what is cortex analyst

In today’s data-driven world, organizations are sinking in a sea of information. While this data holds enormous potential for valuable insights, the sheer volume of information can often overwhelm users.

According to Salesforce, more than two-thirds (67%) of business leaders are not using data to decide on pricing in line with economic conditions, such as inflation and less than one-third (29%) are using data to inform their strategy when launching in new markets.

So how can organizations unlock meaningful information from this torrent of data?

This is where business intelligence I solutions come into play.

To make informed business decisions, organizations are increasingly turning to AI-powered analytics tools. Snowflake's Cortex Analyst, a cloud-based service, is one such example that prioritizes accuracy, data security, and governance.

This article tells you everything you need to know about Cortext Analyst, an AI service launched by Snowflake, from how it works to key features and use cases. 

What is Cortex Analyst?

Cortex Analyst is a new, fully managed AI service similar to an AI-powered chatbot launched by Snowflake, a company specializing in delivering AI Data Cloud via its engine. The new solution aims to provide users with the easy ability to ask questions in natural language and receive accurate answers based on their structured data stored within Snowflake. 

Overall, Cortex Analyst aims to democratize data access by allowing users to ask questions in plain language rather than writing complex SQL code. 

The AI solution was built using Meta's Llama and Mistral models. Based on Snowflake’s internal evaluation mirroring real-world use cases, Cortex Analyst outperformed alternatives—nearly twice as accurate as single-prompt SQL generation from GPT-4o and about 14% more accurate than other solutions on the market.

Cortext Analyst equips developers with the ability to customize how and where an organization’s users can communicate with results, while still benefiting from Snowflake’s integrated security and governance features, including role-based access controls (RBAC), to protect valuable data.

How does Cortex Analyst work?

Cortex Analyst is a self-serve analytics solution that leverages advanced language models and generates accurate SQL queries based on the questions asked by users, enabling them to gain insights without submitting complex SQL code. The technology makes it easier to process analytical data and equips organizations with data-driven insights helping them make decisions more efficiently.

how does cortex analysrt work
Cortex Analyst chatbot feature on Snowflake.

The system first comprehends the query asked by the user by identifying keywords, phrases, and the overall intent. It then maps the data schema by detecting features of the query and understands the structure and relationships between tables, columns, and data types. Based on the system’s analysis and mapping, Cortex Analyst creates an SQL query designed to retrieve the necessary information related to the data asked from Snowflake's data warehouse.

The generated SQL query is executed against Snowflake's powerful engine, which quickly processes the data and presents the results clearly and concisely. This is often accompanied by visualizations or summaries to enhance understanding.

For high text-to-SQL accuracy, the new system deploys an AI agent powered by multiple SQL-generation agents, each using a different Large Language Model (LLM). It’s a tool that can be easily added to other software programs. It's like a building block that an organization can leverage to create more powerful applications. Using multiple LLMs boosts query-generation accuracy. 

Through an Error Correction Agent, the generated SQL is checked for syntactic and semantic errors, using core Snowflake services like the SQL Compiler. If errors are found, the agent runs a correction loop to fix them. This module also addresses hallucinations, correcting instances where the model might invent entities outside the Semantic Data Model or use nonexistent SQL functions.

In the final step, all of the generated SQL queries are forwarded to a Synthesizer Agent which leverages the work done by the previous agents. The Synthesizer Agent generates the final SQL query that most accurately answers the question at hand. The SQL query, along with the interpretation of the user question, is included in the API response. The returned SQL query can be executed in the background of the client application, and the final results are presented to the end user.

An organization can access Cortex Analyst, which provides a conversational interface where users can interact with structured data through a Snowflake-powered analytics platform. 

Key Features of Cortex Analyst 

1. Natural language queries

Cortex Analyst offers organizations and their teams including non-technical users instant answers and insights from the structured data in Snowflake. This enables teams to create downstream chat applications where users can ask questions using natural language and receive accurate answers quickly.

key features of cortex analyst
Cortex Analyst follows a series of steps to go from a natural language question to a response.

2. REST API for integration

Cortex Analyst is designed to be flexible and adaptable to an organization’s specific needs, which is why it takes an API (Application Programming Interface)-first approach, allowing organizations to integrate it into their existing software or platforms. It provides organizations with full control over the end-user experience. For example, you could integrate Cortex Analyst into a chat app like Slack or Teams, allowing users to ask questions and get answers directly within their preferred workspace. This flexibility ensures that Cortex Analyst can be seamlessly integrated into an organization’s existing operational processes, presenting valuable insights where your users need them most.

3. LLMs

Cortex Analyst uses powerful AI models to comprehend and answer users' queries. These models are very smart and can process information quickly. By default, it uses models – Llama and Mistral, which are crafted by Meta and they run securely inside Snowflake Cortex, Snowflake’s intelligent, fully managed AI service. However, organizations can also opt for other models from Azure OpenAI, which are essentially GPT models.

Cortex Analyst uses the best combination of models to provide users with accurate and relevant answers. As these AI models get better over time, Snowflake will keep adding new ones to Cortex Analyst.

4. Semantic model 

Cortex Analyst is aided by a semantic model that comprehends the data in an organization’s database. The model overcomes the challenges faced by generic AI solutions that struggle with text-to-SQL conversions when given only a database schema.

Instead, Cortex Analyst deploys a semantic model that bridges the gap between an organization’s users and databases. It’s fitted in a lightweight YAML file but the overall structure and concepts of the semantic model are similar to those of database schemas. However, the schemas allow for a greater description of the semantic information around the data, allowing the AI solution to provide more precise and relevant answers.

5. Security and governance

Cortex Analyst prioritizes data privacy and security foremost owing to Snowflake’s strong commitment to privacy and its advanced security measures. This ensures an organization's data remains protected while using the AI conversational interface. 

The tool doesn't use customer data to train its AI models, and it only uses the metadata (information about customer data) to generate SQL queries. The user's data stays within Snowflake's secure environment. Also, Cortex Analyst integrates seamlessly with Snowflake's security features, such as role-based access control, to protect users' data from unauthorized access.

Key benefits of Cortex Analyst 

1. Semantics

Cortex Analyst effectively captures semantics by using a semantic model that provides additional context and information beyond the basic database schema. This helps LLMs understand the intent of data questions and generate more accurate responses for users. 

They provide context and information beyond the basic structure of the data including the different terminology used in an organization, relationships between data elements, and corresponding units of measurement. This extra information helps Cortex Analyst understand the intent behind users’ questions and yield more accurate SQL queries. For example, if you ask "What were our total sales last month?", Cortex Analyst can use the semantic model to identify the relevant tables and columns, as well as the correct units of measurement (e.g. dollars).

2. Contained SQL-generation accuracy

Cortext Analyst contains the problem space by focusing on specific areas or use case. Instead of making efforts to analyze all the data in an organization’s database at once, it targets particular subjects such as analyzing marketing or sales. 

SQL-generation accuracy can be driven significantly higher within a contained scope, as opposed to targeting the entire database schema. Too many similar-sounding tables and columns can confuse the LLMs and thus reduce accuracy. By narrowing down the focus, Cortex Analyst can provide users with more precise and helpful answers.

3. User trust

Cortext Analyst rejects unanswered questions and suggests alternative questions. The AI chatbot proactively identifies and rejects ambiguous or unanswerable questions, depending the available data. Instead of producing incorrect results, it suggests alternative queries that can be confidently answered.

This helps to maintain user trust because it shows that Cortex Analyst is being honest and transparent about its capabilities. It's like having a helpful assistant who knows when to say "I don't know" and offer suggestions.

4. Evolving with technology

Cortex Analyst is continuously evolving to stay up-to-date with the latest advancements in AI. While even the most sophisticated LLMs still face challenges in generating accurate SQL from complex schemas, Cortex Analyst is designed to deliver reliable results by avoiding over/undercounting post-joins and handling complex schema structures like chasm traps and fan traps. Simplifying schemas can further enhance the accuracy and reliability of generated SQL.

Use cases

1. Healthcare Analytics

Cortex Analyst can be deployed by pharmaceutical companies or healthcare providers to analyze clinical trial data and identify trends or correlations that could potenially lead to new drug discoveries for the former. For healthcare providers, the tool can analyze patient records and detect patterns in disease outbreaks or treatment effectiveness. 

For instance, users could write prompts such as – 

"Are there any Jon Doe’s demographics more likely to experience side effects from this medication?" 

"What is the average length of stay for patients with malaria?"

2. Financial Services

Cortex Analyst can help banks and insurance companies analyze customer data to detect potential fraud patterns or customer behavior to optimize marketing campaigns respectively.

For instance, users could write prompts such as – 

"Are there any unusual spending patterns that might indicate fraudulent activity?"

"What factors are correlated with higher claim costs for car insurance?"

3. Retail

Cortex Analyst can be deployed by e-commerce organizations to analyze customer behavior and optimize marketing campaigns. It can also be used by brick-and-mortar retailer to analyze sales data and get the most use out of inventory management for best business practices.

For instance, users could write prompts such as – 

"What products are frequently purchased together?"

"What is the optimal stock level for product X based on historical sales data?"

4. Manufacturing

An organization specializing in manufacturing could deploy Cortex Analyst to analyze production data and identify areas for improvement.

For instance, users could write prompts such as – 

"What factors are contributing to increased production costs on assembly line A?"

cortex analyst by snowflake badge

About Snowflake

Snowflake delivers the AI Data Cloud — a global network where thousands of organizations mobilize data with near-unlimited scale, concurrency, and performance. Inside the AI Data Cloud, organizations unite their siloed data, easily discover and securely share governed data, and execute diverse analytic workloads. Wherever data or users live, Snowflake delivers a single and seamless experience across multiple public clouds. 

Snowflake’s platform is the engine that powers and provides access to the AI Data Cloud, creating a solution for data warehousing, data lakes, data engineering, data science, data application development, and data sharing. Join Snowflake customers, partners, and data providers already taking their businesses to new frontiers in the AI Data Cloud.