Cortex Analyst is Snowflake's managed text-to-SQL service. Rather than letting an LLM hallucinate column names against a raw information_schema, Analyst is grounded in a hand-curated semantic model — a YAML file that names tables, dimensions, measures, synonyms, sample questions, and verified queries. The model is the contract between business vocabulary and warehouse schema; the LLM's only job is to map natural-language intent to that contract and emit Snowflake SQL that runs against your governed data.
The pitch is simple: a self-serve analytics endpoint that doesn't fabricate metrics. Analyst won't invent a column called net_revenue if your semantic model only declares gross_revenue and refunds — it composes them, or it asks for clarification, or it returns a verified query if one matches. As of 2026 it is generally available across most Snowflake regions and is billed by token consumption against the underlying Cortex models.
Cortex Analyst is a hosted REST endpoint that accepts a natural-language question and returns a Snowflake SQL statement, an executed result set, and a chain-of-thought style explanation with citations back to the semantic model. It runs inside the customer's Snowflake account, meaning prompts and data never leave the account boundary, and the SQL it produces is executed under the calling user's role — so row access policies, masking policies, and warehouse grants all apply unchanged.
A semantic model is a YAML file uploaded to a Snowflake stage and referenced by path on each request. The file declares the logical world the LLM is allowed to talk about: the tables it can join, the columns it can select, the measures it can compute, and the synonyms a business user might use for each.
Every entry in the model maps a logical name to a physical Snowflake object. Dimensions are columns you can group by or filter on; measures are aggregations.
name: revenue_analytics
description: |
Sales, refunds, and customer activity for the e-commerce business.
Use for revenue, AOV, churn, and cohort questions.
tables:
- name: orders
description: One row per completed customer order.
base_table:
database: ANALYTICS
schema: PUBLIC
table: FCT_ORDERS
primary_key:
columns: [order_id]
dimensions:
- name: order_id
expr: order_id
data_type: VARCHAR
unique: true
- name: order_date
expr: order_date
data_type: DATE
synonyms: ["date", "purchase date", "when ordered"]
- name: country
expr: shipping_country
data_type: VARCHAR
synonyms: ["region", "ship-to country"]
sample_values: ["US", "CA", "GB", "DE", "FR"]
- name: channel
expr: acquisition_channel
data_type: VARCHAR
synonyms: ["marketing channel", "source"]
sample_values: ["organic", "paid_search", "email", "affiliate"]
measures:
- name: gross_revenue
expr: SUM(order_total_usd)
data_type: NUMBER
default_aggregation: sum
synonyms: ["sales", "GMV", "top-line revenue"]
- name: refunds
expr: SUM(refund_amount_usd)
data_type: NUMBER
default_aggregation: sum
- name: net_revenue
expr: SUM(order_total_usd) - SUM(refund_amount_usd)
data_type: NUMBER
synonyms: ["net sales", "revenue after refunds"]
- name: order_count
expr: COUNT(DISTINCT order_id)
data_type: NUMBER
synonyms: ["orders", "number of orders"]
- name: aov
expr: SUM(order_total_usd) / NULLIF(COUNT(DISTINCT order_id), 0)
data_type: NUMBER
synonyms: ["average order value"]
- name: customers
description: One row per registered customer.
base_table:
database: ANALYTICS
schema: PUBLIC
table: DIM_CUSTOMERS
primary_key:
columns: [customer_id]
dimensions:
- name: customer_id
expr: customer_id
data_type: VARCHAR
- name: signup_date
expr: created_at::DATE
data_type: DATE
relationships:
- name: orders_to_customers
left_table: orders
right_table: customers
relationship_columns:
- left_column: customer_id
right_column: customer_id
join_type: left_outer
relationship_type: many_to_one
Two additional sections lift answer quality dramatically: verified_queries (canonical SQL the model should serve verbatim when matched) and per-table or per-model sample_questions (few-shot pairs the LLM uses for in-context learning).
verified_queries:
- name: weekly_net_revenue_by_country_last_90d
question: "weekly net revenue by country for the last 90 days"
use_as_onboarding_question: true
sql: |
SELECT
DATE_TRUNC('week', order_date) AS week,
shipping_country AS country,
SUM(order_total_usd) - SUM(refund_amount_usd) AS net_revenue
FROM ANALYTICS.PUBLIC.FCT_ORDERS
WHERE order_date >= DATEADD('day', -90, CURRENT_DATE())
GROUP BY 1, 2
ORDER BY 1, 2;
verified_by: kevin@example.com
verified_at: 1745539200
custom_instructions: |
- Always express revenue in USD; no currency conversion in SQL.
- Treat "this quarter" as the current calendar quarter, not fiscal.
- When a user asks for "top customers", default to top 25 by net_revenue.
Verified queries are the highest-leverage thing you can add. Each one removes an entire failure mode and gives the analyst team a way to canonicalize "the right way to compute X" so it stops drifting between dashboards.
Analyst is exposed as a single POST endpoint per Snowflake account. Authentication is OAuth or key-pair against a Snowflake user; the calling user's role controls which warehouses can run the generated SQL.
curl -s -X POST \
"https://${ACCOUNT}.snowflakecomputing.com/api/v2/cortex/analyst/message" \
-H "Authorization: Bearer ${SNOWFLAKE_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "weekly net revenue by country for the last 90 days"}
]
}
],
"semantic_model_file": "@ANALYTICS.PUBLIC.SEMANTIC_MODELS/revenue_analytics.yaml"
}'
The response shape contains a single message with an array of typed content blocks: text (the natural-language explanation), sql (the generated SQL string and a confidence indicator), and optionally suggestions (clarifying questions the LLM proposes when ambiguity is detected).
{
"message": {
"role": "analyst",
"content": [
{"type": "text", "text": "Here is weekly net revenue by country for the last 90 days."},
{
"type": "sql",
"statement": "SELECT DATE_TRUNC('week', order_date) AS week, shipping_country AS country, SUM(order_total_usd) - SUM(refund_amount_usd) AS net_revenue FROM ANALYTICS.PUBLIC.FCT_ORDERS WHERE order_date >= DATEADD('day', -90, CURRENT_DATE()) GROUP BY 1, 2 ORDER BY 1, 2",
"confidence": {
"verified_query_used": {
"name": "weekly_net_revenue_by_country_last_90d",
"verified_by": "kevin@example.com"
}
}
}
]
},
"request_id": "01b2-..."
}
The same endpoint is reachable through the Snowflake Python connector or the Snowpark session. The pattern below sends the question, parses out the generated SQL, executes it under the caller's role, and returns a DataFrame plus the citation block.
import json
import requests
import snowflake.connector
def ask_analyst(conn, question: str, semantic_model: str, history=None) -> dict:
"""
Post a question to Cortex Analyst and execute the resulting SQL.
Returns a dict with the natural-language text, the SQL, the rows, and citations.
"""
history = history or []
history.append({
"role": "user",
"content": [{"type": "text", "text": question}],
})
host = conn.host
token = conn.rest.token
resp = requests.post(
f"https://{host}/api/v2/cortex/analyst/message",
headers={
"Authorization": f"Bearer {token}",
"X-Snowflake-Authorization-Token-Type": "OAUTH",
"Content-Type": "application/json",
},
json={
"messages": history,
"semantic_model_file": semantic_model,
},
timeout=60,
)
resp.raise_for_status()
body = resp.json()
text_blocks = [b["text"] for b in body["message"]["content"] if b["type"] == "text"]
sql_blocks = [b["statement"] for b in body["message"]["content"] if b["type"] == "sql"]
citations = [b.get("confidence", {}) for b in body["message"]["content"] if b["type"] == "sql"]
rows = []
if sql_blocks:
cur = conn.cursor()
cur.execute(sql_blocks[0])
cols = [c[0] for c in cur.description]
rows = [dict(zip(cols, r)) for r in cur.fetchall()]
history.append(body["message"])
return {
"explanation": "\n".join(text_blocks),
"sql": sql_blocks[0] if sql_blocks else None,
"rows": rows,
"citations": citations,
"history": history,
}
conn = snowflake.connector.connect(
account="abc12345",
user="analyst_app",
authenticator="OAUTH",
token=os.environ["SNOWFLAKE_OAUTH_TOKEN"],
warehouse="ANALYST_WH",
role="ANALYST_APP_ROLE",
)
result = ask_analyst(
conn,
question="weekly net revenue by country for the last 90 days",
semantic_model="@ANALYTICS.PUBLIC.SEMANTIC_MODELS/revenue_analytics.yaml",
)
print(result["sql"])
print(f"Returned {len(result['rows'])} rows")
if result["citations"][0].get("verified_query_used"):
vq = result["citations"][0]["verified_query_used"]
print(f"Served from verified query: {vq['name']} (by {vq['verified_by']})")
The conversation history pattern matters. Each subsequent call passes the prior messages array including the analyst's response, so a follow-up like "now exclude orders under $50" gets resolved against the same semantic context.
Out of the box, a thin semantic model over a wide schema produces mediocre answers. The following patterns are what move it from demo to production.
everything.yaml. Build a model per domain (revenue, ops, marketing) and route the user's question to the right model from your app layer. The LLM's job is much easier when there are six tables instead of six hundred.verified_queries. Over time, the long tail shrinks and Analyst increasingly serves canonical SQL.channel means the LLM stops guessing string casing and substring matches.custom_instructions.The text-to-SQL space has converged on the same architecture — semantic model plus LLM plus verified-query store — but the platform integration differs significantly.
| Product | Grounding | Where SQL Runs | Best For |
|---|---|---|---|
| Snowflake Cortex Analyst | YAML semantic model + verified queries. | Snowflake warehouse, caller's role. | Data already in Snowflake; want governance and zero data egress. |
| Databricks Genie | Genie space referencing UC tables, sample queries, instructions. | Databricks SQL warehouse, caller's principal. | Data in the Lakehouse; want UC lineage and notebook-style exploration. |
| OpenAI Code Interpreter | No persistent semantic model — LLM reads CSVs ad hoc in a sandbox. | Sandboxed Python, not your warehouse. | One-off analysis on uploaded files; not appropriate for governed data. |
| DataChat | Multi-warehouse semantic layer with proprietary DSL. | Pushes down to Snowflake/BigQuery/Redshift. | Multi-warehouse organizations needing a single NL surface. |
If the data lives in Snowflake and the goal is a chatbot that respects existing roles, masking policies, and row access policies, Analyst is the lowest-friction option because the SQL it generates simply runs as the caller would have run it manually.
Analyst is not a replacement for a well-curated BI dashboard — it is a complement to one. The decision boils down to question shape and traffic pattern.
The economic argument for Analyst is the displaced cost of bespoke ad-hoc work, not the cost of dashboards. Done well, it converts an analyst team's hours from "writing one-off SQL" into "writing semantic models and verified queries that the LLM serves at scale."
The information schema gives you names and types but no semantics — it can't tell the LLM that shipping_country means region, that net_revenue requires subtracting refunds, or that "this quarter" is calendar not fiscal. Without a semantic model the LLM hallucinates business definitions; with one, it composes verified building blocks. The model also acts as the access-control surface: anything not in the YAML is invisible to the LLM, which prevents it from selecting columns the calling user shouldn't even know exist.
Three places to look in order. First, the response payload — the generated SQL is right there, so eyeball whether the date predicate matches your definition of "last quarter". Second, the semantic model — check whether net_revenue is defined the way the business actually computes it, and whether custom_instructions pins calendar vs fiscal. Third, the verified queries — if a verified query exists for that question, the response will name it; if not, this is a candidate to promote into the verified store after fixing.
Analyst generates SQL but does not run it under a privileged service identity — it runs under the calling user's role via the Snowflake Python connector or REST endpoint. So row access policies and dynamic masking policies on the underlying tables apply unchanged. If user A can only see EMEA rows and user B can see global, the same Analyst question returns different result sets without any extra plumbing in your app.
Whenever a question is asked more than a handful of times and the canonical SQL is non-obvious. Rolling 28-day metrics, cohort retention, attribution joins, anything with a tricky window function — these are the queries you don't want the LLM re-deriving on every request. The operational pattern is to log accepted Analyst answers, surface the most frequent questions weekly, and have an analyst promote them. Over time the long tail shrinks and the LLM increasingly serves verified SQL with high confidence.
Pass the full conversation history on each request — the prior user turns and the prior analyst message blocks — in the messages array. Analyst is stateless server-side, so the client owns the history. The LLM uses the prior turn's SQL and the semantic model to resolve the new turn against the same context, which is what makes "break it down by country" or "exclude refunded orders" work without re-stating the whole question.
Anything that requires arithmetic outside SQL — Monte Carlo simulations, statistical tests, time-series forecasts. Anything that needs joins to data not modeled in the semantic file. And anything where the cost of a wrong answer is catastrophic (regulatory filings, payroll). Analyst is excellent for exploratory analytics and for displacing the long tail of "can you pull a quick number" requests; it should be paired with a human review step for high-stakes outputs.