Snowflake Cortex Analyst

Cortex Analyst is Snowflake's managed text-to-SQL service. Rather than letting an LLM hallucinate column names against a raw information_schema, Analyst is grounded in a hand-curated semantic model — a YAML file that names tables, dimensions, measures, synonyms, sample questions, and verified queries. The model is the contract between business vocabulary and warehouse schema; the LLM's only job is to map natural-language intent to that contract and emit Snowflake SQL that runs against your governed data.

The pitch is simple: a self-serve analytics endpoint that doesn't fabricate metrics. Analyst won't invent a column called net_revenue if your semantic model only declares gross_revenue and refunds — it composes them, or it asks for clarification, or it returns a verified query if one matches. As of 2026 it is generally available across most Snowflake regions and is billed by token consumption against the underlying Cortex models.

1. What Cortex Analyst Is

Cortex Analyst is a hosted REST endpoint that accepts a natural-language question and returns a Snowflake SQL statement, an executed result set, and a chain-of-thought style explanation with citations back to the semantic model. It runs inside the customer's Snowflake account, meaning prompts and data never leave the account boundary, and the SQL it produces is executed under the calling user's role — so row access policies, masking policies, and warehouse grants all apply unchanged.

Grounded in a semantic model. No semantic model means no Analyst. The model file is the only allowed surface area; columns absent from the model are invisible.
Conversational. A request includes the prior conversation, so follow-up questions like "now break it down by region" work without restating context.
Verifiable output. The response includes the generated SQL string before execution, so a UI can show the user exactly what is about to run.
Verified query store. Trusted queries authored by an analyst short-circuit the LLM entirely and serve as high-confidence answers.

2. Authoring a Semantic Model

A semantic model is a YAML file uploaded to a Snowflake stage and referenced by path on each request. The file declares the logical world the LLM is allowed to talk about: the tables it can join, the columns it can select, the measures it can compute, and the synonyms a business user might use for each.

Tables, Dimensions, Measures

Every entry in the model maps a logical name to a physical Snowflake object. Dimensions are columns you can group by or filter on; measures are aggregations.


name: revenue_analytics
description: |
  Sales, refunds, and customer activity for the e-commerce business.
  Use for revenue, AOV, churn, and cohort questions.

tables:
  - name: orders
    description: One row per completed customer order.
    base_table:
      database: ANALYTICS
      schema:   PUBLIC
      table:    FCT_ORDERS

    primary_key:
      columns: [order_id]

    dimensions:
      - name: order_id
        expr: order_id
        data_type: VARCHAR
        unique: true

      - name: order_date
        expr: order_date
        data_type: DATE
        synonyms: ["date", "purchase date", "when ordered"]

      - name: country
        expr: shipping_country
        data_type: VARCHAR
        synonyms: ["region", "ship-to country"]
        sample_values: ["US", "CA", "GB", "DE", "FR"]

      - name: channel
        expr: acquisition_channel
        data_type: VARCHAR
        synonyms: ["marketing channel", "source"]
        sample_values: ["organic", "paid_search", "email", "affiliate"]

    measures:
      - name: gross_revenue
        expr: SUM(order_total_usd)
        data_type: NUMBER
        default_aggregation: sum
        synonyms: ["sales", "GMV", "top-line revenue"]

      - name: refunds
        expr: SUM(refund_amount_usd)
        data_type: NUMBER
        default_aggregation: sum

      - name: net_revenue
        expr: SUM(order_total_usd) - SUM(refund_amount_usd)
        data_type: NUMBER
        synonyms: ["net sales", "revenue after refunds"]

      - name: order_count
        expr: COUNT(DISTINCT order_id)
        data_type: NUMBER
        synonyms: ["orders", "number of orders"]

      - name: aov
        expr: SUM(order_total_usd) / NULLIF(COUNT(DISTINCT order_id), 0)
        data_type: NUMBER
        synonyms: ["average order value"]

  - name: customers
    description: One row per registered customer.
    base_table:
      database: ANALYTICS
      schema:   PUBLIC
      table:    DIM_CUSTOMERS
    primary_key:
      columns: [customer_id]
    dimensions:
      - name: customer_id
        expr: customer_id
        data_type: VARCHAR
      - name: signup_date
        expr: created_at::DATE
        data_type: DATE

relationships:
  - name: orders_to_customers
    left_table: orders
    right_table: customers
    relationship_columns:
      - left_column: customer_id
        right_column: customer_id
    join_type: left_outer
    relationship_type: many_to_one

Verified Queries and Few-Shot Examples

Two additional sections lift answer quality dramatically: verified_queries (canonical SQL the model should serve verbatim when matched) and per-table or per-model sample_questions (few-shot pairs the LLM uses for in-context learning).


verified_queries:
  - name: weekly_net_revenue_by_country_last_90d
    question: "weekly net revenue by country for the last 90 days"
    use_as_onboarding_question: true
    sql: |
      SELECT
        DATE_TRUNC('week', order_date) AS week,
        shipping_country              AS country,
        SUM(order_total_usd) - SUM(refund_amount_usd) AS net_revenue
      FROM ANALYTICS.PUBLIC.FCT_ORDERS
      WHERE order_date >= DATEADD('day', -90, CURRENT_DATE())
      GROUP BY 1, 2
      ORDER BY 1, 2;
    verified_by: kevin@example.com
    verified_at: 1745539200

custom_instructions: |
  - Always express revenue in USD; no currency conversion in SQL.
  - Treat "this quarter" as the current calendar quarter, not fiscal.
  - When a user asks for "top customers", default to top 25 by net_revenue.

Verified queries are the highest-leverage thing you can add. Each one removes an entire failure mode and gives the analyst team a way to canonicalize "the right way to compute X" so it stops drifting between dashboards.

3. Calling the REST API

Analyst is exposed as a single POST endpoint per Snowflake account. Authentication is OAuth or key-pair against a Snowflake user; the calling user's role controls which warehouses can run the generated SQL.


curl -s -X POST \
  "https://${ACCOUNT}.snowflakecomputing.com/api/v2/cortex/analyst/message" \
  -H "Authorization: Bearer ${SNOWFLAKE_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "weekly net revenue by country for the last 90 days"}
        ]
      }
    ],
    "semantic_model_file": "@ANALYTICS.PUBLIC.SEMANTIC_MODELS/revenue_analytics.yaml"
  }'

The response shape contains a single message with an array of typed content blocks: text (the natural-language explanation), sql (the generated SQL string and a confidence indicator), and optionally suggestions (clarifying questions the LLM proposes when ambiguity is detected).


{
  "message": {
    "role": "analyst",
    "content": [
      {"type": "text", "text": "Here is weekly net revenue by country for the last 90 days."},
      {
        "type": "sql",
        "statement": "SELECT DATE_TRUNC('week', order_date) AS week, shipping_country AS country, SUM(order_total_usd) - SUM(refund_amount_usd) AS net_revenue FROM ANALYTICS.PUBLIC.FCT_ORDERS WHERE order_date >= DATEADD('day', -90, CURRENT_DATE()) GROUP BY 1, 2 ORDER BY 1, 2",
        "confidence": {
          "verified_query_used": {
            "name": "weekly_net_revenue_by_country_last_90d",
            "verified_by": "kevin@example.com"
          }
        }
      }
    ]
  },
  "request_id": "01b2-..."
}

4. Python Connector and Response Parsing

The same endpoint is reachable through the Snowflake Python connector or the Snowpark session. The pattern below sends the question, parses out the generated SQL, executes it under the caller's role, and returns a DataFrame plus the citation block.


import json
import requests
import snowflake.connector

def ask_analyst(conn, question: str, semantic_model: str, history=None) -> dict:
    """
    Post a question to Cortex Analyst and execute the resulting SQL.
    Returns a dict with the natural-language text, the SQL, the rows, and citations.
    """
    history = history or []
    history.append({
        "role": "user",
        "content": [{"type": "text", "text": question}],
    })

    host  = conn.host
    token = conn.rest.token

    resp = requests.post(
        f"https://{host}/api/v2/cortex/analyst/message",
        headers={
            "Authorization": f"Bearer {token}",
            "X-Snowflake-Authorization-Token-Type": "OAUTH",
            "Content-Type": "application/json",
        },
        json={
            "messages": history,
            "semantic_model_file": semantic_model,
        },
        timeout=60,
    )
    resp.raise_for_status()
    body = resp.json()

    text_blocks = [b["text"]      for b in body["message"]["content"] if b["type"] == "text"]
    sql_blocks  = [b["statement"] for b in body["message"]["content"] if b["type"] == "sql"]
    citations   = [b.get("confidence", {}) for b in body["message"]["content"] if b["type"] == "sql"]

    rows = []
    if sql_blocks:
        cur = conn.cursor()
        cur.execute(sql_blocks[0])
        cols = [c[0] for c in cur.description]
        rows = [dict(zip(cols, r)) for r in cur.fetchall()]

    history.append(body["message"])

    return {
        "explanation": "\n".join(text_blocks),
        "sql":         sql_blocks[0] if sql_blocks else None,
        "rows":        rows,
        "citations":   citations,
        "history":     history,
    }


conn = snowflake.connector.connect(
    account="abc12345",
    user="analyst_app",
    authenticator="OAUTH",
    token=os.environ["SNOWFLAKE_OAUTH_TOKEN"],
    warehouse="ANALYST_WH",
    role="ANALYST_APP_ROLE",
)

result = ask_analyst(
    conn,
    question="weekly net revenue by country for the last 90 days",
    semantic_model="@ANALYTICS.PUBLIC.SEMANTIC_MODELS/revenue_analytics.yaml",
)

print(result["sql"])
print(f"Returned {len(result['rows'])} rows")
if result["citations"][0].get("verified_query_used"):
    vq = result["citations"][0]["verified_query_used"]
    print(f"Served from verified query: {vq['name']} (by {vq['verified_by']})")

The conversation history pattern matters. Each subsequent call passes the prior messages array including the analyst's response, so a follow-up like "now exclude orders under $50" gets resolved against the same semantic context.

5. Reliability Patterns

Out of the box, a thin semantic model over a wide schema produces mediocre answers. The following patterns are what move it from demo to production.

Narrow scope per model. Don't ship one giant everything.yaml. Build a model per domain (revenue, ops, marketing) and route the user's question to the right model from your app layer. The LLM's job is much easier when there are six tables instead of six hundred.
Verified query store as a treadmill. Every time a user accepts an Analyst answer, log the question and the generated SQL. Once a week, an analyst promotes the most-asked questions into verified_queries. Over time, the long tail shrinks and Analyst increasingly serves canonical SQL.
Synonyms are not optional. "Customer", "user", "buyer", "account holder" all need to resolve to the same dimension. The richer the synonyms list, the more robust the matching.
Sample values for low-cardinality dimensions. Listing the actual values for channel means the LLM stops guessing string casing and substring matches.
Custom instructions for business conventions. Calendar vs fiscal quarter, the default ranking metric, currency assumptions — anything an analyst knows by reflex needs to be written down once in custom_instructions.
Show the SQL before executing. Production UIs render the generated SQL with a "Run" button. Trust improves and bad outputs get caught before they hit a dashboard.

6. Cortex Analyst vs Genie vs Code Interpreter vs DataChat

The text-to-SQL space has converged on the same architecture — semantic model plus LLM plus verified-query store — but the platform integration differs significantly.

Product	Grounding	Where SQL Runs	Best For
Snowflake Cortex Analyst	YAML semantic model + verified queries.	Snowflake warehouse, caller's role.	Data already in Snowflake; want governance and zero data egress.
Databricks Genie	Genie space referencing UC tables, sample queries, instructions.	Databricks SQL warehouse, caller's principal.	Data in the Lakehouse; want UC lineage and notebook-style exploration.
OpenAI Code Interpreter	No persistent semantic model — LLM reads CSVs ad hoc in a sandbox.	Sandboxed Python, not your warehouse.	One-off analysis on uploaded files; not appropriate for governed data.
DataChat	Multi-warehouse semantic layer with proprietary DSL.	Pushes down to Snowflake/BigQuery/Redshift.	Multi-warehouse organizations needing a single NL surface.

If the data lives in Snowflake and the goal is a chatbot that respects existing roles, masking policies, and row access policies, Analyst is the lowest-friction option because the SQL it generates simply runs as the caller would have run it manually.

7. When to Use Analyst vs a Hand-Built Dashboard

Analyst is not a replacement for a well-curated BI dashboard — it is a complement to one. The decision boils down to question shape and traffic pattern.

Use a dashboard when the same five charts are loaded by 200 stakeholders every Monday. The cost per render is trivial, the visuals are polished, and freshness expectations are predictable.
Use Analyst when the question space is open-ended and long-tailed: the executive who wants "revenue by signup-cohort filtered to enterprise customers in EMEA, year over year, but only for accounts that touched the new pricing page". A dashboard team would never build that view; Analyst can compose it on demand.
Use both together by exposing Analyst as a "Ask a question about this dashboard" widget. The dashboard answers the curated 80%; Analyst handles the long tail without spawning a backlog of one-off requests for the analytics team.

The economic argument for Analyst is the displaced cost of bespoke ad-hoc work, not the cost of dashboards. Done well, it converts an analyst team's hours from "writing one-off SQL" into "writing semantic models and verified queries that the LLM serves at scale."

8. Interview Q&A

Q: Why does Cortex Analyst require a semantic model instead of just reading the information schema?

The information schema gives you names and types but no semantics — it can't tell the LLM that shipping_country means region, that net_revenue requires subtracting refunds, or that "this quarter" is calendar not fiscal. Without a semantic model the LLM hallucinates business definitions; with one, it composes verified building blocks. The model also acts as the access-control surface: anything not in the YAML is invisible to the LLM, which prevents it from selecting columns the calling user shouldn't even know exist.

Q: A user asks "what was revenue last quarter" and Analyst returns the wrong number. How do you debug it?

Three places to look in order. First, the response payload — the generated SQL is right there, so eyeball whether the date predicate matches your definition of "last quarter". Second, the semantic model — check whether net_revenue is defined the way the business actually computes it, and whether custom_instructions pins calendar vs fiscal. Third, the verified queries — if a verified query exists for that question, the response will name it; if not, this is a candidate to promote into the verified store after fixing.

Q: How does row-level security work with Cortex Analyst?

Analyst generates SQL but does not run it under a privileged service identity — it runs under the calling user's role via the Snowflake Python connector or REST endpoint. So row access policies and dynamic masking policies on the underlying tables apply unchanged. If user A can only see EMEA rows and user B can see global, the same Analyst question returns different result sets without any extra plumbing in your app.

Q: When does it make sense to add a verified query?

Whenever a question is asked more than a handful of times and the canonical SQL is non-obvious. Rolling 28-day metrics, cohort retention, attribution joins, anything with a tricky window function — these are the queries you don't want the LLM re-deriving on every request. The operational pattern is to log accepted Analyst answers, surface the most frequent questions weekly, and have an analyst promote them. Over time the long tail shrinks and the LLM increasingly serves verified SQL with high confidence.

Q: How would you let a Cortex Analyst chatbot answer follow-up questions like "now group by region"?

Pass the full conversation history on each request — the prior user turns and the prior analyst message blocks — in the messages array. Analyst is stateless server-side, so the client owns the history. The LLM uses the prior turn's SQL and the semantic model to resolve the new turn against the same context, which is what makes "break it down by country" or "exclude refunded orders" work without re-stating the whole question.

Q: What are the failure modes you would not deploy Analyst to handle?

Anything that requires arithmetic outside SQL — Monte Carlo simulations, statistical tests, time-series forecasts. Anything that needs joins to data not modeled in the semantic file. And anything where the cost of a wrong answer is catastrophic (regulatory filings, payroll). Analyst is excellent for exploratory analytics and for displacing the long tail of "can you pull a quick number" requests; it should be paired with a human review step for high-stakes outputs.