> For the complete documentation index, see [llms.txt](https://docs.soda.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.soda.io/soda-ai.md).

# Soda AI

**Soda AI gets your data AI-ready**. It includes several capabilities that serve one goal: trusted data that's ready for an AI-driven workflow. With Soda AI, your team and AI agents can turn data into actionable, understandable and trustworthy information.

The following capabilities make this work:

<table data-header-hidden><thead><tr><th width="186.66668701171875">Feature</th><th>What it does</th></tr></thead><tbody><tr><td><a href="#chat-interface">Chat interface</a></td><td>Conversational entry point to Soda AI: <strong>performs</strong> actions in Soda, <strong>guides</strong> workflows, and <strong>answers</strong> questions, all in-product.</td></tr><tr><td><a href="/pages/SIhP0wH6BwPidEAnAUIS">Contract Autopilot</a></td><td><strong>Creates</strong> a first version of a data contract.</td></tr><tr><td><a href="/pages/PZNeraYaFXY51diHXyWO">Contract Copilot</a></td><td>Helps <strong>edit</strong> and <strong>iterate</strong> on existing data contracts.</td></tr><tr><td><a href="/pages/c9RIRkDSFHZNEE44x1zB">MCP</a></td><td><strong>Connect your own agents</strong> to trusted data.</td></tr><tr><td><a href="/pages/jsZPg1FqxguLLnwjLeHe">API</a></td><td><strong>Integrate</strong> Soda with your own systems.</td></tr><tr><td><a href="/pages/YhGOZdDCLcQVROIZr68m">CLI</a></td><td><strong>Run data quality programmatically</strong> in your existing pipelines.</td></tr></tbody></table>

## What is Soda AI?

**Soda AI is the collection of AI-powered features** that make your data quality AI-ready. It isn't a single tool but the umbrella over everything in Soda that uses AI to define, maintain, and reason about data quality: the **in-product chat** **interface**, **Autopilot**, **Copilot**, and the **tools** that connect your contracts and data-quality status to **external agents**.

Soda AI's features fall into a few groups:

<table data-card-size="large" data-view="cards"><thead><tr><th></th><th></th><th data-hidden data-card-cover data-type="image">Cover image</th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><strong>Chat interface</strong></td><td>The <strong>chat interface</strong> is the in-product way of interacting with Soda AI in natural language. It answers questions about Soda and your data, guides you through platform workflows, and proposes/performs actions on your behalf. As a Soda AI interface, the chat reaches for the tools below to carry out tasks.</td><td><a href="/files/p8HLZZXJXHZp4DOqYhVW">/files/p8HLZZXJXHZp4DOqYhVW</a></td><td><a href="/pages/taqcZt5mpjPfYl9VC72C#chat-interface">/pages/taqcZt5mpjPfYl9VC72C#chat-interface</a></td></tr><tr><td><strong>Contract Autopilot</strong></td><td><strong>Generate fully populated contracts</strong> with recommended checks, drawn from your data's own profile. Data teams can reach broad coverage in an afternoon instead of authoring contracts by hand for weeks.</td><td><a href="/files/SeXLJXXWta7IkyaRBqSq">/files/SeXLJXXWta7IkyaRBqSq</a></td><td><a href="/pages/SIhP0wH6BwPidEAnAUIS">/pages/SIhP0wH6BwPidEAnAUIS</a></td></tr><tr><td><strong>Contract Copilot</strong></td><td><strong>Edit and iterate on contracts</strong> in plain English (adding a freshness check, tightening a rule, handling a nested field) without hand-writing code lines. Contract Copilot will show you exactly what changed.</td><td><a href="/files/lrjqKlsT85mjIKPm0Axj">/files/lrjqKlsT85mjIKPm0Axj</a></td><td><a href="/pages/PZNeraYaFXY51diHXyWO">/pages/PZNeraYaFXY51diHXyWO</a></td></tr><tr><td><strong>Agent-facing interfaces</strong></td><td><strong>Connect agents to trusted data.</strong> MCP, CLI and API put your contracts and data-quality status in front of the tools your team and your agents already use. Engineers can run quality checks inside their existing pipelines, and agents act on verified data.</td><td><a href="/files/irbgCrGkjfVTMHTTq99z">/files/irbgCrGkjfVTMHTTq99z</a></td><td><a href="/pages/c9RIRkDSFHZNEE44x1zB">/pages/c9RIRkDSFHZNEE44x1zB</a></td></tr></tbody></table>

Across all of these, **Soda AI is designed to**:

* **Reduce manual effort** in defining and maintaining data quality expectations
* **Help** users **reason about data** and contracts faster
* **Integrate with existing workflows**, enabling AI agents to interact with Soda Cloud in your own pipelines.
* **Keep humans in control**: every proposed action is shown and approved by a person before it is applied
* **Operate safely** within strict privacy, security, and governance boundaries

### Limitations

* Soda AI features may not always produce complete or perfectly accurate answers.
* Users remain responsible for reviewing any proposed action before it is applied.
* The chat interface, Contract Autopilot, and Contract Copilot do not operate outside the context of your Soda Cloud workspace.

{% hint style="success" %}
**The chat interface,  Contract Autopilot, and Contract Copilot are assistive, not autonomous**. These features are designed as a control surface for human-in-the-loop data quality, scoped to the same operations you could perform yourself, subject to your role and permissions.
{% endhint %}

## Chat interface

The chat interface is **the in-product, conversational face of Soda AI**. It lives alongside your work in Soda Cloud, and incorporates AI-powered, assistive features that help **you understand data**, **author data contracts**, and **accelerate onboarding** without replacing human decision-making. You can request it to perform actions on your behalf, get step-by-step guidance on platform workflows, or ask it questions about Soda, all in **natural language, without leaving the page you’re on**. As an agentic assistant within Soda Cloud, **the chat interface can access the tooling** necessary to perform actions on your behalf, such as Autopilot or MCP.

<figure><img src="/files/ASzSH7QppoZrDfe0SCHA" alt=""><figcaption></figcaption></figure>

### Chat interface capabilities

Soda AI's actions are scoped to the **same operations you can perform** yourself in Soda Cloud, subject to your role and permissions. **The chat interface can:**

* **Perform** safe, reviewable **actions in Soda Cloud** on your behalf. For example:
  * **Generate data contracts.** Using [Contract Autopilot](/soda-ai/contract-autopilot.md), you can point Soda AI at your datasets and generate fully populated contracts with recommended checks, drawn from your data's own profile. Data teams can get full coverage in an afternoon, instead of manually creating contracts for weeks.
  * **Edit and iterate over data contracts** in plain English. Add a freshness check, tighten a rule, handle a nested field, all without hand-writing code. Within the chat, you will see exactly what changed, so you keep full agency over the editing process.
* **Guide** **you through platform workflows** (configuring datasets, authoring contracts, investigating incidents)
* **Answer product and docs questions.** Ask what a check, monitor, or incident means, or tell it to make a change. It proposes the action, scoped to what your role already allows, and waits for you to approve.

{% hint style="success" %}
Every change is proposed, shown, and approved by a human user. Soda AI is an assistant and control surface, and is designed to perform **human-in-the-loop processes**.
{% endhint %}

{% if visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard' %}

### Conversations and history

Each chat is saved as a **conversation** in the Soda AI chat panel. You can resume a previous conversation or start a new one from any page on Soda Cloud.

{% hint style="info" %}
**Your prompts and conversation history are stored in your browser local storage**. If you change browsers or open Soda Cloud in a different device, you will not have access to your conversation history.
{% endhint %}
{% endif %}

{% if visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard' %}

***

## Enable/Disable AI features

AI features in Soda can be configured from the **Organization Settings**.

{% stepper %}
{% step %}
Navigate to **Settings** > **Organization Settings**.
{% endstep %}

{% step %}
Under the **Organization** tab **> Soda AI features:** Check/Uncheck the feature checkbox to enable/disable it.

<figure><img src="/files/RlxsQVoUUfmGEJMx3HDx" alt=""><figcaption></figcaption></figure>
{% endstep %}

{% step %}
Click **Save** on the top right corner.
{% endstep %}
{% endstepper %}
{% endif %}

***

## Soda AI security & privacy measures

Soda's features are built with **privacy and security as first-class design principles**. The following safeguards apply to all AI/ML-powered features in Soda.

### Privacy and data usage

* Soda uses **third-party foundation models** (GPT-5 family).
* **API request data** sent to the model provider is **not retained for training**.
* **Soda does not use customer data**, metadata, or results from your environment to **train** or **fine-tune** external AI, LLM, or ML models.
* **Source data**, such as row-level data and/or PII, **is never sent by default** to Soda’s AI/ML engine or to any external model provider. **AI features that require the use of source data are** **opt-in**.
* To answer questions and propose/perform actions, **Soda AI analyzes** the user's **natural-language request**, Soda **documentation content**, **dataset schemas**, and **non-identifiable column-level metadata** when needed.

{% if visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard' %}

* Soda AI features can be pointed at an OpenAI account you own by [**bringing your own key (BYOK)**](broken://pages/cS46FGE9oVlrTS48iaBE)**.** Soda only stores a reference to the key, never the key itself.
  {% endif %}

{% if !(visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard') %}

* Soda AI features can be pointed at an OpenAI account you own by bringing your own key (BYOK)**.** Soda only stores a reference to the key, never the key itself.
  {% endif %}

{% hint style="info" %}
**The data sent to each model provider depends on the feature** and is limited to what is strictly required by the individual tool being used. See more details on each specific tool page.
{% endhint %}

#### Diagnostics data remains local

If record-level diagnostics (such as `Failed Rows`) are enabled:

{% if visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard' %}

* They are stored **only in your own data warehouse(s)** via [Soda’s Diagnostics Warehouse](broken://pages/UZCAHd8ZBgl01336NkZa)
  {% endif %}

{% if !(visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard') %}

* They are stored **only in your own data warehouse(s)** via [Soda’s Diagnostics Warehouse](/diagnostics-warehouse.md)
  {% endif %}

* They are **not** stored in Soda Cloud

* They are **not accessible** to Soda’s AI/ML processing

### Security and compliance

All AI/ML processing is performed securely and in alignment with **Soda’s SOC 2 Type 2 compliance**:

* All processes and development align with information security, application security, and Software Development Life Cycle (SDLC) policies
* **Data in transit is encrypted** using industry standards
* **Data at rest is encrypted** using industry standards
* **Access** to AI features is governed by **existing Soda roles and permissions**
* Third-party providers (such as OpenAI) are **evaluated for compliance before adoption**

{% hint style="success" %}
Soda continuously monitors AI features via metrics, quality indicators and internal testing to ensure reliable operation at scale.
{% endhint %}

#### Confidentiality & access control

* Soda AI features **accessed through the chat interface** work within an **authenticated Soda Cloud session** only.
* Soda AI's actions are subject to your **existing** **Soda roles and permissions**. None of the features can perform anything you could not perform yourself.
* Requests are **not used for model training**.
* Soda **does not store prompts or conversations**. Your prompts and conversation history are stored in your browser local storage.

  <div data-gb-custom-block data-tag="hint" data-style="info" class="hint hint-info"><p>If you change browsers or open Soda Cloud in a different device, you will not have access to your conversation history.</p></div>

### AI guardrails and ethics

* Foundation models include built-in safety guardrails from the provider
* Additional **system-level guardrails** restrict responses to the intended Soda use cases
* Soda AI features do not generate harmful, discriminatory, or user-targeted content

Because Soda AI features do not evaluate users or make decisions, traditional bias and fairness risks are minimal.

<br>

***

{% if (visitor.claims.plan === 'datasetStandard')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Dataset Standard license** documentation. Learn more about [Documentation access & licensing](/reference/documentation-access-and-licensing.md).
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterprise')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Team license** documentation. Learn more about [Documentation access & licensing](/reference/documentation-access-and-licensing.md).
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterpriseUserBased')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Enterprise license** documentation. Learn more about [Documentation access & licensing](/reference/documentation-access-and-licensing.md).
{% endhint %}
{% endif %}

{% if !(visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard')%}
{% hint style="info" %}
You are **not logged in to Soda** and are viewing the default public documentation. Learn more about [Documentation access & licensing](/reference/documentation-access-and-licensing.md).

If you do have a Soda license, make sure to **log in to Soda Cloud in this same browser**.
{% endhint %}
{% endif %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.soda.io/soda-ai.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
