# Soda AI

**Soda** incorporates AI-powered, assistive features that help users **understand data**, **author data contracts**, and **accelerate onboarding** without replacing human decision-making.

Soda AI features are designed to:

* **Reduce manual effort** in defining and maintaining data quality expectations
* Help users **reason about data** and contracts faster
* **Operate safely** within strict privacy, security, and governance boundaries

Current Soda AI features include:

<table data-header-hidden><thead><tr><th width="284.39996337890625"></th><th></th></tr></thead><tbody><tr><td><a data-mention href="../data-testing/contract-autopilot">contract-autopilot</a></td><td>Creates a first version of a data contract.</td></tr><tr><td><a data-mention href="../data-testing/contract-copilot">contract-copilot</a></td><td>Helps edit and iterate existing data contracts.</td></tr><tr><td><a data-mention href="soda-ai/askai">askai</a></td><td>Answers questions about Soda documentation.</td></tr></tbody></table>

## Soda AI principles

#### Assistive design

Soda AI features are designed to be assistive, not autonomous. They **do not make decisions** or take irreversible actions without user involvement. Outputs are always reviewable, editable, and attributable.

#### Narrow scope

All Soda AI capabilities are purpose-built for **technical data quality workflows**. They do not process personal user information or business context outside Soda.

***

## Soda's AI security & privacy measures

Soda's features are built with **privacy and security as first-class design principles**. The following safeguards apply to all AI/ML-powered features in Soda.

### Privacy and data usage

* Soda uses **third-party foundation models** (GPT-5 family).
* **API request data** sent to the model provider is **not retained for training**.
* **Soda does not use customer data**, metadata, or results from your environment to **train** or **fine-tune** external AI, LLM, or ML models.
* **Source data**, such as row-level data and/or PII, **is never sent by default** to Soda’s AI/ML engine or to any external model provider. **AI features that require the use of source data are** **opt-in**.

{% hint style="info" %}
**The data sent to each model provider depends on the feature** and is limited to what is strictly required by the individual tool being used. See more details on each specific tool page.
{% endhint %}

#### Diagnostics data remains local

If record-level diagnostics (such as `Failed Rows`) are enabled:

* They are stored **only in your own data warehouse(s)** via [Soda’s Diagnostics Warehouse](https://docs.soda.io/diagnostics-warehouse)
* They are **not** stored in Soda Cloud
* They are **not accessible** to Soda’s AI/ML processing

### Security and compliance

All AI/ML processing is performed securely and in alignment with **Soda’s SOC 2 Type 2 compliance**:

* All processes and development align with information security, application security, and Software Development Life Cycle (SDLC) policies
* **Data in transit is encrypted** using industry standards
* **Data at rest is encrypted** using industry standards
* **Access** to AI features is governed by **existing Soda roles and permissions**
* Third-party providers (such as OpenAI) are **evaluated for compliance before adoption**

{% hint style="success" %}
Soda continuously monitors AI features via metrics, quality indicators and internal testing to ensure reliable operation at scale.
{% endhint %}

### AI guardrails and ethics

* Foundation models include built-in safety guardrails from the provider
* Additional **system-level guardrails** restrict responses to the intended Soda use cases
* Soda AI features do not generate harmful, discriminatory, or user-targeted content

Because Soda AI features do not evaluate users or make decisions, traditional bias and fairness risks are minimal.

<br>

***

{% if (visitor.claims.plan === 'datasetStandard')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Free license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterprise')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Team license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterpriseUserBased')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Enterprise license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if !(visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard')%}
{% hint style="info" %}
You are **not logged in to Soda** and are viewing the default public documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.soda.io/reference/soda-ai.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
