Soda AI

Discover Soda's AI powered features that help improve your data quality seamlessly and learn about Soda's privacy & security measures with AI language models.

Soda incorporates AI-powered, assistive features that help users understand data, author data contracts, and accelerate onboarding without replacing human decision-making.

Soda AI features are designed to:

  • Reduce manual effort in defining and maintaining data quality expectations

  • Help users reason about data and contracts faster

  • Operate safely within strict privacy, security, and governance boundaries

Current Soda AI features include:

Creates a first version of a data contract.

Helps edit and iterate existing data contracts.

Answers questions about Soda documentation.

Soda AI principles

Assistive design

Soda AI features are designed to be assistive, not autonomous. They do not make decisions or take irreversible actions without user involvement. Outputs are always reviewable, editable, and attributable.

Narrow scope

All Soda AI capabilities are purpose-built for technical data quality workflows. They do not process personal user information or business context outside Soda.


Soda's AI security & privacy measures

Soda's features are built with privacy and security as first-class design principles. The following safeguards apply to all AI/ML-powered features in Soda.

Privacy and data usage

  • Soda uses third-party foundation models (GPT-5 family).

  • API request data sent to the model provider is not retained for training.

  • Soda does not use customer data, metadata, or results from your environment to train or fine-tune external AI, LLM, or ML models.

  • Source data, such as row-level data and/or PII, is never sent to Soda’s AI/ML engine or to any external model provider by default.

circle-info

The data sent to each model provider depends on the feature and is limited to what is strictly required by the individual tool being used. See more details on each specific tool page.

Diagnostics data remains local

If record-level diagnostics (such as Failed Rows) are enabled:

  • They are stored only in your own data warehouse(s) via Soda’s Diagnostics Warehouse

  • They are not stored in Soda Cloud

  • They are not accessible to Soda’s AI/ML processing

Security and compliance

All AI/ML processing is performed securely and in alignment with Soda’s SOC 2 Type 2 compliance:

  • All processes and development align with information security, application security, and Software Development Life Cycle (SDLC) policies

  • Data in transit is encrypted using industry standards

  • Data at rest is encrypted using industry standards

  • Access to AI features is governed by existing Soda roles and permissions

  • Third-party providers (such as OpenAI) are evaluated for compliance before adoption

circle-check

AI guardrails and ethics

  • Foundation models include built-in safety guardrails from the provider

  • Additional system-level guardrails restrict responses to the intended Soda use cases

  • Soda AI features do not generate harmful, discriminatory, or user-targeted content

Because Soda AI features do not evaluate users or make decisions, traditional bias and fairness risks are minimal.


circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?