Contract Autopilot

Learn more about Contract Autopilot, a Soda AI-powered feature to create data contracts tailored to your data.

database

Contract Autopilot automatically generates initial data contracts for datasets with recommended data quality checks to accelerate onboarding in Soda Cloud. It is designed to help teams achieve fast time to value and iterate later.

camcorder Watch Contract Autopilot in action

Generate a contract with Autopilot

circle-exclamation
1

Select datasets

On the Datasets page, select the datasets for which you want to generate a contract.

Then, click on Generate Contracts (top right).

2

[Optional] Set a schedule

You can optionally define a schedule to run contract generation at a time of your choosing.

3

Generate

After selecting the datasets, Soda will provide a snapshot of which contracts will be generated, and a time estimate for generation. You can continue using Soda Cloud while contracts are being generated.

After reviewing what will be created, click on Generate [num] Contracts.

octagon-check You can now view the generated contracts to review, edit or suggest changes.

This process will create a scan for each selected dataset. You can cancel the scan or review the execution logs in the Scans tab on the dataset, as well as on the general Scans page.

check Successful scan

  • The generated contract is published

    • A blue checkmark icon indicates that it is an automatically generated contract

    • The icon will change to a green checkmark icon after there is a user edit, to indicate it is a user-generated contract..

  • A data contract history entry is published, indicating that Autopilot published a first version of the contract

  • [If configured] A "published contract" Webhook will be published (for example, to allow sync to Git)

x Failed scan

  • The dataset is marked with the message "Contract generation has failed"

  • Scan logs can be reviewed to understand the underlying cause

  • The contract generation can be re-triggered after inspecting the root cause


Requirements & limitations

circle-info

Autopilot is only available upon request. Contact usenvelope to enable it in your organization.

Autopilot is currently a Soda Cloud feature:

  • A Soda Agent is required

  • CLI or Python API are not available

  • APIs are not available

Data sources

  • Supported on Snowflake, Databricks, and Postgres connections

Contract Generation Scope

Autopilot can only generate data contracts for datasets that do not already have an existing data contract. Datasets with an existing contract are not eligible for contract generation via Autopilot.

circle-check

Privacy & security

Contract Autopilot is designed to minimize data exposure while generating meaningful contract recommendations.

The following data is used locally by Soda:

  • A sample of approximately 10,000 rows to compute profiling metrics

The following data is sent to Soda’s managed OpenAI API:

  • A sample of up to 100 rows

  • Computed profiling metrics

  • Data source type

  • Dataset schema

circle-info

Access to Contract Autopilot is restricted to users with appropriate dataset permissions.

Learn more about Soda's AI security & privacy measures.


Next steps

After generation, you can:

  • Review generated checks

  • Validate assumptions with domain experts

  • Adjust schedules and thresholds

  • Iterate using Contract Copilot


Contract Autopilot vs. Create Contract

When you click Create Contract (or use the CLI generate command), Soda creates a contract skeleton based on the dataset schema. It includes columns only and no checks are added.

Contract Autopilot, on the other hand, analyzes sampled data and profiling metadata to automatically generate a fully populated contract with recommended checks.


circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?