Migrate from v3 to v4
This guide details the steps required to migrate datasets from v3 to v4 contracts.
The migration needs to be enabled by the Soda team. Contact us to ensure that the feature flag (v3ToV4DatasetMigrationEnabled) is enabled for your organization.
This action requires the global Admin role and can be enabled for users with Manage contract permission. Learn more about permissions here:
Using Soda Cloud
Start the migration
The migration can be performed in two ways:
Bulk migration: On the datasets page, you can select multiple datasets to migrate them all at once.
Single migration: On a dataset page, you can migrate the dataset individually.


Once initiated, Soda guides you through each step of the process to ensure a safe and transparent migration.
Select a v4 data source
In the first step, you’ll select which v4 data source the dataset(s) should be migrated to.
Choose an existing v4 data source from the dropdown list.
Or click Add a new v4 Data Source to create a new one.
Click Next
You can only migrate v3 datasets to a data source of the same type. For example, from Snowflake to Snowflake.
The migration process in Soda Cloud requires a valid connection configuration to your data source through Soda Agent.

Configure migration settings
In this step, you can customize optional settings before running the migration.
Available configurations:
Contract Schedule: Set a schedule for automatic contract verification.
Migrate History: include 90 days of historical check results in the migration.
Migrate Responsibilities: Migrate the dataset's ownership and assigned responsibilities.
Migrate Attributes: Migrate the dataset's attributes.
Once your settings are configured, click Continue to proceed.

Review migration
Before migrating, you’ll see a detailed summary of what will be migrated and what won’t. Each dataset displays:
Number of checks that were:
Successfully translated: The behavior of the new checks matches that in v3.
Translated with a warning: Translated, but not an exact one-to-one match. Review recommended.
Not translated: Checks are either not supported in v4 or cannot be translated. See Current limitations
For each check with a warning or that was not translated, you can view the definition of the v3 check, along with an indicative remark explaining the reason or impact.

Click the eye icon to open a preview of the generated contract before finalizing the migration.It lets you verify the contract structure, filters, variables, and checks. You can test it directly from this view to ensure everything runs as expected before confirming migration.


Complete migration
After reviewing, use the checkboxes to deselect any datasets that are not ready for migration, then click Migrate to finalize the process.
This action starts a background process to migrate the selected datasets. The migration may take a few seconds or minutes, depending on the volume being migrated, to complete.

Once started, you can navigate to the v3 datasets to monitor progress. Refresh the page after a few moments to view the migration results.

Post-migration results
Once migration is complete:
Your original v3 dataset remains accessible but marked as migrated.
Your new v4 dataset is fully active and ready for contract-based validation.
The v3 dataset page shows its migration status as Completed migration.
In the v4 dataset view, you can now see:
Migrated checks under Dataset checks and Column checks.
Contract verification results and coverage metrics.
The option to edit or version your new data contract.

Using Soda CLI
Install the migration CLI
In a virtual environment, install the Soda migration package as well as the Soda Core package for your data source (see Data source reference for Soda Core).
The migration process requires having installed the soda-migration package using the private PyPI with a Team or Enterprise license.
Need access to the PyPI repository ? Please contact us.
Choose your organization host to install the migration package:
Next to the migration package, it is also required to install the necessary package to connect to your data source. See Data source reference for Soda Core. This is required because Soda connects to your data source to generate the contract skeleton before translating existing checks.
Configuration
Create a migration.json file to map v3 dataset IDs to v4 DQNs (Dataset Qualified Names). This file defines which datasets to migrate and their corresponding new DQNs.
Example structure:
Learn more about DQNs in Dataset fully qualified name
Retrieve the v3 dataset IDs
To retrieve the v3 dataset IDs, you want to migrate from Soda Cloud:
Use to fetch the datasets' information and their IDs
The IDs can also be found in the dataset URL for a given dataset.
Generate contracts
Run the following command, replacing the paths and file names with your setup. Note that this action requires a valid connection configuration to your data source and Soda Cloud.
Parameters
Parameter
Required
Description
--bulk-config-file
Yes
Path to the JSON configuration file (bulk.json) that maps v3 dataset IDs to v4 DQNs.
--output-directory
Yes
Directory where the generated v4 contracts will be saved. The structure within this location mirrors the DQN hierarchy.
--v4-data-source
Yes
Path to the local v4 datasource YAML file (e.g., ds.yml). Used to retrieve schema metadata during contract generation.
See Data source reference for Soda Core
--soda-cloud
Yes
Configuration required to connect to Soda Cloud.
--schedule
No
Optional cron expression defining the schedule for the generated contracts (e.g., "0 0 * * *").
--verbose
No
Enables verbose output for detailed logs during the migration process.
Review migration
Ensure that each generated contract includes:
Correct
v4_dqnin thedatasetpropertyCorrect
v3_dataset_idproperty referencing v3 datasetCorrect columns and types
Checks migrated from
sodaclintocontractschecksv3 check IDs present in the
qualifierfield for each checkAccurate check filters, expressions, and metadata
See Migrate from v3 to v4 to know which checks cannot be automatically migrated yet. Those checks can still be added manually, and the history can be migrated by setting a v3 check ID in the qualifier
Complete migration
Once contracts are verified, publish them to Soda Cloud with the following command:
Parameters
--contract
Yes
Specifies the path to the folder containing the contracts or a specific contract file to be migrated. This parameter supports recursive directory traversal.
⚠️ Note: If the folder includes contracts for datasets that have already been migrated, those datasets will be migrated again.
--soda-cloud
Yes
Configuration required to connect to Soda Cloud.
--migrate-test-results
No
Include test history from v3 to v4. A maximum of 90 days of history is migrated.
Default is false.
--migrate-responsibilities
No
Copy responsibilities from v3 datasets to v4 datasets.
Default is false.
--migrate-attributes
No
Copy attributes from v3 datasets to v4 datasets
Default is false.
--verbose
No
Enable verbose mode
Post-migration results
After publishing, confirm the following:
Logs: Check migration status (overall and per dataset).
UI Overview: Verify datasets in the Soda Cloud UI.
Checks: Ensure checks appear with full history.
Flags: Confirm presence of the following indicators:
v4 flag present
v3 link available
Migration completed flag present

Notes and recommendations
Migration does not delete v3 datasets; it simply marks them as migrated. Once migration is completed, you will be required to update your pipelines or agreements to stop executing v3 checks. Then you can remove the v3 datasets or the v3 data source to remove all its datasets.
Verify that your v4 data source has a valid connection configuration before migrating. Soda connects to your data source to generate the contract structure.
Review the migrated contract in detail before finishing the migration. However, it is possible to re-run the migration if necessary Re-running a migration
Enable dataset owners to migrate datasets from v3→v4
An organization Admin can enable users with Manage Contract permission to migrate datasets.
Dataset owners have the Manage Contract permission by default.
Click on your profile and navigate to Organization Settings

Under Dataset migration, check the option "Allow users with Manage Contract permission to migrate datasets"
This option is disabled by default.

Users with permission will now be able to see the migration tool:

View and filter migrated datasets
After migration, you can review and manage your datasets from the Datasets page in Soda Cloud.
Use the filters to easily identify datasets based on their migration status and version:
Migration status filter — view datasets that are Pending, In progress, or Completed migration.
Version filter — filter by dataset version (v3 or v4) to focus on datasets still awaiting migration or already upgraded.
This makes it simple to track migration progress, validate completed transitions, and identify datasets that still require attention.

Re-running a migration
Once a dataset has been successfully migrated to version 4 (v4), Soda blocks the migration for the dataset to happen again. To re-run the migration, you will need to delete the v4 dataset and run the migration again.
How checks are matched
The migration process uses the qualifier ( Check qualifiers) field to identify which v3 checks should be migrated. The qualifier values are set to the v3 check IDs. Because the qualifier is part of the check's identity algorithm, it is important not to change the qualifier after migration. Changing it would result in a loss of history for the checks in Soda Cloud.
Current limitations
Translation step
Check types
The following check types are not yet supported:
Any reconciliation checks
Reference check
Any checks using anomaly score or anomaly detection
Dataset filters
Dataset filters (In-check vs. dataset filters ) are currently not migrated.
Column casting
Data Contract does not support casting yet. When casting is detected in a check, the check will not be translated and will be excluded from the migration.
Variables
Variables in names
If variables are used in the column name, the check will not be translated and will be excluded from the migration.
Example:
Variables default values
Variables used in SodaCL are automatically added to the data contract. They will not have a default value. The default values can be added manually by the users.
You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.
Last updated
Was this helpful?
