For the complete documentation index, see llms.txt. This page is also available as Markdown.

Migrate from v3 to v4

This guide details the steps required to migrate datasets from v3 to v4 contracts.

The migration needs to be enabled by the Soda team. Contact us to ensure that the feature flag (v3ToV4DatasetMigrationEnabled) is enabled for your organization.

Using Soda Cloud

Start the migration

The migration can be performed in two ways:

  1. Bulk migration: On the datasets page, you can select multiple datasets to migrate them all at once.

  2. Single migration: On a dataset page, you can migrate the dataset individually.

Once initiated, Soda guides you through each step of the process to ensure a safe and transparent migration.

Select a v4 data source

In the first step, you’ll select which v4 data source the dataset(s) should be migrated to.

  1. Choose an existing v4 data source from the dropdown list.

  2. Or click Add a new v4 Data Source to create a new one.

  3. Click Next

Configure migration settings

In this step, you can customize optional settings before running the migration.

Available configurations:

  • Contract Schedule: Set a schedule for automatic contract verification.

  • Migrate History: include 90 days of historical check results in the migration.

  • Migrate Responsibilities: Migrate the dataset's ownership and assigned responsibilities.

  • Migrate Attributes: Migrate the dataset's attributes.

Once your settings are configured, click Continue to proceed.

Review migration

Before migrating, you’ll see a detailed summary of what will be migrated and what won’t. Each dataset displays:

  • Number of checks that were:

    • Successfully translated: The behavior of the new checks matches that in v3.

    • Translated with a warning: Translated, but not an exact one-to-one match. Review recommended.

    • Not translated: Checks are either not supported in v4 or cannot be translated. See Current limitations

  • For each check with a warning or that was not translated, you can view the definition of the v3 check, along with an indicative remark explaining the reason or impact.

Click the eye icon to open a preview of the generated contract before finalizing the migration.It lets you verify the contract structure, filters, variables, and checks. You can test it directly from this view to ensure everything runs as expected before confirming migration.

Complete migration

After reviewing, use the checkboxes to deselect any datasets that are not ready for migration, then click Migrate to finalize the process.

This action starts a background process to migrate the selected datasets. The migration may take a few seconds or minutes, depending on the volume being migrated, to complete.

Once started, you can navigate to the v3 datasets to monitor progress. Refresh the page after a few moments to view the migration results.

Post-migration results

Once migration is complete:

  • Your original v3 dataset remains accessible but marked as migrated.

  • Your new v4 dataset is fully active and ready for contract-based validation.

The v3 dataset page shows its migration status as Completed migration.

In the v4 dataset view, you can now see:

  • Migrated checks under Dataset checks and Column checks.

  • Contract verification results and coverage metrics.

  • The option to edit or version your new data contract.

Using Soda CLI

Install the migration CLI

In a virtual environment, install the Soda migration package as well as the Soda Core package for your data source (see Data source reference for Soda Core).

Choose your organization host to install the migration package:

Next to the migration package, it is also required to install the necessary package to connect to your data source. See Data source reference for Soda Core. This is required because Soda connects to your data source to generate the contract skeleton before translating existing checks.

Configuration

Create a migration.json file to map v3 dataset IDs to v4 DQNs (Dataset Qualified Names). This file defines which datasets to migrate and their corresponding new DQNs. Example structure:

Learn more about DQNs in Dataset fully qualified name

Retrieve the v3 dataset IDs

To retrieve the v3 dataset IDs, you want to migrate from Soda Cloud:

  • Use to fetch the datasets' information and their IDs

  • The IDs can also be found in the dataset URL for a given dataset.

Example Python script to generate the configuration file

Generate contracts

Run the following command, replacing the paths and file names with your setup. Note that this action requires a valid connection configuration to your data source and Soda Cloud.

Parameters

Parameter

Required

Description

--bulk-config-file

Yes

Path to the JSON configuration file (bulk.json) that maps v3 dataset IDs to v4 DQNs.

--output-directory

Yes

Directory where the generated v4 contracts will be saved. The structure within this location mirrors the DQN hierarchy.

--v4-data-source

Yes

Path to the local v4 datasource YAML file (e.g., ds.yml). Used to retrieve schema metadata during contract generation. See Data source reference for Soda Core

--soda-cloud

Yes

Configuration required to connect to Soda Cloud.

--schedule

No

Optional cron expression defining the schedule for the generated contracts (e.g., "0 0 * * *").

--verbose

No

Enables verbose output for detailed logs during the migration process.

Review migration

Ensure that each generated contract includes:

  • Correct v4_dqn in the dataset property

  • Correct v3_dataset_id property referencing v3 dataset

  • Correct columns and types

  • Checks migrated from sodacl into contracts checks

  • v3 check IDs present in the qualifier field for each check

  • Accurate check filters, expressions, and metadata

See Migrate from v3 to v4 to know which checks cannot be automatically migrated yet. Those checks can still be added manually, and the history can be migrated by setting a v3 check ID in the qualifier

Complete migration

Once contracts are verified, publish them to Soda Cloud with the following command:

Parameters

Parameter
Required

--contract

Yes

Specifies the path to the folder containing the contracts or a specific contract file to be migrated. This parameter supports recursive directory traversal.

⚠️ Note: If the folder includes contracts for datasets that have already been migrated, those datasets will be migrated again.

--soda-cloud

Yes

Configuration required to connect to Soda Cloud.

--migrate-test-results

No

Include test history from v3 to v4. A maximum of 90 days of history is migrated. Default is false.

--migrate-responsibilities

No

Copy responsibilities from v3 datasets to v4 datasets. Default is false.

--migrate-attributes

No

Copy attributes from v3 datasets to v4 datasets Default is false.

--verbose

No

Enable verbose mode

Post-migration results

After publishing, confirm the following:

  • Logs: Check migration status (overall and per dataset).

  • UI Overview: Verify datasets in the Soda Cloud UI.

  • Checks: Ensure checks appear with full history.

  • Flags: Confirm presence of the following indicators:

    • v4 flag present

    • v3 link available

    • Migration completed flag present

Notes and recommendations

  • Migration does not delete v3 datasets; it simply marks them as migrated. Once migration is completed, you will be required to update your pipelines or agreements to stop executing v3 checks. Then you can remove the v3 datasets or the v3 data source to remove all its datasets.

  • Verify that your v4 data source has a valid connection configuration before migrating. Soda connects to your data source to generate the contract structure.

  • Review the migrated contract in detail before finishing the migration. However, it is possible to re-run the migration if necessary Re-running a migration

Enable dataset owners to migrate datasets from v3→v4

An organization Admin can enable users with Manage Contract permission to migrate datasets.

Dataset owners have the Manage Contract permission by default.

1

Click on your profile and navigate to Organization Settings

2

Under Dataset migration, check the option "Allow users with Manage Contract permission to migrate datasets"

This option is disabled by default.

Users with permission will now be able to see the migration tool:


View and filter migrated datasets

After migration, you can review and manage your datasets from the Datasets page in Soda Cloud.

Use the filters to easily identify datasets based on their migration status and version:

  • Migration status filter — view datasets that are Pending, In progress, or Completed migration.

  • Version filter — filter by dataset version (v3 or v4) to focus on datasets still awaiting migration or already upgraded.

This makes it simple to track migration progress, validate completed transitions, and identify datasets that still require attention.

Re-running a migration

Once a dataset has been successfully migrated to version 4 (v4), Soda blocks the migration for the dataset to happen again. To re-run the migration, you will need to delete the v4 dataset and run the migration again.

How checks are matched

The migration process uses the qualifier ( Check qualifiers) field to identify which v3 checks should be migrated. The qualifier values are set to the v3 check IDs. Because the qualifier is part of the check's identity algorithm, it is important not to change the qualifier after migration. Changing it would result in a loss of history for the checks in Soda Cloud.

Current limitations

Translation step

Check types

The following check types are not yet supported:

Dataset filters

Dataset filters (In-check vs. dataset filters ) are currently not migrated.

Column casting

Data Contract does not support casting yet. When casting is detected in a check, the check will not be translated and will be excluded from the migration.

Variables

Variables in names

If variables are used in the column name, the check will not be translated and will be excluded from the migration.

Example:

Variables default values

Variables used in SodaCL are automatically added to the data contract. They will not have a default value. The default values can be added manually by the users.


You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?