Additional settings

Test a contract on a sample

circle-exclamation

When testing a data contract, Soda allows you to run contract validation on a sample of your dataset instead of the full data. This feature helps you quickly and cost-efficiently verify that your contract runs correctly before executing full scans.

Running a test contract on a sample enables you to:

  • Validate that your contract syntax, checks, and filters work as expected.

  • Reduce data warehouse compute cost while verifying new or updated contracts.

  • Iterate faster on contract definitions in development environments.

Results from sampled runs reflect only a subset of your data and may not represent its actual quality. Use full verification once your contract logic is validated.

Enable sampling for test contracts

This feature can be enabled at the data source level, applying to all datasets that use that connection.

circle-exclamation

To enable this feature:

  1. Go to Data sources.

  2. Click Edit connection for a data source.

  1. Under the Connection Details section, toggle Data Sampling.

  2. Specify your sample size on the Limit field.

  1. octagon-check Click Connect.


Optimize computing with multiple warehouses

circle-exclamation

When connecting to Snowflake, you must provide a warehouse as part of the data source configuration. By default, this single warehouse is used for all operations, including discovery, metric monitoring, profiling, data contract executions, and the diagnostics warehouse.

The Configure warehouses per dataset feature gives you greater control and flexibility by allowing you to define specific warehouses for individual datasets. This helps you optimize cost, manage compute workloads, and allocate resources efficiently across your data operations.

circle-info

This feature is available only when using Soda Agent. When using Soda Core, the warehouse can be specified directly in the connection YAML instead.

Enable the use of multiple warehouses

circle-exclamation
  1. Go to Data sources in Soda Cloud.

  2. Click Edit connection for your Snowflake data source.

  3. Toggle on Configure Warehouses.

  4. Specify the list of allowed warehouses that can be used by this connection.

  5. Choose a default warehouse to use for all datasets unless otherwise specified.

  6. octagon-check Click Save on the top right to save your configuration.

Default warehouse behavior

Once enabled:

  • The warehouse specified in the data source connection is used for discovery.

  • The default warehouse (defined under Configure Warehouses) is used for:

    • Metric monitoring

    • Profiling

    • Data contract executions

    • Diagnostics Warehouse operations

  • A different warehouse can be configured at the dataset level, overriding the default.

Specify a warehouse at the dataset level

circle-exclamation
  1. Go to a dataset in Soda Cloud.

  2. Click Edit dataset.

  1. Under the Snowflake section, select the warehouse to use for this dataset.

  2. octagon-check Click Save to apply your changes.

Last updated

Was this helpful?