Via Soda Core

Go to the data source reference for Soda Core for more in-depth details about supported data sources and connection parameters.

A data source can be partially onboarded programmatically via Soda Core. Once a data source is fully onboarded via Soda Cloud, the engineering team can onboard datasets programmatically.

In a nutshell, a dataset is pushed onto Soda Cloud by:

  1. Creating a contract YAML that points to that dataset

  2. Pushing the results of the contract onto Soda Cloud

circle-exclamation

Partially onboard a data source via CLI

This flow shows how to partially onboard a data source programmatically in order to finish setting it up in Soda Cloud.

1

Set up your connections

Follow the CLI reference to:

  1. Perform installation steps

2

Create a contract

Run the following:

soda contract create --dataset datasource/db/schema/table --file contract.yaml --data-source ds_config.yml --soda-cloud sc_config.yml

circle-exclamation

When should I use the --use-agent flag?

In some organizations, the data source Admin can create the data source connection in Soda Cloud without onboarding any datasets. Then, engineers can create and publish new contracts to onboard datasets via CLI. Since the data source exists already in Soda Cloud, the --use-agent flag should be used so that the datasets are pushed from Soda Core onto the existing data source through the Agent.

3

Test, verify & publish the contract

  1. Test that the contract is correct

    soda contract test --contract contract.yaml
  2. Verify the contract

    soda contract verify --contract contract.yaml --data-source ds_config.yml
  3. Publish the contract

    soda contract publish --contract contract.yaml --soda-cloud sc_config.yml

octagon-check After successfully publishing the contract, you will see a success output:

  __|  _ \|  \   \\
\__ \ (   |   | _ \\
____/\___/___/_/  _\\ CLI v4.0.4b24
Fetching datasets configurations from Soda Cloud for datasets '[DatasetIdentifier(data_source='CLI_testing', prefixes=['postgres', 'aldi_local'], dataset='retail_orders')]'
Verifying contract πŸ“œ contract.yaml 🀞

### Contract results for CLI_testing/postgres/aldi_local/retail_orders
+-----------------+-----------------------------------+-------------------------------+-----------+---------------+   
| Column          | Check                             | Threshold                     | Outcome   | Diagnostics   |   
+=================+===================================+===============================+===========+===============+   
| [dataset-level] | Schema matches expected structure | level: fail                   | βœ… PASSED |               |   
|                 |                                   | must be less than or equal: 0 |           |               |   
+-----------------+-----------------------------------+-------------------------------+-----------+---------------+   
# Summary:
|----------------|---|----|
| Checks         | 1 |    |
| Passed         | 1 | βœ… |
| Failed         | 0 | βœ… |
| Warned         | 0 | βœ… |
| Not Evaluated  | 0 | βœ… |
| Excluded       | 0 | βœ… |
| Runtime Errors | 0 | βœ… |

πŸ‘Œ Results sent to Soda Cloud
To view the dataset on Soda Cloud, see https://cloud.us.soda.io/o/<datasetID>/datasets/ab1bc55d-c49a-441e-a0dc-c01857c71b21
Updating post processing stage 'diagnosticWarehouse' to state 'completed' for scan <scanID>
Updated post processing stage 'diagnosticWarehouse' to state 'completed' for scan <scanID>

Complete onboarding via Soda Cloud

circle-info

A data source must be connected to Soda Agent to access contract verification in Soda Cloud. Connection to Soda Agent is performed via Soda Cloud.

  • If you attempt to use contract verification, you will find the following warning:

  • If you attempt to access Metric Monitoring capabilities in Soda Cloud, you will find the following warning:

circle-check


circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?