A data source can be partially onboarded programmatically via Soda Core. Once a data source is fully onboarded via Soda Cloud, the engineering team can onboard datasets programmatically.
In a nutshell, a dataset is pushed onto Soda Cloud by:
Creating a contract YAML that points to that dataset
Pushing the results of the contract onto Soda Cloud
Soda does not support pushing a full data source configuration programmatically. This flow will create an empty data source that needs to be configured later in Soda Cloud.
Partially onboard a data source via CLI
This flow shows how to partially onboard a data source programmatically in order to finish setting it up in Soda Cloud.
Unlike the create command in the CLI reference, to push a data source via CLI you must not use the --use-agent flag.
The --use-agent flag would attempt to find the data source in Soda Cloud, which has not been yet set up in your environment at this time.
When should I use the --use-agent flag?
In some organizations, the data source Admin can create the data source connection in Soda Cloud without onboarding any datasets. Then, engineers can create and publish new contracts to onboard datasets via CLI. Since the data source exists already in Soda Cloud, the --use-agent flag should be used so that the datasets are pushed from Soda Core onto the existing data source through the Agent.
3. Test, verify & publish the contract
Test that the contract is correct
Verify the contract
Publish the contract
After successfully publishing the contract, you will
Complete onboarding via Soda Cloud
A data source must be connected to Soda Agent to access contract verification in Soda Cloud. Connection to Soda Agent is performed via Soda Cloud.
If you attempt to use contract verification, you will find the following warning:
If you attempt to access Metric Monitoring capabilities in Soda Cloud, you will find the following warning: