When you run a scan, Soda SQL uses the configurations in your scan YAML file and Soda Cloud monitors to prepare, then run SQL queries against data in your data source. The default tests and metrics Soda SQL configured when it created the YAML file focus on finding missing, invalid, or unexpected data in your datasets.
Each scan requires the following as input:
- a warehouse YAML file, which represents a connection to your data source
- a scan YAML file, including its filepath, which contains the metric and test instructions that Soda SQL uses to scan datasets in your data source
$ soda scan warehouse.yml tables/demodata.yml
To run the same scan against different data sources, proceed as follows.
- Prepare one warehouse YAML file for each data source you wish to scan. For example:
name: my_postgres_datawarehouse_dev connection: type: postgres host: localhost port: '5432' username: env_var(POSTGRES_USERNAME) password: env_var(POSTGRES_PASSWORD) database: dev schema: public
name: my_postgres_datawarehouse_prod connection: type: postgres host: dbhost.example.com port: '5432' username: env_var(POSTGRES_USERNAME) password: env_var(POSTGRES_PASSWORD) database: prod schema: public
- Prepare a scan YAML file to define all the tests you wish to run against your data sources. See Define tests for details.
- Run separate Soda SQL scans against each data source by specifying which warehouse YAML to scan and using the same scan YAML file. For example:
soda scan warehouse_postgres_dev.yml tables/my_dataset_scan.yml soda scan warehouse_postgres_prod.yml tables/my_dataset_scan.yml
Use a single scan YAML file to run tests on different datasets in your data source.
Prepare one scan YAML file to define the tests you wish to apply against multiple datasets. Use custom metrics to write SQL queries and subqueries that run against multiple datasets. When you run a scan, Soda SQL uses your SQL queries to query data in the datasets you specified in your scan YAML file.
- See Example tests by metric to learn more about defining tests.
- Learn How Soda SQL works.
- Learn more about Metrics.
- Learn how to apply dynamic filters to your scan.
- Need help? Join the Soda community on Slack.
Last modified on 16-Jul-21