For each
Use a for each configuration to execute checks against multiple datasets during a scan.
for each dataset T:
datasets:
- dim_products%
- fact%
- exclude fact_survey_response
checks:
- row_count > 0
Configure for each
Add a for each section to your checks YAML file to specify a list of checks you wish to execute on multiple datasets.
- Add a
for each dataset T
section header anywhere in your YAML file. The purpose of theT
is only to ensure that everyfor each
configuration has a unique name. - Nested under the section header, add two nested keys, one for
datasets
and one forchecks
. - Nested under
datasets
, add a list of datasets against which to run the checks. Refer to the example below that illustrates how to useinclude
andexclude
configurations and wildcard characters (%) . - Nested under
checks
, write the checks you wish to execute against all the datasets listed underdatasets
.
for each dataset T:
datasets:
# include the dataset
- dim_customers
# include all datasets matching the wildcard expression
- dim_products%
# (optional) explicitly add the word include to make the list more readable
- include dim_employee
# exclude a specific dataset
- exclude fact_survey_response
# exclude any datasets matching the wildcard expression
- exclude prospective_%
checks:
- row_count > 0
- Soda Core dataset names matching is case insensitive.
- If any of your checks specify column names as arguments, make sure the column exists in all datasets listed under the
datasets
heading. - To add multiple for each configurations in your checks YAML file, configure another
for each
section header with a different letter identifier, such asfor each dataset R
.
Go further
- Need help? Join the Soda community on Slack.
Last modified on 01-Jul-22
Was this documentation helpful?
Share feedback in the Soda community on Slack.
Help improve our docs!