✔ Soda Checks Language: a human-readable, domain-specific language for data reliability
✔ Compatible with Soda Core and Soda Cloud
✔ Includes 25+ built-in metrics, plus the ability to include SQL queries
✔ Includes checks with change-over-time thresholds to gauge changes to metrics over time
✔ Collaborate with your team to write SodaCL checks in a YAML file
Example checks
# Checks for basic validations
checks for dim_customer:
- row_count between 10 and 1000
- missing_count(birth_date) = 0
- invalid_percent(phone) < 1 %:
valid format: phone number
- invalid_count(number_cars_owned) = 0:
valid min: 1
valid max: 6
- duplicate_count(phone) = 0
checks for dim_product:
- avg(safety_stock_level) > 50
# Checks for schema changes
- schema:
name: Find forbidden, missing, or wrong type
warn:
when required column missing: [dealer_price, list_price]
when forbidden column present: [credit_card]
when wrong column type:
standard_cost: money
fail:
when forbidden column present: [pii*]
when wrong column index:
model_name: 22
# Check for freshness
- freshness (start_date) < 1d
# Check for referential integrity
checks for dim_department_group:
- values in (department_group_name) must exist in dim_employee (department_name)
Was this documentation helpful?
What could we do to improve this page?
- Suggest a docs change in GitHub.
- Share feedback in the Soda community on Slack.
Last modified on 31-May-23