Link Search Menu Expand Document


✔ Human-readable, domain-specific language for data reliability

✔ Compatible with Soda Core and Soda Cloud

✔ Includes 25+ built-in metrics, plus the ability to include SQL queries

✔ Includes checks with dynamic thresholds to gauge changes to metrics over time

✔ Collaborate with your team to write SodaCL checks in a YAML file

Example checks

# Checks for basic validations
checks for dim_customer:
  - row_count between 10 and 1000
  - missing_count(birth_date) = 0
  - invalid_percent(phone) < 1 %:
      valid format: phone number
  - invalid_count(number_cars_owned) = 0:
      valid min: 1
      valid max: 6
  - duplicate_count(phone) = 0

checks for dim_product:
  - avg(safety_stock_level) > 50
# Checks for schema changes
  - schema:
      name: Find forbidden, missing, or wrong type
        when required column missing: [dealer_price, list_price]
        when forbidden column present: [credit_card]
        when wrong column type:
          standard_cost: money
        when forbidden column present: [pii*]
        when wrong column index:
          model_name: 22

# Check for freshness 
  - freshness (start_date) < 1d

# Check for referential integrity
checks for dim_department_group:
  - values in (department_group_name) must exist in dim_employee (department_name)

Last modified on 01-Jul-22

Was this documentation helpful?
Share feedback in the Soda community on Slack.

Help improve our docs!

  • Request a docs change.
  • Edit this page in our GitHub repo.