Sigma
Learn how to integrate Sigma with Soda.
Connect Soda Cloud to Sigma to build live reporting dashboards on top of your data quality results. You can create dashboards that visualize Soda check results, track failed rows over time, and report on the overall health of your data assets.

The integration works by exposing Soda data, such as check results, scan time, or failed rows. Once Sigma is connected to Soda or a Soda data source, such as a Diagnostics Warehouse, you can build and share dashboards directly in Sigma without any additional export or scripting steps.
You can use the Soda Cloud API to extract check and dataset results programmatically and write them to a data warehouse of your choice, which Sigma can then query.
Prerequisites
Python 3.8+
Pip 21.0+
A Sigma account with permission to create workbooks and connect data sources
A Soda Cloud account with at least one data source configured and scans running
Permission in Soda Cloud to access dataset metadata; see Global and Dataset Roles
Soda Cloud API keys. See Generate API keys
Soda Cloud Reporting API dashboard
Set up a Python script
Install the required dependencies:
In a new Python script, configure your Soda Cloud connection. Use cloud.us.soda.io for US accounts and cloud.soda.io for EU accounts.
See Generate API keys for instructions on obtaining your keys.
Define the Snowflake table names in which to store the metadata. Use uppercase names to meet Snowflake's case sensitivity requirements.
Configure your Snowflake connection:
Capture and store metadata
Make a GET request to the Soda Cloud API to retrieve dataset information. The script exits with an error if the request is unauthorised.
Once the connection is confirmed, iterate over all pages of results and load them into a Pandas DataFrame. The script handles API rate limiting automatically. On HTTP 429, it pauses 30 seconds and retries the same page.
Following the same pattern, extract all check information from the Checks endpoint. This retrieves each check's name, dataset, last evaluation time, result (pass/warn/fail), owner, and any custom attributes.
Any custom Soda attributes defined on your checks are automatically expanded into individual columns. If the target tables already exist in Snowflake, the script detects new attribute columns and adds them with ALTER TABLE without overwriting existing data.
Write both DataFrames to your data source:
To track changes over time, schedule the script to run regularly and store results in incremental tables.
Build a dashboard in Sigma
Follow Sigma's documentation to connect to your data source, pointing to the database and schema where your CHECKS_REPORT and DATASETS_REPORT tables are stored.
Access the metadata either by modelling data from the database tables directly, or by creating a dataset using custom SQL.
Create a new workbook in Sigma and add your visualisations.
A typical data quality dashboard might include:
KPI tiles — total datasets monitored, total checks executed, number of failing checks
Weighted data quality score — using the
Weightcheck attribute to calculate a custom health score, trended over timeBreakdowns by check attribute — such as Data Quality Dimension (Completeness, Validity, Consistency, Accuracy, Timeliness, Uniqueness), Data Domain, Data Team, or Pipeline Stage

Check attributes such as Data Quality Dimension, Data Domain, Data Team, and Pipeline Stage are particularly useful for filtering and segmenting results in your dashboard. The Weight attribute lets you assign a numerical importance level to each check, enabling a custom data health score.
See check and dataset attributes for more.
You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.
Last updated
Was this helpful?
