Sigma

Learn how to integrate Sigma with Soda.

Connect Soda Cloud to Sigma to build live reporting dashboards on top of your data quality results. You can create dashboards that visualize Soda check results, track failed rows over time, and report on the overall health of your data assets.

The integration works by exposing Soda data, such as check results, scan time, or failed rows. Once Sigma is connected to Soda or a Soda data source, such as a Diagnostics Warehouse, you can build and share dashboards directly in Sigma without any additional export or scripting steps.

You can use the Soda Cloud API to extract check and dataset results programmatically and write them to a data warehouse of your choice, which Sigma can then query.

Prerequisites

  • Python 3.8+

  • Pip 21.0+

  • A Sigma account with permission to create workbooks and connect data sources

  • A Soda Cloud account with at least one data source configured and scans running


Soda Cloud Reporting API dashboard

Set up a Python script

1

Install the required dependencies:

2

In a new Python script, configure your Soda Cloud connection. Use cloud.us.soda.io for US accounts and cloud.soda.io for EU accounts.

See Generate API keys for instructions on obtaining your keys.

3

Define the Snowflake table names in which to store the metadata. Use uppercase names to meet Snowflake's case sensitivity requirements.

4

Configure your Snowflake connection:

Capture and store metadata

1

Make a GET request to the Soda Cloud API to retrieve dataset information. The script exits with an error if the request is unauthorised.

2

Once the connection is confirmed, iterate over all pages of results and load them into a Pandas DataFrame. The script handles API rate limiting automatically. On HTTP 429, it pauses 30 seconds and retries the same page.

3

Following the same pattern, extract all check information from the Checks endpoint. This retrieves each check's name, dataset, last evaluation time, result (pass/warn/fail), owner, and any custom attributes.

Any custom Soda attributes defined on your checks are automatically expanded into individual columns. If the target tables already exist in Snowflake, the script detects new attribute columns and adds them with ALTER TABLE without overwriting existing data.

4

Write both DataFrames to your data source:

5

To track changes over time, schedule the script to run regularly and store results in incremental tables.

Build a dashboard in Sigma

1

Follow Sigma's documentation to connect to your data sourcearrow-up-right, pointing to the database and schema where your CHECKS_REPORT and DATASETS_REPORT tables are stored.

3

Create a new workbook in Sigma and add your visualisations.

A typical data quality dashboard might include:

  • KPI tiles — total datasets monitored, total checks executed, number of failing checks

  • Weighted data quality score — using the Weight check attribute to calculate a custom health score, trended over time

  • Breakdowns by check attribute — such as Data Quality Dimension (Completeness, Validity, Consistency, Accuracy, Timeliness, Uniqueness), Data Domain, Data Team, or Pipeline Stage

Sigma dashboard on top of Soda Cloud Reporting API results
circle-info

lightbulb Check attributes such as Data Quality Dimension, Data Domain, Data Team, and Pipeline Stage are particularly useful for filtering and segmenting results in your dashboard. The Weight attribute lets you assign a numerical importance level to each check, enabling a custom data health score.

See check and dataset attributes for more.


circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?