Soda Cloud - Release notes

Review release notes for Soda Cloud, a web app that enables you visualize data quality test results and set alerts and notifications.

circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Version 4.0

28th January, 2026 Everything new in Soda 4.0 We are introducing a new data contracts engine and a unified cloud platform that brings observability, AI, and data quality enforcement together. This new version of Soda has transformed the software into a full data-quality platform by layering on end-to-end data observability and collaborative data contracts.

This marks the shift from a CLI-centric checks engine toward a unified, observability-driven data quality platform with a refined, three-tier Core + Agent + Cloud architecture, built-in contracts, orchestration, and deep integrations.

Soda Core 4.0

Data Contracts Engine: An open-source engine that formalizes the standard for defining and executing data contracts. A clean, data-quality–first syntax supporting 50+ built-in check types.

Soda Cloud 4.0

Unified, self-driving data quality platform: It unites AI-powered contract generation, feedback-driven anomaly detection, deep diagnostic capabilities, and a faster, cleaner interface into a single platform where quality rules write themselves and bad data gets isolated instantly.


The improvements in this release are numerous. Here are some of the highlights:

  • Implemented new automated dataset discovery and onboarding process to easily onboard datasets with metric monitoring

    • Schema-less data source onboarding: Connect data sources without requiring predefined schemas

    • Rules-based dataset onboarding: Define rules to automatically onboard datasets matching specific criteria

    • New dataset onboarding UI: Redesigned interface for a smoother onboarding experience

    • Faster and more efficient dataset onboarding: Performance improvements reduce onboarding time

  • New data testing capabilities:

    • Introduced Data Contracts to formalize requirements and expectations of user datasets

      • Supported checks:

        • Dataset-level: row count, schema, freshness, duplicate, failed rows, custom metrics.

        • Column-level: missing, invalid, duplicate, aggregate, failed rows, custom metrics.

      • Available on supported data sources: PostgreSQL, Databricks, Snowflake, BigQuery, Athena, SQL Server, Dremio, Redshift, Oracle, Synapse and Fabric

    • Schedule your contract verifications with a Data Contract Schedule.

    • Introduced the new Collaborative Authoring UI that allows business and technical users to collaborate on Data Contracts.

    • Introduced Contract Requests to request and propose changes for Data Contracts.

      • Stay up-to-date with your requests and proposals with e-mail notifications.

    • Introduced Automated Contract Generation to kickstart your Data Contracts.

    • Introduced Secret Manager to securely store your data source connection credentials.

  • Powerful data observability:

    • Introduced AI-powered metrics observability at scale.

      • Supported monitors:

        • Dataset-level: total row count, total row count change, last modification time, schema changes, partition row count, most recent timestamp.

        • Column-level: missing values percentage, duplicate percentage, count, unique count, most recent timestamp, sum, minimum, maximum, average, standard deviation, variance, first quartile (Q1), median (Q2), third quartile (Q3), average length, minimum length, maximum length.

        • Group column-level monitor by any column to get insights per segment.

      • Available on supported data sources: PostgreSQL, Databricks, Snowflake, BigQuery, Athena, SQL Server, Dremio, Redshift, Oracle, Synapse and Fabric

      • Set a schedule for your monitors: daily, hourly, and custom intervals.

      • Fine-tune metric monitoring:

        • Set threshold strategy, exclusion values, and sensitivity,

        • Give feedback to improve detection,

        • Create incidents.

    • Introduced Cloud API to fetch your observability metrics.

    • Introduced programmatic configuration of metric monitoring

    • Introduced historical metric collection: calculate past data quality metrics retroactively for up to 365 days.

    • Introduced metric monitor pages with interactive plots to understand and fine-tune your monitors.

  • With the new Diagnostics Warehouse, Soda stores all scans, failed records, and historical data quality issues directly in your own data warehouse:

    • Full diagnostic information in one place, including attributes.

    • Faster root-cause analysis: jump from a failed check to the exact failed rows, affected datasets/columns, and prior history to see if it’s a one-off issue or a pattern.

    • Open & portable features: it’s just tables in your warehouse. Query with SQL, power dashboards, join with lineage, incident, or cost data, and automate workflows.

    • Security & Governance: Diagnostics Warehouse stores tables in your own warehouse, giving you full control over security, retention and access.

Last updated

Was this helpful?