What's new in Soda docs?

Review a changelog of additions and revisions to Soda documentation.

March 14, 2025

February 27, 2025

February 26, 2025

February 24, 2025

February 21, 2025

February 20, 2025

February 14, 2025

February 6, 2025

February 4, 2025

January 28, 2025

January 9, 2025

January 8, 2025

January 7, 2025

January 6, 2025

November 29, 2024

November 28, 2024

November 27, 2024

November 26, 2024

November 19, 2024

November 15, 2024

November 14, 2024

November 13, 2024

November 1, 2024

October 29, 2024

October 28, 2024

October 25, 2024

October 23, 2024

  • Added instructions for installing pydanticv1 extra library package; see Troubleshootarrow-up-right section of installation instructions.

October 22, 2024

October 21, 2024

October 17, 2024

  • Added release notesarrow-up-right documentation for Soda Library 1.6.5 and 1.7.0, and Soda Cloud Bulk-edit of dataset responsibilities and attributes.

October 16, 2024

October 10, 2024

  • Added an optional auto_exclude_anomalies parameter for anomaly detectionarrow-up-right that you can use to ignore or include existing anomalies in a training dataset.

October 9, 2024

October 8, 2024

October 7, 2024

October 3, 2024

  • Added documentation for using variablesarrow-up-right in the SampleRef message parameter for Python custom sampler to collect and display failed row samples.

October 2, 2024

September 30, 2024

September 26, 2024

September 25, 2024

September 24, 2024

September 23, 2024

  • Compiled and updated failed row samplesarrow-up-right documentation, including:

    • the option to use scan context in a CustomSampler to read/write data to/from a scan

    • the option to collect failed rows samples from specific columns in a dataset in Soda Cloud

    • the option to disable failed row sample collection from all datasets, except those with explicit configuration to collect samples

  • Updated Failed row checksarrow-up-right and User-defined checksarrow-up-right to include optional configuration to specify a single column against which to run the check.

  • Revised check attributesarrow-up-right configuration when applying attributes to more than one check.

September 23, 2024

September 19, 2024

  • Added attribute mapping to Okta SSO integrationarrow-up-right documentation.

  • Correct reconciliation check documentation to remove the option to add a list of comma-separated datasets to compare.

September 18, 2024

September 17, 2024

September 13, 2024

September 12, 2024

September 8, 2024

September 4, 2024

August 29, 2024

August 28, 2024

August 20, 2024

August 19, 2024

August 14, 2024

August 13, 2024

August 8, 2024

  • Added content to clarify that Soda Library officially supports Python 3.8, 3.9, and 3.10.

August 2, 2024

August 1, 2024

July 31, 2024

July 29, 2024

July 25, 2024

July 24, 2024

July 23, 2024

July 22, 2024

July 19, 2024

  • Updated MS Teams integration documentation to reference creating Workflows in MS Teams instead of Office 365 Connectors. Microsoft is retiring the connectors effective August 15, 2024. If you have previously set up a Soda integration with an Office 365 connector, follow the instructions for Creating a workflow from a channel in Teamsarrow-up-right, then update the integration URL in your existing Soda <> MS Teams integration in Soda Cloud.

July 18, 2024

July 17, 2024

July 16, 2024

July 15, 2024

July 10, 2024

  • Revised SodaGPT documentation to replace it with details about Ask AIarrow-up-right, Soda’s in-product generative AI assistant.

July 8, 2024

July 5, 2024

July 2, 2024

June 28, 2024

  • Added release notesarrow-up-right documentation for Soda Agent 1.1.15 & 1.1.16, Soda Library 1.5.13, and Soda Core 3.3.7, 3.3.8 & 3.3.9.

  • Published documentation to accompany data contracts version 4 release.

June 27, 2024

June 24, 2024

June 21, 2024

June 10, 2024

June 18, 2024

June 17, 2024

June 10, 2024

June 7, 2024

June 6, 2024

June 5, 2024

  • Added release notesarrow-up-right documentation for Soda Library 1.5.5.

  • Added details about IRSA authentication for Athena and Redshift data sources.

  • Added new example scriptarrow-up-right to fetch dataset and check info from a Soda Cloud account and transfer data into CSV files.

May 30, 2024

  • Added release notesarrow-up-right documentation for the Soda AI features generally available or available for preview access upon request.

May 29, 2024

May 28, 2024

May 25, 2024

May 24, 2024

May 23, 2024

May 20, 2024

May 17, 2024

May 14, 2024

May 8, 2024

May 7, 2024

May 6, 2024

  • Removed the Agreement deprecation notice as the decision to deprecate the feature has been reversed.

April 30, 2024

April 29, 2024

April 26, 2024

April 25, 2024

April 24, 2024

April 22, 2024

April 12, 2024

April 10, 2024

April 4, 2024

March 27, 2024

March 21, 2024

  • Added release notesarrow-up-right documentation for Soda Library 1.4.3.

  • Added more content and example for rerouting failed row samples.

  • Added requirementarrow-up-right for using anomaly detection checks in group by configurations: requires Soda Library 1.1.27 or greater, or Soda Agent 0.8.57 or greater.

March 20, 2024

March 18, 2024

March 15, 2024

March 12, 2024

March 6, 2024

March 5, 2024

March 1, 2024

February 29, 2024

  • Following improvements and changes to the self-hosted Soda Agent 1.0.0, removed the documented details for including idle replicas and polling intervals in a cluster that aimed to improve scan times. Also, added release notes to inform existing Soda Agent users about changes to parameter configuration with 1.0.0 and advice for optimal performance using managed node groups instead of Fargate profiles in Amazon EKS, GCP Autopilot, or AKS Virtual Clusters. See Soda Agent release notesarrow-up-right for upgrade details.

  • Added details for system requirementsarrow-up-right for deploying a Soda Agent in a Kubernetes cluster.

  • Included schema checks as available to add as a no-code check to a dataset in a data source that uses a Soda Agent to execute scans.

  • Added instructions for how to run a Soda Cloud-defined scan remotely using the Soda Library CLI. See the Remotely run a scan tab in Scan for data qualityarrow-up-right.

  • Added release notesarrow-up-right documentation for Soda Library 1.3.4, 1.4.0 and Soda Core 1.3.3.

  • Added Databricks SQL to the list of data sourcesarrow-up-right you can connect to using a Soda-hosted Agent.

February 28, 2024

February 26, 2024

February 22, 2024

February 21, 2024

February 14, 2024

February 13, 2024

February 9, 2024

February 8, 2024

February 2, 2024

February 1, 2024

January 29, 2024

January 26, 2024

January 22, 2024

  • Updated the documentation for rerouting failed row samples to include new, optional configuration parameters that offer users direct access to the failed row sample data.

January 19, 2024

January 15, 2024

January 12, 2024

January 5, 2024

January 3, 2024

January 2, 2024

  • Added alertnate syntax for failed row check using a failed row condition.

December 21, 2021

December 15, 2023

December 13, 2023

December 7, 2023

December 4, 2023

November 29, 2023

November 28, 2023

November 24, 2023

November 22, 2023

November 21, 2023

  • Added documentation for managing scansarrow-up-right and setting up failed scan notifications.

  • Added work_group as an optional connection configuration property for Athenaarrow-up-right.

  • Added troubleshooting tip for using quotes on column names within an in-check filter. See Troubleshoot SodaCLarrow-up-right.

  • In the context of Soda Cloud, changed instances of scan defintion to scan schedule to reflect the updated naming in the Soda Cloud UI.

November 16, 2023

November 15, 2023

  • Corrected the rule that numeric characters in a list of valid values, invalid values, or missing values, must be wrapped in single quotes. This is not the case. See Specify valid or invalid valuesarrow-up-right for corrected content.

November 14, 2023

November 8, 2023

  • Removed Reporting API v0 documentation as the version is now deprecated.

November 7, 2023

November 2, 2023

  • Added pollingInterval to Soda Agent deploymentarrow-up-right instructions.

  • Added release notesarrow-up-right documentation for Soda Library 1.1.20 - 1.1.21 and Soda Core 3.0.52 - 3.0.53.

  • Added sample input values and clarifying notes to data source connection config reference for Athena.

October 30, 2023

  • Updated anomaly score documentation to include support for dataset filters.

  • Added documentation to accompany new support for Prestoarrow-up-right data source.

October 26, 2023

October 25, 2023

October 24, 2023

October 23, 2023

October 17, 2023

October 13, 2023

October 12, 2023

October 11, 2023

  • Refactored the content on docs.soda.io to focus more on use cases, tasks, and reader goals. The goal of the project was to pivot from a products-based set of documentation to task-based/use case-based content. You may notice a change to the navigation on docs.soda.io that is organized by actions (Install, Deploy, Run Scans, Set alerts, etc.) instead of by product (Soda Library, Soda Cloud, Soda Agent, etc.)

  • Updated Integrate Soda with dbtarrow-up-right to install sub-packages with double-quotes.

  • Update best practices for reconciliation checksarrow-up-right to recommend creating a separate agreement for a reconciliation project.

  • Added release notesarrow-up-right documentation for Soda Library 1.1.15 - 1.1.16 and Soda Core 3.0.51.

October 6, 2023

October 5, 2023

September 27, 2023

  • Added clarifying information about user input and how it is used by SodaGPT.

  • Added release notesarrow-up-right documentation for Soda Library 1.1.13 and Soda Core 3.0.50.

September 26, 2023

September 21, 2023

September 20, 2023

September 19, 2023

September 18, 2023

September 14, 2023

  • Removed known issue for inability to use check identity with failed row checks. This is now supported in Soda Library.

September 13, 2023

September 12, 2023

September 11, 2023

September 1, 2023

August 31, 2023

August 30, 2023

August 24, 2023

August 23, 2023

  • Update agreement documentation to reflect the change in behaviour where scans do not run until stakeholders have approved of the agreement.

August 21, 2023

August 11, 2023

August 10, 2023

August 8, 2023

  • Revised documentation to reflect the new Checks dashboard, that displays checks and their latest scan results. This replaces the Check Results dashboard, that displayed all individual check results.

August 7, 2023

July 26, 2023

July 24, 2023

July 21, 2023

July 6, 2023

July 4, 2023

June 27, 2023

  • Documentation to accompany the preview launch of SodaGPT.

June 23, 2023

  • Changed requirement for check templatearrow-up-right to include the dataset identifier in the first line of the check so that Soda Cloud can properly render the check results.

  • Added release notesarrow-up-right documentation for Soda Core 3.0.40 and Soda Core 3.0.41.

  • Added release notesarrow-up-right documentation for Soda Library 1.0.1 and Soda Library 1.0.2.

  • Reverted Soda Agent to describe configuring Soda Core settings instead of Library. Will update to Soda Library details when updates are complete.

June 15, 2023

June 12, 2023

June 9, 2023

June 8, 2023

May 31, 2023

  • Added instructions and event payload details for using a webhookarrow-up-right to notify a third-party of new, deleted, or changed Soda agreements.

May 30, 2023

May 25, 2023

  • Added a step for configuringarrow-up-right soda-core-spark[databricks] to be sure to install databricks-sql-connector as well.

May 23, 2023

May 20, 2023

May 15, 2023

May 11, 2023

May 9, 2023

May 2, 2023

  • Published content regarding the set up of multiple Soda Cloud organizationsarrow-up-right for use with different environments in your network infrastructure.

  • Added a note about selecting a region when you sign up for a new Soda Cloud account.

April 28, 2023

April 18, 2023

April 11, 2023

March 29, 2023

March 28, 2023

March 24, 2023

March 21, 2023

  • Added release notesarrow-up-right documentation for Soda Core 3.0.29 and Soda Core 3.0.30.

  • Added instructions for limiting samples for an entire data source.

March 9, 2023

March 8, 2023

March 7, 2023

February 28, 2023

  • Published instructions for setting up private connectivity to a Soda Cloud account using AWS PrivateLink.

February 23, 2023

February 22, 2023

  • Removed preview status from agent deployment documentation for Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE).

  • Added instructions for programmatically running a Soda scan of the contents of a local filearrow-up-right using Dask.

February 21, 2023

February 16, 2023

February 10, 2023

February 9, 2023

January 25, 2023

January 24, 2023

January 20, 2023

January 19, 2023

January 13, 2023

  • Updated Soda Agent for GKE documentation so that the instructions for using a file reference for a BigQuery data source connection use a Kubernetes secret instead of an Kubernetes ConfigMap.

January 11, 2023

January 10, 2023

  • Added note about the new ability to add co-owners to an agreement.

December 28, 2022

December 20, 2022

December 15, 2022

December 12, 2022

December 8, 2022

  • Added preview documentation for the Soda Cloud Reporting API v1arrow-up-right.

  • Corrected documentation to properly reflect that you can add only one column against which to execute a metric in a check.

  • Reverted the statement about using variables to pass any value anywhere in syntax or configuration at scan time. Refer to variables documentationarrow-up-right for details on how to use them.

December 2, 2022

December 1, 2022

November 30, 2022

November 28, 2022

November 23, 2022

November 18, 2022

  • Added a list of valid formatsarrow-up-right for validity metrics that Soda for MS SQL Server supports.

  • Added documentation for rerouting failed rows samples to an HTTP endpoint; supported as of Soda Core 3.0.13.

  • Removed content for overwriting Soda Cloud checks results using -t option.

  • Archived all Soda SQL and Soda Spark content to the sodadata/soda-sqlarrow-up-right repository in GitHub.

November 16, 2022

  • Added content to more explictly describe the metrics that dataset discovery and column profiling derive, and the potential compute costsarrow-up-right associated with these configurations.

November 15, 2022

November 14, 2022

November 10, 2022

November 8, 2022

  • Added an example webhook integration for Soda Cloud and ServiceNow.

November 7, 2022

November 3, 2022

November 2, 2022

November 1, 2022

October 26, 2022

  • Removed the Preview status from self-serve features which are now generally available in Soda Cloud, such as agreements and profilingarrow-up-right.

  • Migrated custom metric templates from Soda SQL to SodaCL.

October 19, 2022

October 13, 2022

October 11, 2022

October 5, 2022

September 29, 2022

  • Added a link to a community contribution for Prefect 2.0 collection for Soda Core.

  • Updated Reference checks documentation for displaying failedarrow-up-right rows in Soda Cloud.

September 28, 2022

September 23, 2022

Septemeber 22, 2022

September 14, 2022

  • Added instructions for configuring a custom sampler for failed rows.

September 13, 2022

Septemeber 12, 2022

  • Documented Soda Cloud resourcesarrow-up-right to add visual context to the parts that exist in Soda Cloud, and how they relate to each other, particularly when you delete a resource.

  • Added documentation to correspond with Soda Cloud’s new support for webhooksarrow-up-right to integrate with third-party service providers to send alert notifications or create and track incidents externally.

  • Corrected documentation to indicate that reference checksarrow-up-right do not support dataset filters.

September 9, 2022

  • Decoupled data source connection configuration details from Soda Core. Created a separate page for each data source’s connection config details. See Connect a data sourcearrow-up-right.

September 8, 2022

September 7, 2022

August 30, 2022

  • Recorded known issue: Soda Core for SparkDF does not support anomaly score or distribution checks.

August 29, 2022

August 26, 2022

August 24, 2022

  • Adjusted configuration instructionsarrow-up-right for soda-core-spark-df to separately install dependencies for Hive and ODBC as needed.

  • Added content to correspond with Soda Core’s new support for Trinoarrow-up-right.

  • Removed the known issue: The missing format configuration does not function as expected.

August 22, 2022

August 11, 2022

  • Added documentation for the new -t option for use with scan commands to overwrite scan output in Soda Cloud.

August 10, 2022

August 9, 2022

  • Added documentation to describe the migration path from Soda SQL to Soda Core.

August 2, 2022

August 1. 2022

July 27, 2022

July 20, 2022

  • Published documentation associated with the preview release of Soda Cloud’s self-serve features and functionality. This is a limited access preview release, so please ask us for access at [email protected]envelope.

June 29, 2022

June 28, 2022

  • Revised documentation to reflect the general availability of Soda Core and SodaCL.

  • Archived the deprecated documentation for Soda SQL and Soda Spark.

June 23, 2022

June 22, 2022

June 21, 2022

  • Added details to Soda Core documentation for using system variables. to store sensitive credentials.

  • Updated the Quick start for Soda Core and Soda Cloudwith slightly changed instructions.

June 20, 2022

  • Changed all references to table in SodaCL to dataset, notably used with for eacharrow-up-right and distributionarrow-up-right check syntax.

  • Added deprecation warning banners to all Soda SQL and Soda Spark content.

  • Revised and reorganized content to reframe focus on Soda Core in lieu of Soda SQL.

  • Added more Soda Core documentation to main docs set.

  • Updated Soda product overviewarrow-up-right to reflect new focus on Soda Core and imminent deprecation of Soda SQL and Soda Spark.

  • Updated Soda Cloud documentation to reflect new focus on Soda Core.

  • Update links on docs home pagearrow-up-right to point to most recent content and shift Soda SQL and Soda Core to a Legacy section.

June 14, 2022

June 9, 2022

  • Added some new Soda Core content to documentation.

  • Moved Soda SQL and Soda Spark in documentation leftnav.

  • Updated Home page with links to new Soda Core documentation.

  • Fixed formatting in Quick start for Soda Core and Soda Cloud.

June 8, 2022

June 7, 2022

June 6, 2022

June 2, 2022

June 1, 2022

May 31, 2022

May 26, 2022

May 25, 2022

  • Revised and renamed Data observability to Data concepts.

May 24, 2022

May 19, 2022

May 18, 2022

  • Updated the details pertaining to connecting Soda Core to Soda Cloud. The syntax for the key-value pairs for API keys changed from api_key and api_secret to api_key_id and api_key_secret.

May 9, 2022

April 26, 2022

  • Updated a set of Soda product comparison matrices to illustrate the features and functionality available with different Soda tools.

April 25, 2022

  • Updated the Soda product overviewarrow-up-right with a more thorough explanation of the product suite and how the parts work together to establish and maintain data reliability.

April 22, 2022

  • Replaced the quick start tutorials for Soda SQL and Soda Cloud with two new tutorials:

    • Quick start for Soda SQL and Soda Cloud

    • Quick start for Soda Core and Soda Cloud

April 6, 2022

  • Added details to the Freshness checkarrow-up-right to clarify limitations when specifying duration.

  • Added documentation for how to use system variables to store property values#provide-credentials-as-system-variables) instead of storing values in the env_vars.yml file.

  • Updated Soda Core documentation to remove aspirational content from Adding scans to a pipeline.

April 1, 2022

  • Added documentation for the dataset_name identifier in a scan YAML file. Use the identifier to send more precise dataset information to Soda Cloud.

March 22, 2022

  • New documentation for the beta release of Soda Core, a free, open-source, command-line tool that enables you to use the Soda Checks Language to turn user-defined input into aggregated SQL queries.

  • New documentation for the beta release of SodaCLarrow-up-right, a domain-specific language you can use to define Soda Checks in a checks YAML file.

February 15, 2022

  • Added content to explain how Soda Cloud notifies users of a scan failure.

February 10, 2022

January 18, 2022

January 17, 2022

January 12, 2022

January 11, 2022

  • Added requirement for installing Soda Spark on a Databricks cluster. See Soda Spark Requirements.

December 22, 2021

  • Added data types information for Trino and MySQL.

  • Adjusted the docs footer to offer users ways to suggest or make improve our docs.

December 16, 2021

December 14, 2021

  • Added documentation to accompany the new Soda Cloud Incidentsarrow-up-right feature. Collaborate with your team in Soda Cloud and in Slack to investigate and resolve data quality issues.

December 13, 2021

December 6, 2021

  • Added documenation for the new audit trailarrow-up-right feature for Soda Cloud.

  • Added further detail about which rows Soda SQL sends to Soda Cloud as samples.

December 2, 2021

  • Updated Quick start tutorial for Soda Cloud.

  • Added information about using regex in a YAML file.

November 30, 2021

  • Added documentation about the anonymous Soda SQL usage statistics that Soda collects. Learn more about the information Soda collects and how to opt out of sending statistics.

November 26, 2021

November 24, 2021

November 23, 2021

  • Revised the Quick start tutorial for Soda SQL to use the same demo repo as the interactive demo.

November 15, 2021

  • Added a new, embedded interactive demo for Soda SQL.

  • New documentation to accompany the soft-launch of Soda Spark, an extension of Soda SQL functionality.

November 9, 2021

  • New documentation to accompany the new, preview release of historic metrics. This type of metric enables you to use Soda SQL to access the historic measurements in the Cloud Metric Store and write tests that use those historic measurements.

October 29, 2021

  • Added SSO identity providers to the list of third-party IdPs to which you can add Soda Cloud as a service provider.

October 25, 2021

  • Removed the feature to Add datasets directly in Soda Cloud. Instead, users add datasets using Soda SQL.

  • Added support for Snowflake session parameter configuration in the warehouse YAML file.

October 18, 2021

  • New documentation to accompany the new Schema Evolution Monitor in Soda Cloud. Use this monitor type to get notifications when columns are changed, added, or deleted in your dataset.

October 17, 2021

  • New documentation to accompany the new feature to disable or reroute sample data to Soda Cloud.

September 30, 2021

September 28, 2021

September 17, 2021

  • Added Soda Cloud metric names to primary list of column metrics.

September 9, 2021

  • Published documentation for time partitioning, column metrics, and sample data in Soda Cloud.

September 1, 2021

  • Added information for new command options included in Soda CLI version 2.1.0b15 for

    • limiting the datasets that Soda SQL analyzes,

    • preventing Soda SQL from sending scan results to Soda Cloud after a scan, and

    • instructing Soda SQL to skip confirmations before running a scan.

  • Added information about how to use a new option, account_info_path, to direct Soda SQL to your BigQuery service account JSON key file for configuration details.

August 31, 2021

  • Added documentation for the feature that allows you to include or exclude specific datasets in your soda analyze command.

August 30, 2021

  • Updated content and changed the name of Data monitoring documentation to Data quality.

August 23, 2021

  • New document for custom metric templates that you can copy and paste into scan YAML files.

August 9, 2021

  • Added details for Apache Spark support. See Install Soda SQL.

  • Updated Adjust a dataset scan schedule to include details instructions for triggering a Soda scan externally.

August 2, 2021

  • Added new document to ouline the Supportarrow-up-right that Soda provides its users and customers.

  • Updated BigQuery data source configuration to include auth_scopes.

July 29, 2021

  • Added instructions for configuring BigQuery permissions to run Soda scans.

  • Added an example of a programmatic scan using a lambda function.

  • Added instructions for overwriting scan output in Soda Cloud.

  • New document for Example test to compare row counts.

July 26. 2021

  • Added Soda SQL documentation for configuring excluded_columns during scans.

  • Updated compatible data sources for Soda SQL to include MySQL (experimental), and Soda Cloud to improve accuracy.

  • Updated Create monitors and alerts to include custom metrics as part of creation flow; updated prerequisites.

  • Updated Product overview comparisonarrow-up-right for new excluded_columns functionality and custom metrics in Soda Cloud.

  • Minor adjustments to reflect new and changed elements in the Soda SQL 2.1.0b12arrow-up-right release.

July 16, 2021

  • Added early iteraction of content for Best practices for defining tests and running scans.

  • Added a link to the docs footer to open a Github issue to report issues with docs.

July 13, 2021

  • New Add datasets documentation for the newly launched feature that enables your to connect to data sources and add datasets directly in Soda Cloud.

  • New Collaborate on data monitoring documentation that incorporates how to integrate with Slack, and how to include your team in your efforts to monitor your data.

  • New Adjust a dataset scan schedule content to help you refine how often Soda scans a particular dataset.

  • Revised Quick start tutorial for Soda Cloud that incorporates the new feature to add datasets.

  • Improved Soda product overview page with a comparison chartarrow-up-right for features and functionality.

July 6, 2021

If you want to know which flavor of Soda is best, you need to examine the criteria of what makes a good Soda. Is it sweet? Is it performant? Is it an appealing color? Does it produce valid SQL?

Though not conclusive, early test results would indicate that the best flavors of Soda, in descending order, are as follows:

  1. Cream Soda

  2. Root Beer

  3. Coca Cola

  4. Ginger Beer

  5. Cherry Cola

Last updated

Was this helpful?