Link Search Menu Expand Document

Release notes for Soda Library

[soda-library] 1.4.7

10 April 2024

Fixes and features

  • Rename argument in set_scan_results_file method (#2047)
  • Dremio: support disableCertificateVerification option (#2049)

[soda-library] 1.4.5 & 1.4.6

04 April 2024

Fixes and features

  • SAS-3165 Only reset sampler when originally SodaCloudSampler by @dirkgroenen in #207
  • Feature: enable new anomaly detection algo in group by checks by @bastienboutonnet in #208

[soda-library] 1.4.4

23 March 2024

Fixes and features

  • Failed rows: fix warn/fail thresholds by @m1n0 in #204
  • Bump opentelemetry to 1.22 by @m1n0 in #205

[soda-library] 1.4.3

20 March 2024

Fixes and features

  • Add missing import for type annotations backwards compatibility by @Antoninj in #196
  • Refactor: Parse access_url from dbt config for new multicell org by @bastienboutonnet in #194
  • Timestamp conversion fixes by @Antoninj in #200
  • SAS-2966 Remove scan reference exception throw in local mode by @dirkgroenen in #199
  • Add test for checks level attributes by @m1n0 in #201
  • Fix: Attribute handler timezone test by @m1n0 in #202
  • Feature: Better legend wording and nicer tooltip formatting by @bastienboutonnet in #203

[soda-library] 1.4.2

05 March 2024

Fixes

  • Dremio: fix token support (#2028) by @m1n0 in #195

[soda-library] 1.4.1

01 March 2024

Fixes and features

  • IA-533: Implement daily and monthly seasonality to external regressor by @baturayo in #189
  • Support GMT (Zulu) and microseconds time format by @m1n0 in #193

[soda-library] 1.4.0

28 February 2024

Fixes and features

  • Cloud 6550: remote scans by @m1n0 in #192

[soda-library] 1.3.4

28 February 2024

Fixes and features

  • Fix: timezone mismatch between the recent and historical ad results by @baturayo in #188
  • Feature: in anomaly detection simulator use soda core historic check results endpoint instead of test results by @baturayo in #190
  • Update dask-sql by @m1n0 in #191

[soda-library] 1.3.3

13 February 2024

Fixes and features

  • Fix: include simulator assets folder into the setup.py by @baturayo in #186

[soda-library] 1.3.2

13 February 2024

Fixes and features

  • Fix: simulator import and streamlit path by @m1n0 in #182
  • Oracle: create dsn if not provided (#2012) by @m1n0 in #183
  • Oracle: cast config to str/int to prevent oracledb errors (#2018) by @m1n0 in #184
  • Oracle: fix Cloud integration by @m1n0 in #185

[soda-library] 1.3.1

09 February 2024

Fixes and features

  • Feature: correctly identified anomalies are excluded from training data by @baturayo in #178
  • Fix: show more clearly the detected frequency using warning message first by @baturayo in #180
  • Pin segment analytics and typing-extensions by @m1n0 in #181

[soda-library] 1.3.0

08 February 2024

Fixes and features

  • Feature: anomaly detection simulator by @baturayo in #163
  • Feature: added dremio token support (#2009) by @m1n0 in #179
  • Temporarily affix Segment Analytics version by @dirkgroenen in #177
  • Cloud 6693 improve group by by @m1n0 in #176

[soda-library] 1.2.4

31 January 2024

Fixes and features

  • Feature: implement severity level paramaters by @baturayo in #169
  • Fix for min_confidence_interval_ratio parameter by @baturayo in #170
  • Always use datasource specifis COUNT expression (#2003) by @m1n0 in #172
  • Send result to Cloud if data source connection issue by @m1n0 in #171
  • CLOUD-6805: avoid sending empty error location when logging configuration file parsing errors by @Antoninj in #173
  • CLOUD-6817: Catch Cloud exceptions (failed insertions) properly by @dirkgroenen in #174

[soda-library] 1.2.2 & 1.2.3

26 January 2024

Fixes and features

  • Hive data source improvements by @robertomorandeira in sodadata/soda-core#1982
  • Feature: Implement migrate from anomaly score check config by @baturayo in sodadata/soda-core#1998
  • Bump Prophet by @m1n0 in sodadata/soda-core#2000
  • Tests: Use approx comparison for floats by @m1n0 in sodadata/soda-core#1999


  • Support token auth by @m1n0 in #159
  • Schema check: Support custom identity (#1988) by @m1n0 in #161
  • CLI: Omit exception if no cli args by @m1n0 in #162
  • Add semver release for major, minor and latest by @dirkgroenen in #164
  • Bug: Handle null values for continuous dist by @baturayo in #165
  • IA-486: implement new anomaly detection logic and syntax by @baturayo in #153
  • Fix Python3.8 type issues for new AD syntax by @baturayo in #166
  • Feature: Support built in prophet public holidays by @baturayo in #167

[soda-library] 1.2.0

16 January 2024

Fixes and features

  • cbt: improve parsing logs by @m1n0 in #157
  • Sampler: fix link href by @m1n0 in #158
  • BREAKING: Row Reconciliation, new simple strategy for batch processing by @m1n0 in #155

[soda-library] 1.2.1

14 January 2024

Fixes and features

  • Recon row fixes by @m1n0 in #160

[soda-library] 1.1.29

03 January 2024

Fixes and features

  • Feature: implement warn_only for anomaly score by @baturayo in #156

[soda-library] 1.1.28

15 December 2023

Fixes and features

  • Fix frequency aggregation bug for anomaly detection by @baturayo in #152
  • Bump pydantic from v1 to v2 by @baturayo in #151
  • Adding support for authentication via a chained list of delegate accounts by @m1n0 in #154

[soda-library] 1.1.27

15 December 2023

Fixes and features

  • Group by: support anomaly/cot, better names by @m1n0 in #147

[soda-library] 1.1.26

04 December 2023

Fixes and features

  • Freshness: support in-check filters (#1970) by @m1n0 in #150. Documentation to follow shortly.

[soda-library] 1.1.24 & 1.1.25

24 November 2023

Fixes and features

  • Reconciliation row: expose deepdiff config, lower sensitivity by @m1n0 in #149

  • Make custom identity fixed as v4 by @m1n0 in #143
  • Reconciliation row: fix key cols mapping, bugfixes by @m1n0 in #148

[soda-library] 1.1.23

19 November 2023

Fixes and features

  • Align usage of database/catalog and implement fallback by @dirkgroenen in #142
  • Remove segment logs by @m1n0 in #145
  • Align usage of exit codes and add exit_code(4) by @dirkgroenen in #146

[soda-library] 1.1.22

14 November 2023

Fixes and features

  • Cloud: Add ScanId by @dirkgroenen in #137
  • Athena: Set default catalog name by @dirkgroenen in #139
  • Sqlserver: remove % from pattern (#1956) by @m1n0 in #140
  • Sqlserver: support quoting tables with brackets, “quote_tables” mode by @m1n0 in #141

[soda-library] 1.1.20 & 1.1.21

02 November 2023

Fixes and features

  • Freshness: support mixed thresholds by @m1n0 in #134
  • Duckdb: Rename path to database by @dirkgroenen in #135

  • Failed rows: new ‘empty’ type, handle no rows scenario better by @m1n0 in #132
  • Extend Data Source identity migration to spark_df by @dirkgroenen in #133

[soda-library] 1.1.19

23 October 2023

Fixes and features

  • Fix: compute value counts in DB rather than in python for categoric distribution checks by @baturayo in #116
  • Run scientific unit tests in CI by @baturayo in #121
  • Raise a warning instead of exception when dataset name is incorrect in suggestions by @baturayo in #126
  • Add support for custom dask data source name by @dirkgroenen in #120

[soda-library] 1.1.17 & 1.1.18

12 October 2023

Fixes and features

  • Remove quotes from dataset name in check payload by @m1n0 in #124

  • Fix package specific tests by @m1n0 in #123
  • Cloud 4311 nightly dev builds by @vijaykiran in #125
  • Add threshold support to failed row query/condition checks by @vijaykiran in #127

[soda-library] 1.1.15 & 1.1.16

11 October 2023

Fixes and features

  • Change dbt version marker in extras_reqiure. To install soda-dbt, use either pip install -i https://pypi.cloud.soda.io "soda-dbt[ver16]" or pip install -i https://pypi.cloud.soda.io "soda-dbt[ver15]".
  • Fix error on invalid check attributes by @vijaykiran in #117
  • CLOUD-5705 Fix schema attributes validation by @vijaykiran in #118
  • Add attributes to checks level by @vijaykiran in #119
  • Add tests for reconciliation checks, minor bugfixes by @m1n0 in #122

[soda-library] 1.1.14

05 October 2023

Fixes and features

  • Chore: rename Soda Library docker image build step by @Antoninj in #109
  • Fix threshold cloud payload for freshness checks by @vijaykiran in #112
  • Fix schema reconciliation config parsing by @m1n0 in #113
  • Allow to specify virtual file name for add sodacl string by @m1n0 in #115
  • Check type segment tracking by @m1n0 in #114

[soda-library] 1.1.13

27 September 2023

Fixes and features

  • Reconciliation schema: support type mapping by @m1n0 in #110
  • Fix databricks numeric types profiling by @m1n0 in #111

[soda-library] 1.1.12

21 September 2023

Fixes and features

  • Add PR auto assign reviewer GH workflow by @Antoninj in #103
  • Fix: nofile payload when http sampler is used by @m1n0 in #104
  • Trino: fix dataset prefix by @m1n0 in #105
  • Row reconciliation improve sample by @m1n0 in #106
  • Schema reconciliation improve diagnostics by @m1n0 in #107
  • Add thresholds and diagnostics to scan result by @m1n0 in #108

[soda-library] 1.1.11

19 September 2023

Fixes and features

  • Feature: Support dbt 1.5 and 1.6 by @vijaykiran in #99
  • Feature: Reference check: support must NOT exist by @m1n0 in #100
  • Fix: Reconciliation variables support by @m1n0 in #93
  • Fix: Catch exceptions while building results file by @dirkgroenen in #63
  • Fix: Row diff: fix python 3.8 compatibility by @vijaykiran in #101
  • Improvement: Reconciliation row diff better config logging by @m1n0 in #102

[soda-library] 1.1.10

13 September 2023

Fixes

  • Reconciliation row diff handle incompatible schema by @m1n0 in #97
  • Fix: api_key_id tracking in check suggestions by @baturayo in #98

[soda-library] 1.1.9

12 September 2023

Fix

Soda Library 1.1.9 includes a fix for reconciliation check results that have been overwriting historical results data in Soda Cloud.

Upon upgrading, Soda Cloud will archive any existing check history for reconciliation checks, only. With 1.1.9, reconciliation check results start collecting a fresh history of results with an improved check identify algorithm that properly retains check history.

Action

  1. Upgrade to Soda Library 1.1.9 to leverage the fix.
  2. Initiate a new scan that involves your reconciliation checks.
  3. Review the refreshed check results in Soda Cloud, the start of new, properly-retained historical results.

[soda-library] 1.1.6 - 1.1.8

11 September 2023

Fixes and features

  • Discussion scan type by @m1n0 in #91
  • Reconciliation schema remove warn, adjust pass graph numbers by @m1n0 in #92
  • Apply filter in row reconciliation by @m1n0 in #94
  • Reconciliation schema check by @m1n0 in #89
  • Reconciliation freshness check fix cloud ingest by @m1n0 in #87

[soda-library] 1.1.0 - 1.1.5

31 August 2023

Fixes and features

  • Remove label from recon checks by @m1n0 in #82
  • Recon row sample more intuitive header by @m1n0 in #83
  • Handle recon metric division by zero by @m1n0 in #84
  • WIP row recon column mapping by @m1n0 in #85
  • Add Presto support by @vijaykiran in #86

  • Push recon metric diagnostics to Soda Cloud by @m1n0 in #77
  • Fix key columns related issue by @vijaykiran in #78
  • CLOUD-4549 change source/target column to source/target columns by @vijaykiran in #79
  • Fix recon row column handling by @m1n0 in #80
  • Fix divide by zero when the metric value is 0 by @vijaykiran in #81

  • Fix recon group type by @m1n0 in #72
  • Update recon label behaviour by @m1n0 in #73

  • Row reconciliation samples by @m1n0 in #70

  • Support custom identity for failed rows check type by @m1n0 in #65
  • Row recon metric send count only by @m1n0 in #66
  • Source and target key columns support by @m1n0 in #67
  • Ingest recon checks as groups, add summary and diagnostics by @m1n0 in #68
  • Row reconciliation samples by @m1n0 in #69

[soda-library] 1.0.6 - 1.0.8

11 August 2023

Fixes

  • Metrics-based recon checks WIP by @m1n0 in #52
  • Fix typo in recon check name construction by @m1n0 in #57
  • CLOUD-4314: Make abs default for reconciliation checks by @vijaykiran in #58
  • CLOUD-4320: Fix between thresholds for reconciliation by @vijaykiran in #59
  • Build cleanup by @vijaykiran in #60
  • CLOUD-4319: Add support for metric expressions by @vijaykiran in #61
  • CLOUD-3993: Apply the CI/CD fix from soda-core to CI/CD by @milanaleksic in #62
  • Reconciliation row diff checks WIP by @vijaykiran in #64
  • Recon row diff: fix threshold-based outcome WIP

[soda-library] 1.0.5

26 July 2023

Fixes

  • Trino connector has new options: source, client_tags
  • Fix for optional schema_name property added to schema checks (743811c)

[soda-library] 1.0.3 & 1.0.4

21 July 2023

Fixes and features

  • CLOUD-4112 pass scan reference by @gregkaczan in #37
  • Source owner property in scan insert payload by @m1n0 in #39
  • Remove code that was originaly copied over from core by @m1n0 in #41
  • CLOUD-4144 add attributes to cross-checks by @vijaykiran in #42
  • Evaluate group evolution conditions if no historical data is present by @m1n0 in #40
  • Add dict as an overridable field by @vijaykiran in #43
  • Samples columns support by @m1n0 in #38
  • Fix filter in failed rows samples with parenthesis by @m1n0 in #44
  • Add app identifier to datasources by @vijaykiran in #45
  • Bug: skipping partition suggestion were causing the app to fail by @baturayo in #46
  • CLOUD-4170 expose cloud url by @gregkaczan in #49
  • [sqlserver] fix port configuration by @vijaykiran in #50
  • Set metric for failed rows check by @m1n0 in #48
  • Block soda suggest if cloud config is missing by @m1n0 in #51
  • Fix templates for failed rows by @vijaykiran in #53
  • Introduce schema_name property for schema checks by @vijaykiran in #54
  • Bump requirements by @vijaykiran in #55
  • Bug: fix keyboard interrupt tracking in check suggestions by @baturayo in #21
  • CLOUD-3967 merge soda scientific into main package by @vijaykiran in #20
  • Include template definition in check definition by @m1n0 in #23
  • DB prefix set to None if no info available by @m1n0 in #27
  • Improve templates not found/provided msgs by @m1n0 in #26
  • Update check suggestion links by @janet-can in #25
  • Fix boolean attributes+add tests by @m1n0 in #28
  • Update PR Workflow for merge queue support by @vijaykiran in #29
  • Templates support for failed rows check by @m1n0 in #30
  • Feature: track supported and unsupported data sources by @baturayo in #32
  • CLOUD-3862 push ci info file contents to cloud scan results by @gregkaczan in #31
  • Fix link to attributes by @vijaykiran in #33
  • TRINO: add http_headers option by @vijaykiran in #35
  • HIVE: add configuration parameters by @vijaykiran in #36
  • CLOUD-3861 pass scanType with cicd option by @gregkaczan in #34

[soda-library] 1.0.1 & 1.0.2

23 June 2023

Fixes

  • Add dispatch pipeline for pushing to Dockerhub by @dakue-soda in #10
  • Fix container build, the reference to our own pypi was m… by @dakue-soda in #11
  • Allow newer version of pyyaml by @m1n0 in #13
  • Handle scenario where schema cannot be obtained by @m1n0 in #14
  • Set default for group by name by @vijaykiran in #16
  • Include checks metadata in scan result by @m1n0 in #17
  • Upgrade BigQuery client to 3.x by @m1n0 in #19
  • Include basic data source info in scan payload by @m1n0 in #15

[soda-library] 1.0.0

15 June 2023

General availability release

Introducing the launch of Soda Library, a Python library and CLI tool for testing data quality.

Built on top of Soda Core, Soda Library leverages all the features and functionality of the open-source tool, with newly added features. Install Soda Library from the command line, then configure it to connect to Soda Cloud using API keys that are valid for a free, 45-day trial.

pip install -i https://pypi.cloud.soda.io soda-postgres

If you already use Soda Core, you can seamlessly upgrade to Soda Library without changing any configurations, checks, or integrations. See Migrate from Soda Core for details.

Features

  • Soda Library supports SodaCL’s newest checks: Group By and Group Evolution.
    • For an individual dataset, add a Group By configuration to specify the categories into which Soda must group the check results. When you run a scan, Soda groups the results according to the unique values in the column you identified.
    • Use a Group Evolution check to validate the presence or absence of a group in a dataset, or to check for changes to groups in a dataset relative to their previous state.
  • Soda Library supports Check Suggestions, a helpful CLI tool that assists you in generating basic data quality checks. Instead of writing your own data quality checks from scratch, the check suggestions assisstant profiles your dataset, then prompts you through a series of questions so that it can leverage the built-in Soda metrics and auto-generate quality checks tailored to your data.
  • Soda Library supports Check template configurations that enable you to prepare a user-defined metric that you can reuse in checks in multiple checks YAML files.

Last modified on 26-Apr-24