Link Search Menu Expand Document

Release notes for Soda Library

[soda-library] 1.1.13

27 September 2023

Fixes and features

  • Reconciliation schema: support type mapping by @m1n0 in #110
  • Fix databricks numeric types profiling by @m1n0 in #111

[soda-library] 1.1.12

21 September 2023

Fixes and features

  • Add PR auto assign reviewer GH workflow by @Antoninj in #103
  • Fix: nofile payload when http sampler is used by @m1n0 in #104
  • Trino: fix dataset prefix by @m1n0 in #105
  • Row reconciliation improve sample by @m1n0 in #106
  • Schema reconciliation improve diagnostics by @m1n0 in #107
  • Add thresholds and diagnostics to scan result by @m1n0 in #108

[soda-library] 1.1.11

19 September 2023

Fixes and features

  • Feature: Support dbt 1.5 and 1.6 by @vijaykiran in #99
  • Feature: Reference check: support must NOT exist by @m1n0 in #100
  • Fix: Reconciliation variables support by @m1n0 in #93
  • Fix: Catch exceptions while building results file by @dirkgroenen in #63
  • Fix: Row diff: fix python 3.8 compatibility by @vijaykiran in #101
  • Improvement: Reconciliation row diff better config logging by @m1n0 in #102

[soda-library] 1.1.10

13 September 2023

Fixes

  • Reconciliation row diff handle incompatible schema by @m1n0 in #97
  • Fix: api_key_id tracking in check suggestions by @baturayo in #98

[soda-library] 1.1.9

12 September 2023

Fix

Soda Library 1.1.9 includes a fix for reconciliation check results that have been overwriting historical results data in Soda Cloud.

Upon upgrading, Soda Cloud will archive any existing check history for reconciliation checks, only. With 1.1.9, reconciliation check results start collecting a fresh history of results with an improved check identify algorithm that properly retains check history.

Action

  1. Upgrade to Soda Library 1.1.9 to leverage the fix.
  2. Initiate a new scan that involves your reconciliation checks.
  3. Review the refreshed check results in Soda Cloud, the start of new, properly-retained historical results.

[soda-library] 1.1.6 - 1.1.8

11 September 2023

Fixes and features

  • Discussion scan type by @m1n0 in #91
  • Reconciliation schema remove warn, adjust pass graph numbers by @m1n0 in #92
  • Apply filter in row reconciliation by @m1n0 in #94
  • Reconciliation schema check by @m1n0 in #89
  • Reconciliation freshness check fix cloud ingest by @m1n0 in #87

[soda-library] 1.1.0 - 1.1.5

31 August 2023

Fixes and features

  • Remove label from recon checks by @m1n0 in #82
  • Recon row sample more intuitive header by @m1n0 in #83
  • Handle recon metric division by zero by @m1n0 in #84
  • WIP row recon column mapping by @m1n0 in #85
  • Add Presto support by @vijaykiran in #86

  • Push recon metric diagnostics to Soda Cloud by @m1n0 in #77
  • Fix key columns related issue by @vijaykiran in #78
  • CLOUD-4549 change source/target column to source/target columns by @vijaykiran in #79
  • Fix recon row column handling by @m1n0 in #80
  • Fix divide by zero when the metric value is 0 by @vijaykiran in #81

  • Fix recon group type by @m1n0 in #72
  • Update recon label behaviour by @m1n0 in #73

  • Row reconciliation samples by @m1n0 in #70

  • Support custom identity for failed rows check type by @m1n0 in #65
  • Row recon metric send count only by @m1n0 in #66
  • Source and target key columns support by @m1n0 in #67
  • Ingest recon checks as groups, add summary and diagnostics by @m1n0 in #68
  • Row reconciliation samples by @m1n0 in #69

[soda-library] 1.0.6 - 1.0.8

11 August 2023

Fixes

  • Metrics-based recon checks WIP by @m1n0 in #52
  • Fix typo in recon check name construction by @m1n0 in #57
  • CLOUD-4314: Make abs default for reconciliation checks by @vijaykiran in #58
  • CLOUD-4320: Fix between thresholds for reconciliation by @vijaykiran in #59
  • Build cleanup by @vijaykiran in #60
  • CLOUD-4319: Add support for metric expressions by @vijaykiran in #61
  • CLOUD-3993: Apply the CI/CD fix from soda-core to CI/CD by @milanaleksic in #62
  • Reconciliation row diff checks WIP by @vijaykiran in #64
  • Recon row diff: fix threshold-based outcome WIP

[soda-library] 1.0.5

26 July 2023

Fixes

  • Trino connector has new options: source, client_tags
  • Fix for optional schema_name property added to schema checks (743811c)

[soda-library] 1.0.3 & 1.0.4

21 July 2023

Fixes and features

  • CLOUD-4112 pass scan reference by @gregkaczan in #37
  • Source owner property in scan insert payload by @m1n0 in #39
  • Remove code that was originaly copied over from core by @m1n0 in #41
  • CLOUD-4144 add attributes to cross-checks by @vijaykiran in #42
  • Evaluate group evolution conditions if no historical data is present by @m1n0 in #40
  • Add dict as an overridable field by @vijaykiran in #43
  • Samples columns support by @m1n0 in #38
  • Fix filter in failed rows samples with parenthesis by @m1n0 in #44
  • Add app identifier to datasources by @vijaykiran in #45
  • Bug: skipping partition suggestion were causing the app to fail by @baturayo in #46
  • CLOUD-4170 expose cloud url by @gregkaczan in #49
  • [sqlserver] fix port configuration by @vijaykiran in #50
  • Set metric for failed rows check by @m1n0 in #48
  • Block soda suggest if cloud config is missing by @m1n0 in #51
  • Fix templates for failed rows by @vijaykiran in #53
  • Introduce schema_name property for schema checks by @vijaykiran in #54
  • Bump requirements by @vijaykiran in #55
  • Bug: fix keyboard interrupt tracking in check suggestions by @baturayo in #21
  • CLOUD-3967 merge soda scientific into main package by @vijaykiran in #20
  • Include template definition in check definition by @m1n0 in #23
  • DB prefix set to None if no info available by @m1n0 in #27
  • Improve templates not found/provided msgs by @m1n0 in #26
  • Update check suggestion links by @janet-can in #25
  • Fix boolean attributes+add tests by @m1n0 in #28
  • Update PR Workflow for merge queue support by @vijaykiran in #29
  • Templates support for failed rows check by @m1n0 in #30
  • Feature: track supported and unsupported data sources by @baturayo in #32
  • CLOUD-3862 push ci info file contents to cloud scan results by @gregkaczan in #31
  • Fix link to attributes by @vijaykiran in #33
  • TRINO: add http_headers option by @vijaykiran in #35
  • HIVE: add configuration parameters by @vijaykiran in #36
  • CLOUD-3861 pass scanType with cicd option by @gregkaczan in #34

[soda-library] 1.0.1 & 1.0.2

23 June 2023

Fixes

  • Add dispatch pipeline for pushing to Dockerhub by @dakue-soda in #10
  • Fix container build, the reference to our own pypi was m… by @dakue-soda in #11
  • Allow newer version of pyyaml by @m1n0 in #13
  • Handle scenario where schema cannot be obtained by @m1n0 in #14
  • Set default for group by name by @vijaykiran in #16
  • Include checks metadata in scan result by @m1n0 in #17
  • Upgrade BigQuery client to 3.x by @m1n0 in #19
  • Include basic data source info in scan payload by @m1n0 in #15

[soda-library] 1.0.0

15 June 2023

General availability release

Introducing the launch of Soda Library, a Python library and CLI tool for testing data quality.

Built on top of Soda Core, Soda Library leverages all the features and functionality of the open-source tool, with newly added features. Install Soda Library from the command line, then configure it to connect to Soda Cloud using API keys that are valid for a free, 45-day trial.

pip install -i https://pypi.cloud.soda.io soda-postgres

If you already use Soda Core, you can seamlessly upgrade to Soda Library without changing any configurations, checks, or integrations. See Migrate from Soda Core for details.

Features

  • Soda Library supports SodaCL’s newest checks: Group By and Group Evolution.
    • For an individual dataset, add a Group By configuration to specify the categories into which Soda must group the check results. When you run a scan, Soda groups the results according to the unique values in the column you identified.
    • Use a Group Evolution check to validate the presence or absence of a group in a dataset, or to check for changes to groups in a dataset relative to their previous state.
  • Soda Library supports Check Suggestions, a helpful CLI tool that assists you in generating basic data quality checks. Instead of writing your own data quality checks from scratch, the check suggestions assisstant profiles your dataset, then prompts you through a series of questions so that it can leverage the built-in Soda metrics and auto-generate quality checks tailored to your data.
  • Soda Library supports Check template configurations that enable you to prepare a user-defined metric that you can reuse in checks in multiple checks YAML files.

Last modified on 27-Sep-23