Release notes for Soda Core
Review release notes for Soda Core, an open-source tool for testing and monitoring data quality.
[soda-core] 3.5.0
20 February 2025
3.5.0 Features and fixes
Custom identity: Support in Reference and Group Evolution by @m1n0 in #2208
Validity: Accept 24-hr times in ISO 8601 dates by @pholser in #2133
Dask: Fix wrong count query when row_count used with duplicate check on two columns. by @jzalucki in #2209
Dask: Fix COUNT queries on windows by @mivds in #2210
Impala: Add support for Apache Impala by @marcocharlie in #2191
Athena: Add session token parameter by @ruixuantan in #2095
Postgres: Handle unsupported options parameter by @RekunDzmitry in #2176
Duckdb: Relax duckdb dependency to < 1.1.0 by @tombaeyens in #2162
Docs: Fix typo by @crazyshrut in #2151
Docs: Fix workflow path for CI by @syou6162 in #2206
Chore: Remove unused MarkupSafe dependency (#2103) by @ghjklw in #2200
Chore: Relax opentelemetry version contraint (#2192) by @ghjklw in #2199
Refer to the Soda Core Release Notes for details.
[soda-core] 3.4.4
07 January 2025
3.4.4 Features and fixes
Remove autoescape from jinja - allow use of special chars in variables. by @jzalucki in #2193
Add support for numbers in identifier. by @jzalucki in #2197
Refer to the Soda Core Release Notes for details.
[soda-core] 3.4.3
02 December 2024
3.4.3 Features and fixes
Add include null to valid_count and invalid_count and percentage version. by @jzalucki in #2186
Yaml: read and parse files thread-safe by @m1n0 in #2188
Refer to the Soda Core Release Notes for details.
[soda-core] 3.4.2
28 November 2024
3.4.2 Features and fixes
Foreach: Resolve vars in queries by @m1n0 in #2183
Chore: Use jinja sandbox for templates by @m1n0 in #2185
Refer to the Soda Core Release Notes for details.
[soda-core] 3.4.1
22 October 2024
3.4.1 Features and fixes
Add documentation for MS fabric package install + config by @janet-can in #2180
Fix: Comparison row count check secondary datasource filter by @asantoz in #2165
Refer to the Soda Core Release Notes for details.
[soda-core] 3.4.0
21 October 2024
3.4.0 Features and fixes
Add support for Azure SQL, Synapse, and Microsoft Fabric and extend support for SQL Server by @sdebruyn in #2160
Add page to docs folder for data contracts language reference by @janet-can in #2166
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.15 - 22
12 September 2024
3.3.15 - 3.3.22 Fixes
Removing the data source name lower case requirement by @tombaeyens in #2161
Fixing Spark session API by @tombaeyens in #2159
Fixing the lacking data source error message on contract build by @tombaeyens in #2158
Fixing test library dependencies for test table creation by @tombaeyens in #2157
Fixing atlan source contract yaml and lacking schema error message by @tombaeyens in #2153
Fixed the Atlan integration glue db-schema switch by @tombaeyens in #2152
Comparison check - fix other table filter by @jzalucki in #2149
Contracts7 : Fixing the integration correlation issue by @tombaeyens in #2148
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.14
14 August 2024
3.3.14 Fixes
Cross row count check should support custom identity. by @jzalucki in #2139
Handle SQL exception nicely for failed rows and user-defined check. by @jzalucki in #2140
Spark: Send discovery data despite errors. by @jzalucki in #2142
Spark: Failed rows should not be limited to max 100 total results. by @jzalucki in #2143
Freshness: Support variables in thresholds by @m1n0 in #2146
Spark: replicate implicit ‘include all’ in profiling consistently wit… by @m1n0 in #2147
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.13
29 July 2024
3.3.13 Fixes
Always reset logger when new scan instance is created. by @jzalucki in #2136
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.12
18 July 2024
3.3.12 Fixes
Scan context: support list keys in getter by @m1n0 in #2135
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.11
17 July 2024
3.3.11 Features and fixes
Date formats: improve date format regex by @pholser in #2128
Duckdb: fix schema check for db in file by @m1n0 in #2130
Snowflake: support custom hostname and port @whummer in #2109
Improve user provided query sanitization. by @jzalucki in #2131
Scan Context: read/write data from/to a scan by @m1n0 in #2134
Add sslmode support to postgres and denodo by @m1n0 in #2066
Sqlserver: add support for custom parameters. by @jzalucki in #2132
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.10
08 July 2024
3.3.10 Features and fixes
Bugfix for the Atlan integration when using soda-core & contracts
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.7 & 3.3.8 & 3.3.9
28 June 2024
3.3.7 - 9 Features and fixes
Contracts4 by @tombaeyens in #2116
Duplicate check: Remove unused aggregated query by @m1n0 in #2118
Updated readme by @janet-can in #2104
Oracle: Fix formats, freshness, other minor fixes by @m1n0 in #2106
Sampler: Do not invoke http endpoint if no failed rows present by @jzalucki in #2115
Missing count: Fix sample query by @jzalucki in #2114
Between threshold: Fix error when using variables by @jzalucki in #2113
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.6
20 June 2024
3.3.6 Features and fixes
CLOUD-7708 - Add Snowflake CI account to pipeline for soda-core by @dakue-soda in #2088
CLOUD-7400 - Improve memory usage by @dirkgroenen in #2081
Duplicate check: fail gracefully in case of error in query by @m1n0 in #2093
Bump requests and tox/docker by @m1n0 in #2094
Duplicate check: support sample exclude columns fully by @m1n0 in #2096
Spark: profiling support more text types by @m1n0 in #2099
Spark: profiling support more numeric types by @m1n0 in #2100
Oracle: fix profiling/discovery queries, add numeric profiling by @m1n0 in #2101
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.5
24 May 2024
Features and fixes
Failed rows: fix warn/fail thresholds for fail condition by @m1n0 in #2084
Upgrade to latest version of ibm-db python client by @Antoninj in #2076
User defined metric check: support fail query by @m1n0 in #2089
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.3 & 3.3.4
07 May 2024
Features and fixes
Fix automated monitoring, prevent duplicate queries by @m1n0 in #2075
Hive support scheme by @m1n0 in #2077
Bump deps by @m1n0 in #2079
Update autoflake precommit by @m1n0 in #2070
Contracts v3 by @tombaeyens in #2067
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.2
24 April 2024
Features and fixes
Rename argument in set_scan_results_file method by @ozgenbaris1 in #2047
Dremio: support disableCertificateVerification option by @m1n0 in #2049
Denodo: fix connection timeout attribute by @m1n0 in #2065
DB2: support security option by @4rahulae in #2063
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.1
23 March 2024
Features and fixes
Feature: improved wording and tooltip formatting in simulator by @bastienboutonnet in #2038
Failed rows: fix warn/fail thresholds by @m1n0 in #2042
Bump opentelemetry to 1.22 by @m1n0 in #2043
Bump dev requirements by @m1n0 in #2045
Refer to the Soda Core Release Notes for details.
[soda-core] 3.3.0
15 March 2024
Features
Contracts 2nd iteration @tombaeyens in #2006
Soda Core 3.3.0 supports the newest, experimental version of soda-contracts
. The new version introduces changes that may not be compatible with the previous experimental version of soda-contracts
. To continue using the first version of soda-contracts
without any adjustments, upgrade to Soda Core 3.2.4 for the latest in bug fixes and updates.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.2.4
15 March 2024
Fixes and features
Fix: Support attributes on multiple checks by @milanaleksic in #2032
Use dbt’s new access_url pattern to access cloud API by @bastienboutonnet in #2035
Refer to the Soda Core Release Notes for details.
[soda-core] 3.2.3
05 March 2024
Fixes and features
Feature: implement daily and monthly seasonality to external regressor … by @baturayo in #2027
Dremio: Fix token support by @m1n0 in #2028
Refer to the Soda Core Release Notes for details.
[soda-core] 3.2.2
28 February 2024
Fixes and features
Fix assets folder by @m1n0 in #2020
Fix: timezone mismatch between the recent and historical ad results by @baturayo in #2023
Feature: in anomaly detection simulator use soda core historic check results endpoint instead of test results by @baturayo in #2025
Update dask-sql by @m1n0 in #2026
Refer to the Soda Core Release Notes for details.
[soda-core] 3.2.1
13 February 2024
Fixes and features
Feature: correctly identified anomalies are excluded from training data by @baturayo in #2013
Fix: show more clearly the detected frequency using warning message first by @baturayo in #2014
Fix: simulator streamlit path by @m1n0 in #2017
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #2016
Update oracle_data_source.py by @vinod901 in #2012
Oracle: cast config to str/int to prevent oracledb errors by @m1n0 in #2018
Refer to the Soda Core Release Notes for details.
[soda-core] 3.2.0
08 February 2024
Fixes and features
Feature: implement severity level paramaters by @baturayo in #2001
Always use datasource specifis COUNT expression by @m1n0 in #2003
Fix: anomaly detection feedbacks by @baturayo in #2005
Feature: anomaly detection simulator (#163) by @baturayo in #2010
Dremio Token Support by @JorisTruong in #2009
Refer to the Soda Core Release Notes for details.
[soda-core] 3.1.4 & 3.1.5
24 January 2024
Fixes and features
Hive data source improvements by @robertomorandeira in #1982
Featire: Implement migrate from anomaly score check config by @baturayo in #1998
Bump Prophet by @m1n0 in #2000
Tests: Use approx comparison for floats by @m1n0 in #1999
Dbt: Improve parsing logs by @m1n0 in #1981
Sampler: Fix link href by @m1n0 in #1983
Document group by example for Soda Core with failed rows check by @janet-can in #1984
Schema check: Support custom identity by @m1n0 in #1988
SAS-2735 Add semver release for major, minor and latest by @dirkgroenen in #1993
Bug: Handle null values for continuous dist (#165) by @baturayo in #1994
pre-commit autoupdate by @pre-commit-ci in #1977
Feature: Implement new anomaly detection in soda core by @baturayo in #1995
Feature: Support built-in prophet public holidays by @baturayo in #1997
Refer to the Soda Core Release Notes for details.
[soda-core] 3.1.3
03 January 2024
Fixes and features
Feature: implement warn_only for anomaly score (#156) by @baturayo in #1980
Refer to the Soda Core Release Notes for details.
[soda-core] 3.1.2
15 December 2023
Fixes and features
GCP Delegate Authentication support by @nathadfield in #1973
Fix anomaly detection frequency aggregation bug by @baturayo in #1975
Upgrade pydantic from v1 to v2 by @baturayo in #1974
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1938
Refer to the Soda Core Release Notes for details.
[soda-core] 3.1.1
04 December 2023
Fixes and features
Update python api docs by @m1n0 in #1967
Make custom identity fixed as v4 by @m1n0 in #1968
Freshness: support in-check filters by @m1n0 in #1970. Documentation to follow shortly.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.1.0
16 November 2023
Fixes and features
Introducing the launch of data contracts, Soda’s experimental way to set data quality standards for data products.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.54
14 November 2023
Fixes and features
Failed rows check: support thresholds by @m1n0 in #1960
Updated install doc to include MotherDuck support via DuckDB by @janet-can in #1963
Sqlserver: remove % from pattern by @chuwangBA in #1956
Sqlserver: support quoting tables using brackets, “quote_tables” mode by @m1n0 in #1959
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.52 & 3.0.53
02 November 2023
Fixes and features
Freshness: support mixed thresholds by @m1n0 in #1957
Add License to every package by @m1n0 in #1958
Fix: compute value counts in DB rather than in python for categoric d… by @baturayo in #1948
Feature: Add Dask/Pandas configurable data source naming support by @dirkgroenen in #1951
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.51
11 October 2023
Fixes and features
Allow specification of virtual file name for add sodacl string by @m1n0 in #1943
Duckdb: support csv, parquet and json file formats by @PaoloLeonard in #1942
BigQuery: support job Labels by @data-fool in #1947
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.50
27 September 2023
Fixes and features
Add thresholds and diagnostics to scan result by @m1n0 in #1939
Fix databricks numeric types profiling by @m1n0 in #1941
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.49
19 September 2023
Fixes and features
Feature: Reference check: support must NOT exist by @m1n0 in #1937
Chore: remove redundant workflow by @dirkgroenen in #1931
Catch exceptions while building results file by @m1n0 in #1936
Pre-commit.ci: pre-commit autoupdate by @pre-commit-ci in #1935
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.48
11 August 2023
Fixes
Update docs to include experimental support for soda-core-teradata by @janet-can in #1916
Remove duckdb constraint by @JCZuurmond in #1921
Remove Vertica from build by @vijaykiran in #1923
Fix boolean attributes formatting by @m1n0 in #1925; addresses 400 error in sending results to Soda Cloud that involve checks with a boolean attribute such as a checkbox.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.42 - 3.0.47
21 July 2023
Fixes
Bump requests and markupsafe by @m1n0 in #1908
Create postgres_example.md by @rolandrmgservices in #1915
Pre-commit.ci: pre-commit autoupdate by @pre-commit-ci in #1905
Refactor: fix dask warnings for deprecated function by @baturayo in #1914
Added support for source and client_tags in trino data source by @deenkar in #1909
Set metric for failed rows check by @m1n0 in #1904
Add experimental Teradata support by @gpby in #1907
Fix multiple group by checks by @vijaykiran in #1918
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.40 & 3.0.41
23 June 2023
Fixes
Do not qualify metadata queries by @m1n0 in #1896
Pin dask-sql by @m1n0 in #1890
Added Soda Core docs by @janet-can in #1893
Formatting adjustments to docs by @janet-can in #1894
Handle scenario where schema cannot be obtained by @m1n0 in #1895
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.38 & 3.0.39
08 June 2023
Fixes
Core: Upgrade compatible duckdb version by @vijaykiran in #1875
Core: Revised README slightly for updated language by @janet-can in #1876
Cloud related code cleanup + simple log buffer by @m1n0 in #1881
Core Fix: Athena timestamp precision by @m1n0 in #1886
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.35, 3.0.36, 3.0.37
11 May 2023
Fixes
Snowflake: Add database.schema prefix for snowflake by @vijaykiran in #1872
No Changes in 3.0.36 & 3.0.37 - due to PyPi incident package publishing was broken during 3.0.36 release.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.33 & 3.0.34
10 May 2023
Fixes and Features
Core: user defined checks using sql file by @vijaykiran in #1859
Scientific: implement optional sample field for DRO update by @baturayo in #1848
Core: Improved graphs and diagnostics for Schema check in Cloud by @m1n0 in #1789
Core: total failed rows for derived checks by @m1n0 in #1857
Core: dbt - use generic check type by @m1n0 in #1858
Core: snowflake - remove unnecessary dependencies by @vijaykiran in #1862
Core: pre-commit autoupdate by @pre-commit-ci in #1855
Core Generic check type for Schema check by @m1n0 in #1789
Scientific: pin pandas<2.0.0 on scientific lib by @bastienboutonnet in #1869
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.31 & 3.0.32
18 April 2023
Fixes and features
Core: Document Analyze Table in tests by @m1n0 in #1844
Core: Apply dataset filters to reference check by @m1n0 in #1846
Core: Cleanup Group Evolution by @vijaykiran in #1853
Scientific: Fix: improve error handling when invalid metric is provided to AnomalyMetricCheck by @tituskx in #1850
Scientific: feat: derive bins and weights for categorical DRO in SQL warehouse by @tituskx in #1847
Core: Upgrade snowflake connector version by @vijaykiran in #1854
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.29 & 3.0.30
21 March 2023
Fixes and features
Core: Use correct failing/passing query on aggregatedd query checks by @m1n0 in #1837
Core: Group by by @vijaykiran in #1840
Core: Fix wrong parsing of scheme for soda_cloud by @milanaleksic in #1841
Scientific: Fix: do not return anomalyProbability by @tituskx in #1807
Core: Add group evolution check by @vijaykiran in #1843
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.28
09 March 2023
Fixes and features
Cloud: Fix freshness check cloud payload by @m1n0 in #1814
Cloud: Send the link details from sample configuration by @vijaykiran in #1832
Vertica: Fix profiling by @m1n0 in #1826
Core: Duplicate check: specify table for * query by @m1n0 in #1829
Duckdb: Support connection configuration by @m1n0 in #1830
Trino: Remove experimental python types arg by @vijaykiran in #1831
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.24 & 3.0.25
23 February 2023
Fixes and features
Cloud: Fix freshness check cloud payload by @m1n0 in #1814
Core: Duplicate check: get/send both raw and aggregated rows by @m1n0 in #1818
Core: Fix condition for fail queries by @vijaykiran in #1811
Core: Do not create sample rows when limit is 0 by @m1n0 in #1820
Dremio: Update dremio_data_source.py by @aayush16 in #1824
CI: Distinct PR and Nightly test runs by @m1n0 in #1821
CI: Fix tests run by @m1n0 in #1825
Core: Initial group by check support by @vijaykiran in #1827
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.24 & 3.0.25
23 February 2023
Fixes and features
Core: Updates to README by @baturayo @janet-can
Core: Tiny update to readme subtitle byin #1806
Core: Improve profiling tests by @m1n0 in #1805
Core: Fix sampler parsing by @vijaykiran in #1813
Core: Ground work for query generation changes, and refactor reference check into two queries by @m1n0 in #1812
Core: Bump telemetry versions by @vijaykiran in #1810
Scientific: Fix: return yhat not trend as anomalyPredictedValue by @tituskx in #1802
Scientific: Refactor: only use last n_points in detect_anomalies method by @tituskx in #1808
Scientific: Fix: skip/ignore measurements does not work by @tituskx in #1799
Scientific: Fix anomaly result to expect feedback outside of anomaly diagnostics by @baturayo in #1797
Cloud: generic check API changes by @m1n0 in #1778
Cloud: Check attributes: send numeric type as string by @m1n0 in #1798
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.23
16 February 2023
Fixes and features
Core: Optimize duplicate query by @m1n0 in #1781
Core: Pin duckdb version by @baturayo in #1791
Core: Invalid values/format/regex for validity checks by @m1n0 in #1793
Core: Data source utils: support config as string by @m1n0 in #1796
Profiling refactor: profiler implementation by @baturayo in #1775
Anomaly detection: Fix - confidence bounds of anomaly detection sit too close together by @tituskx in #1780
Anomaly detection: Bug - anomaly detection doesn’t use more than 3 historical check results by @baturayo in #1787
Cloud: Fix verbose mode in cloud requests by @m1n0 in #1792
CI: Update workflow.yml by @vijaykiran in #1794
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.22
25 January 2023
Fixes and features
Vertica: Add Vertica Support by @mkovalyshev in #1771
Dask/Pandas: Dask and Pandas support by @baturayo in #1671
Profiling: Fix table and column inclusion/exclusion works correctly for profiling by @tituskx in #1735
Cloud: Add support for link text for failed row sampler messages by @vijaykiran in #1772
Cloud: Gather all queries in the query object by @m1n0 in #1773
Cloud: Add possibility to configure the URL scheme for Soda Cloud by @dakue-soda in #1776
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.21
19 January 2023
Fixes and features
Core: Check attributes: Enable tests and fix support on all check types by @m1n0 in #1767
Core: CI: Use the Github token from the pipeline as a dispatch token by @dakue-soda in #1768
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.20
19 January 2023
Fixes and features
Dremio: Fixes for profiling and discovery by @vijaykiran in #1764
Core: Attributes: add timezone to dates by @m1n0 in #1765
Core: Deprecation: actually remove execution of row count query in dataset discovery by @bastienboutonnet in #1763
Core: Add dataset filter test case starting with % by @baturayo in #1756
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.19
11 January 2023
Fixes
Fix Docker build issue.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.18
11 January 2023
Features and fixes
Core: apply in-check filters to duplicate check by @m1n0 in #1748
Core: Deprecate global data source ‘disable_samples’, use one from sampler by @m1n0 in #1749
Core: Fix for user defined query check failure on zero result by @vijaykiran in #1750
Core: Remove unused soda_cloud property by @m1n0 in #1753
Core: Better warning message for invalid check without valid spec by @m1n0 in #1752
Cloud: Get available check attributes schema using correct cloud api by @m1n0 in #1751
Cloud: Skip all checks if invalid attributes found by @m1n0 in #1758
Profiling: Add row count to profiling by @baturayo in #1747
Dremio: Fix profiling for schemas with . s by @vijaykiran in #1757
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.17
28 December 2022
Features and fixes
Core: Custom query null result handling by @m1n0 in #1739
Core: Duplicate count workaround for complex types by @m1n0 in #1738
Core: Warn about unsupported multi-argument numeric checks by @m1n0 in #1741 and #1742
Core: Send check attributes to cloud by @m1n0 in #1743
API: Experimental connection API by @vijaykiran in #1745
Dremio: fix profiling issues by @vijaykiran in #1746
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.16
15 December 2022
Features and fixes
Cloud: Do not upload more than 100 rows to Cloud when no limit is specified by @m1n0 in #1719
Core: Bump requests version by @m1n0 in #1721
Core: Fix history-loss when custom identity is provided by @vijaykiran in #1720
Core: Fix json serialisation for HTTPSampler by @vijaykiran in #1723
Core: Bump dependency versions by @vijaykiran in #1728
Core: Remove cloud traces from telemetry by @m1n0 in #1729
Dremio: Fix profiling query by @vijaykiran in #1730
Scientific: Log metric identity when getting historical metrics for anomaly check for easier debugging by @bastienboutonnet in #1731
Databricks/Spark: Fix listing of tables by @vijaykiran in #1736
Sparkdf: Fix schema info with partition info by @vijaykiran in #1737
Snowflake: Add snowflake arg to allow temp credential file in Linux by @wintersrd in #1714
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.15
12 December 2022
Features and fixes
Core-267 variables by @m1n0 in #1716
Add snowflake arg to allow temp credential file in Linux by @wintersrd in #1714
Do not upload more than 100 rows to Cloud when no limit is specified by @m1n0 in #1719
Bump requests version by @m1n0 in #1721
Fix history-loss when custom identity is provided by @vijaykiran in #1720
Check attributes by @m1n0 in #1718
Fix json serialisation for HTTPSampler by @vijaykiran in #1723
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.14
01 December 2022
New features and improvements
Core: Date format fixes by @vijaykiran in #1691
Core: Variables everywhere by @m1n0 in #1700
Core: Update docker by @vijaykiran in #1699
Core: Refactor duplicate check into two queries by @m1n0 in #1698
Core: Remove row-count derivation from dataset discovery by @bastienboutonnet in #1706
Core: Support variables in configuration by @m1n0 in #1705
Core: Fix schema checks with table filter by @m1n0 in #1704
Core: Update CI to test support for python 3.10 @vijaykiran
SQL Server: Fix email regex, do not allow empty string by @m1n0 in #1688
Spark: Respect verbose setting when running a test query by @m1n0 in #1697
Spark: Remove unnecessary logging by @vijaykiran
Cloud: Updates to HTTP Sampler by @vijaykiran in #1702
Cloud: Ensure that profiling does not lowercase columns by @tituskx in #1687
Docs: Updates to list of compatible data sources. by @janet-can in #1694
New data source: Duckdb support (experimental) by @vijaykiran in #1709
New data source: Denodo support (experimental) by @vijaykiran in #1710
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.13
15 November 2022
New features and improvements
Core: Support Date type in freshness checks by @m1n0 in #1667
Core: Log current time to logs by @m1n0 in #1676
Core: Fixes to sampler, add logging by @vijaykiran in #1690
Core: Add file/check location to scan summary by @m1n0 in #1675
Core: Regex: support ‘+’ in email format by @m1n0 in #1677
Core: Generate passing query for built-in checks by @m1n0 in #1668
Core: Test statistical functions on all data sources by @m1n0 in #1678
Core: Add samples limit to queries by @m1n0 in #1685
Cloud: Add message configuration option to sampler by @vijaykiran in #1686
Oracle Oracle DB Support by @vijaykiran in #1682
Scientific: feat: support sampling for distribution checks by @baturayo in #1666
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.12
03 November 2022
New features and improvements
Core: Duplicate percent check by @m1n0 in #1649
Core: Change over time - remove ‘same day last month’ by @m1n0 in #1648
Core: Failed rows exclude columns by @m1n0 in #1657
Core: Introduce http sampler by @vijaykiran in #1665
Core: Modify Test Column Names by @tdstark in #1652
Cloud: Do not send null file ref, when failed rows are disabled by @vijaykiran in #1650
Scientific feat: Allow use of in-check filters for distribution checks by @tituskx in #1655
Trino: Update trino_data_source.py by @ScottAtDisney in #1658
MS SQL Server: Change count to big_count by @vijaykiran in #1660
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.11
19 October 2022
New features
Cloud: Change over time - add same day/month support by @m1n0 in #1645
Core: Verify data source connection command by @m1n0 in #1636
Enhancements and bug fixes
Core: Parse cli variables correctly, fix cli tests to actually assert result. by @m1n0 in #1634
Core: variable substitution in schema check query by @ceyhunkerti in #1628
Redshift: use SVV_COLUMNS to get table metadata by @m1n0 in #1635
Scientific: fix: limit the bin size and handle zero division for continious DRO by @baturayo in #1624
Scientific: fix: handle DRO generation for columns with 0 rows by @baturayo in #1627
Scientific: chore: pin prophet to >=1.1 by @bastienboutonnet in #1629
Scientific: refactor: add bins and weights doc link to DRO exception handling logs by @baturayo in #1633
Scientific: (anomaly_check): only send outcomeReasons with severity “warn” or “error” by @tituskx in #1640
Snowflake: use upper case in table metadata query by @m1n0 in #1639
Trino: fix py310 type hints by @m1n0 in #1641
BiQuery: fixing bq separate compute storage project by @thiagodeschamps in #1638
BiQuery: fix distribution check by @m1n0 in #1647
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.10
05 October 2022
New features
Dremio: First version of Dremio support by @vijaykiran in #1618
Core: Sample size is configurable for all failed row checks by @m1n0 in #1608
Enhancements and bug fixes
Core: Skip change over time checks when historical measurements not available by @m1n0 in #1615
Core: Include psycopg2 requirement for redshift by @m1n0 in #1620
Core: Use correct dicts when building scan result by @m1n0 in #1612
Cloud/dbt: Add Check source field for cloud by @m1n0 in #1614
Scientific feat: check historical metrics are not None or log helpful message by @bastienboutonnet in #1600
Scientific fix: handle very large bin sizes by filtering out outliers for dro generation by @baturayo in #1616
Scientific fix: ensure PSI and SWD can deal with decimal.Decimal type by @tituskx in #1611
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.9
28 September 2022
Enhancements and bug fixes
Limit failed rows sample limit to 1000 by @m1n0 in #1599
Add scan result getter by @m1n0 in #1602
BigQuery separate project for compute and storage. by @m1n0 in #1598
Scan results file argument by @vijaykiran in #1603
Chore/move snowflake account by @jmarien in #1607
Use filename in check identity by @m1n0 in #1606
Refer to the Soda Core Release Notes for details.
Troubleshoot
Problem: When you run a scan using Soda Core 3.0.9, you get an error message that reads, from google.protobuf.pyext import _message ImportError: dlopen(.../site-packages/google/protobuf/pyext/_message.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace
Solution: This is the result of a transitive dependency from open telemetry that gathers OSS usage statistics. To resolve:
From the command-line, in the directory in which you installed your soda-core package, run
pip uninistall protobuf
.Reinstall protobuf with the command
pip install protobuf==3.19.4
.
[soda-core] 3.0.8
22 September 2022
Soda Core: Add variable resolution to queries/thresholds @vijaykiran in #1597
Soda Core: Scan results dict API method by @m1n0 in #1595
Soda Core: Minor edits to CLI help messages. by @janet-can in #1590
Soda Cloud: Fix change-over-time checks with percentage with no extra config by @m1n0 in #1592
Soda Cloud: Prevent empty message in outcomeReasons by @bastienboutonnet in #1596
Soda Scientific: Raise more user-friendly log messages when importing sci library fails by @bastienboutonnet in #1584
dbt: Fix sending correct table name to Soda Cloud @vijaykiran in #1587
BigQuery: Add context authentication and impersonation for BigQuery by @tooobsias in #1588
SQLServer: Basic Sqlserver regex support by @m1n0 in #1586
MySQL/MariaDB: Fix mysql/mariadb compatibility for regex by @vijaykiran in #1591
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.7
13 September 2022
Core: Update freshness value to be milliseconds and add measure by @vijaykiran in #1575
Core: Resolve variables in user defined queries by @vijaykiran in #1577
dbt: Add configurable API URL for dbt cloud by @vijaykiran in #1576
dbt: Add
dbt:
prefix to dbt check results in Soda Cloud by @vijaykiran in #1574dbt: Fix dbt cloud ingest, improve logging. by @m1n0 in #1578
dbt: Fix dbt checks not being sent properly to Soda Cloud by @vijaykiran in #1580
MySQL: Fixed port option a @ScottAtDisney in #1579
MySQL: Fix regex tests for mysql by @vijaykiran in #1583
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.6
07 September 2022
Fixed: add identityB to add datasource name in identity by @vijaykiran in #1556
Databricks SQL support by @vijaykiran in #1559
Added application flag to snowflake connect by @tombaeyens in #1561
Added identites by @vijaykiran in #1569
Added support for custom sampler by @vijaykiran in #1570
Handle numerical column/table names by @m1n0 in #1572
dbt ingestion support by @m1n0 in #1552
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.5
24 August 2022
New features
Support for Trino data source by @ScottAtDisney in #1553
Enhancements and bug fixes
Fix ‘missing format’ in numeric metrics by @m1n0 in #1549
Fix duplicate query by @m1n0 in #1543
Refactor: turn no matching table error into a warning to avoid scan failing when all tables are excluded by @bastienboutonnet in #1533
Add comments explaining cloud payload by @m1n0 in #1545
Add data source contributing docs by @m1n0 in #1546
Feature, profiling: add support for extra numeric and text datatypes by @bastienboutonnet in #1534
Change spark installation to decouple dependencies for Hive and ODBC by @vijaykiran in #1554 Read more about installing the dependencies separately, as needed.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.4
10 August 2022
Testing switch to 22.04 for GA by @jmarien in #1521
Log and trace Soda Cloud trace IDs by @m1n0 in #1520
Update docker image for sqlserver support by @vijaykiran in #1522
Add option to set scan datatime by @vijaykiran in #1531
Add MySQL Support by @vijaykiran in #1526
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.3
27 July 2022
New Features
MS SQLServer support by @vijaykiran in #1515
IBM DB2 support
Bug Fixes
Fix: better logging messages for profiling and discover datasets by @baturayo in #1498
Fix config file creation when first path is not writable by @m1n0 in #1504
fix: Failed rows don’t consider filter by @vijaykiran in #1505
Fix log message by @m1n0 in #1507
Fix reference check for null values in source column by @m1n0 in #1509
Attach sample rows to reference check by @m1n0 in #1508
Make sure results to sodacloud are sent when there is an exception by @vijaykiran in #1510
Fix for regex on collated columns in Snowflake by @ScottAtDisney in #1516
Enhancements
Check name refactor by @m1n0 in #1502
Set basic telemetry scan data even in case of exceptions by @m1n0 in #1512
Improve athena text fixture auth setup by @m1n0 in #1501
Publish data source packages for python 3.7 by @m1n0 in #1514
Inform about wrong check indentation in logs by @m1n0 in #1517
Feat: skip row count query during column profiling by @bastienboutonnet in #1518
Feat: support ‘text’ data type in column profiling by @bastienboutonnet in #1519
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.2
18 July 2022
Enhancements and New Features
IBM db2 support
Support cli –version to output core version
Warn users when quotes are present in include excludes identifiers
Add samples limit to failed rows checks
BQ expose remaining client params and auth methods
Enable Snowflake Tokens
Treat zero missing or invalid rows as zero percent
Bug Fixes
Make name optional for failed rows
Use exception rather than exc_info to render traceback in soda-core logger’s call of prophet model
Stored row count in cloud is wrong
Handle exceptions from scientific library and log them instead or letting them raise
Spark DF: update example api usage
Change default scan definition name
BQ: remove schema, use dataset only
Use default distribution comparison method when user has not provided one
Fix utc timezone handling
Improve profiling test for all tables and all columns
Fix utc timezone handling
Set redshift host before trying to fetch credentials
Change unassigned min and max variables for profiling logs
Use check name in Metric checks
If anomaly detection fails other check results are not sent to cloud
Prevent empty table list from running all tables
Profile column parsing fails when user provides illegal column spec
Join check text with newlines instead of /n
Infra/CI
Async Docker image building through Actions and dispatch
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.1
29 June 2022
Re-introduce Spark for the Docker image by @jmarien in #1458
Build: require strict prophet v1.0.0 in scientific library by @bastienboutonnet in #1459
Comment for pinned prophet version by @m1n0 in #1460
Fix the e parameter by @jmarien in #1461
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0
28 June 2022
This is the general availability release for Soda Core with Soda CL.
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0rc3 Beta
27 June 2022
Doc: add comment about ordinal_position ordering by @bastienboutonnet in #1428
Refactor: use filesystem abstractions in distribution check by @baturayo in #1423
Fix: distribution check athena compatibility by @bastienboutonnet in #1429
Feat: profile and discover view tables by @baturayo in #1416
Code style section in contrib docs by @m1n0 in #1432
Unify data source api, remove redundant code. by @m1n0 in #1433
Fix: support athena in column profiling by @bastienboutonnet in #1430
Column profiling metadata fix by @tombaeyens in #1431
Feat: Support profile columns inclusion/exclusion behaviour for Spark by @baturayo in #1437
CORE-63 Added relative percentage change over time by @tombaeyens in #1435
Feat: Raise a MissingBinsAndWeights exception if soda scan runs without distribution_reference present by @tituskx in #1421
Flatten data source configuration schema by @m1n0 in #1441
Fix: Suppress prophet’s pandas: frame.append deprecation warning by @tituskx in #1440
Feat: send outcome reason to cloud for anomaly detection and schema checks by @baturayo in #1390
Add private key and other extra params to snowflake by @m1n0 in #1446
Feat: refer to DROs by name by @tituskx in #1422
Change: rename the update command to update-dro as it better describes what the command is used for by @tituskx in #1444
Feat/fix: ensure empty bins for integer columns are not created and fix bin width derivation by @baturayo in #1447
Do not quote table names in for-each block by @m1n0 in #1449
Feat: add env based option to run tests on views by @vijaykiran in #1442
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0rc2 Beta
22 June 2022
feat: add wasserstein distance and PSI methods to distribution checks by @tituskx in #1395
CORE-24 New freshness syntax by @tombaeyens in #1400
Verify that in spark-df arrays & structs don’t break anything by @tombaeyens in #1397
feat: add column exclusion to profile columns by @bastienboutonnet in #1396
feat: log no threshold error during parsing and provide more informative error during check summary by @tituskx in #1401
CORE-44 Fixed some extra timestamps to utc by @tombaeyens in #1405
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1407
SODA-23 table dataset rename by @tombaeyens in #1404
feat: send distribution check results to cloud so that they can be plotted by @tituskx in #1402
Update README to include support for Amazon Athena by @stuart-robinson in #1409
refactor: Refactor scan.py to remove code duplicates by @baturayo in #1391
Update CONTRIBUTING to stipulate that users fork the repo. by @janet-can in #1413
Core 70 clean test schemas by @tombaeyens in #1415
fix: hotfix for historic measurements having none values by @baturayo in #1418
CORE-26 Fix change over time results value parsing by @vijaykiran in #1419
CORE-57 improved exception handling when creating data source by @tombaeyens in #1411
Another approach for the Docker image for Soda Core by @jmarien in #1398
Added 5 random chars to CI schema names by @tombaeyens in #1424
Fix drop table statement in test suite by @m1n0 in #1425
SODA-44 Added Z to timestamps in soda cloud json by @tombaeyens in #1408
Added docs on running tests by @tombaeyens in #1426
Fix schema check title by @vijaykiran in #1427
fix: more useful profiling warnings by @bastienboutonnet in #1420
CORE-37 Fixed schema type comparison for BigQuery by @tombaeyens in #1410
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0rc1 Beta
08 June 2022
1175 spark by @m1n0 in #1382
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b19 Beta
02 June 2022
fix: handle %.% in profile columns properly and other bugs by @bastienboutonnet in #1377
Fix: cope with cloud disabled samples. by @m1n0 in #1393
BQ: regex switch to ‘r’ instead of backslash escaping by @m1n0 in #1394
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b18 Beta
01 June 2022
Scientific package tests on Athena. by @m1n0 in #1374
Update OT with scan/check counts by @vijaykiran in #1386
feat: add ability to send dataset samples to soda cloud (SODA-284) by @baturayo in #1372
fix: typo in data source package import by @bastienboutonnet in #1387
627 Added default sampler returning a sample that is is not persistent by @tombaeyens in #1385
feat: cap distribution check to 1M rows by default by @tituskx in #1379
refactor: clean up logging for anomaly detection by @bastienboutonnet in #1389
fix: avoid parsing DRO name in distribution check until fully implemented by @bastienboutonnet in #1388
Downgrade markupsafe dependency by @m1n0 in #1392
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b17 Beta
26 May 2022
Pin versions in core by @vijaykiran in #1383
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b16 Beta
26 May 2022
Fixing suffix scanning of configuration and check files by @tombaeyens in #1365
Refactored to actual table and actual column names by @tombaeyens in #1370
Send Soda Cloud logs by @tombaeyens in #1380
Prevent upload when no sample rows are present by @tombaeyens in #1378
refactor: inform when columns are skipped in profiling via logs by @bastienboutonnet in #1375
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b15 Beta
23 May 2022
refactor: remove darts dependency by @bastienboutonnet in #1362
refactor: remove code duplication in sodacl_parser by @baturayo in #1361
SODA-248 fixed change over time checks by @tombaeyens in #1366
Athena support by @m1n0 in #1367
Added for each schema check by @tombaeyens in #1368
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b14 Beta
19 May 2022
Add defaultDataSource to cloud payload by @vijaykiran in #1359
fix: provide docker image with soda-scientific packaged by @bastienboutonnet in #1355
Fixing data source validity error message by @tombaeyens in #1357
Added cython to the setup.py file by @tituskx in #1360
#1353 SODA-494 Fixing recursive loading of files by @tombaeyens in #1356
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b13 Beta
18 May 2022
1237 samples2 by @tombaeyens in #1340
Getting disable samples from cloud config by @tombaeyens in #1348
Test anomaly detection for numeric metrics by @baturayo in #1349
Updated contributing, fixed logs and added hint in comment how to add… by @tombaeyens in #1350
Fix: Automated monitoring revert issues SODA-489 by @baturayo in #1351
Soda 159 - Test anomaly detection for nested metrics by @baturayo in #1354
Refer to the Soda Core Release Notes for details.
[soda-core] 3.0.0b12 Beta
16 May 2022
Fix date eu/us formats. by @m1n0 in #1334
1237 samples by @tombaeyens in #1328
feat: column profiling by @bastienboutonnet in #1322
Switch to latest prophet by @vijaykiran in #1335
throw log error and return empty string if histogram assumption broken by @bastienboutonnet in #1337
Freshness send microseconds to cloud. by @m1n0 in #1338
Cloud: timestamps use seconds resolution by @m1n0 in #1339
Deleted docs folder by @tombaeyens in #1343
feat: implement automated monitoring executor/runner by @baturayo in #1323
feat: add table discovery by @bastienboutonnet in #1341
fix(profiling): allow null results in text column aggregates by @bastienboutonnet in #1344
Fix: update check identity in case of automated monitoring by @baturayo in #1346
Refer to the Soda Core Release Notes for details.
[soda-core] 0.0.1 Beta
22 March 2022
This release marks the launch, or first beta release, of Soda Core and Soda Checks Language.
Reference the Soda Core OSS and SodaCL documentation for information on how to use the new CLI tool and domain-specific language for reliability.
Last updated
Was this helpful?