Link Search Menu Expand Document

Glossary

alert

A setting that you configure in a Soda Cloud monitor by specifying key:value thresholds which, if exceeded, trigger a notification. See also: notification.

alert configuration

A configuration in a SodaCL check that you use to explicitly specify the conditions that warrant a warn result. See Optional check configurations.

built-in metric

An out-of-the-box metric that you can configure in a checks YAML file. See Metrics and checks.

check

A test for data quality that you write using the Soda Checks Language (SodaCL). See Metrics and checks.

checks YAML

The file in which you define SodaCL checks. Soda Core uses the input from this file to prepare, then run SQL queries against your data. See How Soda Core works.

cloud metric store

The place in Soda Cloud that stores the values of measurements collected over time as Soda Core executes checks.

column

A column in a dataset in your data source.

configuration key

The key in the key-value pair that you use to define what qualifies as a missing or valid value in a column. A Soda scan uses the value of a column configuration key to determine if a check should pass, warn, or fail. For example, in valid format: UUID , valid format is a column configuration key and UUID is the only format of the data in the column that Soda considers valid. See Missing metrics and Validity metrics.

configuration YAML

The file in which you configure data source connection details and Soda Cloud connection details. See How Soda Core works.

data source

A storage location that contains a collection of datasets, such as Snowflake, Amazon Athena, or GCP BigQuery.

dataset

A representation of a tabular data structure with rows and columns. A dataset can take the form of a table in PostgreSQL or Snowflake, a stream in Kafka, or a DataFrame in a Spark application.

measurement

The value for a metric that Soda Core collects during a scan.

metric

A property of the data in your dataset. See Metrics and checks.

metric store

The component in Soda Cloud that stores metric measurements. This component facilities the visualization of changes to your data over time.

monitor

A set of details you define in Soda Cloud which Soda SQL used when it ran a scan. Sometimes referred to in other systems as a “data quality rule”. Soda Cloud displays Soda Core check results as Monitors.
See Create monitors and alerts.

notification

A setting you configure in a Soda Cloud monitor that defines whom to notify when a data issue triggers an alert. See also: alert.

scan

A command that executes tests to extract information about data in a data source. See Run a scan.

SodaCL

The domain-specific language to define Soda Checks in a checks YAML file. A Soda Check is a test that Soda Core executes when it scans a dataset in your data source. See SodaCL documentation.

Soda Cloud

A web application that enables you to examine scan results and create monitors and alerts. Create a Soda Cloud account at cloud.soda.io. If you also use Soda Core, you can connect Soda Core to Soda Cloud.

Soda Core

A free, open-source, command-line tool that enables you to use the Soda Checks Language to turn user-defined input into aggregated SQL queries. You can use this as a stand-alone tool to monitor data quality from the command-line, or connect it to a Soda Cloud account to monitor your data using a web application. See Soda Core documentation.

Soda Spark (Deprecated)

Soda Spark was an extension of Soda SQL that allowed you to run Soda SQL functionality programmatically on a Spark DataFrame. It has been replaced by Soda Core configured to connect with Apache Spark DataFrames. Access legacy documentation.

Soda SQL (Deprecated)

Soda SQL was an open-source command-line tool that scanned the data in your data source. Replaced by Soda Core. Access legacy documentation.

threshold

The value for a metric that Soda checks against during a scan. See Metrics and checks.

validity rule

In Soda Cloud, the key-value pair that you use to define what qualifies as a missing valid value in a column. A Soda scan uses the value defined in a validity rule to determine if it should pass or fail a check. See also: configuration key.



Last modified on 01-Jul-22

Was this documentation helpful?
Share feedback in the Soda community on Slack.

Help improve our docs!

  • Request a docs change.
  • Edit this page in our GitHub repo.