Link Search Menu Expand Document

Common terms

See Glossary for a full list of Soda terminology.

Term Definition
analyze A Soda SQL CLI command that sifts through the contents of your data source and automatically prepares a scan YAML file for each table. See Create a scan YAML file.
data source A storage location that contains a collection of datasets. A warehouse in Soda SQL is one form of datasource. A datasource may also imply a compute engine that Soda SQL uses to compute measurements.
dataset A representation of a tabular data structure with rows and columns. A dataset can take the form of a table in PostgreSQL or Snowflake, a stream in Kafka, or a dataframe in a Spark application.
measurement The value for a metric that Soda SQL checks against during a scan. For example, in the test row_count = 5, row_count is the metric and 5 is the measurement.
metric A property of the data in your dataset. See Metrics.
monitor A set of details you define in Soda Cloud which Soda SQL uses when it runs a scan. Sometimes referred to in other systems as a “data quality rule”.
For a new monitor, you define: a dataset and column against which to execute a test, a test, an alert, a notification, an owner, and a description. See Create monitors and alerts.
scan A command that executes tests to extract information about data in a data source. See Run a scan.
scan YAML The file in which you configure scan metrics and tests. Soda SQL uses the input from this file to prepare, then run SQL queries against your data. See Scan YAML.
test A Python expression that, during a scan, checks metrics to see if they match the parameters you defined for a measurement. As a result of a scan, a test either passes or fails. See Tests.
warehouse A type of data source.
warehouse YAML The file in which you configure data source connection details and Soda Cloud connection details. See Warehouse YAML and Connect to Soda Cloud.



Last modified on 15-Sep-21

Was this documentation helpful?
Give us your feedback in the #soda-docs channel in the Soda community on Slack or open an issue in GitHub.