See Glossary for a full list of Soda terminology.
|analyze||A Soda SQL CLI command that sifts through the contents of your data source and automatically prepares a scan YAML file for each table. See Create a scan YAML file.|
|data source||A storage location that contains a collection of datasets. A warehouse in Soda SQL is one form of datasource. A datasource may also imply a compute engine that Soda SQL uses to compute measurements.|
|dataset||A representation of a tabular data structure with rows and columns. A dataset can take the form of a table in PostgreSQL or Snowflake, a stream in Kafka, or a dataframe in a Spark application.|
|measurement||The value for a metric that Soda SQL checks against during a scan. For example, in the test |
|metric||A property of the data in your dataset. See Metrics.|
|monitor||A set of details you define in Soda Cloud which Soda SQL uses when it runs a scan. Sometimes referred to in other systems as a “data quality rule”. |
For a new monitor, you define: a dataset and column against which to execute a test, a test, an alert, a notification, an owner, and a description. See Create monitors and alerts.
|scan||A command that executes tests to extract information about data in a data source. See Run a scan.|
|scan YAML||The file in which you configure scan metrics and tests. Soda SQL uses the input from this file to prepare, then run SQL queries against your data. See Scan YAML.|
|test||A Python expression that, during a scan, checks metrics to see if they match the parameters you defined for a measurement. As a result of a scan, a test either passes or fails. See Tests.|
|warehouse||A type of data source.|
|warehouse YAML||The file in which you configure data source connection details and Soda Cloud connection details. See Warehouse YAML and Connect to Soda Cloud.|
Last modified on 15-Sep-21