Link Search Menu Expand Document

Troubleshoot SodaCL

Last modified on 26-Jan-23

Errors with invalid format
Soda does not recognize variables
Missing check results in Soda Cloud
Metrics were not computed for check

Errors with invalid format

Problem: You have written a check using an invalid_count or invalid_percent metric and used an invalid format config key to specify the values that qualify as invalid, but Soda errors on scan.

Solution: The invalid format configuration key only works with data type TEXT. See Specify valid format.

See also: Tips and best practices for SodaCL


Soda does not recognize variables

Problem: You execute a programmatic scan using Soda Core, but Soda does not seem to recognize the variables you included in the programmatic scan.

Solution: Be sure to include any variables in your programmatic scan before the check YAML file identification. Refer to Basic programmatic scan for an example.


Missing check results in Soda Cloud

Problem, variation 1: You have written checks for a single dataset and use variables to provide check input at scan time, as in the example below. However, when you provide a different value for the variable and run the scan, the check result for the previous scan that used a different variable disappears or appears to be overwritten.

checks for test_table_${expected_country}:
  - failed rows:
      name: Check for ${expected_country}
      fail query: |
          select * from test_table where country = ${expected_country}

Problem, variation 2: You wrote one or more checks for a dataset and the scan produced check results for the check as expected. Then, you adjusted the check – for example, to apply to a different dataset, as in the example below – and ran another scan. The latest scan appears in the check results, but the previous check result seems to have disappeared or been archived.

checks for dataset_1:
  - failed rows:
      identity: failed-row-1
      fail query: |
        SELECT DISTINCT busbreakdown_id
        FROM breakdowns
checks for dataset_2:
  - failed rows:
      identity: failed-row-2
      fail query: |
        SELECT DISTINCT busbreakdown_id
        FROM breakdowns

Solution: Soda Cloud archives check results if they have been removed, by deletion or alteration, from the check file. If two scans run using the same checks YAML file, but an alteration or deletion of the checks in the file took place between scans, Soda Cloud automatically archives the check results of any check that appeared in the file for the first scan, but does not exist in the same checks YAML file during the second scan.

To force Soda Cloud to retain the check results of previous scans, you can use one of the following options:

  • Write individual checks and keep them static between scan executions.
  • Add the same check to different checks YAML files, then execute the scan command to include two separate checks YAML files.
soda scan -d adventureworks -c configuration.yml checks_test.yml checks_test2.yml
  • Add a check identity and use the -s scan definition option in the scan command to explicitly specify separate scan definitions for each scan. Read more.
soda scan -d subscription_statuses -s subscription_statuses-BE -c configuration.yml -v country=BE checks.yml 

soda scan -d subscription_statuses -s subscription_statuses-CA -c configuration.yml -v country=CA checks.yml 

See also: Add a check identity
See also: Configure a single scan to run in multiple environments.

Metrics were not computed for check

Problem, variation 1: You have written a check using the exact syntax provided in SodaCL documentation but when you run a scan, Soda produces an error that reads something like, Metrics 'schema' were not computed for check 'schema'.

Problem, variaion 2: You can run scans succesfully on some datasets but one or two of them always produce errors when trying to execute checks.

Solution: In your checks YAML file, you cannot use a dataset identifier that includes a schema, such as soda.test_table. You can only use a dataset name as an identifier, such as test_table.

However, if you were including the schema in the dataset identifier in an attempt to run the same set of checks against multiple environments, you can do so using the instructions to Configure a single scan to run in multiple environments.

See also: Add a check identity

Go further


Was this documentation helpful?

What could we do to improve this page?


Last modified on 26-Jan-23