Link Search Menu Expand Document

Install Soda SQL

Soda SQL is a command-line interface (CLI) tool that enables you to scan the data in your database to surface invalid, missing, or unexpected data.

Go further


Use Soda SQL to scan a variety of data warehouses:

Amazon Athena
Amazon Redshift
Apache Hive
GCP BigQuery
Microsoft SQL Server


To use Soda SQL, you must have installed the following on your system:

  • Python 3.7 or greater. To check your existing version, use the CLI command: python --version
  • Pip 21.0 or greater. To check your existing version, use the CLI command: pip --version

For Linux users only, install the following:

  • On Debian Buster: apt-get install g++ unixodbc-dev python3-dev libssl-dev libffi-dev
  • On CentOS 8: yum install gcc-c++ unixODBC-devel python38-devel libffi-devel openssl-devel

For MSSQL Server users only, install the following:


From your command-line interface tool, execute the following command, using the install package that matches the type of warehouse you use to store data.

$ pip install soda-sql-yourdatawarehouse
Data warehouse Install package
Amazon Athena soda-sql-athena
Amazon Redshift soda-sql-redshift
Apache Hive soda-sql-hive
GCP BigQuery soda-sql-bigquery
MS SQL Server soda-sql-sqlserver
PostgreSQL soda-sql-postgresql
Snowflake soda-sql-snowflake

Optionally, you can install Soda SQL in a virtual environment. Execute the following commands one by one:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install soda-sql-yourdatawarehouse


To upgrade your existing Soda SQL tool to the latest version, use the following command:

pip install soda-sql-yourdatawarehouse -U


Problem: There are known issues on Soda SQL when using pip version 19.
Solution: Upgrade pip to version 20 or greater using the following command:

$ pip install --upgrade pip

Problem: Upgrading Soda SQL does not seem to work.
Solution: Run the following command to skip your local cache when upgrading your Soda SQL version:

$ pip install --upgrade --no-cache-dir soda-sql-yourdatawarehouse

Problem: I can’t run the soda command in my CLI. It returns command not found: soda.
Solution: If you followed the instructions to install Soda SQL and still received the error, you may need to adjust your $PATH variable.

  1. Run the following command to find the path to your installation of Python, replacing soda-sql-postgresql with the install package that matches the type of warehouse you use if not PostgreSQL:
    pip show soda-sql-postgresql

    The output indicates the Location that looks something like this example:
    Location: /Users/yourname/Library/Python/3.8/lib/python/site-packages
  2. Add the location to your $PATH variable using the export PATH command as follows:
    'export PATH=$PATH:/Users/yourname/Library/Python/3.8/bin soda'
  3. Run the soda command again to receive the following output:
    Usage: soda [OPTIONS] COMMAND [ARGS]...
      Soda CLI version
      --help  Show this message and exit.
      analyze  Analyzes tables in the warehouse and creates scan YAML files...
      create   Creates a new warehouse.yml file and prepares credentials in
      scan     Computes all measurements and runs all tests on one table.

Go further