Link Search Menu Expand Document

Install Soda SQL

Soda SQL is a command-line interface (CLI) tool that enables you to scan the data in your database to surface invalid, missing, or unexpected data.

Compatibility
Requirements
Install
Upgrade
Troubleshoot
Go further

Compatibility

Use Soda SQL to scan a variety of data warehouses:

Amazon Athena
Amazon Redshift
Apache Hive
GCP BigQuery
Microsoft SQL Server
PostgreSQL
Snowflake

Requirements

To use Soda SQL, you must have installed the following on your system:

  • Python 3.7 or greater. To check your existing version, use the CLI command: python --version
  • Pip 21.0 or greater. To check your existing version, use the CLI command: pip --version

For Linux users only, install the following:

  • On Debian Buster: apt-get install g++ unixodbc-dev python3-dev libssl-dev libffi-dev
  • On CentOS 8: yum install gcc-c++ unixODBC-devel python38-devel libffi-devel openssl-devel

For MSSQL Server users only, install the following:

Install

From your command-line interface tool, execute the following command, using the install package that matches the type of warehouse you use to store data.

$ pip install soda-sql-yourdatawarehouse
Data warehouse Install package
Amazon Athena soda-sql-athena
Amazon Redshift soda-sql-redshift
Apache Hive soda-sql-hive
GCP BigQuery soda-sql-bigquery
MS SQL Server soda-sql-sqlserver
PostgreSQL soda-sql-postgresql
Snowflake soda-sql-snowflake

Optionally, you can install Soda SQL in a virtual environment. Execute the following commands one by one:

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install soda-sql-yourdatawarehouse

Upgrade

To upgrade your existing Soda SQL tool to the latest version, use the following command:

pip install soda-sql-yourdatawarehouse -U

Troubleshoot

Problem: There are known issues on Soda SQL when using pip version 19.
Solution: Upgrade pip to version 20 or greater using the following command:

$ pip install --upgrade pip


Problem: Upgrading Soda SQL does not seem to work.
Solution: Run the following command to skip your local cache when upgrading your Soda SQL version:

$ pip install --upgrade --no-cache-dir soda-sql-yourdatawarehouse

Go further