Soda Python Libraries
This page describes how to install the Soda Python packages, which are required for running Soda scans via the CLI or Python API.
Installation
Requirements
To use Soda, you must have installed the following on your system.
Python 3.8, 3.9, 3.10 or 3.11.
To check your existing version, use the CLI command: python --version or python3 --version. If you have not already installed Python, consider using pyenv to manage multiple versions of Python in your environment.
Pip 21.0 or greater.
To check your existing version, use the CLI command: pip --version
A Soda Cloud account; see how to Sign up.
Best practice dictates that you install the Soda CLI using a virtual environment. If you haven't yet, in your command-line interface tool, create a virtual environment in the .venv directory using the commands below. Depending on your version of Python, you may need to replace python with python3 in the first command.
python -m venv .venv
source .venv/bin/activateChoose an installation flow
Before you install the Soda CLI, decide which installation flow applies to your environment and license type. The two flows available serve different purposes:
Executing data contracts with basic data quality checks on enterprise data sources.
Use this installation method if you’re just getting started.
The Public PyPI index hosts Soda Core packages for all supported data sources.
Same as above, plus: group by checks, reconciliation checks, migrating checks from v3 to v4, running checks on Oracle data, and capturing failed rows with the Diagnostics Warehouse.
Private PyPI repositories are region-specific and require authentication using your API key credentials. This method ensures secure access to licensed components, enterprise-only extensions, and region-compliant hosting.
Public PyPI installation flow
To use the open source Soda Core python packages, you must install them from the public Soda PyPi registry: https://pypi.dev.sodadata.io/simple .
Install the Soda Core package for your data source. This gives you access to all the basic CLI functionality for working with contracts.
pip install -i https://pypi.dev.sodadata.io/simple -U soda-postgresReplace soda-postgres with the appropriate package for your data source. See the Data source reference for Soda Corefor supported packages and configurations.
Now you can Soda Python Libraries.
Supported packages
soda: "umbrella" package (does not include Diagnostics Warehouse)Data-source-specific packages: naming pattern is “
soda-<datasource>” (e.g.soda-postgres,soda-bigquery,soda-sparkdf, etc.)
Private PyPI installation flow
If you wish to use commercial extensions to the Soda Core python package, you must install them from one of the private Soda PyPi registries below. The private PyPI installation process adds an authentication layer and region-based repositories for license-based access control of Team and Enterprise customers.
Upgrade
pipinside your new virtual environment.
pip install --upgrade pipChoose the correct repository based on your license and region.
1 Team: Any license except "Trial" or "Enterprise" (see below)
2 Enterprise: one of enterprise , enterprise_user_based , dataset_standard , premier licenses.
Set your credentials. See how to generate your own API key values.
export SODA_API_KEY_ID="your_key_id"
export SODA_API_KEY_SECRET="your_key_secret"Execute the following command, replacing
soda>=4.0.0b0with the package that you need to install.
pip install "soda>=4.0.0b0" --pre -i "https://${SODA_API_KEY_ID}:${SODA_API_KEY_SECRET}@enterprise.pypi.cloud.soda.io" --extra-index-url=https://pypi.dev.sodadata.ioIncluded packages
Last updated
Was this helpful?
