Soda Python Libraries

This page describes how to install the Soda Python packages, which are required for running Soda scans via the CLI or Python API.

Installation

Requirements

To use Soda, you must have installed the following on your system.

  • Python 3.8, 3.9, 3.10 or 3.11.

To check your existing version, use the CLI command: python --version or python3 --version. If you have not already installed Python, consider using pyenv to manage multiple versions of Python in your environment.

  • Pip 21.0 or greater.

To check your existing version, use the CLI command: pip --version

  • A Soda Cloud account; see how to Sign up.

Best practice dictates that you install the Soda CLI using a virtual environment. If you haven't yet, in your command-line interface tool, create a virtual environment in the .venv directory using the commands below. Depending on your version of Python, you may need to replace python with python3 in the first command.

python -m venv .venv
source .venv/bin/activate

Choose an installation flow

Before you install the Soda CLI, decide which installation flow applies to your environment and license type. The two flows available serve different purposes:

Use Case
Installation Flow
Description

Executing data contracts with basic data quality checks on enterprise data sources.

Use this installation method if you’re just getting started.

The Public PyPI index hosts Soda Core packages for all supported data sources.

Same as above, plus: group by checks, reconciliation checks, migrating checks from v3 to v4, running checks on Oracle data, and capturing failed rows with the Diagnostics Warehouse.

Private PyPI repositories are region-specific and require authentication using your API key credentials. This method ensures secure access to licensed components, enterprise-only extensions, and region-compliant hosting.

Different installations will support different packages. Learn more about which packages are supported in public and private PyPI.


Public PyPI installation flow

To use the open source Soda Core python packages, you must install them from the public Soda PyPi registry: https://pypi.dev.sodadata.io/simple .

  1. Install the Soda Core package for your data source. This gives you access to all the basic CLI functionality for working with contracts.

pip install -i https://pypi.dev.sodadata.io/simple -U soda-postgres

Replace soda-postgres with the appropriate package for your data source. See the Data source reference for Soda Corefor supported packages and configurations.

Now you can Soda Python Libraries.

Supported packages

  • soda: "umbrella" package (does not include Diagnostics Warehouse)

  • Data-source-specific packages: naming pattern is “soda-<datasource>” (e.g. soda-postgres, soda-bigquery, soda-sparkdf, etc.)


Private PyPI installation flow

If you wish to use commercial extensions to the Soda Core python package, you must install them from one of the private Soda PyPi registries below. The private PyPI installation process adds an authentication layer and region-based repositories for license-based access control of Team and Enterprise customers.

  1. Upgrade pip inside your new virtual environment.

pip install --upgrade pip
  1. Choose the correct repository based on your license and region.

License
Soda Region
Repository URL

1 Team: Any license except "Trial" or "Enterprise" (see below) 2 Enterprise: one of enterprise , enterprise_user_based , dataset_standard , premier licenses.

  1. Set your credentials. See how to generate your own API key values.

export SODA_API_KEY_ID="your_key_id"
export SODA_API_KEY_SECRET="your_key_secret"
  1. Execute the following command, replacing soda>=4.0.0b0 with the package that you need to install.

pip install "soda>=4.0.0b0" --pre -i "https://${SODA_API_KEY_ID}:${SODA_API_KEY_SECRET}@enterprise.pypi.cloud.soda.io" --extra-index-url=https://pypi.dev.sodadata.io

Included packages

Team
  • soda: required for the contract generator (includes Diagnostics Warehouse)

pip install “soda” --pre -i “https://${SODA_API_KEY_ID}:${SODA_API_KEY_SECRET}@team.pypi.cloud.soda.io”--extra-index-url=https://pypi.dev.sodadata.io
  • soda-groupby

  • soda-migration

Enterprise
  • soda: required for the contract generator (includes Diagnostics Warehouse)

pip install “soda” --pre -i “https://${SODA_API_KEY_ID}:${SODA_API_KEY_SECRET}@enterprise.pypi.cloud.soda.io”--extra-index-url=https://pypi.dev.sodadata.io
  • soda-groupby

  • soda-migration

  • soda-reconciliation

  • soda-oracle

Last updated

Was this helpful?