DuckDB advanced usage
Soda supports DuckDB as a flexible, lightweight SQL engine that can be used with files and in-memory data frames such as Pandas and Polars.
Install the following package:
pip install -i https://pypi.dev.sodadata.io/simple -U soda-duckdbFrom Pandas DataFrame
import pandas as pd
import duckdb
from soda_core.contracts import verify_contract_locally
from soda_duckdb import DuckDBDataSource
df = pd.read_parquet("adventureworks.parquet")
conn = duckdb.connect(database=":memory:")
cursor = conn.cursor()
cursor.register(view_name="adventureworks", python_object=df)
result = verify_contract_locally(
data_sources=[DuckDBDataSource.from_existing_cursor(cursor, name="duckdb")],
contract_file_path="adventureworks.yml"
)From Polars DataFrame
In-Memory with DuckDB SQL
Data from Parquet File
You can point directly to a .parquet file as a DuckDB "database":
Then you can verify a contract on this database using the CLI:
Or Python API:
Last updated
Was this helpful?
