Redshift
Access configuration details to connect Soda to an Amazon Redshift data source.
Connection configuration reference
Install the following package:
pip install -i https://pypi.dev.sodadata.io/simple -U soda-redshiftData source YAML
type: redshift
name: my_redshift
connection:
host: ${env.REDSHIFT_HOST}
port: 5439
database: ${env.REDSHIFT_DB}
user: ${env.REDSHIFT_USER}
# optional
password: ${env.REDSHIFT_PW}
access_key_id: ${env.REDSHIFT_AWS_ACCESS_KEY_ID}
secret_access_key: ${env.REDSHIFT_AWS_SECRET_ACCESS_KEY}
session_token: ${env.REDSHIFT_AWS_SESSION_TOKEN}
role_arn: ${env.REDSHIFT_ROLE_ARN} # e.g., arn:aws:iam::123456789012:role/MyRole
region: ${env.REDSHIFT_REGION} # e.g., us-east-1
profile_name: ${env.REDSHIFT_AWS_PROFILE}
cluster_identifier: ${env.REDSHIFT_CLUSTER_ID}Connection test
Test the data source connection:
soda data-source test -ds ds.ymlCase sensitivity
In Soda v3, Soda only supports case-insensitive identifiers for this data source. Users migrating from v3 to v4 might have to account for case since Soda v4 enforces case-sensitive identifiers.
Redshift is usually case-insensitive and interprets all identifiers (e.g. table and column names) as lower case regardless of the original input. However, Soda enforces case-sensitive identifiers for its sessions.
Starting with Soda v4, all queries executed against Redshift are now run with enable_case_sensitive_identifier = True, regardless of your database’s default setting. This ensures consistent handling of table and column names, including those containing uppercase characters.
If you previously relied on case-insensitive behavior (the default in Soda v3), you may need to review and update any custom SQL, filters, or references to identifiers to ensure they match the exact case used in your Redshift schema.
This change prevents onboarding issues for mixed-case identifiers, but it also means all SQL executed by Soda is now case sensitive.
Last updated
Was this helpful?
