Snowflake
Access configuration details to connect Soda to a Snowflake data source.
Connection configuration reference
Install the following package:
pip install -i https://pypi.dev.sodadata.io/simple -U soda-snowflakeCore connection:
type: snowflake
name: my_datasource_name
connection:
# Core
account: ${env.SNOWFLAKE_ACCOUNT} # <account locater>.<region>, e.g.,ET23172.eu-west-1
database: <database>
warehouse: <warehouse>
role: <role> # optional, strongly recommended
connection_timeout: 240 # optional
client_session_keep_alive: false # optionalChoose exactly one authentication approach of the four options below.
Supported by Soda
Default (
SnowflakePasswordAuth)External browser (
SnowflakeSSOAuth)key_pair (
SnowflakeKeyPairAuth)OAuth (
SnowflakeOAuthAuth)OAuth client credentials (
SnowflakeClientCredentialsOAuthAuth)
Not supported by Soda
OAuth Authorization Code grant
User/password MFA flows
Workload identity authenticator
Programmatic access token (PAT)
Programmatic access token with external session
1. User + password
# 1) User + password
user: ${env.SNOWFLAKE_USER}
password: ${env.SNOWFLAKE_PASSWORD}2. External browser SSO
# 2) External browser SSO (SAML 2.0 IdP such as Okta/OneLogin)
authenticator: externalbrowser3. Key pair (JWT) authentication
You can use the private_key and private_key_passphrase parameters to specify for key pair authentication. In you configuration YML file, add the parameters as per the following example.
# 3) Key pair (JWT) authentication
authenticator: SNOWFLAKE_JWT
user: <user>
private_key: |
-----BEGIN ENCRYPTED PRIVATE KEY-----
-----END ENCRYPTED PRIVATE KEY-----
private_key_passphrase: ${env.SNOWFLAKE_PASSPHRASE}
# or use a file path instead of inline key:
private_key_path: /path/to/private-key.pk84. OAuth 2.0 Client Credentials
# 4) OAuth 2.0 Client Credentials
authenticator: OAUTH_CLIENT_CREDENTIALS
oauth_client_id: <client-id>
oauth_client_secret: <client-secret>
oauth_token_request_url: https://<idp>/oauth/token
oauth_scope: "scope1 scope2" # space-delimited; optional if derived by roleOther parameters
# Optional Snowflake session parameters (passed through)
session_parameters:
QUERY_TAG: soda-queries
QUOTED_IDENTIFIERS_IGNORE_CASE: false
# Optional proxies (for agents behind proxy)
proxy_http: http://host:port
proxy_https: https://host:portUse a values file to store private key authentication values
If you use a private key authentication with Snowflake and have deployed a Soda Agent, you can provide the required private key values in a values.yml file when you deploy or redeploy the agent.
You can also use the values.yml file to store other environment variables for the Soda Agent to use, such as SNOWFLAKE_USER, SNOWFLAKE_ACCOUNT, SNOWFLAKE_PASSPHRASE, etc.
First, run the following command to create a local path to the Snowflake private key. Replace the
local path to the Snowflake private keywith your own value.
kubectl create secret generic -n <soda-agent-namespace> snowflake-private-key --from-file=snowflake-private-key.pk8=<local path to the Snowflake private key>Then, add the following to the your
values.ymlfile, adjusting the values to your own specific details.soda: scanLauncher: volumeMounts: - name: snowflake-private-key mountPath: /opt/soda/etc volumes: - name: snowflake-private-key secret: secretName: snowflake-private-key items: - key: snowflake-private-key.pk8 path: snowflake-private-key.pk8 contractLauncher: volumeMounts: - name: snowflake-private-key mountPath: /opt/soda/etcAdjust the
configuration.ymlfile to include the new path in the connection details, as in the following example.data_source ltsnowflakecustomer: type: snowflake account: ${env.SNOWFLAKE_ACCOUNT} database: PUBLISH_DEV warehouse: ${env.SNOWFLAKE_WAREHOUSE} role: ${env.SNOWFLAKE_ROLE} user: ${env.SNOWFLAKE_USER} authenticator: SNOWFLAKE_JWT private_key_passphrase: ${env.SNOWFLAKE_PASSPHRASE} private_key_path: /opt/soda/etc/snowflake-private-key.pk8 client_session_keep_alive: true session_parameters: QUERY_TAG: soda-queries QUOTED_IDENTIFIERS_IGNORE_CASE: falseDeploy, or redeploy, the agent for the changes to take effect.
Connection test
Test the data source connection:
soda data-source test -ds ds.ymlSupported data types
text
CHAR, VARCHAR, CHARACTER, STRING, TEXT
number
NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL
time
DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ
Troubleshoot
Problem: When testing the connection to your Snowflake data source, Snowflake returns an error message about using the use database command.
Solution: Use the role parameter to specify a Snowflake role that has permission to access the database and schema of your data source.
Though optional, best practice dictates that you provide a value for role. If you do not provide a role, and Snowflake has not assigned a Snowflake System-Defined Role to the user account, Snowflake may, confusingly, deny access to the data source.
Problem: When Soda attempts to connect to your Snowflake data source, it produces a connectivity error that includes something like RuntimeError: Command failed with exit code 2: ..... ocsp_response_validation_cash.lock.
Solution: Use Snowflake's troubleshooting guide to triage OCSP-related connectivity issues.
Problem: You have defined a Group By check and the scan that executes the check yields an error.
Solution: Be aware that, by default, Snowflake returns columns names in uppercase. Therefore, when defining a Group By check, you must specify the custom name of the check in uppercase as in the following example.
checks for VOLUME:
- group by:
group_limit: 50
query: |
SELECT TRADER, count(*) as trader_row_count
FROM TRADEVOLUME
WHERE TRADE_DATE = '2023-01-01'
GROUP BY TRADER
fields:
- TRADER
checks:
- TRADER_ROW_COUNT > 40:
name: Trader row count Problem: While attempting to connect Soda to a Snowflake data source using proxy parameters, you encounter an error that reads something similar to Could not connect to data source "name_db": 250001 (08001): Failed to connect to DB: mydb.eu-west-1.snowflakecomputing.com:443. Incoming request with IP/Token xx.xxx.xx.xxx is not allowed to access Snowflake.
data_source my_data_source:
type: snowflake
...
session_parameters:
QUERY_TAG: soda-test
QUOTED_IDENTIFIERS_IGNORE_CASE: false
proxy_http: http://a-proxy-o-dd-dddd-net:8000
proxy_https: https://a-proxy-o-dd-dddd-net:8000Solution: When connecting to a Snowflake data source by proxy, be sure to set the new proxy environment variables from the command-line using export statements, as in the following example.
export HTTP_PROXY=http://a-proxy-o-dd-dddd-net:8000
export HTTPS_PROXY=https://a-proxy-o-dd-dddd-net:8000Last updated
Was this helpful?
