Connect Soda to Snowflake
Last modified on 31-May-23
For Soda to run quality scans of your data, you must configure it to connect to your data source.
- For Soda Core, add the connection configurations to your
configuration.yml
file. Read more. - For Soda Cloud, add the connection configurations to step 3 of the New Data Source workflow. Read more.
Configuration
Private key authentication
Use a values file to store private key authenticaion values
Test the data source connection
Supported data types
Troubleshoot
Configuration
Install package: soda-core-snowflake
data_source my_datasource_name:
type: snowflake
connection:
username:
password:
account:
database:
warehouse:
connection_timeout:
role: PUBLIC
client_session_keep_alive: true
authenticator: externalbrowser
session_parameters:
QUERY_TAG: soda-queries
QUOTED_IDENTIFIERS_IGNORE_CASE: false
schema: public
Property | Required | Notes |
---|---|---|
type | required | |
username | required | Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_USER} . |
password | required | Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_PASSWORD} . |
account | required | Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_ACCOUNT} . |
database | required | |
schema | required | |
warehouse | required | |
connection_timeout | required | |
role | optional | See Snowflake System-Defined Roles for details. |
client_session_keep_alive | optional | Use this parameter with a boolean option to keep the session active, even with no user activity. |
authenticator 1 | optional | Add an authenticator paramater with value externalbrowser to authenticate the connection to your Snowflake data source using any SAML 2.0-compliant identity provider (IdP) such as Okta or OneLogin. |
other params | optional | You can pass any other Snowflake paramters you wish by adding the key:value pairs to your Snowflake connection configuration. See Snowflake Python Connector API documentation for a list of passable parameters. |
QUERY_TAG | optional | See QUERY_TAG in Snowflake documentation. |
QUOTED_IDENTIFIERS_ IGNORE_CASE | optional | See QUOTED_IDENTIFIERS_IGNORE_CASE in Snowflake documentation. |
schema | required |
1 Use this parameter when adding Snowflake connection configurations to a configuration.
yml file. However, if you are adding connection configuration details directly in Soda Cloud (connecting to your Snowflake data source via a Soda Agent) to authenticate using Okta, you must follow the instructions documented by Snowflake for Native SSO - Okta Only.
Private key authentication
You can use the private_key
and private_key_passphrase
parameters to specify for key pair authentication. In you configuration YML file, add the parameters as per the following example.
data_source snowflake:
type: snowflake
connection:
username: xxxyyyzzz
...
client_session_keep_alive: true
Authenticator: SNOWFLAKE_JWT
schema: TPCH_SF1
private_key_passphrase: "123xxx"
private_key: |
-----BEGIN ENCRYPTED PRIVATE KEY-----
-----END ENCRYPTED PRIVATE KEY-----
Use a values file to store private key authentication values
If you use a private key authentication with Snowflake and have deployed a Soda Agent, you can provide the required private key values in a values.yml
file when you deploy or redeploy the agent.
You can also use the values.yml
file to store other environment variables for the Soda Agent to use, such as SNOFLAKE_USER
, SNOWFLAKE_ACCOUNT
, SNOWFLAKE_PASSPHRASE
, etc.
- First, run the following command to create a local path to the Snowflake private key. Replace the
local path to the Snowflake private key
with your own value.kubectl create secret generic -n <soda-agent-namespace> snowflake-private-key --from-file=snowflake-private-key.pk8=<local path to the Snowflake private key>
- Then, add the following to the your
values.yml
file, adjusting the values to your own specific details.soda: scanlauncher: volumeMounts: - name: snowflake-private-key mountPath: /opt/soda/etc volumes: - name: snowflake-private-key secret: secretName: snowflake-private-key items: - key: snowflake-private-key.pk8 path: snowflake-private-key.pk8
- Adjust the
configuration.yml
file to include the new path in the connection details, as in the following example.data_source ltsnowflakecustomer: type: snowflake connection: username: ${SNOWFLAKE_USER} password: password account: ${SNOWFLAKE_ACCOUNT} database: PUBLISH_DEV warehouse: ${SNOWFLAKE_WAREHOUSE} role: ${SNOWFLAKE_ROLE} client_session_keep_alive: true session_parameters: QUERY_TAG: soda-queries QUOTED_IDENTIFIERS_IGNORE_CASE: false schema: CUSTOMER private_key_passphrase: ${SNOWFLAKE_PASSPHRASE} private_key_path: /opt/soda/etc/snowflake-private-key.pk8
- Deploy, or redeploy, the agent for the changes to take effect.
Test the data source connection
To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection
command. If you wish, add a -V
option to the command to returns results in verbose mode in the CLI.
soda test-connection -d my_datasource -c configuration.yml -V
Supported data types
Category | Data type |
---|---|
text | CHAR, VARCHAR, CHARACTER, STRING, TEXT |
number | NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL |
time | DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ |
Troubleshoot
Problem: When Soda attempts to connect to your Snowflake data source, it produces a connectivity error that includes something like RunteimError: Command failed with exit code 2: ..... ocsp_response_validation_cash.lock
.
Solution: Use Snowflake’s troubleshooting guide to triage OCSP-related connectivity issues.
Was this documentation helpful?
What could we do to improve this page?
- Suggest a docs change in GitHub.
- Share feedback in the Soda community on Slack.
Last modified on 31-May-23