Snowflake

Access configuration details to connect Soda to a Snowflake data source.

Connection configuration reference

Install the following package:

pip install -i https://pypi.dev.sodadata.io/simple -U soda-snowflake

Core connection:

type: snowflake
name: my_datasource_name
connection:
  # Core
  account: ${SNOWFLAKE_ACCOUNT}
  database: <database>
  schema: <schema>
  warehouse: <warehouse>
  role: <role>                            # optional, strongly recommended
  connection_timeout: 240                 # optional
  client_session_keep_alive: false        # optional

Though optional, best practice dictates that you provide a value for role. If you do not provide a role, and Snowflake has not assigned a Snowflake System-Defined Role to the user account, Snowflake may, confusingly, deny access to the data source.

Core connection properties
Property
Required
Type
Description

type

yes

string

Identify the type of data source for Soda. In this case, must be snowflake.

account

yes

string

Provide the unique value that identifies your account. Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_ACCOUNT}. Note: Account sometimes needs to take the form of <account_identifier>-<account_name> or <account_identifier>.<region>.

database

yes

string

Provide an identifier for your database.

schema

yes

string

Identify the schema in the data source in which your tables exist.

warehouse

yes

string

Provide an identifier for the cluster of resources that is a Snowflake virtual warehouse. See Overview of Warehouses.

role1

recommended

string

Specify a Snowflake role that has permission to access the database and schema of your data source.

connection_timeout

no

integer

Set the timeout period in seconds for an inactive login session.

client_session_keep_alive

no

boolean

Use this parameter to keep the session active, even with no user activity. Default value: false.

session_parameters

no

object

Pass-through Snowflake session params (e.g., QUERY_TAG, QUOTED_IDENTIFIERS_IGNORE_CASE).

proxy_http

no

string

HTTP proxy (agent environments); see Troubleshoot section.

proxy_https

no

string

HTTPS proxy (agent environments); see Troubleshoot section.

1 Though optional, best practice dictates that you provide a value for role. If you do not provide a role, and Snowflake has not assigned a Snowflake System-Defined Role to the user account, Snowflake may, confusingly, deny access to the data source.

Authenticator selector

Property
Required
Values / Example
Description

authenticator2

no

externalbrowser | SNOWFLAKE_JWT | OAUTH_CLIENT_CREDENTIALS

Add an authenticator paramater with value externalbrowser to authenticate the connection to your Snowflake data source using any SAML 2.0-compliant identity provider (IdP) such as Okta or OneLogin.

2 Use this parameter when adding Snowflake connection configurations to a configuration.yml file. However, if you are adding connection configuration details directly in Soda Cloud (connecting to your Snowflake data source via a Soda Agent) to authenticate using Okta, you must follow the instructions documented by Snowflake for Native SSO - Okta Only.

Choose exactly one authentication approach of the four options below.

Notes:

  • Use one authentication method per connection.

  • While role is technically optional, providing it avoids confusing access errors.

  • Private key auth can use inline private_key (PEM) or private_key_path.

  • externalbrowser SSO uses your SAML 2.0 IdP (e.g., Okta/OneLogin).

  • Proxy parameters are supported when connecting via an agent behind a proxy.

1. Username + password

  # 1) Username + password
  username: ${SNOWFLAKE_USER}
  password: ${SNOWFLAKE_PASSWORD}
Username/password properties (when no authenticator is set)
Property
Required
Description

username

yes

Snowflake user login. Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_USER}.

password

yes

Password for the user. Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_PASSWORD}.

2. External browser SSO

  # 2) External browser SSO (SAML 2.0 IdP such as Okta/OneLogin)
  authenticator: externalbrowser
External browser SSO properties (when authenticator: externalbrowser)
Property
Required
Description

username

no

Depending on IdP settings, may be required. Consider using system variables to retrieve this value securely using, for example, ${SNOWFLAKE_USER}.

authenticator

yes

Must be externalbrowser to enable SAML 2.0 browser-based SSO.

3. Private key authentication

You can use the private_key and private_key_passphrase parameters to specify for key pair authentication. In you configuration YML file, add the parameters as per the following example.

  # 3) Key pair (JWT) authentication
  authenticator: SNOWFLAKE_JWT
  username: <user>
  private_key: |
    -----BEGIN ENCRYPTED PRIVATE KEY-----
    -----END ENCRYPTED PRIVATE KEY-----
  private_key_passphrase: ${SNOWFLAKE_PASSPHRASE}
  # or use a file path instead of inline key:
  private_key_path: /path/to/private-key.pk8
Key pair / JWT properties (when authenticator: SNOWFLAKE_JWT)
Property
Required
Description

username

yes

Snowflake user that owns the key pair.

private_key

yes*

Inline PEM private key. Use either this or private_key_path.

private_key_path

yes*

Path to private key file (.pk8). Use either this or private_key.

private_key_passphrase

no

Passphrase for encrypted private key.

4. OAuth 2.0 Client Credentials (NEW)

  # 4) OAuth 2.0 Client Credentials
  authenticator: OAUTH_CLIENT_CREDENTIALS
  oauth_client_id: <client-id>
  oauth_client_secret: <client-secret>
  oauth_token_request_url: https://<idp>/oauth/token
  oauth_scope: "scope1 scope2"        # space-delimited; optional if derived by role
OAuth 2.0 client properties (when authenticator: OAUTH_CLIENT_CREDENTIALS)
Property
Required
Description

oauth_client_id

yes

Client ID from the IdP for the Snowflake security integration.

oauth_client_secret

yes

Client secret from the IdP for the Snowflake security integration.

oauth_token_request_url

yes

IdP token endpoint that issues access tokens to the driver. (With Snowflake as IdP, derive from server/account parameters.)

oauth_scope

no

Space-delimited, case-sensitive scopes. Defaults may be derived from role; specify explicitly for multiple/custom scopes.


Other parameters

  # Optional Snowflake session parameters (passed through)
  session_parameters:
    QUERY_TAG: soda-queries
    QUOTED_IDENTIFIERS_IGNORE_CASE: false

  # Optional proxies (for agents behind proxy)
  proxy_http: http://host:port
  proxy_https: https://host:port
Other parameters properties
Property
Required
Notes

other params

optional

You can pass any other Snowflake parameters you wish by adding the key:value pairs to your Snowflake connection configuration. See Snowflake Python Connector API documentation for a list of passable parameters.

QUERY_TAG

optional

See QUERY_TAG in Snowflake documentation.

QUOTED_IDENTIFIERS_ IGNORE_CASE

optional

See QUOTED_IDENTIFIERS_IGNORE_CASE in Snowflake documentation.


Use a values file to store private key authentication values

If you use a private key authentication with Snowflake and have deployed a Soda Agent, you can provide the required private key values in a values.yml file when you deploy or redeploy the agent.

You can also use the values.yml file to store other environment variables for the Soda Agent to use, such as SNOFLAKE_USER, SNOWFLAKE_ACCOUNT, SNOWFLAKE_PASSPHRASE, etc.

  1. First, run the following command to create a local path to the Snowflake private key. Replace the local path to the Snowflake private key with your own value.

kubectl create secret generic -n <soda-agent-namespace> snowflake-private-key --from-file=snowflake-private-key.pk8=<local path to the Snowflake private key>
  1. Then, add the following to the your values.yml file, adjusting the values to your own specific details.

    soda:
      scanlauncher:
        volumeMounts:
          - name: snowflake-private-key
            mountPath: /opt/soda/etc
        volumes:
          - name: snowflake-private-key
            secret:
              secretName: snowflake-private-key
              items:
                - key: snowflake-private-key.pk8
                  path: snowflake-private-key.pk8
  2. Adjust the configuration.yml file to include the new path in the connection details, as in the following example.

    data_source ltsnowflakecustomer:
      type: snowflake
      username: ${SNOWFLAKE_USER}
      password: password
      account: ${SNOWFLAKE_ACCOUNT}
      database: PUBLISH_DEV
      warehouse: ${SNOWFLAKE_WAREHOUSE}
      role: ${SNOWFLAKE_ROLE}
      client_session_keep_alive: true
      session_parameters:
        QUERY_TAG: soda-queries
        QUOTED_IDENTIFIERS_IGNORE_CASE: false
      schema: CUSTOMER
      private_key_passphrase: ${SNOWFLAKE_PASSPHRASE}
      private_key_path: /opt/soda/etc/snowflake-private-key.pk8
  3. Deploy, or redeploy, the agent for the changes to take effect.

Test the data source connection

To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection command. If you wish, add a -V option to the command to return results in verbose mode in the CLI.

soda test-connection -d my_datasource -c configuration.yml -V

Supported data types

Category
Data type

text

CHAR, VARCHAR, CHARACTER, STRING, TEXT

number

NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL

time

DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ

Troubleshoot

Problem: When testing the connection to your Snowflake data source, Snowflake returns an error message about using the use database command.

Solution: Use the role parameter to specify a Snowflake role that has permission to access the database and schema of your data source. Though optional, best practice dictates that you provide a value for role. If you do not provide a role, and Snowflake has not assigned a Snowflake System-Defined Role to the user account, Snowflake may, confusingly, deny access to the data source.

Problem: When Soda attempts to connect to your Snowflake data source, it produces a connectivity error that includes something like RunteimError: Command failed with exit code 2: ..... ocsp_response_validation_cash.lock.

Solution: Use Snowflake's troubleshooting guide to triage OCSP-related connectivity issues.

Problem: You have defined a Group By check and the scan that executes the check yields an error.

Solution: Be aware that, by default, Snowflake returns columns names in uppercase. Therefore, when defining a Group By check, you must specify the custom name of the check in uppercase as in the following example.

checks for VOLUME:
  - group by:
      group_limit: 50
      query: |
        SELECT TRADER, count(*) as trader_row_count
        FROM TRADEVOLUME
        WHERE TRADE_DATE = '2023-01-01'
        GROUP BY TRADER
      fields:
        - TRADER
      checks:
        - TRADER_ROW_COUNT > 40:
            name: Trader row count

Problem: While attempting to connect Soda to a Snowflake data source using proxy parameters, you encounter an error that reads something similar to Could not connect to data source "name_db": 250001 (08001): Failed to connect to DB: mydb.eu-west-1.snowflakecomputing.com:443. Incoming request with IP/Token xx.xxx.xx.xxx is not allowed to access Snowflake.

data_source: my_data_source
  type: snowflake
  ...
  session_param:
    QUERY_TAG: soda-test
    QUOTED_IDENTIFIERS_IGNORE_CASE: false
  proxy_http: http://a-proxy-o-dd-dddd-net:8000
  proxy_https: https://a-proxy-o-dd-dddd-net:8000

Solution: When connecting to a Snowflake data source by proxy, be sure to set the new proxy environment variables from the command-line using export statements, as in the following example.

export HTTP_PROXY=http://a-proxy-o-dd-dddd-net:8000
export HTTPS_PROXY=https://a-proxy-o-dd-dddd-net:800

Last updated

Was this helpful?