# Snowflake

## Connection configuration reference

Install the following package:

```bash
pip install soda-snowflake
```

#### Core connection:

**Create the config file:**

```shellscript
soda data-source create -f ds_config.yml
```

The data source configuration YAML should look like the following:

{% code title="ds\_config.yml" %}

```yaml
type: snowflake
name: my_datasource_name
connection:
  # Core
  account: ${env.SNOWFLAKE_ACCOUNT} # <account locater>.<region>, e.g.,ET23172.eu-west-1
  database: <database>
  warehouse: <warehouse>
  role: <role>                            # optional, strongly recommended
  connection_timeout: 240                 # optional
  client_session_keep_alive: false        # optional
```

{% endcode %}

{% hint style="info" %}
Though optional, best practice dictates that you provide a value for `role`. If you do not provide a role, and Snowflake has not assigned a [Snowflake System-Defined Role](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#system-defined-roles) to the user account, Snowflake may, confusingly, deny access to the data source.
{% endhint %}

<details>

<summary>Core connection properties</summary>

<table><thead><tr><th width="242.5">Property</th><th width="133.75">Required</th><th width="100">Type</th><th>Description</th></tr></thead><tbody><tr><td><code>type</code></td><td>yes</td><td>string</td><td>Identify the type of data source for Soda. In this case, must be <code>snowflake</code>.</td></tr><tr><td><code>account</code></td><td>yes</td><td>string</td><td>Provide the unique value that identifies your account. Consider using system variables to retrieve this value securely using, for example, <code>${SNOWFLAKE_ACCOUNT}</code>. Note: Account sometimes needs to take the form of <code>&#x3C;account_identifier>-&#x3C;account_name></code> or <code>&#x3C;account_identifier>.&#x3C;region></code>.</td></tr><tr><td><code>database</code></td><td>yes</td><td>string</td><td>Provide an identifier for your database.</td></tr><tr><td><code>warehouse</code></td><td>yes</td><td>string</td><td>Provide an identifier for the cluster of resources that is a Snowflake virtual warehouse. See <a href="https://docs.snowflake.com/en/user-guide/warehouses-overview">Overview of Warehouses</a>.</td></tr><tr><td><code>role</code><sup>1</sup></td><td>recommended</td><td>string</td><td>Specify a Snowflake role that has permission to access the <code>database</code> and <code>schema</code> of your data source.</td></tr><tr><td><code>connection_timeout</code></td><td>no</td><td>integer</td><td>Set the timeout period in seconds for an inactive login session.</td></tr><tr><td><code>client_session_keep_alive</code></td><td>no</td><td>boolean</td><td>Use this parameter to keep the session active, even with no user activity.<br>Default value: <code>false</code>.</td></tr><tr><td><code>session_parameters</code></td><td>no</td><td>object</td><td>Pass-through Snowflake session params (e.g., <code>QUERY_TAG</code>, <code>QUOTED_IDENTIFIERS_IGNORE_CASE</code>).</td></tr><tr><td><code>proxy_http</code></td><td>no</td><td>string</td><td>HTTP proxy (runner environments); see <a href="#troubleshoot">Troubleshoot section</a>.</td></tr><tr><td><code>proxy_https</code></td><td>no</td><td>string</td><td>HTTPS proxy (runner environments); see <a href="#troubleshoot">Troubleshoot section</a>.</td></tr></tbody></table>

<sup>1</sup> Though optional, best practice dictates that you provide a value for `role`. If you do not provide a role, and Snowflake has not assigned a [Snowflake System-Defined Role](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#system-defined-roles) to the user account, Snowflake may, confusingly, deny access to the data source.

**Authenticator selector**

<table><thead><tr><th width="154.75">Property</th><th width="113.5">Required</th><th>Values / Example</th><th>Description</th></tr></thead><tbody><tr><td><code>authenticator</code><sup>2</sup></td><td>no</td><td><code>externalbrowser</code> | <code>OAUTH_CLIENT_CREDENTIALS</code></td><td><p>Add an <code>authenticator</code> paramater with value <code>externalbrowser</code> to authenticate the connection to your Snowflake data source using any SAML 2.0-compliant identity provider (IdP) such as</p><p>Okta or OneLogin.</p><p>If omitted, Soda uses user/password auth (default)</p></td></tr></tbody></table>

<sup>2</sup> Use this parameter when adding Snowflake connection configurations to a `configuration.yml` file. However, if you are adding connection configuration details directly in Soda Cloud (connecting to your Snowflake data source via a Soda Runner) to authenticate using Okta, you must follow the instructions documented by Snowflake for [Native SSO - Okta Only](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#native-sso-okta-only).

</details>

Choose exactly **one authentication approach** of the four options below.

<i class="fa-octagon-check">:octagon-check:</i> **Supported by Soda**

* Default (`SnowflakePasswordAuth`)
* External browser (`SnowflakeSSOAuth`)
* key\_pair (`SnowflakeKeyPairAuth`)
* OAuth (`SnowflakeOAuthAuth`)
* OAuth client credentials (`SnowflakeClientCredentialsOAuthAuth`)

<i class="fa-octagon-xmark">:octagon-xmark:</i> **Not supported by Soda**

* OAuth **Authorization Code** grant
* User/password **MFA** flows
* **Workload identity** authenticator
* **Programmatic access token** (PAT)
* Programmatic access token **with external session**

{% hint style="info" %}
**Notes**:

* Use **one** authentication method per connection.
* While `role` is technically optional, providing it avoids confusing access errors.
* Private key auth can use inline `private_key` (PEM) or `private_key_path`.
* `externalbrowser` SSO uses your SAML 2.0 IdP (e.g., Okta/OneLogin).
* Proxy parameters are supported when connecting via a runner behind a proxy.
  {% endhint %}

### 1. User + password

```yaml
  # 1) User + password
  user: ${env.SNOWFLAKE_USER}
  password: ${env.SNOWFLAKE_PASSWORD}
```

<details>

<summary>User/password properties (when <em>no</em> <code>authenticator</code> is set)</summary>

<table><thead><tr><th width="120">Property</th><th width="104">Required</th><th>Description</th></tr></thead><tbody><tr><td><code>user</code></td><td>yes</td><td>Snowflake user login.<br>Consider using system variables to retrieve this value securely using, for example, <code>${SNOWFLAKE_USER}</code>.</td></tr><tr><td><code>password</code></td><td>yes</td><td>Password for the user.<br>Consider using system variables to retrieve this value securely using, for example, <code>${SNOWFLAKE_PASSWORD}</code>.</td></tr></tbody></table>

</details>

### 2. External browser SSO

```yaml
  # 2) External browser SSO (SAML 2.0 IdP such as Okta/OneLogin)
  authenticator: externalbrowser
```

<details>

<summary>External browser SSO properties (when <code>authenticator: externalbrowser</code>)</summary>

<table><thead><tr><th width="160.5">Property</th><th width="116.25">Required</th><th>Description</th></tr></thead><tbody><tr><td><code>user</code></td><td>no</td><td>Depending on IdP settings, may be required.<br>Consider using system variables to retrieve this value securely using, for example, <code>${SNOWFLAKE_USER}</code>.</td></tr><tr><td><code>authenticator</code></td><td>yes</td><td>Must be <code>externalbrowser</code> to enable SAML 2.0 browser-based SSO.</td></tr></tbody></table>

</details>

### 3. Key pair (JWT) authentication

You can use the `private_key` and `private_key_passphrase` parameters to specify for key pair authentication. In you configuration YML file, add the parameters as per the following example.

```yaml
  # 3) Key pair (JWT) authentication
  authenticator: SNOWFLAKE_JWT
  user: <user>
  private_key: |
    -----BEGIN ENCRYPTED PRIVATE KEY-----
    -----END ENCRYPTED PRIVATE KEY-----
  private_key_passphrase: ${env.SNOWFLAKE_PASSPHRASE}
  # or use a file path instead of inline key:
  private_key_path: /path/to/private-key.pk8
```

<details>

<summary>Key pair / JWT properties (when <code>authenticator: SNOWFLAKE_JWT</code>)</summary>

<table><thead><tr><th width="213.75">Property</th><th width="111.75">Required</th><th>Description</th></tr></thead><tbody><tr><td><code>user</code></td><td>yes</td><td>Snowflake user that owns the key pair.</td></tr><tr><td><code>private_key</code></td><td>yes*</td><td>Inline PEM private key. Use either this <strong>or</strong> <code>private_key_path</code>.</td></tr><tr><td><code>private_key_path</code></td><td>yes*</td><td>Path to private key file (<code>.pk8</code>). Use either this <strong>or</strong> <code>private_key</code>.</td></tr><tr><td><code>private_key_passphrase</code></td><td>no</td><td>Passphrase for encrypted private key.</td></tr></tbody></table>

{% hint style="info" %}
Do **not** include `password` when `authenticator: SNOWFLAKE_JWT` is used.
{% endhint %}

</details>

### 4. OAuth 2.0 Client Credentials

```yaml
  # 4) OAuth 2.0 Client Credentials
  authenticator: OAUTH_CLIENT_CREDENTIALS
  oauth_client_id: <client-id>
  oauth_client_secret: <client-secret>
  oauth_token_request_url: https://<idp>/oauth/token
  oauth_scope: "scope1 scope2"        # space-delimited; optional if derived by role
```

<details>

<summary>OAuth 2.0 client properties (when <code>authenticator: OAUTH_CLIENT_CREDENTIALS</code>)</summary>

<table><thead><tr><th width="231.75">Property</th><th width="93">Required</th><th>Description</th></tr></thead><tbody><tr><td><code>oauth_client_id</code></td><td>yes</td><td>Client ID from the IdP for the Snowflake <strong>security integration</strong>.</td></tr><tr><td><code>oauth_client_secret</code></td><td>yes</td><td>Client secret from the IdP for the Snowflake <strong>security integration</strong>.</td></tr><tr><td><code>oauth_token_request_url</code></td><td>yes</td><td>IdP token endpoint that issues access tokens to the driver. (With Snowflake as IdP, derive from server/account parameters.)</td></tr><tr><td><code>oauth_scope</code></td><td>no</td><td>Space-delimited, case-sensitive scopes. Defaults may be derived from role; specify explicitly for multiple/custom scopes.</td></tr></tbody></table>

</details>

{% hint style="info" %}
Only the **Client Credentials** grant is supported. The **Authorization Code** grant is **not** supported.
{% endhint %}

***

### Other parameters

```yaml
  # Optional Snowflake session parameters (passed through)
  session_parameters:
    QUERY_TAG: soda-queries
    QUOTED_IDENTIFIERS_IGNORE_CASE: false

  # Optional proxies (for runners behind proxy)
  proxy_http: http://host:port
  proxy_https: https://host:port
```

<details>

<summary>Other parameters properties</summary>

| Property                                     | Required | Notes                                                                                                                                                                                                                                                                                            |
| -------------------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| other params                                 | optional | You can pass any other Snowflake parameters you wish by adding the key:value pairs to your Snowflake connection configuration. See [Snowflake Python Connector API documentation](https://docs.snowflake.com/en/user-guide/python-connector-api.html#connect) for a list of passable parameters. |
| QUERY\_TAG                                   | optional | See [QUERY\_TAG](https://docs.snowflake.com/en/sql-reference/parameters.html#query-tag) in Snowflake documentation.                                                                                                                                                                              |
| <p>QUOTED\_IDENTIFIERS\_<br>IGNORE\_CASE</p> | optional | See [QUOTED\_IDENTIFIERS\_IGNORE\_CASE](https://docs.snowflake.com/en/sql-reference/parameters.html#quoted-identifiers-ignore-case) in Snowflake documentation.                                                                                                                                  |

</details>

***

## Use a values file to store private key authentication values

If you use a private key authentication with Snowflake and have deployed a [Soda Runner](https://docs.soda.io/deployment-options/soda-agent), you can provide the required private key values in a `values.yml` file when you deploy or redeploy the runner.

You can also use the `values.yml` file to store other environment variables for the Soda Runner to use, such as `SNOWFLAKE_USER`, `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_PASSPHRASE`, etc.

1. First, run the following command to create a local path to the Snowflake private key. Replace the `local path to the Snowflake private key` with your own value.

   <pre class="language-shell" data-overflow="wrap"><code class="lang-shell">kubectl create secret generic -n &#x3C;soda-runner-namespace> snowflake-private-key --from-file=snowflake-private-key.pk8=&#x3C;local path to the Snowflake private key>
   </code></pre>
2. Then, add the following to your `values.yml` file, adjusting the values to your own specific details.

   <pre class="language-yaml" data-title="values.yml"><code class="lang-yaml">soda:
     scanLauncher:
       volumeMounts:
         - name: snowflake-private-key
           mountPath: /opt/soda/etc
       volumes:
         - name: snowflake-private-key
           secret:
             secretName: snowflake-private-key
             items:
               - key: snowflake-private-key.pk8
                 path: snowflake-private-key.pk8
     contractLauncher:
       volumeMounts:
         - name: snowflake-private-key
           mountPath: /opt/soda/etc
   </code></pre>
3. Adjust the configuration file to include the new path in the connection details, as in the following example.

   ```yaml
   data_source ltsnowflakecustomer:
     type: snowflake
     account: ${env.SNOWFLAKE_ACCOUNT}
     database: PUBLISH_DEV
     warehouse: ${env.SNOWFLAKE_WAREHOUSE}
     role: ${env.SNOWFLAKE_ROLE}
     user: ${env.SNOWFLAKE_USER}
     authenticator: SNOWFLAKE_JWT
     private_key_passphrase: ${env.SNOWFLAKE_PASSPHRASE}
     private_key_path: /opt/soda/etc/snowflake-private-key.pk8
     client_session_keep_alive: true
     session_parameters:
       QUERY_TAG: soda-queries
       QUOTED_IDENTIFIERS_IGNORE_CASE: false
   ```
4. Deploy, or redeploy, the runner for the changes to take effect.

## Connection test

Test the data source connection:

```bash
soda data-source test -ds ds_config.yml
```

## Supported data types

| Category | Data type                                                                                                       |
| -------- | --------------------------------------------------------------------------------------------------------------- |
| text     | CHAR, VARCHAR, CHARACTER, STRING, TEXT                                                                          |
| number   | NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL |
| time     | DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT\_LTZ, TIMESTAMP\_NTZ, TIMESTAMP\_TZ                                 |

## Troubleshoot

<i class="fa-square-xmark">:square-xmark:</i> **Problem:** When testing the connection to your Snowflake data source, Snowflake returns an error message about using the `use database` command.

<i class="fa-square-check">:square-check:</i> **Solution:** Use the `role` parameter to specify a Snowflake role that has permission to access the `database` and `schema` of your data source.\
Though optional, best practice dictates that you provide a value for `role`. If you do not provide a role, and Snowflake has not assigned a [Snowflake System-Defined Role](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html#system-defined-roles) to the user account, Snowflake may, confusingly, deny access to the data source.

<br>

<i class="fa-square-xmark">:square-xmark:</i> **Problem:** When Soda attempts to connect to your Snowflake data source, it produces a connectivity error that includes something like `RuntimeError: Command failed with exit code 2: ..... ocsp_response_validation_cash.lock`.

<i class="fa-square-check">:square-check:</i> **Solution:**\
Use [Snowflake's troubleshooting guide](https://community.snowflake.com/s/article/How-to-Triage-OCSP-Related-Connectivity-Problems) to triage OCSP-related connectivity issues.

<br>

<i class="fa-square-xmark">:square-xmark:</i> **Problem:** You have defined a Group By check and the scan that executes the check yields an error.

<i class="fa-square-check">:square-check:</i> **Solution:** Be aware that, by default, Snowflake returns columns names in uppercase. Therefore, when defining a Group By check, you must specify the custom name of the check in uppercase as in the following example.

```yaml
checks for VOLUME:
  - group by:
      group_limit: 50
      query: |
        SELECT TRADER, count(*) as trader_row_count
        FROM TRADEVOLUME
        WHERE TRADE_DATE = '2023-01-01'
        GROUP BY TRADER
      fields:
        - TRADER
      checks:
        - TRADER_ROW_COUNT > 40:
            name: Trader row count
```

<i class="fa-square-xmark">:square-xmark:</i> **Problem:** While attempting to connect Soda to a Snowflake data source using proxy parameters, you encounter an error that reads something similar to `Could not connect to data source "name_db": 250001 (08001): Failed to connect to DB: mydb.eu-west-1.snowflakecomputing.com:443. Incoming request with IP/Token xx.xxx.xx.xxx is not allowed to access Snowflake.`

```yaml
data_source my_data_source:
  type: snowflake
  ...
  session_parameters:
    QUERY_TAG: soda-test
    QUOTED_IDENTIFIERS_IGNORE_CASE: false
  proxy_http: http://a-proxy-o-dd-dddd-net:8000
  proxy_https: https://a-proxy-o-dd-dddd-net:8000
```

<i class="fa-square-check">:square-check:</i> **Solution:** When connecting to a Snowflake data source by proxy, be sure to set the new proxy environment variables from the command-line using export statements, as in the following example.

```sh
export HTTP_PROXY=http://a-proxy-o-dd-dddd-net:8000
export HTTPS_PROXY=https://a-proxy-o-dd-dddd-net:8000
```

<br>

***

{% if (visitor.claims.plan === 'datasetStandard')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Free license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterprise')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Team license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if (visitor.claims.plan === 'enterpriseUserBased')%}
{% hint style="success" %}
You are **logged in to Soda** and seeing the **Enterprise license** documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}

{% if !(visitor.claims.plan === 'enterprise' || visitor.claims.plan === 'enterpriseUserBased' || visitor.claims.plan === 'datasetStandard')%}
{% hint style="info" %}
You are **not logged in to Soda** and are viewing the default public documentation. Learn more about [documentation-access-and-licensing](https://docs.soda.io/reference/documentation-access-and-licensing "mention").
{% endhint %}
{% endif %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.soda.io/reference/data-source-reference-for-soda-core/snowflake.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
