Link Search Menu Expand Document

Set warehouse configurations

Soda SQL needs connection details in order to access your warehouse to scan your data. Each type of warehouse uses different configuration parameters. To set the warehouse configurations in your warehouse YAML, use the following example configurations that correspond to each database type that Soda SQL supports.

Amazon Athena
Amazon Redshift
Apache Hive
Google Cloud Platform BigQuery
Microsoft SQL Server (Experimental)
PostgreSQL
Snowflake

Amazon Athena

name: my_athena_project
connection:
    type: athena
    catalog: AwsDataCatalog
    database: sodalite_test
    access_key_id: env_var(AWS_ACCESS_KEY_ID)
    secret_access_key: env_var(AWS_SECRET_ACCESS_KEY)
    role_arn: 
    region: eu-west-1
    staging_dir: <YOUR STAGING PATH IN AWS S3>
...
Property Required Notes
type required  
catalog optional Default is AwsDataCatalog.
database required  
staging_dir required  
access_key_id optional Use environment variables to retrieve this value securely.
secret_access_key optional Use environment variables to retrieve this value securely.
role_arn optional The Amazon Resource Name of an IAM role that you want to use.
region optional  

Access keys and IAM role are mutually exclusive: if you provide values for access_key_id and secret_access_key, you cannot use Identity and Access Management role; if you provide value for role_arn, then you cannot use the access keys. Refer to Identity and Access Management in Athena for details.

Amazon Redshift

name: my_redshift_project
connection:
    type: redshift
    host: <YOUR AMAZON REDSHIFT HOSTNAME>
    username: soda
    password: <YOUR AMAZON REDSHIFT PASSWORD>
    database: soda_agent_test
    schema: public
    access_key_id: env_var(AWS_ACCESS_KEY_ID)
    secret_access_key: env_var(AWS_SECRET_ACCESS_KEY)
    role_arn: 
    region: eu-west-1
...
Property Required Notes
type required  
host required  
username required  
password required  
database required  
schema    
access_key_id optional Use environment variables to retrieve this value securely.
secret_access_key optional Use environment variables to retrieve this value securely.
role_arn optional The Amazon Resource Name of an IAM role that you want to use.
region optional  

Access keys and IAM role are mutually exclusive: if you provide values for access_key_id and secret_access_key, you cannot use Identity and Access Management role; if you provide value for role_arn, then you cannot use the access keys. Refer to Amazon Redshift Authorization parameters for details.

Apache Hive

name: my_hive_project
connection:
    type: hive
    host: localhost
    port: 10000
    username: env_var(HIVE_USERNAME)
    password: env_var(HIVE_PASSWORD)
    database: default
    configuration:
      hive.execution.engine: mr
      mapreduce.job.reduces: 2
...
Property Required Notes
type required  
host required  
port required  
username required Use environment variables to retrieve this value securely.
password required Use environment variables to retrieve this value securely.
database required  
hive.execution.engine required Input options are:
mr (Map reduce, default)
tez (Tez execution for Hadoop 2)
spark (Spark execution for Hive 1.1.0 or later)
mapreduce.job.reduces required Sets the number of reduce tasks per job. Input -1 for Hive to automatically determine the number of reducers.

GCP BigQuery

Use the values Google Cloud Platform provides when you create a service account. Use your BigQuery service account JSON key file.

name: my_bigquery_project
connection:
    type: bigquery
    account_info_json: >
      {
        "type": "service_account",
        "project_id": "...",
        "private_key_id": "...",
        "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
        "client_email": "...@project.iam.gserviceaccount.com",
        "client_id": "...",
        "auth_uri": "https://accounts.google.com/o/oauth2/auth",
        "token_uri": "https://accounts.google.com/o/oauth2/token",
        "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
        "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
      }
    dataset: sodasql
...
Property Required
type required
project_id required
private_key_id required
private_key required
client_email required
client_id required
auth_uri required
token_uri required
auth_provider_x509_cert_url required
client_x509_cert_url required

Microsoft SQL Server (Experimental)

Note: MS SQL Server support is experimental.

name: my_sqlserver_project
connection:
  type: sqlserver
  host: <YOUR SQLServer HOSTNAME>
  username: env_var(SQL_SERVER_USERNAME)
  password: env_var(SQL_SERVER_PASSWORD)
  database: master
  schema: dbo
Property Required Notes
type required  
host required  
username required Use environment variables to retrieve this value securely.
password required Use environment variables to retrieve this value securely.
database required  
schema required  

PostgreSQL

name: my_postgres_project
connection:
    type: postgres
    host: localhost
    username: sodasql
    password: sodasql
    database: sodasql
    schema: public
...
Property Required Notes
type required  
host required  
username required Use environment variables to retrieve this value securely.
password required Use environment variables to retrieve this value securely.
database    
schema    

Snowflake

name: my_snowflake_project
connection:
    type: snowflake
    username: env_var(SNOWFLAKE_USERNAME)
    password: env_var(SNOWFLAKE_PASSWORD)
    account: YOUR_SNOWFLAKE_ACCOUNT.eu-west-1
    warehouse: YOUR_WAREHOUSE
    database: YOUR_DATABASE
    schema: PUBLIC
    role: PUBLIC
...
Property Required Notes
type required  
username required Use environment variables to retrieve this value securely.
password required Use environment variables to retrieve this value securely.
account required Example: YOUR_SNOWFLAKE_ACCOUNT.eu-west-1
warehouse required The name of your Snowflake virtual warehouse.
database optional  
schema required  
role optional See Snowflake System-Defined Roles for details.