A warehouse represents a SQL engine or database such as Snowflake, Amazon Redshift, or PostgreSQL. You use a warehouse YAML file to configure connection details for Soda SQL to access your warehouse.
You need to create a warehouse YAML file for every warehouse to which you want to connect. You can create warehouse YAML files manually, but the CLI command
soda create yourwarehousetype automatically prepares a warehouse YAML file and an env_vars YAML file for you. (Use the env-vars YAML to securely store warehouse login credentials. See Env_vars YAML below.)
When you execute the
soda create yourwarehousetype command, you include options that instruct Soda SQL in the creation of the file, and you indicate the type of warehouse, a specification Soda SQL requires. Use
soda create --help for a list of all available warehouse types and options.
The example below includes the following optional details:
-dprovides the name of the database
-uprovides the username to log in to the database
-wprovides the name of the warehouse directory
$ soda create yourwarehousetype -d sodasql -u sodasql -w soda_sql_tutorial
| Soda CLI version 2.x.xx | Creating warehouse YAML file warehouse.yml ... | Creating /Users/Me/.soda/env_vars.yml with example env vars in section soda_sql_tutorial | Review warehouse.yml by running command | cat warehouse.yml | Review section soda_sql_tutorial in ~/.soda/env_vars.yml by running command | cat ~/.soda/env_vars.yml | Then run the soda analyze command
In the above example, Soda SQL created a warehouse YAML file and put it in a warehouse directory which is the top directory in your Soda SQL project directory structure. (It puts the env_vars YAML file in your local user home directory.)
When it creates your warehouse YAML file, Soda SQL pre-populates it with the options details you provided. The following is an example of a warehouse YAML file that Soda SQL created and pre-populated.
name: soda_sql_tutorial connection: type: postgres host: localhost username: env_var(POSTGRES_USERNAME) password: env_var(POSTGRES_PASSWORD) database: sodasql schema: public
Notice that even though the command provided a value for
username, Soda SQL automatically used
env_var(POSTGRES_USERNAME) instead. By default, Soda SQL stores database login credentials in an env_vars YAML file so that this sensitive information stays locally stored. See Env_vars YAML below for details.
Each type of warehouse requires different configuration parameters. Refer to Set warehouse configurations for details that correspond to the type of database you are using.
To keep your warehouse YAML file free of warehouse login credentials, Soda SQL references environment variables. When it creates a new warehouse YAML file, Soda SQL also creates an env_vars YAML file to store your database username and password values. Soda SQL does not overwrite or remove and existing environment variables, it only adds new.
When it runs a scan, Soda SQL loads environment variables from your local user home directory where it stored your env_vars YAML file.
Use the command
cat ~/.soda/env_vars.yml to review the contents of your env_vars YAML file. Open this hidden file from your local user home directory to input the values for your database credentials.
soda_sql_tutorial: POSTGRES_USERNAME: myexampleusername POSTGRES_PASSWORD: myexamplepassword some_other_soda_project: SNOWFLAKE_USERNAME: someotherexampleusername SNOWFLAKE_PASSWORD: someotherexamplepassword
Beyond storing warehouse login credentials, you can use env_vars to securely store any parameter in the warehouse YAML file. Best practice dictates that you use env_vars to store login credentials and any other sensitive data such as API keys or tokens. Note, however, that it is not mandatory that you use this env_vars YAML file to store your credentials. The env_vars YAML is where Soda SQL looks first when it runs a scan, but if the file does not exist, it uses the runtime environment variables from the current shell.
For example, if you use you Google Cloud Platform, you can set runtime environment variables in your cloud platform where Soda SQL will find the login credentials it needs to access your warehouse. Set a single variable in your cloud platform that Soda SQL can locate and use. Then, since you do not need it, locate and delete the env_vars YAML file that Soda SQL created in your local home directory.