Connect Soda to Amazon Athena
Access configuration details to connect Soda to an Athena data source.
For Soda to run quality scans on your data, you must configure it to connect to your data source. To learn how to set up Soda and configure it to connect to your data sources, see Get started.
Connection configuration reference
Install package: soda-athena
data_source my_datasource_name:
type: athena
access_key_id: kk9gDU6800xxxx
secret_access_key: 88f&eeTuT47xxxx
region_name: eu-west-1
staging_dir: s3://s3-results-bucket/output/
schema: public
type
required
Identify the type of data source for Soda.
access_key_id
required 1
Consider using system variables to retrieve this value securely. See Manage access keys for IAM users.
secret_access_key
required 1
Consider using system variables to retrieve this value securely. See Manage access keys for IAM users.
role_arn
optional 1
Specify role to use for authentication and authorization.
staging_dir
required
Identify the Amazon S3 Staging Directory (the Query Result Location in AWS); see Specifying a query result location
schema
required
Identify the schema in the data source in which your tables exist.
catalog
optional
Identify the name of the Data Source, also referred to as a Catalog. The default value is awsdatacatalog
.
work_group
optional
Identify a non-default workgroup in your region. In your Athena console, access your current workgroup in the Workgroup option on the upper right. Read more about Athena Workgroups.
session_token
optional
Add a session Token to use for authentication and authorization.
profile_name
optional
Specify the profile Name from local AWS configuration to use for authentication and authorization.
Test the data source connection
To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection
command. If you wish, add a -V
option to the command to returns results in verbose mode in the CLI.
soda test-connection -d my_datasource -c configuration.yml -V
Supported data types
text
CHAR, VARCHAR, STRING
number
TINYINT, SMALLINT, INT, INTEGER, BIGINT, DOUBLE, FLOAT, DECIMAL
time
DATE, TIMESTAMP
Last updated
Was this helpful?