Connect Soda to Trino
Access configuration details to connect Soda to a Trino data source.
For Soda to run quality scans on your data, you must configure it to connect to your data source. To learn how to set up Soda and configure it to connect to your data sources, see Get started.
Connection configuration reference
Install package: soda-trino
Reference Trino documentation for assistance.
data_source my_datasource_name:
type: trino
host: my.trino.host
catalog: datalake
schema: cw_dq
auth_type: OAuth2ClientCredentialsAuthentication
oauth:
token_url: https://token-url.test.com/token
client_id: XXX
client_secret: YYY
scope: "scope1 scope2" # optional, CSV
grant_type: client_credentials # optional, this is default value
Core connection properties
type
yes
Identify the type of data source for Soda. Must be trino
in this case.
host
yes
Provide a host identifier.
catalog
yes
Provide an identifier for the catalog which contains schemas and which references a data source using a connector. See Catalog in the Trino documentation.
schema
yes
Provide an identifier for the schema in which your dataset exists.
auth_type
yes
Authentication mode. Add OAuth2ClientCredentialsAuthentication
to use OAuth 2.0 client credentials flow.
BasicAuthentication
in combination of user + password, orJWTAuthentication
in combination withaccess_token
and optionally username.Default:
BasicAuthentication
.
OAuth (client credentials flow) properties
When auth_type
is OAuth2ClientCredentialsAuthentication
, configure the following nested properties under oauth:
.
oauth.token_url
yes
The OAuth 2.0 token endpoint to obtain an access token.
oauth.client_id
yes
OAuth 2.0 client identifier.
oauth.client_secret
yes
OAuth 2.0 client secret.
oauth.scope
no
Space-delimited, case-sensitive list of strings per RFC 6749 (e.g., "scope1 scope2"
).
No default value.
oauth.grant_type
no
OAuth 2.0 grant type. Leave unset to use the default client_credentials
flow.
Test the data source connection
To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection
command. If you wish, add a -V
option to the command to return results in verbose mode in the CLI.
soda test-connection -d my_datasource -c configuration.yml -V
Supported data types
text
CHAR, VARCHAR
number
NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL
time
DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ
Last updated
Was this helpful?