Connect Soda to Trino

Access configuration details to connect Soda to a Trino data source.

For Soda to run quality scans on your data, you must configure it to connect to your data source. To learn how to set up Soda and configure it to connect to your data sources, see Get started.

Connection configuration reference

Install package: soda-trino

Reference Trino documentation for assistance.

data_source my_datasource_name:
  type: trino
  host: my.trino.host
  catalog: datalake
  schema: cw_dq
  auth_type: OAuth2ClientCredentialsAuthentication
  oauth:
    token_url: https://token-url.test.com/token
    client_id: XXX
    client_secret: YYY
    scope: "scope1 scope2" # optional, CSV
    grant_type: client_credentials # optional, this is default value

Core connection properties

Property
Required
Description

type

yes

Identify the type of data source for Soda. Must be trino in this case.

host

yes

Provide a host identifier.

catalog

yes

Provide an identifier for the catalog which contains schemas and which references a data source using a connector. See Catalog in the Trino documentation.

schema

yes

Provide an identifier for the schema in which your dataset exists.

auth_type

yes

Authentication mode. Add OAuth2ClientCredentialsAuthentication to use OAuth 2.0 client credentials flow.

  • BasicAuthentication in combination of user + password, or

  • JWTAuthentication in combination with access_token and optionally username.

  • Default: BasicAuthentication.

OAuth (client credentials flow) properties

When auth_type is OAuth2ClientCredentialsAuthentication, configure the following nested properties under oauth:.

Property
Required
Description

oauth.token_url

yes

The OAuth 2.0 token endpoint to obtain an access token.

oauth.client_id

yes

OAuth 2.0 client identifier.

oauth.client_secret

yes

OAuth 2.0 client secret.

oauth.scope

no

Space-delimited, case-sensitive list of strings per RFC 6749 (e.g., "scope1 scope2").

No default value.

oauth.grant_type

no

OAuth 2.0 grant type. Leave unset to use the default client_credentialsflow.

Test the data source connection

To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection command. If you wish, add a -V option to the command to return results in verbose mode in the CLI.

soda test-connection -d my_datasource -c configuration.yml -V

Supported data types

Category
Data type

text

CHAR, VARCHAR

number

NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL

time

DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ

Last updated

Was this helpful?