Connect Soda to Trino

Access configuration details to connect Soda to a Trino data source.

For Soda to run quality scans on your data, you must configure it to connect to your data source. To learn how to set up Soda and configure it to connect to your data sources, see Get started.

Connection configuration reference

Install package: soda-trino

Reference Trino documentation for assistance.

data_source my_datasource_name:
  type: trino
  host: 127.0.0.1
  port: "5432"
  username: simple
  password: simple_pass
  catalog: hive
  schema: public
  source: 
  http_headers: 
  client_tags: ["test","test2"]
Property
Required
Notes

type

required

Identify the type of data source for Soda.

host

required

Provide a host identifier.

port

optional

Provide a port identifier.

auth_type

optional

BasicAuthentication in combination of user + password or JWTAuthentication in combination with access_token and optionally username. Default: BasicAuthentication.

access_token

optional

Map to the JWT access token. Only applicable if auth_type = JWTAuthentication.

username

required

Optional if auth_type = JWTAuthentication. Consider using system variables to retrieve this value securely using, for example, ${TRINO_USER}.

password

optional

Consider using system variables to retrieve this value securely using, for example, ${TRINO_PASSWORD}. Only applicable for auth_type = BasicAuthentication.

catalog

required

Provide an identifier for the catalog which contains schemas and which references a data source using a connector. See Catalog in the Trino documentation.

schema

required

Provide an identifier for the schema in which your dataset exists.

source

optional

http_headers

optional

Provide any HTTP headers as needed. See Trino documentation for details.

client_tags

optional

Provide a list of tag strings to identify Trino resource groups. See Trino documentation for details.

Test the data source connection

To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection command. If you wish, add a -V option to the command to return results in verbose mode in the CLI.

soda test-connection -d my_datasource -c configuration.yml -V

Supported data types

Category
Data type

text

CHAR, VARCHAR

number

NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL

time

DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ

Last updated

Was this helpful?