Connect Soda to Trino
Access configuration details to connect Soda to a Trino data source.
For Soda to run quality scans on your data, you must configure it to connect to your data source. To learn how to set up Soda and configure it to connect to your data sources, see Get started.
Connection configuration reference
Install package: soda-trino
Reference Trino documentation for assistance.
data_source my_datasource_name:
type: trino
host: 127.0.0.1
port: "5432"
username: simple
password: simple_pass
catalog: hive
schema: public
source:
http_headers:
client_tags: ["test","test2"]
type
required
Identify the type of data source for Soda.
host
required
Provide a host identifier.
port
optional
Provide a port identifier.
auth_type
optional
BasicAuthentication
in combination of user + password or JWTAuthentication
in combination with access_token
and optionally username. Default: BasicAuthentication
.
access_token
optional
Map to the JWT access token. Only applicable if auth_type = JWTAuthentication
.
username
required
Optional if auth_type = JWTAuthentication
. Consider using system variables to retrieve this value securely using, for example, ${TRINO_USER}
.
password
optional
Consider using system variables to retrieve this value securely using, for example, ${TRINO_PASSWORD}
. Only applicable for auth_type = BasicAuthentication
.
catalog
required
Provide an identifier for the catalog which contains schemas and which references a data source using a connector. See Catalog in the Trino documentation.
schema
required
Provide an identifier for the schema in which your dataset exists.
source
optional
client_tags
optional
Provide a list of tag strings to identify Trino resource groups. See Trino documentation for details.
Test the data source connection
To confirm that you have correctly configured the connection details for the data source(s) in your configuration YAML file, use the test-connection
command. If you wish, add a -V
option to the command to return results in verbose mode in the CLI.
soda test-connection -d my_datasource -c configuration.yml -V
Supported data types
text
CHAR, VARCHAR
number
NUMBER, INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL
time
DATE, DATETIME, TIME, TIMESTAMP, TIMESTAMPT_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ
Last updated
Was this helpful?