Alation

circle-info

You will be prompted to contact supportenvelope as a last step to finish setting up the Alation integration.

Integrate Soda with Alation to access details about the quality of your data from within the data catalog.

  • Run data quality checks using Soda and visualize quality metrics and rules within the context of a data source, dataset, or column in Alation.

  • Use Soda Cloud to flag poor-quality data in lineage diagrams and during live querying.

  • Give your Alation users the confidence of knowing that the data they are using is sound.

🎥 Watch a 5-minute overviewarrow-up-right showcasing the integration of Soda and Alation.

Alation dashboard when integrated with Soda

Prerequisites

  • You have verified some contracts and published the results to Soda Cloud.

  • You have an Alation account with the privileges necessary to allow you to add a data source, create custom fields, and customize templates.

  • You have a git repository in which to store the integration project files.


Local setup

1

Credentials

1.1. Log in to Soda Cloud

Sign into your Soda Cloud account and confirm that you see the datasets you expect to see in the data source you wish to test for quality.

1.2. Create an .env file

To connect your Soda Cloud account to your Alation Service Account, create an .env file in your integration project in your git repo and include details according to the example below. Refer to Generate API keysarrow-up-right to obtain the values for your Soda API keys.

Optional Configurations:

  • SSL_VERIFY - True / False - set whether ssl certificates should be verified (default True). Disabling ssl verify should only be used for POCs and temporarily, not for production.

  • SSL_VERIFY_PATH - string, a path to local certificate to be used if custom/self-signed ssl certificates are used.

  • REFRESH_TOKEN_PATH - string, storage path for the Alation refresh token

  • CONFIGURED_REFRESH_TOKEN_PATH - string, path to a directory where the Alation refresh token is located. This is used to initially configure a Alation refresh token. The file in this path is copied into the REFRESH_TOKEN_PATH

  • SODA_DATASOURCE_MAPPING_FILE - string, path with file name to the datasource mapping file

  • SODA_DESCRIPTION_ATTRIBUTE - string, Soda check attribute name to use for the custom description fields in Alation

2

Data source mappings

Create a mapping file

To sync a data source and schema in the Alation catalog to a data source in Soda Cloud, you must map it from Soda Cloud to Alation. Create a .datasource-mapping.yml file in your integration project and populate it with mapping data according to the following example. The table below describes where to retrieve the values for each field.

Field
Retrieve value from

name

A name you choose as an identifier for an integration between Soda Cloud and a data catalog.

soda: datasource_id

The data source information panel in Soda Cloud.

soda: datasource_name

The data source information panel in Soda Cloud.

soda: dataset_mapping

(Optional) When you run the integration, Soda automatically maps all of the datasets between data sources. However, if the names of the datasets differ in the tools you can use this property to manually map datasets between tools.

catalog: type:

The name of the cataloging software; in this case, “alation”.

catalog: datasource_id

Retrieve this value from the URL on the data source page in the Alation catalog; see image below.

catalog: datasource_container_name

The schema of the data source; retrieve this value from the data source page in the Alation catalog under the subheading Schemas. See image below.

catalog: datasource_container_id

The ID of the datasource_container_name (the schema of the data source); retrieve this value from the schema page in the Alation catalog. See image below

  • Retrieve the Alation datasource_id from the URL

    The datasource_id is in the URL of the data source page

  • Retrieve the Alation datasource_container_name (schema) from the data source page

    The datasource_container_name is the data source schema name
  • Retrieve the Alation datasource_container_id for the datasource_container_name from the URL in the Schema page.

3

Onboarded datasets

Make sure that datasets you wish to sync data quality data from Soda Cloud to catalog are fully onboarded on both sides.

4

[Optional] Enable API access to Alation with SSO

If your Alation account employs single sign-on (SSO) access, you must Create an API service accountarrow-up-right for Soda to integrate with Alation.

If your Alation account does not use SSO, skip this step and proceed to Customize the catalogarrow-up-right.

5

Customize the catalog

circle-info

Some catalogs require manual customization for this integration to work.

Alation custom fields are created in global context on Alation -> Settings -> Customize Catalog -> Custom Fields page. These fields can then be attached to any entity on Custom Templates tab on the same page.

Set up the following custom fields and then attach them to the Table entity:

  • Has DQ - Picker with True and False values (!) Make sure to use these values exactly as Alation API is case-sensitive.

  • Profile - Last Run - Date

  • Soda DQ Overview - Rich Text

6.1. Custom fields

Create custom fields in Alation that reference information that Soda Cloud pushes to the catalog. These are the fields the catalog users will see that will display Soda Cloud data quality details. In your Alation account, navigate to Settings > Catalog Admin > Customize Catalog. In the Custom Fields tab, create the following fields:

  • Under the Pickers heading, create a field for “Has DQ” with Options “True” and “False”. The Alation API is case sensitive so be sure to use these exact values.

  • Under the Dates heading, create a field for “Profile - Last Run”.

  • Under the Rich Texts heading, create the following fields:

    • “Soda DQ Overview”

    • “Soda Data Quality Rules”

    • “Data Quality Metrics”

6.2. Add custom fields to Custom Templates

Add each new custom field to a Custom Template in Alation. In Customize Catalog, in the Custom Templates tab, select the Table template, then click Insert… to add a custom field to the template:

  • “Soda DQ Overview”

6.3. Add "Data Quality Info" grouping

In the Table template, click Insert… to add a Grouping of Custom Fields. Label the grouping “Data Quality Info”, then Insert… two custom fields:

  • “Has DQ”

  • “Profile - Last Run”

  1. In the Column template, click Insert… to add a custom field to the template:

    • “Has DQ”

6.4. Add "Soda Data Profile Information" grouping

6.4. In the Column template, click Insert… to add a Grouping of Custom Fields. Label the grouping “Soda Data Profile Information”, then Insert… two custom fields:

  • Data Quality Metrics

  • Soda Data Quality Rules

Kubernetes setup

The Soda-Alation integration can be run in a Kubernetes cluster. Contact [email protected]envelope to run the integration in your organization.


Run the integration

Contact [email protected]envelope directly to acquire the assets and instructions to run the integration and view Soda Cloud details in your Alation catalog.


Use the integration

Access Soda Cloud to create no-code checks or initiate a request in order to execute checks against datasets in your data source each time you run a Soda scan manually or orchestrate a scan using a data pipeline tool, such as Airflow. Soda Cloud pushes data quality scan results to the corresponding data source in Alation so that users can review data quality information from within the catalog.

In Alation, beyond reviewing data quality information for the data source, users can access the Joins and Lineage tabs of individual datasets to examine details and investigate the source of any data quality issues.

Open in Soda

In a dataset page in Alation, in the Overview tab, users have the opportunity to review Soda information on Alation, or directly access Soda Cloud to scrutinize data quality details.

Under the Soda DQ Overview heading in Alation, click Open in Soda to access the dataset page in Soda Cloud


circle-info

You are not logged in to Soda and are viewing the default public documentation. Learn more about Documentation access & licensing.

Last updated

Was this helpful?