Link Search Menu Expand Document

Data security and privacy

Last modified on 18-Mar-24

Soda works in several ways to ensure your data and systems are secure and remain private.

Compliance and reporting
Using a Soda-hosted agent
Connecting with Soda Library
Sending data to Soda Cloud
Single sign-on with Soda Cloud

Compliance and reporting

As a result of an independent review in April 2022, Soda has been found to be SOCII Type 2 compliant. Contact support@soda.io for more information.

soc2

Using a Soda-hosted agent

Soda hosts agents in a secure environment in Amazon AWS. As a SOC 2 Type 2 certified business, Soda responsibly manages Soda-hosted agents to ensure that they remain private, secure, and independent of all other hosted agents.

  • Soda encrypts values pertaining to data source connections and only uses the values to access the data to perform scans for data quality. It uses asymmetric keys in AWS KMS to encrypt and store the values you provide for access to your data source. AMS KMS keys are certified under the FIPS 140-2 Cryptographic Module Validation Program.
  • Soda encrypts the secrets you provide via Soda Cloud both in transit and at rest. This end-to-end encryption means that secrets leave your browser already encrypted and can only be decrypted using a Private Key that only the Soda Agent can access.
  • Once you enter data source access credentials into Soda Cloud, neither you or any user or entity can access the values because they have been encrypted and can only be decrypted by the Soda Agent.
  • If your data source accepts allowlisted IP addresses, add the Soda Cloud IP address to the allowlist to access your data sources via the Soda-hosted Agent. Obtain this IP address in Soda Cloud when connecting a new data source.

Connecting with Soda Library

Installed in your environment, you use the Soda Library command-line tools to securely connect to a data source using system variables to store login credentials.

You can connect Soda Library to your Soda Cloud account. To communicate with your data source, Soda Cloud uses a Network Address Translation (NAT) gateway with the IP address 54.78.91.111. You may wish to add this IP address to your data source’s allowlist.

Sending data to Soda Cloud

Soda Library usse a secure API to connect to Soda Cloud. When Soda Library completes a scan, it pushes the scan results to your Soda Cloud account where you can log in and examine the details in the web application.

Notably, your Soda Cloud account does not store the raw data that Soda Library scans. Soda Library pushes metadata to Soda Cloud; by default all your data stays inside your private network.

Soda Cloud does store the following:

  • metadata, such as column names
  • aggregated metrics, such as averages
  • sample rows and failed rows, if you explicitly set up your configuration to send this data to Soda Cloud

Where your datasets contain sensitive data or private information, you may not want to send failed row samples from your data source to Soda Cloud. In such a circumstance, you can disable the failed row samples feature entirely in Soda Cloud. Read more about disabling samples and failed row samples in Failed rows checks.

Read more about Soda’s Privacy Policy.

Single sign-on with Soda Cloud

Organizations that use a SAML 2.0 single sign-on (SSO) identity provider can add Soda Cloud as a service provider. Once added, employees of the organization can gain authorized and authenticated access to the organization’s Soda Cloud account by successfully logging in to their SSO. Refer to Set up single sign-on with Soda Cloud for details.



Was this documentation helpful?

What could we do to improve this page?

Documentation always applies to the latest version of Soda products
Last modified on 18-Mar-24