Metadata data sources
For each supported data source, Soda always calculates two metrics:
Total row count
Last modification time
Oracle
Row count: metadata row count is actually a
count(*)
Soda does not use metadata for this metric in Oracle. It requires an additional package and/or is unreliable based on the schedule of that package.
Last insertion time: Soda uses metadata
Note that past data is only available for a limited amount of time, which varies depending on the system. The minimum goes back 120 h.
Historical backfilling: not possible.
Postgres
Metadata is supported, but it requires some additional setup on Postgres side.
Row count: enabled out of the box.
Last insertion time:
track_commit_timestamp
must be enabled: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-TRACK-COMMIT-TIMESTAMPIf
track_commit_timestamp
is not enabled, Soda will return a warning.
Historical backfilling: not possible.
BigQuery
Metadata metrics are available and supported in BigQuery.
Historical backfilling: possible.
Partition column: can be suggested based on metadata available in BigQuery.
Soda will prioritize user-suggested columns.
If there are no user-suggested columns, Soda will try a metadata approach to find the partition column automatically.
If there are no columns found in the metadata of BigQuery, Soda will fall back on its own heuristic.
Redshift
Historical backfilling is supported on Redshift and it is limited to 7 days for the metadata.
Last updated
Was this helpful?