Share via


Data sources and file formats supported for data quality in Unified Catalog

Supported sources

Data source Data profiling Data quality scan Virtual network support Note
Azure Data Lake Storage Gen2 Yes Yes Yes
Azure Databricks Unity Catalog Yes Yes Yes
Azure Synapse serverless Yes Yes Yes
Azure Synapse Data Warehouse Yes Yes Yes
Azure SQL Database Yes Yes Yes
Azure Dedicated SQL Pool (formarly SQL DW) Yes Yes No
Google BigQuery Yes Yes No
Snowflake Yes Yes Yes
Fabric Yes Yes Yes Lakehouse, Shortcut to other filesystem, and mirroring with other database
Amazon S3 Yes Yes No Supported via Fabric shortcut
Dataverse Yes Yes No Supported via Fabric shortcut
Google Cloud Storage Yes Yes No Supported via Fabric shortcut

Supported file formats

File format Data profiling Data quality scan vNet support
Delta Yes Yes NA
Parquet Yes Yes NA
Iceberg Avro Yes Yes NA
Iceberg Orc Yes Yes NA

Supported resource set pattern

Pattern Name Details
SparkPartitions All patterns ending with {SparkPartitions} are supported provided that they do not contain any other mixed non-column patterns in their folder path.
Column partitions All column partition patterns for parquet, delta and iceberg datasets are supported.
  • Example 1 (supported): Standard resource-set folder path: https://myblob.blob.core.windows.net/sample-data/name-of-folder-output/{SparkPartitions}
  • Example 2 (supported): Column partitioned resource-set folder path: https://myblob.blob.core.windows.net/my-partitioned-data/Year={Year}/Month={Month}/Day={Day}/{SparkPartitions}
  • Example 3 (not supported): Mixed resource-set path: https://myblob.blob.core.windows.net/sample-data/data{N}.parquet

For more information on Microsoft Data Map resource sets, see Understanding resource set.