MS Purview: Spark partitioned ADLSGen2 resource set is showing as not showing schema, while the Parent ADLS Gen2 folder shows as non supported in DQ & Profiling

Arjun Dhingra 0 Reputation points
2025-12-05T05:48:15.6266667+00:00

We have an ADSL Gen2 Spark partitioned storage. When we are scanning, we can see assets as the spark partitioned & parent folder.

In Discovery we can see schema for the Spark partition however, when we try to Profile or DQ, it complains that it cannot find the schema.

To add to misery, the parent folder is marked as non supported in the DQ/Profiling section.

Microsoft Security | Microsoft Purview
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pratyush Vashistha 5,045 Reputation points Microsoft External Staff Moderator
    2025-12-05T06:05:25.2266667+00:00

    Hey Arjun, it looks like you're having some trouble with Microsoft Purview while trying to profile and perform data quality checks on your ADLS Gen2 Spark partitioned storage. Here’s a breakdown of what you might consider doing to resolve this issue.

    Steps to Resolve Your Issue:

    1. Check Cluster Permissions:
      • Ensure that the Microsoft Purview account is listed as a Storage Blob Data Reader on your ADLS Gen2 storage account. If the Purview account lacks the necessary permissions, it can't access the schema properly.
    2. Wait for Schema Update:
      • Sometimes changes in schema can take time to reflect. After the initial scan, it can take up to 12 hours for the schema to update properly. If it hasn't been long since you scanned, consider waiting before trying to analyze it again.
    3. Enable Advanced Resource Sets:
      • Make sure the Advanced Resource Sets option is turned on in your Microsoft Purview settings. This feature must be enabled for the schema to be accurately captured for partitioned datasets.
    4. Validate File Format and Structure:
      • Check that the data follows the supported directory structure for profiling and DQ you mentioned. Make sure your file names and folder structures conform to the expected formats that Purview supports. For instance, use paths structured like https://<storage_account>.dfs.core.windows.net/<container>/path/{Partitioned Files}.
    5. Consider Data Type Support:
      • Ensure none of the fields in your files utilize unsupported data types. If they do, it might affect schema extraction.

    Reference Information:

    Follow-Up Questions:

    1. Are all file formats being used compliant with what Purview can detect and analyze?
    2. Are there any specific error messages that you encountered while conducting the profiling or DQ checks?

    Hope this helps you get your schema issues sorted out! Let me know if you have any more questions.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.