.alter column policy encoding for larger than BigObject32 provision

databricksuser-5173 20 Reputation points
2025-08-17T20:59:53.71+00:00

Hi

https://learn.microsoft.com/en-us/kusto/management/alter-encoding-policy?view=microsoft-fabric

Per above documentation, the largest object that can be held in a column is 32MB. How do we deal with objects larger than 32MB, for example - to process an XML input weighing 100 Mb or JSON with 50 Mb? How can such large inputs be natively ingested into Azure Data Explorer from ADLSg2?

Cheers.

Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
{count} votes

1 answer

Sort by: Most helpful
  1. Venkat Reddy Navari 5,830 Reputation points Microsoft External Staff Moderator
    2025-08-18T10:20:27.1166667+00:00

    Hi databricksuser-5173 Thanks for confirming that you're using a standalone Azure Data Explorer (ADX) cluster.

    You're right to point this out, ADX has a 32 MB limit per column value, and that’s enforced by design to ensure stable performance and efficient querying.

    If you’re working with large files (like a 100MB XML or 50MB JSON), you’ll typically need to approach it differently, since such payloads can't be ingested directly as a single column value.

    Here are a few practical approaches:

    • Break the file into smaller parts before ingestion. For example, if it’s a large JSON array or XML with repeated elements, you can split it into individual objects or nodes and ingest them as separate rows. That keeps each row’s value well within the limit.
    • Ingest only references or metadata into ADX – You can store the full payload in ADLS Gen2 and ingest only the path, metadata (like size or schema), and a reference ID into ADX. That way, ADX remains optimized for querying, and you can retrieve the full file separately when needed.
    • Use compression – In some cases, compressing the payload (e.g., Base64-encoded GZIP) before ingestion can bring it under 32 MB, though that depends on the content and may require decompression logic during querying.

    And this 32 MB limit is documented by Microsoft. The BigObject32 encoding policy explicitly states:

    Overrides MaxValueSize property in the encoding Policy to 32 MB.

    You can refer to the official documentation here for confirmation: https://learn.microsoft.com/en-us/kusto/management/alter-encoding-policy?view=microsoft-fabric#policy-encoding


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.