Document Intelligence Custom Classification Model

Question

Document Intelligence Custom Classification Model

Robby Read 0

We have set up a custom composed model for 7 different document types. In order to do that we had to set up a custom classification model and train it for each specific document type. I have two types in specific that keep getting inaccurately classified for one another. The documents are not close in format at all and I have 14 documents trained in one and 15 documents in the other. How can we fix this issue with this classifying the document incorrectly? This is supposed to be intelligent but there seems to be nothing I can do and adding more training documents does not help.

3 answers

Your answer

Answer 1

To address the issue of inaccurate classification between two distinct document types in your custom composed model, consider the following strategies:

Increase Training Samples: Although you mentioned that adding more training documents has not helped, ensure that you have at least five samples per class, as this is the minimum required for effective training. If the classes are similar, adding even more samples can improve model accuracy.
Separate Models: If the documents are visually distinct, it may be beneficial to train separate models for each document type instead of relying on a composed model. This can help the classifier focus on the specific characteristics of each document type without confusion.
Review Training Data: Ensure that the training dataset includes all variations of the documents. If there are format differences (e.g., digital vs. scanned PDFs), make sure to include samples of each format in the training set.
Check for Extraneous Labels: Ensure that your training data does not contain extraneous labels that could confuse the model. Clean and label your training data carefully.
Incremental Training: If you have already trained models, consider using incremental training to refine the models with additional examples that highlight the differences between the two types. This requires using the same API version as the original model.
Evaluate Model Performance: Use the accuracy and confidence scores to assess how well the model is performing. If the accuracy is low, it may indicate that the model needs more labeled data or that the document types are too visually similar.

By implementing these strategies, you should be able to improve the classification accuracy of your custom model.

References:

Answer 2

Hi Robby Read,

Welcome to Microsoft Q&A,

Thank you for the detailed explanation of your setup. What you are experiencing is a known and common limitation of custom classification models in Azure AI Document Intelligence, especially when multiple document types are involved.

Important clarification upfront

This is not a bug, and it does not mean your training is incorrect. Simply adding more training documents does not always improve accuracy once the model reaches a certain learning limit.

Why this happens (even when documents look very different)

Custom classification models do not understand documents semantically the way humans do. Instead, they rely on statistical and visual signals such as:

Page layout and structure

Text density and distribution

Common keywords and phrases

Header/footer placement

Overall visual composition

Because of this:

Two documents that are clearly different to a human can still appear statistically similar to the model

If both document types share:

Similar page counts

  Similar text density
  
     Overlapping keywords (for example: *Total*, *Date*, *Reference*)
     
        Similar header positioning

the classifier can repeatedly confuse them.

Once this happens, adding more documents (beyond ~10–15 per class) often does not improve accuracy, which aligns with what you’re seeing.

Key limitations to be aware of

Custom classification models currently:

Do not support feature weighting
Do not allow tuning confidence thresholds
Do not support “hard negative” or contrastive training
Do not allow you to explicitly define distinguishing features

Because of these constraints, persistent confusion between two specific document types can occur and may not be solvable through retraining alone.

What does work

1. Add a rule-based pre-classification layer

Before calling the classification model, apply deterministic rules such as:

Presence or absence of a unique keyword or phrase

A specific header string

A known document identifier

Page count rules

Regex-based patterns

You can then:

Route only ambiguous cases to the classifier, or

Override the classifier’s output when a strong rule matches

This hybrid approach is the most reliable production pattern.

2. Split the classification workload

Instead of one classification model with 7 document types:

Create multiple classification models

Separate the two problematic document types into different models

Use a lightweight first-pass rule to decide which classifier to call

This reduces class competition and improves accuracy.

3. Use confidence-based post-processing

Although thresholds can’t be tuned inside the service:

Capture the confidence score returned by the classifier
If confidence is below a safe value (for example, < 0.85):
- Apply fallback rules
- Route for secondary validation
- Or classify using an alternate model

4. Strengthen visual differentiation

If you have influence over document templates:

Add a consistent, unique header or label

Introduce a distinctive first-page identifier

Ensure a repeated keyword unique to each type

Even a small, consistent visual cue can significantly improve classification.

What will not help

Adding large numbers of similar training documents

Re-training repeatedly without changing signals

Expecting semantic understanding from the classifier

Relying on classification alone for routing decisions

Recommended best-practice architecture

In real-world implementations, the most stable approach is:

Rule-based pre-filter

Custom classification model

Confidence evaluation

Fallback or override logic

Classification should be treated as assistive, not authoritative.

This behavior is expected with custom classification models
Adding more samples often does not improve accuracy
Visually different documents can still conflict statistically
There is no way to “force” separation inside the classifier today
Hybrid rule + ML approaches deliver the best results

Please refer this

I Hope this helps. Do let me know if you have any further queries.

If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thank you!

SRILAKSHMI C 13,065 Reputation points Microsoft External Staff Moderator

2026-01-15T17:21:02.4+00:00

Hi Robby Read,

Following up to see if the above answer was helpful. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thank you!
SRILAKSHMI C 13,065 Reputation points Microsoft External Staff Moderator

2026-01-19T11:51:19.6833333+00:00

Hi Robby Read,

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If you are still facing any further issues, please don't hesitate to reach out to us. We are happy to assist you.

Looking forward to your response and appreciate your time on this.

If you feel that your quires have been resolved, please accept the answer by clicking the "Upvote" and "Accept Answer" on the post.

Thank you!

Answer 3

Thank you for the clarification. You mentioned that unique headers or page layout should help with this issue but it does not seem to make much difference. We are actually using a custom extraction composed model and in the 4.0 version we were forced to add the classification to that. I did not know that you needed to make separate classification models for it to work correctly. We currently have one classification model with seven different classes and that is the basis for out composed model which has seven different individual models one for each document type. Is there a way to remove the classification model altogether and just have a composed model? Will there be a way to fine tune the classification models coming in the future?

Share via

Document Intelligence Custom Classification Model

3 answers

Your answer