Azure document intelligence production server specs

moaz alla 20 Reputation points
2025-11-30T12:49:39.7033333+00:00

Hello
i was trying to estimate the specs need to run the azure disconnected containers (layout ,read and custom template ) on a production environment that will serve 50 concurrent users , we assume that each user will send about 10~15 pages (dense pages with detailed tables or figures ), i know that on the documentation page it mentions the specs for each container but it doesn't say any details about how much users each container will handle with these specs , i need to know an the estimated specs to run a production server in environment mentioned above ?

Thanks in advanced .

Azure AI Document Intelligence
{count} votes

Answer accepted by question author
  1. Anshika Varshney 3,795 Reputation points Microsoft External Staff Moderator
    2025-12-01T14:54:41.1933333+00:00

    Hi moaz alla,

    Thank you for reaching out on the Microsoft Q&A.

    There is no official Microsoft sizing guide for how much server power you need for a certain number of users in Azure Document Intelligence. The documentation only provides minimum requirements for running the containers, such as 8 CPU cores and around 16–24 GB of RAM depending on the model. However, community experience shows that these minimum specs are usually not enough for real production workloads, especially when documents contain many pages, tables, images, or scanned content. Most users report that performance becomes smoother and more stable when the server has at least 32 GB of RAM and fast SSD storage for temporary processing.

    For your scenario with about 50 concurrent users uploading 10–15 dense pages each, a stronger setup is recommended. A good starting point would be a server with 8–12 CPU cores, 32–48 GB RAM, and SSD storage with at least 20–30 GB free. This setup gives the service enough headroom to handle multiple heavy documents at the same time. Since actual performance depends on the complexity and quality of your documents, the best next step is to run a load test with your real files and monitor CPU, memory, and processing time.

    This testing will help you adjust your resources as needed. If you see the server getting overloaded, you can either increase the machine size or run multiple container instances behind a load balancer. This approach ensures you get reliable performance without over- or under-provisioning your system.

    References Links:

    Please let me know if there are any remaining questions or additional details, I can help with, I’ll be glad to provide further clarification or guidance.

    Thankyou!

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.