Cannot use GPU VMs (NC4as_T4_v3) in Azure ML despite having quota - VMs grayed out

Max B 0 Reputation points
2025-12-01T14:42:42.2866667+00:00

Hello Azure Community,

I'm trying to deploy an Azure ML real-time endpoint with GPU compute but unable to select GPU VMs despite having quota allocated.

Subscription Details

  • Type: Pay-As-You-Go
  • Status: Active
  • Region: West Europe

Quota Status

I have verified quota allocation in Azure Portal → Quotas:

  • Standard NCASv3_T4 Family - West Europe: 0 of 12 cores available
  • Standard NCASv3_T4 Family - West US: 0 of 12 cores available

Problem Description

When creating a Managed Online Endpoint in Azure ML Studio:

  1. Navigate to: ML Studio → Endpoints → Real-time endpoints → Create
  2. Register model and environment successfully
  3. On Compute configuration page, attempt to select GPU VM
  4. All GPU VM sizes appear grayed out with message:

    "You do not have enough quota for the following VM sizes"

  5. Specifically trying to use: Standard_NC4as_T4_v3 (4 cores, 28GB RAM, NVIDIA T4 GPU)

What I've Verified

✓ Quota page shows 12 cores available for NCASv3_T4 family in both West Europe and West US

✓ Subscription is Pay-As-You-Go

✓ NC4as_T4_v3 is officially listed as available in West Europe region

✓ Model and environment registered successfully

My Question

Is there additional approval or activation needed for first-time GPU VM usage in Azure ML, even with Pay-As-You-Go subscription and available quota?

The quota shows as available but VMs are not selectable in ML Studio. How can I enable access to GPU VMs for ML endpoint deployments?

Any guidance would be greatly appreciated!

Thank you!

Azure Machine Learning
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Aryan Parashar 3,380 Reputation points Microsoft External Staff Moderator
    2025-12-02T09:17:19.4766667+00:00

    Hi Max B,

    I understand how frustrating it can be to have quota allocated but still be unable to select GPU VM sizes when deploying your Azure ML real-time endpoint.

    Quota must be requested and approved at the ML workspace level for GPU VM sizes (such as Standard_NC4as_T4_v3) to become available during Managed Online Endpoint creation.

    To resolve this, please verify and request the quota directly within the ML workspace:

    Navigate to All workspacesQuotas.
    User's image

    If no quota is available, request the quota as shown below:

    Select the compute family -> Select Request quota User's image

    Enter the New cores limit and click Submit
    User's image

    Please accept this as an answer.
    User's image Thank you for reaching out to The Microsoft Q&A Portal.


  2. Max B 0 Reputation points
    2025-12-05T09:52:41.38+00:00

    Hi, I finally found a solution. I don't know why but the default value for Instance count is 3 but considering that NC4as_T4_v3 requires 4 cores, then for 3 instances you will need 12 (even more considering 20% quota than expected may be redundancy purposes on some SKUs). Due to the default value of 3 my quota of 8 did not cover that many cores, so NC4as_T4_v3 did not even appear in the VM list.

    My advice is always set the instance to 1 to see all available VMs. Also consider the quota and those same 20%. Good luck

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.