Edit

Share via


Use serverless GPU compute in Microsoft Dev Box

This article explains what serverless GPU compute is, how it works, and key scenarios for its use. Serverless GPU compute in Microsoft Dev Box (preview) lets you spin up dev boxes with GPU acceleration—no extra setup needed. Dev Box serverless GPU compute lets developers use GPU resources on demand without permanent infrastructure or complex setup.

Common scenarios for serverless GPU compute include compute-intensive workloads like AI model training, inference, and data processing. Serverless GPU compute lets you:

  • Use GPU resources only when you need them
  • Scale GPU resources based on workload demands
  • Pay only for the GPU time you use
  • Work in your organization's secure network environment

This capability integrates Microsoft Dev Box with Azure Container Apps to deliver GPU power without requiring developers to manage infrastructure.

Serverless GPU compute in Dev Box uses Azure Container Apps (ACA). When a developer starts a GPU-enabled shell or tool, Dev Box automatically:

  • Creates a connection to a serverless GPU session
  • Provisions the necessary GPU resources
  • Makes those resources available through the developer's terminal or integrated development environment
  • Automatically terminates the session when no longer needed

Prerequisites

  • An Azure subscription
  • Microsoft.App registered for your subscription
  • Microsoft.CognitiveServices registered for your subscription
  • A dev center and project
  • A managed service identity (MSI) configured for the dev center

Configure serverless GPU

Administrators control serverless GPU access at the project level through Dev Center. Key management capabilities include:

  • Enable/disable GPU access: Control whether projects can use serverless GPU resources.
  • Set concurrent GPU limits: Set the maximum number of GPUs that can be used at the same time in a project.

Access to serverless GPU resources is managed through project-level properties. When the serverless GPU feature is enabled for a project, all Dev Boxes in that project can use GPU compute. This simple access model removes the need for custom roles or pool-based configurations.

Important

Serverless GPU is available only in specific regions. Your project must be in one of the following regions: BrazilSouth, CanadaCentral, CentralUS, EastUS, EastUS2, SouthCentralUS, or WestUS3.

Register serverless GPU for the subscription

  1. Sign in to the Azure portal.
  2. Navigate to your subscription.
  3. Select Settings > Preview features.
  4. Select Dev Box Serverless GPU Preview, and then select Register. Screenshot of the Azure subscription page, showing the Dev Box Serverless GPU Preview feature.

Enable serverless GPU for a project

  1. Go to your project.
  2. Select Settings > Dev box settings.
  3. Under AI workloads, select Enable, and then select Apply. Screenshot of the dev box settings page, showing the Serverless GPU option Enabled.

Connect to a GPU

After you enable serverless GPU, Dev Box users in that project see GPU options in their terminal and Visual Studio (VS) Code environments.

You can connect using one of these methods:

Method 1: Launch a Dev Box GPU shell

  1. Open Windows Terminal on your dev box.
  2. Run the following command:
    devbox gpu shell
    
  3. Connects you to a preconfigured GPU container.

Method 2: Use VS Code with remote tunnels

  1. Open Windows Terminal on your dev box.
  2. Run the following command:
    devbox gpu shell
    
  3. Launch Visual Studio Code.
  4. Install the Remote Tunnels extension.
  5. Connect to the gpu-session tunnel.