Bare Metal Machine Platform Commands

This article describes how to perform lifecycle management operations on Bare Metal Machines (BMM). These steps should be used for troubleshooting purposes to recover from failures or when taking maintenance actions.

First, read the advice in the article Best Practices for Bare Metal Machine Operations before proceeding with operations.

The bolded actions listed are considered disruptive (Power off, Restart, Reimage, Replace). The Cordon action without the evacuate parameter isn't considered disruptive while Cordon with the evacuate parameter is considered disruptive.

Power off a Bare Metal Machine
Start a Bare Metal Machine
Restart a Bare Metal Machine
Make a Bare Metal Machine unschedulable (cordon without evacuate, doesn't drain the node)
Make a Bare Metal Machine unschedulable (cordon with evacuate, drains the node)
Make a Bare Metal Machine schedulable (uncordon)
Reimage a Bare Metal Machine
Replace a Bare Metal Machine

Action comparison

The following table summarizes each action to help you select the appropriate operation for your scenario:

Action	Purpose	Data Loss	Downtime	Hardware Change	Timeout
Cordon	Mark node unschedulable	None	None	No	10 minutes
Uncordon	Remove scheduling restriction	None	None	No	10 minutes
Power off	Gracefully power down the machine	None	Yes	No	40 minutes
Start	Power on a machine	None	Recovery	No	30 minutes
Restart	Reboot the machine while preserving OS and data	None	Minutes	No	40 minutes
Reimage	Reinstall the OS image on existing hardware	Full	Hours	No	3 hours
Replace	Swap physical hardware with new machine	Full	Hours	Yes	4 hours

Choose the right action

Use the following guidance to determine which action best fits your situation:

Symptom	Recommended Action
Preparing node for maintenance	Cordon
Resume scheduling after maintenance	Uncordon
Machine needs to be offline for maintenance	Power off
Bring offline machine back online	Start
Machine needs reboot	Restart
OS corrupted or software issues	Reimage
Hardware failure detected and repaired	Replace
Need fresh OS installation	Reimage
Replacing physical server	Replace
System unresponsive due to temporary software issues	Restart
Rolling maintenance across nodes	Cordon
BMC credentials need manual rotation	Replace
Firmware reconciliation needed	Replace

Control plane node considerations

Control plane nodes require extra caution when performing lifecycle actions. The platform implements special handling for control plane nodes to maintain cluster quorum and availability:

One at a time: The platform prevents multiple concurrent disruptive actions (restart, reimage, replace) on control plane nodes. If another control plane node is already undergoing a disruptive action, new requests are blocked until that action completes and the node rejoins the cluster.
Quorum safety: The platform verifies that sufficient healthy control plane nodes remain before allowing disruptive operations. Actions may be rejected if proceeding would break cluster quorum.
Extended coordination: Restart, reimage, and replace actions on control plane nodes include extra steps to safely remove and rejoin the node to the control plane.

Action locking

Only one lifecycle action can run on a BMM at a time. If you attempt to start a new action while another is in progress, the request is rejected. Before starting a new action:

Verify any previous action has completed by checking the BMM's actionStates in the Azure portal or via the API
If an action appears stuck, investigate the root cause before attempting another action

Caution

Don't perform any action against control or management plane servers without first consulting with Microsoft support personnel, doing so could affect the integrity of the Operator Nexus Cluster.

Important

Multiple disruptive command requests against a Kubernetes Control Plane (KCP) node are rejected. This check is done to maintain the integrity of the Nexus Cluster instance and avoid multiple KCP nodes become nonoperational at once due to simultaneous disruptive actions. Rejected disruptive action commands can be due to either already running against another KCP node or if the full KCP isn't available. If multiple nodes become nonoperational, it breaks the healthy quorum threshold of the Kubernetes Control Plane.

The actions listed are considered disruptive to BareMetal Machines (BMM):

Power off a BMM
Restart a BMM
Make a BMM unschedulable (cordon with evacuate, drains the node)
Reimage a BMM
Replace a BMM

Leaving only the nondisruptive actions:

Start a BMM
Make a BMM unschedulable (cordon without evacuate, doesn't drain node)
Make a BMM schedulable (uncordon)

Tip

In version 2509.1 and above, you can monitor recent or in-progress BMM actions in the Azure portal. For more information, see Monitor status in Bare Metal Machine JSON properties.

Prerequisites

Install the latest version of the appropriate CLI extensions.
Request access to run the Azure Operator Nexus network fabric (NF) and network cloud CLI extension commands.
Sign in to the Azure CLI and select the subscription where the cluster is deployed.
Collect the following information:
- Subscription ID (SUBSCRIPTION)
- Cluster name (CLUSTER)
- Resource group (CLUSTER_RG)
- Managed resource group (CLUSTER_MRG) - BareMetal Machines (BMM) resources are present in the managed resource group
- BareMetal Machine Name (BMM_NAME) that requires lifecycle management operations

Power off a Bare Metal Machine

The power-off action gracefully shuts down a bare metal machine, making it unavailable to the cluster while preserving its data. The machine remains in a powered-off state until explicitly started again. This action is useful for maintenance scenarios where the hardware needs to be offline but no reprovisioning is required.

Important

There are rare cases where running Nexus VMs fail to relaunch after BMM shutdown or restart. To prevent these cases, power off any virtual machines on the BMM before powering off or restarting the BMM. See the cordon command for instructions on finding the workloads running on a BMM.

Use the power-off command when the machine needs to be taken completely offline, such as for physical maintenance that requires the machine to be powered down or to reduce power consumption for unused capacity.

This command will power-off the specified bareMetalMachineName.

az networkcloud baremetalmachine power-off \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

Start a Bare Metal Machine

The start action powers on a bare metal machine that was previously powered off, bringing it back online and making it available to the cluster. This action is the inverse of the power-off action and restores the machine to an operational state without reinstalling the operating system or losing any data.

Use the start command when a powered-off machine needs to be brought back online, such as recovering from a power-off action or restoring capacity after maintenance.

Note

After a start operation, if the machine was cordoned before being powered off, you may need to execute an uncordon command to allow workloads to be scheduled on the node.

This command will start the specified bareMetalMachineName.

az networkcloud baremetalmachine start \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

Restart a Bare Metal Machine

The restart action performs a controlled reboot of the bare metal machine. Unlike power-off followed by start, the restart action coordinates the shutdown and startup as a single operation, ensuring workloads are gracefully terminated and the machine rejoins the cluster after rebooting. The operating system and all data on the machine are preserved.

Important

Use the restart command when the machine is unresponsive but hardware is healthy, a reboot is needed to apply configuration changes, or temporary software issues need to be cleared. The restart action is the least disruptive operation among those that cause downtime.

During a restart, the system:

Cordons the node to prevent new workload scheduling
Waits for workloads to gracefully terminate
Powers off the hardware
Powers on and waits for the node to rejoin the cluster

This command will restart the specified bareMetalMachineName.

az networkcloud baremetalmachine restart \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

Make a Bare Metal Machine unschedulable (cordon)

The cordon action marks a bare metal machine as unschedulable, preventing new workloads from being placed on the node. Unlike power-off or restart, the machine remains powered on and existing workloads continue running. This action is commonly used as a preparatory step before maintenance operations or to isolate a node for troubleshooting.

On the execution of the cordon command, Operator Nexus workloads aren't scheduled on the Bare Metal Machine when cordon is set. Any attempt to create a workload on a cordoned Bare Metal Machine results in the workload being set to pending state. Existing workloads continue to run on the Bare Metal Machine unless the workloads are drained.

Use cordon when:

You need to prevent new workloads from scheduling on a node
Performing rolling maintenance across multiple nodes
Troubleshooting a node while keeping existing workloads running

Note

The platform may automatically cordon nodes due to detected hardware issues such as port flapping, NIC failures, or LACP issues. When you execute an uncordon command, it clears both your cordon and any platform-applied cordons. However, if the node is still degraded due to an unresolved hardware issue, the uncordon is rejected.

Drain Bare Metal Machine workloads

The cordon command supports the evacuate parameter, for which its default value False means that the cordon command prevents scheduling new workloads. To drain workloads with the cordon command, the evacuate parameter must be set to True. The workloads running on the Bare Metal Machine are stopped and the Bare Metal Machine is set to pending state.

Note

Nexus Management Workloads continue to run on the Bare Metal Machine even when the server is cordoned and evacuated.

It's a best practice to set the evacuate value to True when attempting to do any maintenance operations on the Bare Metal server. For more best practices to follow, read through Best Practices for Bare Metal Machine Operations.

az networkcloud baremetalmachine cordon \
  --evacuate "True" \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

To identify if any workloads are currently running on a Bare Metal Machine, run the following command

For Virtual Machines:

az networkcloud baremetalmachine show -n <nodeName> /
  --resource-group <resourceGroup> /
  --subscription <subscriptionID> | jq '.virtualMachinesAssociatedIds'

For Nexus Kubernetes cluster nodes: (Requires logging into the Nexus Kubernetes cluster)

kubectl get nodes <resourceName> -ojson |jq '.metadata.labels."topology.kubernetes.io/baremetalmachine"'

Make a Bare Metal Machine schedulable (uncordon)

The uncordon action removes the scheduling restriction from a bare metal machine, allowing new workloads to be placed on the node. This action is the inverse of the cordon action and is typically performed after maintenance is complete. The uncordon action also clears any automatic cordons that the platform may have applied due to detected hardware issues.

All workloads in a pending state on the Bare Metal Machine are restarted when the Bare Metal Machine is uncordoned.

Use uncordon when:

Maintenance is complete and the node should resume normal scheduling
A hardware issue has been resolved and the auto-cordon should be cleared
The node is ready to accept new workloads again

Note

For compute nodes, if the node is degraded due to a hardware issue and was automatically cordoned by the platform, the uncordon action is rejected until the underlying hardware issue is resolved. The error message indicates the node is degraded and which condition is preventing uncordon. This protection prevents accidentally scheduling workloads on nodes with known hardware issues.

az networkcloud baremetalmachine uncordon \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

Reimage a Bare Metal Machine

The reimage action completely reinstalls the operating system on the bare metal machine, returning it to a clean state. The existing machine is deprovisioned, the disk is wiped, and a fresh OS image is deployed. After reimaging, the machine rejoins the cluster with the same identity (hostname, IP addresses) but with a freshly installed operating system. Use this action when software issues can't be resolved through a restart.

This process redeploys the runtime image on the target Bare Metal Machine and executes the steps to rejoin the cluster with the same identifiers.

Use reimage when:

The OS has become corrupted or unstable
A clean slate is needed without changing hardware
Software configuration has drifted beyond recovery

During a reimage, the system progresses through the following phases:

Deprovisioning: Deletes the existing machine, triggering disk wipe and power-off
Provisioning: Creates a new machine with fresh OS image
Cloud Init: Waits for the reimaged machine to complete initialization and rejoin the cluster

Note

Both reimage and replace result in a freshly provisioned machine, but they differ in key ways. Reimage reinstalls the OS on the same hardware, while replace swaps the physical hardware entirely. Use reimage for software-related issues and replace for hardware failures.

As a best practice, ensure the Bare Metal Machine's workloads are drained using the cordon command, with evacuate set to True, before executing the reimage command. For more best practices to follow, read through Best Practices for Bare Metal Machine Operations.

Important

Avoid write or edit actions performed on the node via Bare Metal Machine access. The reimage action is required to restore Microsoft support and any changes done to the Bare Metal Machine are lost while restoring the node to it's expected state.

Warning

Don't run more than one baremetalmachine replace or reimage command at the same time for the same BareMetal Machine (BMM) resource. Executing replace at the same time as a reimage leaves servers in a nonoperational state. Make sure any replace/reimage on the BMM completes fully before starting another one. Additionally, avoid executing sequential reimage actions on a BMM that just completed a replace action unless specified maintenance operation is being performed.

az networkcloud baremetalmachine reimage \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --subscription <subscriptionID>

Replace a Bare Metal Machine

The replace action integrates new or repaired physical hardware into the cluster. Before provisioning the new hardware, the system validates that the replacement hardware meets requirements by testing BMC connectivity, verifying credentials, and checking network links. After validation passes, the old machine is deprovisioned and the replacement hardware is provisioned with a fresh OS image. The machine then rejoins the cluster with the same logical identity.

After replacing components such as motherboard or Network Interface Card (NIC), the MAC address of Bare Metal Machine changes; however, the iDRAC IP address and hostname remain the same. A replace must be executed after each hardware maintenance operation, read through Best practices for a Bare Metal Machine replace for more details.

Use replace when:

Hardware has failed (disk, memory, CPU, NIC)
Physical maintenance requires swapping the chassis
BMC credentials need to be updated along with hardware
Hardware components replaced and need firmware reconciled with platform

During a replace operation, the system progresses through the following phases:

Hardware Validation: Validates replacement hardware meets requirements (BMC credentials, serial number, MAC addresses)
Deprovisioning: Removes the old machine from cluster control and deletes associated resources
Provisioning: Registers, inspects, and provisions the replacement hardware
Cloud Init: Waits for the replacement machine to join the cluster and become ready

As of the 2506.2 release, the password value for iDRAC can be provided as a Key Vault Uniform Resource Identifier (URI) or password value. See Key Vault Credential Reference. Using a URI instead of a plaintext password provides extra security.

Warning

az networkcloud baremetalmachine replace \
  --name <BareMetalMachineName> \
  --resource-group <resourceGroup> \
  --bmc-credentials password=<PASSWORD_URI or IDRAC_PASSWORD> username=<IDRAC_USER> \
  --bmc-mac-address <IDRAC_MAC> \
  --boot-mac-address <PXE_MAC> \
  --machine-name <OS_HOSTNAME> \
  --serial-number <SERIAL_NUMBER> \
  --subscription <subscriptionID> \
  --safeguard-mode <"All" or "None">

Important

For replace actions made using API version 2025-07-01-preview and later: by default the replace action uses a safeguard that prevents replacing a healthy machine (powered on, ready, provisioned, joined to cluster) to avoid unnecessary disruptive operations. If a replace is attempted while the machine is healthy the action is rejected with the following response:

(action rejected) cannot replace healthy machine (powered on, ready, provisioned, joined to cluster). Use --safeguard-mode None to override
Code: action rejected
Message: cannot replace healthy machine (powered on, ready, provisioned, joined to cluster). Use --safeguard-mode None to override

To override the safeguard, specify --safeguard-mode None:

If the replace action fails due to a hardware validation failure, the specific error or test failure is shown in the replace response, as shown in the following examples. This information can also be found in the Activity Log for the Bare Metal Machine (Operator Nexus). The error code and error message are also included in the JSON properties of the corresponding BareMetalMachines_Replace operation.

Example 1: Hardware validation fails due to invalid Key Vault URI for Baseboard Management Controller (BMC) credentials

$ az networkcloud baremetalmachine replace --name rack1compute02 --resource-group hostedRG --bmc-credentials password=$KEY_VAULT_URI username=root --bmc-mac-address 00-00-5E-00-01-00 --boot-mac-address 00-00-5E-00-02-00 --machine-name RACK1COMPUTE02 --serial-number SN123435
(failed to retrieve password from key vault) failed to get secret value from key vault: failed to get cluster key vault secret
Code: failed to retrieve password from key vault
Message: failed to retrieve password from key vault
Response: 400 Bad Request

Example 2: Hardware validation fails due to invalid Baseboard Management Controller (BMC) credentials provided

$ az networkcloud baremetalmachine replace --name rack1compute02 --resource-group hostedRG --bmc-credentials password=REDACTED username=root --bmc-mac-address 00-00-5E-00-01-00 --boot-mac-address 00-00-5E-00-02-00 --machine-name RACK1COMPUTE02 --serial-number SN123435
(None) BMC login unsuccessful: Fail - Unauthorized; System health test(s) failed: [Additional logs: Server power down at end of test failed with: Unauthorized]
Code: None
Message: BMC login unsuccessful: Fail - Unauthorized; System health test(s) failed: [Additional logs: Server power down at end of test failed with: Unauthorized]

Note

When hardware validation fails due to BMC credential authentication issues (Unauthorized), the action is rejected but the Bare Metal Machine isn't marked as failed or put into an error state. The Bare Metal Machine maintains its current operational status while the hardware validation reports the credential authentication failure.

Example 3: Hardware validation fails due to networking failure

$ az networkcloud baremetalmachine replace --name rack1compute02 --resource-group hostedRG --bmc-credentials password=REDACTED username=root --bmc-mac-address 00-00-5E-00-01-00 --boot-mac-address 00-00-5E-00-02-00 --machine-name RACK1COMPUTE02 --serial-number SN123435
(None) Networking test(s) failed: [NIC.Slot.6-1-1_LinkStatus] expected: up; observed: Down; [Additional logs: Link failure detected on NIC.Slot.6-1-1; Unable to perform cabling check on PCI Slot 6]
Code: None
Message: Networking test(s) failed: [NIC.Slot.6-1-1_LinkStatus] expected: up; observed: Down; [Additional logs: Link failure detected on NIC.Slot.6-1-1; Unable to perform cabling check on PCI Slot 6]

For more information about troubleshooting hardware validation failures, see Troubleshoot Hardware Validation Failure.

Feedback

Was this page helpful?

Last updated on 2025-12-08

Share via

Bare Metal Machine Platform Commands

Action comparison

Choose the right action

Control plane node considerations

Action locking

Prerequisites

Power off a Bare Metal Machine

Start a Bare Metal Machine

Restart a Bare Metal Machine

Make a Bare Metal Machine unschedulable (cordon)

Drain Bare Metal Machine workloads

To identify if any workloads are currently running on a Bare Metal Machine, run the following command

Make a Bare Metal Machine schedulable (uncordon)

Reimage a Bare Metal Machine

Replace a Bare Metal Machine

Example 1: Hardware validation fails due to invalid Key Vault URI for Baseboard Management Controller (BMC) credentials

Example 2: Hardware validation fails due to invalid Baseboard Management Controller (BMC) credentials provided

Example 3: Hardware validation fails due to networking failure

Feedback

Additional resources