Hello Vinodh247,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you are having Azure Stack HCI 24H2 Deployment error dial tcp: lookup clustername.domain: i/o timeout, and Arc Resource Bridge network and internet connectivity validation failed.
The first step in resolving this Azure Stack HCI 24H2 deployment error is to enable access to the appliance VM for proper diagnostics. Since the error originates from DNS resolution within the appliance, host-level tests like Test-NetConnection are insufficient. Use the Serial Console or Emergency Console to gain access to the VM environment. - https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/windows/serial-console-overview and https://learn.microsoft.com/en-us/cli/azure/serial-console?view=azure-cli-latest provide more details. This access allows you to inspect network settings directly inside the VM, which is critical for troubleshooting connectivity issues.
Once inside the appliance VM, validate its DNS and network configuration. Run commands such as ipconfig /all to confirm DNS servers and gateway settings, and use nslookup clustername.domain <DNS_IP> to verify that the cluster FQDN resolves correctly. Additionally, test connectivity to the cluster using Test-NetConnection clustername.domain -Port 55000. If these checks fail, the issue likely lies in incorrect DNS server assignment or missing A records. Ensure that the DNS servers specified in your deployment template are reachable from the appliance subnet and that the cluster FQDN exists in the DNS zone.
The next step is to review and explicitly configure DNS settings in your deployment JSON or ARM template. This prevents reliance on defaults that may not apply in your environment. For example, include the following snippet in your deployment configuration:
"networkProfile": {
"dnsServers": ["10.0.0.5", "10.0.0.6"]
}
This ensures the appliance uses the correct DNS servers during deployment. Refer to Microsoft’s schema documentation for network profiles: Azure Resource Manager Templates.
After DNS validation, confirm that the appliance VM has a valid default gateway and routing configuration. Misconfigured gateways or blocked routes can prevent DNS queries from reaching the server. Check subnet assignments and verify that NSGs or vSwitch ACLs are not restricting traffic between the appliance and DNS servers. If necessary, temporarily relax firewall rules or adjust routing tables to allow connectivity. Microsoft’s networking best practices for Azure Stack HCI can be found here: https://learn.microsoft.com/en-za/azure-stack/hci/deploy/network-atc and https://learn.microsoft.com/en-ca/azure-stack/hci/concepts/plan-software-defined-networking-infrastructure
If the deployment still fails after these adjustments, redeploy with verbose logging enabled using PowerShell or CLI by adding -Verbose or -Debug flags. This captures detailed network-phase logs for further analysis. Additionally, consider opening a support case with Microsoft and referencing the GitHub supportability guide for Azure Stack HCI: Azure Stack HCI Supportability. This ensures that unresolved issues are escalated with full diagnostic context.
Finally, avoid assumptions based on host-level connectivity tests. Appliance-level validation is mandatory because the VM operates in an isolated network namespace.
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.