Hi pipi,
Thanks for reaching out to the Q/A. VM Scale Set (VMSS) does not wait for Application Gateway’s connection draining to complete during scale-in. Although connection draining stops new incoming requests, VMSS terminates the instance immediately, resulting in active requests being dropped before completion.
This happens due to Application Gateway marks the backend instance as draining and stops sending new requests and VMSS does not wait for draining to finish and deletes the VM before existing sessions are completed.
Below are the recommendations
1.Enable VM Termination Notification:
This allows the VM to receive a shutdown signal (typically 30 seconds’ notice), so the application can stop accepting new requests and complete in-flight transactions.
2.Use VMSS Instance Protection:
This Prevents critical instances from being deleted during scale-in until it is safe to remove them.
3.Verify Application Gateway Connection Draining Configuration:
Ensures new requests are stopped while allowing existing connections to finish.
Additional Best Practices:
- Review and adjust scale-in policies (Newest VM, Oldest VM, or Least CPU usage) based on workload behavior
- Configure appropriate autoscale thresholds and cooldown periods to avoid aggressive scale-in
- Monitor scale events using Azure Monitor to validate expected behavior
- Consider leveraging Availability Zones to improve resiliency during scale operations
Hope this helps! Please let me know if you have any queries.