Hi Lucas,
The information about cloning the passive node is the "smoking gun." Cloning a domain-joined cluster node is a critical violation of Microsoft supportability and Active Directory mechanics. This action is almost certainly the root cause of the password synchronization failure you are seeing on PWGIRSQL2. To analyze the Root Cause (SID and Machine Password Duplication), when you cloned PWGIRSQL2 to create PWGIRSQL3, you didn't just copy the files; you copied the Machine Account Password and potentially the Local Security Identifier (SID) (unless you ran Sysprep properly). Here came the conflict__:__ Active Directory relies on a secure channel password that rotates automatically (usually every 30 days). If PWGIRSQL3 (the clone) updated the machine password with the Domain Controller, PWGIRSQL2 (the source) now has an "old" password locally that the Domain Controller no longer recognizes. Since the Cluster Service runs on top of these OS credentials, when PWGIRSQL2 tries to talk to the Cluster Name Object (CLPWGIRSQL$) in AD to update the CNO password, the authentication fails because the secure channel of the node itself is broken or confused by the duplicate identity artifacts.
You cannot simply "fix" the cloned node while it is live in the cluster. You must remove the compromised node to restore stability. Here is my suggestion:
Step 1: Validate the Healthy Node (PWGIRSQL1) Ensure that your primary node (PWGIRSQL1) is hosting all resources (SQL Server, etc.) and is functioning correctly.
Step 2: Evict the Problem Node (PWGIRSQL2) Since PWGIRSQL2 is the source of the clone and is throwing errors, it is "tainted."
Open Failover Cluster Manager.
Right-click PWGIRSQL2 -> More Actions -> Evict.
This removes the node from the cluster configuration so it stops trying to corrupt the CNO credentials.
Step 3: Fix the Trust Relationship on PWGIRSQL2 (or Reinstall)
Option A (Reinstall - Recommended): Format PWGIRSQL2 and install the OS fresh. This guarantees no lingering SID duplication issues.
Option B (Re-join): Remove PWGIRSQL2 from the domain (join a Workgroup), reboot, deleting the computer account in AD, and then re-join the domain. This forces a new SID and machine password.
Step 4: Repair the Cluster Name Object (CNO) Now that the "bad" node is gone, perform the repair action we discussed earlier on the surviving node (PWGIRSQL1).
Right-click Cluster Name -> More Actions -> Repair Active Directory Object.
This ensures the CNO (CLPWGIRSQL$) is perfectly synced with the surviving authoritative node.
Step 5: Re-add PWGIRSQL2 Once PWGIRSQL2 is fresh (reinstalled or fully re-joined) and showing "Ready" in validation, add it back to the cluster using the Add Node wizard.
Important Warning regarding PWGIRSQL3: If PWGIRSQL3 was created via cloning without running Sysprep /generalize, it is also a time-bomb. It is highly recommended to eventually evict it and reinstall it properly from an ISO, rather than a clone. For now, prioritize fixing PWGIRSQL2 to regain redundancy.
I hope you've found something useful here. If it helps you get more insight into the issue, it's appreciated to accept the answer then. Should you have more questions, feel free to leave a message. Have a nice day!
VP