Azure Kubernetes Fleet Manager 分段更新執行提供一種受控方法,可讓您使用分段程式,跨多個成員叢集部署工作負載。 為降低風險,此方法依序部署至目標叢集,階段間可選擇等待時間及審核門檻。
本文說明如何建立和執行分段更新執行,以漸進方式部署工作負載,並在需要時回復至舊版。
Note
Azure Kubernetes Fleet Manager 提供兩種分階段更新的方法:
-
叢集範圍:使用
ClusterStagedUpdateRun搭配ClusterResourcePlacement,用於管理基礎結構層級變更的機群系統管理員 -
命名空間範圍:使用
StagedUpdateRun搭配ResourcePlacement,用於在其特定命名空間內管理推出作業的應用程式團隊
本文的範例使用頁籤來展示兩種方法。 選擇與你部署範圍相符的分頁。
Prerequisites
您需要具有有效訂用帳戶的 Azure 帳戶。 免費建立帳戶。
若要瞭解本文中使用的概念和術語,請閱讀 分段推出策略的概念概觀。
您需要安裝 Azure CLI 2.58.0 版或更新版本,才能完成本文。 若要安裝或升級,請參閱安裝 Azure CLI。
如果您還沒有 Kubernetes CLI (kubectl),您可以使用此命令加以安裝:
az aks install-cli您需要
fleetAzure CLI 延伸模組。 您可以執行下列命令來安裝它:az extension add --name fleet執行
az extension update命令以更新至最新版本的延伸模組:az extension update --name fleet
設定示範環境
此示範會在具有中樞叢集和三個成員叢集的 Fleet Manager 上執行。 如果您還沒有,請遵循 快速入門 指南來建立具有調控中樞叢集的 Fleet Manager。 然後,將 Azure Kubernetes Service (AKS) 叢集加入為成員。
本教學指南示範如何使用具有以下標籤的三個成員叢集的示範集群環境來執行分段更新:
| 叢集名稱 | labels |
|---|---|
| member1 | 環境=金絲雀,順序=2 |
| member2 | environment=staging |
| member3 | environment=canary, order=1 |
為了依環境分組叢集並控制各階段的部署順序,這些標籤讓我們能建立階段。
準備工作負載以供部署
將工作負載發佈到樞紐叢集,以便將它們放置在成員叢集上。
- ClusterResourcePlacement 叢集資源配置
- 資源配置
在中樞叢集上為工作負載建立命名空間和 ConfigMap。
kubectl create ns test-namespace
kubectl create cm test-cm --from-literal=key=value1 -n test-namespace
若要部署資源,請建立 ClusterResourcePlacement:
Note
spec.strategy.type 設定為 External,以允許透過 ClusterStagedUpdateRun 觸發的推出。
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: example-placement
spec:
resourceSelectors:
- group: ""
kind: Namespace
name: test-namespace
version: v1
policy:
placementType: PickAll
strategy:
type: External
由於我們使用PickAll 原則,因此應該對這三個叢集進行排程,但因為尚未建立ClusterStagedUpdateRun,所以成員叢集上尚未部署任何資源。
確認刊登位置已排程:
kubectl get crp example-placement
您的輸出訊息看起來應類似下列範例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 51s
使用資源快照集
當資源變更時,Fleet Manager 會建立資源快照集。 每個快照集都有唯一的索引,可用來參考特定版本的資源。
Tip
如需資源快照集及其運作方式的詳細資訊,請參閱 瞭解資源快照集。
檢查目前的資源快照
- ClusterResourcePlacement 叢集資源配置
- 資源配置
若要檢查目前的資源快照集:
kubectl get clusterresourcesnapshots --show-labels
您的輸出訊息看起來應類似下列範例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 60s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
我們只有一個版本的快照。 這是目前的最新(kubernetes-fleet.io/is-latest-snapshot=true),並具有資源索引 0(kubernetes-fleet.io/resource-index=0)。
建立新的資源快照集
- ClusterResourcePlacement 叢集資源配置
- 資源配置
現在,使用新的值修改 configmap:
kubectl edit cm test-cm -n test-namespace
將值從 value1 更新為 value2:
kubectl get configmap test-cm -n test-namespace -o yaml
您的輸出訊息看起來應類似下列範例:
apiVersion: v1
data:
key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
creationTimestamp: ...
name: test-cm
namespace: test-namespace
resourceVersion: ...
uid: ...
現在您應該會看到兩個版本的資源快照集,分別具有索引 0 和 1:
kubectl get clusterresourcesnapshots --show-labels
您的輸出訊息看起來應類似下列範例:
NAME GEN AGE LABELS
example-placement-0-snapshot 1 2m6s kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot 1 10s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1
最新的標籤會設定為 example-placement-1-snapshot,其中包含最新的 configmap 數據:
kubectl get clusterresourcesnapshots example-placement-1-snapshot -o yaml
您的輸出訊息看起來應類似下列範例:
apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
annotations:
kubernetes-fleet.io/number-of-enveloped-object: "0"
kubernetes-fleet.io/number-of-resource-snapshots: "1"
kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
creationTimestamp: "2025-07-22T21:26:54Z"
generation: 1
labels:
kubernetes-fleet.io/is-latest-snapshot: "true"
kubernetes-fleet.io/parent-CRP: example-placement
kubernetes-fleet.io/resource-index: "1"
name: example-placement-1-snapshot
ownerReferences:
- apiVersion: placement.kubernetes-fleet.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: ClusterResourcePlacement
name: example-placement
uid: e7d59513-b3b6-4904-864a-c70678fd6f65
resourceVersion: "19994"
uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
selectedResources:
- apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: test-namespace
name: test-namespace
spec:
finalizers:
- kubernetes
- apiVersion: v1
data:
key: value2 # latest value: value2, old value: value1
kind: ConfigMap
metadata:
name: test-cm
namespace: test-namespace
部署分階段更新策略
- ClusterResourcePlacement 叢集資源配置
- 資源配置
ClusterStagedUpdateStrategy定義協調流程模式,將叢集分組為階段,並指定推出順序。 它會依標籤選取成員叢集。 針對示範,我們會建立一個具有兩個階段的佈署流程: 預備和 Canary:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
name: example-strategy
spec:
stages:
- name: staging
labelSelector:
matchLabels:
environment: staging
afterStageTasks:
- type: TimedWait
waitTime: 1m
- name: canary
labelSelector:
matchLabels:
environment: canary
sortingLabelKey: order
afterStageTasks:
- type: Approval
部署分階段更新以推送最新變更
- ClusterResourcePlacement 叢集資源配置
- 資源配置
ClusterStagedUpdateRun 會接續 ClusterResourcePlacement 執行 ClusterStagedUpdateStrategy 的推出。 若要觸發 ClusterResourcePlacement (CRP) 的分段更新執行,我們需要創建一個包含 CRP 名稱、updateRun 策略名稱和最新資源快照索引(「1」)的 ClusterStagedUpdateRun。
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
分段更新執行已初始化並執行:
kubectl get csur example-run
您的輸出訊息看起來應類似下列範例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 7s
在一分鐘的 TimedWait 到期後,狀態的更詳細檢視如下:
kubectl get csur example-run -o yaml
您的輸出訊息看起來應類似下列範例:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
...
name: example-run
...
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
status:
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: ClusterStagedUpdateRun initialized successfully
observedGeneration: 1
reason: UpdateRunInitializedSuccessfully
status: "True" # the updateRun is initialized successfully
type: Initialized
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: The updateRun is waiting for after-stage tasks in stage canary to complete
observedGeneration: 1
reason: UpdateRunWaiting
status: "False" # the updateRun is still progressing and waiting for approval
type: Progressing
deletionStageStatus:
clusters: [] # no clusters need to be cleaned up
stageName: kubernetes-fleet.io/deleteStage
policyObservedClusterCount: 3 # number of clusters to be updated
policySnapshotIndexUsed: "0"
stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
name: staging
- afterStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
name: canary
sortingLabelKey: order
stagesStatus: # detailed status for each stage
- afterStageTaskStatus:
- conditions:
- lastTransitionTime: "2025-07-22T21:29:23Z"
message: Wait time elapsed
observedGeneration: 1
reason: AfterStageTaskWaitTimeElapsed
status: "True" # the wait after-stage task has completed
type: WaitTimeElapsed
type: TimedWait
clusters:
- clusterName: member2 # stage staging contains member2 cluster only
conditions:
- lastTransitionTime: "2025-07-22T21:28:08Z"
message: Cluster update started
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: Cluster update completed successfully
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True" # member2 is updated successfully
type: Succeeded
conditions:
- lastTransitionTime: "2025-07-22T21:28:23Z"
message: All clusters in the stage are updated and after-stage tasks are completed
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "False"
type: Progressing
- lastTransitionTime: "2025-07-22T21:29:23Z"
message: Stage update completed successfully
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "True" # stage staging has completed successfully
type: Succeeded
endTime: "2025-07-22T21:29:23Z"
stageName: staging
startTime: "2025-07-22T21:28:08Z"
- afterStageTaskStatus:
- approvalRequestName: example-run-canary # ClusterApprovalRequest name for this stage
conditions:
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: ClusterApprovalRequest is created
observedGeneration: 1
reason: AfterStageTaskApprovalRequestCreated
status: "True"
type: ApprovalRequestCreated
type: Approval
clusters:
- clusterName: member3 # according to the labelSelector and sortingLabelKey, member3 is selected first in this stage
conditions:
- lastTransitionTime: "2025-07-22T21:29:23Z"
message: Cluster update started
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:29:38Z"
message: Cluster update completed successfully
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True" # member3 update is completed
type: Succeeded
- clusterName: member1 # member1 is selected after member3 because of order=2 label
conditions:
- lastTransitionTime: "2025-07-22T21:29:38Z"
message: Cluster update started
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: Cluster update completed successfully
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True" # member1 update is completed
type: Succeeded
conditions:
- lastTransitionTime: "2025-07-22T21:29:53Z"
message: All clusters in the stage are updated, waiting for after-stage tasks
to complete
observedGeneration: 1
reason: StageUpdatingWaiting
status: "False" # stage canary is waiting for approval task completion
type: Progressing
stageName: canary
startTime: "2025-07-22T21:29:23Z"
我們可以看到預備階段的 TimedWait 期間已到期,並且已建立用於金絲雀階段核准 工作的 ClusterApprovalRequest 物件。 我們可以檢查產生的 ClusterApprovalRequest,發現還沒有人批准
kubectl get clusterapprovalrequest
您的輸出訊息看起來應類似下列範例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-canary example-run canary 2m39s
批准分階段更新執行
- ClusterResourcePlacement 叢集資源配置
- 資源配置
我們可以藉由建立 json 修補程式檔案並套用它來核准 ClusterApprovalRequest:
cat << EOF > approval.json
"status": {
"conditions": [
{
"lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"message": "lgtm",
"observedGeneration": 1,
"reason": "testPassed",
"status": "True",
"type": "Approved"
}
]
}
EOF
提交補丁請求,以核准使用建立的 JSON 檔案。
kubectl patch clusterapprovalrequests example-run-canary --type='merge' --subresource=status --patch-file approval.json
然後確認你是否批准了申請:
kubectl get clusterapprovalrequest
您的輸出訊息看起來應類似下列範例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-canary example-run canary True True 3m35s
現在,ClusterStagedUpdateRun 可以繼續執行並完成。
kubectl get csur example-run
您的輸出訊息看起來應類似下列範例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 5m28s
確認部署完成
- ClusterResourcePlacement 叢集資源配置
- 資源配置
ClusterResourcePlacement也會顯示佈署已完成,所有成員叢集都有可用的資源:
kubectl get crp example-placement
您的輸出訊息看起來應類似下列範例:
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 True 1 8m55s
configmap test-cm 應該部署在所有三個成員叢集上,並包含最新的數據:
apiVersion: v1
data:
key: value2
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
部署第二個 ClusterStagedUpdateRun 以回復至舊版
部署第二個分階段更新執行以進行復原
- ClusterResourcePlacement 叢集資源配置
- 資源配置
假設工作負載管理員想要恢復 configmap 的變更,將值 value2 還原回 value1。 他們不必從中樞手動更新 configmap,而是可以使用我們內容中的先前資源快照集索引 “0” 來建立新的 ClusterStagedUpdateRun,而且可以重複使用相同的策略:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run-2
spec:
placementName: example-placement
resourceSnapshotIndex: "0"
stagedRolloutStrategyName: example-strategy
讓我們檢視新的 ClusterStagedUpdateRun:
kubectl get csur
您的輸出訊息看起來應類似下列範例:
NAME PLACEMENT RESOURCE-SNAPSHOT-INDEX POLICY-SNAPSHOT-INDEX INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 13m
example-run-2 example-placement 0 0 True 9s
在一分鐘的 TimedWait 到期後,我們應該會看到為新的 ClusterStagedUpdateRun 建立的 ClusterApprovalRequest 物件:
kubectl get clusterapprovalrequest
您的輸出訊息看起來應類似下列範例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-canary example-run-2 canary 75s
example-run-canary example-run canary True True 14m
若要核准新的 ClusterApprovalRequest 物件,讓我們重複使用相同的 approval.json 檔案來修補它:
kubectl patch clusterapprovalrequests example-run-2-canary --type='merge' --subresource=status --patch-file approval.json
確認新物件是否已核准:
kubectl get clusterapprovalrequest
您的輸出訊息看起來應類似下列範例:
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-2-canary example-run-2 canary True True 2m7s
example-run-canary example-run canary True True 15m
configmap test-cm 現在應該部署在所有三個成員叢集上,並將數據還原為 value1:
apiVersion: v1
data:
key: value1
kind: ConfigMap
metadata:
...
name: test-cm
namespace: test-namespace
...
方法間的主要差異
| 層面 | 叢集範圍 | 命名空間範圍 |
|---|---|---|
| 策略資源 | ClusterStagedUpdateStrategy |
StagedUpdateStrategy |
| 更新 執行資源 | ClusterStagedUpdateRun |
StagedUpdateRun |
| 目標定位 | ClusterResourcePlacement |
ResourcePlacement |
| 核准資源 |
ClusterApprovalRequest (簡稱: careq) |
ApprovalRequest (簡稱: areq) |
| 快照資源 | ClusterResourceSnapshot |
ResourceSnapshot |
| Scope | 整個叢集 | 命名空間限制 |
| 用例 | 基礎設施部署 | 應用程式部署 |
| 許可 | 叢集管理員層級 | 命名空間層級 |
清理資源
完成本教學課程后,您可以清除您所建立的資源:
- ClusterResourcePlacement 叢集資源配置
- 資源配置
kubectl delete csur example-run example-run-2
kubectl delete csus example-strategy
kubectl delete crp example-placement
kubectl delete namespace test-namespace
後續步驟
在這篇文章中,你學會了如何利用分階段更新執行來協調成員叢集間的部署。 你為叢集範圍與命名空間範圍部署建立分階段更新策略,執行漸進式部署,並回滾至先前版本。
若要深入瞭解分段更新執行和相關概念,請參閱下列資源: