共用方式為


協調成員群間的分階段推廣

Azure Kubernetes Fleet Manager 分段更新執行提供一種受控方法,可讓您使用分段程式,跨多個成員叢集部署工作負載。 為降低風險,此方法依序部署至目標叢集,階段間可選擇等待時間及審核門檻。

本文說明如何建立和執行分段更新執行,以漸進方式部署工作負載,並在需要時回復至舊版。

Note

Azure Kubernetes Fleet Manager 提供兩種分階段更新的方法:

  • 叢集範圍:使用 ClusterStagedUpdateRun 搭配 ClusterResourcePlacement,用於管理基礎結構層級變更的機群系統管理員
  • 命名空間範圍:使用 StagedUpdateRun 搭配 ResourcePlacement,用於在其特定命名空間內管理推出作業的應用程式團隊

本文的範例使用頁籤來展示兩種方法。 選擇與你部署範圍相符的分頁。

Prerequisites

  • 您需要具有有效訂用帳戶的 Azure 帳戶。 免費建立帳戶

  • 若要瞭解本文中使用的概念和術語,請閱讀 分段推出策略的概念概觀

  • 您需要安裝 Azure CLI 2.58.0 版或更新版本,才能完成本文。 若要安裝或升級,請參閱安裝 Azure CLI

  • 如果您還沒有 Kubernetes CLI (kubectl),您可以使用此命令加以安裝:

    az aks install-cli
    
  • 您需要 fleet Azure CLI 延伸模組。 您可以執行下列命令來安裝它:

    az extension add --name fleet
    

    執行 az extension update 命令以更新至最新版本的延伸模組:

    az extension update --name fleet
    

設定示範環境

此示範會在具有中樞叢集和三個成員叢集的 Fleet Manager 上執行。 如果您還沒有,請遵循 快速入門 指南來建立具有調控中樞叢集的 Fleet Manager。 然後,將 Azure Kubernetes Service (AKS) 叢集加入為成員。

本教學指南示範如何使用具有以下標籤的三個成員叢集的示範集群環境來執行分段更新:

叢集名稱 labels
member1 環境=金絲雀,順序=2
member2 environment=staging
member3 environment=canary, order=1

為了依環境分組叢集並控制各階段的部署順序,這些標籤讓我們能建立階段。

準備工作負載以供部署

將工作負載發佈到樞紐叢集,以便將它們放置在成員叢集上。

在中樞叢集上為工作負載建立命名空間和 ConfigMap。

kubectl create ns test-namespace
kubectl create cm test-cm --from-literal=key=value1 -n test-namespace

若要部署資源,請建立 ClusterResourcePlacement:

Note

spec.strategy.type 設定為 External,以允許透過 ClusterStagedUpdateRun 觸發的推出。

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: example-placement
spec:
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-namespace
      version: v1
  policy:
    placementType: PickAll
  strategy:
    type: External

由於我們使用PickAll 原則,因此應該對這三個叢集進行排程,但因為尚未建立ClusterStagedUpdateRun,所以成員叢集上尚未部署任何資源。

確認刊登位置已排程:

kubectl get crp example-placement

您的輸出訊息看起來應類似下列範例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1                                           51s

使用資源快照集

當資源變更時,Fleet Manager 會建立資源快照集。 每個快照集都有唯一的索引,可用來參考特定版本的資源。

Tip

如需資源快照集及其運作方式的詳細資訊,請參閱 瞭解資源快照集。

檢查目前的資源快照

若要檢查目前的資源快照集:

kubectl get clusterresourcesnapshots --show-labels

您的輸出訊息看起來應類似下列範例:

NAME                           GEN   AGE   LABELS
example-placement-0-snapshot   1     60s   kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0

我們只有一個版本的快照。 這是目前的最新(kubernetes-fleet.io/is-latest-snapshot=true),並具有資源索引 0(kubernetes-fleet.io/resource-index=0)。

建立新的資源快照集

現在,使用新的值修改 configmap:

kubectl edit cm test-cm -n test-namespace

將值從 value1 更新為 value2

kubectl get configmap test-cm -n test-namespace -o yaml

您的輸出訊息看起來應類似下列範例:

apiVersion: v1
data:
  key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
  creationTimestamp: ...
  name: test-cm
  namespace: test-namespace
  resourceVersion: ...
  uid: ...

現在您應該會看到兩個版本的資源快照集,分別具有索引 0 和 1:

kubectl get clusterresourcesnapshots --show-labels

您的輸出訊息看起來應類似下列範例:

NAME                           GEN   AGE    LABELS
example-placement-0-snapshot   1     2m6s   kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot   1     10s    kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1

最新的標籤會設定為 example-placement-1-snapshot,其中包含最新的 configmap 數據:

kubectl get clusterresourcesnapshots example-placement-1-snapshot -o yaml

您的輸出訊息看起來應類似下列範例:

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 10dd7a3d1e5f9849afe956cfbac080a60671ad771e9bda7dd34415f867c75648
  creationTimestamp: "2025-07-22T21:26:54Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: example-placement
    kubernetes-fleet.io/resource-index: "1"
  name: example-placement-1-snapshot
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterResourcePlacement
    name: example-placement
    uid: e7d59513-b3b6-4904-864a-c70678fd6f65
  resourceVersion: "19994"
  uid: 79ca0bdc-0b0a-4c40-b136-7f701e85cdb6
spec:
  selectedResources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kubernetes.io/metadata.name: test-namespace
      name: test-namespace
    spec:
      finalizers:
      - kubernetes
  - apiVersion: v1
    data:
      key: value2 # latest value: value2, old value: value1
    kind: ConfigMap
    metadata:
      name: test-cm
      namespace: test-namespace

部署分階段更新策略

ClusterStagedUpdateStrategy定義協調流程模式,將叢集分組為階段,並指定推出順序。 它會依標籤選取成員叢集。 針對示範,我們會建立一個具有兩個階段的佈署流程: 預備和 Canary:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
  name: example-strategy
spec:
  stages:
    - name: staging
      labelSelector:
        matchLabels:
          environment: staging
      afterStageTasks:
        - type: TimedWait
          waitTime: 1m
    - name: canary
      labelSelector:
        matchLabels:
          environment: canary
      sortingLabelKey: order
      afterStageTasks:
        - type: Approval

部署分階段更新以推送最新變更

ClusterStagedUpdateRun 會接續 ClusterResourcePlacement 執行 ClusterStagedUpdateStrategy 的推出。 若要觸發 ClusterResourcePlacement (CRP) 的分段更新執行,我們需要創建一個包含 CRP 名稱、updateRun 策略名稱和最新資源快照索引(「1」)的 ClusterStagedUpdateRun

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: example-run
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy

分段更新執行已初始化並執行:

kubectl get csur example-run

您的輸出訊息看起來應類似下列範例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True                      7s

在一分鐘的 TimedWait 到期後,狀態的更詳細檢視如下:

kubectl get csur example-run -o yaml

您的輸出訊息看起來應類似下列範例:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  ...
  name: example-run
  ...
spec:
  placementName: example-placement
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: example-strategy
status:
  conditions:
  - lastTransitionTime: "2025-07-22T21:28:08Z"
    message: ClusterStagedUpdateRun initialized successfully
    observedGeneration: 1
    reason: UpdateRunInitializedSuccessfully
    status: "True" # the updateRun is initialized successfully
    type: Initialized
  - lastTransitionTime: "2025-07-22T21:29:53Z"
    message: The updateRun is waiting for after-stage tasks in stage canary to complete
    observedGeneration: 1
    reason: UpdateRunWaiting
    status: "False" # the updateRun is still progressing and waiting for approval
    type: Progressing
  deletionStageStatus:
    clusters: [] # no clusters need to be cleaned up
    stageName: kubernetes-fleet.io/deleteStage
  policyObservedClusterCount: 3 # number of clusters to be updated
  policySnapshotIndexUsed: "0"
  stagedUpdateStrategySnapshot: # snapshot of the strategy used for this update run
    stages:
    - afterStageTasks:
      - type: TimedWait
        waitTime: 1m0s
      labelSelector:
        matchLabels:
          environment: staging
      name: staging
    - afterStageTasks:
      - type: Approval
      labelSelector:
        matchLabels:
          environment: canary
      name: canary
      sortingLabelKey: order
  stagesStatus: # detailed status for each stage
  - afterStageTaskStatus:
    - conditions:
      - lastTransitionTime: "2025-07-22T21:29:23Z"
        message: Wait time elapsed
        observedGeneration: 1
        reason: AfterStageTaskWaitTimeElapsed
        status: "True" # the wait after-stage task has completed
        type: WaitTimeElapsed
      type: TimedWait
    clusters:
    - clusterName: member2 # stage staging contains member2 cluster only
      conditions:
      - lastTransitionTime: "2025-07-22T21:28:08Z"
        message: Cluster update started
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:28:23Z"
        message: Cluster update completed successfully
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True" # member2 is updated successfully
        type: Succeeded
    conditions:
    - lastTransitionTime: "2025-07-22T21:28:23Z"
      message: All clusters in the stage are updated and after-stage tasks are completed
      observedGeneration: 1
      reason: StageUpdatingSucceeded
      status: "False"
      type: Progressing
    - lastTransitionTime: "2025-07-22T21:29:23Z"
      message: Stage update completed successfully
      observedGeneration: 1
      reason: StageUpdatingSucceeded
      status: "True" # stage staging has completed successfully
      type: Succeeded
    endTime: "2025-07-22T21:29:23Z"
    stageName: staging
    startTime: "2025-07-22T21:28:08Z"
  - afterStageTaskStatus:
    - approvalRequestName: example-run-canary # ClusterApprovalRequest name for this stage
      conditions:
      - lastTransitionTime: "2025-07-22T21:29:53Z"
        message: ClusterApprovalRequest is created
        observedGeneration: 1
        reason: AfterStageTaskApprovalRequestCreated
        status: "True"
        type: ApprovalRequestCreated
      type: Approval
    clusters:
    - clusterName: member3 # according to the labelSelector and sortingLabelKey, member3 is selected first in this stage
      conditions:
      - lastTransitionTime: "2025-07-22T21:29:23Z"
        message: Cluster update started
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:29:38Z"
        message: Cluster update completed successfully
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True" # member3 update is completed
        type: Succeeded
    - clusterName: member1 # member1 is selected after member3 because of order=2 label
      conditions:
      - lastTransitionTime: "2025-07-22T21:29:38Z"
        message: Cluster update started
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-07-22T21:29:53Z"
        message: Cluster update completed successfully
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True" # member1 update is completed
        type: Succeeded
    conditions:
    - lastTransitionTime: "2025-07-22T21:29:53Z"
      message: All clusters in the stage are updated, waiting for after-stage tasks
        to complete
      observedGeneration: 1
      reason: StageUpdatingWaiting
      status: "False" # stage canary is waiting for approval task completion
      type: Progressing
    stageName: canary
    startTime: "2025-07-22T21:29:23Z"

我們可以看到預備階段的 TimedWait 期間已到期,並且已建立用於金絲雀階段核准 工作的 ClusterApprovalRequest 物件。 我們可以檢查產生的 ClusterApprovalRequest,發現還沒有人批准

kubectl get clusterapprovalrequest

您的輸出訊息看起來應類似下列範例:

NAME                 UPDATE-RUN    STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-canary   example-run   canary                                 2m39s

批准分階段更新執行

我們可以藉由建立 json 修補程式檔案並套用它來核准 ClusterApprovalRequest

cat << EOF > approval.json
"status": {
    "conditions": [
        {
            "lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
            "message": "lgtm",
            "observedGeneration": 1,
            "reason": "testPassed",
            "status": "True",
            "type": "Approved"
        }
    ]
}
EOF

提交補丁請求,以核准使用建立的 JSON 檔案。

kubectl patch clusterapprovalrequests example-run-canary --type='merge' --subresource=status --patch-file approval.json

然後確認你是否批准了申請:

kubectl get clusterapprovalrequest

您的輸出訊息看起來應類似下列範例:

NAME                 UPDATE-RUN    STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-canary   example-run   canary   True       True               3m35s

現在,ClusterStagedUpdateRun 可以繼續執行並完成。

kubectl get csur example-run

您的輸出訊息看起來應類似下列範例:

NAME          PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run   example-placement   1                         0                       True          True        5m28s

確認部署完成

ClusterResourcePlacement也會顯示佈署已完成,所有成員叢集都有可用的資源:

kubectl get crp example-placement

您的輸出訊息看起來應類似下列範例:

NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
example-placement   1     True        1               True        1               8m55s

configmap test-cm 應該部署在所有三個成員叢集上,並包含最新的數據:

apiVersion: v1
data:
  key: value2
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

部署第二個 ClusterStagedUpdateRun 以回復至舊版

部署第二個分階段更新執行以進行復原

假設工作負載管理員想要恢復 configmap 的變更,將值 value2 還原回 value1。 他們不必從中樞手動更新 configmap,而是可以使用我們內容中的先前資源快照集索引 “0” 來建立新的 ClusterStagedUpdateRun,而且可以重複使用相同的策略:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: example-run-2
spec:
  placementName: example-placement
  resourceSnapshotIndex: "0"
  stagedRolloutStrategyName: example-strategy

讓我們檢視新的 ClusterStagedUpdateRun

kubectl get csur

您的輸出訊息看起來應類似下列範例:

NAME            PLACEMENT           RESOURCE-SNAPSHOT-INDEX   POLICY-SNAPSHOT-INDEX   INITIALIZED   SUCCEEDED   AGE
example-run     example-placement   1                         0                       True          True        13m
example-run-2   example-placement   0                         0                       True                      9s

在一分鐘的 TimedWait 到期後,我們應該會看到為新的 ClusterStagedUpdateRun 建立的 ClusterApprovalRequest 物件:

kubectl get clusterapprovalrequest

您的輸出訊息看起來應類似下列範例:

NAME                   UPDATE-RUN      STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-canary   example-run-2   canary                                 75s
example-run-canary     example-run     canary   True       True               14m

若要核准新的 ClusterApprovalRequest 物件,讓我們重複使用相同的 approval.json 檔案來修補它:

kubectl patch clusterapprovalrequests example-run-2-canary --type='merge' --subresource=status --patch-file approval.json

確認新物件是否已核准:

kubectl get clusterapprovalrequest                                                                            

您的輸出訊息看起來應類似下列範例:

NAME                   UPDATE-RUN      STAGE    APPROVED   APPROVALACCEPTED   AGE
example-run-2-canary   example-run-2   canary   True       True               2m7s
example-run-canary     example-run     canary   True       True               15m

configmap test-cm 現在應該部署在所有三個成員叢集上,並將數據還原為 value1

apiVersion: v1
data:
  key: value1
kind: ConfigMap
metadata:
  ...
  name: test-cm
  namespace: test-namespace
  ...

方法間的主要差異

層面 叢集範圍 命名空間範圍
策略資源 ClusterStagedUpdateStrategy StagedUpdateStrategy
更新 執行資源 ClusterStagedUpdateRun StagedUpdateRun
目標定位 ClusterResourcePlacement ResourcePlacement
核准資源 ClusterApprovalRequest (簡稱: careq ApprovalRequest (簡稱: areq
快照資源 ClusterResourceSnapshot ResourceSnapshot
Scope 整個叢集 命名空間限制
用例 基礎設施部署 應用程式部署
許可 叢集管理員層級 命名空間層級

清理資源

完成本教學課程后,您可以清除您所建立的資源:

kubectl delete csur example-run example-run-2
kubectl delete csus example-strategy
kubectl delete crp example-placement
kubectl delete namespace test-namespace

後續步驟

在這篇文章中,你學會了如何利用分階段更新執行來協調成員叢集間的部署。 你為叢集範圍與命名空間範圍部署建立分階段更新策略,執行漸進式部署,並回滾至先前版本。

若要深入瞭解分段更新執行和相關概念,請參閱下列資源: