How to Roll Out and Roll Back Changes in Stage
This how-to guide demonstrates how to use staged updates to rollout resources to member clusters in a staged manner and rollback resources to a previous version. We’ll cover both cluster-scoped and namespace-scoped approaches to serve different organizational needs.
Overview of Staged Update Approaches
Kubefleet provides two staged update approaches:
Cluster-Scoped: Use ClusterStagedUpdateRun with ClusterResourcePlacement (CRP) for fleet administrators managing infrastructure-level changes.
Namespace-Scoped: Use StagedUpdateRun with ResourcePlacement for application teams managing rollouts within their specific namespaces.
Prerequisite
This tutorial is based on a demo fleet environment with 3 member clusters:
| cluster name | labels |
|---|---|
| member1 | environment=canary, order=2 |
| member2 | environment=staging |
| member3 | environment=canary, order=1 |
We’ll demonstrate both cluster-scoped and namespace-scoped staged updates using different scenarios.
Cluster-Scoped Staged Updates
Setup for Cluster-Scoped Updates
To demonstrate cluster-scoped rollout and rollback behavior, we create a demo namespace and a sample configmap with very simple data on the hub cluster. The namespace with configmap will be deployed to the member clusters.
kubectl create ns test-namespace
kubectl create cm test-cm --from-literal=key=value1 -n test-namespace
Now we create a ClusterResourcePlacement to deploy the resources:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
name: example-placement
spec:
resourceSelectors:
- group: ""
kind: Namespace
name: test-namespace
version: v1
policy:
placementType: PickAll
strategy:
type: External
EOF
Note that spec.strategy.type is set to External to allow rollout triggered with a ClusterStagedUpdateRun.
All clusters should be scheduled since we use the PickAll policy but at the moment no resource should be deployed on the member clusters because we haven’t created a ClusterStagedUpdateRun yet. The CRP is not AVAILABLE yet.
kubectl get crp example-placement
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 8s
Check cluster resource snapshot versions
Fleet keeps a list of resource snapshots for version control and audit, (for more details, please refer to api-reference).
To check current resource snapshots:
kubectl get clusterresourcesnapshots --show-labels
NAME GEN AGE LABELS
example-placement-0-snapshot 1 7m31s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
We only have one version of the snapshot. It is the current latest (kubernetes-fleet.io/is-latest-snapshot=true) and has resource-index 0 (kubernetes-fleet.io/resource-index=0).
Now we modify the our configmap with a new value value2:
kubectl edit cm test-cm -n test-namespace
kubectl get configmap test-cm -n test-namespace -o yaml
apiVersion: v1
data:
key: value2 # value updated here, old value: value1
kind: ConfigMap
metadata:
creationTimestamp: ...
name: test-cm
namespace: test-namespace
resourceVersion: ...
uid: ...
It now shows 2 versions of resource snapshots with index 0 and 1 respectively:
kubectl get clusterresourcesnapshots --show-labels
NAME GEN AGE LABELS
example-placement-0-snapshot 1 17m kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=0
example-placement-1-snapshot 1 2m2s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=example-placement,kubernetes-fleet.io/resource-index=1
The latest label set to example-placement-1-snapshot which contains the latest configmap data:
kubectl get clusterresourcesnapshots example-placement-1-snapshot -o yaml
apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
...
labels:
kubernetes-fleet.io/is-latest-snapshot: "true"
kubernetes-fleet.io/parent-CRP: example-placement
kubernetes-fleet.io/resource-index: "1"
name: example-placement-1-snapshot
...
spec:
selectedResources:
- apiVersion: v1
kind: Namespace
metadata:
labels:
kubernetes.io/metadata.name: test-namespace
name: test-namespace
spec:
finalizers:
- kubernetes
- apiVersion: v1
data:
key: value2 # latest value: value2, old value: value1
kind: ConfigMap
metadata:
name: test-cm
namespace: test-namespace
Deploy a ClusterStagedUpdateStrategy
A ClusterStagedUpdateStrategy defines the orchestration pattern that groups clusters into stages and specifies the rollout sequence.
It selects member clusters by labels. For our demonstration, we create one with two stages:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
name: example-strategy
spec:
stages:
- name: staging
labelSelector:
matchLabels:
environment: staging
afterStageTasks:
- type: TimedWait
waitTime: 1m
- name: canary
labelSelector:
matchLabels:
environment: canary
sortingLabelKey: order
afterStageTasks:
- type: Approval
EOF
Deploy a ClusterStagedUpdateRun to rollout latest change
A ClusterStagedUpdateRun executes the rollout of a ClusterResourcePlacement following a ClusterStagedUpdateStrategy. To trigger the staged update run for our CRP, we create a ClusterStagedUpdateRun specifying the CRP name, updateRun strategy name, and the latest resource snapshot index (“1”):
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
EOF
The staged update run is initialized and running:
kubectl get csur example-run
NAME PLACEMENT RESOURCE-SNAPSHOT POLICY-SNAPSHOT INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True 44s
A more detailed look at the status:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
...
name: example-run
...
spec:
placementName: example-placement
resourceSnapshotIndex: "1"
stagedRolloutStrategyName: example-strategy
status:
conditions:
- lastTransitionTime: ...
message: ClusterStagedUpdateRun initialized successfully
observedGeneration: 1
reason: UpdateRunInitializedSuccessfully
status: "True" # the updateRun is initialized successfully
type: Initialized
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: UpdateRunStarted
status: "True"
type: Progressing # the updateRun is still running
deletionStageStatus:
clusters: [] # no clusters need to be cleaned up
stageName: kubernetes-fleet.io/deleteStage
policyObservedClusterCount: 3 # number of clusters to be updated
policySnapshotIndexUsed: "0"
stagedUpdateStrategySnapshot: # snapshot of the strategy
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
name: staging
- afterStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
name: canary
sortingLabelKey: order
stagesStatus: # detailed status for each stage
- afterStageTaskStatus:
- conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: AfterStageTaskWaitTimeElapsed
status: "True" # the wait after-stage task has completed
type: WaitTimeElapsed
type: TimedWait
clusters:
- clusterName: member2 # stage staging contains member2 cluster only
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True" # member2 is updated successfully
type: Succeeded
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingWaiting
status: "False"
type: Progressing
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "True" # stage staging has completed successfully
type: Succeeded
endTime: ...
stageName: staging
startTime: ...
- afterStageTaskStatus:
- approvalRequestName: example-run-canary # ClusterApprovalRequest name for this stage
type: Approval
clusters:
- clusterName: member3 # according the labelSelector and sortingLabelKey, member3 is selected first in this stage
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True" # member3 update is completed
type: Succeeded
- clusterName: member1 # member1 is selected after member3 because of order=2 label
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True" # member1 update has not finished yet
type: Started
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingStarted
status: "True" # stage canary is still executing
type: Progressing
stageName: canary
startTime: ...
Wait a little bit more, and we can see stage canary finishes cluster update and is waiting for the Approval task.
We can check the ClusterApprovalRequest generated and not approved yet:
kubectl get clusterapprovalrequest
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-canary example-run canary 2m2s
We can approve the ClusterApprovalRequest by patching its status:
kubectl patch clusterapprovalrequests example-run-canary --type=merge -p {"status":{"conditions":[{"type":"Approved","status":"True","reason":"lgtm","message":"lgtm","lastTransitionTime":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","observedGeneration":1}]}} --subresource=status
clusterapprovalrequest.placement.kubernetes-fleet.io/example-run-canary patched
This can be done equivalently by creating a json patch file and applying it:
cat << EOF > approval.json
"status": {
"conditions": [
{
"lastTransitionTime": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"message": "lgtm",
"observedGeneration": 1,
"reason": "lgtm",
"status": "True",
"type": "Approved"
}
]
}
EOF
kubectl patch clusterapprovalrequests example-run-canary --type='merge' --subresource=status --patch-file approval.json
Then verify it’s approved:
kubectl get clusterapprovalrequest
NAME UPDATE-RUN STAGE APPROVED APPROVALACCEPTED AGE
example-run-canary example-run canary True True 2m30s
The updateRun now is able to proceed and complete:
kubectl get csur example-run
NAME PLACEMENT RESOURCE-SNAPSHOT POLICY-SNAPSHOT INITIALIZED SUCCEEDED AGE
example-run example-placement 1 0 True True 4m22s
The CRP also shows rollout has completed and resources are available on all member clusters:
kubectl get crp example-placement
NAME GEN SCHEDULED SCHEDULED-GEN AVAILABLE AVAILABLE-GEN AGE
example-placement 1 True 1 True 1 134m
The configmap test-cm should be deployed on all 3 member clusters, with latest data:
data:
key: value2
Deploy a second ClusterStagedUpdateRun to rollback to a previous version
Now suppose the workload admin wants to rollback the configmap change, reverting the value value2 back to value1.
Instead of manually updating the configmap from hub, they can create a new ClusterStagedUpdateRun with a previous resource snapshot index, “0” in our context and they can reuse the same strategy:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
name: example-run-2
spec:
placementName: example-placement
resourceSnapshotIndex: "0"
stagedRolloutStrategyName: example-strategy
EOF
Following the same step as deploying the first updateRun, the second updateRun should succeed also. Complete status shown as below:
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
...
name: example-run-2
...
spec:
placementName: example-placement
resourceSnapshotIndex: "0"
stagedRolloutStrategyName: example-strategy
status:
conditions:
- lastTransitionTime: ...
message: ClusterStagedUpdateRun initialized successfully
observedGeneration: 1
reason: UpdateRunInitializedSuccessfully
status: "True"
type: Initialized
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: UpdateRunStarted
status: "True"
type: Progressing
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: UpdateRunSucceeded # updateRun succeeded
status: "True"
type: Succeeded
deletionStageStatus:
clusters: []
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingStarted
status: "True"
type: Progressing
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "True" # no clusters in the deletion stage, it completes directly
type: Succeeded
endTime: ...
stageName: kubernetes-fleet.io/deleteStage
startTime: ...
policyObservedClusterCount: 3
policySnapshotIndexUsed: "0"
stagedUpdateStrategySnapshot:
stages:
- afterStageTasks:
- type: TimedWait
waitTime: 1m0s
labelSelector:
matchLabels:
environment: staging
name: staging
- afterStageTasks:
- type: Approval
labelSelector:
matchLabels:
environment: canary
name: canary
sortingLabelKey: order
stagesStatus:
- afterStageTaskStatus:
- conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: AfterStageTaskWaitTimeElapsed
status: "True"
type: WaitTimeElapsed
type: TimedWait
clusters:
- clusterName: member2
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True"
type: Succeeded
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingWaiting
status: "False"
type: Progressing
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "True"
type: Succeeded
endTime: ...
stageName: staging
startTime: ...
- afterStageTaskStatus:
- approvalRequestName: example-run-2-canary
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: AfterStageTaskApprovalRequestCreated
status: "True"
type: ApprovalRequestCreated
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: AfterStageTaskApprovalRequestApproved
status: "True"
type: ApprovalRequestApproved
type: Approval
clusters:
- clusterName: member3
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True"
type: Succeeded
- clusterName: member1
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingStarted
status: "True"
type: Started
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: ClusterUpdatingSucceeded
status: "True"
type: Succeeded
conditions:
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingWaiting
status: "False"
type: Progressing
- lastTransitionTime: ...
message: ""
observedGeneration: 1
reason: StageUpdatingSucceeded
status: "True"
type: Succeeded
endTime: ...
stageName: canary
startTime: ...
The configmap test-cm should be updated on all 3 member clusters, with old data:
data:
key: value1
Namespace-Scoped Staged Updates
Namespace-scoped staged updates allow application teams to manage rollouts independently within their namespaces using StagedUpdateRun and StagedUpdateStrategy resources.
Setup for Namespace-Scoped Updates
Let’s demonstrate namespace-scoped staged updates by deploying an application within a specific namespace. Create a namespace and an application rollout:
kubectl create ns my-app-namespace
kubectl create deployment web-app --image=nginx:1.20 --port=80 -n my-app-namespace
kubectl expose deployment web-app --port=80 --target-port=80 -n my-app-namespace
Create a namespace-scoped ResourcePlacement to deploy the application:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ResourcePlacement
metadata:
name: web-app-placement
namespace: my-app-namespace
spec:
resourceSelectors:
- group: "apps"
kind: Deployment
name: web-app
version: v1
- group: ""
kind: Service
name: web-app
version: v1
policy:
placementType: PickAll
strategy:
type: External # enables namespace-scoped staged updates
EOF
Check namespace-scoped resource snapshots
Check the resource snapshots for the namespace-scoped placement:
kubectl get resourcesnapshots -n my-app-namespace
NAME GEN AGE LABELS
web-app-placement-0-snapshot 1 63s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=web-app-placement,kubernetes-fleet.io/resource-index=0
Update the deployment to a new version:
kubectl set image deployment/web-app web-app=nginx:1.21 -n my-app-namespace
Verify the new snapshot is created:
kubectl get resourcesnapshots -n my-app-namespace --show-labels
NAME GEN AGE LABELS
web-app-placement-0-snapshot 1 263s kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=web-app-placement,kubernetes-fleet.io/resource-index=0
web-app-placement-1-snapshot 1 23s kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=web-app-placement,kubernetes-fleet.io/resource-index=1
Deploy a StagedUpdateStrategy
Create a namespace-scoped staged update strategy:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateStrategy
metadata:
name: app-rollout-strategy
namespace: my-app-namespace
spec:
stages:
- name: dev-clusters
labelSelector:
matchLabels:
environment: staging
afterStageTasks:
- type: TimedWait
waitTime: 30s
- name: prod-clusters
labelSelector:
matchLabels:
environment: canary
sortingLabelKey: order
afterStageTasks:
- type: Approval
EOF
Deploy a StagedUpdateRun
Create a namespace-scoped staged update run to rollout the new image version:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
name: web-app-rollout-v1-21
namespace: my-app-namespace
spec:
placementName: web-app-placement
resourceSnapshotIndex: "1" # Latest snapshot with nginx:1.21
stagedRolloutStrategyName: app-rollout-strategy
EOF
Monitor namespace-scoped staged rollout
Check the status of the staged update run:
kubectl describe sur web-app-rollout-v1-21 -n my-app-namespace
Wait for the first stage to complete, then check for approval requests:
kubectl get approvalrequests -n my-app-namespace
Approve the staging gate to proceed to production clusters:
kubectl patch approvalrequests web-app-rollout-v1-21-prod-clusters -n my-app-namespace --type='merge' \
-p '{"status":{"conditions":[{"type":"Approved","status":"True","reason":"approved","message":"approved"}]}}' \
--subresource=status
Verify the rollout completion:
kubectl get sur web-app-rollout-v1-21 -n my-app-namespace
kubectl get resourceplacement web-app-placement -n my-app-namespace
Rollback with namespace-scoped staged updates
To rollback to the previous version (nginx:1.20), create another staged update run referencing the earlier snapshot:
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
name: web-app-rollback-v1-20
namespace: my-app-namespace
spec:
placementName: web-app-placement
resourceSnapshotIndex: "0" # Previous snapshot with nginx:1.20
stagedRolloutStrategyName: app-rollout-strategy
EOF
Follow the same monitoring and approval process as above to complete the rollback.
Key Differences Summary
| Aspect | Cluster-Scoped | Namespace-Scoped |
|---|---|---|
| Strategy Resource | ClusterStagedUpdateStrategy | StagedUpdateStrategy |
| UpdateRun Resource | ClusterStagedUpdateRun | StagedUpdateRun |
| Target Placement | ClusterResourcePlacement | ResourcePlacement |
| Approval Resource | ClusterApprovalRequest (short name: careq) | ApprovalRequest (short name: areq) |
| Scope | Cluster-wide | Namespace-bound |
| Use Case | Infrastructure rollouts | Application rollouts |
| Permissions | Cluster-admin level | Namespace-level |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can further improve.
Sorry to hear that. Please tell us how we can fix the experience for you.