This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Tutorials

Guide for integrating KubeFleet with your development and operations workflows

This guide will help you understand how KubeFleet can seamlessly integrate with your development and operations workflows. Follow the instructions provided to get the most out of KubeFleet’s features. Below is a walkthrough of all the tutorials currently available.

1 - Resource Migration Across Clusters

Migrating Applications to Another Cluster When a Cluster Goes Down

This tutorial demonstrates how to move applications from clusters have gone down to other operational clusters using Fleet.

Scenario

Your fleet consists of the following clusters:

  1. Member Cluster 1 & Member Cluster 2 (WestUS, 1 node each)
  2. Member Cluster 3 (EastUS2, 2 nodes)
  3. Member Cluster 4 & Member Cluster 5 (WestEurope, 3 nodes each)

Due to certain circumstances, Member Cluster 1 and Member Cluster 2 are down, requiring you to migrate your applications from these clusters to other operational ones.

Current Application Resources

The following resources are currently deployed in Member Cluster 1 and Member Cluster 2 by the ClusterResourcePlacement:

Service

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: test-app
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

Summary:

  • This defines a Kubernetes Service named nginx-svc in the test-app namespace.
  • The service is of type LoadBalancer, meaning it exposes the application to the internet.
  • It targets pods with the label app: nginx and forwards traffic to port 80 on the pods.

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test-app
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.16.1 
        ports:
        - containerPort: 80

Summary:

  • This defines a Kubernetes Deployment named nginx-deployment in the test-app namespace.
  • It creates 2 replicas of the nginx pod, each running the nginx:1.16.1 image.
  • The deployment ensures that the specified number of pods (replicas) are running and available.
  • The pods are labeled with app: nginx and expose port 80.

ClusterResourcePlacement

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourcePlacement
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"placement.kubernetes-fleet.io/v1","kind":"ClusterResourcePlacement","metadata":{"annotations":{},"name":"crp-migration"},"spec":{"policy":{"affinity":{"clusterAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"clusterSelectorTerms":[{"labelSelector":{"matchLabels":{"fleet.azure.com/location":"westus"}}}]}}},"numberOfClusters":2,"placementType":"PickN"},"resourceSelectors":[{"group":"","kind":"Namespace","name":"test-app","version":"v1"}],"revisionHistoryLimit":10,"strategy":{"type":"RollingUpdate"}}}
  creationTimestamp: "2024-07-25T21:27:35Z"
  finalizers:
    - kubernetes-fleet.io/crp-cleanup
    - kubernetes-fleet.io/scheduler-cleanup
  generation: 1
  name: crp-migration
  resourceVersion: "22177519"
  uid: 0683cfaa-df24-4b2c-8a3d-07031692da8f
spec:
  policy:
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
            - labelSelector:
                matchLabels:
                  fleet.azure.com/location: westus
    numberOfClusters: 2
    placementType: PickN
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-app
      version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
status:
  conditions:
    - lastTransitionTime: "2024-07-25T21:27:35Z"
      message: found all cluster needed as specified by the scheduling policy, found
        2 cluster(s)
      observedGeneration: 1
      reason: SchedulingPolicyFulfilled
      status: "True"
      type: ClusterResourcePlacementScheduled
    - lastTransitionTime: "2024-07-25T21:27:35Z"
      message: All 2 cluster(s) start rolling out the latest resource
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: ClusterResourcePlacementRolloutStarted
    - lastTransitionTime: "2024-07-25T21:27:35Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: ClusterResourcePlacementOverridden
    - lastTransitionTime: "2024-07-25T21:27:35Z"
      message: Works(s) are succcesfully created or updated in 2 target cluster(s)'
        namespaces
      observedGeneration: 1
      reason: WorkSynchronized
      status: "True"
      type: ClusterResourcePlacementWorkSynchronized
    - lastTransitionTime: "2024-07-25T21:27:35Z"
      message: The selected resources are successfully applied to 2 cluster(s)
      observedGeneration: 1
      reason: ApplySucceeded
      status: "True"
      type: ClusterResourcePlacementApplied
    - lastTransitionTime: "2024-07-25T21:27:45Z"
      message: The selected resources in 2 cluster(s) are available now
      observedGeneration: 1
      reason: ResourceAvailable
      status: "True"
      type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
    - clusterName: aks-member-2
      conditions:
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: 'Successfully scheduled resources for placement in "aks-member-2"
        (affinity score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 1
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 1
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 1
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 1
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: All corresponding work objects are applied
          observedGeneration: 1
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T21:27:45Z"
          message: All corresponding work objects are available
          observedGeneration: 1
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
    - clusterName: aks-member-1
      conditions:
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: 'Successfully scheduled resources for placement in "aks-member-1"
        (affinity score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 1
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 1
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 1
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 1
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T21:27:35Z"
          message: All corresponding work objects are applied
          observedGeneration: 1
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T21:27:45Z"
          message: All corresponding work objects are available
          observedGeneration: 1
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
  selectedResources:
    - kind: Namespace
      name: test-app
      version: v1
    - group: apps
      kind: Deployment
      name: nginx-deployment
      namespace: test-app
      version: v1
    - kind: Service
      name: nginx-service
      namespace: test-app
      version: v1

Summary:

  • This defines a ClusterResourcePlacement named crp-migration.
  • The PickN placement policy selects 2 clusters based on the label fleet.azure.com/location: westus. Consequently, it chooses Member Cluster 1 and Member Cluster 2, as they are located in WestUS.
  • It targets resources in the test-app namespace.

Migrating Applications to a Cluster to Other Operational Clusters

When the clusters in WestUS go down, update the ClusterResourcePlacement (CRP) to migrate the applications to other clusters. In this tutorial, we will move them to Member Cluster 4 and Member Cluster 5, which are located in WestEurope.

Update the CRP for Migration to Clusters in WestEurope

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourcePlacement
metadata:
  name: crp-migration
spec:
  policy:
    placementType: PickN
    numberOfClusters: 2
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
            - labelSelector:
                matchLabels:
                  fleet.azure.com/location: westeurope  # updated label
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: test-app
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate

Update the crp.yaml to reflect the new region and apply it:

kubectl apply -f crp.yaml

Results

After applying the updated crp.yaml, the Fleet will schedule the application on the available clusters in WestEurope. You can check the status of the CRP to ensure that the application has been successfully migrated and is running on the newly selected clusters:

kubectl get crp crp-migration -o yaml

You should see a status indicating that the application is now running in the clusters located in WestEurope, similar to the following:

CRP Status

...
status:
  conditions:
    - lastTransitionTime: "2024-07-25T21:36:02Z"
      message: found all cluster needed as specified by the scheduling policy, found
        2 cluster(s)
      observedGeneration: 2
      reason: SchedulingPolicyFulfilled
      status: "True"
      type: ClusterResourcePlacementScheduled
    - lastTransitionTime: "2024-07-25T21:36:14Z"
      message: All 2 cluster(s) start rolling out the latest resource
      observedGeneration: 2
      reason: RolloutStarted
      status: "True"
      type: ClusterResourcePlacementRolloutStarted
    - lastTransitionTime: "2024-07-25T21:36:14Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 2
      reason: NoOverrideSpecified
      status: "True"
      type: ClusterResourcePlacementOverridden
    - lastTransitionTime: "2024-07-25T21:36:14Z"
      message: Works(s) are succcesfully created or updated in 2 target cluster(s)'
        namespaces
      observedGeneration: 2
      reason: WorkSynchronized
      status: "True"
      type: ClusterResourcePlacementWorkSynchronized
    - lastTransitionTime: "2024-07-25T21:36:14Z"
      message: The selected resources are successfully applied to 2 cluster(s)
      observedGeneration: 2
      reason: ApplySucceeded
      status: "True"
      type: ClusterResourcePlacementApplied
    - lastTransitionTime: "2024-07-25T21:36:14Z"
      message: The selected resources in 2 cluster(s) are available now
      observedGeneration: 2
      reason: ResourceAvailable
      status: "True"
      type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
    - clusterName: aks-member-5
      conditions:
        - lastTransitionTime: "2024-07-25T21:36:02Z"
          message: 'Successfully scheduled resources for placement in "aks-member-5" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 2
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 2
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 2
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 2
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All corresponding work objects are applied
          observedGeneration: 2
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All corresponding work objects are available
          observedGeneration: 2
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
    - clusterName: aks-member-4
      conditions:
        - lastTransitionTime: "2024-07-25T21:36:02Z"
          message: 'Successfully scheduled resources for placement in "aks-member-4" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 2
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 2
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 2
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 2
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All corresponding work objects are applied
          observedGeneration: 2
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T21:36:14Z"
          message: All corresponding work objects are available
          observedGeneration: 2
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
  selectedResources:
    - kind: Namespace
      name: test-app
      version: v1
    - group: apps
      kind: Deployment
      name: nginx-deployment
      namespace: test-app
      version: v1
    - kind: Service
      name: nginx-service
      namespace: test-app
      version: v1

Conclusion

This tutorial demonstrated how to migrate applications using Fleet when clusters in one region go down. By updating the ClusterResourcePlacement, you can ensure that your applications are moved to available clusters in another region, maintaining availability and resilience.

2 - Resource Migration With Overrides

Migrating Applications to Another Cluster For Higher Availability With Overrides

This tutorial shows how to migrate applications from clusters with lower availability to those with higher availability, while also scaling up the number of replicas, using Fleet.

Scenario

Your fleet consists of the following clusters:

  1. Member Cluster 1 & Member Cluster 2 (WestUS, 1 node each)
  2. Member Cluster 3 (EastUS2, 2 nodes)
  3. Member Cluster 4 & Member Cluster 5 (WestEurope, 3 nodes each)

Due to a sudden increase in traffic and resource demands in your WestUS clusters, you need to migrate your applications to clusters in EastUS2 or WestEurope that have higher availability and can better handle the increased load.

Current Application Resources

The following resources are currently deployed in the WestUS clusters:

Service

Note: Service test file located here.

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: test-app
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
  type: LoadBalancer

Summary:

  • This defines a Kubernetes Service named nginx-svc in the test-app namespace.
  • The service is of type LoadBalancer, meaning it exposes the application to the internet.
  • It targets pods with the label app: nginx and forwards traffic to port 80 on the pods.

Deployment

Note: Deployment test file located here.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test-app
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.16.1 
        ports:
        - containerPort: 80

Note: The current deployment has 2 replicas.

Summary:

  • This defines a Kubernetes Deployment named nginx-deployment in the test-app namespace.
  • It creates 2 replicas of the nginx pod, each running the nginx:1.16.1 image.
  • The deployment ensures that the specified number of pods (replicas) are running and available.
  • The pods are labeled with app: nginx and expose port 80.

ClusterResourcePlacement

Note: CRP Availability test file located here

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourcePlacement
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"placement.kubernetes-fleet.io/v1","kind":"ClusterResourcePlacement","metadata":{"annotations":{},"name":"crp-availability"},"spec":{"policy":{"affinity":{"clusterAffinity":{"requiredDuringSchedulingIgnoredDuringExecution":{"clusterSelectorTerms":[{"labelSelector":{"matchLabels":{"fleet.azure.com/location":"westus"}}}]}}},"numberOfClusters":2,"placementType":"PickN"},"resourceSelectors":[{"group":"","kind":"Namespace","name":"test-app","version":"v1"}],"revisionHistoryLimit":10,"strategy":{"type":"RollingUpdate"}}}
  creationTimestamp: "2024-07-25T23:00:53Z"
  finalizers:
    - kubernetes-fleet.io/crp-cleanup
    - kubernetes-fleet.io/scheduler-cleanup
  generation: 1
  name: crp-availability
  resourceVersion: "22228766"
  uid: 58dbb5d1-4afa-479f-bf57-413328aa61bd
spec:
  policy:
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
            - labelSelector:
                matchLabels:
                  fleet.azure.com/location: westus
    numberOfClusters: 2
    placementType: PickN
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-app
      version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate
status:
  conditions:
    - lastTransitionTime: "2024-07-25T23:00:53Z"
      message: found all cluster needed as specified by the scheduling policy, found
        2 cluster(s)
      observedGeneration: 1
      reason: SchedulingPolicyFulfilled
      status: "True"
      type: ClusterResourcePlacementScheduled
    - lastTransitionTime: "2024-07-25T23:00:53Z"
      message: All 2 cluster(s) start rolling out the latest resource
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: ClusterResourcePlacementRolloutStarted
    - lastTransitionTime: "2024-07-25T23:00:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: ClusterResourcePlacementOverridden
    - lastTransitionTime: "2024-07-25T23:00:53Z"
      message: Works(s) are succcesfully created or updated in 2 target cluster(s)'
        namespaces
      observedGeneration: 1
      reason: WorkSynchronized
      status: "True"
      type: ClusterResourcePlacementWorkSynchronized
    - lastTransitionTime: "2024-07-25T23:00:53Z"
      message: The selected resources are successfully applied to 2 cluster(s)
      observedGeneration: 1
      reason: ApplySucceeded
      status: "True"
      type: ClusterResourcePlacementApplied
    - lastTransitionTime: "2024-07-25T23:01:02Z"
      message: The selected resources in 2 cluster(s) are available now
      observedGeneration: 1
      reason: ResourceAvailable
      status: "True"
      type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
    - clusterName: aks-member-2
      conditions:
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: 'Successfully scheduled resources for placement in "aks-member-2"
        (affinity score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 1
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 1
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 1
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 1
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: All corresponding work objects are applied
          observedGeneration: 1
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T23:01:02Z"
          message: All corresponding work objects are available
          observedGeneration: 1
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
    - clusterName: aks-member-1
      conditions:
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: 'Successfully scheduled resources for placement in "aks-member-1"
        (affinity score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 1
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 1
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: No override rules are configured for the selected resources
          observedGeneration: 1
          reason: NoOverrideSpecified
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 1
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T23:00:53Z"
          message: All corresponding work objects are applied
          observedGeneration: 1
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T23:01:02Z"
          message: All corresponding work objects are available
          observedGeneration: 1
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
  selectedResources:
    - kind: Namespace
      name: test-app
      version: v1
    - group: apps
      kind: Deployment
      name: nginx-deployment
      namespace: test-app
      version: v1
    - kind: Service
      name: nginx-service
      namespace: test-app
      version: v1

Summary:

  • This defines a ClusterResourcePlacement named crp-availability.
  • The placement policy PickN selects 2 clusters. The clusters are selected based on the label fleet.azure.com/location: westus.
  • It targets resources in the test-app namespace.

Identify Clusters with More Availability

To identify clusters with more availability, you can check the member cluster properties.

kubectl get memberclusters -A -o wide

The output will show the availability in each cluster, including the number of nodes, available CPU, and memory.

NAME                                JOINED   AGE   NODE-COUNT   AVAILABLE-CPU   AVAILABLE-MEMORY   ALLOCATABLE-CPU   ALLOCATABLE-MEMORY
aks-member-1                        True     22d   1            30m             40Ki               1900m             4652296Ki
aks-member-2                        True     22d   1            30m             40Ki               1900m             4652296Ki
aks-member-3                        True     22d   2            2820m           8477196Ki          3800m             9304588Ki
aks-member-4                        True     22d   3            4408m           12896012Ki         5700m             13956876Ki
aks-member-5                        True     22d   3            4408m           12896024Ki         5700m             13956888Ki

Based on the available resources, you can see that Member Cluster 3 in EastUS2 and Member Cluster 4 & 5 in WestEurope have more nodes and available resources compared to the WestUS clusters.

Migrating Applications to a Different Cluster with More Availability While Scaling Up

When the clusters in WestUS are nearing capacity limits and risk becoming overloaded, update the ClusterResourcePlacement (CRP) to migrate the applications to clusters in EastUS2 or WestEurope, which have more available resources and can handle increased demand more effectively. For this tutorial, we will move them to WestEurope.

Create Resource Override

Note: Cluster resource override test file located here

To scale up during migration, apply this override before updating crp:

apiVersion: placement.kubernetes-fleet.io/v1alpha1
kind: ResourceOverride
metadata:
  name: ro-1
  namespace: test-app
spec:
  resourceSelectors:
    -  group: apps
       kind: Deployment
       version: v1
       name: nginx-deployment
  policy:
    overrideRules:
      - clusterSelector:
          clusterSelectorTerms:
            - labelSelector:
                matchLabels:
                  fleet.azure.com/location: westeurope
        jsonPatchOverrides:
          - op: replace
            path: /spec/replicas
            value:
              4

This override updates the nginx-deployment Deployment in the test-app namespace by setting the number of replicas to “4” for clusters located in the westeurope region.

Update the CRP for Migration

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourcePlacement
metadata:
  name: crp-availability
spec:
  policy:
    placementType: PickN
    numberOfClusters: 2
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
            - propertySelector:
                matchExpressions:
                  - name: kubernetes-fleet.io/node-count
                    operator: Ge
                    values:
                      - "3"
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-app
      version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate

Update the crp-availability.yaml to reflect selecting clusters with higher node-count and apply it:

kubectl apply -f crp-availability.yaml

Results

After applying the updated crp-availability.yaml, the Fleet will schedule the application on the available clusters in WestEurope as they each have 3 nodes. You can check the status of the CRP to ensure that the application has been successfully migrated and is running in the new region:

kubectl get crp crp-availability -o yaml

You should see a status indicating that the application is now running in the WestEurope clusters, similar to the following:

CRP Status

...
status:
  conditions:
    - lastTransitionTime: "2024-07-25T23:10:08Z"
      message: found all cluster needed as specified by the scheduling policy, found
        2 cluster(s)
      observedGeneration: 2
      reason: SchedulingPolicyFulfilled
      status: "True"
      type: ClusterResourcePlacementScheduled
    - lastTransitionTime: "2024-07-25T23:10:20Z"
      message: All 2 cluster(s) start rolling out the latest resource
      observedGeneration: 2
      reason: RolloutStarted
      status: "True"
      type: ClusterResourcePlacementRolloutStarted
    - lastTransitionTime: "2024-07-25T23:10:20Z"
      message: The selected resources are successfully overridden in 2 cluster(s)
      observedGeneration: 2
      reason: OverriddenSucceeded
      status: "True"
      type: ClusterResourcePlacementOverridden
    - lastTransitionTime: "2024-07-25T23:10:20Z"
      message: Works(s) are succcesfully created or updated in 2 target cluster(s)'
        namespaces
      observedGeneration: 2
      reason: WorkSynchronized
      status: "True"
      type: ClusterResourcePlacementWorkSynchronized
    - lastTransitionTime: "2024-07-25T23:10:21Z"
      message: The selected resources are successfully applied to 2 cluster(s)
      observedGeneration: 2
      reason: ApplySucceeded
      status: "True"
      type: ClusterResourcePlacementApplied
    - lastTransitionTime: "2024-07-25T23:10:30Z"
      message: The selected resources in 2 cluster(s) are available now
      observedGeneration: 2
      reason: ResourceAvailable
      status: "True"
      type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
    - applicableResourceOverrides:
        - name: ro-1-0
          namespace: test-app
      clusterName: aks-member-5
      conditions:
        - lastTransitionTime: "2024-07-25T23:10:08Z"
          message: 'Successfully scheduled resources for placement in "aks-member-5" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 2
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T23:10:20Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 2
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T23:10:20Z"
          message: Successfully applied the override rules on the resources
          observedGeneration: 2
          reason: OverriddenSucceeded
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T23:10:20Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 2
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T23:10:21Z"
          message: All corresponding work objects are applied
          observedGeneration: 2
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T23:10:30Z"
          message: All corresponding work objects are available
          observedGeneration: 2
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
    - applicableResourceOverrides:
        - name: ro-1-0
          namespace: test-app
      clusterName: aks-member-4
      conditions:
        - lastTransitionTime: "2024-07-25T23:10:08Z"
          message: 'Successfully scheduled resources for placement in "aks-member-4" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
          observedGeneration: 2
          reason: Scheduled
          status: "True"
          type: Scheduled
        - lastTransitionTime: "2024-07-25T23:10:08Z"
          message: Detected the new changes on the resources and started the rollout process
          observedGeneration: 2
          reason: RolloutStarted
          status: "True"
          type: RolloutStarted
        - lastTransitionTime: "2024-07-25T23:10:08Z"
          message: Successfully applied the override rules on the resources
          observedGeneration: 2
          reason: OverriddenSucceeded
          status: "True"
          type: Overridden
        - lastTransitionTime: "2024-07-25T23:10:08Z"
          message: All of the works are synchronized to the latest
          observedGeneration: 2
          reason: AllWorkSynced
          status: "True"
          type: WorkSynchronized
        - lastTransitionTime: "2024-07-25T23:10:09Z"
          message: All corresponding work objects are applied
          observedGeneration: 2
          reason: AllWorkHaveBeenApplied
          status: "True"
          type: Applied
        - lastTransitionTime: "2024-07-25T23:10:19Z"
          message: All corresponding work objects are available
          observedGeneration: 2
          reason: AllWorkAreAvailable
          status: "True"
          type: Available
  selectedResources:
    - kind: Namespace
      name: test-app
      version: v1
    - group: apps
      kind: Deployment
      name: nginx-deployment
      namespace: test-app
      version: v1
    - kind: Service
      name: nginx-service
      namespace: test-app
      version: v1

The status indicates that the application has been successfully migrated to the WestEurope clusters and is now running with 4 replicas, as the resource override has been applied.

To double-check, you can also verify the number of replicas in the nginx-deployment:

  1. Change context to member cluster 4 or 5:
    kubectl config use-context aks-member-4
    
  2. Get the deployment:
    kubectl get deployment nginx-deployment -n test-app -o wide
    

Conclusion

This tutorial demonstrated how to migrate applications using Fleet from clusters with lower availability to those with higher availability. By updating the ClusterResourcePlacement and applying a ResourceOverride, you can ensure that your applications are moved to clusters with better availability while also scaling up the number of replicas to enhance performance and resilience.

3 - KubeFleet and ArgoCD Integration

See KubeFleet and ArgoCD working together to efficiently manage Gitops promotion

This hands-on guide of KubeFleet and ArgoCD integration shows how these powerful tools work in concert to revolutionize multi-cluster application management. Discover how KubeFleet’s intelligent orchestration capabilities complement ArgoCD’s popular GitOps approach, enabling seamless deployments across diverse environments while maintaining consistency and control. This tutorial illuminates practical strategies for targeted deployments, environment-specific configurations, and safe, controlled rollouts. Follow along to transform your multi-cluster challenges into streamlined, automated workflows that enhance both developer productivity and operational reliability.

Suppose in a multi-cluster, multi-tenant organization, team A wants to deploy the resources ONLY to the clusters they own. They want to make sure each cluster receives the correct configuration, and they want to ensure safe deployment by rolling out to their staging environment first, then to canary if staging is healthy, and lastly to the production. Our tutorial will walk you through a hands-on experience of how to achieve this. Below image demonstrates the major components and their interactions.

Prerequisites

KubeFleet environment

In this tutorial, we prepare a fleet environment with one hub cluster and four member clusters. The member clusters are labeled to indicate their environment and team ownership. From the hub cluster, we can verify the clustermembership and their labels:

kubectl config use-context hub
kubectl get memberclusters --show-labels
NAME      JOINED   AGE    MEMBER-AGENT-LAST-SEEN   NODE-COUNT   AVAILABLE-CPU   AVAILABLE-MEMORY   LABELS
member1   True     84d    10s                      3            4036m           13339148Ki         environment=staging,team=A,...
member2   True     84d    14s                      3            4038m           13354748Ki         environment=canary,team=A,...
member3   True     144m   6s                       3            3676m           12458504Ki         environment=production,team=A,...
member4   True     6m7s   15s                      3            4036m           13347336Ki         team=B,...

From above output, we can see that:

  • member1 is in staging environment and owned by team A.
  • member2 is in canary environment and owned by team A.
  • member3 is in production environment and owned by team A.
  • member4 is owned by team B.

Install ArgoCD

In this tutorial, we expect ArgoCD controllers to be installed on each member cluster. Only ArgoCD CRDs need to be installed on the hub cluster so that ArgoCD Applications can be created.

  • Option 1: Install ArgoCD on each member cluster directly (RECOMMENDED)

    It’s straightforward to install ArgoCD on each member cluster. You can follow the instructions in ArgoCD Getting Started.
    To install only CRDs on the hub cluster, you can run the following command:

    kubectl config use-context hub
    kubectl apply -k https://github.com/argoproj/argo-cd/manifests/crds?ref=stable --server-side=true
    
  • Option 2: Use KubeFleet ClusterResourcePlacement (CRP) to install ArgoCD on member clusters (Experimental)

    Alternatively, you can first install all the ArgoCD manifests on the hub cluster, and then use KubeFleet ClusterResourcePlacement to populate to the member clusters. Install the CRDs on the hub cluster:

    kubectl config use-context hub
    kubectl apply -k https://github.com/argoproj/argo-cd/manifests/crds?ref=stable --server-side=true
    

    Then apply the resource manifest we prepared (argocd-install.yaml) to the hub cluster:

    kubectl config use-context hub
    kubectl create ns argocd && kubectl apply -f ./manifests/argocd-install.yaml -n argocd --server-side=true
    

    We then use a ClusterResourcePlacement (refer to argocd-crp.yaml) to populate the manifests to the member clusters:

    kubectl config use-context hub
    kubectl apply -f ./manifests/argocd-crp.yaml
    

    Verify the CRP becomes available:

    kubectl get crp
    NAME                GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
    crp-argocd          1     True        1               True        1               79m
    

Enable “Applications in any namespace” in ArgoCD

In this tutorial, we are going to deploy an ArgoCD Application in the guestbook namespace. Enabling “Applications in any namespace” feature, application teams can manage their applications in a more flexible way without the risk of a privilege escalation. In this tutorial, we need to enable Applications to be created in the guestbook namespace.

  • Option 1: Enable on each member cluster manually

    You can follow the instructions in ArgoCD Applications-in-any-namespace documentation to enable this feature on each member cluster manually.
    It generally involves updating the argocd-cmd-params-cm configmap and restarting the argocd-application-controller statefulset and argocd-server deployment.
    You will also want to create an ArgoCD AppProject in the argocd namespace for Applications to refer to. You can find the manifest at guestbook-appproject.yaml.

    cat ./manifests/guestbook-appproject.yaml
    apiVersion: argoproj.io/v1alpha1
    kind: AppProject
    metadata:
      name: guestbook-project
      namespace: argocd
    spec:
      sourceNamespaces:
      - guestbook
      destinations:
      - namespace: '*'
        server: https://kubernetes.default.svc
      sourceRepos:
      - '*'
    
    kubectl config use-context member<*>
    kubectl apply -f ./manifests/guestbook-appproject.yaml
    
  • Option 2: Populate ArgoCD AppProject to member clusters with CRP (Experimental)

    If you tried above Option 2 to install ArgoCD from hub cluster to member clusters, you gain the flexibility by just updating the argocd-cmd-params-cm configmap, and adding the guestbook-appproject to the argocd namespace, and existing CRP will populate the resources automatically to the member clusters. Note: you probably also want to update the argocd-application-controller and argocd-server a bit to trigger pod restarts.

Deploy resources to clusters using ArgoCD Application orchestrated by KubeFleet

We have prepared one guestbook-ui deployment with corresponding service for each environment. The deployments are same except for the replica count. This simulates different configurations for different clusters. You may find the manifests here.

guestbook  
│
└───staging
│   │   guestbook-ui.yaml
|
└───canary
|   │   guestbook-ui.yaml
|
└───production
    │   guestbook-ui.yaml

Deploy an ArgoCD Application for gitops continuous delivery

Team A want to create an ArgoCD Application to automatically sync the manifests from git repository to the member clusters. The Application should be created on the hub cluster and placed onto the member clusters team A owns. The Application example can be found at guestbook-app.yaml.

kubectl config use-context hub
kubectl create ns guestbook
kubectl apply of - << EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: guestbook-app
  namespace: guestbook 
spec:
  destination:
    namespace: guestbook
    server: https://kubernetes.default.svc
  project: guestbook-project
  source:
    path: content/en/docs/tutorials/ArgoCD/manifests/guestbook
    repoURL: https://github.com/kubefleet-dev/website.git
    targetRevision: main
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    retry:
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m0s
      limit: 10
    syncOptions:
    - PruneLast=true
    - PrunePropagationPolicy=foreground
    - CreateNamespace=true
    - ApplyOutOfSyncOnly=true
EOF

Place ArgoCD Application to member clusters with CRP

A ClusterResourcePlacement (CRP) is used to place resources on the hub cluster to member clusters. Team A is able to select their own member clusters by specifying cluster labels. In spec.resourceSelectors, specifying guestbook namespace includes all resources in it including the Application just deployed. The spec.strategy.type is set to External so that CRP is not rolled out immediately. Instead, rollout will be triggered separately in next steps. The CRP resource can be found at guestbook-crp.yaml.

kubectl config use-context hub
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: guestbook-crp
spec:
  policy:
    placementType: PickAll # select all member clusters with label team=A
    affinity:
      clusterAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          clusterSelectorTerms:
            - labelSelector:
                matchLabels:
                  team: A # label selectors
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: guestbook # select guestbook namespace with all resources in it
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: External # will use an updateRun to trigger the rollout
EOF

Verify the CRP status and it’s clear that only member1, member2, and member3 are selected with team=A label are selected, and rollout has not started yet.

kubectl get crp guestbook-crp -o yaml
...
status:
  conditions:
  - lastTransitionTime: "2025-03-23T23:46:56Z"
    message: found all cluster needed as specified by the scheduling policy, found
      3 cluster(s)
    observedGeneration: 1
    reason: SchedulingPolicyFulfilled
    status: "True"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2025-03-23T23:46:56Z"
    message: There are still 3 cluster(s) in the process of deciding whether to roll
      out the latest resources or not
    observedGeneration: 1
    reason: RolloutStartedUnknown
    status: Unknown
    type: ClusterResourcePlacementRolloutStarted
  observedResourceIndex: "0"
  placementStatuses:
  - clusterName: member1
    conditions:
    - lastTransitionTime: "2025-03-24T00:22:22Z"
      message: 'Successfully scheduled resources for placement in "member1" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2025-03-24T00:22:22Z"
      message: In the process of deciding whether to roll out the latest resources
        or not
      observedGeneration: 1
      reason: RolloutStartedUnknown
      status: Unknown
      type: RolloutStarted
  - clusterName: member2
    conditions:
    - lastTransitionTime: "2025-03-23T23:46:56Z"
      message: 'Successfully scheduled resources for placement in "member2" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2025-03-23T23:46:56Z"
      message: In the process of deciding whether to roll out the latest resources
        or not
      observedGeneration: 1
      reason: RolloutStartedUnknown
      status: Unknown
      type: RolloutStarted
  - clusterName: member3
    conditions:
    - lastTransitionTime: "2025-03-23T23:46:56Z"
      message: 'Successfully scheduled resources for placement in "member3" (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2025-03-23T23:46:56Z"
      message: In the process of deciding whether to roll out the latest resources
        or not
      observedGeneration: 1
      reason: RolloutStartedUnknown
      status: Unknown
      type: RolloutStarted
...

Override path for different member clusters with ResourceOverride

Above Application specifies spec.source.path as content/en/docs/tutorials/ArgoCD/manifests/guestbook. By default, every member cluster selected receives the same Application resource. In this tutorial, member clusters from different environments should receive different manifests, as configured in different folders in the git repo. To achieve this, a ResourceOverride is used to override the Application resource for each member cluster. The ResourceOverride resource can be found at guestbook-ro.yaml.

kubectl config use-context hub
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1alpha1
kind: ResourceOverride
metadata:
  name: guestbook-app-ro
  namespace: guestbook # ro needs to be created in the same namespace as the resource it overrides
spec:
  placement:
    name: guestbook-crp # specify the CRP name
  policy:
    overrideRules:
    - clusterSelector:
        clusterSelectorTerms:
        - labelSelector: 
            matchExpressions:
            - key: environment
              operator: Exists
      jsonPatchOverrides:
      - op: replace
        path: /spec/source/path # spec.source.path is overridden
        value: "content/en/docs/tutorials/ArgoCD/manifests/guestbook/${MEMBER-CLUSTER-LABEL-KEY-environment}"
      overrideType: JSONPatch
  resourceSelectors:
  - group: argoproj.io
    kind: Application
    name: guestbook-app # name of the Application
    version: v1alpha1
EOF

Trigger CRP progressive rollout with clusterStagedUpdateRun

A ClusterStagedUpdateRun (or updateRun for short) is used to trigger the rollout of the CRP in a progressive, stage-by-stage manner by following a pre-defined rollout strategy, namely ClusterStagedUpdateStrategy.

A ClusterStagedUpdateStrategy is provided at teamA-strategy.yaml. It defines 3 stages: staging, canary, and production. Clusters are grouped by label environment into different stages. The TimedWait after-stage task in staging stageis used to pause the rollout for 1 minute before moving to canary stage.s The Approval after-stage task in canary stage waits for manual approval before moving to production stage. After applying the strategy, a ClusterStagedUpdateRun can then reference it to generate the concrete test plan.

kubectl config use-context hub
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
  name: team-a-strategy
spec:
  stages: # 3 stages: staging, canary, production
  - afterStageTasks:
    - type: TimedWait
      waitTime: 1m # wait 1 minute before moving to canary stage
    labelSelector:
      matchLabels:
        environment: staging
    name: staging
  - afterStageTasks:
    - type: Approval # wait for manual approval before moving to production stage
    labelSelector:
      matchLabels:
        environment: canary
    name: canary
  - labelSelector:
      matchLabels:
        environment: production
    name: production
EOF

Now it’s time to trigger the rollout. A sample ClusterStagedUpdateRun can be found at guestbook-updaterun.yaml. It’s pretty straightforward, just specifying the CRP resource name, the strategy name, and resource version.

kubectl config use-context hub
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: guestbook-updaterun
spec:
  placementName: guestbook-crp
  resourceSnapshotIndex: "0"
  stagedRolloutStrategyName: team-a-strategy
EOF

Checking the updateRun status to see the rollout progress, member1 in staging stage has been updated, and it’s pausing at the after-stage task before moving to canary stage.

kubectl config use-context hub
kubectl get crsur gestbook-updaterun -o yaml
...
stagesStatus:
  - afterStageTaskStatus:
    - type: TimedWait
    clusters:
    - clusterName: member1
      conditions:
      - lastTransitionTime: "2025-03-24T00:47:41Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-03-24T00:47:56Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True"
        type: Succeeded
      resourceOverrideSnapshots:
      - name: guestbook-app-ro-0
        namespace: guestbook
    conditions:
    - lastTransitionTime: "2025-03-24T00:47:56Z"
      message: ""
      observedGeneration: 1
      reason: StageUpdatingWaiting
      status: "False"
      type: Progressing
    stageName: staging
    startTime: "2025-03-24T00:47:41Z"
  - afterStageTaskStatus:
    - approvalRequestName: guestbook-updaterun-canary
      type: Approval
    clusters:
    - clusterName: member2
      resourceOverrideSnapshots:
      - name: guestbook-app-ro-0
        namespace: guestbook
    stageName: canary
  - clusters:
    - clusterName: member3
      resourceOverrideSnapshots:
      - name: guestbook-app-ro-0
        namespace: guestbook
    stageName: production
...

Checking the Application status on each member cluster, and it’s synced and healthy:

kubectl config use-context member1
kubectl get Applications -n guestbook
NAMESPACE   NAME            SYNC STATUS   HEALTH STATUS
guestbook   guestbook-app   Synced        Healthy

At the same time, there’s no Application in member2 or member3 as they are not rolled out yet.

After 1 minute, the staging stage is completed, and member2 in canary stage is updated.

kubectl config use-context hub
kubectl get crsur guestbook-updaterun -o yaml
...
- afterStageTaskStatus:
    - approvalRequestName: guestbook-updaterun-canary
      conditions:
      - lastTransitionTime: "2025-03-24T00:49:11Z"
        message: ""
        observedGeneration: 1
        reason: AfterStageTaskApprovalRequestCreated
        status: "True"
        type: ApprovalRequestCreated
      type: Approval
    clusters:
    - clusterName: member2
      conditions:
      - lastTransitionTime: "2025-03-24T00:48:56Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-03-24T00:49:11Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True"
        type: Succeeded
      resourceOverrideSnapshots:
      - name: guestbook-app-ro-0
        namespace: guestbook
    conditions:
    - lastTransitionTime: "2025-03-24T00:49:11Z"
      message: ""
      observedGeneration: 1
      reason: StageUpdatingWaiting
      status: "False"
      type: Progressing
    stageName: canary
    startTime: "2025-03-24T00:48:56Z"
...

canary stage requires manual approval to complete. The controller generates a ClusterApprovalRequest object for user to approve. The name is included in the updateRun status, as shown above, approvalRequestName: guestbook-updaterun-canary. Team A can verify everything works properly and then approve the request to proceed to production stage:

kubectl config use-context hub

kubectl get clusterapprovalrequests
NAME                         UPDATE-RUN            STAGE    APPROVED   APPROVALACCEPTED   AGE
guestbook-updaterun-canary   guestbook-updaterun   canary                                 21m

kubectl patch clusterapprovalrequests guestbook-updaterun-canary --type='merge' -p '{"status":{"conditions":[{"type":"Approved","status":"True","reason":"lgtm","message":"lgtm","lastTransitionTime":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","observedGeneration":1}]}}' --subresource=status

kubectl get clusterapprovalrequests
NAME                         UPDATE-RUN            STAGE    APPROVED   APPROVALACCEPTED   AGE
guestbook-updaterun-canary   guestbook-updaterun   canary   True       True               22m

Not the updateRun moves on to production stage, and member3 is updated. The whole updateRun is completed:

kubectl config use-context hub

kubectl get crsur guestbook-updaterun -o yaml
...
status:
  conditions:
  - lastTransitionTime: "2025-03-24T00:47:41Z"
    message: ClusterStagedUpdateRun initialized successfully
    observedGeneration: 1
    reason: UpdateRunInitializedSuccessfully
    status: "True"
    type: Initialized
  - lastTransitionTime: "2025-03-24T00:47:41Z"
    message: ""
    observedGeneration: 1
    reason: UpdateRunStarted
    status: "True"
    type: Progressing
  - lastTransitionTime: "2025-03-24T01:11:45Z"
    message: ""
    observedGeneration: 1
    reason: UpdateRunSucceeded
    status: "True"
    type: Succeeded
...
  stagesStatus:
  ...
  - clusters:
    - clusterName: member3
      conditions:
      - lastTransitionTime: "2025-03-24T01:11:30Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingStarted
        status: "True"
        type: Started
      - lastTransitionTime: "2025-03-24T01:11:45Z"
        message: ""
        observedGeneration: 1
        reason: ClusterUpdatingSucceeded
        status: "True"
        type: Succeeded
      resourceOverrideSnapshots:
      - name: guestbook-app-ro-0
        namespace: guestbook
    conditions:
    - lastTransitionTime: "2025-03-24T01:11:45Z"
      message: ""
      observedGeneration: 1
      reason: StageUpdatingWaiting
      status: "False"
      type: Progressing
    - lastTransitionTime: "2025-03-24T01:11:45Z"
      message: ""
      observedGeneration: 1
      reason: StageUpdatingSucceeded
      status: "True"
      type: Succeeded
    endTime: "2025-03-24T01:11:45Z"
    stageName: production
    startTime: "2025-03-24T01:11:30Z"
...

Verify the Application on member clusters

Now we are able to see the Application is created, synced, and healthy on all member clusters except member4 as it does not belong to team A. We can also verify that the configMaps synced from git repo are different for each member cluster:

kubectl config use-context member1
kubectl get app -n guestbook
NAMESPACE   NAME            SYNC STATUS   HEALTH STATUS
guestbook   guestbook-app   Synced        Healthy

kubectl get deploy,svc -n guestbook
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-ui   1/1     1            1           80s # 1 replica in staging env

NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/guestbook-ui   ClusterIP   10.0.20.139   <none>        80/TCP    79s

# verify member2
kubectl config use-context member2
kubectl get app -n guestbook
NAMESPACE   NAME            SYNC STATUS   HEALTH STATUS
guestbook   guestbook-app   Synced        Healthy

kubectl get deploy,svc -n guestbook
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-ui   2/2     2            2           54s # 2 replicas in canary env

NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/guestbook-ui   ClusterIP   10.0.20.139   <none>        80/TCP    54s

# verify member3
kubectl config use-context member3
kubectl get app -n guestbook
NAMESPACE   NAME            SYNC STATUS   HEALTH STATUS
guestbook   guestbook-app   Synced        Healthy

kubectl get deploy,svc -n guestbook
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/guestbook-ui   4/4     4            4           18s # 4 replicas in production env

NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
service/guestbook-ui   ClusterIP   10.0.20.139   <none>        80/TCP    17s

# verify member4
kubectl config use-context member4
kubectl get app -A
No resources found

Release a new version

When team A makes some changes and decides to release a new version, they can cut a new branch or tag in the git repo. To rollout this new version progressively, they can simply:

  1. Update the targetRevision in the Application resource to the new branch or tag on the hub cluster.
  2. Create a new ClusterStagedUpdateRun with the new resource snapshot index.

Suppose now we cut a new release on branch v0.0.1. Updating the spec.source.targetRevision in the Application resource to v0.0.1 will not trigger rollout instantly.

kubectl config use-context hub
kubectl edit app guestbook-app -n guestbook
...
spec:
  source:
    targetRevision: v0.0.1 # <- replace with your release branch
...

Checking the crp, and it’s clear that the new Application is not available yet:

kubectl config use-context hub
kubectl get crp
NAME            GEN   SCHEDULED   SCHEDULED-GEN   AVAILABLE   AVAILABLE-GEN   AGE
guestbook-crp   1     True        1                                           130m

Check a new version of ClusterResourceSnapshot is generated:

kubectl config use-context hub
kubectl get clusterresourcesnapshots --show-labels
NAME                       GEN   AGE     LABELS
guestbook-crp-0-snapshot   1     133m    kubernetes-fleet.io/is-latest-snapshot=false,kubernetes-fleet.io/parent-CRP=guestbook-crp,kubernetes-fleet.io/resource-index=0
guestbook-crp-1-snapshot   1     3m46s   kubernetes-fleet.io/is-latest-snapshot=true,kubernetes-fleet.io/parent-CRP=guestbook-crp,kubernetes-fleet.io/resource-index=1

Notice that guestbook-crp-1-snapshot is latest with resource-index set to 1.

Create a new ClusterStagedUpdateRun with the new resource snapshot index:

kubectl config use-context hub
kubectl apply -f - << EOF
apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: guestbook-updaterun
spec:
  placementName: guestbook-crp
  resourceSnapshotIndex: "1"
  stagedRolloutStrategyName: team-a-strategy
EOF

Following the same steps as before, we can see the new version is rolled out progressively to all member clusters.

Summary

KubeFleet and ArgoCD integration offers a powerful solution for multi-cluster application management, combining KubeFleet’s intelligent orchestration with ArgoCD’s popular GitOps approach. This tutorial showcased how teams can deploy applications across diverse environments with cluster-specific configurations while maintaining complete control over the rollout process. Through practical examples, we demonstrated targeted deployments using cluster labels, environment-specific configurations via overrides, and safe, controlled rollouts with staged update runs. This integration enables teams to transform multi-cluster challenges into streamlined, automated workflows that enhance both developer productivity and operational reliability.

Next steps