Staged Update

Concept about Staged Update

While users can rely on the RollingUpdate rollout strategy in the placement to safely roll out their workloads, many organizations require more sophisticated deployment control mechanisms for their fleet operations. Standard rolling updates, while effective for individual rollout, lack the granular control and coordination capabilities needed for complex, multi-cluster environments.

Why Staged Rollouts?

Staged rollouts address critical deployment challenges that organizations face when managing workloads across distributed environments:

Strategic Grouping and Ordering

Traditional rollout strategies treat all target clusters uniformly, but real-world deployments often require strategic sequencing. Staged rollouts enable you to:

  • Group clusters by environment (development → staging → production)
  • Prioritize by criticality (test clusters before business-critical systems)
  • Sequence by dependencies (deploy foundational services before dependent applications)

This grouping capability ensures that updates follow your organization’s risk management and operational requirements rather than arbitrary ordering.

Enhanced Control and Approval Gates

Automated rollouts provide speed but can propagate issues rapidly across your entire fleet. Staged rollouts introduce deliberate checkpoints that give teams control over the rollout progression:

  • Manual approval gates between stages allow human validation of rollout health
  • Automated wait periods provide time for monitoring and observation

Dual-Scope Architecture

Kubefleet provides staged update capabilities at two different scopes to address different organizational roles and operational requirements:

  1. Cluster-scoped staged updates for fleet administrators who need to manage rollouts across entire clusters
  2. Namespace-scoped staged updates for application teams who need to manage rollouts within their specific namespaces

This dual approach enables fleet administrators to maintain oversight of infrastructure-level changes while empowering application teams with autonomous control over their specific workloads.

Overview

Kubefleet provides staged update capabilities at two scopes to serve different organizational needs:

Cluster-Scoped: ClusterStagedUpdateStrategy, ClusterStagedUpdateRun, and ClusterApprovalRequest for fleet administrators managing infrastructure-level changes across clusters.

Namespace-Scoped: StagedUpdateStrategy, StagedUpdateRun, and ApprovalRequest for application teams managing rollouts within their specific namespaces.

Both systems follow the same pattern:

  1. Strategy resources define reusable orchestration patterns with stages, ordering, and approval gates
  2. UpdateRun resources execute the strategy for specific rollouts using a target placement, resource snapshot, and strategy name
  3. External rollout strategy must be set on the target placement to enable staged updates

Key Differences by Persona

AspectFleet Administrators (Cluster-Scoped)Application Teams (Namespace-Scoped)
ScopeEntire clusters and infrastructureIndividual applications within namespaces
ResourcesClusterStagedUpdateStrategy, ClusterStagedUpdateRun, ClusterApprovalRequestStagedUpdateStrategy, StagedUpdateRun, ApprovalRequest
TargetClusterResourcePlacementResourcePlacement (namespace-scoped)
Use CasesFleet-wide updates, infrastructure changesApplication rollouts, service updates
PermissionRequires cluster-admin permissionsOperates within namespace boundaries

Configure External Rollout Strategy

Both placement types require setting the rollout strategy to External to enable staged updates:

Cluster-scoped example:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterResourcePlacement
metadata:
  name: example-placement
spec:
  resourceSelectors:
    - group: ""
      kind: Namespace
      name: test-namespace
      version: v1
  policy:
    placementType: PickAll
  strategy:
    type: External # enables staged updates

Namespace-scoped example:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ResourcePlacement
metadata:
  name: example-namespace-placement
  namespace: my-app-namespace
spec:
  resourceSelectors:
    - group: "apps"
      kind: Deployment
      name: my-application
      version: v1
  policy:
    placementType: PickAll
  strategy:
    type: External # enables staged updates

Define Staged Update Strategies

Staged update strategies define reusable orchestration patterns that organize target clusters into stages with specific rollout sequences and approval gates. Both scopes use similar configurations but operate at different levels.

Cluster-Scoped Strategy

ClusterStagedUpdateStrategy organizes member clusters into stages:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateStrategy
metadata:
  name: cluster-config-strategy
spec:
  stages:
    - name: staging
      labelSelector:
        matchLabels:
          environment: staging
      afterStageTasks:
        - type: TimedWait
          waitTime: 1h
    - name: canary
      labelSelector:
        matchLabels:
          environment: canary
      afterStageTasks:
        - type: Approval
    - name: production
      labelSelector:
        matchLabels:
          environment: production
      sortingLabelKey: order
      afterStageTasks:
        - type: Approval
        - type: TimedWait
          waitTime: 1h

Namespace-Scoped Strategy

StagedUpdateStrategy follows the same pattern but operates within namespace boundaries:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateStrategy
metadata:
  name: app-rollout-strategy
  namespace: my-app-namespace
spec:
  stages:
    - name: dev-clusters
      labelSelector:
        matchLabels:
          environment: development
      afterStageTasks:
        - type: TimedWait
          waitTime: 30m
    - name: production-clusters
      labelSelector:
        matchLabels:
          environment: production
      sortingLabelKey: deployment-order
      afterStageTasks:
        - type: Approval

Stage Configuration

Each stage includes:

  • name: Unique identifier for the stage
  • labelSelector: Selects target clusters for this stage
  • sortingLabelKey (optional): Label whose integer value determines update sequence within the stage
  • afterStageTasks (optional): Tasks that must complete before proceeding to the next stage

After-Stage Tasks

Two task types to control stage progression:

  • TimedWait: Waits for a specified duration before proceeding
  • Approval: Requires manual approval via an approval request object

For approval tasks, the system automatically creates an approval request object named <updateRun-name>-<stage-name>. The approval request type depends on the scope:

  • Cluster-scoped: Creates ClusterApprovalRequest (short name: careq) - a cluster-scoped resource containing a spec with parentStageRollout (the UpdateRun name) and targetStage (the stage name). The spec is immutable after creation.
  • Namespace-scoped: Creates ApprovalRequest (short name: areq) within the same namespace - a namespace-scoped resource with the same spec structure as ClusterApprovalRequest.

Both approval request types use status conditions to track approval state:

  • Approved condition (status True): indicates the request was approved by a user
  • ApprovalAccepted condition (status True): indicates the approval was processed and accepted by the system

Approve manually by setting the Approved condition to True using kubectl patch:

# For cluster-scoped approvals
kubectl patch clusterapprovalrequests example-run-canary --type='merge' \
  -p '{"status":{"conditions":[{"type":"Approved","status":"True","reason":"approved","message":"approved"}]}}' \
  --subresource=status

# For namespace-scoped approvals
kubectl patch approvalrequests app-run-canary -n my-app-namespace --type='merge' \
  -p '{"status":{"conditions":[{"type":"Approved","status":"True","reason":"approved","message":"approved"}]}}' \
  --subresource=status

Trigger Staged Rollouts

UpdateRun resources execute strategies for specific rollouts. Both scopes follow the same pattern with three required parameters:

Cluster-scoped example:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: ClusterStagedUpdateRun
metadata:
  name: example-run
spec:
  placementName: example-placement              # Target ClusterResourcePlacement
  resourceSnapshotIndex: "0"                    # Resource version to deploy
  stagedRolloutStrategyName: example-strategy   # Strategy to execute

Namespace-scoped example:

apiVersion: placement.kubernetes-fleet.io/v1beta1
kind: StagedUpdateRun
metadata:
  name: app-rollout-v1-2-3
  namespace: my-app-namespace
spec:
  placementName: example-namespace-placement    # Target ResourcePlacement
  resourceSnapshotIndex: "5"                    # Resource version to deploy
  stagedRolloutStrategyName: app-rollout-strategy # Strategy to execute

UpdateRun Execution

UpdateRuns execute in two phases:

  1. Initialization: Captures strategy snapshot, collects target bindings, generates cluster update sequence
  2. Execution: Processes stages sequentially, updates clusters within each stage, enforces after-stage tasks

Important Constraints and Validation

Immutable Fields: Once created, the following UpdateRun spec fields cannot be modified:

  • placementName: Target placement resource name
  • resourceSnapshotIndex: Resource version to deploy
  • stagedRolloutStrategyName: Strategy to execute

Strategy Limits: Each strategy can define a maximum of 31 stages to ensure reasonable execution times.

Monitor UpdateRun Status

UpdateRun status provides detailed information about rollout progress across stages and clusters. The status includes:

  • Overall conditions: Initialization, progression, and completion status
  • Stage status: Progress and timing for each stage
  • Cluster status: Individual cluster update results
  • After-stage task status: Approval and wait task progress

Use kubectl describe to view detailed status:

# Cluster-scoped (can use short name 'csur')
kubectl describe clusterstagedupdaterun example-run
kubectl describe csur example-run

# Namespace-scoped (can use short name 'sur')
kubectl describe stagedupdaterun app-rollout-v1-2-3 -n my-app-namespace
kubectl describe sur app-rollout-v1-2-3 -n my-app-namespace

UpdateRun and Placement Relationship

UpdateRuns and Placements work together to orchestrate rollouts, with each serving a distinct purpose:

The Trigger Mechanism

UpdateRuns serve as the trigger that initiates rollouts for their respective placement resources:

  • Cluster-scoped: ClusterStagedUpdateRun triggers ClusterResourcePlacement
  • Namespace-scoped: StagedUpdateRun triggers ResourcePlacement

Before an UpdateRun is Created

When you create a Placement with strategy.type: External, the system schedules which clusters should receive the resources, but does not deploy anything yet. The Placement remains in a scheduled but not available state, waiting for an UpdateRun to trigger the actual rollout.

After an UpdateRun is Created

Once you create an UpdateRun, the system begins executing the staged rollout. Both resources provide status information, but at different levels of detail:

UpdateRun Status - High-level rollout orchestration:

  • Which stage is currently executing
  • Which clusters have started/completed updates
  • Whether after-stage tasks (approvals, waits) are complete
  • Overall rollout progression through stages

Placement Status - Detailed deployment information for each cluster:

  • Success or failure of individual resource creation (e.g., did the Deployment create successfully?)
  • Whether overrides were applied correctly
  • Specific error messages if resources failed to deploy
  • Detailed conditions for troubleshooting

This separation of concerns allows you to:

  1. Monitor high-level rollout progress and stage execution through the UpdateRun
  2. Drill down into specific deployment issues on individual clusters through the Placement
  3. Understand whether a problem is with the staged rollout orchestration or with resource deployment itself

Concurrent UpdateRuns

Multiple UpdateRuns can execute concurrently for the same placement with one constraint: all concurrent runs must use identical strategy configurations to ensure consistent behavior.

Next Steps