Table of Contents
Zero-downtime deployment ensures your service remains available during updates. This guide covers proven strategies and implementation patterns.
Deployment Strategies
Three main approaches exist for zero-downtime deployments:
- Blue-Green: Run two identical environments, switch traffic instantly
- Rolling: Gradually replace instances one by one
- Canary: Route small percentage to new version first
Blue-Green Deployments
Maintain two production environments. Deploy to inactive one, then switch:
# Deploy to green environment
kubectl apply -f deployment-green.yaml
# Verify green is healthy
kubectl get pods -l env=green
# Switch traffic
kubectl patch service myapp -p '{"spec":{"selector":{"env":"green"}}}'
# Monitor for issues
# If problems, switch back instantly
Advantages
- Instant rollback capability
- Test production environment before switching
- No mixed version state
Disadvantages
- Requires 2x infrastructure
- Database migrations need special handling
- Stateful applications are complex
Rolling Updates
Replace instances gradually to maintain availability:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 6
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2 # Allow 2 extra pods during update
maxUnavailable: 1 # At most 1 pod down at a time
The update process:
- Create new pod with updated version
- Wait for health check to pass
- Terminate old pod
- Repeat until all pods updated
Health Checks
Critical for zero-downtime deployments. Implement both readiness and liveness probes:
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 3
Your health endpoint should verify:
- Database connectivity
- Required dependencies availability
- Critical background workers running
Rollback Procedures
Always have a rollback plan. With Kubernetes:
# View deployment history
kubectl rollout history deployment/myapp
# Rollback to previous version
kubectl rollout undo deployment/myapp
# Rollback to specific revision
kubectl rollout undo deployment/myapp --to-revision=3
Automated Rollback
Implement automatic rollback on error rate threshold:
if error_rate > 5% for 2 minutes:
trigger rollback
alert oncall team
create incident
Summary
Zero-downtime deployment is achievable with proper planning. Choose the strategy that fits your constraints, implement comprehensive health checks, and always have a tested rollback procedure. Start with rolling updates for stateless services and consider blue-green for critical systems.