Well-Architected Framework
Deploy applications with zero downtime
Application deployments cause downtime, service disruptions, and deployment risk when updates deploy directly to production. Traditional deployment approaches take applications offline during updates, causing revenue loss and poor user experience. Deploy applications using zero-downtime strategies like blue/green, canary, rolling, or a combination to maintain availability during updates, reduce deployment risk, and enable safe rollback.
Your deployment strategy depends on your infrastructure, such as virtual machines or containers, and orchestration tools, like Nomad or Kubernetes. Use load balancers and orchestrators to gradually shift traffic, test changes with production load, and rollback instantly when issues occur.
Why deploy applications with zero downtime
Deploying applications with zero-downtime strategies addresses the following operational challenges:
Lower service disruptions and revenue loss: Application downtime during deployments causes lost revenue, frustrated users, and damaged reputation. Zero-downtime deployments maintain service availability throughout updates, ensuring users experience no interruptions.
Reduce deployment risk with gradual rollouts: Deploying application changes to all users simultaneously creates high risk. If issues occur, all users are affected. Canary and rolling deployments gradually shift traffic, limiting the blast radius and allowing you to catch issues before full rollout.
Enable instant rollback capabilities: When application updates cause bugs or performance issues, traditional deployments require time-consuming rollback procedures. Blue/green deployments maintain the previous version running, allowing traffic switching back to the working version.
Test changes with production traffic: Canary deployments let you test changes with real production traffic on a small user subset, validating performance and functionality before full deployment.
Choose a deployment strategy
Select your deployment strategy based on application requirements, infrastructure constraints, and risk tolerance.
Use the following criteria to choose a deployment strategy:
Use blue/green deployments when you need:
- Instant rollback capability for critical applications
- Complete validation before switching traffic
- Ability to maintain two full environments simultaneously
- Predictable cutover timing
Use canary deployments when you need:
- Risk reduction for high-impact changes
- Gradual validation with real production traffic
- Early detection of issues before full rollout
- Ability to test with a subset of users first
Use rolling deployments when you need:
- Resource efficiency with minimal overhead
- Gradual replacement without double infrastructure costs
- Continuous availability during updates
- Automated orchestration with Kubernetes or Nomad
Combine strategies for comprehensive safety. For example, you can use blue/green deployment with canary testing to deploy to green environment, route 10% traffic for canary validation, then switch all traffic if successful.
Deploy applications on virtual machines with load balancers
Blue/green and canary deployments work well for applications on virtual machines. Load balancers and reverse proxies manage traffic between blue and green environments, enabling you to direct a subset of users for canary testing and control traffic for rolling deployments.
Load balancers route traffic between application environments during updates, supporting blue/green deployments and canary releases. They allow you to gradually shift users to new versions while maintaining the ability to roll back if issues occur. By continuously monitoring application health and automatically routing around failed instances, load balancers increase service availability throughout the deployment process.
Regardless of your cloud provider, you can use Terraform to manage load balancers and proxies. Using Terraform for infrastructure as code allows you to version control your load balancer configurations alongside your application code, ensuring that changes are tracked, reviewed, and rolled back if needed. You can define target groups, health check parameters, routing rules, and SSL certificates declaratively, then apply these configurations automatically as part of your CI/CD pipeline.
The following example shows Terraform configuration for canary deployment using AWS Application Load Balancer with weighted target groups:
# Create target group for stable (blue) version
resource "aws_lb_target_group" "blue" {
name = "app-blue"
port = 8080
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
interval = 30
path = "/health"
timeout = 5
}
}
# Create target group for new (green) version
resource "aws_lb_target_group" "green" {
name = "app-green"
port = 8080
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 2
interval = 30
path = "/health"
timeout = 5
}
}
# Configure listener with weighted traffic distribution
resource "aws_lb_listener" "app" {
load_balancer_arn = aws_lb.main.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
forward {
target_group {
arn = aws_lb_target_group.blue.arn
weight = 90 # 90% of traffic to stable version
}
target_group {
arn = aws_lb_target_group.green.arn
weight = 10 # 10% of traffic to canary version
}
stickiness {
enabled = false
duration = 600
}
}
}
}
The Terraform configuration creates two target groups and distributes traffic with a 90/10 split for canary testing. To gradually shift traffic, you can update the weight values for example, to 50/50, then 0/100, and run terraform apply. The load balancer immediately adjusts traffic distribution without downtime.
Canary testing workflow
After the green environment is ready, the load balancer sends a small fraction of traffic to the green environment (in this example, 10%).

If the canary test succeeds without errors, you can incrementally direct traffic to the green environment over time. In the end state, you redirect all traffic to the green environment. After verifying the new deployment, you can destroy the old blue environment. The green environment is now your current production service.

To learn how to implement canary deployments with AWS Application Load Balancers, follow the blue-green and canary deployments tutorial.
Deploy containerized applications with orchestration tools
Containers support rolling, blue/green, and canary deployments through orchestration tools like Nomad and Kubernetes. Orchestrators automate the deployment process, manage health checks, and handle traffic routing during updates.
The following deployment strategies lower downtime risk:
- Blue/green deployments: Provide instant rollback capability by maintaining two identical environments and switching traffic between them, ensuring zero downtime but requiring double the resources.
- Rolling deployments: Gradually replace instances one by one, minimizing resource usage while maintaining availability, making them efficient for resource-constrained environments.
- Canary deployments: Mitigate risk by releasing to a small subset of users first, allowing you to validate changes and catch issues before full rollout.
Rolling deployments with Nomad
Nomad supports rolling updates as a first-class feature. Use the update block to control how Nomad replaces old allocations with new ones during deployment.
The following example shows a Nomad job specification with rolling update configuration:
job "web-app" {
datacenters = ["dc1"]
type = "service"
update {
max_parallel = 2 # Update 2 instances at a time
health_check = "checks" # Wait for health checks to pass
min_healthy_time = "10s" # Minimum time to be healthy
healthy_deadline = "5m" # Maximum time to become healthy
progress_deadline = "10m" # Overall deployment timeout
auto_revert = true # Automatically revert on failure
canary = 2 # Deploy 2 canary instances first
}
group "web" {
count = 6 # Total 6 instances
network {
port "http" {
to = 8080
}
}
service {
name = "web-app"
port = "http"
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
}
task "app" {
driver = "docker"
config {
image = "myregistry/myapp:1.0.0"
ports = ["http"]
}
}
}
}
The Nomad job specification deploys 6 instances with rolling updates. Nomad first deploys 2 canary instances, waits for health checks to pass for 10 seconds, then progressively updates 2 instances at a time. If any instance fails health checks, Nomad automatically reverts to the previous version. The progress_deadline ensures the entire deployment completes within 10 minutes or fails.
To learn how to implement rolling and canary deployments with Nomad, follow the Nomad job updates tutorials.
Rolling deployments with Kubernetes
Kubernetes uses rolling updates by default. Kubernetes incrementally replaces current pods with new ones, scheduling new pods on nodes with available resources and waiting for them to become ready before removing old pods.
Both Nomad and Kubernetes support blue/green deployments. Before sending all traffic to your new deployment, use canary testing to validate the new version works correctly with production traffic.
HashiCorp resources:
- Learn about zero-downtime deployment strategies overview
- Deploy blue/green infrastructure for zero downtime
- Deploy with traffic splitting using service mesh
- Implement atomic deployments for infrastructure
- Learn how to package applications for deployment
- Implement automated testing before deploying
- Use infrastructure as code to manage deployments
Terraform load balancer resources:
- Learn to use Application Load Balancers for blue-green and canary deployments
- Use AWS Load Balancer target groups for traffic routing
- Use AWS Load Balancer listener for traffic distribution
- Configure AWS Load Balancer listener rules for advanced routing
- Read the Terraform AWS provider documentation for additional resources
Nomad deployment resources:
- Nomad blue/green and canary deployments - Complete tutorial on Nomad deployment strategies
- Nomad rolling updates - Tutorial on rolling deployments
- Nomad update block reference - Configure rolling upgrades and canary deployments
- Nomad job updates tutorials - All Nomad deployment tutorials
- Read the Nomad documentation for comprehensive feature guide
- Learn about Nomad job specifications for application deployments
- Use the Nomad Terraform provider to manage jobs as code
External resources:
Next steps
In this section of Zero-downtime deployments, you learned how to deploy application changes using blue/green, canary, and rolling strategies with load balancers and orchestrators. Zero-downtime deployments are part of the Define and automate processes pillar.
Refer to the following documents to learn more about deployment strategies: