🏛️ SAP-C02 Domain Report: Compute & Load Balancing
🧠 Domain Summary & Key Takeaways
The Compute and Load Balancing domain on the SAP-C02 exam does not just test what services do; it tests how they break under pressure and how to fix them using strict constraints (e.g., "least operational overhead," "no application code changes").
Here are the master-level rules you must memorize for the exam based on our session:
1. Advanced Load Balancing & Edge Routing
- Application Load Balancer (ALB): HTTP/HTTPS/gRPC only. Trap: ALBs cannot perform transparent URL rewrites, only HTTP redirects. If you need transparent rewrites, use CloudFront Functions, Lambda@Edge, or API Gateway. Use the Least Outstanding Requests (LOR) algorithm if backend targets process requests at vastly different speeds or are prone to compute exhaustion.
- Network Load Balancer (NLB): TCP/UDP. Used for extreme performance and static IP requirements. Trap: To preserve the true client IP when fronting an NLB with Global Accelerator, the NLB must be configured with Security Groups.
- Gateway Load Balancer (GWLB): Uses the GENEVE protocol (port 6081) for transparent 3rd-party firewall insertion. Trap: GWLB does not failover across Availability Zones by default; you must manually enable Cross-Zone Load Balancing. Do not confuse this with Appliance Mode, which is an AWS Transit Gateway feature to prevent asymmetric routing.
- Global Accelerator vs. CloudFront: Use Global Accelerator for non-HTTP traffic (UDP/TCP), static Anycast IPs, and sub-second deterministic failover immune to client-side DNS caching.
2. Container & Serverless Scaling Constraints
- Long-Running Tasks: Target Group Deregistration Delay (connection draining) has a hard limit of 1 hour (3,600 seconds). For game servers or workers running longer than an hour, you must use Amazon ECS Task Scale-In Protection via the local container agent to prevent Auto Scaling from terminating the task.
- Serverless Database Exhaustion: Relational databases (RDS/Aurora) cannot handle tens of thousands of simultaneous connections from massive Lambda scale-outs. You must insert Amazon RDS Proxy as a connection pooler.
- NAT Gateway Port Exhaustion: A single NAT Gateway supports a maximum of 55,000 concurrent connections to a single unique destination. If thousands of Lambda functions are calling a single 3rd-party API endpoint, you will hit
ErrorPortAllocation. The solution is to attach additional Elastic IPs to the NAT Gateway (up to 8).
📝 Mock Exam Archive: Scenarios & Explanations
Below are the 5 brutal, SAP-level scenarios we went through, including the exact reasoning for why the correct answers work and why the distractors fail.
Scenario 1: Global Payment Gateway Modernization
The Challenge: Route custom TCP payloads (port 8200) to Fargate containers in two regions (us-east-1, eu-west-1). You need static IPs for B2B whitelisting, true client IP logging, and sub-second failover immune to DNS caching.
- A) ALB + Global Accelerator + X-Forwarded-For
- B) NLB + Route 53 Latency Routing + Proxy Protocol v2
- C) NLB with Security Groups + Global Accelerator with Client IP Preservation (Correct)
- D) Transit Gateway + PrivateLink + Global Accelerator
Coach's Breakdown:
- Why C is correct: Global Accelerator provides static Anycast IPs and ignores DNS caching for failover. Because traffic is TCP, we must use NLB. AWS recently updated Global Accelerator to natively support Client IP Preservation for NLBs, but only if the NLB uses Security Groups.
- Traps: ALB (Option A) only supports HTTP. Route 53 (Option B) relies on DNS TTLs, which clients often ignore, breaking the failover constraint.
Scenario 2: AI Inference API Scaling & Consolidation
The Challenge: A REST API on ECS crashes if a container receives more than 3 concurrent requests. You also need to transparently rewrite incoming URLs from /legacy/v2/predict to /predict without using HTTP redirects. You cannot rewrite the app to an SQS queue.
- A) API Gateway with Path Rewriting + Usage Plan (Rate Limit 3)
- B) ALB Target Optimizer + Concurrency Agent + ALB URL Rewrite action
- C) CloudFront Function for URI Rewrite + ALB Least Outstanding Requests algorithm + Target Tracking on RequestCountPerTarget (Correct)
- D) SQS Queue + Lambda URL Transformer + ECS pulling from SQS
Coach's Breakdown:
- Why C is correct: CloudFront Functions provide lightweight, transparent URL rewrites before the ALB. Changing the ALB algorithm to Least Outstanding Requests intelligently routes traffic away from busy containers, preventing the 3-request crash while Target Tracking scales the cluster smoothly.
- Traps: ALBs cannot transparently rewrite URLs (Option B is fake). API Gateway Usage Plans (Option A) rate-limit the entire API, not individual backend targets.
Scenario 3: The Serverless Flash Sale Meltdown
The Challenge: A massive spike in Lambda functions in a private VPC is causing two errors: NAT Gateway ErrorPortAllocation (hitting a 3rd party API) and Aurora PostgreSQL TooManyConnections (hitting the database).
- A) Add 3 NAT Gateways + update Route Table 0.0.0.0/0 to load balance + increase Aurora
max_connectionsto 100,000 - B) Attach 2 additional Elastic IPs to the existing NAT Gateway + Deploy Amazon RDS Proxy (Correct)
- C) Move Lambda out of VPC + Deploy ElastiCache Redis
- D) Deploy Global Accelerator for outbound traffic + Aurora Serverless v2
Coach's Breakdown:
- Why B is correct: A NAT Gateway is limited to 55,000 connections to a single destination. Adding EIPs expands this port pool linearly. RDS Proxy is the required AWS-native connection pooler to protect relational databases from Serverless compute exhaustion.
- Traps: Aurora cannot handle 100,000 native connections, and VPC route tables cannot load balance
0.0.0.0/0across multiple NAT Gateways (Option A). Global Accelerator (Option D) is for inbound traffic, not outbound.
Scenario 4: The "Bump-in-the-Wire" Blackhole
The Challenge: An Ingress VPC uses a Gateway Load Balancer (GWLB) to route internet traffic to 3rd-party firewalls across two AZs. When firewalls in AZ-A fail, the GWLB endpoint drops the traffic instead of routing it to healthy firewalls in AZ-B.
- A) Replace GWLB with NLB + Cross-Zone Load Balancing
- B) Enable Appliance Mode on the Ingress VPC
- C) Modify Gateway Load Balancer attributes to enable cross-zone load balancing (Correct)
- D) EventBridge + Lambda to dynamically update Route Tables
Coach's Breakdown:
- Why C is correct: GWLB does not route traffic across Availability Zones by default to save on latency and data transfer costs. You must explicitly enable Cross-Zone Load Balancing on the GWLB attributes.
- Traps: Appliance Mode (Option B) is a valid AWS concept, but it belongs to Transit Gateway, not Ingress VPCs. NLBs (Option A) do not support the GENEVE protocol required for transparent firewall insertion.
Scenario 5: The Global Multiplayer Backend
The Challenge: A global gaming backend uses UDP. Matches last up to 2.5 hours. You need static global entry IPs, true client IP visibility, and you must guarantee ECS Service Auto Scaling does not terminate containers hosting active matches.
- A) Global Accelerator + NLB + Target Group Deregistration Delay set to 9,000 seconds
- B) API Gateway WebSockets + Route 53 Geolocation + EC2 Scale-In Protection
- C) CloudFront UDP Origin + NLB Proxy Protocol v2 + ECS
UpdateTaskProtection - D) Global Accelerator + NLB with Security Groups + ECS Task Scale-In Protection (Correct)
Coach's Breakdown:
- Why D is correct: Global Accelerator is required for global UDP routing. NLB Security Groups are required for GA Client IP preservation. ECS Task Scale-In Protection allows the container to locally signal AWS to block termination until the match concludes.
- Traps: Target Group Deregistration Delay has a hard limit of 3,600 seconds / 1 hour (Option A). API Gateway and CloudFront do not support raw UDP routing (Options B and C).