API Gateway, CloudFront, and Global Accelerator

What Is API Gateway and Edge Architecture on AWS

API Gateway and edge architecture on AWS is the layer that sits between your end users and your backend compute, databases, and storage. For the SAA-C03 exam, API Gateway and edge means three primary services working together: Amazon API Gateway for publishing and managing HTTP, REST, and WebSocket APIs; Amazon CloudFront for global content delivery and HTTP-level edge acceleration; and AWS Global Accelerator for TCP/UDP anycast acceleration over the AWS backbone. Supporting services that appear in API Gateway and edge scenarios include Lambda@Edge, CloudFront Functions, Route 53 routing policies, AWS WAF, and AWS Shield.

The API Gateway and edge topic lives in SAA-C03 Task 2.1 "Design scalable and loosely coupled architectures," but it crosses into Task 3.4 (high-performing network) and Task 4.4 (cost-optimized network). You will see API Gateway and edge questions framed as scenario trade-offs: a global SaaS company needs low-latency static asset delivery (CloudFront), a multiplayer game needs static IPs for corporate firewall allowlists (Global Accelerator), a startup wants a cheap HTTP facade in front of Lambda (HTTP API), and an enterprise needs per-customer rate limits with monetized API keys (REST API with usage plans). Mastering API Gateway and edge means knowing which front door to pick for each story.

This SAA-C03 topic differs from the foundational CLF-C02 network-services topic in two ways. First, the API Gateway and edge exam bar is prescriptive: you must choose between REST API and HTTP API given cost and feature trade-offs, between CloudFront and Global Accelerator given protocol and caching needs, and between Lambda@Edge and CloudFront Functions given latency and language constraints. Second, the API Gateway and edge topic assumes you already know VPC, IAM, Lambda, and ALB — those are inputs to the design decisions, not the decisions themselves. With that scope set, let's dive into the SAA-C03 API Gateway and edge knowledge you will be tested on.

Analogy 1 — The Shopping Mall Reception Desk (API Gateway)

Picture a large shopping mall. API Gateway is the reception desk at the main entrance. Every visitor (HTTP request) must pass through reception before going deeper into the mall. Reception does five things at once that map directly onto API Gateway features: they check your ID badge (authentication via IAM, Cognito, or Lambda Authorizer); they check which shops you are allowed to visit (authorization); they enforce the crowd-control rope that limits how many people enter per minute (throttling); they hand out pre-printed flyers for popular FAQs instead of phoning each shop (caching); and they stamp a ticket with your visit number for auditing (request/response logging). Different customer tiers get different wristbands at reception — free tier, silver, gold — and reception tracks how many visits each tier has used this month (usage plans with API keys). Without the reception desk, every visitor would walk straight into every shop, and each shop would have to implement its own ID check, crowd control, and accounting. With API Gateway as reception, the shops (your Lambda functions, ECS services, HTTP backends) focus on selling goods.

Analogy 2 — The Chain of 7-Eleven Stores (CloudFront)

Amazon CloudFront is the chain of 7-Eleven convenience stores that dot every neighborhood. Instead of every customer driving to the central Amazon warehouse (your origin: S3, ALB, or custom HTTP server) for a cold drink, they walk to the nearest 7-Eleven (the CloudFront edge location) and grab it there. Each 7-Eleven keeps a small inventory of popular items (the cache), and only when the local store is out of stock does the clerk phone the central warehouse to restock (cache miss, origin fetch). CloudFront distributions are the franchise network — you tell CloudFront "here is my central warehouse address, please open stores worldwide pointing to it." Cache behaviors are the per-aisle restocking policies ("keep drinks cold for 24 hours; keep newspapers for 1 hour"). Signed URLs are numbered ticket stubs — only the customer holding stub #4782 can claim the pre-paid item at the counter. Lambda@Edge is the on-site store manager who can rewrite your shopping receipt before handing it back ("add a coupon code based on the customer's location"). And Origin Access Control (OAC) is the strict rule that the warehouse only accepts orders from the franchise stores — walk-in customers at the warehouse gate are turned away.

Analogy 3 — The Corporate Toll-Free Number (Global Accelerator)

AWS Global Accelerator is the corporate 1-800 toll-free number printed on every business card. No matter where you dial from — Taipei, Tokyo, Tallinn — the phone company routes your call onto the corporate private telephone network (the AWS backbone) at the nearest exchange, then delivers it fast and consistently to the right internal extension (your ALB, NLB, or EC2 endpoint in a specific AWS Region). Two crucial properties fall out of this analogy. First, the toll-free number never changes — if the corporate HQ moves from Seattle to Dublin, the number on the business card stays the same (static anycast IPs survive Region and endpoint changes). Second, the phone company's private network is faster and more reliable than making the caller navigate the public telephone system, dropping calls and hopping between carriers (Global Accelerator uses the AWS backbone, not the public internet). You can also dial down the traffic volume to an extension while you do maintenance (traffic dials, 0–100%) or weight calls toward senior operators (endpoint weights).

Analogy 4 — The Door Access Card System (Lambda Authorizer)

A Lambda Authorizer is the electronic door access card system for an office building with many tenants. When you tap your card at the front door, a little computer (the Lambda function) looks up your badge number, confirms you work for Tenant X, and decides "yes, let them through to floors 3–5, but not to floor 7." API Gateway caches that decision for the next 5 minutes so the same card swipe does not round-trip to the database every time — that is the authorizer result cache TTL. If the card is expired or revoked, the system says "deny" and the door stays locked. This analogy is why Lambda Authorizers are perfect for OAuth/JWT validation, custom HMAC signatures, or legacy corporate SSO systems that do not fit into IAM or Cognito.

With reception desks, 7-Elevens, corporate toll-free numbers, and door access cards as your four mental hooks, API Gateway and edge questions on SAA-C03 become pattern-matching exercises.

Amazon API Gateway — The Managed API Front Door

Amazon API Gateway is a fully managed service that lets you publish, secure, monitor, and monetize HTTP, REST, and WebSocket APIs at any scale. For SAA-C03 API Gateway and edge scenarios, treat API Gateway as the API facade in front of Lambda, ECS/EKS services, ALB/NLB, or any public HTTP backend. API Gateway handles TLS, authentication, throttling, caching, request/response transformation, and CloudWatch metrics out of the box, so your backend Lambda or container only has to run business logic.

REST API vs HTTP API vs WebSocket API — The Three Flavors

API Gateway offers three API types, and SAA-C03 routinely asks you to pick between them:

REST API (v1) — the full-featured flagship. Supports API keys, usage plans, request validation, WAF integration, private APIs inside a VPC, request/response transformations (mapping templates), edge-optimized endpoints (built-in CloudFront), and all authorizer types. Highest latency of the three and the most expensive per million calls.
HTTP API (v2) — a leaner, faster, cheaper successor launched in 2020. Up to 70% cheaper and roughly 60% lower latency than REST API. Supports JWT authorizers natively (no Lambda needed for OIDC), Lambda proxy and HTTP proxy integrations, and CORS. Drops some REST features: no API keys / usage plans, no request validation, no response mapping templates, no AWS WAF integration (as of current scope), no edge-optimized mode.
WebSocket API — bidirectional persistent connection for real-time apps (chat, collaboration tools, live dashboards, multiplayer game state). Routes messages by the client-supplied $default, $connect, $disconnect, or a custom route key extracted from the payload. Backend integrations can be Lambda, HTTP, or AWS services.

API Gateway REST vs HTTP API Decision Rule — Choose HTTP API by default for new serverless APIs — it is cheaper, faster, and simpler. Choose REST API only when you need API keys with usage plans, request validation, AWS WAF attached directly, private APIs inside a VPC, or mapping templates for legacy request transformations. Choose WebSocket API whenever the client needs the server to push without polling. Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-vs-rest.html

Endpoint Types — Edge-Optimized vs Regional vs Private

REST APIs ship with three deployment endpoint types, and this is a favorite SAA-C03 trap:

Edge-optimized (REST default) — clients hit a CloudFront edge, which forwards to API Gateway in the API's home Region. Best for globally distributed clients hitting a single Region.
Regional — clients hit the API Gateway endpoint directly in the Region. Best when clients are in the same Region, or when you want to put your own CloudFront distribution in front (for custom caching, multiple origins, or Lambda@Edge). Also required if you want to pair API Gateway with AWS Global Accelerator.
Private — the API is only accessible from inside a VPC via an Interface VPC Endpoint (AWS PrivateLink). Useful for internal-only microservices.

HTTP APIs are Regional only.

Authentication and Authorization on API Gateway

API Gateway supports four authentication patterns. Knowing when each applies is the highest-yield section of any API Gateway and edge study:

IAM Authorization (AWS_IAM) — the caller signs the request with SigV4 using AWS credentials. Used for service-to-service calls inside AWS, or for internal tools where every caller already has IAM identities. Works with REST and HTTP APIs.
Amazon Cognito User Pools — callers authenticate to a Cognito User Pool and receive a JWT. API Gateway validates the JWT automatically. Built for end-user auth in mobile and web apps. Works with REST (Cognito authorizer) and HTTP (JWT authorizer pointing at Cognito issuer).
Lambda Authorizer (formerly Custom Authorizer) — API Gateway calls your Lambda with the incoming token or request context; your Lambda returns an IAM-style policy document granting or denying access. Use for OAuth / OIDC from third-party providers, HMAC signatures, SAML assertions, or any proprietary auth. Result caching up to 1 hour reduces Lambda invocations.
API Keys with Usage Plans (REST only) — not true authentication, more like identification plus quota enforcement. Used to meter and bill API consumers in a SaaS monetization model.

Cognito User Pool vs Cognito Identity Pool — The Classic API Gateway Trap — SAA-C03 loves to confuse Cognito User Pool (user directory and JWT issuer — authentication) with Cognito Identity Pool (temporary AWS credentials broker — authorization to AWS services). For API Gateway authentication, the answer is almost always User Pool (the JWT authorizer). Identity Pool is for granting a signed-in user direct access to S3 or DynamoDB via temporary IAM credentials. Read the scenario twice: if it mentions "JWT" or "sign in," it is User Pool. If it mentions "AWS credentials" or "direct access to S3 from the mobile client," it is Identity Pool. Reference: https://docs.aws.amazon.com/cognito/latest/developerguide/what-is-amazon-cognito.html

Stages, Deployments, and Stage Variables

API Gateway decouples API definitions from runtime deployments. You edit the API definition, then create a deployment (an immutable snapshot) and point a stage (dev, test, prod) at that deployment. Stage variables are like environment variables — the same API definition can point at different Lambda ARNs per stage (e.g., ${stageVariables.functionArn}). Canary deployments on a stage let you shift a percentage of traffic to a new deployment while keeping the rest on the stable one, which is the API Gateway equivalent of blue/green.

Throttling and Caching on API Gateway

Throttling protects your backend from traffic spikes and prevents a single client from consuming all capacity:

Account-level throttle — default 10,000 RPS steady-state and 5,000 RPS burst per Region (soft limit).
Stage-level throttle — override per stage.
Method-level throttle — override per resource + method inside a stage.
Per-client (usage plan) throttle — limit by API key (REST only).

Caching (REST only) is provisioned as a dedicated cache cluster attached to a stage, sized from 0.5 GB to 237 GB. Cache keys default to the full request URL; you can include headers or query parameters in the key. Cache TTL defaults to 300 seconds and goes up to 3600 seconds. Cache invalidation requires the caller to send Cache-Control: max-age=0 and hold InvalidateCache permission.

Usage Plans and API Keys — Monetization Primitives

Usage plans (REST only) tie API keys to quotas (requests per day/week/month) and throttle limits (RPS and burst). Typical SaaS pattern: you create three usage plans — Free (1,000 req/month, 10 RPS), Pro (100,000 req/month, 100 RPS), Enterprise (custom) — and each customer gets an API key associated with their chosen plan. API Gateway rejects requests over quota with 429 Too Many Requests. API keys alone do not authenticate; always combine with IAM, Cognito, or a Lambda Authorizer for real security.

API Keys Are For Metering, Not Security — An API key is a string passed in the x-api-key header. It is not a cryptographic credential — anyone who extracts it from a mobile app binary can reuse it. Always combine API keys with true authentication (Cognito, IAM SigV4, Lambda Authorizer) when the endpoint handles sensitive data. Use API keys for plan identification and billing, not authorization. Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-usage-plans.html

API Gateway vs Application Load Balancer — The Decision

SAA-C03 asks this pairing often. Choose API Gateway when you want managed API features: native Lambda integration, API keys/usage plans, request validation, multiple auth methods, per-method throttling, or a pure serverless stack. Choose ALB when you want the cheapest HTTPS load balancer for container or EC2 workloads, path/host routing at massive scale, gRPC support, or when your cost-per-request dominates and the 10,000 RPS account default of API Gateway is a concern. HTTP API narrows the price gap significantly — for serverless Lambda backends, HTTP API beats ALB + Lambda in cost under about 1 million invocations per day.

Amazon CloudFront — Global CDN and Edge Front Door

Amazon CloudFront is AWS's content delivery network with 600+ edge locations and hundreds of regional edge caches (second-tier caches between edges and origins). For SAA-C03 API Gateway and edge scenarios, CloudFront plays three roles: cache HTTP/S content close to users, terminate TLS at the edge (using ACM certificates), and apply edge-side logic (WAF rules, Lambda@Edge, CloudFront Functions, signed URLs).

Distributions, Origins, and Cache Behaviors

A distribution is the top-level CloudFront configuration, identified by a d1234abcd.cloudfront.net hostname (you can alias with a custom domain plus ACM). Each distribution has one or more origins — the backend that CloudFront fetches from on a cache miss. Origin types include Amazon S3, S3 website endpoint, Application Load Balancer, EC2, API Gateway (Regional), Elemental MediaStore/MediaPackage, AWS Elemental services, and any HTTP server reachable from the internet (custom origin).

Cache behaviors map URL path patterns to origins and per-path settings (cache TTLs, allowed HTTP methods, forwarded headers/cookies/query strings, viewer protocol policy, trusted signers for signed URLs). Behaviors are evaluated in order; the first matching path wins, and a default behavior (*) catches everything else. A classic pattern: /api/* behavior forwards to ALB/API Gateway with caching disabled, /static/* forwards to S3 with 1-day caching, and * forwards to the ALB for dynamic HTML.

Origin Groups and Origin Failover

Define two origins as a primary and a secondary in an origin group, and CloudFront automatically fails over to the secondary when the primary returns configured HTTP status codes (typically 5xx and 4xx). This implements origin DR without needing DNS failover, and the failover is transparent to the viewer.

TTL, Cache Keys, and Invalidation

CloudFront cache TTL is controlled by origin Cache-Control and Expires headers, or by distribution minimum/default/maximum TTL settings. Cache key by default is the hostname + path; you opt in to include headers, cookies, or query strings via a cache policy. Invalidations force-evict objects from all edges; the first 1,000 paths per month are free, more are paid. Best practice: avoid invalidations, use versioned filenames (app.v42.js) instead.

Security at the Edge — OAC, WAF, Shield, Signed URLs, Signed Cookies

Origin Access Control (OAC) is the current recommended way to lock an S3 origin so that only CloudFront can read from it. OAC replaces the older Origin Access Identity (OAI). CloudFront signs the origin request with SigV4, and the S3 bucket policy grants access only to the specific CloudFront distribution via AWS:SourceArn condition. With OAC, you block all other S3 public access and serve exclusively through CloudFront — the bucket disappears from the public internet.

AWS WAF attaches to a CloudFront distribution and filters malicious HTTP requests at the edge — SQL injection, XSS, bot patterns, rate-based rules, managed rule groups, and custom rules. AWS Shield Standard is free and automatic on every CloudFront distribution (common DDoS protection). AWS Shield Advanced adds L3/L4/L7 protections, 24/7 DDoS response team access, and cost-protection credits for scaling events caused by attacks.

Signed URLs restrict access to individual objects and expire at a timestamp you set. Signed Cookies restrict access to sets of objects (e.g., an entire video catalog or a premium user area) with a single cookie covering many URLs. Both rely on a trusted key group of public keys uploaded to CloudFront; your backend signs with the matching private key. Use signed URLs for one-off download links, signed cookies when many files share one access boundary.

OAC Has Fully Replaced OAI — Use OAC for New Distributions — AWS launched Origin Access Control (OAC) in 2022 as the successor to Origin Access Identity (OAI). OAC supports SSE-KMS encryption on S3 origins, uses SigV4 (supports all Regions including opt-in), and is the AWS-recommended pattern for new distributions. On SAA-C03, if the scenario involves S3 + CloudFront + locked-down bucket, the answer is OAC. OAI still works for backward compatibility but is no longer the default recommendation. Reference: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html

Lambda@Edge vs CloudFront Functions — Edge Compute Choice

Both run your code at CloudFront edges, but they are optimized for different jobs:

CloudFront Functions — JavaScript (ECMAScript 5.1) only, sub-millisecond execution, runs at the edge location itself, up to 10 million requests per second per distribution. Scope: HTTP header manipulation, URL rewrites, cache-key normalization, viewer-request / viewer-response event types only (not origin-side). Up to 1 ms CPU, 2 MB memory. Six times cheaper than Lambda@Edge for the same request. Cannot call other AWS services or the network.
Lambda@Edge — Node.js or Python, runs at regional edge caches (Lambda@Edge is a deployment of a Lambda function to specific edge Regions). Scope: all four event types (viewer-request, viewer-response, origin-request, origin-response). Up to 5 seconds (viewer events) or 30 seconds (origin events), 128 MB–10 GB memory, can call AWS services and make network requests. Higher latency than CloudFront Functions.

Decision rule: if you only need to rewrite headers or URLs at line rate, use CloudFront Functions. If you need to fetch data from DynamoDB, call another service, or do heavier logic at the origin side, use Lambda@Edge.

CloudFront Functions Is the Default for Simple Edge Logic — SAA-C03 favors the simpler, cheaper service when it fits. Header rewrites, A/B-test cookie assignment, URL canonicalization, and JWT signature pre-validation should all be answered with CloudFront Functions, not Lambda@Edge. Reserve Lambda@Edge for scenarios that explicitly mention dynamic origin selection, DynamoDB lookups at the edge, or runtime longer than 1 ms. Reference: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cloudfront-functions.html

Real-Time Logs and Standard Logs

CloudFront standard logs deliver access logs to S3 every few minutes (up to 24 hours lag, best-effort). Real-time logs stream to Kinesis Data Streams within seconds, for live dashboards, attack detection, or log-driven automation. Real-time logs cost extra per log line but deliver near-real-time visibility.

AWS Global Accelerator — Static Anycast Over the AWS Backbone

AWS Global Accelerator provides two static anycast IP addresses (or a custom BYOIP prefix) that serve as a fixed front door to your application. Traffic from a user enters the AWS global network at the nearest AWS edge POP (same POPs CloudFront uses) and then travels over the AWS private backbone to your application endpoints in a specific Region — bypassing the congested public internet for most of the path.

The Three Knobs: Listeners, Endpoint Groups, Endpoints

Accelerator — the top-level resource, owns the two static anycast IPs and a DNS name like a1234.awsglobalaccelerator.com.
Listener — binds a port and protocol (TCP or UDP) on the anycast IPs.
Endpoint group — per-Region grouping of targets, with a traffic dial (0–100%) that controls what percentage of traffic routed to this Region actually hits the endpoints (the rest is shifted to other Regions or dropped).
Endpoint — a specific ALB, NLB, EC2 instance, or Elastic IP inside the endpoint group. Each endpoint has a weight (0–255) that distributes traffic among endpoints in the same group.

Traffic Dials and Endpoint Weights — Operational Controls

The traffic dial is the headline Global Accelerator feature on SAA-C03. Setting the dial for a Region to 0 drains all traffic out of that Region within seconds — perfect for blue/green Region cutover, maintenance windows, or dialing down a struggling Region without touching DNS. Endpoint weights are for intra-Region distribution: set one endpoint to weight 0 to take it out of rotation while keeping it warm.

CloudFront vs Global Accelerator — The Canonical Comparison

This is the most-asked API Gateway and edge trap on SAA-C03. Memorize this cheat sheet:

Dimension	CloudFront	Global Accelerator
Protocol	HTTP / HTTPS only	Any TCP or UDP
Caching	Yes, at edge	No caching — acceleration only
IP addresses	Changing set per distribution	2 static anycast IPs
TLS termination	At edge	At your endpoint (or pass-through)
Best for	Cacheable web/API content	Non-HTTP workloads, stable IPs, failover speed
Failover	Origin groups + DNS-level	Regional endpoint health + traffic dials in seconds
Common use cases	Websites, SPAs, APIs, video	Multiplayer games, VoIP, IoT, trading, MQTT

Decision shortcut: if the scenario says HTTP/HTTPS and caching helps, answer CloudFront. If the scenario says TCP/UDP, needs static IPs for firewall allowlists, or needs fast cross-Region failover without relying on DNS TTL, answer Global Accelerator. The two are not mutually exclusive — some architectures use CloudFront for static assets and Global Accelerator for a real-time TCP API in the same product.

CloudFront Does Not Accept UDP; Global Accelerator Does Not Cache — If a SAA-C03 scenario mentions UDP (gaming, VoIP, QUIC without HTTP/3 CDN), CloudFront is automatically wrong. If a scenario says "accelerate global access to static assets" or "reduce origin data transfer cost through caching," Global Accelerator is automatically wrong (it does not cache). Read the protocol and the word "cache" first — they eliminate one of the two options in most questions. Reference: https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html

Client IP Preservation

Global Accelerator supports client IP preservation when the endpoint is an ALB or a Network Load Balancer — your backend sees the original viewer's IP rather than a Global Accelerator internal IP. This matters for geolocation, fraud detection, and logging. Client IP preservation is on by default for ALB endpoints and supported on NLB with TCP listeners.

Route 53 Routing Policies in the API Gateway and Edge Picture

Route 53 routing policies frequently appear alongside API Gateway and edge services, because DNS is the ultimate steering wheel. SAA-C03 expects you to pick the right policy for the scenario:

Simple — single record, no health checks.
Weighted — split across records by weight. Blue/green deployments.
Latency-based — route to the Region with lowest measured client latency. Global multi-Region apps.
Failover — active-passive. Primary serves until health check fails, then secondary takes over.
Geolocation — route by user's continent/country/state. Regulatory content localization.
Geoproximity — route by geographic distance with a bias. Advanced fine-tuning.
Multi-value answer — up to 8 healthy records returned. Poor-man's load balancing with health checks.

When pairing Route 53 with API Gateway and edge: Route 53 is DNS-layer steering, so failover reacts at DNS TTL speed (30–60 seconds typical). Global Accelerator failover reacts in under 30 seconds at the anycast layer, without waiting for DNS. If "fastest failover" is the question, Global Accelerator beats Route 53.

Stage, Deployment, and Environment Patterns

Beyond API Gateway stages, the broader API Gateway and edge pattern for multi-environment work looks like:

dev: Regional HTTP API, no CloudFront, dev Lambda aliases, 100% traffic dial.
staging: Regional REST API + regional CloudFront distribution, staging Lambda aliases, WAF with staging rules.
prod: Edge-optimized REST API (or Regional API + CloudFront for custom caching), prod Lambda aliases, WAF with full rule set, Shield Advanced, Global Accelerator if latency-sensitive or static-IP required.

Deployment pipelines promote an API Gateway stage through canary deployments and promote a Lambda alias (via weighted aliases) — these are independent blue/green primitives, which is a subtle point worth remembering.

When to Pair Each Edge Service — Five Canonical Architectures

Pattern 1: Static Website on S3 + CloudFront + OAC

Classic Jamstack. S3 holds the static assets, CloudFront caches them globally, OAC restricts S3 access to the CloudFront distribution, Route 53 alias maps the apex domain to the distribution, ACM provides the TLS cert. WAF adds bot protection, Lambda@Edge (or CloudFront Functions) rewrites paths (e.g., add index.html to directory requests).

Pattern 2: Serverless HTTP API + Lambda

HTTP API (cheap, fast) directly invokes Lambda. JWT authorizer validates Cognito User Pool tokens. No CloudFront by default; add CloudFront in front only if you need caching or WAF. Usage plans are not available on HTTP API — if you need API keys and quotas, switch to REST API.

Pattern 3: REST API + CloudFront (Edge-Optimized) + Usage Plans

SaaS monetization stack. Edge-optimized REST API auto-pairs with an AWS-managed CloudFront. Usage plans tie API keys to quotas per customer tier. Lambda Authorizer validates OAuth access tokens from the customer's IdP. WAF attached at the API Gateway level rate-limits per IP. Pay-per-call billing maps to usage-plan reports.

Pattern 4: Global Multi-Region Active-Active with Global Accelerator

An ALB in us-east-1 and an ALB in eu-west-1 both front identical ECS services. A single Global Accelerator exposes two static anycast IPs. Traffic dials set to 100% on both Regions; anycast plus AWS backbone routes each user to the nearest Region. If us-east-1 fails, endpoint health checks remove the Region in under 30 seconds, and all traffic shifts to eu-west-1 without DNS TTL wait. Route 53 remains only for the custom CNAME to the accelerator.

Pattern 5: Gaming UDP Servers + Global Accelerator + NLB

Real-time multiplayer game. Fleet of EC2 game servers behind NLBs in multiple Regions. Global Accelerator UDP listener forwards to the NLBs. Clients connect to the two static IPs burned into the game client — IPs never change, even as game servers scale or Regions get added. Traffic dials let ops drain a Region during releases.

API Gateway and Edge Quick Numbers for SAA-C03 — - API Gateway default throttle: 10,000 RPS steady, 5,000 RPS burst per Region.

API Gateway payload limit: 10 MB request, 10 MB response.
API Gateway timeout: 29 seconds (integration timeout maximum).
Lambda Authorizer result cache: up to 1 hour (TTL configurable).
HTTP API vs REST API: HTTP API up to 70% cheaper, ~60% lower latency.
CloudFront edge locations: 600+ globally, across 100+ cities.
CloudFront default TTL: 24 hours (86,400 seconds).
CloudFront Functions: up to 10M RPS per distribution, 1 ms CPU, 2 MB memory, ES5.1 JavaScript only.
Lambda@Edge: up to 30s origin event timeout, Node.js/Python, all four event types.
Global Accelerator: 2 static anycast IPs per accelerator, failover < 30 seconds.
Global Accelerator traffic dial: 0–100%, endpoint weight 0–255.
CloudFront invalidation free tier: 1,000 paths per month. Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

Caching Strategies Across API Gateway and Edge Layers

Caching in API Gateway and edge architectures is layered. From client to database:

Browser cache — driven by Cache-Control from CloudFront/API Gateway.
CloudFront edge cache — cache policy + origin headers, TTL controlled by you.
CloudFront regional edge cache — second-tier automatic cache.
API Gateway cache — 0.5–237 GB per stage, only on REST API.
Application cache (ElastiCache, DAX) — Redis/Memcached in front of RDS or DynamoDB.
Database buffer pool — RDS/Aurora internal cache.

A well-designed API Gateway and edge stack caches at the outermost layer the data permits, and only falls through to deeper layers on miss. When cost is the concern, CloudFront (per-GB pricing) usually beats API Gateway cache (per-hour flat pricing) for high-fan-out reads.

Common Exam Traps for API Gateway and Edge

HTTP API does not support API keys / usage plans — need those? Use REST API.
Edge-optimized is REST-only — HTTP APIs are Regional; front with your own CloudFront for edge caching.
CloudFront does not support UDP — UDP scenarios point to Global Accelerator.
Global Accelerator does not cache — "accelerate static content" with caching is CloudFront.
Lambda Authorizer result cache is per-token-value — changing the token bypasses cache; the cache key is the token string (or the request context for REQUEST-type authorizers).
Cognito User Pool vs Identity Pool — User Pool for API Gateway JWT; Identity Pool for temporary AWS creds to access S3/DynamoDB directly.
OAI is legacy; OAC is current — pick OAC for new S3 + CloudFront lockdown questions.
CloudFront Functions cannot call AWS services — network calls or AWS SDK access require Lambda@Edge.
API Gateway has a hard 29-second integration timeout — long-running requests need async patterns (SQS + Step Functions + WebSocket push).
Global Accelerator failover is faster than Route 53 failover — sub-30-second regional shift vs DNS TTL wait.
REST API request/response size limit is 10 MB — large file uploads go direct-to-S3 via pre-signed URL, not through API Gateway.
WAF attaches at the CloudFront layer (global) or regional API Gateway layer — HTTP API does not support WAF attachment directly.

The Big Three API Gateway and Edge Distinctions for SAA-C03 — If you only memorize three API Gateway and edge facts, make them:

HTTP API ≠ REST API: HTTP API is cheaper and faster but has no API keys, no usage plans, no WAF, no edge-optimized mode, no mapping templates.
CloudFront ≠ Global Accelerator: CloudFront is HTTP + cache; Global Accelerator is any TCP/UDP + static IPs + no cache.
CloudFront Functions ≠ Lambda@Edge: CloudFront Functions is header/URL rewrites at line rate; Lambda@Edge is full-fat Lambda at regional edges for origin manipulation and AWS-SDK calls. Reference: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/introduction-what-is-cloudfront.html

Side-by-Side: API Gateway and Edge Cheat Sheet

Scenario	Correct Service
Cheap serverless HTTP facade in front of Lambda	API Gateway HTTP API
SaaS API with API keys, quotas, per-plan throttle	API Gateway REST API + usage plans
Real-time chat or live collab with server push	API Gateway WebSocket API
JWT-based user auth from Cognito	Cognito User Pool authorizer
OAuth from Okta / Auth0 / custom HMAC	Lambda Authorizer
Global static website with S3 origin	CloudFront + OAC
Header rewrites at line rate, no backend calls	CloudFront Functions
URL rewrite + DynamoDB lookup at the edge	Lambda@Edge (origin-request)
Multiplayer game over UDP with static IPs	Global Accelerator UDP listener + NLB
Cross-Region active-active with sub-30s failover	Global Accelerator with traffic dials
Video catalog with pay-per-view access	CloudFront signed cookies
One-off premium PDF download link	CloudFront signed URL
Restrict S3 bucket to only CloudFront	Origin Access Control (OAC)
Rate-limit malicious IPs at the edge	AWS WAF on CloudFront + rate-based rule
Global DDoS protection beyond Shield Standard	AWS Shield Advanced on CloudFront/GA
DNS failover across Regions	Route 53 failover policy

Practice Question Patterns for API Gateway and Edge

Protocol discriminator: "UDP or non-HTTP traffic to be accelerated globally?" → Global Accelerator.
Caching vs acceleration: "Reduce origin bandwidth cost through caching?" → CloudFront.
Auth shape: "User base already in Cognito, JWT-based auth, simplest integration?" → Cognito authorizer (User Pool).
Auth shape: "Custom HMAC or OAuth from Okta?" → Lambda Authorizer.
API type: "Needs usage plans and API keys for monetization?" → REST API.
API type: "Cheapest Lambda facade, no monetization needed?" → HTTP API.
Edge compute: "Rewrite a header at 10 million RPS, no AWS SDK calls?" → CloudFront Functions.
Edge compute: "Look up customer tier in DynamoDB before choosing origin?" → Lambda@Edge origin-request.
Static IP requirement: "Corporate firewall needs fixed IP allowlist to the application?" → Global Accelerator.
S3 lockdown: "Only CloudFront can read from S3; bucket has no public access?" → Origin Access Control (OAC).
Failover speed: "Need regional failover in seconds, not DNS TTL?" → Global Accelerator.
Private API: "API reachable only from inside a VPC, not from the internet?" → REST API with private endpoint + VPC Interface Endpoint.

FAQ — API Gateway and Edge Top Questions

Q1: When should I choose API Gateway HTTP API over REST API on SAA-C03?

Choose HTTP API by default for new serverless APIs: it is roughly 70% cheaper, about 60% lower latency, and simpler to configure. Choose REST API only when you need features HTTP API does not support — API keys with usage plans for monetization, request validation, mapping templates for legacy request/response transformation, AWS WAF attached directly to the API, edge-optimized endpoints with built-in CloudFront, or private APIs inside a VPC. On exam scenarios, look for the words "API keys," "quota per customer," "usage plan," or "throttling per subscriber" — those all point to REST API. Look for "cheapest," "simplest Lambda proxy," or "just JWT auth" — those point to HTTP API.

Q2: What is the difference between Cognito User Pool and Cognito Identity Pool for API Gateway?

Cognito User Pool is a user directory and JWT issuer — it handles sign-up, sign-in, password reset, MFA, and returns ID/access tokens that API Gateway can validate natively with a Cognito authorizer (REST) or JWT authorizer (HTTP). Cognito Identity Pool is a credential broker — it exchanges a User Pool token (or a third-party IdP token) for temporary AWS IAM credentials so the client can call AWS services like S3 or DynamoDB directly. For API Gateway authentication, you almost always want User Pool. Use Identity Pool when a mobile client needs direct AWS SDK access to resources without a backend in the middle. The two can be used together: User Pool for sign-in, Identity Pool for direct S3 upload credentials, API Gateway for business logic.

Q3: When do I pair CloudFront with Global Accelerator, and when do I use just one?

Pair both when your application has both cacheable HTTP content and non-cacheable real-time traffic. Example: a gaming company uses CloudFront for the game launcher's static files (patches, images, landing page) and Global Accelerator for the UDP game server traffic. Use only CloudFront when traffic is 100% HTTP/HTTPS and caching delivers obvious benefit — classic web and API workloads. Use only Global Accelerator when you need stable IPs for firewall allowlists, UDP/non-HTTP protocols, or sub-30-second regional failover that beats DNS TTL. If you see both in an answer set and the scenario is plain HTTP with caching, the single CloudFront answer usually beats the paired one on cost.

Q4: How does Origin Access Control (OAC) differ from the older Origin Access Identity (OAI)?

OAC is the 2022 replacement for OAI. OAC supports SSE-KMS-encrypted S3 buckets (OAI does not), works in all AWS Regions including opt-in Regions, uses AWS SigV4 signing, and enables advanced use cases like S3 Object Lambda. Functionally both lock the S3 bucket so only the CloudFront distribution can read from it, but OAC is the AWS-recommended pattern for new distributions. On SAA-C03, if a question offers both OAC and OAI as options, pick OAC. OAI remains supported for legacy distributions.

Q5: What is the difference between Lambda@Edge and CloudFront Functions, and when should I pick each?

CloudFront Functions runs ECMAScript 5.1 JavaScript at the edge location itself, scales to 10 million requests per second per distribution, has a 1 ms CPU limit and 2 MB memory, supports only viewer-request and viewer-response events, and cannot call AWS services or the network. It is roughly one-sixth the cost of Lambda@Edge. Lambda@Edge runs Node.js or Python at regional edge caches, supports all four event types (viewer-request, viewer-response, origin-request, origin-response), allows AWS SDK and network calls, and has far higher limits (5 seconds on viewer events, 30 seconds on origin events, up to 10 GB memory). Pick CloudFront Functions for header/URL rewrites, cache-key normalization, A/B cookie assignment, JWT signature pre-check. Pick Lambda@Edge when you need to call DynamoDB from the edge, dynamically pick an origin based on user attributes, or do anything a CloudFront Function cannot express.

Q6: How do API Gateway usage plans, API keys, and throttling work together?

A usage plan bundles a quota (requests per day/week/month) and a throttle (steady-state RPS plus burst) and associates with one or more API Gateway stages. You attach API keys to the usage plan. When a request arrives with an x-api-key header, API Gateway validates the key, finds its usage plan, checks the quota (returns 429 if exceeded for the period), and applies the throttle (returns 429 if over RPS). API keys do not authenticate the caller — always layer a real auth mechanism (IAM, Cognito, Lambda Authorizer) on top. Usage plans and API keys are REST API only. HTTP API has no equivalent; if monetization and quota per customer are required, REST API is the answer.

Q7: Why would I put API Gateway behind CloudFront rather than use the edge-optimized endpoint?

Edge-optimized REST API bundles an AWS-managed CloudFront distribution in front of your API, with no knobs you can turn (cache policy, custom behaviors, WAF rules, Lambda@Edge). When you need any of those custom controls, deploy a Regional REST API or HTTP API and put your own CloudFront distribution in front. Now you own the cache behaviors (cache GET responses, bypass POST), attach a WAF Web ACL with custom rules, chain Lambda@Edge functions, add multiple origins (e.g., API for /api/*, S3 for /static/* under one domain), and use signed URLs. You also pay CloudFront separately, but you gain flexibility that edge-optimized REST does not offer.

Q8: What's the fastest failover option across AWS Regions for a global API?

Global Accelerator is the fastest. When an endpoint group's endpoints fail health checks, Global Accelerator withdraws the BGP announcement for that Region at the anycast layer, and traffic shifts in under 30 seconds — regardless of any DNS TTL. Route 53 failover depends on DNS resolver TTL respect; typical failover is 60–120 seconds and some resolvers ignore short TTLs. CloudFront origin failover is also fast (seconds), but only fails over between origins for the same distribution — not across Regions of your backend automatically. For sub-30-second cross-Region failover on any TCP or UDP protocol, Global Accelerator is the SAA-C03 answer.