examhub .cc The most efficient path to the most valuable certifications.
In this note ≈ 31 min

High-Performing and Scalable Network Architectures

6,180 words · ≈ 31 min read

What Are High-Performing Network Architectures on AWS

High-performing network architectures on AWS are the combination of EC2 networking primitives (ENA, EFA, placement groups), edge acceleration services (CloudFront, Global Accelerator), hub-and-spoke connectivity (Transit Gateway), dedicated hybrid links (Direct Connect), and private service exposure (AWS PrivateLink) that together minimize latency, maximize throughput, and stay predictable under load. SAA-C03 Task Statement 3.4 asks you to "determine high-performing and/or scalable network architectures" — meaning every scenario expects you to pick the right knob from the right layer. High-performing network architectures are never the result of one clever service; they are the result of aligning instance-level packet processing, intra-VPC placement, inter-VPC routing, edge caching, and hybrid bandwidth with the actual traffic pattern.

On SAA-C03 the high-performing network architectures topic is worth about 270 of the 1,920 Domain-3 questions, and it reuses concepts from Domain 1 (VPC security) and Domain 4 (network cost). The exam rewards candidates who can distinguish CloudFront from Global Accelerator in one sentence, explain why cluster placement groups beat spread placement groups for HPC, describe exactly what ENA and EFA add above the baseline, and name the scenarios where Transit Gateway replaces VPC peering. High-performing network architectures are also where the majority of "which service do I pick for low latency" questions live, so a clean mental map pays off on many sibling questions as well.

This note walks the high-performing network architectures stack from the NIC up: enhanced networking with ENA and EFA, placement group strategies for EC2, CloudFront cache policies and Origin Shield, the canonical CloudFront vs Global Accelerator decision, Transit Gateway at scale (routing, multicast, peering), Direct Connect connection types, LAG, and SiteLink, AWS PrivateLink for low-latency private service access, and bandwidth considerations that tie every layer together. Every section closes the loop back to "how does this change latency or throughput on exam day" — because high-performing network architectures questions are always decided by latency, throughput, jitter, or fixed-IP requirements, never by buzzwords.

White-Language Explanation of High-Performing Network Architectures

High-performing network architectures sound academic, but three everyday analogies turn them into something you can reason about under exam pressure.

Analogy 1 — The Restaurant Kitchen (ENA, EFA, and Placement Groups)

Imagine a busy restaurant kitchen. The standard EC2 virtual NIC is a single waiter carrying plates from the kitchen line to each table — fine for light lunch traffic, but the waiter becomes the bottleneck when twelve tables order at once. Enabling ENA (Elastic Network Adapter) is like swapping in a veteran server with a rolling trolley: same kitchen, same tables, but dramatically higher throughput and lower per-plate overhead because the delivery path is optimized. EFA (Elastic Fabric Adapter) goes one step further — it installs a dumbwaiter that lets the chefs skip the dining room entirely and pass plates directly between stations. That bypass is exactly what MPI workloads need to communicate with microsecond latency. Cluster placement groups are like putting all the prep stations on the same countertop so knives, oils, and pans are within arm's reach; spread placement groups deliberately place each chef at a different station so one dropped pan does not ruin the whole service; and partition placement groups divide the kitchen into racks of stations so one rack can fail without affecting the others. High-performing network architectures inside a single AZ are chosen by picking the right kitchen layout for the workload.

Analogy 2 — The Global Postal System (CloudFront, Global Accelerator, and Transit Gateway)

Now zoom out to planetary scale. CloudFront is the chain of neighborhood post office branches: cacheable HTTP content (packages that everyone wants) is pre-stocked near the customer, so they walk two blocks instead of waiting for a delivery from the central warehouse. Cache policies are the pick-up rules that decide whether the branch can reuse the parcel it already has or must call the warehouse. Origin Shield is the regional sorting hub that consolidates cache misses so the warehouse only sees one truck a day instead of fifty. AWS Global Accelerator, by contrast, is the corporate toll-free number: the customer dials one fixed number from anywhere on earth, the call instantly enters the AWS private phone backbone, and the call is routed to the healthy office with the shortest backbone path — no caching, no HTTP, just a faster wire. Transit Gateway is the central mail-sorting facility: every branch office (VPC) drops its outbound mail into the hub, the hub reads the route table and forwards to the correct destination, and new branches plug in without needing bilateral agreements with every existing branch. High-performing network architectures that span regions or thousands of VPCs always reduce to picking between "branch cache," "toll-free number," and "central sorter."

An on-premises datacenter connecting to AWS is an apartment building hooking into city utilities. Site-to-Site VPN is running the electric cable through a public power grid with a lock on your meter — cheap, fast to install, but voltage fluctuates. AWS Direct Connect is pulling a dedicated feeder cable from the utility substation straight into your building — weeks of trenching, but the wattage is contractually yours and no one else shares it. A dedicated connection is the whole feeder cable (1/10/100 Gbps); a hosted connection is a rented branch of a cable someone else already pulled (50 Mbps to 10 Gbps). Link Aggregation Groups (LAG) bundle up to four feeder cables into one thick trunk so you get 4×10 Gbps as one logical pipe. SiteLink is the utility company's new service that lets your Tokyo apartment and your Frankfurt apartment exchange power through the utility backbone without renting office space in between — branch-to-branch over the AWS private network, bypassing the public internet. AWS PrivateLink is the private service corridor that lets your building receive cable TV from one specific provider through a dedicated conduit, not the shared riser — unidirectional, low-latency, and invisible to other tenants. High-performing network architectures at the hybrid layer are all about which cable you pull and which corridor you reserve.

With the kitchen, the postal system, and the apartment utilities as a mental map, every high-performing network architectures question on SAA-C03 reduces to "which tool at which layer."

EC2 Network Performance Foundations — ENA, EFA, and the Baseline

High-performing network architectures start at the instance. Everything you build above EC2 inherits the NIC's packet-per-second rate, jitter, and cross-AZ latency — so picking the right enhanced networking driver and placement group is the first lever, not the last.

Enhanced Networking with ENA

The Elastic Network Adapter (ENA) is the standard high-throughput virtual NIC available on almost every modern EC2 instance family (M5 and up, C5 and up, R5 and up, Nitro-based). ENA delivers up to 100 Gbps of aggregate throughput on the largest instances, uses SR-IOV to bypass the hypervisor packet path, and ships with drivers pre-installed on every current Amazon Linux, Ubuntu LTS, and Windows Server AMI.

What ENA buys you compared to the legacy Xen paravirtualized driver:

  • Higher packets-per-second (PPS) — often 2 to 4 million PPS on c5n, up from a few hundred thousand.
  • Lower and more consistent inter-instance latency — single-digit microseconds inside the same AZ.
  • Hardware checksum offload and multi-queue support so a single instance's vCPUs do not become the NIC bottleneck.

For SAA-C03 you do not need to benchmark ENA; you need to recognize that any time a scenario says "high packet rate," "low jitter," "network-intensive EC2 workload," or mentions c5n, m5n, m5dn, r5n, the answer involves enhanced networking with ENA.

Elastic Fabric Adapter (EFA) for HPC and ML

EFA is ENA plus an OS-bypass data path. On supported instances (c5n.18xlarge, p4d.24xlarge, p5, hpc6a, hpc7a, and similar), EFA adds the Scalable Reliable Datagram (SRD) protocol that lets MPI and NCCL traffic skip the TCP/IP stack and the kernel entirely, reaching consistent sub-20-microsecond application-to-application latency between instances in the same cluster placement group.

When EFA is the correct answer:

  • Tightly-coupled HPC (CFD, molecular dynamics, weather simulation) running MPI.
  • Distributed deep-learning training (PyTorch DDP, Horovod, NCCL all-reduce) at 8+ nodes.
  • Any scenario that names libfabric, MPI, NCCL, or "tightly coupled."

When EFA is not the answer:

  • General web traffic, databases, or anything using plain TCP sockets — EFA falls back to standard ENA behavior for those, so you pay no penalty, but you also gain nothing.
  • Cross-AZ or cross-Region traffic — EFA requires the instances to be in the same AZ and the same cluster placement group.

Enhanced NetworkingEnhanced networking is the AWS umbrella term for SR-IOV-based network acceleration on EC2. It covers two drivers: ENA (high throughput, available broadly, used by default on Nitro) and EFA (ENA plus OS-bypass for HPC/ML via the SRD protocol). Enhanced networking requires a supported instance type, a supported AMI, and, for EFA, the libfabric stack on the guest. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html

EC2 Placement Groups — Cluster, Spread, and Partition

Placement groups tell EC2 how to physically distribute your instances relative to each other. High-performing network architectures use placement groups to either maximize throughput (pack tightly) or minimize correlated failure (spread out). The three strategies are:

  1. Cluster placement group — All instances go into a single rack or pair of adjacent racks inside one AZ. This minimizes inter-instance latency and maximizes per-flow bandwidth (up to 10 Gbps per flow between instances). Best for HPC, MPI, financial trading, and tightly-coupled ML training. The classic trade-off is that a rack-level failure affects the whole group.

  2. Spread placement group — Each instance lands on a distinct underlying hardware rack, optionally across multiple AZs. Maximum 7 instances per AZ per spread group. Best for small critical fleets (for example, a handful of control-plane nodes or a NoSQL quorum) where losing two of seven at once is unacceptable. The hard limit of 7 is a favorite SAA-C03 trap.

  3. Partition placement group — Divides instances into up to 7 partitions per AZ, with each partition on a separate set of racks. Scales to hundreds of instances. Best for large distributed workloads like HDFS, Cassandra, or Kafka where the application can survive losing a whole partition but wants partitions to fail independently.

Placement Group Limits to Lock In — - Cluster: single AZ, maximum 10 Gbps per-flow between instances, same-rack locality, use with EFA for HPC.

  • Spread: max 7 instances per AZ per group (hard limit), each on distinct rack hardware.
  • Partition: up to 7 partitions per AZ, hundreds of instances per partition, rack-aware.
  • A single instance can be in only one placement group at a time.
  • You cannot merge placement groups; you can move an instance between groups only by stopping it. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html

Instance Network Bandwidth — The Hidden Ceiling

Every EC2 instance type has a documented baseline and burst network bandwidth. Small instances (t3.small, m5.large) have bursty credits similar to EBS; larger sizes (m5.24xlarge, c5n.18xlarge, c7gn.16xlarge) provide guaranteed throughput up to 100 Gbps. Two traps appear on exam questions:

  • Single-flow cap: Even on a 100 Gbps instance, a single TCP flow is capped at 5 Gbps (or 10 Gbps inside a cluster placement group). Scaling throughput requires multiple parallel flows.
  • Cross-AZ and cross-Region limits: AZ-to-AZ traffic costs bandwidth and adds latency (roughly 1–2 ms within a region). Cross-region bandwidth has no hard cap but costs data transfer fees and adds tens of milliseconds of latency.

If a scenario demands "maximum throughput between two EC2 instances," the answer is: same AZ + cluster placement group + ENA or EFA + instance type with 25 Gbps or higher network bandwidth + multiple parallel flows.

The Four-Step EC2 Network Tuning Checklist — For any EC2-to-EC2 performance question, apply these four steps in order:

  1. Instance type: pick a Nitro family with sufficient baseline bandwidth (c5n, m5n, c6gn, c7gn).
  2. Driver: enable ENA; for HPC or ML all-reduce, add EFA.
  3. Placement: use a cluster placement group for lowest latency; spread or partition for fault isolation.
  4. Flow parallelism: open multiple TCP streams to bypass the single-flow cap. Skipping any step leaves performance on the table. Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-network-bandwidth.html

CloudFront Deep Dive — Edge Caching for Latency Reduction

Amazon CloudFront is the AWS CDN and the first stop for any scenario involving "reduce latency for global users" on cacheable HTTP/HTTPS content. High-performing network architectures treat CloudFront as a four-layer cache: browser, edge cache, Regional Edge Cache, and Origin Shield, with the origin fetched only when every layer misses.

CloudFront Origins and Origin Groups

A CloudFront origin is the source of truth that CloudFront fetches from on cache miss. Supported origin types include:

  • Amazon S3 buckets (with Origin Access Control locking the bucket to CloudFront-only).
  • AWS MediaStore and MediaPackage for video workflows.
  • Elastic Load Balancers (ALB or NLB) fronting EC2, ECS, or EKS.
  • API Gateway endpoints (though API Gateway has its own edge caching).
  • Any publicly reachable HTTP server, including on-premises origins and non-AWS clouds.

An origin group pairs a primary origin with a secondary origin plus the set of HTTP status codes (typically 500, 502, 503, 504, and specific 4xx) that trigger origin failover. This is the fastest DR pattern you can implement at the edge — failover happens at the edge in milliseconds, without waiting for a Route 53 TTL to expire.

Cache Behaviors and Path Patterns

A cache behavior maps a path pattern (for example /api/* or /static/*) to a specific origin plus a set of cache settings. CloudFront evaluates behaviors in order of specificity; the default behavior (*) is the fallback. Use cache behaviors to:

  • Send /api/* to an ALB with caching disabled while /static/* goes to S3 with a 24-hour TTL.
  • Attach different WAF Web ACLs to different path prefixes.
  • Enable different viewer protocol policies (HTTP redirects to HTTPS vs HTTPS-only) per path.
  • Associate Lambda@Edge or CloudFront Functions to specific behaviors only.

Cache Policies, Origin Request Policies, and Response Headers Policies

Modern CloudFront separates three concerns that used to share one settings blob:

  • Cache Policy — defines the cache key (which headers, cookies, and query strings are part of the hash) and the TTL. A narrower cache key means a higher hit ratio.
  • Origin Request Policy — defines what CloudFront forwards to the origin on cache miss, independent of the cache key. You can forward more to the origin than you key on, avoiding cache-key explosion.
  • Response Headers Policy — injects security headers (HSTS, CSP), CORS headers, and custom headers on the way out to the viewer.

AWS publishes managed policies (CachingOptimized, CachingDisabled, CachingOptimizedForUncompressedObjects, AllViewer, etc.) that cover 80% of use cases. Custom policies come into play when you need to key on a specific header like Accept-Language.

Raise Your Cache Hit Ratio by Narrowing the Cache Key — The single biggest CloudFront performance gain is increasing the cache hit ratio. Use a managed CachingOptimized policy as the starting point, forward only the headers/cookies/query strings that actually affect the response, and let the origin request policy forward the rest without keying on them. A hit-ratio jump from 70% to 95% effectively triples origin capacity and triples perceived speed for users. Reference: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/controlling-the-cache-key.html

Origin Shield — The Regional Cache Consolidator

Origin Shield is an optional additional caching layer placed between CloudFront's Regional Edge Caches and your origin. When enabled, every cache miss from every edge location funnels through the Origin Shield Region first. Benefits:

  • Dramatically lower origin load: instead of hundreds of Regional Edge Caches each pulling from the origin, only Origin Shield does, and it coalesces concurrent requests for the same object into a single origin fetch.
  • Higher cache hit ratio for long-tail content where requests are too sparse to populate every Regional Edge Cache.
  • Lower origin egress cost because far fewer bytes leave the origin.

Origin Shield is especially valuable for:

  • Live-streaming video workflows where thousands of viewers request the same segment within seconds.
  • APIs with moderate cardinality where regional edge caches see low individual hit rates.
  • Any origin with strict egress limits (a small on-premises server or a Direct Connect link).

The flip side is the per-GB Origin Shield fee; on pure static assets where every edge already has high hit ratios, Origin Shield sometimes adds cost without meaningful benefit.

CloudFront Security and Performance Integrations

High-performing network architectures treat CloudFront as both a latency tool and a security tool:

  • AWS Shield Standard is free on every distribution and blocks layer 3/4 DDoS at the edge.
  • AWS WAF attaches a Web ACL for layer-7 filtering (SQL injection, XSS, rate limits).
  • Field-level encryption encrypts specific form fields (credit card numbers) at the edge so they are never decrypted until a specific backend service.
  • CloudFront Functions runs tiny JavaScript at every edge location (sub-millisecond) for URL rewrites, A/B splits, and header manipulation.
  • Lambda@Edge runs Node.js or Python in the Regional Edge Cache for richer logic (authentication, image resizing).

AWS Global Accelerator — Anycast IPs Over the AWS Backbone

Global Accelerator is CloudFront's non-HTTP cousin. It gives you two static anycast IPv4 addresses (optionally two additional IPv6) that are advertised from every AWS edge location. A client hitting those IPs enters the AWS backbone at the nearest edge and traverses the private backbone to the configured endpoint group in a specific region — bypassing congested public-internet paths for the bulk of the trip.

Core Global Accelerator Building Blocks

  • Accelerator — the top-level resource, issued the two static anycast IPs.
  • Listener — TCP or UDP on a specific port (or port range), optionally with client-affinity stickiness.
  • Endpoint Group — one per AWS Region, with a traffic-dial percentage (0–100) controlling how much traffic that region receives.
  • Endpoint — an ALB, NLB, EC2 instance, or Elastic IP. Weights control distribution inside a group.

Why Global Accelerator Beats Raw Public Internet

  • Anycast IP at the edge: packets enter AWS within tens of milliseconds regardless of the client's ISP path.
  • Private backbone: the rest of the journey runs on AWS-owned fiber with predictable latency and low jitter.
  • Fast failover: unhealthy endpoints are pulled from rotation in under 30 seconds, and traffic reroutes to the next-best region automatically.
  • Fixed IPs for allowlists: enterprise firewalls, gaming consoles, and IoT gateways often bake in IPs — the two anycast IPs never change for the lifetime of the accelerator.

Global Accelerator vs CloudFront — The Canonical Decision

This is the single most-tested distinction on SAA-C03 in the high-performing network architectures area. Use this short table, then the paragraph below:

Dimension CloudFront AWS Global Accelerator
Protocol HTTP / HTTPS only Any TCP or UDP
Caching Yes (edge + regional + shield) No, never caches
IP addresses Many, rotating per distribution 2 fixed anycast IPs (+ IPv6)
Best use case Static assets, video streaming, cached APIs Gaming, VoIP, IoT, financial trading, non-HTTP TCP, HTTP where caching is impossible
Pricing Per-request + per-GB egress Hourly accelerator fee + per-GB DT-premium
Failover Origin failover on 5xx (and selected 4xx) Health-check-driven across regions in <1 min
Use for fixed-IP allowlists No Yes

Decision rule: if the traffic is cacheable HTTP(S), start with CloudFront. If the traffic is UDP, non-HTTP TCP, or HTTP but not cacheable and you need fixed IPs, pick Global Accelerator. They are not mutually exclusive — large stacks run CloudFront for /static/* and Global Accelerator for a real-time WebSocket or game server fleet.

Global Accelerator Does Not Cache — Ever — A common SAA-C03 trap is a scenario offering "accelerate a static website with Global Accelerator." Global Accelerator never caches content; it only accelerates the routing path. For static assets, CloudFront's edge cache is orders of magnitude faster because cached hits return from the edge without round-tripping to origin at all. Global Accelerator wins only on non-cacheable or non-HTTP traffic. Reference: https://docs.aws.amazon.com/global-accelerator/latest/dg/what-is-global-accelerator.html

AWS Transit Gateway at Scale

AWS Transit Gateway is a regional cloud router that attaches VPCs, VPNs, Direct Connect gateways, and peer Transit Gateways in other regions. High-performing network architectures use it whenever the number of interconnected VPCs exceeds a handful — VPC peering becomes combinatorial (N×(N-1)/2) at scale, while Transit Gateway grows linearly.

Transit Gateway Routing and Route Tables

Every attachment (VPC, VPN, DX, peering) is associated with one Transit Gateway route table but can propagate routes into many. Static routes and propagated (dynamic) routes coexist. Typical patterns:

  • Flat topology: a single default route table; every VPC reaches every other VPC.
  • Segmented topology: separate route tables for prod, dev, and shared services, with static routes controlling which segments can talk.
  • Shared-services hub: a central VPC hosting DNS, AD, and CI/CD, with both prod and dev route tables routing to it but not to each other.

Transit Gateway Multicast

Transit Gateway supports native IP multicast across VPCs — useful for legacy enterprise apps (market-data distribution, video broadcast, service discovery) that require multicast. Multicast groups are defined on the Transit Gateway with sources and members expressed as ENIs; cross-AZ and cross-VPC delivery is handled by the TGW. This is the only way to do multicast inside AWS; native VPC multicast is not supported.

Transit Gateway Peering and Inter-Region Traffic

Two Transit Gateways in different regions can be peered over the AWS backbone, giving you transitive multi-region connectivity without paying for inter-region public internet or stitching VPNs together. Traffic between peered TGWs rides the AWS private network with the usual inter-region data-transfer pricing. Peering can span AWS partitions restrictions aside, and each peering carries its own BGP-less routing decision recorded in TGW route tables.

Transit Gateway Scale Numbers to Remember

  • Up to 5,000 attachments per Transit Gateway.
  • Up to 10,000 routes per Transit Gateway route table.
  • 50 Gbps per VPC attachment (aggregate across flows).
  • Equal-Cost Multi-Path (ECMP) across multiple Direct Connect or VPN attachments for bandwidth aggregation.

Transit Gateway Is the Answer for "Many VPCs" Questions — Whenever SAA-C03 describes ten or more VPCs, multi-account organizations with dozens of accounts, hub-and-spoke shared services, or transitive routing ("A must reach C through the hub"), the answer is AWS Transit Gateway plus AWS Resource Access Manager for cross-account sharing. VPC peering is the right answer only when the scenario explicitly has two or three VPCs and wants the cheapest option. Reference: https://docs.aws.amazon.com/vpc/latest/tgw/how-transit-gateways-work.html

Direct Connect is the physical layer of high-performing network architectures that span on-premises and AWS. It provides a private fiber link from a customer router in a Direct Connect location (a carrier-neutral colocation facility) to AWS. This bypasses the public internet entirely.

Dedicated vs Hosted Connections

  • Dedicated Connection: a physical port (1 Gbps, 10 Gbps, or 100 Gbps) allocated exclusively to your AWS account. Ordered directly from AWS; provisioning takes weeks because the cross-connect must be physically patched in the colo.
  • Hosted Connection: a logical slice of a partner's existing Direct Connect port, provisioned by an AWS Direct Connect Partner. Speeds from 50 Mbps up to 10 Gbps. Faster to light (often days), and better for sub-gig bandwidth needs. Only one VIF per hosted connection.

Virtual Interfaces (VIFs)

Over a DX connection you create VIFs, each tagged with a VLAN:

  • Private VIF reaches a specific VPC through a Virtual Private Gateway, or many VPCs through a Direct Connect Gateway and Transit Gateway Transit VIF.
  • Public VIF reaches AWS public services (S3, DynamoDB, API endpoints) across the DX link without traversing the internet — useful when compliance forbids public egress.
  • Transit VIF connects to a Direct Connect Gateway associated with a Transit Gateway, letting one DX link serve many VPCs across multiple regions.

A LAG bundles up to four identical-speed dedicated connections into a single logical connection using LACP. Benefits:

  • Aggregate bandwidth: 4 × 10 Gbps behaves like 40 Gbps logically.
  • Redundancy: losing one physical port drops capacity but does not drop the BGP session if the minimum-links threshold is met.
  • Simpler BGP: one BGP peer per LAG instead of one per port.

For 100 Gbps workloads you can combine multiple 100 Gbps ports into a single LAG, though most customers satisfy their needs with a single 100 Gbps port.

AWS Direct Connect SiteLink is a feature flag you can enable on private or transit VIFs. When enabled, traffic between two Direct Connect locations flows across the AWS global backbone instead of the public internet, even if neither endpoint is a VPC. Use cases:

  • Branch-to-branch WAN replacement: Tokyo datacenter and Frankfurt datacenter each land a DX, enable SiteLink, and exchange traffic at AWS-backbone latency.
  • Failover WAN: SiteLink is a cheaper secondary to an MPLS WAN.
  • Multi-colo enterprise: consolidating many legacy MPLS circuits onto a few DX ports.

SiteLink charges a per-hour fee per enabled VIF plus the usual DX data transfer. It is the first AWS service that explicitly lets customers use the AWS backbone as a WAN without touching a VPC.

Direct Connect Resiliency Recommendations

AWS publishes four resiliency tiers; SAA-C03 expects you to recognize them:

  1. Development and test: one connection at one location.
  2. High resiliency: two connections to two different devices at the same DX location.
  3. Maximum resiliency: two connections at two different DX locations (with two different customer routers).
  4. Active/active plus VPN backup for disaster tolerance at the region level.

Direct Connect Is Not Encrypted by Default — Even though Direct Connect is a private dedicated link, traffic on it is not encrypted. If the workload requires encryption in transit (HIPAA, PCI-DSS, financial sector mandates), either enable MACsec on 10/100 Gbps dedicated ports, or run an IPSec VPN on top of the DX link terminating on a Transit Gateway or VGW. Assuming "private equals encrypted" is a classic high-performing network architectures exam trap. Reference: https://docs.aws.amazon.com/directconnect/latest/UserGuide/encryption-options.html

High-performing network architectures use PrivateLink when they need to expose a service to many consumers without broad VPC peering. PrivateLink uses Interface VPC Endpoints (ENIs with private IPs) in the consumer's VPC, fronted by a Network Load Balancer in the producer's VPC, over the AWS private network.

  • SaaS-style service exposure: one producer VPC, hundreds or thousands of consumer VPCs, often across accounts.
  • Unidirectional service model: consumer calls producer; producer cannot initiate back — good for security boundaries.
  • No CIDR overlap concerns: peering fails when two VPCs have overlapping CIDRs; PrivateLink does not care because the interface endpoint gets an IP from the consumer's CIDR.
  • Fine-grained IAM: the endpoint supports endpoint policies, and the service provider controls which AWS accounts can connect.
  • PrivateLink — lowest operational complexity for many-to-one service exposure, NLB-fronted, unidirectional, ENI-based.
  • VPC Peering — bidirectional, lowest data-processing cost at low VPC counts, does not scale to many consumers cleanly, CIDR must not overlap.
  • Transit Gateway — transitive routing across many VPCs, supports DX and VPN in the same hub, highest flexibility, per-attachment and per-GB processing fee.

For low-latency service calls at scale (thousands of microservice consumers hitting a small set of platform services), PrivateLink is usually the high-performing answer.

Many AWS services (Kinesis, SNS, SQS, Systems Manager, Secrets Manager, ECR, etc.) expose Interface VPC Endpoints so private subnets can reach them without a NAT Gateway. This both improves security and reduces NAT egress cost — a classic overlap between high-performing network architectures and cost-optimized network.

Bandwidth Considerations and Capacity Planning

The final layer of high-performing network architectures is capacity planning — making sure every piece of the path has headroom for the expected load.

Per-Flow vs Aggregate Bandwidth

  • Single TCP flow between two EC2 instances is capped at 5 Gbps in most cases, or 10 Gbps inside a cluster placement group.
  • Aggregate bandwidth up to the instance's documented rating (up to 100 Gbps on the largest instances, 200 Gbps on select Nitro instances).
  • If a single flow is the bottleneck, architect the application to use multiple parallel TCP connections, for example by sharding by key or running multiple worker threads.

NAT Gateway and Egress Throughput

NAT Gateway scales up to 100 Gbps per NAT Gateway (recently raised from the older 45 Gbps), but throughput is still a consideration:

  • Deploy one NAT Gateway per AZ to avoid cross-AZ latency and cross-AZ data-transfer charges.
  • For very-high-egress patching or container-image pulls, consider VPC Interface Endpoints for the specific AWS service instead of routing through NAT.
  • For DynamoDB and S3, a Gateway Endpoint is free and eliminates NAT traffic entirely.

Inter-AZ and Inter-Region Bandwidth

  • Inter-AZ traffic inside a region is typically 1–2 ms and is billed per GB in each direction (except within the same AZ, which is free).
  • Inter-region traffic adds tens to hundreds of milliseconds depending on geography and is priced at a premium per GB.
  • If sustained inter-region bandwidth is material, evaluate Transit Gateway Inter-Region Peering (runs on backbone, still priced per GB) vs Direct Connect + SiteLink (flat port fee, cheaper per GB at high volume).

Bandwidth Per Direct Connect Connection

  • 1 Gbps / 10 Gbps / 100 Gbps dedicated ports.
  • Up to four ports in a single LAG.
  • Hosted connections from 50 Mbps up to 10 Gbps.
  • Remember DX bandwidth is half-duplex to each VIF in terms of pricing (data transfer is billed egress only at the DX rate).

The Bandwidth Planning Checklist

For any SAA-C03 scenario that names throughput (e.g., "transfer 20 TB per day," "sustain 5 Gbps of streaming video"), walk this checklist:

  1. What is the source and destination — same AZ, cross-AZ, cross-Region, or on-prem?
  2. Is the traffic cacheable? If yes, CloudFront drops origin bandwidth dramatically.
  3. Is the traffic egress from a private subnet? NAT Gateway, Gateway Endpoint, or Interface Endpoint?
  4. Does a single TCP flow suffice, or do we need parallelism to beat the per-flow cap?
  5. For hybrid, is VPN (up to 1.25 Gbps per tunnel, can aggregate with ECMP) enough, or do we need Direct Connect?

Side-by-Side Decision Table for High-Performing Network Architectures

Scenario Correct High-Performing Network Architecture Choice
Global users, cacheable HTTP content Amazon CloudFront with Origin Shield and optimized cache key
Global users, UDP game server with fixed IPs AWS Global Accelerator, TCP/UDP listeners, endpoint groups per region
HPC with MPI across 64 nodes c5n/hpc6a + EFA + cluster placement group, single AZ
Maximum per-instance throughput Nitro instance + ENA + cluster placement group + multiple flows
7 critical control-plane EC2s that must never fail together Spread placement group across AZs
100-node Cassandra ring tolerating partition failures Partition placement group, one partition per rack
40 VPCs across 5 accounts AWS Transit Gateway shared via AWS RAM
Multicast market-data across VPCs Transit Gateway multicast
On-prem to AWS at guaranteed 10 Gbps Direct Connect dedicated 10 Gbps + private VIF
40 Gbps hybrid bandwidth LAG of 4×10 Gbps dedicated DX connections
Tokyo office to Frankfurt office via AWS Direct Connect + SiteLink at each end
SaaS service exposed privately to 500 consumer VPCs AWS PrivateLink with NLB
Reduce cross-region DX latency Transit Gateway inter-region peering over backbone
Compliance requires encrypted hybrid traffic DX with MACsec, or IPSec VPN over DX
Free private access to S3 from private subnet VPC Gateway Endpoint for S3

Common High-Performing Network Architectures Exam Traps

  1. CloudFront vs Global Accelerator by protocol — HTTP cacheable → CloudFront. TCP/UDP or non-cacheable with fixed IPs → Global Accelerator.
  2. Spread placement group 7-instance-per-AZ limit — larger fleets need partition, not spread.
  3. Cluster placement group is single-AZ only — cross-AZ breaks the locality guarantee.
  4. EFA requires same cluster placement group and same AZ — otherwise you are paying for ENA with unused EFA libraries.
  5. Direct Connect is not encrypted — add MACsec or IPSec VPN over DX.
  6. VPC Peering is not transitive — many VPCs means Transit Gateway, not a peering mesh.
  7. Global Accelerator does not cache — static content still wants CloudFront in front.
  8. Single TCP flow cap at 5 Gbps — parallel flows required to saturate larger instances.
  9. NAT Gateway is zonal — one per AZ for HA and to avoid cross-AZ charges.
  10. Origin Shield is optional — pick only when cache-hit ratio or origin egress is the pain point.

The Four Pillars of High-Performing Network Architectures — Every SAA-C03 scenario in this topic sits on one or more of these four pillars — use them as a checklist:

  1. Instance layer: Nitro + ENA (or EFA) + cluster placement group for intra-AZ throughput and latency.
  2. Edge layer: CloudFront for cacheable HTTP, Global Accelerator for TCP/UDP with fixed IPs.
  3. Inter-VPC layer: Transit Gateway at scale; VPC peering only for small counts; PrivateLink for service exposure.
  4. Hybrid layer: Direct Connect for bandwidth and predictability; SiteLink for site-to-site over backbone; VPN for speed of setup and backup. Reference: https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/introduction.html

High-Performing Network Architectures vs VPC Security Foundations — Topic Boundary

Task 1.2 (VPC security) and Task 3.4 (high-performing network architectures) both touch VPC and both touch load balancers. The boundary:

  • 1.2 VPC security owns control-plane decisions: security groups, NACLs, private subnet design, encryption.
  • 3.4 high-performing network architectures owns data-plane decisions: ENA/EFA, placement groups, CloudFront, Global Accelerator, Transit Gateway throughput, Direct Connect bandwidth.

When a question asks "what protects the traffic?" it is 1.2. When it asks "what accelerates the traffic?" it is 3.4. Occasionally a single service (for example PrivateLink) appears in both; pick the domain that matches the task statement's verb — "secure" or "perform."

High-Performing Network Architectures vs Cost-Optimized Network — Topic Boundary

Task 4.4 (cost-optimized network) overlaps with 3.4 on Transit Gateway, Direct Connect, and NAT Gateway. The boundary is straightforward:

  • 3.4 high-performing cares about latency, throughput, jitter, and fixed IPs.
  • 4.4 cost-optimized cares about dollars per GB, attachment fees, and egress reduction.

Cluster placement groups and EFA never appear in 4.4 because they are purely performance levers. Gateway endpoints appear in both — they improve latency (3.4) and eliminate NAT cost (4.4), which is why they are a frequent correct answer across the exam.

Practice Question Patterns for High-Performing Network Architectures

Expect SAA-C03 to drill high-performing network architectures in these patterns:

  • Edge-service selection: "Users worldwide, cacheable JSON API" → CloudFront with CachingOptimized policy.
  • Protocol discrimination: "UDP game traffic, fixed IPs for console firewalls" → Global Accelerator.
  • Placement group scenario: "HPC MPI cluster of 32 nodes, microsecond latency" → cluster placement group + EFA.
  • Spread limit trap: "11 critical servers that must not fail together" → partition (not spread, because spread caps at 7 per AZ).
  • Multi-VPC scale: "50 VPCs across 10 accounts" → Transit Gateway + AWS RAM.
  • Hybrid bandwidth: "Consistent 8 Gbps from on-prem to AWS" → Direct Connect 10 Gbps dedicated + private VIF.
  • Cross-branch WAN: "Tokyo and Frankfurt offices connected via AWS private backbone" → Direct Connect + SiteLink.
  • Private SaaS exposure: "Expose internal service to 300 consumer VPCs without peering" → PrivateLink + NLB.

FAQ — High-Performing Network Architectures Top Questions

Q1: When should I use EFA instead of plain ENA for high-performing network architectures?

Use EFA when your workload is tightly coupled, uses MPI or NCCL, and runs across multiple EC2 instances in the same cluster placement group and same AZ. EFA's OS-bypass SRD protocol delivers consistent sub-20-microsecond application latency, which is the difference between a 10-node HPC cluster scaling linearly and collapsing under MPI all-reduce overhead. For everything else — web servers, databases, TCP microservices — ENA is sufficient and EFA adds no performance benefit because the application still goes through the normal kernel network stack. Remember EFA requires same-AZ, same-cluster-placement-group, a supported instance type (c5n.18xlarge, p4d, p5, hpc6a, hpc7a, etc.), and the libfabric user-space libraries installed in the AMI.

Q2: How do I decide between cluster, spread, and partition placement groups in high-performing network architectures?

Start from the failure-tolerance and latency requirements. Cluster is the answer when latency and throughput matter most and the whole group can tolerate rack-level correlated failure (HPC, tightly coupled ML training, low-latency trading). Spread is the answer when you have seven or fewer critical instances per AZ that must absolutely not share hardware (NoSQL quorum nodes, control-plane schedulers). Partition is the answer when you have many instances that should fail in independent blast-radius zones — HDFS workers, Kafka brokers, Cassandra rings with hundreds of nodes. The hard limits are what decide most exam questions: spread caps at 7 per AZ, partition caps at 7 partitions per AZ with unlimited instances inside each partition, cluster is single-AZ. If a scenario describes more than 7 critical instances per AZ, spread is wrong by definition.

Q3: CloudFront Origin Shield vs a regular CloudFront distribution — when is Origin Shield worth the extra fee?

Origin Shield is worth it when your cache hit ratio at regional edge caches is low and your origin egress is expensive or bandwidth-constrained. Typical fits: live video (thousands of viewers request the same segment within seconds, Origin Shield collapses those into a single origin fetch), long-tail content libraries (Wikipedia-style, where individual edge caches rarely see the same object twice), and on-premises origins reached via Direct Connect where origin egress counts against a finite DX port. For static websites with high natural cache-hit ratio (everyone requests the same CSS/JS), Origin Shield can add cost without meaningful savings — measure first, then enable. Origin Shield is enabled per origin, so you can use it selectively on the one origin that benefits.

Q4: Global Accelerator or CloudFront for a REST API serving mobile clients worldwide?

The answer depends on the cacheability of the responses. If responses are cacheable (idempotent GETs, feature flags, product catalogs), front the API with CloudFront — a cache hit returns in tens of milliseconds at the edge and never reaches the origin. If responses are per-user and non-cacheable (POSTs, user-specific data, realtime signals) and the mobile clients need fixed IPs for enterprise firewalls or lowest-possible non-cached latency to a specific region, pick Global Accelerator — it routes the TCP connection onto the AWS backbone at the nearest edge and maintains fast failover across regions. A common pattern is both: CloudFront for cacheable paths and Global Accelerator for the WebSocket endpoint.

VPC Peering is the right choice for two or three VPCs with no CIDR overlap and no transitive needs — it is free after the first attachment and has zero data-processing fee. Transit Gateway is the right choice for many VPCs (typically 4+), multi-account through AWS RAM, transitive routing, or consolidating VPN and Direct Connect alongside VPC attachments. TGW adds a per-attachment hourly fee and per-GB processing fee, but linear scaling easily beats the mesh-complexity of peering. PrivateLink is the right choice when you expose a specific service to many consumers — it is unidirectional, uses an NLB in the producer VPC and an Interface Endpoint in each consumer VPC, tolerates CIDR overlap, and scales to thousands of consumers without touching route tables. In practice, large organizations run all three in the same architecture: Transit Gateway for general multi-VPC, peering for a couple of special-case pairs, and PrivateLink for platform services.

For many enterprises, yes — or at least a credible second option. SiteLink lets traffic between two on-premises sites, both with Direct Connect, flow across the AWS private backbone instead of the public internet or an MPLS circuit. The latency is comparable to or better than MPLS in most geographies, and the per-Mbps cost is typically far lower than incumbent MPLS pricing. SiteLink's caveats: both ends must land a Direct Connect, you pay a per-hour SiteLink VIF fee plus per-GB data transfer, and failover to a secondary path must be designed deliberately (BGP weighting on your side). A common migration pattern is running SiteLink as the primary path with the legacy MPLS as backup, then retiring MPLS once SiteLink has proven stable for 6–12 months.

PrivateLink is faster in three specific cases. First, accessing AWS services (S3, DynamoDB, Kinesis, SNS, Secrets Manager, ECR, etc.) from a private subnet via Interface Endpoint is faster than routing through a NAT Gateway because the hop count is shorter and the traffic stays on the AWS private network end-to-end — plus it eliminates NAT Gateway data-processing fees. Second, exposing a service to many VPCs via PrivateLink gives each consumer a local ENI with a private IP; round-trip latency within a region is tens of microseconds, compared with peering where the path still traverses an intermediate route-table lookup. Third, PrivateLink is faster to scale operationally: adding the 501st consumer takes zero changes on the producer side, while peering requires a new peering connection and route-table entry on both sides. For cross-region private service access, pair PrivateLink with Transit Gateway inter-region peering or with a dedicated replication layer.

Further Reading on High-Performing Network Architectures

  • AWS VPC Connectivity Options Whitepaper (latest revision)
  • Amazon EC2 Enhanced Networking documentation (ENA and EFA)
  • Amazon CloudFront Developer Guide — Cache Policies, Origin Request Policies, and Origin Shield chapters
  • AWS Global Accelerator Developer Guide — Endpoints and Traffic Dials sections
  • AWS Transit Gateway Design Best Practices Whitepaper
  • AWS Direct Connect Resiliency Recommendations Whitepaper
  • AWS PrivateLink Concepts and Scenarios

Mastering high-performing network architectures — enhanced networking with ENA and EFA, cluster/spread/partition placement groups, CloudFront with Origin Shield, Global Accelerator for TCP/UDP, Transit Gateway at scale, Direct Connect with LAG and SiteLink, and PrivateLink for private service exposure — gives you the tool chest SAA-C03 Task 3.4 tests repeatedly. Every high-performing network architectures scenario resolves to picking the right layer: instance tuning, edge acceleration, inter-VPC routing, or hybrid bandwidth. Memorize the four pillars, practice the decision tables, and high-performing network architectures questions become reliable points on exam day.

Official sources