Cost-Optimized Compute on AWS (EC2, Lambda, Fargate, Spot, Savings Plans)

Cost-optimized compute is the SAA-C03 Task 4.2 discipline of matching every workload to the cheapest compute contract and runtime that still meets its availability, latency, and flexibility SLAs. Unlike the CLF-C02 pricing-models topic — which only asks you to recognize On-Demand, Reserved Instances, Savings Plans, Spot, and Dedicated Hosts — cost-optimized compute on SAA-C03 expects you to architect the whole stack: pick the purchasing option, mix Spot with On-Demand in an Auto Scaling group, migrate to Graviton, rightsize with Compute Optimizer, tune Lambda memory, and know when Fargate Spot beats EC2 Spot.

This study guide walks the full cost-optimized compute decision tree, compares Compute Savings Plans against EC2 Instance Savings Plans in exam-relevant depth, unpacks Spot Instance architectural patterns (mixed-instances ASG, interruption handling, Spot fleet), explains the 40% cost advantage of Graviton 2/3, and closes with Lambda memory-vs-duration math and a realistic Compute Optimizer rightsizing workflow. By the end you should be able to read any SAA-C03 cost-optimized compute scenario and name the right purchasing option, instance family, and runtime without hesitation.

What is Cost-Optimized Compute on AWS

Cost-optimized compute is a design discipline, not a single service. It combines three levers you pull together: the purchasing option (how you pay), the runtime (EC2 vs container vs Lambda), and the processor family (x86 Intel, x86 AMD, or ARM Graviton). The SAA-C03 exam tests all three levers, often in the same scenario question.

The three levers of cost-optimized compute

Purchasing option — On-Demand, Reserved Instances, Compute Savings Plans, EC2 Instance Savings Plans, Spot Instances, Dedicated Hosts, On-Demand Capacity Reservations. Each trades commitment or interruption risk for discount.
Runtime selection — EC2, ECS on EC2, ECS on Fargate, EKS on EC2, EKS on Fargate, AWS Batch, AWS Lambda. Each has a different billing granularity (per-second EC2, per-millisecond Lambda, per-second Fargate).
Processor family — Intel x86 (default), AMD x86 (roughly 10% cheaper than Intel on same vCPU), AWS Graviton (up to 40% better price-performance versus comparable x86 instances).

Cost-optimized compute on AWS is the design discipline of selecting the purchasing option, runtime, and processor family that deliver a workload's required throughput and availability at the lowest total cost. It is the core deliverable of SAA-C03 Task 4.2 and always involves a trade-off between commitment flexibility, interruption risk, and operational complexity. Source ↗

Why cost-optimized compute dominates Domain 4

Domain 4 of SAA-C03 is 20% of the exam, and community retros consistently report that cost-optimized compute questions outnumber every other Domain 4 sub-topic combined. The reason is simple: compute is the largest line item in most AWS bills, so AWS expects a Solutions Architect Associate to save customers real money on compute before anything else. Every cost-optimized compute question on SAA-C03 ultimately tests whether you can pick the cheapest compute that meets the availability requirement in the scenario.

Cost-Optimized Compute Purchasing Decision Tree

The single most tested artifact on SAA-C03 Domain 4 is the cost-optimized compute purchasing decision tree. Memorize it in this exact order of questions.

The seven-question decision tree

Can the workload tolerate a 2-minute interruption? If yes → consider Spot Instances (or Fargate Spot for containers). If no → continue.
Is the workload steady, predictable, and running for 1+ years? If no → On-Demand is the floor. If yes → continue.
Will you keep the same instance family, region, OS, and tenancy for the full commitment term? If yes → Standard Reserved Instance or EC2 Instance Savings Plan (deepest discount, up to 72%). If no → continue.
Do you need flexibility to change instance family or switch to Fargate/Lambda? If yes → Compute Savings Plan (up to 66% off, covers EC2 + Fargate + Lambda).
Do you need guaranteed capacity in a specific AZ without a pricing commitment? If yes → On-Demand Capacity Reservation (stacks with Savings Plans / RIs).
Do you need physical server isolation or per-socket BYOL licensing (SQL Server, Oracle)? If yes → Dedicated Hosts.
Can you move off EC2 entirely? If the workload is event-driven and short-lived → Lambda. If it's containerized but bursty → Fargate or Fargate Spot.

Purchasing options side-by-side

Option	Discount vs On-Demand	Commitment	Flexibility	Interruption	Capacity guarantee
On-Demand	0%	None	Full	No	No
EC2 Instance Savings Plan	Up to 72%	1 or 3 yr	Family + region locked	No	No
Standard Reserved Instance	Up to 72%	1 or 3 yr	Instance config locked	No	Zonal RI only
Convertible Reserved Instance	Up to 66%	1 or 3 yr	Can exchange family	No	No
Compute Savings Plan	Up to 66%	1 or 3 yr	EC2 + Fargate + Lambda	No	No
Spot Instance	Up to 90%	None	Any	2-min notice	No
On-Demand Capacity Reservation	0% (stacks with SP/RI)	None	AZ-specific	No	Yes
Dedicated Host (On-Demand)	0%	None	BYOL licensing	No	Host-level
Dedicated Host Reservation	Up to 70%	1 or 3 yr	BYOL licensing	No	Host-level

For 90% of SAA-C03 cost-optimized compute scenarios, this one-sentence decision works: "Interruptible → Spot. Steady and identical config for 3 years → Reserved Instance. Steady but might change family or add Lambda/Fargate → Compute Savings Plan. Unpredictable or short-term → On-Demand. BYOL per-socket license → Dedicated Hosts." Learn the qualifier keywords in the question — they always map to one branch. Source ↗

Compute Savings Plans vs EC2 Instance Savings Plans — The Canonical SAA Trap

SAA-C03 loves the Compute Savings Plans vs EC2 Instance Savings Plans distinction because both hit cost-optimized compute but with very different flexibility scopes. Let's dissect it.

Compute Savings Plan — maximum flexibility

A Compute Savings Plan commits you to a dollar-per-hour compute spend (for example, $10/hour) for 1 or 3 years. In exchange, AWS applies up to 66% off On-Demand across:

Any EC2 instance family (c5, m6g, r7i, anything).
Any size within that family.
Any region (the discount moves with your workload).
Any OS (Linux, Windows, RHEL, SUSE).
Any tenancy (shared or dedicated instance tenancy).
AWS Fargate — both ECS on Fargate and EKS on Fargate.
AWS Lambda — duration cost (up to 17% off Lambda duration pricing).

Compute Savings Plans are the most flexible commitment-based cost-optimized compute option on AWS. They are the right answer whenever a scenario mentions changing instance families, mixing EC2 with Fargate or Lambda, or migrating across regions.

EC2 Instance Savings Plan — deeper discount, tighter lock

An EC2 Instance Savings Plan matches the Standard Reserved Instance discount (up to 72%) but is less flexible than a Compute Savings Plan. You commit to a dollar-per-hour spend AND lock to:

A specific EC2 instance family (for example, c5).
A specific AWS region (for example, us-east-1).

Within that family+region lock you can still change size (c5.large → c5.xlarge), OS (Linux → Windows), AZ, and tenancy. But you cannot use an EC2 Instance Savings Plan for Fargate, Lambda, or a different family.

Coverage scope at a glance

Coverage	Compute SP	EC2 Instance SP	Standard RI	Convertible RI
EC2 across all families	Yes	No (one family)	No (one config)	No
EC2 across all regions	Yes	No (one region)	No (one region)	No
AWS Fargate	Yes	No	No	No
AWS Lambda	Yes	No	No	No
Max discount	66%	72%	72%	66%
Can exchange	N/A (auto-applies)	N/A	No	Yes
Capacity reservation	No	No	Zonal only	No

How the discounts stack and apply order

When you run workloads with both Reserved Instances and Savings Plans, AWS applies discounts in this order to maximize your savings:

Zonal / Standard Reserved Instances — applied first to matching usage.
EC2 Instance Savings Plans — applied next to remaining eligible EC2 usage.
Compute Savings Plans — applied last to any remaining EC2, Fargate, or Lambda spend.
On-Demand rate — billed for any usage above your Savings Plan commitment.

This stacking is why sophisticated FinOps teams layer RIs as a rock-solid baseline under a flexible Compute Savings Plan — the RI covers the exact instance config they're sure about, and the Compute Savings Plan mops up the rest of their cost-optimized compute spend.

If a SAA-C03 scenario asks "How does a company reduce cost on a steady, predictable AWS Lambda workload?" the correct cost-optimized compute answer is Compute Savings Plan — up to 17% off Lambda duration. Reserved Instances do not exist for Lambda or Fargate. An EC2 Instance Savings Plan does not cover Lambda or Fargate either. Every time Lambda or Fargate appears in a steady-state cost question, the Compute Savings Plan is the answer. Source ↗

A classic cost-optimized compute trap on SAA-C03 is a scenario where a team runs ECS on Fargate for steady containers and picks "EC2 Instance Savings Plan" because it matches the 72% RI discount number. This is wrong — EC2 Instance Savings Plans only cover EC2 instances in one family and region, not Fargate or Lambda. For Fargate steady-state cost optimization, the only Savings Plan flavor that applies is the Compute Savings Plan. The 6-percentage-point discount difference (72% vs 66%) is the price of Fargate coverage. Source ↗

Spot Instances — Architectural Patterns for Cost-Optimized Compute

Spot Instances give you up to 90% off On-Demand for interruption-tolerant workloads. SAA-C03 does not just test "what is Spot?" — it tests how you architect with Spot to reduce cost-optimized compute spend without sacrificing availability.

Spot interruption handling fundamentals

When AWS needs Spot capacity back for On-Demand customers, you receive a 2-minute interruption notice through two channels:

EC2 instance metadata endpoint at http://169.254.169.254/latest/meta-data/spot/instance-action returns a timestamp when reclamation begins.
Amazon EventBridge emits a EC2 Spot Instance Interruption Warning event you can route to Lambda, SQS, or Step Functions.

Your instance can respond to the interruption notice by:

Draining connections — deregister from an Application Load Balancer target group before the 2-minute window expires.
Checkpointing state — flush in-progress work to S3, DynamoDB, or EBS.
Graceful task shutdown — if running ECS, set stopTimeout on the task definition so ECS Agent gets signaled.

Interruption behaviors — stop, hibernate, terminate

When the 2-minute window ends, EC2 applies the interruption behavior you configured:

Terminate (default) — instance removed, instance store lost, EBS retained per DeleteOnTermination flag.
Stop — EBS-backed Spot Instances stop (EBS retained). You can restart when Spot capacity is available again.
Hibernate — RAM is flushed to the root EBS volume before stop. On restart, your process resumes where it was. Only supported on specific instance families.

Spot Fleet and EC2 Fleet

A Spot Fleet or EC2 Fleet launches a target capacity across multiple instance types and AZs, automatically diversifying to reduce interruption blast radius. Fleet types:

request — one-time launch to meet target capacity.
maintain — if any instance is interrupted, fleet automatically replaces it with another Spot instance from a different pool.

ASG mixed-instances policy — the SAA-favorite pattern

The single most-tested cost-optimized compute architectural pattern on SAA-C03 is the EC2 Auto Scaling group with a mixed-instances policy. Instead of locking the ASG to one purchasing option and one instance type, you configure:

Multiple instance types — for example, c5.large, c5a.large, c6i.large, c6a.large, m5.large. More instance pools means lower interruption probability for any one Spot pool.
Multiple purchasing options — OnDemandBaseCapacity sets a floor of On-Demand instances for stability; OnDemandPercentageAboveBaseCapacity decides what fraction of instances above the base are On-Demand versus Spot.
Allocation strategies — price-capacity-optimized (recommended) picks the Spot pools with lowest price AND lowest interruption probability. capacity-optimized prioritizes availability. lowest-price picks pure cost (risky).

A typical cost-optimized compute ASG mixed-instances policy might be: base capacity of 2 On-Demand instances, 20% of capacity above base as On-Demand, 80% Spot, across 5 instance types in 3 AZs. You get roughly 70% blended cost savings versus pure On-Demand while maintaining a stable floor for graceful degradation during Spot interruptions.

Spot Instances: up to 90% off, 2-minute interruption notice via instance metadata + EventBridge. Use interruption behaviors (terminate / stop / hibernate) to preserve state. Diversify across instance pools with Spot Fleet or ASG mixed-instances policy. Default to price-capacity-optimized allocation strategy. Combine OnDemandBaseCapacity (stable floor) + Spot (majority) for resilient cost-optimized compute. Never use pure Spot for stateful production databases — use it for stateless workers, batch jobs, CI runners, and EMR task nodes. Source ↗

Spot workload patterns — what fits, what doesn't

Fits Spot:

Amazon EMR task nodes (core nodes should stay On-Demand for HDFS durability).
AWS Batch array jobs (automatic retry on interruption).
CI/CD build workers (GitHub Actions runners, Jenkins agents).
Stateless web tier behind ALB with enough over-capacity to absorb a Spot eviction.
ML training jobs that checkpoint.
Kubernetes worker nodes running stateless pods (paired with pod disruption budgets).

Does not fit Spot:

Stateful production databases (RDS primary, self-managed PostgreSQL).
Session-affinity web servers without session replication.
Jobs that cannot be interrupted and resumed (real-time trading matching engine).
Kubernetes control plane nodes.

Graviton — The 40% Cost-Optimized Compute Shortcut

AWS Graviton is a family of ARM-based processors designed by AWS that deliver up to 40% better price-performance than comparable x86 Intel or AMD instances. Graviton is the single biggest cost-optimized compute lever available without committing to Savings Plans or tolerating Spot interruption.

Graviton generations — 2, 3, and 4

Graviton 2 — instance family suffix g (c6g, m6g, r6g, t4g, x2gd, i4g). Up to 40% better price-performance than comparable x86 fifth-generation instances.
Graviton 3 — instance family suffix g on seventh-generation families (c7g, m7g, r7g). Up to 25% better performance than Graviton 2, meaning even deeper price-performance gains.
Graviton 4 — newest generation on r8g and latest c8g/m8g families. Further performance gains for memory-intensive workloads.

Identifying Graviton in instance names

If an EC2 instance name has a g before the dot — like m6g.large, c7g.2xlarge, r7g.16xlarge, t4g.medium — it's Graviton. The g stands for Graviton. Intel is typically i or no letter (c5, m5, r5, m5n, r5i). AMD is a (c5a, m5a, r5a).

What workloads fit Graviton

Graviton runs anything that compiles to ARM, which now covers nearly every modern workload:

Containers — any OCI image with a multi-arch manifest (most AWS-published images are multi-arch).
Java, Go, Python, Node.js, .NET 6+, Ruby — all run natively on Graviton.
Databases — RDS and Aurora support Graviton db.r6g / db.m6g / db.r7g nodes.
ElastiCache — Redis and Memcached on cache.m6g / cache.r6g.
OpenSearch Service — data nodes on r6g / m6g.
AWS Lambda — Lambda supports ARM architecture; up to 20% lower cost per ms versus x86.

What doesn't fit Graviton

Workloads with hand-tuned x86 assembly.
Proprietary software shipped as x86-only binaries without an ARM build.
Windows workloads that require Windows Server (Graviton supports Windows on Server 2022 but tooling is less mature than Linux).

Graviton migration playbook

A typical cost-optimized compute Graviton migration looks like:

Audit current spend — use AWS Compute Optimizer to see which instances have Graviton recommendations.
Rebuild container images for multi-arch — docker buildx build --platform linux/amd64,linux/arm64.
Test in staging — deploy Graviton nodes alongside x86 in an ASG mixed-instances policy at 10% capacity.
Benchmark — confirm latency and throughput.
Roll forward — shift the ASG to 100% Graviton or keep a mix if some workloads need x86.
Layer Savings Plans — Compute Savings Plans apply to Graviton usage just like x86.

Graviton's 40% price-performance improvement is multiplicative with purchasing option discounts. A 3-year Compute Savings Plan on Graviton delivers roughly 60% cost savings versus On-Demand x86 (40% Graviton + additional Compute Savings Plan discount on the already-reduced Graviton rate). Graviton Spot Instances go further — up to 90% off the Graviton On-Demand rate. For new greenfield workloads, cost-optimized compute on AWS today means Graviton by default, Compute Savings Plans for steady spend, and Spot for batch. Source ↗

Rightsizing with AWS Compute Optimizer and CloudWatch

Before you buy a Savings Plan or migrate to Graviton, the highest-ROI cost-optimized compute move is often rightsizing — shutting off oversized resources. AWS Compute Optimizer and CloudWatch are the two tools SAA-C03 expects you to know.

AWS Compute Optimizer

AWS Compute Optimizer is a machine-learning-based service that analyzes your resource utilization (from CloudWatch metrics) and provides rightsizing recommendations across:

EC2 instances — recommends smaller, Graviton, or different-family instances.
EC2 Auto Scaling groups — recommends the mix of instance types that balances cost and availability.
EBS volumes — recommends gp3 volumes or smaller sizes.
AWS Lambda functions — recommends memory allocations for lowest cost at acceptable duration.
Amazon ECS services on Fargate — recommends task CPU/memory sizes.
Amazon RDS instances — recommends smaller DB instance classes.

Each Compute Optimizer recommendation includes:

Finding classification — Under-provisioned, Over-provisioned, Optimized, or Not optimized.
Projected savings — estimated monthly dollar savings from each recommendation.
Risk — low, medium, or high risk of performance regression.

Compute Optimizer requires at least 14 days of CloudWatch metric history to produce reliable recommendations. The service is free for default metrics; enabling "Enhanced infrastructure metrics" at $0.0003360215 per resource per hour unlocks 93-day analysis and deeper recommendations.

CloudWatch metrics for detecting underutilization

Before Compute Optimizer existed, the standard cost-optimized compute rightsizing workflow used raw CloudWatch metrics:

CPUUtilization — if average over 14 days is below 20%, the instance is likely oversized.
NetworkIn / NetworkOut — confirms the instance is actually serving traffic.
MemoryUtilization (via CloudWatch Agent; not a default metric) — if below 40%, consider a smaller memory tier.
DiskReadOps / DiskWriteOps — for I/O-bound workloads, confirms you need the IOPS you're paying for.

A common SAA-C03 scenario is "company has many EC2 instances with CPU consistently below 10% — what tool recommends rightsizing?" The answer is always AWS Compute Optimizer.

Rightsizing recommendation types

Downsize — c5.4xlarge → c5.xlarge (75% cost reduction if utilization justifies).
Family switch — c5.2xlarge → m5.2xlarge if memory is the bottleneck, not CPU.
Generation upgrade — c5.large → c6i.large for better performance at same price.
Graviton migration — c5.large → c6g.large for up to 40% price-performance gain.
Purchasing option — pair with Compute Savings Plans for compounded savings.

A common cost-optimized compute mistake is buying a 3-year Compute Savings Plan on your current over-provisioned footprint. If you then rightsize and your spend drops below your Savings Plan commitment, you're paying for unused commitment. The correct sequence is: (1) rightsize with Compute Optimizer → (2) migrate to Graviton where possible → (3) only then commit to Savings Plans at the new, lower spend level. SAA-C03 scenarios often present this exact sequence in the correct answer. Source ↗

AWS Lambda Cost Optimization — Memory-Duration Math

AWS Lambda is a fundamentally different cost-optimized compute model: you pay per invocation plus per GB-second of compute time. The formula is:

Lambda cost = invocations × request-price + (memory-GB × duration-seconds) × GB-second-price

At current pricing (us-east-1, x86, April 2026):

Requests — $0.20 per 1 million requests.
Duration — $0.0000166667 per GB-second (x86) or $0.0000133334 per GB-second (Graviton / arm64).
Provisioned Concurrency — an additional charge for keeping function instances warm.

Memory tuning — the counterintuitive cost lever

Lambda allocates CPU proportionally to memory: more memory = more vCPU = faster execution. This means increasing memory can decrease total cost if it decreases duration more than proportionally.

Example: A function runs in 10 seconds at 512 MB (0.5 GB). Cost per invocation = 0.5 GB × 10 s × $0.0000166667 = $0.0000833335.

Increase memory to 1024 MB (1 GB). If the function now runs in 4 seconds (because more CPU was allocated), cost = 1 GB × 4 s × $0.0000166667 = $0.0000666668. You doubled the memory but reduced cost by 20%.

Increase to 2048 MB (2 GB). If duration drops to 2.5 seconds, cost = 2 × 2.5 × $0.0000166667 = $0.0000833335 — back to the original cost. The sweet spot for this function is 1024 MB.

AWS Lambda Power Tuning

The community-built AWS Lambda Power Tuning tool (a Step Functions state machine you deploy from the Serverless Application Repository) runs your function at multiple memory settings, measures duration, and plots the cost-vs-memory curve. It is the SAA-referenced mechanism for finding the optimal Lambda memory setting.

AWS Compute Optimizer also provides Lambda memory recommendations based on observed invocation patterns.

Additional Lambda cost-optimized compute levers

arm64 (Graviton) architecture — up to 20% lower per-ms cost for compatible functions. Free to switch for most runtimes.
Compute Savings Plans — up to 17% off Lambda duration cost for committed spend.
Provisioned Concurrency with a Compute Savings Plan — for predictable high-throughput functions, Provisioned Concurrency can be cheaper than On-Demand Lambda when usage is steady. The Compute Savings Plan applies to Provisioned Concurrency fees.
Tiered pricing — Lambda duration cost has volume tiers; above 6 billion GB-seconds per month, the per-GB-second rate drops.
Avoid over-provisioning memory — the reverse of the memory tuning argument; for I/O-bound functions that don't benefit from more CPU, extra memory is wasted money.

When Lambda beats EC2 for cost-optimized compute

Lambda is cheaper than EC2 when:

Invocations are bursty or infrequent — you don't pay for idle time.
Average utilization would be under ~30% of an EC2 instance — you'd be paying for mostly-idle EC2.
Developer operational overhead is high — Lambda abstracts patching, OS updates, and capacity planning.

Lambda loses to EC2 on cost when:

Traffic is sustained and high — a continuously-busy workload on EC2 with a Compute Savings Plan beats per-invocation Lambda pricing at scale.
Execution exceeds 15 minutes — Lambda's hard timeout forces you onto Fargate or EC2.
Memory needs exceed 10 GB — Lambda's maximum memory is 10,240 MB.

Fargate and Fargate Spot — Cost-Optimized Compute for Containers

AWS Fargate is serverless container compute for Amazon ECS and Amazon EKS. You pay per vCPU-second and per GB-second of memory allocated to your task, with no EC2 instances to manage.

Fargate pricing model

Fargate charges two dimensions per second (minimum 1 minute per task):

vCPU — $0.04048 per vCPU per hour (us-east-1, Linux, x86).
Memory — $0.004445 per GB per hour.
Graviton (arm64) Fargate — roughly 20% cheaper per second than x86 Fargate.
Storage — ephemeral storage is free up to 20 GB; additional GB is charged separately.

Fargate Spot — 70% off for interruption-tolerant containers

Fargate Spot runs your ECS tasks on spare Fargate capacity for up to 70% off Fargate On-Demand pricing. Fargate Spot tasks can be interrupted with a 2-minute SIGTERM signal when AWS needs the capacity back (followed by SIGKILL after 2 minutes).

Fargate Spot is available only for ECS on Fargate, not EKS on Fargate (as of SAA-C03 exam guide scope). You configure Fargate Spot through capacity providers attached to an ECS cluster.

Capacity provider strategy — mixing Fargate and Fargate Spot

A typical cost-optimized compute ECS capacity provider strategy mixes FARGATE (On-Demand) and FARGATE_SPOT with a base + weight configuration:

FARGATE — base of 2 tasks (always On-Demand), weight 1.
FARGATE_SPOT — base of 0, weight 4.

With this strategy, the first 2 tasks always run on Fargate On-Demand (stable floor). Beyond that, for every 5 tasks launched, 1 runs on Fargate and 4 on Fargate Spot — achieving roughly 56% blended cost savings.

Fargate Spot fit patterns

Fargate Spot is suitable for:

Stateless API tiers behind ALB with health-check-driven replacement.
Asynchronous workers processing SQS queues with visibility timeout handling.
Parallel CI jobs that can restart.
Development and staging environments where a 70% saving offsets occasional restarts.

Fargate Spot is unsuitable for:

Long-running stateful tasks without external state checkpointing.
Tasks that cannot handle SIGTERM gracefully.
Production-critical batch jobs with hard SLAs that cannot absorb delayed restarts.

A recurring SAA-C03 cost-optimized compute trap is a scenario asking "how do you get Spot-like savings on EKS on Fargate?" The distractor answer is "Fargate Spot" — but Fargate Spot is not available for EKS on Fargate under current AWS scope. For EKS cost optimization with Spot, you must use EKS managed node groups on EC2 with a Spot capacity type, or a Karpenter provisioner configured for Spot. The Fargate Spot answer only applies to ECS on Fargate. Source ↗

On-Demand Capacity Reservations and Dedicated Hosts

Two cost-optimized compute constructs sit outside the main purchasing-option discount flow but still appear in SAA-C03 scenarios.

On-Demand Capacity Reservations

An On-Demand Capacity Reservation (ODCR) reserves EC2 capacity in a specific Availability Zone without a pricing commitment. You pay the On-Demand rate whether you use the reservation or not, but the capacity is guaranteed.

ODCR use cases:

Disaster recovery — guarantee that failover capacity exists when you need it.
Predictable peak events — Black Friday, product launches, broadcast windows.
Compliance or availability SLAs requiring AZ-level capacity guarantees.

ODCRs stack with Savings Plans and Regional Reserved Instances — the reservation guarantees capacity, and the SP/RI applies the discount on top. This is the cost-optimized compute pattern when you need both guaranteed capacity AND a commitment discount.

Dedicated Hosts — BYOL and physical isolation

Dedicated Hosts give you a full physical EC2 server reserved for your account. You see the sockets, cores, and physical host ID. Dedicated Hosts are the only option for:

Bring-Your-Own-License (BYOL) for per-socket or per-core licensed software — Microsoft Windows Server, Microsoft SQL Server, Oracle Database, SUSE Linux Enterprise Server.
Regulatory compliance requiring physical hardware isolation (certain government, healthcare, or financial regulations).

Dedicated Hosts are priced per host-hour (not per instance) and are available as:

On-Demand Dedicated Hosts — pay per hour, no commitment.
Dedicated Host Reservations — 1 or 3 years with up to 70% off On-Demand Dedicated Host rate.

Dedicated Hosts are not a cost-optimization tool by themselves — they exist for licensing and compliance. But on SAA-C03, if the scenario mentions "per-socket license" or "physical isolation" alongside cost considerations, Dedicated Host Reservations are the cost-optimized compute answer.

These three cost-optimized compute constructs are often confused. Savings Plans provide discount without capacity guarantee. Zonal Reserved Instances provide both discount and capacity guarantee in one AZ. On-Demand Capacity Reservations provide capacity guarantee without discount. If a scenario asks for capacity guarantee in a specific AZ plus a commitment discount, the answer is typically "Zonal RI" or "ODCR + Savings Plan". If it asks for capacity guarantee WITHOUT commitment, it's ODCR alone. Source ↗

SAA-C03 Scenario Patterns — Cost-Optimized Compute Playbook

These are the recurring cost-optimized compute scenario archetypes that SAA-C03 rotates through question variations. Learn the mapping.

Steady 24/7 production workload, same instance family for 3 years

Answer: 3-year Standard Reserved Instance (All Upfront) OR 3-year EC2 Instance Savings Plan, both up to 72% off. RI is marginally cheaper when you're certain about the exact instance configuration.

Steady 24/7 workload but might switch instance families, add Fargate or Lambda

Answer: 3-year Compute Savings Plan (up to 66% off, covers EC2 + Fargate + Lambda).

Fault-tolerant batch workload that can retry on interruption

Answer: Spot Instances in an ASG with price-capacity-optimized allocation and multiple instance types. For containers, Fargate Spot on ECS.

Mix of steady baseline and bursty peak

Answer: ASG mixed-instances policy with OnDemandBaseCapacity of N instances (covered by Compute Savings Plan), OnDemandPercentageAboveBaseCapacity = 0%, rest on Spot across multiple instance types.

Need to reduce compute cost by 30-40% without changing purchasing model

Answer: Migrate to AWS Graviton (c6g, m6g, r7g, Lambda arm64). Works alongside existing Savings Plans.

Many oversized EC2 instances, uncertain which to downsize

Answer: AWS Compute Optimizer — produces ML-based rightsizing recommendations from CloudWatch metrics.

Steady Lambda invocation volume, $5K/month, want commitment discount

Answer: Compute Savings Plan (only Savings Plan flavor that covers Lambda). Also switch Lambda to arm64 for additional ~20% savings.

EC2 capacity guaranteed in us-east-1a for DR without commitment

Answer: On-Demand Capacity Reservation in us-east-1a.

Microsoft SQL Server with per-socket BYOL license

Answer: Dedicated Host (optionally with 3-year Dedicated Host Reservation for cost optimization).

Short-term 2-week load test or proof of concept

Answer: On-Demand — no commitment or purchase option beats On-Demand for workloads shorter than any commitment term.

Purchasing ladder (ascending discount, descending flexibility): On-Demand (0%) → Compute Savings Plan (66%, covers EC2+Fargate+Lambda) → EC2 Instance Savings Plan / Standard RI (72%, family/region locked) → Spot (90%, interruptible). Plus Graviton (40% price-performance on top of any option) and Dedicated Hosts (BYOL/compliance). Rightsizing tool: AWS Compute Optimizer. Container cost-optimized compute: Fargate Spot (ECS only) + capacity provider strategy. Lambda cost-optimized compute: tune memory (AWS Lambda Power Tuning), switch to arm64, apply Compute Savings Plan. Sequence: rightsize → Graviton → commit. Source ↗

Key Numbers and Must-Memorize Facts for SAA-C03

On-Demand — baseline, no discount.
Compute Savings Plan — up to 66% off, covers EC2 + Fargate + Lambda.
EC2 Instance Savings Plan — up to 72% off, one family + one region.
Standard Reserved Instance — up to 72% off.
Convertible Reserved Instance — up to 66% off, exchangeable.
Spot Instance — up to 90% off, 2-minute interruption notice.
Graviton — up to 40% better price-performance versus comparable x86.
Lambda arm64 — up to 20% cheaper per GB-second versus x86.
Lambda Compute Savings Plan — up to 17% off duration.
Fargate Spot — up to 70% off Fargate On-Demand (ECS only).
Compute Optimizer enhanced metrics — 93-day lookback vs default 14 days.
Spot interruption behaviors — terminate (default), stop, hibernate.
ASG mixed-instances allocation strategies — price-capacity-optimized (recommended), capacity-optimized, lowest-price.
Savings Plans commitment terms — 1 year or 3 years.
Payment options — All Upfront, Partial Upfront, No Upfront (applies to RI, Savings Plans, Dedicated Host Reservations).
Dedicated Host Reservation — up to 70% off On-Demand Dedicated Host rate.

Cost-Optimized Compute vs Elastic Compute Scaling — Scope Boundary

SAA-C03 Task 4.2 (this topic, cost-optimized compute) covers the cost view of compute: how to pay less by choosing the right purchasing option, runtime, processor, and sizing. Task 3.2 (elastic-compute-scaling) covers the performance view: EC2 instance families, placement groups, Auto Scaling policies, Lambda concurrency tuning, and Batch. The two topics overlap on ASG configuration (both care about scaling policies) but diverge on the primary optimization axis.

If a question asks "what instance type offers the best GPU-to-price ratio for ML inference?" that is cost-optimized compute. If the question asks "what instance type provides the best sustained compute performance?" that is elastic-compute-scaling. Both topics share the AWS Compute Optimizer service as a cross-cutting tool — rightsizing is simultaneously a cost-optimization and a performance-tuning activity.

Compare also to the CLF-C02 pricing-models topic, which covers the same purchasing options at a beginner level. SAA-C03 cost-optimized compute goes deeper: ASG mixed-instances patterns, Spot Fleet allocation strategies, Graviton migration playbooks, and Lambda memory-duration math are all SAA-exclusive.

FAQ — Cost-Optimized Compute Top 6 Questions

1. When should I pick a Compute Savings Plan over a Reserved Instance for cost-optimized compute?

Pick a Compute Savings Plan when you value flexibility — the ability to change instance family, switch regions, migrate to Fargate, or add Lambda — more than the extra 6 percentage points of discount (72% RI vs 66% Compute SP). Pick a Reserved Instance (or EC2 Instance Savings Plan) only when you are absolutely certain your instance family, region, OS, and tenancy will not change for the full 1-year or 3-year commitment. In modern architectures that blend EC2 with Fargate and Lambda, the Compute Savings Plan almost always wins on risk-adjusted cost-optimized compute because its flexibility absorbs migration decisions that would strand an RI commitment.

2. What is the best cost-optimized compute pattern for a stateless web tier?

An EC2 Auto Scaling group with a mixed-instances policy, using OnDemandBaseCapacity for a stable floor (covered by a Compute Savings Plan) and price-capacity-optimized Spot allocation above the base across 4-6 instance types in 3 AZs. Pair this with an Application Load Balancer that health-checks and drains connections gracefully, and a graceful shutdown handler listening for the Spot interruption notice. Blend savings typically land at 60-75% versus pure On-Demand. Migrate the instance types to Graviton (c6g, m6g) for an additional 30-40% price-performance gain.

3. How do I handle Spot Instance interruptions safely in an ASG?

Three layers: (1) configure the ASG with a mixed-instances policy across multiple instance types to reduce the blast radius of any single Spot pool interruption; (2) configure price-capacity-optimized allocation strategy so AWS picks the lowest-interruption pools; (3) attach a lifecycle hook or a listener to the EC2 Spot Instance Interruption Warning EventBridge event, and in response deregister from the ALB target group, flush state to external storage, and exit gracefully. Many teams also set a minimum On-Demand base capacity so full cluster eviction is impossible. This is the canonical SAA-C03 cost-optimized compute answer for "how do you use Spot for production?".

4. Why does increasing Lambda memory sometimes reduce total cost?

Because Lambda allocates CPU proportional to memory — doubling memory from 512 MB to 1024 MB doubles the vCPU allocation, which can more than halve the function duration for CPU-bound work. Since Lambda duration cost is memory × duration, if duration drops by more than 50% when you double memory, total cost decreases. This is why AWS Lambda Power Tuning is the recommended tool: it runs your function at multiple memory settings and plots the actual cost-vs-memory curve, letting you pick the minimum-cost configuration. For I/O-bound functions where more CPU does not help, this does not apply and extra memory is wasted cost-optimized compute spend.

5. Is Graviton safe for production cost-optimized compute workloads?

Yes for the vast majority of modern workloads. Every mainstream language runtime (Java 11+, Python 3.8+, Node.js 14+, Go 1.15+, Ruby 2.7+, .NET 6+, PHP 7.4+) has a native ARM build. Nearly every AWS-published container image is multi-arch. Major OSS databases (PostgreSQL, MySQL, Redis, MongoDB, Kafka) ship official ARM builds. The migration playbook is: rebuild images with docker buildx for linux/arm64, deploy to a Graviton node pool alongside existing x86 at 10% capacity, benchmark, then scale up. The 40% price-performance improvement is the single biggest cost-optimized compute lever available without changing your purchasing option. The cases where Graviton does not fit are narrow — proprietary x86-only binaries, hand-tuned x86 assembly, or workloads on legacy Windows Server versions without ARM support.

6. How do I decide between Fargate Spot and EC2 Spot for a containerized cost-optimized compute workload?

Choose Fargate Spot (ECS only) when you want zero node management and your container workload is stateless, interruption-tolerant, and fits within Fargate's task size limits (up to 16 vCPU, 120 GB memory per task). You save up to 70% versus Fargate On-Demand, and you pay nothing for idle capacity. Choose EC2 Spot with ECS or EKS managed node groups when you need larger tasks, GPU instances (Fargate does not support GPUs), custom AMI or kernel features, or deeper savings (EC2 Spot discount reaches 90% versus Fargate Spot's 70%). For EKS specifically, Fargate Spot is not available — you must use EC2 Spot via managed node groups or Karpenter provisioners for Spot-based cost-optimized compute. SAA-C03 commonly tests this EKS-Fargate-Spot trap.

What is Cost-Optimized Compute on AWS

The three levers of cost-optimized compute

Why cost-optimized compute dominates Domain 4

Cost-Optimized Compute Purchasing Decision Tree

The seven-question decision tree

Purchasing options side-by-side

Compute Savings Plans vs EC2 Instance Savings Plans — The Canonical SAA Trap

Compute Savings Plan — maximum flexibility

EC2 Instance Savings Plan — deeper discount, tighter lock

Coverage scope at a glance

How the discounts stack and apply order

Spot Instances — Architectural Patterns for Cost-Optimized Compute

Spot interruption handling fundamentals

Interruption behaviors — stop, hibernate, terminate

Spot Fleet and EC2 Fleet

ASG mixed-instances policy — the SAA-favorite pattern

Spot workload patterns — what fits, what doesn't

Graviton — The 40% Cost-Optimized Compute Shortcut

Graviton generations — 2, 3, and 4

Identifying Graviton in instance names

What workloads fit Graviton

What doesn't fit Graviton

Graviton migration playbook

Rightsizing with AWS Compute Optimizer and CloudWatch

AWS Compute Optimizer

CloudWatch metrics for detecting underutilization

Rightsizing recommendation types

AWS Lambda Cost Optimization — Memory-Duration Math

Memory tuning — the counterintuitive cost lever

AWS Lambda Power Tuning

Additional Lambda cost-optimized compute levers

When Lambda beats EC2 for cost-optimized compute

Fargate and Fargate Spot — Cost-Optimized Compute for Containers

Fargate pricing model

Fargate Spot — 70% off for interruption-tolerant containers

Capacity provider strategy — mixing Fargate and Fargate Spot

Fargate Spot fit patterns

On-Demand Capacity Reservations and Dedicated Hosts

On-Demand Capacity Reservations

Dedicated Hosts — BYOL and physical isolation

SAA-C03 Scenario Patterns — Cost-Optimized Compute Playbook

Steady 24/7 production workload, same instance family for 3 years

Steady 24/7 workload but might switch instance families, add Fargate or Lambda

Fault-tolerant batch workload that can retry on interruption

Mix of steady baseline and bursty peak

Need to reduce compute cost by 30-40% without changing purchasing model

Many oversized EC2 instances, uncertain which to downsize

Steady Lambda invocation volume, $5K/month, want commitment discount

EC2 capacity guaranteed in us-east-1a for DR without commitment

Microsoft SQL Server with per-socket BYOL license

Short-term 2-week load test or proof of concept

Key Numbers and Must-Memorize Facts for SAA-C03

Cost-Optimized Compute vs Elastic Compute Scaling — Scope Boundary

FAQ — Cost-Optimized Compute Top 6 Questions

1. When should I pick a Compute Savings Plan over a Reserved Instance for cost-optimized compute?

2. What is the best cost-optimized compute pattern for a stateless web tier?

3. How do I handle Spot Instance interruptions safely in an ASG?

4. Why does increasing Lambda memory sometimes reduce total cost?

5. Is Graviton safe for production cost-optimized compute workloads?

6. How do I decide between Fargate Spot and EC2 Spot for a containerized cost-optimized compute workload?

Further Reading

Official sources