CI/CD pipelines and infrastructure-as-code (IaC) deployment strategies sit at the intersection of reliability, security, and velocity on SAP-C02. Every scenario that says "the company must deploy a new version with minimal downtime", "roll out a baseline to 40 accounts", or "auto-rollback when error rate spikes" is really a deployment-strategy question. At the Professional tier, AWS expects you to pick the right deployment pattern, the right IaC unit of work, and the right multi-account distribution mechanism — and to know the failure modes of each.
This guide assumes you already know the Associate-level fundamentals: what a CloudFormation stack is, what CodeBuild does, what a Lambda function alias is. It focuses on the Pro-level decisions: StackSets vs nested stacks vs CDK Pipelines, CodeDeploy canary vs linear vs all-at-once, ECS blue/green mechanics, Lambda alias-based traffic shifting, AppConfig feature flags, Service Catalog governed self-service, and when Terraform fits the picture. The goal is to make every SAP-C02 deployment-strategy question a 30-second recognition exercise rather than a 3-minute elimination puzzle.
Why Deployment Strategy Matters on SAP-C02
On Professional tier, AWS rarely asks "what is CloudFormation". It asks: "the team has 20 AWS accounts, an org-wide IAM baseline must be deployable, auditable, and drift-detected — which combination of services do you use?" That single question pulls in AWS Organizations, CloudFormation StackSets service-managed permissions, delegated administrator, drift detection, change sets, and CodePipeline approval actions. Miss any one piece and you pick a technically working but operationally inferior answer.
The exam also likes to pit deployment options against each other: all-at-once vs rolling vs blue/green vs canary, CloudFormation vs CDK vs SAM vs Terraform, StackSets service-managed vs self-managed, CodeDeploy ECS blue/green vs rolling, Lambda canary vs linear vs all-at-once, AppConfig vs environment variables for feature flags, Service Catalog vs direct CloudFormation access for developers. The fastest way to nail these is to know each tool's sweet spot and its specific failure modes cold.
- Deployment strategy: the pattern used to cut over traffic from old version to new — all-at-once, rolling, blue/green, canary, linear.
- Infrastructure as code (IaC): declarative (or imperative) description of cloud resources in version-controlled source. On AWS: CloudFormation, CDK, SAM, Terraform.
- CloudFormation StackSet: an AWS Organizations-aware mechanism to deploy the same CloudFormation template to many accounts and regions as stack instances.
- Change set: a preview of the resource-level diff CloudFormation will apply to a stack, including replacements that could cause downtime, before you execute.
- Nested stack: a CloudFormation resource of type
AWS::CloudFormation::Stackwhose template lives in S3 and is provisioned as a child of a parent stack. - Cross-stack reference: an
Export/Fn::ImportValuemechanism that lets one stack consume outputs of another in the same account and region. - CDK construct (L1/L2/L3): CDK's building blocks — L1 (CFN resource), L2 (curated resource with sensible defaults), L3 (pattern combining multiple resources).
- CodeDeploy deployment configuration: the named policy (AllAtOnce, OneAtATime, HalfAtATime, Linear, Canary, Custom) that governs how traffic or instances shift during a deployment.
- Lambda alias-based canary: Lambda's built-in traffic-splitting on an alias (e.g., 10 percent to new version), normally driven by CodeDeploy.
- AWS AppConfig: a runtime configuration and feature-flag service with validation, staged deployments, and automatic rollback on CloudWatch alarm.
- AWS Service Catalog: governed self-service catalog that wraps CloudFormation templates into launchable products for end users.
- Reference: https://docs.aws.amazon.com/codepipeline/latest/userguide/welcome.html
Plain-Language Explanation: CI/CD and IaC Deployment
Deployment strategy is one of those topics where the mechanics look similar but the consequences diverge sharply. Three analogies from different domains make the trade-offs stick.
Analogy 1: The Restaurant Menu Change
Picture a busy restaurant rolling out a new menu. All-at-once deployment is flipping the kitchen over to the new menu during lunch rush — if the new lasagna recipe is broken, every table suffers at the same time. Rolling deployment is switching one cook at a time to the new menu while the others keep cooking the old one — fewer tables affected per minute, but for a longer window. Blue/green is the most cautious: you set up a second identical kitchen (same grills, same staff, same ingredients) running the new menu, plate up a few test dishes, then at a precise moment the host starts seating all new parties in the new dining room while the old dining room finishes its current guests and then closes. If the new menu bombs, the host reopens the old dining room instantly — zero wasted plates. Canary deployment is giving the new lasagna to the first ten percent of diners, watching their reactions for ten minutes, then expanding to the next twenty percent, and so on. Linear is the same idea but with a fixed ramp (ten percent every ten minutes, regardless of feedback).
CloudFormation is the chef's written recipe — declarative, version-controlled, idempotent; same recipe, same dish. CDK is the sous-chef who reads your outline and writes the recipe — you describe "a Mediterranean pasta lunch for 200", the sous-chef synthesizes the full recipe. SAM is the dedicated pastry recipe format — optimized for serverless dishes, shorter than CloudFormation for the same output. Terraform is a multi-cuisine recipe format — works in your restaurant (AWS), the bistro across the street (Azure), and the noodle bar (GCP). StackSets is franchising your recipe to 40 restaurants in one operation, with a health inspector (drift detection) checking that none of them secretly added MSG.
Analogy 2: The Airline Fleet Upgrade
An airline replacing a cabin layout gives you the deployment-strategy intuition with stakes. All-at-once is grounding the whole fleet on Saturday night and retrofitting all planes by Monday morning — fastest, highest risk. Rolling is retrofitting one plane per week while the rest keep flying — low blast radius per step but total rollout takes a quarter. Blue/green is buying a new fleet, parking it on a second concourse, test-flying empty, and on cut-over day routing every new flight to the new fleet while the old fleet finishes scheduled routes and retires. If the new entertainment system crashes, reopen the old fleet in an hour. Canary is painting five planes with the new livery and loading normal passengers, checking reviews for a week, then expanding to twenty planes if reviews are good. Linear is the scheduled version — promote five planes every Friday no matter what.
AWS CodeDeploy plays the role of the flight dispatcher: it decides which planes (or ECS tasks, or Lambda versions) get passengers. CloudWatch alarms are the safety sensors wired to the dispatcher; if smoke is detected (error rate spikes, p99 latency climbs), the dispatcher auto-aborts the remaining rollout and routes passengers back to the old fleet. CodePipeline is the ground operations software chaining the whole workflow: pull parts from inventory (source), assemble the cabin (build), run hangar tests (test), coordinate with dispatcher (deploy), notify flight crew (notify). Multi-account CodePipeline with approval actions is the airline corporate review board — production deploys must have the VP's signoff before the dispatcher is allowed to route passengers.
Analogy 3: The Hospital Surgical Protocol
Hospitals introduce new surgical techniques with staged rollout that maps perfectly to deployment strategy. All-at-once is flipping every OR to the new technique overnight — never done on real humans. Rolling is training one surgeon per month; risk spreads over time. Blue/green is running the new technique in a simulation OR (staging environment) until the team is ready, then opening a second fully equipped OR (green environment) and scheduling new cases there while the old OR finishes its booked cases; if the first new-technique case goes badly, the hospital returns to the old OR immediately. Canary is the ethics-board-approved trial: first ten patients, monitor outcomes for two weeks, if outcomes hold expand to the next hundred. Linear is the standardized ramp — ten percent per month for the next ten months.
Change sets are the pre-surgery checklist — CloudFormation tells you "your update will replace the RDS instance" before you touch the patient; you can abort. Drift detection is the post-op audit — did someone re-intubate with a non-sterile tube outside the written protocol? AppConfig feature flags are the per-OR toggles — you can turn off the new anesthesia protocol instantly via a dashboard without redeploying the entire hospital. Service Catalog is the approved-procedures manual — surgeons self-serve from a curated list of techniques that have passed review, instead of inventing their own.
For SAP-C02, the airline fleet analogy maps the cleanest to CodeDeploy's traffic-shifting model: you always think in terms of "old fleet vs new fleet, percentage of passengers routed, rollback time". The hospital analogy is most useful for change-control reasoning: when the exam talks about "pre-deployment validation, approval, auditable rollback", think surgical protocol. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-configurations.html
Deployment Strategy Selection — All-at-Once, Rolling, Blue/Green, Canary, Linear
Before IaC tooling, you pick the traffic-shifting pattern. SAP-C02 exam stems almost always embed a hint: "zero downtime", "fast rollback", "gradual traffic shift", "cost-sensitive dev environment". Match the hint to the right strategy.
All-at-once (in-place)
- Pattern: stop old version on all hosts, deploy new version to the same hosts, start new version.
- Downtime: yes, for the duration of the stop-deploy-start cycle.
- Rollback speed: slow — requires a second deployment to revert.
- Cost: cheapest (no duplicate fleet).
- Use case: dev/staging, non-customer-facing batch jobs, acceptable-maintenance-window environments.
- CodeDeploy config:
CodeDeployDefault.AllAtOnce(EC2/on-prem) orCodeDeployDefault.LambdaAllAtOnce/CodeDeployDefault.ECSAllAtOnce.
Rolling (in-place, staggered)
- Pattern: replace instances/tasks in waves. Common waves: one-at-a-time (
OneAtATime), half-at-a-time (HalfAtATime), custom percentage. - Downtime: minimal per instance; total throughput dips during rollout.
- Rollback speed: slow — roll forward with old version.
- Cost: same as in-place (no duplicate fleet).
- Use case: EC2 Auto Scaling groups, ECS services with sufficient headroom, internal tools.
- CodeDeploy configs:
CodeDeployDefault.OneAtATime,CodeDeployDefault.HalfAtATime.
Blue/Green
- Pattern: provision a parallel "green" fleet, deploy new version there, cut traffic over (usually via ELB listener swap or Route 53 weighted record), drain and retire "blue".
- Downtime: zero if cut-over is atomic.
- Rollback speed: very fast — flip the listener back to blue; blue is still warm.
- Cost: high during cut-over (two full fleets live).
- Use case: customer-facing production, stateful workloads where in-place upgrade is risky, ECS services, EC2 behind ELB.
- AWS services: CodeDeploy ECS blue/green, CodeDeploy EC2 blue/green with Auto Scaling, Elastic Beanstalk swap environment URLs.
Canary
- Pattern: shift a small percentage (typically 10 percent) of traffic to the new version for a bake period, then shift the rest.
- Downtime: zero.
- Rollback speed: fast — traffic rerouted off canary in seconds.
- Cost: low incremental (only the small canary fleet is duplicated, or with weighted traffic no duplication at all for Lambda).
- Use case: public APIs, Lambda functions, anything with CloudWatch alarms that fire on error-rate regression.
- CodeDeploy configs:
CodeDeployDefault.LambdaCanary10Percent5Minutes,LambdaCanary10Percent10Minutes,LambdaCanary10Percent15Minutes,LambdaCanary10Percent30Minutes.
Linear
- Pattern: shift traffic in equal-sized increments on a fixed schedule (e.g., 10 percent every minute).
- Downtime: zero.
- Rollback speed: fast during rollout; once complete, full redeploy to revert.
- Cost: low.
- Use case: when you want a smooth ramp without a sudden jump; ideal when telemetry lag is known.
- CodeDeploy configs:
CodeDeployDefault.LambdaLinear10PercentEvery1Minute,LambdaLinear10PercentEvery2Minutes,LambdaLinear10PercentEvery3Minutes,LambdaLinear10PercentEvery10Minutes.
Custom deployment configurations
For EC2/on-prem (non-Lambda, non-ECS), you can create a custom deployment configuration specifying a minimum healthy hosts value (percentage or absolute count). For Lambda and ECS you get the built-in canary and linear variants above — custom ECS blue/green traffic shifting is supported via appspec Hooks and custom CodeDeploy configurations.
On SAP-C02, canary means "two-step shift: X percent for Y minutes, then the rest", while linear means "equal-step ramp every Y minutes until 100 percent". Both are zero-downtime and both trigger CloudWatch-alarm-driven auto-rollback when wired via CodeDeploy. Canary is preferred when you want a bake window with human review; linear is preferred when the ramp is automated end-to-end and you trust alarms. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-configurations.html
CloudFormation at Professional Depth — Stacks, Nested Stacks, Cross-Stack References
CloudFormation is the AWS-native IaC backbone. Every SAP-C02 answer that says "use IaC" is really a CloudFormation or CDK answer unless Terraform is named explicitly.
Stack organization patterns
You can fit an entire architecture into one monolithic stack, but at Pro scale that breaks down:
- Blast radius: a failed update rolls back the whole stack.
- Change frequency mismatch: the VPC changes yearly, the Lambda changes daily; one stack forces them into the same cadence.
- Ownership: network team owns VPC, app team owns Lambda; one stack creates ownership confusion.
- Service limits: 500 resources per stack (soft limit, raisable), 200 parameters/outputs/mappings per stack.
The canonical AWS pattern is to split by change cadence and ownership: foundational stacks (VPC, IAM roles, shared KMS keys), platform stacks (ECS cluster, EKS cluster), application stacks (services, Lambdas). You then wire them together with either nested stacks or cross-stack references.
Nested stacks
A nested stack is a AWS::CloudFormation::Stack resource whose TemplateURL points to a child template in S3. The parent stack creates, updates, and deletes children as part of its own lifecycle.
- Strength: single lifecycle — update the parent and all children update as one transaction; roll back together on failure.
- Strength: reusable module — drop the same child template into many parents with different parameters.
- Weakness: tight coupling — you cannot update a child independently without going through the parent.
- Weakness: template size can balloon when many children reference one another.
- Good for: VPC module + subnet module + security group module as sibling children of one application stack.
Cross-stack references
Cross-stack references use the Outputs section of one stack (with Export) and Fn::ImportValue in another stack to share values across stack boundaries.
- Strength: loose coupling — stacks have independent lifecycles and owners.
- Strength: clear contract — the export name is the contract.
- Weakness: the exporting stack cannot delete or rename the export while importers exist (you get a dependency error). This is a famous Day-2 pain point: you must delete importers first, then update the exporter.
- Weakness: same account and region only — cross-stack references do not cross accounts or regions. For cross-account sharing use SSM Parameter Store or Systems Manager with RAM sharing, or pass values through CDK Pipelines context.
- Good for: VPC stack exports subnet IDs, many independent application stacks import them.
Choosing nested stacks vs cross-stack references vs one big stack
| Criterion | Nested stacks | Cross-stack references | One big stack |
|---|---|---|---|
| Lifecycle | Single transaction with parent | Independent per stack | Single transaction |
| Ownership boundary | Weak (child belongs to parent) | Strong (each stack can be owned separately) | None |
| Update cadence | Must match parent | Independent | Must match whole system |
| Cross-account | No | No (use SSM Parameter Store) | No |
| Typical use | Reusable modules | Foundational sharing (VPC, IAM) | Small, single-team workloads |
Change sets
A change set previews the resource-level diff CloudFormation will apply before you execute. Key properties:
- Lists every resource being added, modified, removed, or replaced (replacement means destroy and recreate — potential downtime).
- Can be reviewed by humans or programmatically (CodePipeline has a
CloudFormationCreateReplaceChangeSet+CloudFormationExecuteChangeSetaction split that is the canonical Pro pipeline pattern). - Required for production pipelines: no direct stack updates; always via change set with an approval gate.
A common SAP-C02 answer pattern for change-management scenarios: CodePipeline stage 1 creates a change set (CHANGE_SET_REPLACE), stage 2 requires a manual approval action, stage 3 executes the change set. This gives you a previewable diff, an approval signoff, and a clean audit trail in CloudTrail. The alternative — directly updating stacks — skips the preview and is a real operational risk. Reference: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets.html
Drift detection and remediation
Drift is any divergence between the stack's expected template state and the actual live resources. Examples: someone manually edited a security group rule, someone deleted an IAM policy attachment, an external script changed an S3 bucket versioning setting.
CloudFormation drift detection:
- On-demand: trigger
DetectStackDrift(per stack) orDetectStackSetDrift(per stack set) to get a snapshot. - Per-stack-set scheduled scans: StackSets supports periodic drift detection via EventBridge scheduled rules or AWS Config rules (
cloudformation-stack-drift-detection-check). - Remediation paths: (a) bring the resource back to the template's desired state manually or via a Systems Manager Automation runbook; (b) update the template to reflect the new intended state and redeploy; (c) replace the drifted resource by cycling the stack.
Drift detection does not auto-remediate. The full pattern at Pro scale is drift detection → EventBridge event → Lambda or SSM runbook that either remediates (for known safe drift like "security group rule added") or notifies SecOps (for unknown drift).
StackSets at scale — self-managed vs service-managed
CloudFormation StackSets is the multi-account, multi-region distribution layer. Two permission models:
- Self-managed permissions: you pre-create
AWSCloudFormationStackSetAdministrationRolein the administrator account andAWSCloudFormationStackSetExecutionRolein every target account. Target by account ID. Use when the organization is not using AWS Organizations, or when targets live outside your organization. - Service-managed permissions: AWS creates service-linked roles for you. Target by OU. New accounts added to the OU receive the stack automatically via automatic deployment. This is almost always the SAP-C02 answer for organization-wide deployment.
Operational controls on every StackSet operation:
- Maximum concurrent accounts (count or percentage) — how many accounts deploy at once.
- Failure tolerance (count or percentage) — how many failures to allow before halting.
- Region order — explicit list; rollouts are sequential across regions, parallel across accounts within a region.
- Parameter and tag overrides per stack instance — an account or region can override values without template change.
- Deletion behavior on OU removal — retain or delete stack instances when an account leaves an OU.
The delegated administrator pattern matters here: from the management account you designate a member account (typically Shared Services or Deployments) as the StackSets delegated admin; daily operations then happen outside the management account. SAP-C02 rarely accepts an answer that runs StackSets from the management account for day-to-day work.
- Self-managed = account IDs; Service-managed = OUs + auto-deployment for new accounts.
- Delegated admin lets a member account run StackSets so the management account stays clean.
- Concurrency controls blast radius per operation (count or percent).
- Failure tolerance aborts the operation before the failure fraction is exceeded.
- Region order rolls out sequentially across regions — useful for progressive region adoption.
- Automatic deployment auto-applies to new accounts added to targeted OUs. Reference: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/stacksets-concepts.html
Canonical pattern: 20 accounts, org-wide baseline, deployable + auditable + drift-detected
This is the scenario SAP-C02 loves. The correct architecture:
- AWS Organizations with all features enabled, OUs aligned to policy (Security, Infrastructure, Workloads/Prod, Workloads/Non-Prod, Sandbox).
- AWS Control Tower enrolled for the Security OU baseline (Log Archive, Audit, CloudTrail, Config).
- Delegated administrator set for CloudFormation StackSets on the Shared Services / Deployments account.
- CodeCommit (or GitHub via connection) repository for baseline templates, peer-reviewed via pull requests.
- CodePipeline in the Deployments account: source → build/lint (CodeBuild + cfn-lint + cfn-nag) → create change set on a StackSet → manual approval → execute StackSet operation targeting specific OUs with service-managed permissions and automatic deployment enabled.
- Stack instance per account and region is then provisioned; new accounts joining the OU auto-enroll.
- Drift detection is scheduled via EventBridge every 24 hours on the StackSet; drift events flow to Security Hub and trigger SNS to SecOps.
- AWS Config rules (
cloudformation-stack-drift-detection-check,cloudformation-stack-notification-check) detective-layer the same posture. - Root-level SCP denies
cloudformation:DeleteStackandcloudformation:UpdateStackon baseline stack name patterns unless the principal is the CodePipeline execution role — this locks developers out of tampering. - CloudTrail organization trail (already enabled by Control Tower) records every StackSet operation; auditors review via Athena.
This architecture gives you deployable (one pipeline pushes baseline to 20 accounts in parallel), auditable (CloudTrail + change set diff + approval signoff), and drift-detected (CloudFormation + Config + EventBridge). Any SAP-C02 answer that reuses this skeleton is almost always the right one.
AWS CDK — L1, L2, L3 Constructs, and CDK Pipelines
The AWS Cloud Development Kit (CDK) is the programmatic IaC layer that synthesizes CloudFormation templates from TypeScript, Python, Java, C#, or Go code. CDK has become the AWS-preferred authoring experience for new workloads; SAP-C02 expects you to recognize when CDK is the right answer.
Why CDK exists
CloudFormation is declarative YAML/JSON. At scale this becomes:
- Verbose — dozens of lines for a simple "Lambda with IAM role, log group, and X-Ray tracing".
- Repetitive — the same IAM boilerplate copied across templates.
- Fragile — parameters and mappings substitute for true programming (no loops, weak conditionals).
CDK replaces templates with constructs — composable programming-language classes that synthesize to CloudFormation at deploy time.
Construct levels — L1, L2, L3
- *L1 (Cfn constructs)**: one-to-one with CloudFormation resources.
CfnFunction,CfnBucket,CfnTable. You get every attribute exactly as CloudFormation exposes it. Use L1 when you need fine control, or when a new AWS feature hasn't been abstracted to L2 yet. - L2 (curated constructs): opinionated wrappers with sensible defaults and higher-level helpers.
lambda.Functionauto-creates an IAM role and log group,s3.Bucketsets encryption and blocks public access by default. L2 is the bulk of day-to-day CDK code. - L3 (patterns): multi-resource patterns solving a use case end-to-end.
aws-ecs-patterns.ApplicationLoadBalancedFargateServicecreates a VPC, ECS cluster, Fargate service, ALB, security groups, log groups, and IAM roles in one line. L3 constructs encode AWS best-practice architectures and are the fastest way to ship.
SAP-C02 expects you to distinguish L1 (raw power, more boilerplate) from L2 (sensible defaults) from L3 (pattern wholesale). A scenario that says "the team wants to deploy a production-grade Fargate web app with minimal code" points at an L3 pattern.
Environments, bootstrapping, and context
CDK deploys into environments (account + region pairs). Before deploying, you run cdk bootstrap aws://ACCOUNT/REGION once per account-region — this provisions a bootstrap stack containing:
- An S3 bucket for template assets and Lambda code zips.
- An ECR repo for container image assets.
- IAM roles the CDK Toolkit assumes during deploy (
cfn-exec,deploy,lookup,publish-file,publish-image).
Bootstrapping is the most forgotten operational step on SAP-C02; answers that say "CDK deploy fails with a bucket-not-found error" usually point to missing bootstrap.
CDK Pipelines — self-mutating CI/CD
CDK Pipelines (aws-cdk-lib.pipelines) is a construct library that defines a CodePipeline whose first stage is "build and update the pipeline itself", followed by build+synth+deploy stages per environment. Properties:
- Self-mutating: when you add a new stage in code and push, the pipeline first updates itself to include the stage, then runs the new stage. No manual pipeline edits.
- Multi-account: each stage can target a different AWS account (via cross-account CDK bootstrap roles); useful for dev→staging→prod promotion.
- Security defaults: artifact bucket is KMS-encrypted, roles are least-privilege, secrets are read from Secrets Manager or SSM Parameter Store.
- Waves: parallel stages in a wave deploy to multiple regions or accounts simultaneously.
When SAP-C02 asks "the team wants a pipeline that promotes an application through dev, staging, and prod accounts with zero manual pipeline maintenance", CDK Pipelines is the canonical answer.
CDK vs SAM vs CloudFormation — the decision
| Criterion | CloudFormation | CDK | SAM |
|---|---|---|---|
| Authoring | Declarative YAML/JSON | Programming language (TS/Py/Java/C#/Go) | Declarative YAML extension |
| Abstraction | Low (raw resources) | Medium to high (L1/L2/L3) | Medium (serverless macros) |
| Best for | Any AWS resource, especially for templated reuse | New workloads, complex architectures, multi-account pipelines | Serverless apps (Lambda, API Gateway, DynamoDB, Step Functions) |
| Local testing | Poor | Medium (jest for CDK code) | Good (sam local invoke, sam local start-api) |
| Multi-account | StackSets, Service Catalog | CDK Pipelines | SAM Pipelines |
| Learning curve | Low for simple, high for complex | Medium | Low (if you know CloudFormation) |
If the workload is purely serverless (Lambda + API Gateway + DynamoDB + Step Functions + EventBridge), SAM gives you the best local-dev experience and the shortest template. If the workload mixes serverless and traditional (ECS, EC2, RDS, VPCs) or if you want a single authoring language across everything, CDK wins. If the team already has a deep CloudFormation estate and just needs templated reuse with parameters, plain CloudFormation + StackSets + Service Catalog is fine. Reference: https://docs.aws.amazon.com/cdk/v2/guide/home.html
AWS SAM and Serverless Application Repository
AWS Serverless Application Model (SAM) is a CloudFormation extension specialized for serverless workloads.
SAM in a nutshell
- Template: YAML using
Transform: AWS::Serverless-2016-10-31. Shorthand resources (AWS::Serverless::Function,AWS::Serverless::Api,AWS::Serverless::StateMachine,AWS::Serverless::HttpApi,AWS::Serverless::LayerVersion) expand into underlying CloudFormation at deploy time. - SAM CLI:
sam buildpackages the function code,sam deploywrapsaws cloudformation deploy,sam local invokeruns the function in a Docker container against a local event,sam local start-apiruns a local HTTP simulator. - Gradual deployments:
DeploymentPreferenceproperty onAWS::Serverless::Functionwires a CodeDeploy application + deployment group with a pre-traffic Lambda hook, a post-traffic Lambda hook, and a CloudWatch alarms list for auto-rollback. SetType: Canary10Percent10MinutesorLinear10PercentEvery2Minutesto get alias-based traffic shifting out of the box. - Policy templates: prebuilt IAM policy shortcuts (
DynamoDBCrudPolicy,S3ReadPolicy,SQSSendMessagePolicy) reduce IAM authoring.
SAM's unique value over CloudFormation: the serverless-friendly shorthand + local dev + the built-in CodeDeploy canary wiring. Its unique value over CDK: fewer moving parts, no synthesis step, no programming-language runtime in the pipeline.
Serverless Application Repository (SAR)
AWS Serverless Application Repository is a managed repo for publishing and discovering SAM applications. Use cases:
- Internal organization-wide reuse — a platform team publishes a SAM app that deploys a standard logging Lambda + SQS DLQ pattern; every application team deploys a copy with a few parameters.
- Public reuse — AWS publishes reference applications (alarm handlers, Slack notifiers, CSV-to-DynamoDB loaders) you can deploy into your account with two clicks.
- Distribution — vendors package their serverless integrations and customers subscribe.
SAR applications are deployed via a AWS::Serverless::Application resource referencing the application ARN and a semver version. The published template lives in SAR's managed storage; when you deploy, SAR synthesizes a nested CloudFormation stack in your account.
- Set
AutoPublishAlias: liveonAWS::Serverless::Function— every deploy publishes a new Lambda version and moves thelivealias. - Set
DeploymentPreference.Typeto one of the canary/linear built-ins. - Set
DeploymentPreference.Alarmsto a list of CloudWatch alarms — any alarm in ALARM state triggers auto-rollback. - Set
DeploymentPreference.Hooksto pre-traffic and post-traffic Lambda ARNs for integration checks. - On
sam deploy, CodeDeploy shifts traffic per the configuration, monitors alarms, and rolls back if any trip. Reference: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/automating-updates-to-serverless-apps.html
AWS CodePipeline — Multi-Account and Multi-Region Deployments
CodePipeline is the orchestration layer that chains source, build, test, and deploy actions into a pipeline with stages. At Professional scale the interesting shape is multi-account, multi-region — which is where most real enterprises run.
Cross-account action pattern
The canonical cross-account CodePipeline has:
- Pipeline account (often the Deployments account) hosts CodePipeline, CodeBuild, and the artifact S3 bucket.
- Target accounts (dev, staging, prod) host the actual resources being deployed.
- The artifact bucket in the pipeline account is encrypted with a customer-managed KMS key; the key policy grants decrypt permission to IAM roles in the target accounts.
- Each target account has a pipeline execution role (for example,
CodePipelineCrossAccountRole) whose trust policy trusts the pipeline account's CodePipeline service role, with an external ID condition. - Pipeline stages targeting a remote account use the
RoleArnfield on the action to assume the target account role.
Key enablers:
- Artifact bucket KMS key must allow
kms:Decryptandkms:GenerateDataKeyfrom target-account principals (or from anyone in the organization viaaws:PrincipalOrgID). - S3 bucket policy on the artifact bucket must allow
s3:GetObject/s3:GetObjectVersionfrom target accounts. - CloudFormation deploy action in a cross-account target stage must include a
DeploymentRoleArn(the role CloudFormation itself assumes inside the target account) — distinct from the pipeline action'sRoleArn.
Multi-region considerations
CodePipeline pipelines are regional; cross-region actions are supported but require an artifact bucket in every region the pipeline crosses into. When the pipeline stage targets another region, CodePipeline replicates the artifact to the regional bucket and then runs the action there.
Pro patterns:
- Region fan-out: use a single pipeline with parallel actions in
us-east-1,eu-west-1,ap-northeast-1— each action deploys the same template or image. - Region sequencing: for gradual region rollout (region A first, then B if A is healthy), chain sequential stages and gate each with a CloudWatch metric check.
- Control plane vs data plane: tighten pipeline stages with an approval after region 1 deploys; don't auto-propagate to region 2 until alarms are green.
Source-action options
- CodeCommit (native AWS Git).
- GitHub / GitLab / Bitbucket Cloud via AWS CodeStar Connections — the current recommended mechanism for third-party Git (replaces the old GitHub action).
- S3 object — a zipped bundle deposited in S3.
- ECR image — a newly pushed container image.
- AWS CodeArtifact package — a new NPM/PyPI/Maven package version.
Action providers relevant to deployment strategy
- CloudFormation —
CREATE_UPDATE,DELETE_ONLY,CHANGE_SET_REPLACE,CHANGE_SET_EXECUTE,REPLACE_ON_FAILURE. Pro pipelines always splitCHANGE_SET_REPLACE+ approval +CHANGE_SET_EXECUTE. - CloudFormation StackSets — create or update a stack set, create stack instances targeting specific OUs.
- CodeDeploy — invoke an EC2/on-prem or Lambda or ECS CodeDeploy deployment.
- CodeBuild — run a build (tests, linting, security scans).
- Manual approval — pause pipeline pending human signoff; supports SNS notification and reviewer comments.
- Lambda invoke — custom pre- or post-deployment logic (database migration, smoke test, integration call).
- Step Functions invoke — long-running or complex orchestration.
- Service Catalog — deploy a product.
A classic SAP-C02 setup: the team creates a cross-account pipeline, sets up the target role with assume-role trust, and the pipeline fails at the CloudFormation deploy step with a decrypt error. The missing piece is almost always the KMS CMK key policy on the pipeline-account artifact bucket — it must allow kms:Decrypt to the target account's role. IAM allow on the consumer side is not enough; KMS requires the key policy to grant the principal. Reference: https://docs.aws.amazon.com/codepipeline/latest/userguide/pipelines-create-cross-account.html
AWS CodeDeploy — Deployment Configurations in Depth
CodeDeploy is AWS's deployment engine for EC2/on-prem instances, Lambda functions, and ECS services. The deployment configuration is the policy that governs how traffic or instances shift during the deployment.
EC2 / on-premises deployment configurations
These apply to in-place and blue/green deployments on EC2 instances or on-prem servers with the CodeDeploy agent.
CodeDeployDefault.AllAtOnce— update all instances simultaneously. Fast, high risk, cheap. Use in dev only.CodeDeployDefault.HalfAtATime— update half at a time, in waves. Requires fleet size ≥ 2. Balances speed and risk.CodeDeployDefault.OneAtATime— update one instance at a time. Slowest but safest, requires fleet size ≥ 2.- Custom deployment configurations — you define
MinimumHealthyHostsas a count or percentage; CodeDeploy ensures at least that many instances stay healthy during the deploy.
Lambda deployment configurations
Lambda deployments never run in-place; CodeDeploy publishes a new Lambda version and shifts alias traffic.
CodeDeployDefault.LambdaAllAtOnce— immediate alias cutover. Dev use.CodeDeployDefault.LambdaCanary10Percent5Minutes— 10% traffic for 5 minutes, then 100%.CodeDeployDefault.LambdaCanary10Percent10Minutes— 10% for 10 minutes, then 100%.CodeDeployDefault.LambdaCanary10Percent15Minutes— 10% for 15 minutes, then 100%.CodeDeployDefault.LambdaCanary10Percent30Minutes— 10% for 30 minutes, then 100%.CodeDeployDefault.LambdaLinear10PercentEvery1Minute— +10% every 1 minute (total ~10 minutes).CodeDeployDefault.LambdaLinear10PercentEvery2Minutes— +10% every 2 minutes.CodeDeployDefault.LambdaLinear10PercentEvery3Minutes— +10% every 3 minutes.CodeDeployDefault.LambdaLinear10PercentEvery10Minutes— +10% every 10 minutes (total ~100 minutes).
ECS deployment configurations
ECS deployments via CodeDeploy use blue/green only — CodeDeploy provisions a new task set behind a test listener on the ALB/NLB, then shifts the production listener.
CodeDeployDefault.ECSAllAtOnce— cut all traffic to new task set immediately after test traffic validation.CodeDeployDefault.ECSCanary10Percent5MinutestoECSCanary10Percent15Minutes— canary with the named bake window.CodeDeployDefault.ECSLinear10PercentEvery1MinutetoECSLinear10PercentEvery3Minutes— linear ramp.
Automatic rollback and alarms
Every CodeDeploy deployment group can be wired to CloudWatch alarms. Properties:
- Alarms: list of CloudWatch alarms; if any alarm is in ALARM state during the deploy, CodeDeploy auto-rolls back.
- Auto-rollback events:
DEPLOYMENT_FAILURE,DEPLOYMENT_STOP_ON_ALARM,DEPLOYMENT_STOP_ON_REQUEST. - Lifecycle hooks: pre-traffic and post-traffic hooks invoke Lambda functions for integration tests; hook failure triggers rollback.
Pro pattern: define a p99Latency CloudWatch alarm and an Error5xxRate alarm; attach both to every deployment group. For Lambda, use the built-in CloudWatch metrics Errors, Duration, and custom application metrics emitted via embedded metric format.
ECS blue/green with CodeDeploy requires: (1) an ECS service with deployment controller = CODE_DEPLOY; (2) an ALB or NLB with two listeners — a production listener and a test listener; (3) two target groups — blue (currently serving) and green (spawned for the deploy); (4) a CodeDeploy application and deployment group referencing both target groups and both listeners; (5) an appspec.yaml specifying the new task definition. CodeDeploy registers the new task set against the green target group, optionally routes test traffic for validation via the test listener, then shifts the production listener from blue to green per the deployment configuration. Rollback is a listener flip back to blue. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployments-create-ecs-bg.html
Lambda Alias-Based Canary and Version Management
Lambda has native versioning and alias support that underpins CodeDeploy canary deployments.
Lambda versions and aliases
- A Lambda function version is an immutable snapshot of the function's code + config + environment variables at publish time. Versions are numbered (
1,2,3, ...) with$LATESTbeing the mutable working copy. - A Lambda alias is a named pointer to a specific version (
prod → 7,staging → 8). Aliases are invokable (their ARN is stable), so clients call the alias ARN and never the version directly.
Alias traffic shifting
An alias can point to one or two versions simultaneously with a RoutingConfig weight. Example:
alias: prod
FunctionVersion: 7 (primary, 90% of invocations)
RoutingConfig:
AdditionalVersionWeights:
"8": 0.10 (10% of invocations routed to version 8)
CodeDeploy drives this alias traffic shifting on a deployment schedule (canary or linear). The application never knows there are two versions; the Lambda service handles the per-invocation routing.
SAM vs direct CodeDeploy wiring
The same alias-canary pattern is available in three authoring styles:
- SAM —
AutoPublishAlias: live+DeploymentPreference.Type: Canary10Percent10Minutes+DeploymentPreference.Alarms. Shortest. - CDK —
LambdaDeploymentGroupconstruct withdeploymentConfigandalarms. Idiomatic for CDK estates. - Direct CloudFormation / CodeDeploy —
AWS::CodeDeploy::Application,AWS::CodeDeploy::DeploymentGroup,AWS::Lambda::Alias. Full control, most verbose.
Provisioned concurrency and aliases
Provisioned concurrency is assigned to an alias (or a specific version). When you canary, both the old version (primary) and new version (canary) need adequate provisioned concurrency to avoid cold starts during the shift. A common Pro mistake is provisioning concurrency on the alias (which in canary mode is split) without accounting for the canary percentage — the canary version starts cold. Fix: provision concurrency on the version being rolled out just before the canary starts, not on the alias.
When SAP-C02 says "a new Lambda version must auto-rollback if error rate spikes in the first 10 minutes after deploy", the textbook answer stack is: (1) publish new version via CodeDeploy, (2) use CodeDeployDefault.LambdaCanary10Percent10Minutes, (3) attach a CloudWatch alarm on function errors + custom metric, (4) auto-rollback on alarm. No custom Lambda orchestration, no Step Function, no manual runbook. Reference: https://docs.aws.amazon.com/lambda/latest/dg/configuration-aliases.html
AWS AppConfig — Runtime Feature Flags and Configuration
Deployment strategy is not only about shipping code — it's also about shipping configuration and feature flags without code deploys. AWS AppConfig is the managed service for this.
Core components
- Application: logical grouping (e.g., "checkout-service").
- Environment: logical deployment target (e.g., "prod-us-east-1", "staging-eu-west-1"). Environments can have CloudWatch alarms for auto-rollback.
- Configuration profile: the unit of configuration — freeform JSON/YAML/text, a feature-flag profile (structured feature-flag JSON), or a hosted configuration. Sources include AppConfig-hosted, SSM Parameter Store, S3, and Systems Manager documents.
- Deployment strategy: named policy governing the rollout —
AppConfig.AllAtOnce,AppConfig.Linear50PercentEvery30Seconds,AppConfig.Canary10Percent20Minutes, or custom. - Validator: optional Lambda or JSON Schema that validates the new config at deploy time.
Deployment flow
- Operator updates the configuration profile with a new version (feature flag toggled, threshold raised).
- Operator starts a deployment to an environment with a chosen deployment strategy.
- AppConfig validates the config (syntax + JSON Schema + custom Lambda).
- AppConfig rolls the config out per the strategy's ramp.
- Clients (via the AppConfig Agent sidecar or SDK) poll and pick up new config at their next poll interval.
- If an environment CloudWatch alarm enters ALARM state during the deploy, AppConfig bakes in rollback and reverts.
Why AppConfig over environment variables
- Changing a Lambda environment variable requires a full deploy and alias cutover.
- Changing an ECS task definition environment variable requires a task definition update + service deploy.
- AppConfig lets you change a behavior flag in seconds without a deploy, and with the same rollback guardrails as code.
Feature flags vs configuration
AppConfig has a first-class feature flag profile type with structured flags (boolean, value, constraints). The runtime SDK exposes getFlag("newCheckoutFlow") with sane defaults, and the UI lets you enable/disable per environment. This is the service to reach for when SAP-C02 says "gradually enable a new feature to 10 percent of users, with a kill-switch".
Think of AppConfig as a second deployment surface alongside your code pipeline. The exam loves scenarios where a risky config change must be reversible without redeploying the application — AppConfig is the answer. Pair it with CloudWatch alarms for automatic rollback and you get a runtime analog of the CodeDeploy canary pattern, without touching Lambda versions or ECS task definitions. Reference: https://docs.aws.amazon.com/appconfig/latest/userguide/what-is-appconfig.html
AWS Service Catalog — Governed Self-Service for Developers
When developers need to spin up their own resources but the organization needs guardrails, AWS Service Catalog wraps CloudFormation templates into versioned, launchable products end users consume through a controlled portfolio.
Core components
- Product: a versioned CloudFormation template representing a reusable architecture (e.g., "standard three-tier web stack", "Aurora Postgres with backup policies").
- Portfolio: a grouping of products shared with specific IAM principals or accounts.
- Launch constraint: an IAM role that Service Catalog assumes when launching a product. End users don't need CloudFormation permissions — they only need permission to launch the product, and the launch constraint role provisions the resources.
- Template constraint: rules restricting parameter values end users can pick (e.g., instance type must be from an allowed list).
- TagOption library: managed tag key/value allowlist; products inherit TagOptions so all launched resources get consistent tags.
Multi-account pattern
In a multi-account organization:
- Hub account (often Shared Services) hosts the portfolio and products.
- Spoke accounts receive portfolio shares via the Service Catalog portfolio share mechanism, or via AWS RAM for organization-scope sharing.
- Developers in spoke accounts launch products from the locally-shared portfolio; the launch happens in the spoke account using the launch constraint role.
- Product updates propagate automatically to shared portfolios; spoke admins see new versions and can apply them to existing provisioned products.
Service Catalog vs direct CloudFormation access
| Criterion | Direct CloudFormation | Service Catalog |
|---|---|---|
| Developer permissions | Need broad CloudFormation + resource permissions | Launch permission only |
| Template governance | Per-team, no enforcement | Curated, versioned portfolio |
| Parameter controls | Free-form | Template constraints restrict values |
| Tagging enforcement | Per-template | TagOption library enforces consistency |
| Multi-account sharing | StackSets or manual | Portfolio share + RAM |
| Use case | Platform teams | End-user self-service |
SAP-C02 answer signal: "developers should self-serve standard architectures without broad IAM permissions" → Service Catalog. "Platform team deploys org-wide baseline" → StackSets.
Terraform on AWS Considerations
Terraform is not an AWS-native service, but many enterprises run Terraform at scale on AWS. SAP-C02 occasionally asks how Terraform fits alongside AWS-native IaC.
AWS-supported integration points
- AWS Provider for Terraform: the standard Terraform provider covering essentially every AWS service. Versioned, well-maintained by HashiCorp and AWS.
- AWS Cloud Control Provider: provides Terraform access to CloudFormation-supported resources via the Cloud Control API — useful for resources not yet in the AWS Provider.
- AWS Control Tower Account Factory for Terraform (AFT): AWS-provided reference implementation of Terraform-based account vending with Control Tower. Uses CodeCommit repos per account for baseline configuration + global customizations.
- Terraform Cloud / Terraform Enterprise integration: CodePipeline supports invoking Terraform Cloud runs; private module registries can reference AWS-native artifacts.
State management on AWS
Terraform requires remote state at Pro scale:
- S3 backend for the state file, with DynamoDB table for state locking (prevents concurrent writes).
- S3 bucket versioning on the state bucket for audit and rollback.
- KMS CMK for state encryption at rest.
- Separate state files per environment to limit blast radius (dev, staging, prod each have their own).
Terraform vs CloudFormation — the Pro-level choice
| Criterion | CloudFormation/CDK | Terraform |
|---|---|---|
| AWS coverage | Day 0 for every new AWS service | Fast but typically lags by weeks |
| Multi-cloud | AWS only | AWS + Azure + GCP + many others |
| State | AWS-managed | Operator-managed (S3 + DynamoDB) |
| Drift detection | Native (CloudFormation + Config rule) | terraform plan + third-party (e.g., driftctl) |
| Multi-account distribution | StackSets (service-managed) | AFT, custom pipelines, Terragrunt |
| Change preview | Change sets | terraform plan |
| Rollback | Automatic on failure | Manual (re-apply previous state) |
| Org policy integration | Native (SCPs, Control Tower) | External (Sentinel, OPA) |
| Cost | Free | Free (OSS); paid tiers for Terraform Cloud |
SAP-C02 typically prefers CloudFormation/CDK answers unless the scenario explicitly mentions an existing Terraform estate or a multi-cloud requirement. When Terraform is in play, the AFT pattern is the AWS-recommended multi-account approach.
The exam chooses Terraform in three scenarios: (1) "the company already runs Terraform in production and wants to extend to AWS", (2) "the architecture spans AWS + Azure + GCP and needs a single IaC tool", (3) "Control Tower Account Factory for Terraform (AFT) is already the baseline". In every other SAP-C02 IaC scenario, CloudFormation or CDK is the right answer. Reference: https://docs.aws.amazon.com/prescriptive-guidance/latest/terraform-aws-provider-best-practices/introduction.html
Systems Manager Automation and Change Management
Deployment strategy at Pro scale also covers operational changes — patch rollouts, configuration baselines, one-off runbook executions. AWS Systems Manager Automation is the service for these.
Automation documents (runbooks)
- AWS-managed documents (hundreds of them):
AWS-RestartEC2Instance,AWS-PatchInstanceWithRollback,AWS-EnableExplorer,AWSSupport-TroubleshootRDP, etc. - Custom documents: YAML or JSON definitions with steps (
aws:runCommand,aws:invokeLambdaFunction,aws:approve,aws:waitForAwsResourceProperty,aws:branch). - Multi-account, multi-region automation: runbooks can target Organizations OUs, spanning accounts and regions in one execution.
Patch Manager
- Patch baselines per OS and compliance level define which patches are approved.
- Patch groups tag EC2 instances with the baseline they follow.
- Maintenance windows schedule patching, respecting business hours and maintenance approvals.
- Organization-wide rollouts: Systems Manager Quick Setup + Patch Manager baselines applied across OUs.
Change Manager
- Change templates define the review workflow and approvers for a given type of change.
- Change requests are proposals tied to a runbook; approvers review and approve or reject.
- Calendar integration prevents changes during blackout windows.
- CloudTrail auditability: every step is logged.
SAP-C02 answer signals: "approve and auditable operational change across 40 accounts" → Systems Manager Change Manager + Automation. "Schedule OS patching across the fleet with compliance reporting" → Patch Manager with maintenance windows.
Scenario Patterns — CI/CD and IaC on SAP-C02
The following patterns map exam stems to architectures.
Scenario Pattern 1: "Deploy org-wide baseline to 20 accounts, auditable and drift-detected"
This is the canonical pattern already covered in detail above. Summary:
- AWS Organizations + Control Tower for the OU structure and baseline guardrails.
- CloudFormation StackSets with service-managed permissions targeting specific OUs, with automatic deployment on new accounts.
- Delegated administrator for StackSets on the Deployments account.
- CodePipeline with
CHANGE_SET_REPLACE→ manual approval →CHANGE_SET_EXECUTEactions. - Drift detection via scheduled EventBridge rules plus AWS Config rule
cloudformation-stack-drift-detection-check. - Root-level SCP denies tampering with baseline stacks except by the pipeline role.
- CloudTrail organization trail for audit.
Scenario Pattern 2: "Lambda must deploy with auto-rollback on p99 latency regression"
- Publish new version with SAM
AutoPublishAliasor CDKLambdaDeploymentGroup. - CodeDeploy deployment configuration =
CodeDeployDefault.LambdaCanary10Percent10Minutes. - CloudWatch alarm on Lambda
Durationp99 above threshold, and onErrorsrate. - DeploymentPreference.Alarms references both alarms.
- On alarm, CodeDeploy auto-rolls back the alias weight.
Scenario Pattern 3: "ECS service zero-downtime deploy with automated rollback"
- ECS service
deploymentController = CODE_DEPLOY. - ALB with two listeners (production + test) and two target groups (blue + green).
- CodeDeploy application + deployment group with
ECSCanary10Percent5MinutesorECSAllAtOnceafter hook validation. - Pre-traffic Lambda hook hits
/healthon the test listener; fails rollback. - Post-traffic Lambda hook runs integration tests; fails rollback.
- CloudWatch alarms on ALB 5xx, target group healthy host count.
Scenario Pattern 4: "Developer self-service with governance"
- AWS Service Catalog hub in Shared Services account.
- Portfolio shared with workload OUs via the native share mechanism.
- Launch constraint role provisions resources in spoke accounts on behalf of developers.
- Template constraints restrict instance types, regions, AZs.
- TagOption library enforces mandatory tags.
- Developers invoke from the Service Catalog console or via CloudFormation
AWS::ServiceCatalog::CloudFormationProvisionedProduct.
Scenario Pattern 5: "Feature flag rollout with kill-switch"
- AWS AppConfig application + environment + feature-flag profile.
- AppConfig deployment strategy =
AppConfig.Linear50PercentEvery30Seconds. - Environment alarm on business metric (e.g., conversion rate).
- Application uses the AppConfig Agent Lambda extension for zero-latency fetch.
- Kill-switch: operator flips the flag to
falseand starts a deployment — seconds to full effect.
Scenario Pattern 6: "Multi-account CodePipeline — dev → staging → prod"
- Pipeline account hosts CodePipeline, CodeBuild, artifact S3 with customer-managed KMS key.
- Dev / staging / prod accounts each have a CloudFormation deployment role trusting the pipeline's CodePipeline service role.
- KMS key policy allows decrypt from each target account role.
- S3 bucket policy allows
s3:GetObjectfrom target account roles. - Pipeline stages use the action's
RoleArnfield to assume the target role andDeploymentRoleArnon CloudFormation actions. - Manual approval between staging and prod.
Scenario Pattern 7: "Multi-region rollout with progressive validation"
- Single pipeline with sequential region stages:
us-east-1→eu-west-1→ap-northeast-1. - Each region has its own artifact bucket (regional requirement).
- Between regions, a CloudWatch alarm check action (custom Lambda) verifies no metrics in ALARM state.
- Manual approval for the first production region; automatic promotion after alarm clear for subsequent regions.
Decision Matrix — Which IaC and Deployment Tool for Which Goal
| Goal | Primary choice | Notes |
|---|---|---|
| Deploy baseline to many accounts via OU | CloudFormation StackSets (service-managed) | Automatic deployment on new accounts |
| Deploy to one account with modular templates | Nested stacks | Single-lifecycle reusable modules |
| Share VPC/IAM outputs between independent stacks | Cross-stack references (Export/ImportValue) | Same account, same region only |
| Preview infra change before applying | CloudFormation change sets | Required for Pro pipelines |
| Detect drift on deployed stacks | CloudFormation drift detection + Config rule | Pair with EventBridge alerting |
| Greenfield complex architecture | CDK (L2/L3) | Programmatic abstractions |
| Greenfield serverless app | SAM | Best local-dev, shortest template |
| Multi-account self-mutating pipeline | CDK Pipelines | Cross-account via bootstrap roles |
| Governed developer self-service | AWS Service Catalog | Launch constraint + TagOptions |
| Reversible config/feature flag change | AWS AppConfig | Runtime, alarm-driven rollback |
| Zero-downtime Lambda deploy | CodeDeploy Lambda canary or linear | Alias-based traffic shifting |
| Zero-downtime ECS deploy | CodeDeploy ECS blue/green | Two listeners + two target groups |
| Zero-downtime EC2 fleet deploy | CodeDeploy EC2 blue/green | New ASG + ELB target group swap |
| Fast in-place dev deploy | CodeDeployDefault.AllAtOnce | Dev only; avoid in prod |
| Org-wide OS patching | Systems Manager Patch Manager + Maintenance Windows | Organization-scoped |
| Multi-account runbook execution | Systems Manager Automation (Multi-Account/Region) | Approve via Change Manager |
| Approval-gated change with audit | Systems Manager Change Manager | Integrates with Automation |
| Multi-cloud IaC | Terraform (+ AFT for AWS multi-account) | Remote state in S3 + DynamoDB |
| Publish reusable serverless apps internally | Serverless Application Repository (SAR) | Nested-stack deployment |
Common Traps — CI/CD and IaC Deployment Strategy
Expect multiple of these distractors on every SAP-C02 attempt.
Trap 1: "Use self-managed StackSets permissions" when Organizations is in play
If the organization is using AWS Organizations (and the scenario mentions OUs), service-managed permissions is the right answer. Self-managed is only correct when targets are outside the organization or Organizations isn't enabled.
Trap 2: Running StackSets from the management account
Day-to-day StackSet operations should happen in the delegated administrator account, not the management account. Answers that keep StackSets in the management account are governance anti-patterns.
Trap 3: Cross-stack references across accounts
Fn::ImportValue is same-account, same-region only. To share values across accounts, use SSM Parameter Store with RAM sharing, or pass values explicitly through pipeline parameters or CDK Pipelines context.
Trap 4: Forgetting CDK bootstrap
CDK deploys require a bootstrap stack per account-region. Missing bootstrap is the #1 cause of "bucket not found" errors in CDK pipelines. The fix is cdk bootstrap once per environment.
Trap 5: CodeDeploy ECS without two listeners or two target groups
ECS blue/green needs both a production and a test listener, and two target groups. Single-listener/single-target-group scenarios default to the ECS rolling update controller, not CodeDeploy blue/green.
Trap 6: AppConfig vs environment variables
Scenarios asking for "fast reversible config change without deploy" want AppConfig, not env vars. Env vars require a full function/task update to change.
Trap 7: Lambda provisioned concurrency misassignment during canary
Provision concurrency on the new version during canary rollout, not on the alias (which is split). Otherwise the canary traffic starts cold.
Trap 8: Terraform when CloudFormation fits
Unless the scenario explicitly mentions multi-cloud or an existing Terraform estate, the AWS-native answer (CloudFormation, CDK, SAM, StackSets) is preferred on SAP-C02. Terraform is not wrong, but it's rarely the "best" answer on AWS-only scenarios.
Trap 9: Tag policy enforcement misunderstanding
Organizations tag policies report non-compliance by default; they do not block resource creation. To block, combine tag policies with an SCP using aws:RequestTag conditions, or enforce at the Service Catalog product level with TagOptions.
Trap 10: KMS key policy missing on cross-account pipelines
Cross-account CodePipeline artifacts require the customer-managed KMS key's key policy to grant decrypt to the target account's principals. IAM allow on the target side is not sufficient — KMS uses resource-based permissions primarily.
Trap 11: Assuming CodeDeploy does rolling update for ECS in-place
CodeDeploy's ECS deployment controller is blue/green only. If you want rolling in-place on ECS, use ECS's native rolling update controller (not CodeDeploy). The exam occasionally offers "CodeDeploy rolling for ECS" as a distractor — it doesn't exist.
Trap 12: Change sets skipped in production
Any production pipeline answer that uses CREATE_UPDATE directly is suspect. Pro pipelines always use CHANGE_SET_REPLACE + approval + CHANGE_SET_EXECUTE.
Diagnostic Entry Point — "Why is this deployment failing?"
When a scenario says "the deployment failed, the team expected X but saw Y", walk the diagnostic ladder:
- Pipeline role and cross-account trust — does the pipeline's service role have
sts:AssumeRoleon the target account role? Does the target role's trust policy trust the pipeline service role with the right external ID? - KMS CMK key policy — for artifact-bucket-encrypted pipelines, does the KMS key allow
kms:Decryptandkms:GenerateDataKeyto the target account? - S3 bucket policy on the artifact bucket — does it allow
s3:GetObject/s3:GetObjectVersionto the target account role? - CloudFormation deployment role — for cross-account CloudFormation actions, the
DeploymentRoleArnis the role CloudFormation assumes in the target account; it needs permissions on every resource type the template touches. This is a separate role from the pipeline role. - SCP on the target account path — is a deny SCP blocking the CloudFormation action or a resource type?
- Service quotas — CloudFormation 500-resource limit, StackSet 100 stack-instance operation limit, pipeline action timeout (default 1 hour).
- Change set validity — an expired change set must be recreated; a change set referencing a resource that no longer exists fails execution.
- CloudFormation Hooks or proactive guardrails — a Control Tower proactive control may block the change at hook evaluation; check the hook outcome in the CloudFormation events.
- CloudWatch alarm state for CodeDeploy deployments — was an alarm already in ALARM before deploy started (auto-rollback triggers instantly)?
- Lambda function limits — concurrency limits, code size, deployment package size, reserved concurrency conflicts.
CloudTrail (in the Log Archive account via the organization trail) plus the pipeline execution history plus the CloudFormation stack events give you the forensic trail.
Cost Model Notes
Most CI/CD services charge based on concrete units:
- CodePipeline: first pipeline free; additional active pipelines billed monthly. V2 pipelines charge per execution minute.
- CodeBuild: per-minute billing per build host (general1.small through large).
- CodeDeploy: no charge for EC2/on-prem/Lambda/ECS deployments themselves; you pay for the underlying compute.
- CloudFormation StackSets: no service charge; you pay for the provisioned resources.
- CDK: free; you pay for provisioned resources and the bootstrap assets (S3 + ECR storage).
- SAM CLI: free local use.
- Serverless Application Repository: free for publishing and consuming.
- AWS AppConfig: per configuration call + per configuration retrieved (tiny unit cost; cache via agent/extension).
- Service Catalog: free; you pay for provisioned resources.
- Systems Manager: free for most features; Automation charges per step execution for automation running across many instances.
At Pro scale, the dominant costs are the provisioned infrastructure itself and the CodeBuild minutes for heavy pipelines; the CI/CD control plane is cheap.
FAQ — CI/CD and IaC Deployment Strategy Top Questions
Q1: When should I use CloudFormation StackSets vs nested stacks vs cross-stack references?
Use nested stacks when a single team owns all the resources and wants them to deploy, update, and roll back as one transaction — the parent stack drives every child. Use cross-stack references when different teams own different pieces in the same account and region and need independent lifecycles (VPC team, IAM team, application team) but still want to share outputs like subnet IDs. Use CloudFormation StackSets when you must deploy the same template to many accounts or regions — nested stacks and cross-stack references don't cross account boundaries. For SAP-C02, "org-wide baseline" always maps to StackSets with service-managed permissions; "modular VPC composed of subnets" maps to nested stacks; "shared foundational IAM consumed by multiple applications" maps to cross-stack references via SSM Parameter Store across accounts.
Q2: How do I choose between CloudFormation, CDK, and SAM for a new workload?
Choose SAM for pure-serverless workloads (Lambda, API Gateway, DynamoDB, Step Functions, EventBridge) when the team values short templates and best-in-class local development. Choose CDK for mixed workloads (serverless + containers + VMs + RDS) or when the team wants programmatic reuse, typed infrastructure, and a single language across everything. Choose plain CloudFormation YAML when the estate is already heavily CloudFormation-based, when you want the simplest possible authoring without a build step, or when you're publishing a reusable template consumed by Service Catalog or StackSets. All three synthesize to CloudFormation at the end — you can mix them (CDK for new services, CloudFormation templates consumed via CfnInclude, SAM for serverless subsystems).
Q3: What is the canonical pattern for deploying to 20 AWS accounts with an auditable pipeline?
From the management account, enable Organizations with all features and designate a Deployments account as the CloudFormation StackSets delegated administrator. In the Deployments account, run a CodePipeline that: pulls the template from CodeCommit or GitHub via CodeStar Connections, runs CodeBuild for linting (cfn-lint, cfn-nag, tfsec for Terraform) and unit tests, creates a change set on a StackSet with service-managed permissions targeting specific OUs, pauses on a manual approval action with SNS notification to SecOps, then executes the change set which deploys to all targeted accounts in parallel (with the configured concurrency and failure tolerance). Enable automatic deployment so new accounts added to the OU get the stack on day zero. Schedule drift detection via EventBridge. Lock everything with a root SCP that denies direct cloudformation:* on baseline stack name patterns except by the pipeline role. Every step is logged in CloudTrail org trail, giving full audit.
Q4: Should I use canary or linear for a Lambda deployment?
Use canary when you want a clear two-step shift with a bake window for human review or alarm observation — the classic pattern is LambdaCanary10Percent10Minutes, giving 10 minutes of canary traffic before full cutover. Canary fits when you have confidence that 10 minutes of data is enough to catch regressions (good telemetry, high traffic). Use linear when you want a smoother ramp — 10 percent every minute over 10 minutes — which reduces the "jump" when the canary phase ends. Linear fits when alarms need more data points and traffic is steady; the gradual ramp lets each step's alarm evaluation feed into the next. Both auto-rollback on CloudWatch alarm. For low-traffic Lambdas where 10 percent of little is still too little data, consider lengthening the bake time (LambdaCanary10Percent30Minutes) or preferring blue/green-like staging tests before prod deploy.
Q5: How does ECS blue/green with CodeDeploy actually work end-to-end?
The ECS service is configured with deploymentController: CODE_DEPLOY. You have an ALB with two listeners — a production listener on port 80/443 and a test listener on a different port (e.g., 8080) — and two target groups (blue and green). The CodeDeploy deployment group references the cluster, service, both target groups, both listeners, and a deployment configuration. When a deploy starts, CodeDeploy creates a new ECS task set registered against the green target group. You can optionally route test traffic through the test listener for pre-production validation; a pre-traffic Lambda hook runs integration tests against the test listener. If the hook passes, CodeDeploy shifts the production listener from blue to green per the deployment configuration (all-at-once, canary, or linear). A post-traffic hook runs smoke tests after cutover. If any CloudWatch alarm in the deployment group's alarm list trips, CodeDeploy auto-rolls back — the production listener flips back to blue, and the green task set is terminated. This gives you rollback in seconds without a second deploy.
Q6: When would I use AWS AppConfig instead of redeploying?
Use AppConfig for configuration and feature flags that must change at runtime without a code deploy. Typical uses: toggling a feature on/off for a percentage of users, raising a rate-limit threshold, switching between algorithms, enabling a maintenance-mode banner, rolling out a regional feature. AppConfig gives you the same guardrails as code deploys — validation (Lambda or JSON Schema), staged rollout (linear, canary, all-at-once), and automatic rollback on CloudWatch alarm — but the rollout is measured in seconds because nothing is redeployed, only fetched at the next poll interval. SAP-C02 scenarios that say "reverse a config change in seconds", "feature flag rollout", or "kill switch" all point to AppConfig. Don't use AppConfig for secrets (use Secrets Manager) or for large static assets (use S3). Do use it for JSON/YAML config blobs up to 1 MiB and for feature-flag profiles.
Q7: How do I deploy CodePipeline across multiple AWS accounts?
Host the pipeline in a dedicated pipeline account (often the Deployments account). Encrypt the artifact S3 bucket with a customer-managed KMS key whose key policy grants kms:Decrypt and kms:GenerateDataKey to roles in every target account; also ensure the S3 bucket policy allows s3:GetObject* from target-account roles. In each target account, create a CloudFormation deployment role trusted by the pipeline's CodePipeline service role (the trust policy uses aws:PrincipalArn or an sts:ExternalId condition). In the pipeline, the action's RoleArn specifies the cross-account role to assume, and the CloudFormation deploy action also specifies a separate DeploymentRoleArn which is the role CloudFormation itself assumes to provision resources. Test the trust chain first with a minimal "list-stacks" action before trying a full deploy. Use CDK Pipelines if you want this wiring generated automatically from code.
Q8: How do I detect and auto-remediate CloudFormation drift at scale?
Combine three mechanisms. (1) CloudFormation drift detection on each stack set or stack, scheduled via EventBridge every N hours (DetectStackDrift or DetectStackSetDrift API). (2) AWS Config rule cloudformation-stack-drift-detection-check as a detective control that records non-compliance on drifted stacks. (3) EventBridge rule matching CloudFormation drift detection completion events, routing drifted-stack events to either an SNS topic for SecOps alerting or a Systems Manager Automation runbook for auto-remediation. For known-safe drift categories (e.g., security group rule additions) a runbook can re-apply the CloudFormation template. For unknown drift, alert humans and freeze the stack until investigation. At Pro scale, also attach a root-level SCP that denies manual resource edits on tag-identified baseline resources so drift is prevented at the source. Pair this with Control Tower drift detection on the landing-zone level for complete coverage.
Q9: What is the difference between CDK Pipelines and CodePipeline?
CodePipeline is the underlying AWS service — a linear workflow of stages containing actions. You write the pipeline definition in CloudFormation, CDK, SAM, or the console, and AWS runs it. CDK Pipelines (aws-cdk-lib.pipelines) is a higher-level CDK construct library that generates a CodePipeline from your CDK application code. The signature feature is self-mutation: when you add a new stage or environment in your CDK code, the pipeline's first action rebuilds itself to include the new stage. CDK Pipelines also handles cross-account CDK bootstrap role wiring automatically, sets up artifact encryption with best-practice defaults, and integrates naturally with CDK applications' stages (dev/staging/prod as CDK stages, each targeting a different environment). For SAP-C02 "pipeline with multi-environment promotion and minimal ongoing maintenance" answers, CDK Pipelines is the native choice if you're already in CDK.
Q10: How does SAR (Serverless Application Repository) fit into a multi-team organization?
A platform team packages a reusable serverless pattern (say, a "standard event-driven Lambda with DLQ, structured logging, X-Ray tracing, and CloudWatch dashboard") as a SAM application and publishes it to SAR. Publish scope can be private to the AWS account, shared with specific AWS accounts, shared with the AWS Organization, or public. Consuming teams deploy the SAR app in their account with a single AWS::Serverless::Application resource referencing the SAR application ARN and version — SAR synthesizes a nested CloudFormation stack into the consumer's account. Benefits: one source of truth for reusable patterns, semantic versioning with explicit consumer opt-in to updates, and SAM-style parameter passing for customization. SAR complements (not replaces) AWS Service Catalog: SAR is serverless-specific and self-service with SAM conventions; Service Catalog is generic CloudFormation wrapped in governance (launch constraints, TagOptions, portfolio sharing) and fits non-serverless architectures too.
Further Reading
- AWS CloudFormation StackSets
- CloudFormation StackSets Concepts — Self-Managed vs Service-Managed
- CloudFormation Nested Stacks
- CloudFormation Cross-Stack References
- CloudFormation Change Sets
- CloudFormation Drift Detection
- AWS CDK v2 Developer Guide
- AWS CDK Constructs (L1/L2/L3)
- AWS CDK Pipelines
- AWS SAM Developer Guide
- AWS SAM — Gradual Deployments
- AWS Serverless Application Repository
- AWS CodePipeline User Guide
- CodePipeline Cross-Account Actions
- AWS CodeDeploy User Guide
- CodeDeploy Deployment Configurations
- CodeDeploy Blue/Green for Amazon ECS
- Lambda Aliases and Traffic Shifting
- AWS AppConfig User Guide
- AWS Service Catalog Administrator Guide
- AWS Systems Manager Automation
- Terraform AWS Provider Best Practices (AWS Prescriptive Guidance)
- AWS SAP-C02 Exam Guide (PDF)