examhub .cc The most efficient path to the most valuable certifications.
In this note ≈ 32 min

Data Governance, Backup, and Compliance Controls

6,400 words · ≈ 32 min read

Data governance on AWS is the practice of controlling what data you keep, where you keep it, how long you keep it, who can touch it, and how you prove it to an auditor. On the SAA-C03 exam, data governance and compliance controls show up in Task Statement 1.3 ("Determine appropriate data security controls") and are tested at a scenario level — you are given a regulatory requirement (HIPAA, PCI DSS, GDPR, ISO 27001, financial records retention) and asked to pick the right combination of AWS services that satisfies it. The core services you must know cold are Amazon S3 Object Lock, AWS Backup, AWS CloudTrail, AWS Config, Amazon Macie, AWS Audit Manager, AWS Artifact, and supporting primitives like S3 Versioning and S3 Replication. Together they implement data retention, immutability, auditability, data classification, continuous compliance, and evidence collection — the five pillars of data governance compliance.

This guide walks through every construct in plain language, highlights the exam traps (governance mode vs compliance mode, Versioning vs Object Lock, AWS Backup vs manual snapshots, management events vs data events), and pairs each service with memorable analogies so the vocabulary sticks. Master this topic and you own the data governance compliance questions on SAA-C03, plus the follow-on compliance scenarios that overlap with data encryption and multi-account governance.

What is Data Governance on AWS?

Data governance is the overall framework that decides how an organization's data is produced, stored, classified, protected, retained, deleted, and audited. On AWS, data governance is never a single service — it is a stack of AWS services that each own one layer of the framework. Retention and immutability live in Amazon S3 Object Lock. Centralized backup policies live in AWS Backup. Audit trails live in AWS CloudTrail. Continuous configuration compliance lives in AWS Config. Automated data classification lives in Amazon Macie. Evidence collection for auditors lives in AWS Audit Manager. Downloadable compliance attestations (SOC reports, ISO certifications, PCI DSS AOC) live in AWS Artifact.

Data governance compliance on SAA-C03 is mostly about picking the right AWS service for a given regulatory or business requirement. The regulations themselves (HIPAA, PCI DSS, GDPR, ISO 27001, SEC 17a-4, FINRA, CJIS) are not tested deeply — you are expected to recognize the scenario clues and know which AWS service combination addresses them. Most questions map to one of five archetypes:

  1. "We need WORM storage that even an administrator cannot delete for 7 years." → Amazon S3 Object Lock in compliance mode.
  2. "We need one dashboard and one policy to back up EBS, RDS, DynamoDB, and EFS." → AWS Backup.
  3. "We need to know who called DeleteObject on our sensitive bucket at 02:13 last Tuesday." → AWS CloudTrail data events.
  4. "We need continuous evaluation that all S3 buckets block public access." → AWS Config rules.
  5. "We need to find PII in our S3 buckets automatically." → Amazon Macie.
  • WORM (Write-Once-Read-Many): object cannot be overwritten or deleted until the retention window expires.
  • Retention period: a specific number of days or a fixed expiry date during which an object is immutable.
  • Legal hold: an indefinite immutability flag that persists until a user with the right permission explicitly removes it.
  • Governance mode (S3 Object Lock): immutability that a privileged user (s3:BypassGovernanceRetention) can override.
  • Compliance mode (S3 Object Lock): immutability that no one, not even the AWS account root user, can override until the retention period ends.
  • Management events (CloudTrail): control-plane API calls (IAM changes, VPC changes, bucket creation).
  • Data events (CloudTrail): data-plane API calls (S3 GetObject, Lambda Invoke, DynamoDB item access).
  • Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html

Why data governance and compliance matters for SAA-C03

Task 1.3 of the SAA-C03 exam guide asks you to "determine appropriate data security controls", and compliance-driven scenarios make up a large share of the 300 questions in the domain budget for this topic. You will see questions phrased like "a healthcare company must keep patient records immutable for 6 years" or "a financial services firm must prove that no administrator can delete audit logs" — both point to S3 Object Lock compliance mode plus CloudTrail log file integrity validation. Recognizing those clues in the first read-through is the fastest path to scoring on this topic.

Plain-Language Explanation: Data Governance and Compliance on AWS

Abstract policy language becomes concrete when you tie it to everyday systems. Here are three distinct analogies covering every major data governance compliance construct on SAA-C03.

Analogy 1: The Safety Deposit Box at the Bank

Picture a bank vault full of safety deposit boxes. Amazon S3 Object Lock compliance mode is like a box with a time-release lock set by the bank's compliance officer — even the bank president cannot open it early, because the hardware is physically rated to refuse. The bank's internal audit team — AWS CloudTrail — records every visit: who opened the vault door, at what time, what keycard they used, and which box they touched. The bank's filing cabinet with periodic inspections is AWS Config, continuously checking "is each box still in compliance with the vault rules? did someone accidentally change the lock setting?" The cleaning service that scans abandoned boxes for contraband is Amazon Macie — it opens boxes (via machine learning on object content) and flags ones that contain personally identifiable information the customer shouldn't have stored unencrypted. The annual government auditor who shows up with a checklist is AWS Audit Manager, and the binder full of certifications the bank hands the auditor — SOC 2, ISO 27001, PCI DSS — is AWS Artifact. Finally, the bank's off-site backup vault in another city is AWS Backup plus S3 Cross-Region Replication — a copy of every box in a geographically separated facility so a single-building disaster never wipes out the records.

Governance mode in this analogy is a box with a time-release lock that the bank president can override in an emergency — it is still hard to open, but there is an escape hatch for authorized staff with the right key. Compliance mode is the box where no override exists — even the president cannot open it until the clock hits zero. That irreversibility is the whole point; it is what makes compliance mode satisfy SEC 17a-4(f), FINRA Rule 4511, and similar WORM-storage regulations.

Analogy 2: The Hospital Medical Records Room

Think about how a hospital handles patient records. Every chart is tagged with a retention label — "retain for 7 years after last treatment" — which maps directly to an S3 Object Lock retain-until-date. When a patient files a malpractice lawsuit, a legal hold tag is added so the chart cannot be purged even after its retention expires — exactly what an S3 Object Lock legal hold does (indefinite, separate from retention period, removable only by users with s3:PutObjectLegalHold permission).

The sign-in log at the records room door is AWS CloudTrail management events — who entered, who requested a chart. The request slip for an individual chart is CloudTrail data events — who read which specific patient file. Most hospitals log the door but not every chart (because that would be expensive) — same with CloudTrail: management events are free and on by default for the last 90 days in the Event history; data events are opt-in and billed per event because enabling them for every S3 object in a large bucket can be costly.

The compliance officer who walks around with a checklist every morning ("is the door locked? are the cabinet keys in the right drawer? are retention labels attached to every new chart?") is AWS Config. The machine that auto-scans new charts for Social Security numbers and credit card numbers is Amazon Macie. The filing cabinet that gets photocopied and sent to off-site storage each week is AWS Backup. And when a state regulator arrives for a HIPAA audit, the hospital's evidence binder — scanned logs, policy documents, incident reports — is what AWS Audit Manager assembles automatically.

Analogy 3: The Museum Archive with Climate-Controlled Storage

A museum archive captures the full AWS data governance compliance stack. The climate-controlled vault is Amazon S3 with encryption at rest. The archival boxes sealed with tamper-evident tape are S3 Object Lock objects in compliance mode — the tape cannot be removed without leaving visible evidence, and in compliance mode the tape physically cannot be removed until the retention period ends. The security camera recording every person who enters is CloudTrail. The weekly catalog inspection — "is every artifact in its correct location, in its correct storage class, with its correct humidity setting?" — is AWS Config. The art-historian's review of newly acquired items to flag pieces that need special handling is Amazon Macie classifying sensitive data. When a new accreditation cycle comes, the curator's team uses AWS Audit Manager to gather evidence across every controls framework — ISO 27001, NIST CSF, SOC 2 — and produces the report, while the museum's framed certification documents hanging in the director's office are pulled from AWS Artifact.

On exam day, when you see "WORM", "immutable", "retention", or "legal hold", mentally picture the bank safety deposit box with a time-release lock. When you see "who did what", picture the hospital sign-in log (CloudTrail). When you see "continuously evaluate", picture the museum weekly catalog inspection (AWS Config). Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html

Data Classification and Retention — Regulatory Frameworks Overview

Every data governance compliance decision starts with classification — figuring out what kind of data you have — and retention — deciding how long you must keep it and how long you must not delete it.

Data classification categories

SAA-C03 does not test a specific classification taxonomy, but most real-world frameworks use three tiers: public, internal, and restricted/confidential. Restricted data — PHI (Protected Health Information), PCI (cardholder data), PII (Personally Identifiable Information), financial records, export-controlled data, student records — drives almost every compliance requirement you will see on the exam. Amazon Macie is the AWS service that automates classification by scanning Amazon S3 buckets with ML-backed identifiers for these categories.

Retention vs deletion

Retention has two directions. Minimum retention says you must keep the data at least N years (SEC 17a-4 requires 6 years for broker-dealer records; HIPAA requires 6 years for audit logs). Maximum retention says you must delete the data by a certain point (GDPR requires deletion of personal data once the lawful basis for processing ends). AWS services line up with both:

  • Minimum retention → S3 Object Lock with retain-until-date; S3 Versioning + MFA Delete; AWS Backup vault lock.
  • Maximum retention / scheduled deletion → S3 Lifecycle rules with expiration actions; AWS Backup lifecycle to cold storage then delete.

Why frameworks matter

Common compliance frameworks tested at scenario level:

  • HIPAA (healthcare) — requires encryption at rest and in transit, audit logging, access controls, minimum 6-year audit log retention. Maps to S3 SSE-KMS + CloudTrail + S3 Object Lock.
  • PCI DSS (payment cards) — requires segmentation, encryption, quarterly vulnerability scans, logging, 1-year log retention online. Maps to VPC isolation + KMS + CloudTrail + AWS Config Conformance Pack for PCI DSS.
  • GDPR (EU personal data) — requires lawful basis, data subject rights, breach notification, data residency. Maps to region selection + Macie discovery + CloudTrail auditability + encryption.
  • ISO 27001 (general infosec management system) — broad controls catalog; maps to AWS Config conformance pack for ISO 27001 + Audit Manager framework.
  • SEC 17a-4(f) / FINRA 4511 (financial records) — requires WORM immutable storage. Maps to S3 Object Lock compliance mode specifically.

AWS is certified against HIPAA, PCI DSS, GDPR, ISO 27001, SOC 1/2/3, FedRAMP, and many more — but those attestations cover the AWS infrastructure (compliance of the cloud). You, the customer, are responsible for configuring your workloads to meet those same frameworks (compliance in the cloud). AWS Artifact gives you the AWS-side attestations; AWS Audit Manager helps you generate the customer-side evidence. Reference: https://docs.aws.amazon.com/artifact/latest/ug/what-is-aws-artifact.html

Amazon S3 Object Lock is the AWS data governance feature that delivers Write-Once-Read-Many (WORM) storage in Amazon S3. Once an object is locked, it cannot be overwritten or deleted for the duration of the lock. This is the single most-tested data governance control on SAA-C03 because it is the AWS native answer for every "immutable log", "immutable financial record", and "ransomware-resistant backup" scenario.

Prerequisites

S3 Object Lock has two hard prerequisites:

  • The S3 bucket must be created with Object Lock enabled at creation time (for new buckets), or Object Lock enabled later via a support workflow for existing buckets.
  • S3 Versioning must be enabled on the bucket — Object Lock operates at the object version level, not the object name level. A new PUT of the same key creates a new unlocked version; the locked old version is untouched.

Retention modes: governance vs compliance

S3 Object Lock has two retention modes, and the difference between them is the #1 exam trap for this topic.

  • Governance mode protects an object version from deletion unless the caller has the s3:BypassGovernanceRetention IAM permission. This is useful for internal change-control workflows where you want to prevent accidental deletion but keep an escape hatch for a privileged user with a justified reason.
  • Compliance mode protects an object version absolutely — not even the AWS account root user can delete the object version or shorten the retention period until the retain-until-date has passed. The retention period can be extended (lengthened) but never shortened or removed once set. This is the only mode that satisfies strict regulatory WORM requirements like SEC 17a-4(f).

A recurring SAA-C03 trap: the question describes a scenario requiring "no one, not even an administrator, can delete the data for 7 years" and the choices include both modes. Compliance mode is the only correct answer — governance mode allows bypass by a privileged IAM principal. If the scenario mentions "SEC 17a-4", "FINRA", "WORM compliance", or "even the root user cannot delete", it is compliance mode. If the scenario mentions "prevent accidental deletion but allow override with justification", it is governance mode. Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html

Retention periods

A retention period is a specific number of days or years or a specific retain-until-date associated with an object version. Every PUT can carry an explicit retention setting, or the bucket can have a default retention (mode + days/years) that is applied automatically to every new object. Common SAA-C03 patterns:

  • Default retention of 2555 days (≈7 years) in compliance mode on a dedicated audit-log bucket.
  • Default retention of 365 days in governance mode on an invoices bucket with a planned override workflow.
  • Per-object retention written by the application for individual records with variable hold periods.

A legal hold is a separate, indefinite protection flag. It is independent of retention — an object can have a legal hold with no retention, a retention with no legal hold, or both at once. A legal hold stays in place until a user with s3:PutObjectLegalHold permission explicitly removes it. Legal holds are the answer whenever a scenario says "a lawsuit, investigation, or subpoena requires the data be preserved indefinitely beyond its normal retention period" — you cannot predict when the hold will end, so a fixed retain-until-date is the wrong answer.

Retention period is a clock; legal hold is a flag. When the retention clock runs out, the object becomes deletable (subject to any legal hold). When a legal hold is active, the object is undeletable regardless of retention status. Both can coexist on the same object version. Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html

S3 Object Lock for ransomware resilience

A common SAA-C03 scenario is "protect backups from ransomware". The answer is a dedicated backup bucket with Object Lock compliance mode plus a reasonable retention period (30–90 days). Even if an attacker compromises AWS credentials with s3:* permissions, they cannot delete or encrypt-in-place the existing locked object versions. This is why AWS Backup supports Object Lock–backed vaults and why many WORM-compliance scenarios overlap with ransomware recovery scenarios on SAA-C03.

AWS Backup — Centralized Policy-Driven Backup Across Services

AWS Backup is the AWS service that centralizes backup policy, execution, monitoring, and restoration across many AWS services from a single dashboard. It replaces a patchwork of service-specific snapshot scripts with a single, auditable control plane.

Services AWS Backup supports (for SAA-C03)

You do not need to memorize the full list, but know that AWS Backup covers all the major stateful AWS services tested on SAA-C03:

  • Amazon EBS (volume snapshots)
  • Amazon RDS (DB snapshots including Aurora)
  • Amazon DynamoDB (on-demand and continuous PITR backups)
  • Amazon EFS (file system backups)
  • Amazon FSx (Windows File Server, Lustre, NetApp ONTAP, OpenZFS)
  • Amazon EC2 (machine images)
  • AWS Storage Gateway (volume backups)
  • Amazon S3 (bucket-level backup with point-in-time recovery)
  • Amazon Redshift (serverless snapshots)

Backup plans, vaults, and policies

A backup plan is the reusable schedule-and-retention template — for example, "daily backup at 01:00 UTC, retain for 35 days, plus monthly backup on the 1st retained for 7 years". A backup vault is the encrypted container where recovery points are stored, with its own AWS KMS key and access policy. Tag-based resource assignment is how you map resources to a plan without listing them individually — tag every production RDS instance with Backup=daily-prod and the plan automatically picks them up.

AWS Backup Vault Lock

Vault Lock is to AWS Backup what Object Lock compliance mode is to Amazon S3: it enforces a minimum and maximum retention policy on a vault that cannot be modified or deleted once locked in compliance mode. This is the AWS Backup answer to the "no one, not even root, can delete" scenario. There is a 3-day cooling-off period after which the lock becomes immutable; before that window closes you can still abort, so real-world deployments schedule a deliberate wait before finalizing.

AWS Backup vs manual snapshots

A common SAA-C03 trap pairs "enable AWS Backup" against "write Lambda functions to call each service's snapshot API". AWS Backup is the right answer whenever the scenario emphasizes one policy across many services, centralized monitoring, cross-account and cross-region copies, vault-level immutability, or compliance reporting. Hand-rolled snapshot scripts are an exam distractor because they skip every one of those requirements.

If the scenario mentions "centralized management across EBS + RDS + DynamoDB + EFS" or "consolidated backup policy for compliance", the answer is AWS Backup. Do not pick "AWS Lambda scheduled with EventBridge to call service-specific snapshot APIs" — that works technically but misses the centralization, vault lock, and audit features SAA-C03 is asking about. Reference: https://docs.aws.amazon.com/aws-backup/latest/devguide/whatisbackup.html

S3 Versioning and Cross-Region Replication for Data Recovery

S3 Versioning

S3 Versioning is a bucket-level setting that preserves every version of every object. A DELETE becomes a "delete marker" that hides the latest version, not an actual delete — you can restore by removing the delete marker. An overwrite PUT creates a new version instead of replacing the old one.

Versioning is the prerequisite for:

  • S3 Object Lock (requires versioning because lock is per-version).
  • S3 Replication (requires versioning on both source and destination).
  • MFA Delete (an optional feature requiring an MFA token to permanently delete a version or toggle versioning off).

Versioning alone is not a sufficient compliance control — a user with s3:DeleteObjectVersion can still permanently delete any version. You combine versioning with Object Lock, MFA Delete, and IAM deny policies for real immutability.

S3 Cross-Region Replication (CRR) and Same-Region Replication (SRR)

S3 Replication asynchronously copies objects from a source bucket to a destination bucket. Replication is one-way by default but supports two-way (bidirectional) replication rules. Common use cases:

  • Cross-Region Replication (CRR) — disaster recovery, compliance requirements for a geographically separate copy, latency reduction for readers in another region.
  • Same-Region Replication (SRR) — aggregating logs from many buckets into one central account, separating production and analytics buckets, meeting compliance requirements for a separate account copy.

Replication does not replicate objects that existed before replication was enabled (use S3 Batch Replication for historical objects), and it does not replicate lifecycle-expiration-driven deletions by default. Both source and destination must have versioning enabled.

  • Versioning = history of every object change.
  • Object Lock = immutability on top of versioning (governance or compliance mode).
  • Replication = asynchronous copy to another bucket (usually another region) for DR and compliance.
  • Versioning is the prerequisite for both Object Lock and Replication.
  • Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html

AWS CloudTrail — Audit Logging, Log File Integrity, and Governance

AWS CloudTrail is the AWS data governance compliance service that records every API call made in your AWS account. It is the primary answer to every "who did what, when, and from where" question on SAA-C03.

Management events vs data events

CloudTrail distinguishes two event categories, and the difference is heavily tested.

  • Management events (also called control-plane events) are API calls that manage resources — RunInstances, CreateBucket, PutBucketPolicy, AttachRolePolicy, CreateUser. Management events are on by default and free for the last 90 days in the CloudTrail Event history console. Long-term storage in Amazon S3 requires a trail.
  • Data events (also called data-plane events) are API calls that interact with the contents of resources — GetObject, PutObject, DeleteObject on S3 objects; Invoke on Lambda functions; item-level operations on DynamoDB tables. Data events are off by default because they are high-volume and are billed per event when enabled.

For S3, CloudTrail data events can be enabled on a specific bucket or prefix so you pay only for the data you actually care about. A classic SAA-C03 scenario: "a security team must log every GetObject and PutObject call on a specific bucket containing sensitive data" — the answer is enable CloudTrail S3 data events on that bucket, not "enable CloudTrail" (which already covers management events but not data events).

An exam question that describes "an auditor wants to know who downloaded a specific file from S3" cannot be answered by the default CloudTrail configuration — default CloudTrail captures the GetBucketPolicy call but not the GetObject call. You must enable CloudTrail data events on that bucket. Reference: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/logging-management-and-data-events-with-cloudtrail.html

Multi-region trails and organization trails

A single CloudTrail trail can be configured as multi-region, meaning it captures events from every AWS region into one S3 bucket. This is strongly recommended for data governance compliance because it prevents blind spots in regions your teams don't actively use. On top of multi-region, an AWS Organizations trail captures events across every AWS account in the organization and writes them into a single central S3 bucket — the standard pattern for a central security account.

Log file integrity validation

CloudTrail can optionally sign and hash log files, producing a digest file in S3 that lets you cryptographically verify no log file has been tampered with, deleted, or modified after delivery. This is a critical control for audit use cases where the integrity of the audit trail itself must be provable. The CloudTrail CLI command aws cloudtrail validate-logs verifies a time range's log files against the digest chain.

CloudTrail Lake

AWS CloudTrail Lake is a managed data lake specifically for CloudTrail events with support for long-term retention (up to 10 years) and SQL-based querying directly on the events without needing to load them into Athena. Think of it as CloudTrail events + a built-in query engine + long retention. CloudTrail Lake is the right answer when a scenario says "retain CloudTrail data for 7 years and support ad-hoc SQL queries by the security team" — it replaces the older pattern of writing to S3, building a Glue table, and querying with Athena.

Delivery targets

A CloudTrail trail delivers events to:

  • Amazon S3 — durable archival storage (required); this is where long-term retention happens.
  • Amazon CloudWatch Logs — optional, for real-time alerting via metric filters and alarms.
  • Amazon EventBridge — optional, for triggering Lambda functions or Step Functions on specific events.
  • Management events: on by default, free for last 90 days in Event history.
  • Data events: off by default, billed per event, enable per bucket or function.
  • Multi-region trail: one trail captures all regions; recommended for governance.
  • Organization trail: one trail captures all AWS accounts in the org.
  • Log file integrity validation: cryptographic proof that logs weren't tampered with.
  • CloudTrail Lake: up to 10-year retention with SQL querying built in.
  • Reference: https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html

AWS Config — Continuous Compliance Evaluation and Remediation

AWS Config is the AWS data governance compliance service that continuously records the configuration of your AWS resources and continuously evaluates them against rules you define. CloudTrail answers "who did what", Config answers "what does the world look like now, and is it compliant?".

Configuration items and configuration history

Every supported resource (EC2 instance, EBS volume, S3 bucket, security group, IAM role, etc.) produces a configuration item every time it changes. The stream of configuration items for a given resource is its configuration history — you can go back in time and see exactly what a security group's rules looked like three weeks ago.

Config rules

A Config rule is an evaluation function that runs against every resource of a target type and returns COMPLIANT, NON_COMPLIANT, or NOT_APPLICABLE. Two sources:

  • AWS-managed rules — AWS provides hundreds of pre-built rules: s3-bucket-public-read-prohibited, ec2-instance-no-public-ip, iam-user-mfa-enabled, rds-storage-encrypted, cloudtrail-enabled, and many more.
  • Custom rules — you write a Lambda function (or an AWS Config custom policy in Guard DSL) that implements the evaluation logic.

Config rules can be periodic (run on a schedule) or change-triggered (run on every configuration item). Change-triggered is the default for most managed rules.

Conformance packs

A conformance pack is a collection of Config rules (and optionally remediation actions) packaged as a single deployable unit. AWS provides pre-built conformance packs for common compliance frameworks:

  • Operational Best Practices for HIPAA Security
  • Operational Best Practices for PCI DSS 3.2.1
  • Operational Best Practices for GDPR
  • Operational Best Practices for NIST 800-53
  • Operational Best Practices for ISO 27001
  • Operational Best Practices for CIS AWS Foundations Benchmark
  • Operational Best Practices for AWS Well-Architected Security Pillar

Deploying a conformance pack gives you a framework-level compliance dashboard in one step — AWS Config evaluates every rule in the pack against every resource, aggregates the compliance percentage, and produces a framework-aligned report. This is the SAA-C03 answer to "how can the security team quickly see our compliance status against HIPAA?" — deploy the HIPAA conformance pack in AWS Config.

Remediation

Config supports remediation actions that run automatically when a resource is non-compliant. Remediations are AWS Systems Manager automation documents (SSM documents) — for example, "if an S3 bucket is found public, apply a PutBucketPolicy to block public access". Remediation can be automatic (triggered immediately on non-compliant finding) or manual (displayed in the console for a human to click). This closes the loop between detection (Config) and action (SSM Automation) without writing a custom Lambda function.

When an SAA-C03 scenario says "we need to continuously verify compliance with HIPAA / PCI DSS / ISO 27001 across all our AWS accounts", the answer is AWS Config with the appropriate conformance pack, optionally deployed at scale via AWS Organizations. You do not need to build per-service audit scripts; AWS has already done it. Reference: https://docs.aws.amazon.com/config/latest/developerguide/conformance-packs.html

Config vs CloudTrail

Both services record AWS activity, but they answer different questions:

Question Service
Who made the API call and from where? CloudTrail
What was the state of this resource 5 minutes ago? Config
Is this resource compliant with my rules right now? Config
Did anyone call DeleteObject on my bucket last week? CloudTrail data events
Has my security group stayed non-public for the last 30 days? Config

Most well-architected workloads run both CloudTrail (audit trail) and AWS Config (configuration compliance) together as the foundation of data governance.

Amazon Macie — PII Discovery and Data Classification

Amazon Macie is the AWS managed service that uses machine learning and pattern matching to automatically discover, classify, and protect sensitive data in Amazon S3. Macie is the AWS answer to "we need to know which S3 buckets contain PII, PHI, or PCI data" without writing custom scanners.

Managed data identifiers

Macie ships with managed data identifiers for common sensitive data types: names, email addresses, phone numbers, mailing addresses, credit card numbers, US SSNs, passport numbers, driver's licenses, AWS access keys, credentials in source files, medical codes, and many more. You can also create custom data identifiers using regex + keyword windows + maximum match distance for your organization's proprietary formats.

Sensitive data discovery jobs

You run a discovery job against one or more S3 buckets (one-time or scheduled). The job samples objects, applies the managed and custom identifiers, and emits findings classified by severity and sensitivity type. Findings stream to AWS Security Hub, Amazon EventBridge, and the Macie console.

Automated sensitive data discovery

Macie also offers automated sensitive data discovery, a continuous lightweight sampling of all S3 buckets in the account (and organization) that runs without explicit job configuration. This produces an ongoing risk score and a sensitivity map of your S3 footprint — the typical SAA-C03 answer to "the CISO wants a continuous PII inventory across all S3 buckets".

Macie vs GuardDuty vs Inspector

An exam trap: three "security AI" services sound similar but do different things.

Service Focus
Amazon Macie Sensitive data (PII, PHI, PCI) discovery in Amazon S3
Amazon GuardDuty Threat detection via anomaly analysis on CloudTrail, VPC Flow Logs, DNS, EKS audit logs
Amazon Inspector Vulnerability scanning for Amazon EC2, ECR container images, Lambda functions

If the scenario mentions "discover PII", "classify sensitive data", "find credit card numbers in S3", or "data subject access request (GDPR)", the answer is Amazon Macie. If it mentions "detect compromised IAM credentials" or "cryptocurrency mining traffic", it is GuardDuty. If it mentions "known CVEs in my EC2 instances", it is Inspector. Reference: https://docs.aws.amazon.com/macie/latest/user/what-is-macie.html

AWS Audit Manager — Evidence Collection for Compliance Audits

AWS Audit Manager automates the process of gathering evidence for audits against compliance frameworks. Where AWS Config continuously evaluates rules, Audit Manager packages the evaluation results plus other evidence (CloudTrail snippets, resource configurations, manual attestations) into audit-ready reports.

Frameworks and assessments

Audit Manager ships with pre-built frameworks for HIPAA, PCI DSS, GDPR, NIST 800-53, SOC 2, ISO 27001, FedRAMP, GxP, and many others. You create an assessment from a framework, scope it to the AWS accounts and services in the audit, and Audit Manager continuously collects evidence mapped to each control.

Evidence types

Evidence can be:

  • Automated from AWS services — AWS Config rule evaluations, CloudTrail events, AWS Security Hub findings, snapshots of resource configurations.
  • Manual uploads — policy documents, interview notes, training records, physical access logs.

The output is a controlled evidence folder per control, ready to hand to an auditor, plus a PDF assessment report.

Audit Manager vs Config vs Artifact

These three compliance services often confuse candidates. They are actually layered:

  • AWS Artifact — AWS's compliance attestations about AWS itself (SOC 1/2/3, ISO 27001 certification, PCI DSS AOC). You download these and give them to your auditor to prove AWS is a compliant subservice organization.
  • AWS Config (+ conformance packs) — continuous evaluation of your resources against framework-aligned rules. Output: current compliance percentage and non-compliant resource list.
  • AWS Audit Manager — continuous collection and packaging of evidence (from Config, CloudTrail, Security Hub, and manual sources) into an audit-ready report for each control in a framework. Output: a PDF/ZIP deliverable for your auditor.

Artifact = AWS's attestations. Config = "are my resources compliant right now". Audit Manager = "here is the evidence packet proving my controls worked over the audit period". Many SAA-C03 questions hinge on spotting the difference. Reference: https://docs.aws.amazon.com/audit-manager/latest/userguide/what-is.html

AWS Artifact — On-Demand Compliance Reports

AWS Artifact is the AWS service that lets you download AWS's own compliance reports and agreements on-demand. It is not about your workloads — it is about AWS as a subservice organization.

What's in Artifact

  • AWS SOC 1, SOC 2, SOC 3 reports — service organization controls issued by independent auditors.
  • PCI DSS Attestation of Compliance (AOC) and Responsibility Summary.
  • ISO 27001, ISO 27017, ISO 27018, ISO 9001, ISO 22301 certificates.
  • FedRAMP reports (moderate and high baselines).
  • HIPAA / HITECH compliance documentation including the Business Associate Addendum (BAA) which must be executed before storing PHI on AWS services.
  • GDPR Data Processing Addendum (DPA).
  • Global country-specific certifications (IRAP, C5, MTCS, OSPAR, etc.).

How to use Artifact for SAA-C03 scenarios

When an exam scenario says "the auditor has asked for AWS's SOC 2 report" or "we need the signed Business Associate Addendum before storing PHI", the answer is AWS Artifact — that is literally the one service for that use case. Artifact reports are downloadable in one click (after accepting the applicable NDA) and there is no additional AWS charge.

Data Lifecycle Policies — S3 Lifecycle Rules and Expiration

S3 Lifecycle rules automate tier transitions and deletions on Amazon S3 objects. They are a core data governance control because they implement both cost optimization and automatic deletion for GDPR/data minimization.

Transition actions

A transition action moves objects from one storage class to another after N days:

  • Standard → Standard-IA (after at least 30 days)
  • Standard → Intelligent-Tiering
  • Standard → One Zone-IA
  • Standard → Glacier Instant Retrieval
  • Standard → Glacier Flexible Retrieval
  • Standard → Glacier Deep Archive

Expiration actions

An expiration action deletes current object versions or permanently deletes noncurrent versions after N days. Combined with versioning, this gives you a two-phase deletion: the current version becomes noncurrent on DELETE, and the noncurrent version is permanently expired after the configured period. Expiration is the right answer for "delete PII after 7 years to comply with GDPR data minimization".

Lifecycle + Object Lock interaction

An important governance note: S3 Lifecycle cannot delete objects that are locked by Object Lock until the retention period expires and any legal hold is removed. Lifecycle and Object Lock coexist peacefully — lifecycle proposals that would violate the lock are silently skipped.

Lifecycle for data minimization

A typical GDPR scenario: "the company must delete personal data 2 years after the account closes". The implementation is:

  1. Tag objects on ingestion with the customer's account ID.
  2. A lifecycle rule filtered by tag deletes objects after 730 days.
  3. CloudTrail data events log every PutObject (for provenance) and every lifecycle-driven delete (for audit proof).
  4. AWS Config verifies the lifecycle rule is still attached and matches policy.

Implementing Policies for Data Access and Protection

Data governance is not just about storage features — it is also about who is allowed to touch the data. The following controls intersect with the data governance layer on SAA-C03.

S3 bucket policies and IAM policies

Resource-based S3 bucket policies combined with identity-based IAM policies enforce who can read, write, and delete objects. A governance-grade bucket usually has a bucket policy that denies s3:DeleteObject from anyone except a specific role, denies unencrypted uploads, and denies requests without TLS.

S3 Block Public Access

S3 Block Public Access is an account-level and bucket-level control that overrides any ACL or bucket policy that would otherwise make an object public. It is enabled by default on new buckets and must not be disabled without a strong reason. AWS Config has a managed rule (s3-account-level-public-access-blocks) that flags any account where it is off.

AWS KMS and encryption in governance contexts

Encryption by itself is a separate topic (Data Encryption and Key Management), but every data governance framework requires encryption at rest. Governance-grade buckets use SSE-KMS with a customer-managed CMK so that key-level access audit is recorded in CloudTrail (Decrypt, GenerateDataKey) — you can prove not just who accessed the object, but who accessed the key that decrypted the object.

Resource tagging and data classification tags

Tagging objects, buckets, and resources with a data classification label (data-classification: restricted) lets AWS Config rules, Macie, and Audit Manager filter by sensitivity. A common pattern: "every bucket tagged data-classification: restricted must have Object Lock enabled" — implemented as a custom AWS Config rule.

Aligning AWS Services to Compliance Frameworks (HIPAA, PCI DSS, GDPR, ISO 27001)

Instead of memorizing every control in every framework, memorize the mapping pattern: "this framework demands X; AWS delivers X via services Y and Z". Then apply the pattern to any scenario.

HIPAA (Health Insurance Portability and Accountability Act)

HIPAA governs Protected Health Information (PHI) in US healthcare.

  • BAA required (from AWS Artifact) before storing PHI.
  • Encryption at rest — S3 SSE-KMS, EBS encryption, RDS encryption.
  • Encryption in transit — TLS 1.2+ via ACM; deny non-TLS bucket policies.
  • Audit logging — CloudTrail multi-region trail with log file integrity validation; minimum 6-year retention via S3 Lifecycle to Glacier Deep Archive.
  • Access controls — IAM least privilege, MFA, IAM Identity Center.
  • Continuous compliance — AWS Config HIPAA conformance pack.
  • Sensitive data discovery — Amazon Macie scans S3 for PHI patterns.

PCI DSS (Payment Card Industry Data Security Standard)

PCI DSS governs cardholder data.

  • AOC available in AWS Artifact for the AWS-side attestation.
  • Segmentation — dedicated VPC for the Cardholder Data Environment (CDE); restrictive security groups and NACLs.
  • Encryption — KMS for data at rest; TLS for data in transit.
  • Logging — CloudTrail with data events for S3 buckets holding cardholder data; 1 year online + 1 year archive.
  • Vulnerability management — Amazon Inspector for EC2 and ECR scans; AWS Systems Manager Patch Manager.
  • Continuous compliance — AWS Config PCI DSS conformance pack.

GDPR (General Data Protection Regulation)

GDPR governs personal data of EU residents.

  • DPA available in AWS Artifact.
  • Data residency — select EU AWS regions for processing; use S3 region restrictions.
  • Data subject rights — ability to export, amend, and delete personal data; tag + lifecycle + Macie discovery.
  • Breach notification — GuardDuty + Security Hub + EventBridge → automated notification within 72 hours.
  • Record of processing activities — CloudTrail + Config capture the history.
  • Data minimization — S3 Lifecycle expiration actions.

ISO 27001 (Information Security Management)

ISO 27001 defines an ISMS (Information Security Management System).

  • Certificate available in AWS Artifact for the AWS-side compliance.
  • Controls catalog — Annex A controls map to AWS services; AWS Config ISO 27001 conformance pack evaluates technical controls.
  • Audit — AWS Audit Manager ISO 27001 framework assembles evidence.

Every compliance scenario on SAA-C03 boils down to: (1) establish the AWS-side attestation via AWS Artifact, (2) implement technical controls with KMS + IAM + CloudTrail + S3 Object Lock + VPC isolation, (3) continuously evaluate with AWS Config conformance packs, (4) package evidence with AWS Audit Manager. Memorize this four-step pattern and you can deconstruct any framework question. Reference: https://docs.aws.amazon.com/artifact/latest/ug/what-is-aws-artifact.html

Both topics live under Task 1.3 of the SAA-C03 exam guide, but they are complementary, not overlapping:

  • Data encryption and key management (sibling topic) covers KMS, CloudHSM, envelope encryption, SSE-S3 vs SSE-KMS vs SSE-C, TLS, and ACM — the cryptographic layer.
  • Data governance, backup, and compliance controls (this topic) covers Object Lock, Versioning, Backup, CloudTrail, Config, Macie, Audit Manager, Artifact, and compliance framework alignment — the policy, audit, and retention layer.

Many real-world compliance requirements need both: HIPAA requires encryption (covered in the encryption topic) and audit logging (covered here). SAA-C03 questions usually telegraph which topic is primary by emphasizing either "encrypt" (encryption topic) or "retain / immutable / auditable / classify" (governance topic).

Key Numbers to Memorize for Data Governance Compliance

This compact list covers the numeric and categorical facts most likely to show up in distractor answers on SAA-C03.

  • S3 Object Lock retention modes: governance (admin can bypass with s3:BypassGovernanceRetention) and compliance (no one, not even root, can delete or shorten).
  • S3 Object Lock legal hold: indefinite; separate from retention period; removed only with s3:PutObjectLegalHold.
  • S3 Versioning + MFA Delete: required on source and destination for replication; MFA Delete requires root user + MFA to toggle off or permanently delete.
  • CloudTrail management events: on by default, free for last 90 days in Event history.
  • CloudTrail data events: off by default, billed per event, enabled per resource.
  • CloudTrail Lake retention: up to 10 years with SQL queries built in.
  • AWS Backup Vault Lock compliance mode: 3-day cooling-off period before immutable.
  • S3 Lifecycle IA transitions: minimum 30 days in Standard before transitioning to Standard-IA.
  • Amazon Macie: classifies data in Amazon S3 only; not RDS, not DynamoDB, not EBS.
  • AWS Config conformance packs: pre-built for HIPAA, PCI DSS, GDPR, NIST 800-53, ISO 27001, CIS, WA Security Pillar.
  • AWS Artifact: AWS's own SOC, ISO, PCI, HIPAA BAA, GDPR DPA, FedRAMP reports — free download.
  • AWS Audit Manager: automates evidence collection mapped to framework controls.
  • Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock-overview.html

Common Exam Traps — Data Governance and Compliance

Expect at least two of these in every SAA-C03 attempt.

Trap 1: S3 Versioning vs S3 Object Lock

Versioning preserves deleted and overwritten versions, but any principal with s3:DeleteObjectVersion can still permanently delete a specific version. Versioning alone is not immutability and does not satisfy WORM compliance. The right answer for "no one can delete this data" is S3 Object Lock (compliance mode) on top of versioning, not versioning alone.

Trap 2: Governance mode vs compliance mode

Answer choices mixing governance and compliance mode are deliberately confusing. If the scenario mentions "administrator can override with justification", it is governance. If the scenario mentions "immutable for the full retention period", "even the root user", "SEC 17a-4", or "WORM", it is compliance.

Trap 3: AWS Backup vs manual snapshots

If the scenario emphasizes centralized policy across many AWS services, cross-account copies, vault lock, or compliance dashboards, the answer is AWS Backup. Lambda-on-schedule snapshot scripts are a distractor.

Trap 4: CloudTrail management events vs data events

"Who created the bucket" = management events (on by default). "Who downloaded a specific object" = data events (must be explicitly enabled). The cost model is the other clue: data events are billed per event while management events are included.

Trap 5: Macie vs GuardDuty vs Inspector

Macie = sensitive data discovery in S3. GuardDuty = threat detection via logs. Inspector = vulnerability scanning for EC2/ECR/Lambda. Mixing them up is the classic Domain 1 trap.

Trap 6: AWS Config vs AWS CloudTrail

Both are "logging" services in casual speech, but Config answers "what is the current state of my resources" and CloudTrail answers "what API calls were made". Most governance architectures run both.

Trap 7: AWS Artifact vs Audit Manager

Artifact = download AWS's own attestation reports. Audit Manager = assemble your customer-side evidence into auditor-ready packages. Do not confuse them.

Trap 8: Object Lock bucket must be enabled at bucket creation (historical)

Historically S3 Object Lock could only be enabled at bucket creation. More recently AWS added a path to enable it on existing buckets via a support workflow — but exam questions may still treat it as a create-time prerequisite. When in doubt and the scenario says "new bucket", assume it is enabled at creation.

Data Governance Architecture Pattern — The Five-Pillar Stack

A well-architected AWS data governance deployment combines every service above into a single stack. Memorize this pattern — it unlocks composite SAA-C03 questions.

  1. Storage with immutability — S3 with Versioning + Object Lock in compliance mode for audit-critical data; AWS Backup Vault Lock for backups.
  2. Centralized audit trail — CloudTrail organization trail (multi-region, log file integrity validation) writing to a dedicated archive account; data events on sensitive buckets; CloudTrail Lake for 7–10 year retention with SQL.
  3. Continuous compliance — AWS Config aggregator across all accounts; conformance packs for each required framework; remediation via SSM Automation.
  4. Data classification — Amazon Macie automated sensitive data discovery on every S3 bucket; findings sent to Security Hub.
  5. Evidence and reporting — AWS Audit Manager assessments for each framework; AWS Artifact downloads for the AWS-side attestations.

This stack satisfies HIPAA, PCI DSS, GDPR, ISO 27001, SOC 2, and FedRAMP concurrently — which is why it is the SAA-C03 reference architecture for "a regulated enterprise running on AWS".

FAQ — Data Governance and Compliance Top Questions

Q1: What is the difference between S3 Object Lock governance mode and compliance mode?

Governance mode protects an object version from deletion unless the caller has the s3:BypassGovernanceRetention IAM permission — it prevents accidental or unauthorized deletes but keeps a controlled escape hatch for a privileged admin with a legitimate reason. Compliance mode provides absolute immutability: no one — including the AWS account root user — can delete the object version or shorten the retention period until the retain-until-date has passed. You can extend a compliance-mode retention, but you can never shorten or remove it. Compliance mode is the only mode that satisfies strict regulatory WORM requirements like SEC 17a-4(f) and FINRA Rule 4511, and it is the correct answer on SAA-C03 whenever the scenario says "even an administrator cannot delete" or "regulatory WORM".

Q2: When should I use AWS Backup instead of service-specific snapshots?

Use AWS Backup whenever the scenario emphasizes centralized policy across multiple AWS services (EBS + RDS + DynamoDB + EFS + FSx), cross-account and cross-region copies, vault-level immutability (Vault Lock), tag-based resource assignment, or compliance reporting across the backup estate. Service-specific snapshots (EBS snapshots, RDS automated backups, DynamoDB PITR) still work for single-service workloads, but they lack the centralized policy and vault-lock features that SAA-C03 scenarios frequently require. If the question is "one dashboard, one policy, many services", AWS Backup is always the answer.

Q3: What does CloudTrail log by default, and what do I need to enable separately?

CloudTrail records management events by default, and they are free to view for the last 90 days in the CloudTrail Event history console. For long-term retention, create a trail that writes to S3. Data events (S3 GetObject/PutObject/DeleteObject, Lambda Invoke, DynamoDB item-level operations) are off by default because they are high-volume — you enable them per bucket, function, or table and pay per event. For SAA-C03, remember: "who created the bucket" → on by default; "who read this specific file" → requires data events. Also enable log file integrity validation for any governance-grade trail, and consider CloudTrail Lake if you need long retention with built-in SQL querying.

Q4: How does AWS Config help with compliance, and what is a conformance pack?

AWS Config continuously records the configuration of every supported resource and continuously evaluates resources against Config rules (AWS-managed or custom). A conformance pack is a pre-built bundle of Config rules aligned to a specific compliance framework — AWS publishes packs for HIPAA, PCI DSS, GDPR, NIST 800-53, ISO 27001, CIS, and more. Deploying a conformance pack gives you a framework-aligned compliance dashboard in one step. Paired with AWS Systems Manager Automation remediation actions, Config can auto-fix non-compliant resources (for example, re-applying S3 Block Public Access). Config is the continuous-compliance layer; CloudTrail is the audit-trail layer; together they answer "is my environment currently compliant, and how did it get this way?".

Q5: When do I use Amazon Macie vs AWS Audit Manager vs AWS Artifact?

They operate at three different levels. Amazon Macie is a data-layer service that uses ML to find PII, PHI, and PCI data in Amazon S3 buckets and classify sensitivity — use Macie whenever the scenario says "discover sensitive data", "identify credit card numbers", or "classify S3 contents". AWS Audit Manager is a control-layer service that continuously collects evidence across AWS services (Config rule results, CloudTrail events, configuration snapshots, manual uploads) and packages it into framework-aligned audit deliverables — use Audit Manager when the scenario says "prepare audit evidence for HIPAA / PCI DSS / ISO 27001". AWS Artifact is a provider-level service that lets you download AWS's own compliance attestations (SOC reports, ISO certificates, PCI AOC, HIPAA BAA, GDPR DPA) — use Artifact whenever the scenario says "auditor needs AWS's SOC 2 report" or "we must execute a BAA before storing PHI".

Q6: How do I satisfy HIPAA, PCI DSS, or GDPR on AWS at a high level?

Follow the four-step pattern: (1) Establish the AWS-side attestation via AWS Artifact (BAA for HIPAA, AOC for PCI DSS, DPA for GDPR). (2) Implement technical controls — encryption at rest with KMS, encryption in transit with ACM/TLS, IAM least privilege and MFA, VPC segmentation, CloudTrail with log file integrity validation, S3 Object Lock for audit-critical data, S3 Lifecycle for data minimization/GDPR erasure. (3) Continuously evaluate compliance with AWS Config and the framework-specific conformance pack (HIPAA, PCI DSS, GDPR packs are pre-built). (4) Collect and package evidence with AWS Audit Manager using the matching assessment framework. This same pattern — Artifact → technical controls → Config → Audit Manager — satisfies almost every compliance scenario on SAA-C03.

Q7: What is the relationship between S3 Versioning, Replication, and Object Lock?

S3 Versioning is the foundational feature that preserves every version of every object in a bucket — DELETE becomes a delete-marker and overwrite creates a new version. S3 Replication (Cross-Region or Same-Region) asynchronously copies object versions to another bucket for DR and compliance; it requires versioning on both source and destination. S3 Object Lock provides WORM immutability at the object-version level (governance or compliance mode) and requires versioning as a prerequisite. A governance-grade bucket typically has all three: versioning on, lifecycle rules for cost optimization, Object Lock in compliance mode for audit-critical data, and CRR to a geographically separate region for disaster recovery. Each layer solves a different problem — history (versioning), geographic redundancy (replication), and immutability (Object Lock) — and you pick the combination that matches the scenario's requirement.

Further Reading

Official sources