AWS migration tooling is the Pro-level discipline of picking, chaining, and sequencing the right service — AWS Application Migration Service (MGN), AWS Database Migration Service (DMS), AWS Schema Conversion Tool (SCT), DMS Fleet Advisor, the AWS Snow Family at fleet scale, AWS DataSync, AWS Transfer Family, AWS Direct Connect, and application-layer streaming replicators like Kafka MirrorMaker and Amazon MSK Replicator — so that a 600 TB on-prem estate lands inside AWS with the smallest possible cutover window, the smallest possible data-loss surface, and the smallest possible blast radius if any one component fails. SAP-C02 Domain 4 (Accelerate Workload Migration and Modernization, 20 percent) tests this topic more heavily than most candidates expect, and it tests it very differently from SAA-C03: the exam assumes you already know what DataSync is and instead asks whether you can orchestrate DMS Full Load + CDC behind a Direct Connect pipe, fan out a Snowball Edge fleet in parallel, validate data with DMS validation tasks and checksums, then flip DNS inside a 60-minute cutover window without breaking replication lag SLOs. This SAP-C02 study note drills AWS migration tooling at that depth — tool-chain level, not service-overview level — and ends by solving the signature exam scenario: 600 TB on-prem Oracle 19c → Amazon Aurora PostgreSQL with a less-than-one-hour application cutover.
This topic assumes you have already passed SAA-C03 or equivalent and do not need the "what is DataSync" lecture again. If you want the associate-level AWS data transfer solutions decision tree (days-of-online arithmetic, single-service picks), the SAA-C03 data-transfer-solutions note covers it. Here we stay at SAP-C02 depth: tool-chain orchestration, cutover engineering, replication topology, and validation strategy for enterprise AWS migration tooling programs.
The Pro-Level Framing — Migration Tooling as a System, Not a Service
SAA-C03 asks "which one AWS data transfer service fits this workload?" SAP-C02 asks "which combination of AWS migration tooling services, in which order, over which network, with which validation gate, gives you a cutover you can actually run on a Saturday night with the CFO on the call?" The mental model shifts from service-picker to system-designer.
A Pro-level AWS migration tooling design has six moving parts, all of which must be answered before you start touching the console:
- Discovery tooling — AWS Migration Hub, Application Discovery Service (agentless and agent-based), DMS Fleet Advisor for databases, AWS Migration Evaluator for TCO.
- Server-layer replication tooling — AWS Application Migration Service (MGN) for block-level VM replication; historically AWS Server Migration Service (SMS) and CloudEndure, both now superseded by MGN.
- Database-layer replication tooling — AWS Database Migration Service (DMS) with Full Load, CDC, and Full Load + CDC task modes; AWS Schema Conversion Tool (SCT) for heterogeneous engine conversion.
- File and object-layer data transfer tooling — AWS DataSync for NFS/SMB/HDFS → S3/EFS/FSx; AWS Transfer Family for partner SFTP/FTPS/AS2 ingestion; AWS Snow Family for offline bulk at fleet scale.
- Streaming-layer replication tooling — Kafka MirrorMaker 2 and Amazon MSK Replicator for event-bus parity; Kinesis Data Streams cross-region replication where applicable.
- Network transport — AWS Direct Connect (dedicated, private, predictable bandwidth) vs internet-plus-VPN vs offline physical shipping; these are the pipes under every AWS migration tooling choice above.
Bai-Hua-Wen Explanation — AWS Migration Tooling in Plain English
Let me land three different analogies on AWS migration tooling, because the chain is genuinely counter-intuitive the first time you meet it.
Analogy 1: Moving House With a Professional Mover
Imagine you are moving a six-bedroom house across town on a strict schedule. You do not pack every item yourself the night before — that is the amateur mistake, and it is how migrations blow through their cutover windows. A professional mover sends a surveyor first (discovery: Application Discovery Service, DMS Fleet Advisor, Migration Evaluator), produces an inventory and quote (portfolio assessment, 7 Rs), and then rolls in with specialist crews: the piano mover (MGN for critical servers), the rare-book crew (DMS for databases, because books must stay in order), the document-shredder team (Snowball Edge for bulk that cannot fit in the van), and a quality inspector who matches the inventory list item-for-item on arrival (DMS validation tasks, DataSync verification, Snow chain-of-custody).
The moving truck itself is your network — AWS Direct Connect is the rented 53-foot trailer, the public internet is a rented van (slower, more unpredictable), and Snowball Edge is a sealed shipping container you fill in your driveway and hand to a logistics company. You do not load the piano onto a bicycle. You do not lift-and-shift a 50 TB Oracle database over a 100 Mbps VPN. Pick the right tool for the right cargo.
Analogy 2: An Airport Control Tower (Cutover Day)
Cutover day is air-traffic control, not a race. A well-run AWS migration tooling cutover looks like an airport tower managing twenty aircraft: there is a sequence (servers via MGN finalize first, then databases via DMS drain CDC to zero, then DNS flips), there are holding patterns (applications stay in read-only while final CDC drains), there are go-arounds (pre-authored rollback runbooks that reverse-replicate from AWS back to on-prem), and there is radio discipline (one change window, one incident channel, one decision-maker per layer).
Migration tooling gives you this choreography for free only if you use the services end-to-end. MGN's test cutover is the dress rehearsal — launch staging EC2 instances from the continuous replication stream, run smoke tests, tear them down, keep replicating. DMS's CDC mode is the holding pattern — every transaction the source database accepts during the cutover window gets replayed against the target until you close writes on the source.
Analogy 3: A Kitchen Brigade Prepping for Service
In a professional kitchen, mise en place means every ingredient is chopped, every station is stocked, every sauce is held at temperature before service starts. AWS migration tooling is mise en place for a cutover:
- MGN is the walk-in fridge — it holds a continuously-updated copy of every server, ready to launch in minutes.
- DMS Full Load + CDC is the stockpot at low simmer — the target database is warm, data is flowing in, the chef only needs to taste once before plating.
- SCT is the prep cook who has already julienned the Oracle stored procedures into PostgreSQL PL/pgSQL — so when the orders fire, you are cooking, not translating.
- DataSync, Transfer Family, Snow Family are the delivery lanes for bulk ingredients — they get the produce to the kitchen; they do not cook it.
- Direct Connect is the dedicated service corridor — no customer wanders through it, no delivery truck can block it, bandwidth is predictable, latency is tight.
Plain conclusion: AWS migration tooling at Pro depth is a brigade, not a service-of-the-week. Design it like a kitchen — pre-stage every ingredient, pre-rehearse every motion, and service (cutover) becomes the easiest hour of the week instead of the scariest.
Discovery — AWS Migration Hub, Application Discovery Service, DMS Fleet Advisor
Every AWS migration tooling engagement starts with what do we have — and guessing is a career-limiting move at 600-VM scale.
AWS Application Discovery Service
AWS Application Discovery Service captures on-premises server, process, and dependency data and pushes it into AWS Migration Hub. Two modes:
- Agentless discovery via the Application Discovery Service Agentless Collector OVA, which hooks into VMware vCenter and reads inventory plus performance (CPU, RAM, disk I/O, network) without touching guest OSes. Use this for vSphere-heavy environments where you cannot install agents broadly.
- Agent-based discovery via the Application Discovery Service Agent installed on each server. Captures more granular data including running processes and TCP connections — the raw material for dependency maps.
Data lands in Migration Hub, which groups servers into applications, tracks migration status through 7 Rs labels, and presents dashboards of wave readiness.
DMS Fleet Advisor — Database Discovery Without an Agent
AWS migration tooling for databases starts with DMS Fleet Advisor. It runs a lightweight data collector on a Windows or Linux host that has network reach to your database servers, connects using read-only credentials, and inventories: database engine and version, schema size, object counts, feature usage (Oracle partitioning, SQL Server Always On, etc.), and I/O profiles. It outputs an engine-compatibility report — "your 14 Oracle schemas have 82 percent automatic conversion to Aurora PostgreSQL via SCT, with 340 manual action items in triggers and stored procedures."
DMS Fleet Advisor is specifically designed for heterogeneous migration planning at fleet scale. Before DMS Fleet Advisor existed, Pro architects ran SCT assessment reports per database manually, which does not scale to 500-database estates.
AWS Migration Evaluator
AWS Migration Evaluator (formerly TSO Logic) produces the business case — 3-year TCO comparing on-prem run-cost to the AWS target-architecture run-cost. It consumes Application Discovery Service inventory plus performance data and generates pricing with Savings Plans and Reserved Instance assumptions baked in. Migration Evaluator output is the artifact you put in front of the CFO before the Migration Acceleration Program (MAP) funding conversation.
AWS Application Migration Service (MGN) — The Server-Layer Workhorse
AWS Application Migration Service (MGN) is the current AWS-recommended server migration tooling. It replaces both AWS Server Migration Service (SMS) (retired March 2022) and CloudEndure Migration (merged into MGN). If a SAP-C02 question offers SMS or CloudEndure as options in 2026, they are distractors — the answer is MGN.
MGN Replication Model
MGN performs continuous block-level replication from source servers (physical, virtual, or other-cloud VMs) to a staging area subnet inside your AWS VPC. The flow:
- Agent installation — the AWS Replication Agent is installed on each source server (Windows or Linux). The agent reads every block on every attached disk and streams changes to AWS over TLS on port 1500.
- Staging area — a low-cost subnet where MGN provisions Replication Servers (t3.small EC2 instances by default) and staging EBS volumes that receive the replicated blocks. These are intentionally cheap; you pay while replicating, not at full target-instance cost.
- Continuous sync — after the initial full sync, the agent streams only changed blocks. Replication lag is typically seconds.
- Test cutover — a non-disruptive launch of the target EC2 instances from the staging volumes. Source keeps replicating. You smoke-test, then terminate the test instances without data loss. Run this multiple times before real cutover.
- Actual cutover — application is stopped on source, final CDC drains in MGN, target instances are launched for real, source agent is eventually uninstalled.
MGN Launch Templates and Post-Launch Actions
MGN lets you pre-define launch templates per source server: target instance type, subnet, security group, IAM instance profile, EBS volume types (gp3 upgrade from gp2, io2 where IOPS required), and post-launch actions. Post-launch actions are SSM Automation documents that run after cutover — common examples: uninstall the MGN agent, install CloudWatch agent, register with Systems Manager, join the domain, apply tag-based configuration.
MGN Replication Settings
- Bandwidth throttling on the agent to avoid saturating the WAN during business hours.
- Private connectivity via VPC endpoints so replication traffic rides Direct Connect, never the internet.
- EBS encryption at the staging area (SSE-KMS with your CMK).
- Right-sized replication server type — larger replication servers (c5.large) for high-churn workloads.
When MGN Is the Answer (and When It Is Not)
- MGN yes: Rehost / lift-and-shift of existing servers. Tight cutover windows. Large fleets (10s–1000s of VMs). Cross-region or other-cloud-to-AWS.
- MGN no: Database servers where the source engine ≠ target engine — that is DMS + SCT territory. File servers where the target is S3/EFS/FSx — that is DataSync territory. Applications being refactored to serverless or containers — that is modernization, not migration tooling.
AWS Database Migration Service (DMS) — The Database-Layer Workhorse
AWS Database Migration Service (DMS) is AWS migration tooling for live, low-downtime database migration. It handles homogeneous moves (Oracle → Oracle on RDS, MySQL → MySQL on Aurora) and heterogeneous moves (Oracle → Aurora PostgreSQL, SQL Server → MySQL) when paired with the AWS Schema Conversion Tool.
DMS Architecture Components
- Replication instance — a managed EC2 instance in your VPC, running the DMS engine. Sized by throughput need: dms.t3.medium for dev, dms.c5.4xlarge to dms.r5.8xlarge for production-grade Pro workloads. Always deploy Multi-AZ replication instance for production migrations so replication survives an AZ failure mid-migration.
- Source endpoint and target endpoint — connection definitions with credentials (via Secrets Manager integration), TLS settings, and extra connection attributes tuned per engine.
- Replication task — the unit of migration. Binds a source endpoint, a target endpoint, a migration type, table mappings, and task settings (error handling, LOB mode, validation).
DMS Task Types
- Full Load — one-time copy of existing data. Snapshot-style. No ongoing sync.
- Full Load + CDC — full load followed by Change Data Capture streaming. This is the Pro default for low-downtime cutovers. DMS reads source redo logs (Oracle), binlogs (MySQL/MariaDB), WAL (PostgreSQL), or transaction logs (SQL Server) and replays every DML operation against the target until you tell it to stop.
- CDC Only — apply ongoing changes only. Used when you have seeded the target by another method (e.g., a Snowball-shipped initial load) and only need to catch up the delta.
- Replication Ongoing — continuous replication, no defined end. Used for read-scaling patterns or cross-region live replication, not pure migration.
DMS Migration Type Selection Rule
The selection rule at Pro depth: use Full Load + CDC for every production cutover. Full Load alone guarantees downtime equal to the full-load duration (unacceptable for 600 TB). CDC Only requires you to pre-seed correctly. Full Load + CDC is the only option that both seeds the target and keeps it in sync while you plan a calm cutover window.
DMS Validation
DMS includes a validation mode that runs alongside the migration task or as a standalone task. For every row migrated, DMS re-reads the source and target, compares via row-level checksums, and reports mismatches as validation failures. This is the SAP-C02-level answer for "how do we prove the target matches the source." Do not trust row counts alone — validation tasks catch nulls, encoding drift, trigger-induced modifications, and precision loss.
DMS Fleet Advisor (Migration Planning)
Covered earlier in discovery — the planning-phase companion to DMS. Produces an engine-compatibility report per source database so you can wave-plan heterogeneous migrations correctly.
DMS Serverless (Newer Option)
DMS Serverless eliminates the replication-instance sizing decision — you provide the source, target, and task config; DMS auto-scales capacity. Good for variable-throughput migrations; less predictable cost. For exam purposes, know it exists and that it simplifies the "which replication instance size" headache.
AWS Schema Conversion Tool (SCT) — Heterogeneous Engine Conversion
AWS Schema Conversion Tool (SCT) is the offline desktop application (Windows, macOS, Linux) that converts schema objects from one engine to another: tables, indexes, constraints, views, stored procedures, functions, triggers, packages, sequences.
SCT Conversion Model
SCT connects to the source database (read-only) and the target database (write) and performs:
- Assessment report — generates a per-object conversion rate. Example output: "82 percent of Oracle schema auto-converts to Aurora PostgreSQL; 18 percent (342 objects) require manual action, mostly in PL/SQL packages and trigger bodies."
- Automatic conversion — converts the mechanically-translatable DDL (tables, most constraints, simple views).
- Action items list — produces a ticket-able list of objects that need manual translation (complex PL/SQL → PL/pgSQL, Oracle
DBMS_*packages, cross-schema references, datatype gaps). - Apply to target — writes the converted DDL to the target database.
SCT Engine Pairs and Conversion Rates
Typical conversion success rates (rough guide — your mileage varies):
- Oracle → Aurora PostgreSQL: 70–90 percent automatic (the common exam scenario).
- Oracle → Aurora MySQL: 60–85 percent automatic.
- SQL Server → Aurora PostgreSQL: 75–90 percent.
- SQL Server → Aurora MySQL: 70–85 percent.
- MySQL → PostgreSQL: 85–95 percent.
- MongoDB → DocumentDB: mostly mechanical since DocumentDB is wire-protocol compatible.
- Cassandra → Amazon Keyspaces: mechanical mapping of CQL schemas.
SCT Extension Packs and Application Conversion
SCT ships extension packs that emulate source-engine features on the target — e.g., Oracle DBMS_OUTPUT or UTL_FILE equivalents for PostgreSQL. Critical for minimizing application code changes.
SCT also performs application SQL conversion — point it at your application source tree and it will rewrite embedded SQL from Oracle dialect to PostgreSQL dialect. Coverage varies; treat the output as a draft, not final.
The SCT-then-DMS Sequence
The correct order for heterogeneous migration with AWS migration tooling:
- SCT assessment report — quantify conversion work up-front.
- SCT automatic conversion — generate the target schema DDL.
- Manual remediation — developers fix action items.
- Apply schema to target — CREATE the target Aurora cluster's objects.
- DMS replication task (Full Load + CDC) — DMS now has a compatible target to write data into.
- DMS validation task — prove the data moved correctly.
- Cutover — drain CDC, stop source writes, repoint application.
Running DMS before SCT on a heterogeneous pair fails, because DMS only moves data, not schema. That is the single most common DMS-architecture question on the SAP-C02.
AWS Snow Family at Fleet Scale
SAA-C03 treats Snowball Edge as a single-device offline transfer pick. SAP-C02 treats it as a fleet — tens to hundreds of devices dispatched in parallel to evacuate a datacenter in weeks rather than months.
Current Snow Family Lineup (2026)
- AWS Snowcone — 8 TB HDD or 14 TB SSD; 2 vCPU, 4 GB RAM; ships in a standard courier envelope. Use for edge / tactical / single-digit-TB one-offs.
- AWS Snowball Edge Storage Optimized — ~80 TB usable HDD or 210 TB usable SSD (object-store variant). 40 vCPU, 80 GB RAM.
- AWS Snowball Edge Compute Optimized — ~42 TB usable; 52 vCPU, 208 GB RAM; optional NVIDIA V100 GPU. Edge compute at disconnected sites.
- AWS Snowmobile — retired for new orders. Historically a 45-foot truck with up to 100 PB. For any new 2026-era exabyte-scale scenario, the AWS-recommended answer is a parallel fleet of Snowball Edge Storage Optimized devices, not Snowmobile.
Snow Family Fleet Topology for 600 TB
For the signature 600 TB scenario, the Pro-level design is:
- Order 8 × Snowball Edge Storage Optimized (each ~80 TB usable). Total capacity ~640 TB. Ship in two waves so you always have devices on-site while others are in transit.
- Parallel load — spin up 8 load sessions, one per device, using the AWS OpsHub for Snow Family application or the
snowballCLI. Target throughput per device ~1 Gbps on the local loading network. - Chain of custody — each device tracked via AWS tamper-evident seal and KMS-wrapped encryption key (never on the device).
- Return shipping — AWS provides return labels; devices are imported to S3 at the AWS Region automatically on receipt.
- Post-import verification — compare the S3
etags and object counts against the source inventory before marking the job done.
Snow Edge Compute for Pre-Processing
Snowball Edge Compute Optimized can run EC2 instances and Lambda on-device. For media transcoding, PII redaction, or compression before shipment, you can mutate data on the Snow device itself so the bytes that land in S3 are already in the target format. Pro-level cost lever; rarely tested, but occasionally appears as a distractor.
AWS DataSync for Filer Migrations
AWS DataSync is the online AWS migration tooling for filer-to-cloud moves: NFS, SMB, HDFS, and object-store sources migrating to S3, Amazon EFS, Amazon FSx for Windows File Server, FSx for Lustre, FSx for OpenZFS, or FSx for NetApp ONTAP.
DataSync for NFS/SMB Migration — Pro Patterns
- Agent placement — a DataSync agent VM (VMware, Hyper-V, KVM, or EC2) inside the source network. For very large filers, deploy multiple agents in parallel, each pointed at a different prefix, to aggregate throughput.
- Scheduled incremental transfer — cron-like schedule copies only deltas. Use this to reduce the final-cutover sync to minutes.
- Bandwidth throttling — cap agent throughput during business hours; remove the cap overnight.
- Verification mode —
POINT_IN_TIME_CONSISTENT(default) rescans and verifies the entire dataset at task end.ONLY_FILES_TRANSFERREDis faster for incrementals. For compliance-heavy migrations use the stricter mode. - VPC endpoint routing — DataSync traffic over PrivateLink + Direct Connect keeps bytes off the public internet.
DataSync for HDFS
DataSync supports HDFS as a source — direct Hadoop-cluster-to-S3 migration without intermediate staging. The agent reads HDFS via Kerberos authentication and writes to S3 preserving metadata. This is the Pro answer for "migrate our on-prem Hadoop data lake to S3-on-Athena."
DataSync Discovery
DataSync Discovery (a newer capability) scans on-premises storage arrays and profiles capacity, access patterns, and growth — then recommends AWS storage targets. Useful during assessment, rarely tested but worth knowing it exists.
DataSync vs Snow Family vs Direct Connect — Pro Decision
At SAA level the decision is "days-of-online > 7 → Snow." At Pro level you add:
- Hybrid use — DataSync over Direct Connect for continuous incrementals, Snow for the initial bulk. You do both, not either-or. Snow handles the 95 percent bulk that would saturate the WAN; DataSync over Direct Connect handles the 5 percent of daily changes during the cutover window.
- Cost optimization — Snow flat per-device fee vs DataSync per-GB. Above ~500 TB on a sub-1-Gbps link, Snow dominates. Above ~1 Gbps dedicated, DataSync catches up.
AWS Transfer Family for Partner File Drops
AWS Transfer Family exposes managed SFTP, FTPS, FTP, and AS2 endpoints backed by S3 or EFS. At Pro scope it enters the AWS migration tooling conversation in two shapes:
- Migrating an existing on-prem SFTP/FTPS/AS2 gateway to AWS without breaking partner integrations. Partners keep the same credentials and directory layout (via logical directories). You get a managed endpoint with no self-managed SFTP server to patch.
- Post-migration steady state for partner-initiated file ingestion into the AWS landing zone.
Transfer Family Deployment Modes
- Service-managed users for small partner lists.
- AWS Managed Microsoft AD / AD Connector for enterprise identity federation.
- Custom Lambda-backed identity provider for Okta, Azure AD, or home-grown user stores.
- Managed workflows — event-driven pipelines on file arrival (decrypt, validate, route to downstream).
When Transfer Family Is the Answer
Transfer Family is not an AWS migration tooling service for your own data — for that you use MGN, DMS, DataSync, or Snow. Transfer Family is migration tooling for migrating a partner-facing file-exchange surface off on-premises into AWS without breaking external integrations. Trigger words: "300 external partners", "SFTP", "EDI", "AS2", "customers push files to us daily".
Direct Connect vs Internet for Cutover Bandwidth
Network transport is half of any AWS migration tooling decision at Pro scope.
Direct Connect for Migration
- Dedicated ports: 1 Gbps, 10 Gbps, 100 Gbps. Hosted connections from Direct Connect partners cover sub-1-Gbps.
- Private, predictable latency — migrations with tight CDC-lag SLOs (sub-second for DMS) need Direct Connect. Public-internet VPN lag spikes break CDC replication.
- Lower egress cost — for migrations that require reverse replication (AWS → on-prem rollback), per-GB Direct Connect egress is materially cheaper than internet egress.
- Virtual Interfaces (VIFs): Private VIF to a VPC, Public VIF to AWS public services (including S3), Transit VIF to Transit Gateway.
Direct Connect Redundancy for Migration Cutover
For the actual cutover weekend, you want two Direct Connect circuits in different locations or Direct Connect + Site-to-Site VPN backup. A circuit flap during the final CDC drain is a career-limiting event. Active/active via LAG or active/passive via two circuits; VPN as a tertiary fallback.
Internet-plus-VPN for Migration
Viable for:
- Small migrations (< 10 TB).
- Dev / staging migrations where cutover timing is loose.
- Environments where Direct Connect is not yet provisioned (provisioning can take weeks).
Not viable for:
- Large (> 50 TB) migrations with tight cutover windows.
- DMS CDC-heavy cutovers where latency jitter breaks replication SLOs.
- Regulated workloads where compliance mandates private transport.
Application-Layer Streaming Replication
Enterprise migrations almost always carry streaming systems — Kafka clusters, Kinesis streams — that must reach parity alongside databases and files.
Kafka MirrorMaker 2 for Self-Managed Kafka
Kafka MirrorMaker 2 (MM2) is the open-source Kafka-to-Kafka replicator. Runs as a Kafka Connect cluster. Replicates topics, consumer-group offsets, and ACLs from source to target. Used to migrate self-managed Kafka on-prem → self-managed Kafka on EC2 → eventually Amazon MSK.
Amazon MSK Replicator
Amazon MSK Replicator is the AWS-managed equivalent for MSK-to-MSK replication. Targets:
- Cross-region MSK replication for DR or read-local / write-central topologies.
- Same-region MSK migration between clusters (e.g., version upgrades without downtime).
- Active/active multi-region MSK with conflict-free offset preservation.
MSK Replicator is simpler than rolling your own MM2 because there is no Kafka Connect cluster to operate. For migration into MSK from self-managed Kafka, MSK Connect (which runs Kafka Connect as a managed service) plus MM2 remains the pattern.
Kinesis Cross-Region Considerations
Kinesis Data Streams has no native cross-region replication; you replicate via a Kinesis Client Library (KCL) consumer that reads from Region A and writes to Region B, or via Amazon Data Firehose to S3 with cross-region replication. Rarely the primary migration answer but shows up in multi-region DR patterns.
Data Consistency Validation — Proving the Move Is Safe to Cut Over
At Pro scope, "we copied the data" is not sufficient; "we proved the data is identical and that the application works against the target" is. AWS migration tooling ships validation primitives per layer.
DMS Validation
DMS validation tasks re-read every migrated row from both source and target and compare column-by-column checksums. Mismatches surface as validation failures in the task's validation metrics. Run validation as part of the Full Load + CDC task (concurrent validation) rather than post-hoc; catches issues while replication is still running.
DataSync Verification
DataSync verification modes (covered earlier) compare object metadata and optionally content checksums. For compliance-heavy filer migrations, run POINT_IN_TIME_CONSISTENT mode as a final pre-cutover pass.
Snow Family Chain-of-Custody
Snow devices compute object-level SHA-256 checksums on load. On import to S3, AWS re-computes and compares. Any mismatch blocks the import and logs it to CloudTrail. You audit via the Snow job's import log.
MGN Drill Cutover as Validation
MGN's test cutover is functional validation — launch target instances non-disruptively, smoke-test at the application level, tear down, continue replicating. Run test cutovers at least twice before real cutover.
Application-Layer Smoke Tests
Beyond AWS-native validation, the Pro playbook mandates application-level validation:
- Read-only traffic replay — replay production traffic logs against the target database in read-only mode; compare response counts and latency distributions.
- Differential row counts per table —
SELECT COUNT(*)per table, diff, investigate drifts > 0. - Checksum-based reconciliation — application-computed hashes over critical business aggregates (total outstanding balance, total inventory count).
Signature Scenario — 600 TB On-Prem Oracle → Aurora PostgreSQL, Less-Than-One-Hour Cutover
This is the canonical SAP-C02 AWS migration tooling problem. Walk through it end-to-end.
Requirements Recap
- Source: Oracle 19c, single datacenter, 600 TB total database size (mix of hot and cold partitions).
- Target: Amazon Aurora PostgreSQL-compatible, Multi-AZ, in ap-northeast-1.
- Cutover budget: Less than one hour of application read/write downtime.
- Network: Existing 10 Gbps Direct Connect; option to provision a second 10 Gbps circuit for redundancy.
- Compliance: Data must not traverse the public internet; encryption in transit and at rest mandated.
Step 1 — Discovery and Planning
- Run DMS Fleet Advisor to inventory the Oracle schemas (tables, indexes, PL/SQL packages, triggers). Produces per-schema conversion-rate estimate.
- Run AWS SCT assessment report on the specific databases in scope. Output: 78 percent automatic conversion; 22 percent manual action items, mostly PL/SQL and Oracle-specific
DBMS_*usage. - Decision: heterogeneous migration via SCT + DMS. Book 4 weeks of engineering time for manual conversion of PL/SQL to PL/pgSQL.
Step 2 — Network Preparation
- Provision a second 10 Gbps Direct Connect circuit in a different location for redundancy.
- Configure active/active via LAG so both circuits are live during migration.
- Add a Site-to-Site VPN as tertiary backup.
- VPC endpoint for DMS and S3 so all migration traffic stays on private transport.
Step 3 — Schema Conversion
- Use AWS SCT to generate target Aurora PostgreSQL DDL for all schemas.
- Apply auto-converted DDL to a staging Aurora cluster.
- Developers manually remediate action items (PL/SQL → PL/pgSQL, package-to-schema remapping, datatype adjustments).
- Unit-test the converted procedures against a sample dataset loaded via DMS Full Load into the staging cluster.
Step 4 — Initial Data Seeding (Option A: Full Load over Direct Connect)
- Provision a DMS replication instance in Multi-AZ mode, sized dms.r5.8xlarge for throughput.
- Deploy the production Aurora PostgreSQL cluster (Multi-AZ, appropriate instance size, encryption on, backup retention configured).
- Launch a Full Load + CDC task from Oracle source → Aurora PostgreSQL target.
- Over the 10 Gbps Direct Connect, throughput of ~5 Gbps effective means 600 TB completes the full load in roughly 10–14 days.
- During full load, DMS is not yet replicating CDC for already-loaded tables; CDC catch-up starts once full load completes per table.
Step 4 — Initial Data Seeding (Option B: Snowball Edge Fleet + CDC)
For even faster initial seed without saturating the Direct Connect for two weeks:
- Export Oracle snapshot (RMAN / Data Pump) to local staging storage.
- Load onto an 8-device Snowball Edge Storage Optimized fleet (total ~640 TB capacity).
- Ship to AWS; data imports to S3 in ~7 calendar days total.
- Use a DMS task with S3 as source endpoint to load the snapshot into Aurora.
- Simultaneously run a DMS CDC Only task from live Oracle against Aurora, using the SCN at snapshot time as starting point.
Option B is the Pro-preferred answer for a 600 TB move: offline bulk for the hot majority, online CDC for the trickling delta. Direct Connect stays free for CDC traffic instead of being saturated by full load.
Step 5 — CDC Steady State
- Once initial seed is complete, CDC replicates every Oracle transaction to Aurora with sub-second lag.
- Monitor
CDCLatencySourceandCDCLatencyTargetCloudWatch metrics. Alert on lag > 30 seconds. - Run concurrent DMS validation tasks — DMS re-reads every migrated row from both source and target and reports mismatches continuously.
- Resolve any validation failures before scheduling cutover.
Step 6 — Test Cutovers
- Launch test cutover of the application tier (via MGN if app servers are also migrating, or via blue-green deployment if app servers already live in AWS).
- Point the test application tier at the Aurora target.
- Run read-only smoke tests, reconciliation queries, and application-level integration tests.
- Tear down test environment. Keep DMS CDC running.
- Repeat test cutover at least twice, refining runbook each time.
Step 7 — Actual Cutover (Under the One-Hour Budget)
Cutover runbook, time-budgeted:
- T-00:00 — Change window opens. Pre-lower Route 53 TTL to 60 seconds (done 24 hours earlier so DNS caches drain).
- T+00:05 — Put application into read-only mode on source Oracle; block writes at the application layer.
- T+00:10 — Wait for DMS CDC lag to drain to zero. Monitor
CDCLatencyTarget= 0. - T+00:20 — Stop the DMS task cleanly. Capture final SCN in the DMS task logs.
- T+00:22 — Run final reconciliation query (row counts per table, critical business aggregate checksums) against both Oracle and Aurora. Confirm identity.
- T+00:30 — Flip application DB connection string (via Secrets Manager rotation or config flag) from Oracle to Aurora PostgreSQL.
- T+00:40 — Enable writes on the application tier. Monitor application error rates and Aurora
WriteThroughputmetric. - T+00:50 — Run post-cutover application smoke test suite.
- T+00:55 — Declare cutover complete. Keep Oracle source up (read-only) for 72 hours as rollback safety net.
- T+72:00 hours — Decommission Oracle source.
Total application downtime: ~30 minutes (T+00:10 read-only window start to T+00:40 writes enabled). Well inside the 1-hour budget.
Step 8 — Rollback Safety Net
Pre-authored rollback runbook:
- If cutover fails at any step, re-enable writes on Oracle source.
- Stop the application.
- Re-flip the connection string back to Oracle.
- Investigate post-mortem.
- For reverse replication of any Aurora writes that occurred before rollback, configure a reverse DMS task (Aurora PostgreSQL → Oracle) during cutover-preparation weeks. This is the "break-glass" option that almost all SAP-C02 Pro answers include.
Orchestrating the AWS Migration Tooling Chain with Migration Hub
AWS Migration Hub is the single pane of glass for AWS migration tooling status across MGN, DMS, DataSync, and Snow jobs. At Pro scope it serves three purposes:
- Portfolio tracking — group servers and databases into applications, track each through 7 Rs status (Retire, Retain, Rehost, Relocate, Replatform, Repurchase, Refactor).
- Wave visibility — per-wave readiness dashboards.
- Cross-service event integration — MGN launch events, DMS task status, DataSync task completion all stream into Migration Hub.
Migration Hub Refactor Spaces
AWS Migration Hub Refactor Spaces provides a managed infrastructure for strangler-fig application-modernization migrations — incrementally peel services off a monolith into new microservices. Not strictly "data transfer" but appears in SAP-C02 modernization questions adjacent to migration tooling.
Security Considerations Across the Chain
Every AWS migration tooling service has a security posture that must be hardened for production:
- MGN replication traffic over VPC endpoint + Direct Connect; staging EBS volumes encrypted with KMS CMK; agent traffic always TLS 1.2+.
- DMS replication instance in private subnet; endpoint credentials in Secrets Manager; SSL to source and target; replication instance storage encrypted.
- SCT runs on a workstation — never commit SCT projects containing source credentials to version control; use the SCT credential vault.
- DataSync agents authenticate to AWS via activation key + IAM role; traffic TLS over VPC endpoint; destination encryption (SSE-KMS) mandatory.
- Snow Family — 256-bit encryption, keys in AWS KMS never on device, tamper-evident seal, TPM attestation on return.
- Transfer Family — SFTP (SSH), FTPS (TLS), AS2 (S/MIME); VPC endpoint for internal-only partner networks.
- Direct Connect — circuit is private but not encrypted by default; layer IPsec VPN over the public VIF for end-to-end encryption where compliance requires.
Common SAP-C02 Traps for AWS Migration Tooling
Trap 1: Treating MGN as Interchangeable with DMS
MGN migrates servers (block-level VM replication). DMS migrates databases (row-level replication). If a scenario says "migrate an Oracle database," MGN will technically copy the server, but you lose all the database-layer benefits (heterogeneous conversion, CDC, validation). DMS + SCT is correct. If the scenario says "lift-and-shift this Linux web server," MGN is correct and DMS is wrong.
Trap 2: Running DMS Without SCT for Heterogeneous Migrations
DMS moves data; SCT converts schema. Heterogeneous engines (Oracle → Aurora PostgreSQL) require SCT first to create the target schema and translate stored procedures. Skipping SCT means DMS errors on CREATE and your "migration" is dead at step one.
Trap 3: Picking Snowmobile in 2026
Snowmobile is retired for new orders. For exabyte-scale migrations the contemporary answer is a fleet of Snowball Edge Storage Optimized devices in parallel. Snowmobile appearing as an option is usually a distractor in 2026-dated questions.
Trap 4: Assuming Direct Connect Is the Data Transfer Service
Direct Connect is the pipe. DMS, DataSync, MGN, and Storage Gateway run over it. Any answer that says "use Direct Connect to migrate 600 TB" without naming the transfer service is incomplete. Pro answers always specify the stack: "DataSync over Direct Connect" or "DMS Full Load + CDC over Direct Connect."
Trap 5: Skipping the Validation Gate
Any Pro answer that goes "copy, cut over, done" is wrong if the correct answer includes a validation phase. DMS validation tasks, DataSync verification modes, Snow checksums, and application-level smoke tests are non-optional for Pro-grade migration. The SAP-C02 explicitly tests whether you include the validation gate.
Trap 6: Single-Service Answers for Multi-Layer Workloads
A 600-VM + 50-database + 400-TB-filer migration needs MGN + DMS/SCT + DataSync/Snow in parallel, not one service. Single-service answers for multi-layer workloads are distractors.
Trap 7: Underestimating Cutover Network Redundancy
One Direct Connect circuit during a cutover is a single point of failure. Pro answers always include redundant Direct Connect (LAG or dual-location) and often a VPN backup. Scenarios with "minimize cutover risk" are signalling this expectation.
Trap 8: Confusing DMS Fleet Advisor with SCT
DMS Fleet Advisor is fleet-scale discovery and planning (hundreds of databases). SCT is per-database schema conversion. You run DMS Fleet Advisor to triage the fleet, then run SCT on each database to convert. Using SCT alone at fleet scale is operationally infeasible; using DMS Fleet Advisor without SCT skips the actual conversion.
Trap 9: Running CDC Over the Public Internet
DMS CDC has tight latency SLOs. Public-internet jitter breaks CDC and produces ever-growing replication lag. Every Pro CDC cutover runs over Direct Connect.
Trap 10: Ignoring Kafka and Kinesis
Migrations with streaming layers (Kafka, Kinesis) need MSK Replicator or Kafka MirrorMaker 2 in the plan. Answers that migrate the database but forget the event bus are incomplete.
Key Numbers and Patterns to Memorize for SAP-C02 AWS Migration Tooling
- MGN replication — continuous block-level, TLS port 1500, sub-minute cutover RTO, test cutover is non-disruptive.
- DMS task types — Full Load, Full Load + CDC, CDC Only, Replication Ongoing.
- DMS production best practice — Multi-AZ replication instance, concurrent validation task, credentials via Secrets Manager.
- SCT — runs before DMS for heterogeneous; produces assessment report; converts schema plus extension packs for feature parity.
- DMS Fleet Advisor — fleet-scale database discovery; engine-compatibility report output.
- Snow Family 2026 — Snowcone 8/14 TB, Snowball Edge Storage Optimized ~80 TB HDD or ~210 TB SSD, Snowball Edge Compute Optimized ~42 TB + optional V100 GPU, Snowmobile retired for new orders.
- DataSync throughput — up to ~10 Gbps per agent; supports NFS, SMB, HDFS, S3-compatible; POINT_IN_TIME_CONSISTENT verification default.
- Transfer Family — SFTP / FTPS / FTP / AS2; S3 or EFS backends; logical directories to hide bucket paths; managed workflows for on-arrival processing.
- MSK Replicator — AWS-managed Kafka-to-Kafka replication; cross-region, same-region, active/active topologies.
- Direct Connect for migration — 1/10/100 Gbps ports; always redundant for cutover; not encrypted by default (layer VPN for compliance).
- Cutover arithmetic — for 600 TB over 10 Gbps at ~60 percent efficiency, full load takes ~10–14 days online; Snowball Edge fleet cuts this to ~7 calendar days end-to-end.
- Validation — DMS validation task, DataSync verification, Snow SHA-256 checksums, application-level smoke tests — all four layers in a Pro answer.
FAQ — AWS Migration Tooling at SAP-C02 Depth
1. When should I use AWS Application Migration Service (MGN) instead of DMS for a database-server migration?
Use MGN only when the source and target database engines are identical and you want to lift-and-shift the entire server (OS, database binaries, data files) rather than migrate at the database layer. For example, Oracle 19c on a Linux VM to Oracle 19c on EC2 with the same configuration — MGN works. The moment the target engine differs (Oracle → Aurora PostgreSQL) or the target is a managed database (RDS, Aurora), you need DMS + SCT because MGN cannot transform schema or translate stored procedures. The SAP-C02 distractor pattern is offering MGN for a heterogeneous database migration where DMS + SCT is correct.
2. How does DMS Fleet Advisor differ from AWS Schema Conversion Tool?
DMS Fleet Advisor is a fleet-scale discovery and planning tool — it inventories hundreds of databases across your on-premises estate, profiles engine versions and feature usage, and produces an engine-compatibility report that estimates automatic-conversion rates per schema. It answers "across my 200 databases, which are easy to migrate and which are hard?" AWS SCT is a per-database schema conversion tool — it connects to a specific source database, generates the target schema DDL, converts stored procedures, and produces an action-items list. You run DMS Fleet Advisor first to triage the fleet into waves, then run SCT on each database in each wave to actually convert. Treating them as interchangeable is a trap.
3. What is the correct task-type choice for a production Oracle-to-Aurora PostgreSQL migration with a 1-hour cutover?
Full Load + CDC. Full Load alone means your downtime equals the full-load duration (days to weeks for hundreds of TB). CDC Only requires pre-seeding by another method (e.g., Snowball-imported Oracle Data Pump dump). Full Load + CDC does both — seeds the target and keeps it in sync — so at cutover time, CDC lag is near zero and you freeze writes, drain the final CDC delta, and flip the application with minutes of downtime. Combine with a Multi-AZ replication instance and a concurrent DMS validation task for production safety.
4. For a 600 TB on-prem migration with an existing 10 Gbps Direct Connect, should I use DataSync, Snowball Edge fleet, or both?
Both. The Pro-preferred pattern is Snowball Edge Storage Optimized fleet for the initial 600 TB bulk (typically 8 devices in parallel, ~640 TB aggregate, ~7 calendar days end-to-end) plus DMS CDC Only (for databases) or DataSync scheduled incrementals over Direct Connect (for files) to catch up the delta between the snapshot-to-Snow load and the cutover moment. Snow handles the mass; Direct Connect handles the trickle. Saturating a 10 Gbps Direct Connect with 600 TB of full load would take ~10–14 days of dedicated bandwidth and starve the rest of your organization. Snow + CDC is faster and politer.
5. How do I guarantee near-zero data loss during an Oracle-to-Aurora cutover?
Four stacked guarantees: (1) DMS Full Load + CDC with a concurrent validation task that re-reads every migrated row from source and target and reports mismatches; (2) Read-only freeze on the source application at cutover start, so no new writes can create divergence; (3) Drain CDC to zero lag (CloudWatch CDCLatencyTarget = 0) before you flip the connection string; (4) Final reconciliation query against both Oracle and Aurora (row counts per table, critical business-aggregate checksums) as the last pre-flip gate. Combined, these produce minutes of read-only downtime and zero data loss at the target.
6. What is MSK Replicator and when do I need it during a migration?
Amazon MSK Replicator is the AWS-managed Kafka-to-Kafka replicator for cross-region, cross-cluster, and active/active MSK topologies. You need it whenever a workload you are migrating has a Kafka event bus that must reach parity in the target AWS environment — a stateless consumer switch-over alone is not enough because consumer group offsets must replicate too. For migrating into MSK from self-managed on-prem Kafka, use Kafka MirrorMaker 2 running on MSK Connect (managed Kafka Connect). For migrating between MSK clusters (cluster upgrades, cross-region DR), use MSK Replicator. Ignoring the streaming layer is a frequent Pro-exam mistake; servers and databases alone do not a complete migration make.
7. Does Direct Connect encrypt data by default for migration traffic?
No. A Direct Connect circuit is private (it does not ride the public internet) but it is not encrypted at the link layer. For compliance requirements that mandate encryption in transit (HIPAA, PCI-DSS, GDPR), layer a Site-to-Site IPsec VPN over a public VIF on the Direct Connect circuit — this gives you both private transport and end-to-end encryption. Alternatively, rely on application-layer encryption (TLS to every AWS service endpoint, which all AWS migration tooling services enforce by default). The SAP-C02 tests this directly: a compliance-driven migration that requires encrypted transport over Direct Connect means "Direct Connect + VPN" or "TLS per service," not raw Direct Connect.
8. How do I handle rollback if the cutover fails mid-flight?
Pre-author a reverse replication path during cutover preparation. For a database migration: before cutover, configure a second DMS task in the opposite direction (Aurora PostgreSQL → Oracle) as a "break-glass" rollback path. Do not start it during normal cutover; start it only if rollback is declared. Keep the source Oracle database running in read-only standby mode for 72 hours after cutover; this is the simplest rollback (flip the connection string back). For server-layer rollback with MGN, retain the source VMs for at least 72 hours post-cutover — MGN's test-cutover pattern means you can relaunch from the staging volumes even if you have already performed "actual cutover." Every Pro migration plan has a rollback rehearsal, not just a forward rehearsal.
Summary — AWS Migration Tooling at SAP-C02 Depth
AWS migration tooling at Pro scope is a chain of services designed and sequenced as a system, not a single-service pick. The core chain is: discovery (Migration Hub + Application Discovery Service + DMS Fleet Advisor + Migration Evaluator) → schema conversion where heterogeneous (SCT) → replication per layer (MGN for servers, DMS for databases, DataSync for files, Snow for offline bulk, Transfer Family for partner drops, MSK Replicator for streams) → transport (Direct Connect with redundancy) → validation per layer (DMS validation tasks, DataSync verification, Snow checksums, application smoke tests) → rehearsed cutover with pre-authored rollback. The signature SAP-C02 scenario — 600 TB on-prem Oracle → Aurora PostgreSQL with sub-one-hour cutover — is solved by SCT for schema conversion, Snowball Edge fleet for initial 600 TB seed, DMS CDC Only from the SCN checkpoint, concurrent DMS validation, redundant Direct Connect for CDC traffic, two rehearsed test cutovers, a ~30-minute read-only freeze-drain-flip window, and a 72-hour reverse-DMS rollback path. Pick single-service answers only when the scenario is genuinely single-layer; pick chain answers every time the workload spans servers, databases, files, and streams — which is every real migration. Master the chain, not the services, and the AWS migration tooling question family collapses into a repeatable 7-step playbook.