examhub .cc 用最有效率的方法,考取最有價值的認證
Vol. I
本篇導覽 約 35 分鐘

Amazon SageMaker——訓練、部署與 MLOps

6,850 字 · 約 35 分鐘閱讀

The Amazon SageMaker platform is the end-to-end AWS machine learning service that lets a data scientist label data, engineer features, train custom models, tune hyperparameters, deploy inference endpoints, and monitor model drift — all from a single managed environment. On the AWS Certified AI Practitioner (AIF-C01) exam, Task Statement 3.1 asks you to describe design considerations for applications that use foundation models, and the Amazon SageMaker platform is the reference answer whenever the scenario says "build, train, deploy a custom model" or "fine-tune a model end-to-end with your own code." In 2024 AWS consolidated the expanding Amazon SageMaker family under the new banner Amazon SageMaker AI, which groups every capability covered in this study guide — Amazon SageMaker Studio, Amazon SageMaker Data Wrangler, Amazon SageMaker Ground Truth, Amazon SageMaker Feature Store, Amazon SageMaker Training Jobs, Amazon SageMaker JumpStart, Amazon SageMaker Canvas, Amazon SageMaker Autopilot, Amazon SageMaker Pipelines, Amazon SageMaker Model Registry, Amazon SageMaker inference options, Amazon SageMaker Shadow Testing, Amazon SageMaker Inference Recommender, and Amazon SageMaker Model Monitor.

This study guide walks the full Amazon SageMaker platform in the exact order the ML lifecycle runs — prepare, label, feature, train, tune, deploy, monitor — and it ends with a SageMaker-versus-Bedrock decision matrix, ten callouts, and a seven-question FAQ. Every subsection points to the exact trap AWS uses to set up AIF-C01 distractors.

What is the Amazon SageMaker Platform?

The Amazon SageMaker platform is a fully managed Amazon SageMaker service collection that covers the whole machine learning lifecycle. The Amazon SageMaker platform removes the infrastructure toil of building a custom ML pipeline: no EC2 fleet to size, no Kubernetes cluster to harden, no notebook server to patch. You open Amazon SageMaker Studio in a browser, point to your data in Amazon S3, pick a built-in algorithm or bring your own container, press train, press deploy, and the Amazon SageMaker platform handles the rest.

For AIF-C01, you do not need to write SageMaker SDK code. You need to recognize which piece of the Amazon SageMaker platform solves which part of the ML lifecycle. Questions are scenario-based: "a team needs to label 100,000 images," "a business analyst wants ML predictions in a spreadsheet UI," "the model serves unpredictable traffic and should scale to zero." Each of those maps to one specific Amazon SageMaker sub-service, and this topic is the catalog that makes the mapping automatic.

Amazon SageMaker vs Amazon SageMaker AI — the 2024 rebrand

In November 2024 AWS renamed the classic Amazon SageMaker service to Amazon SageMaker AI and introduced a larger umbrella called the next-generation Amazon SageMaker that also includes Amazon SageMaker Lakehouse (unified data), Amazon SageMaker Data and AI Governance, and the unified Amazon SageMaker Studio experience. For AIF-C01, treat Amazon SageMaker AI as the AI-and-ML portion you already learned — every feature in this guide (Studio, Training Jobs, endpoints, Model Monitor) lives inside Amazon SageMaker AI. The exam uses "Amazon SageMaker" and "Amazon SageMaker AI" interchangeably. If a question offers both names, the correct interpretation is still the Amazon SageMaker platform that trains and deploys ML models.

Why the Amazon SageMaker platform matters for AIF-C01

AIF-C01 Domain 3 weighs 28 percent of the exam. Inside Domain 3, Task 3.1 explicitly lists "platform selection" as a topic, and the pain-point analysis from the community shows the Amazon SageMaker platform versus Amazon Bedrock trap is the single most-confused pair on AIF-C01. Expect at least two questions that ask which Amazon SageMaker sub-service handles a given lifecycle step, plus one or two questions on the Amazon SageMaker inference options (real-time, serverless, asynchronous, batch transform).

白話文解釋 Amazon SageMaker Platform

The Amazon SageMaker platform sounds overwhelming because it has roughly fifteen named sub-services. Three white-talk analogies make the whole Amazon SageMaker platform snap into place.

Analogy 1 — The professional kitchen (廚房)

Think of the Amazon SageMaker platform as a Michelin-starred kitchen that trains new chefs and serves customers at the same time.

  • Amazon SageMaker Ground Truth is the ingredient-inspection station. Raw crates arrive and a human inspector labels each crate: "this is salmon, this is trout, this is cod." Without labels, no recipe training can happen.
  • Amazon SageMaker Data Wrangler is the prep counter — washing, chopping, normalizing, de-duplicating. Visual drag-and-drop, no scripting required.
  • Amazon SageMaker Feature Store is the mise-en-place rack. Every prepped ingredient (feature) goes into a labeled bin that both the training kitchen and the live service line can grab from — consistency across training and inference.
  • Amazon SageMaker Training Jobs are the test kitchens. Managed GPU stoves fire up only when a chef (script) says "start," and shut down the moment the bake is done. Managed Spot Training is the night-owl stove that runs when electricity is cheaper.
  • Amazon SageMaker JumpStart is the recipe library — hundreds of pre-tested recipes (pre-trained models) you can tweak instead of starting from raw ingredients.
  • Amazon SageMaker Canvas is the home-cook UI — no Python, no chef's knife. A business user drags a CSV in, clicks "predict," and the Amazon SageMaker platform does the heavy lifting behind the scenes.
  • Amazon SageMaker Autopilot is the sous-chef who automatically tastes dozens of recipe variations and picks the winner (AutoML).
  • Amazon SageMaker Studio is the whole open kitchen floor — the unified IDE where every cook can see every station.
  • Amazon SageMaker endpoints are the service lines. Real-time endpoints are the à la carte pass — a plate goes out in 100 ms. Serverless inference is the pop-up that scales up for brunch rushes and closes between services. Asynchronous inference is the catering order — drop the ticket, come back later. Batch transform is the banquet hall — cook five thousand portions in one sitting.
  • Amazon SageMaker Model Monitor is the quality inspector who tastes random plates leaving the pass and flags any drift in seasoning.

When an exam scenario says "the team labels images before training," the answer is Amazon SageMaker Ground Truth. "Business analyst with no Python skills" means Amazon SageMaker Canvas. "Unpredictable traffic, scale to zero" means Amazon SageMaker Serverless Inference. The kitchen map is the whole trick.

Analogy 2 — The Swiss Army knife (瑞士刀)

The Amazon SageMaker platform is a fifteen-blade Swiss Army knife — one tool, many blades, each blade solves one job.

  • The main blade is Amazon SageMaker Studio — the big blade you open first.
  • The scissors are Amazon SageMaker Ground Truth — they cut raw data into labeled data.
  • The saw is Amazon SageMaker Data Wrangler — it reshapes rough data into usable features.
  • The corkscrew is Amazon SageMaker Feature Store — pull out the exact feature you need at train time and at inference time.
  • The large blade is Amazon SageMaker Training Jobs — your heavy cutting.
  • The file is Amazon SageMaker Training Compiler — it polishes the training for speed.
  • The awl is Amazon SageMaker JumpStart — punches straight through to a pre-built hole.
  • The toothpick is Amazon SageMaker Canvas — small, harmless, non-expert use.
  • The tweezers are Amazon SageMaker Autopilot — picks the best AutoML candidate automatically.
  • The flathead screwdriver is Amazon SageMaker Pipelines — tightens the MLOps bolts.
  • The Phillips screwdriver is Amazon SageMaker Model Registry — versioned model artifacts.
  • The magnifying glass is Amazon SageMaker Model Monitor — watches for drift.
  • The nail file is Amazon SageMaker Inference Recommender — sizes instances for you.
  • The bottle opener is Amazon SageMaker Shadow Testing — safely compares a new variant against production.
  • The ruler is Amazon SageMaker endpoints — measured, predictable inference.

You do not need to build anything for AIF-C01. You just need to pick the right blade from the fifteen for the job in the question.

Analogy 3 — The factory assembly line (工地)

Imagine the Amazon SageMaker platform as a car factory, with data flowing left-to-right through stations.

  • Station 1 — Raw material intake: Amazon SageMaker Ground Truth labels incoming sheet metal (data) so the robot arms know what to weld.
  • Station 2 — Parts prep: Amazon SageMaker Data Wrangler stamps, drills, normalizes each part (feature engineering).
  • Station 3 — Parts warehouse: Amazon SageMaker Feature Store stores prepped parts on numbered shelves so both the assembly line (training) and the service garage (inference) can pull from the same catalog.
  • Station 4 — Main assembly: Amazon SageMaker Training Jobs welds the model together on managed GPU conveyors. Managed spot is the night shift running on cheaper electricity.
  • Station 5 — Pre-built chassis bay: Amazon SageMaker JumpStart rolls out pre-assembled chassis (pre-trained models) so you skip ahead to final assembly.
  • Station 6 — Quality inspection: Amazon SageMaker Autopilot and Amazon SageMaker Experiments run automated tests to pick the winner configuration.
  • Station 7 — Model storage: Amazon SageMaker Model Registry is the finished-car lot, each with a VIN (model version) and approval status.
  • Station 8 — Orchestration: Amazon SageMaker Pipelines is the factory control system that automates the whole line and triggers retraining on new data.
  • Station 9 — Dealerships: Amazon SageMaker endpoints are the dealerships — real-time lots (always open), serverless pop-ups (open on demand), asynchronous (callback when ready), batch transform (fleet sale).
  • Station 10 — Recall monitoring: Amazon SageMaker Model Monitor and Amazon SageMaker Inference Recommender watch production performance and warn if a car starts rusting early (drift).
  • Station 11 — Pilot testing: Amazon SageMaker Shadow Testing mirrors production orders to a new prototype to compare quality without risk.

Keep the factory image and every Amazon SageMaker platform exam question becomes a station-pick.

Amazon SageMaker Studio — The Unified IDE

Amazon SageMaker Studio is the single browser-based IDE at the center of the Amazon SageMaker platform. Amazon SageMaker Studio gives a data scientist one place to open notebooks, launch training jobs, track experiments, register models, deploy endpoints, and monitor drift — no SSH, no local Jupyter, no Conda.

Amazon SageMaker Studio runs on managed compute backed by Amazon Elastic File System so notebook state persists across sessions. The newer Amazon SageMaker Studio (next-generation) released alongside Amazon SageMaker AI adds the unified experience for queries over Amazon SageMaker Lakehouse, SQL on Amazon Redshift, and low-code ML, all behind the same web UI.

Key Amazon SageMaker Studio capabilities for AIF-C01:

  • JupyterLab notebooks and Code Editor (based on VS Code).
  • Launch Amazon SageMaker Training Jobs and Amazon SageMaker Processing Jobs from the UI.
  • Browse Amazon SageMaker JumpStart pre-trained models.
  • Inspect Amazon SageMaker Experiments, Amazon SageMaker Pipelines runs, and Amazon SageMaker Model Registry entries.
  • Open Amazon SageMaker Data Wrangler flows and Amazon SageMaker Canvas apps.
  • Collaborate across a team with shared spaces.

Amazon SageMaker Studio is the unified IDE for data scientists who write code. Amazon SageMaker Canvas is the no-code UI for business analysts who do not write code. Both live inside the Amazon SageMaker platform, but Amazon SageMaker Studio assumes Python literacy while Amazon SageMaker Canvas assumes spreadsheet literacy. On AIF-C01, questions about "code-centric ML IDE" point to Amazon SageMaker Studio; questions about "no-code ML for BI users" point to Amazon SageMaker Canvas. Source ↗

Amazon SageMaker Data Wrangler — Visual Data Preparation

Amazon SageMaker Data Wrangler is the visual, low-code data-preparation tool inside the Amazon SageMaker platform. Amazon SageMaker Data Wrangler connects to Amazon S3, Amazon Athena, Amazon Redshift, Amazon EMR, AWS Lake Formation, Snowflake, and Databricks, and it lets you transform data through a drag-and-drop flow with 300+ built-in transforms (fill missing values, one-hot encode, scale numeric features, detect bias with Amazon SageMaker Clarify, parse dates, and more).

Amazon SageMaker Data Wrangler outputs the flow as one of three artifacts:

  • A scheduled Amazon SageMaker Processing Job that batches the transformation at scale.
  • A Python script you can plug into an Amazon SageMaker Pipeline.
  • A feature definition pushed into Amazon SageMaker Feature Store.

AIF-C01 signal: "visual data prep," "no-code feature engineering before training," "built-in bias report on training data" → Amazon SageMaker Data Wrangler.

Amazon SageMaker Ground Truth — Data Labeling

Amazon SageMaker Ground Truth is the managed labeling service inside the Amazon SageMaker platform. Amazon SageMaker Ground Truth turns raw images, videos, text, and 3D point clouds into labeled training data through three workforces:

  • Amazon Mechanical Turk — public, low-cost, high-volume labeling crowd.
  • Private workforce — your own employees inside a VPC for sensitive data.
  • Vendor workforce — curated third-party labeling partners from the AWS Marketplace.

Amazon SageMaker Ground Truth supports auto-labeling — after ~1,000 human labels, an internal model labels the easy cases and forwards only the hard ones to humans, which cuts labeling cost by up to 70 percent. Built-in task types cover image classification, bounding boxes, semantic segmentation, text classification, named-entity recognition, video-frame object tracking, and 3D point-cloud labeling.

Amazon SageMaker Ground Truth produces labeled datasets; it does not train models. The common trap is picking Amazon SageMaker Ground Truth when the scenario says "automatically classify images" — that is inference, handled by a trained model (Amazon Rekognition Custom Labels or a SageMaker endpoint). Amazon SageMaker Ground Truth is the preparation phase. If the question says "we need humans to label 500,000 medical images before we can train," pick Amazon SageMaker Ground Truth. Source ↗

Amazon SageMaker Feature Store — Shared Feature Repository

Amazon SageMaker Feature Store is a purpose-built repository inside the Amazon SageMaker platform for storing, discovering, and serving ML features. A "feature" is a single input column computed from raw data — for example, customer_30d_purchase_count computed from an order log.

Amazon SageMaker Feature Store solves the hardest silent failure in MLOps: training/serving skew. If the training pipeline computes purchase_count one way and the online inference layer computes it a slightly different way, the model will silently degrade. Amazon SageMaker Feature Store offers two stores behind one API:

  • Online store — low-latency reads for real-time inference (backed by Amazon DynamoDB internally).
  • Offline store — columnar parquet in Amazon S3 for training-batch reads and backfills.

Both stores stay in sync through the same ingestion API, so the feature used at training time is exactly the feature used at inference time.

When an AIF-C01 question mentions "the team needs to guarantee the same feature definition at training and inference," the answer is Amazon SageMaker Feature Store. When the question mentions "centralized feature catalog across multiple models and teams," the answer is still Amazon SageMaker Feature Store. It is the only Amazon SageMaker sub-service dedicated to the feature layer. Source ↗

Amazon SageMaker Training Jobs — Managed Model Training

Amazon SageMaker Training Jobs are the managed training engine of the Amazon SageMaker platform. You supply a training script (or pick a built-in algorithm), the input data location in Amazon S3, and the instance type; Amazon SageMaker provisions the compute, runs the job, writes artifacts back to Amazon S3, and tears the cluster down. You never touch the underlying EC2 instances.

On-demand vs Managed Spot Training

Amazon SageMaker Training Jobs run on either on-demand instances or Managed Spot Training. Managed Spot Training uses Amazon EC2 Spot capacity and can cut training cost by up to 90 percent. Amazon SageMaker handles the Spot interruption and resumes the job automatically using checkpoints written to Amazon S3. The trade-off is longer wall-clock time because Spot capacity may not be available immediately.

Rule of thumb for AIF-C01: cost-sensitive training of large foundation models that can tolerate interruption → Managed Spot Training. Time-sensitive training that must finish by a deadline → on-demand.

Distributed Training

Amazon SageMaker supports two distributed-training paradigms for training large models:

  • Data parallelism — each replica has a copy of the full model; batches are split across replicas. Use the SageMaker Distributed Data Parallel (SMDDP) library or Horovod/PyTorch DDP. Good for models that fit in one GPU.
  • Model parallelism — a single model is too large for one GPU so it is split across devices. Use the SageMaker Distributed Model Parallel (SMMP) library or Amazon SageMaker HyperPod for massive foundation-model training.

AWS also exposes Amazon SageMaker HyperPod, a purpose-built resilient cluster that auto-recovers from instance failures during multi-week foundation-model training runs.

SageMaker Training Compiler

Amazon SageMaker Training Compiler is an optional graph-level compiler that speeds up training of deep-learning models up to 50 percent by fusing operations and reducing memory traffic. It works with PyTorch and TensorFlow out of the box. You opt in with a single argument on the Estimator, no code rewrite needed.

Three Amazon SageMaker Training Jobs knobs reduce training cost: (1) Managed Spot Training cuts instance cost up to 90 percent; (2) Amazon SageMaker Training Compiler shortens wall-clock time up to 50 percent via graph compilation; (3) distributed training (data or model parallel) scales horizontally to finish sooner on larger fleets. On AIF-C01, "reduce training cost with minimal code change" = Managed Spot Training. "Speed up training without changing instance type" = SageMaker Training Compiler. Source ↗

Automatic Model Tuning

Amazon SageMaker Automatic Model Tuning (also called SageMaker hyperparameter tuning) runs multiple training jobs in parallel with different hyperparameter combinations, applies Bayesian optimization or random search, and surfaces the best model. For AIF-C01, recognize Automatic Model Tuning as the SageMaker feature that replaces manual grid search.

Amazon SageMaker JumpStart — Pre-trained Model Hub

Amazon SageMaker JumpStart is the pre-trained model hub of the Amazon SageMaker platform. Amazon SageMaker JumpStart offers hundreds of foundation models and task-specific models (text classification, object detection, tabular regression) plus end-to-end solution templates (fraud detection, predictive maintenance, demand forecasting).

You can:

  • Deploy a JumpStart model to an Amazon SageMaker endpoint in two clicks.
  • Fine-tune a JumpStart foundation model (Meta Llama, Falcon, Amazon Titan, Stability AI Stable Diffusion, and more) on your own data via Amazon SageMaker Training Jobs.
  • Export the notebook that reproduces the deployment or fine-tuning pipeline.

Amazon SageMaker JumpStart is where the Amazon SageMaker platform touches the generative-AI world: it is the SageMaker-side door to foundation models. The Amazon Bedrock-side door is the Amazon Bedrock API. On AIF-C01 you may see a question pairing JumpStart with Bedrock — see the decision matrix later in this guide.

Amazon SageMaker Canvas — No-Code ML

Amazon SageMaker Canvas is the no-code ML interface for business analysts. You upload a CSV, pick the target column, click "Build," and Amazon SageMaker Canvas runs Amazon SageMaker Autopilot under the hood to produce a deployable model. Amazon SageMaker Canvas also natively connects to Amazon S3, Snowflake, Salesforce, Redshift, and AWS Lake Formation, and it surfaces predictions back into Amazon QuickSight dashboards.

Amazon SageMaker Canvas 2024+ supports Generative AI with foundation models: users can chat with Amazon Bedrock models, summarize documents, and run retrieval-augmented generation inside the Canvas UI without writing code.

The primary AIF-C01 signal for Amazon SageMaker Canvas is no Python, no notebooks, BI-style ML.

Amazon SageMaker Autopilot — AutoML

Amazon SageMaker Autopilot is the AutoML engine of the Amazon SageMaker platform. You point Autopilot at a tabular dataset and a target column; Autopilot automatically explores feature preprocessing, algorithms (XGBoost, linear learner, MLP), and hyperparameters, then produces a leaderboard of candidate models with full notebook transparency — you can see the exact code that built each candidate. Amazon SageMaker Canvas uses Amazon SageMaker Autopilot as its backend.

AIF-C01 frame: Autopilot = "automatically try many models," Canvas = "no-code UI that uses Autopilot underneath," Studio = "expert IDE." All three are inside the Amazon SageMaker platform.

Amazon SageMaker Pipelines and Model Registry — MLOps

Amazon SageMaker Pipelines is the purpose-built CI/CD service for ML inside the Amazon SageMaker platform. A SageMaker Pipeline is a DAG of steps — processing, training, evaluation, model creation, batch transform, registration, deployment — defined in Python with the SageMaker SDK. Pipelines runs the DAG on managed compute, logs every artifact, caches redundant steps, and integrates with Amazon EventBridge for trigger-based retraining.

Amazon SageMaker Model Registry is the versioned catalog of trained models. Every model version has metadata (training job, metrics, lineage), an approval status (PendingManualApproval / Approved / Rejected), and a link to the model artifact in Amazon S3. A typical MLOps pattern is: Pipelines trains → registers a new version to Model Registry with PendingManualApproval → data science lead reviews metrics → approves → a downstream Pipelines step deploys the Approved version to the production endpoint.

On AIF-C01, if the scenario says "automate retraining with approval gates" or "version-control trained models with lineage," the answer combines Amazon SageMaker Pipelines (the workflow engine) and Amazon SageMaker Model Registry (the model catalog). Neither one alone is the MLOps answer — pair them. CodePipeline can orchestrate higher-level release cycles but the ML-specific approval and lineage live in Amazon SageMaker Model Registry. Source ↗

Amazon SageMaker Inference Options — The Four Endpoint Types

The Amazon SageMaker platform offers four distinct inference patterns. Matching the right one to a workload is the single most-tested Amazon SageMaker inference topic on AIF-C01.

Real-time endpoints

An Amazon SageMaker real-time endpoint is a persistent HTTPS endpoint backed by one or more EC2 instances running behind an auto-scaling group. Expected latency is milliseconds. You pay for instance-hours while the endpoint is up, even if traffic is zero. Use real-time endpoints for low-latency, steady-traffic online inference such as fraud scoring during checkout or a recommendation widget on a home page.

Serverless Inference

Amazon SageMaker Serverless Inference provisions compute only when a request arrives and scales to zero between invocations. You pay per compute-second actually used, not per hour provisioned. Cold-start latency exists on the first request after a quiet period. Use serverless inference for workloads with unpredictable, intermittent, or spiky traffic where paying for idle capacity is wasteful — think internal tools, prototype apps, or B2B APIs with a few requests per minute.

Asynchronous Inference

Amazon SageMaker Asynchronous Inference queues inbound requests in Amazon S3 and returns the result to a callback location in Amazon S3 minutes or hours later. The Amazon SageMaker endpoint auto-scales down to zero between queue drains, saving cost. Use asynchronous inference for large payloads (up to 1 GB) and long processing times (up to an hour) — typical examples are inference on long videos, large PDFs, or high-resolution medical scans.

Batch Transform

Amazon SageMaker Batch Transform is the job-based inference option. You give SageMaker a folder of input records in Amazon S3 and a model; SageMaker spins up a managed cluster, processes every record in parallel, writes results to Amazon S3, and tears the cluster down. There is no persistent endpoint and no per-hour cost. Use batch transform for periodic offline scoring — nightly churn prediction on the whole customer base, weekly lead scoring, or backfilling predictions onto historical data.

The four Amazon SageMaker inference options map to very specific patterns and AIF-C01 distractors swap them on purpose. Real-time = steady low-latency online. Serverless = intermittent low-volume online with scale-to-zero. Asynchronous = large payloads or long-running online requests queued via Amazon S3. Batch transform = job-based, no endpoint, whole dataset at once. "Scale to zero between requests" is the signal for serverless. "Payload larger than 6 MB" or "processing takes minutes per request" is the signal for asynchronous. "Score the whole customer table once a night" is the signal for batch transform. Source ↗

Multi-Model Endpoints (MME)

Amazon SageMaker Multi-Model Endpoints host many models behind a single endpoint, loading and unloading models from a shared pool of instances on demand. Best when you have thousands of similar models (per-customer recommendation models, per-region forecasts) and only a subset is hot at any given time. MME drastically reduces the per-endpoint cost at the price of slightly higher first-call latency for a cold model.

Multi-Container Endpoints (MCE)

Amazon SageMaker Multi-Container Endpoints host up to 15 distinct containers behind one endpoint, invoked directly by container name or chained serially. Good for consolidating heterogeneous models (e.g., one TensorFlow model plus one PyTorch model plus a pre-processor) behind one URL.

Amazon SageMaker Shadow Testing

Amazon SageMaker Shadow Testing duplicates live production traffic to a shadow variant running the candidate model. The shadow variant's responses are logged but not returned to the user, so you can compare latency, error rate, and output distribution against the current production model without any user impact. Shadow Testing is the AWS-native safe way to validate a new model before swapping it into production traffic.

AIF-C01 signal: "compare a new model against production with real traffic but without affecting users" → Amazon SageMaker Shadow Testing.

Amazon SageMaker Inference Recommender

Amazon SageMaker Inference Recommender runs benchmark load tests across dozens of instance types, batch sizes, and container configurations for a given model and returns the best cost/performance match. It replaces the guess-and-check of picking an instance family for a SageMaker real-time endpoint.

AIF-C01 signal: "which instance type should I pick for my endpoint?" → Amazon SageMaker Inference Recommender.

Amazon SageMaker Model Monitor — Drift Detection

Amazon SageMaker Model Monitor watches deployed endpoints for four classes of drift and alerts via Amazon CloudWatch when any drift crosses a threshold:

  • Data quality drift — input feature distributions differ from training baseline.
  • Model quality drift — prediction accuracy degrades (requires ground-truth labels).
  • Bias drift — fairness metrics (via Amazon SageMaker Clarify) degrade.
  • Feature attribution drift — the features driving predictions shift over time.

Amazon SageMaker Model Monitor runs on a schedule (hourly or daily). Results are logged to Amazon S3 and summarized in Amazon SageMaker Studio. Model Monitor is the drift-detection half of the MLOps story; Amazon SageMaker Pipelines is the retraining-orchestration half.

Amazon SageMaker Model Monitor watches drift on a live endpoint. Amazon SageMaker Clarify inspects bias and explains feature importance. Model Monitor can invoke Clarify for bias-drift monitoring, but the services have distinct primary purposes. On AIF-C01, "endpoint accuracy drops three months after deployment" = Amazon SageMaker Model Monitor. "Explain why the model predicted loan denial for this applicant" = Amazon SageMaker Clarify. Source ↗

Amazon SageMaker Platform vs Amazon Bedrock — Decision Matrix

This is the single most-asked comparison on AIF-C01. The Amazon SageMaker platform and Amazon Bedrock are the two AWS entry points for AI workloads but they answer very different questions.

Dimension Amazon SageMaker Platform Amazon Bedrock
Primary purpose Build, train, deploy your own model Call someone else's foundation model via API
Level of ML expertise required High (data scientists, ML engineers) Low to medium (developers, prompt engineers)
Customization Full — train from scratch, fine-tune, bring your own container Limited — prompts, fine-tuning on supported FMs, RAG via Knowledge Bases
Pricing Per instance-hour for training and endpoints Per input/output token (on-demand) or provisioned throughput
Infrastructure visibility You pick instance types and sizes Fully abstracted — no instances
Foundation model access Amazon SageMaker JumpStart (deploy or fine-tune FMs as SageMaker endpoints) Native Bedrock API
Generative AI-specific tools Via Amazon SageMaker JumpStart Bedrock Guardrails, Knowledge Bases, Agents, Model Evaluation

Use the Amazon SageMaker platform when:

  • You need a custom model trained on your proprietary data.
  • The problem is not a text or image generation task (tabular forecasting, fraud, recommendation).
  • You require full control over instance types, VPC configurations, and MLOps pipelines.
  • You want to fine-tune a foundation model with your own training loop and hyperparameters (JumpStart path).

Use Amazon Bedrock when:

  • You need generative AI — text, images, chat, summarization, code.
  • You want API-only access to pre-trained foundation models with no ML infrastructure.
  • You need Bedrock-native RAG, Agents, or Guardrails.

Both Amazon SageMaker JumpStart and Amazon Bedrock let you use foundation models, and AIF-C01 distractors lean on this overlap. The distinguishing rule: Amazon Bedrock exposes FMs as a serverless API with no instances; Amazon SageMaker JumpStart deploys FMs to SageMaker endpoints that you manage. Pick JumpStart when the scenario mentions SageMaker endpoints, custom fine-tuning loops, or existing SageMaker MLOps pipelines. Pick Bedrock when the scenario mentions serverless generative AI, Bedrock Knowledge Bases, Bedrock Agents, or Bedrock Guardrails. Source ↗

Common Amazon SageMaker Platform Exam Traps

Five traps account for most wrong answers on Amazon SageMaker platform questions:

  1. Amazon SageMaker Studio vs Amazon SageMaker Canvas — code IDE vs no-code UI. "Business analyst without Python" is always Amazon SageMaker Canvas.
  2. Amazon SageMaker Ground Truth vs Amazon Rekognition — labeling training data vs running pre-trained image inference. Ground Truth produces labels for later training; Rekognition produces predictions directly.
  3. Real-time vs Serverless vs Asynchronous vs Batch Transform — the four Amazon SageMaker inference options map to distinct workload shapes (see Inference Options section).
  4. Amazon SageMaker JumpStart vs Amazon Bedrock — both access foundation models; JumpStart deploys to SageMaker endpoints while Bedrock is serverless API.
  5. Amazon SageMaker Model Monitor vs Amazon SageMaker Clarify — drift detection in production vs bias and explainability inspection. Both are responsible-AI tools but distinct entry points.

Key Numbers and Must-Memorize Amazon SageMaker Facts

  • Managed Spot Training: up to 90 percent training-cost reduction on interruptible workloads.
  • SageMaker Training Compiler: up to 50 percent training-time reduction via graph compilation.
  • Asynchronous Inference: payloads up to 1 GB, processing time up to 1 hour per request, scales to zero between batches.
  • Multi-Model Endpoints: thousands of models behind one endpoint, loaded on demand.
  • Multi-Container Endpoints: up to 15 distinct containers behind one endpoint.
  • Feature Store: online store backed by DynamoDB-class latency, offline store in Amazon S3 Parquet.
  • Ground Truth auto-labeling: up to 70 percent labeling-cost reduction after ~1,000 human labels.
  • Model Registry statuses: PendingManualApproval, Approved, Rejected.
  • Model Monitor drift categories: data quality, model quality, bias, feature attribution.
  • Amazon SageMaker AI: the 2024+ brand for the classic SageMaker service inside the next-generation Amazon SageMaker umbrella.

Practice-Ready Scenarios — Task 3.1 Mapped Exercises

Scenario 1: A retail company must label 300,000 product images before training a custom classifier. Correct choice: Amazon SageMaker Ground Truth.

Scenario 2: A marketing analyst with no Python experience wants to predict monthly churn from a CSV. Correct choice: Amazon SageMaker Canvas.

Scenario 3: A data science team wants to fine-tune Meta Llama on internal documentation and deploy the result to a SageMaker endpoint with custom VPC isolation. Correct choice: Amazon SageMaker JumpStart (foundation-model fine-tuning) plus Amazon SageMaker Training Jobs.

Scenario 4: An ML engineer wants to train a multi-billion-parameter language model across a cluster of GPU instances without writing cluster-management code. Correct choice: Amazon SageMaker distributed training on Amazon SageMaker HyperPod.

Scenario 5: A team wants to reduce training cost of a non-urgent computer-vision model by tolerating occasional interruptions. Correct choice: SageMaker Managed Spot Training.

Scenario 6: An application processes customer requests sporadically (0–5 per minute) and must not pay for idle endpoint time. Correct choice: Amazon SageMaker Serverless Inference.

Scenario 7: A medical imaging pipeline inference takes 20 minutes per 400 MB scan. Correct choice: Amazon SageMaker Asynchronous Inference.

Scenario 8: A fraud team scores the full monthly transaction table in one nightly run. Correct choice: Amazon SageMaker Batch Transform.

Scenario 9: An ML platform hosts 5,000 per-customer recommendation models and only ~100 are active at any time. Correct choice: Amazon SageMaker Multi-Model Endpoints.

Scenario 10: Before swapping a new model into production, the team wants to compare its latency and output against the current model using live traffic but without affecting users. Correct choice: Amazon SageMaker Shadow Testing.

Scenario 11: The ML team wants to automatically retrain a model every month with approval gates and lineage tracking. Correct choice: Amazon SageMaker Pipelines plus Amazon SageMaker Model Registry.

Scenario 12: A deployed credit-risk model starts producing unusual prediction distributions three months after launch. Correct choice: Amazon SageMaker Model Monitor.

FAQ — Amazon SageMaker Platform Top 7 Questions

1. What is the difference between Amazon SageMaker and Amazon Bedrock?

Amazon SageMaker is the end-to-end ML platform for building, training, and deploying your own models — from tabular XGBoost to multi-billion-parameter LLMs. Amazon Bedrock is the serverless API that lets you call pre-trained foundation models (Anthropic Claude, Meta Llama, Amazon Titan, Stability AI) with no infrastructure. If the AIF-C01 scenario mentions training data, notebooks, hyperparameters, or SageMaker endpoints, the answer is Amazon SageMaker. If the scenario mentions foundation models, prompts, generative AI, or Bedrock Knowledge Bases, the answer is Amazon Bedrock. Both integrate: Amazon SageMaker JumpStart deploys foundation models to SageMaker endpoints for teams who need SageMaker-style infrastructure control.

2. When should I use Amazon SageMaker Canvas instead of Amazon SageMaker Studio?

Amazon SageMaker Canvas is designed for business analysts and BI users with no coding skills. You upload a CSV, pick the target column, and click "Build." Amazon SageMaker Studio is designed for data scientists and ML engineers who write Python. Both build models on the Amazon SageMaker platform, but the UI targets different personas. On AIF-C01, "no-code ML," "business user," or "no Python required" points to Canvas; "Jupyter notebooks," "SDK code," or "full ML IDE" points to Studio.

3. What is the difference between SageMaker Serverless Inference and Asynchronous Inference?

Both scale to zero between requests, but they serve different workloads. Serverless Inference is for small, low-latency online requests with intermittent or unpredictable traffic — payloads under 6 MB, response time in seconds, caller waits synchronously. Asynchronous Inference is for large payloads (up to 1 GB) and long processing (up to 1 hour) — the caller submits, SageMaker queues the request in Amazon S3, and the result is returned to a callback S3 location. If the question mentions payload size or processing time, pick Asynchronous. If it mentions scale-to-zero and low-latency HTTP, pick Serverless.

4. How does SageMaker Model Monitor detect drift?

Amazon SageMaker Model Monitor establishes a baseline during deployment (statistics of input features, predictions, and — if labels are available — accuracy). On a schedule (hourly or daily), Model Monitor samples live endpoint traffic, computes the same statistics, and compares them against the baseline. Divergence beyond a threshold fires an Amazon CloudWatch alarm. Model Monitor supports four monitor types: data quality, model quality, bias drift (via Amazon SageMaker Clarify), and feature attribution drift.

5. What is SageMaker JumpStart and how does it relate to Amazon Bedrock?

Amazon SageMaker JumpStart is the pre-trained model hub inside the Amazon SageMaker platform. JumpStart lets you deploy or fine-tune foundation models (Meta Llama, Amazon Titan, Falcon, Stable Diffusion) as Amazon SageMaker endpoints you manage. Amazon Bedrock, by contrast, exposes foundation models as a serverless API with no instances. Use JumpStart when you need SageMaker-style control (VPC isolation, custom fine-tuning loops, existing SageMaker MLOps). Use Bedrock when you want no infrastructure, Bedrock-native RAG via Knowledge Bases, or Bedrock Guardrails.

6. What are the four SageMaker inference options and when do I pick each?

The Amazon SageMaker platform offers four inference patterns. Real-time endpoint — persistent HTTPS endpoint for steady low-latency online traffic. Serverless Inference — scales to zero for intermittent/unpredictable traffic. Asynchronous Inference — queue-based via Amazon S3 for large payloads or long processing. Batch Transform — job-based whole-dataset scoring with no persistent endpoint. Also note Multi-Model Endpoints (many models behind one endpoint) and Multi-Container Endpoints (up to 15 containers behind one endpoint) as deployment variants.

7. Do I need Amazon SageMaker Pipelines if I already use AWS CodePipeline?

Usually yes, because they operate at different layers. AWS CodePipeline orchestrates general software release workflows (code → build → deploy). Amazon SageMaker Pipelines orchestrates the ML-specific lifecycle — data processing, training, evaluation, model registration, approval gates — with native SageMaker step types, automatic artifact lineage, and step caching. A common pattern is CodePipeline at the outer layer triggering a SageMaker Pipelines execution when new data arrives. On AIF-C01, questions about "ML-specific CI/CD with lineage and approval" map to Amazon SageMaker Pipelines plus Amazon SageMaker Model Registry.

Further Reading

Summary

The Amazon SageMaker platform covers the entire ML lifecycle in a single managed service family, now branded Amazon SageMaker AI under the next-generation Amazon SageMaker umbrella. Prepare data with Amazon SageMaker Data Wrangler and Amazon SageMaker Ground Truth. Store features in Amazon SageMaker Feature Store to eliminate training/serving skew. Train with Amazon SageMaker Training Jobs using Managed Spot Training for cost, Amazon SageMaker Training Compiler for speed, and distributed training (or Amazon SageMaker HyperPod) for scale. Start from pre-trained models with Amazon SageMaker JumpStart, or skip code entirely with Amazon SageMaker Canvas (backed by Amazon SageMaker Autopilot). Automate retraining with Amazon SageMaker Pipelines plus Amazon SageMaker Model Registry for versioning and approval gates. Serve inference through the four endpoint types — real-time, serverless, asynchronous, batch transform — plus Multi-Model and Multi-Container variants. De-risk releases with Amazon SageMaker Shadow Testing and right-size compute with Amazon SageMaker Inference Recommender. Watch for production drift via Amazon SageMaker Model Monitor. For AIF-C01, the single most valuable skill is the Amazon SageMaker platform versus Amazon Bedrock decision — custom model training on the Amazon SageMaker platform, serverless foundation-model API on Amazon Bedrock — followed by the four-inference-option mapping that AWS uses in almost every Task 3.1 scenario question.

官方資料來源