examhub .cc 用最有效率的方法,考取最有價值的認證
Vol. I
本篇導覽 約 29 分鐘

Amazon Bedrock Guardrails 與內容控制

5,620 字 · 約 29 分鐘閱讀

Amazon Bedrock Guardrails are a configurable safety layer that sits between your application and a foundation model on Amazon Bedrock, filtering both user inputs and model outputs against five independent policy types. On the AIF-C01 exam, Task 5.1 almost always tests whether you can recognise which of the five Bedrock Guardrails policies (content filters, denied topics, word filters, sensitive information filters, or contextual grounding check) solves a described safety problem — and whether you know that Bedrock Guardrails are a separate, versioned policy object referenced by InvokeModel and Converse calls rather than a property of the model itself.

This page walks you through every feature of Bedrock Guardrails in the AIF-C01 blueprint, drills the five-policy taxonomy, contrasts Bedrock Guardrails with IAM and with prompt engineering, and covers the ApplyGuardrail standalone API, cross-region availability, versioning, pricing, and the integration with Amazon Bedrock Knowledge Bases. By the end you will be able to read any AIF-C01 safety scenario, identify the object being protected, and pick the correct Bedrock Guardrails policy in under ten seconds.

What Are Amazon Bedrock Guardrails?

Amazon Bedrock Guardrails are a managed, model-agnostic safety layer that applies five types of configurable policies to foundation-model interactions on Amazon Bedrock. A single Bedrock Guardrail is a standalone resource — it has an ARN, a version, and a set of policy configurations — and it is attached to an InvokeModel, InvokeModelWithResponseStream, or Converse call by passing the guardrailIdentifier and guardrailVersion parameters. You can also invoke Bedrock Guardrails directly through the ApplyGuardrail API without calling any foundation model at all.

Because Bedrock Guardrails are a separate resource, one guardrail definition can protect many different applications and many different foundation models (Claude, Titan, Llama, Mistral, Cohere, Stability AI) with identical policy enforcement. That decoupling is the most important architectural fact about Bedrock Guardrails and it drives several exam traps.

Why Bedrock Guardrails Exist

Foundation models on Amazon Bedrock ship with provider-side safety training, but provider-side safety is uniform across all customers and cannot express your organisation's specific risk posture. Bedrock Guardrails let you add your own policies on top of the model's built-in alignment. A bank needs to block investment advice; a healthcare company needs to block diagnosis statements; a children's product needs stricter violence filtering than an adult news product. Bedrock Guardrails are the mechanism that encodes those product-specific rules.

Where Bedrock Guardrails Fit in the AI Threat Model

Bedrock Guardrails are a defence layer, not the only defence. The AIF-C01 blueprint expects you to layer Bedrock Guardrails with IAM (who can invoke the model), VPC endpoints (network path), CloudTrail (audit), prompt engineering (in-prompt instructions), and responsible-AI practices (data governance, human review). Bedrock Guardrails specifically address the content safety layer — what the model is allowed to say and what the user is allowed to ask.

Why the AIF-C01 Exam Loves Bedrock Guardrails Questions

Domain 5 (Security, Compliance, and Governance for AI) carries 14% of AIF-C01, and Bedrock Guardrails account for roughly a third of Task 5.1 scenarios based on community reports. Because Bedrock Guardrails map neatly onto the five policy types, almost every question reduces to "which policy enforces the rule in this scenario." That makes Bedrock Guardrails one of the highest-return study targets in the exam — a narrow topic with predictable question patterns.

白話文解釋 Bedrock Guardrails

Think of Bedrock Guardrails as the different kinds of filters, doormen, and fact-checkers you can hire for a venue. Each one watches a different kind of risk and the AIF-C01 exam wants you to match the right filter to the right job.

Analogy 1 — The Nightclub Door Team (Hospitality Analogy)

Imagine a high-end nightclub. The venue is your application; the dance floor is the foundation model's output; the guests are the user prompts. Bedrock Guardrails are the door team and the floor supervisor.

Content filters are the main bouncers at the door. They watch for six standard categories of trouble — hate speech, insults, sexual content, violence, misconduct, and prompt attacks — and you dial up or down each bouncer's strictness (None, Low, Medium, High). Raise "violence" to High for a children's brand; lower "sexual" to Low for an adult streaming platform. Denied topics are the venue's custom "do-not-discuss" list — say the owner says "no political debates allowed" and "no discussion of competitor restaurants." Each denied topic has a name, a definition, and example phrases, and the door team turns away anything that walks into those topics. Word filters are the literal banned-word list at the door — profanity plus whatever custom terms the owner banned (competitor product names, insider code words). Sensitive information filters are the privacy team at the door who confiscate (redact) or turn away (block) anyone carrying ID cards, credit cards, or phone numbers — with the option to either black them out with a marker (MASK) or refuse entry entirely (BLOCK). Contextual grounding check is the floor supervisor listening to the DJ announcements — if the DJ announces facts that do not match the actual night's events, the supervisor cuts the mic.

The nightclub analogy also explains the architectural split: the policy object (the venue rulebook) is published once, signed and versioned, and then handed to every door team member. The bouncers do not memorise rules from conversation; they read from the book. That is why Bedrock Guardrails are a separate resource referenced by the InvokeModel call rather than being baked into the model.

Picture a courtroom. Evidence (user prompts) comes in, witnesses (the foundation model) give testimony, and the judge rules what the jury is allowed to hear. Bedrock Guardrails are the judge's standing orders on admissibility.

Content filters are the six standing categories in which the judge has already ruled evidence inadmissible — hateful, insulting, sexual, violent, misconduct-inducing, and prompt-attack content. Denied topics are the case-specific rulings — the judge in this trial has excluded discussion of the defendant's prior convictions and any reference to ongoing settlement talks; each exclusion has a description and sample phrasings. Word filters are the literal gag list — certain names cannot be spoken in court. Sensitive information filters are the redaction rules — SSNs, addresses, and minor children's names must be masked before the transcript is released, and sometimes the judge orders whole statements struck rather than redacted. Contextual grounding check is the fact-witness faithfulness rule — the witness (the LLM) is only allowed to claim things supported by the record (the retrieved context in a RAG system); anything unsupported gets struck.

The courtroom metaphor also illustrates input vs output filtering. The judge reviews both what the prosecutor asks (input) and what the witness answers (output). Similarly, Bedrock Guardrails can inspect the prompt before it reaches the model and inspect the completion before it reaches the user. The same guardrail applies to both directions, but you configure which policies run on input versus output separately for each filter.

Analogy 3 — The Postal System Sorting Office (Infrastructure / Postal Analogy)

Imagine your AI application as a national postal system, where prompts are outgoing letters and completions are incoming letters. Bedrock Guardrails are the sorting office that intercepts both.

Content filters are the X-ray scanners that flag letters containing contraband categories (hate, insults, sexual, violence, misconduct, prompt attacks). Denied topics are like a list of destinations you refuse to deliver to — you have defined "stock-picking advice" as an undeliverable destination, and any letter trying to go there is returned to sender with a canned refusal message. Word filters are the postal service's banned-term list applied to the letter text. Sensitive information filters are the privacy regulations that require the postal service to redact or refuse letters containing personal identifiers. Contextual grounding check is the factual-review desk that compares claims in outbound mail against the source documents attached to them; any claim without a supporting source is either rejected or flagged.

The postal analogy highlights cost and latency. The sorting office adds processing time (Bedrock Guardrails add milliseconds of latency per request) and has per-unit pricing (Bedrock Guardrails charge per text unit processed). It also highlights that the sorting office is regional — Bedrock Guardrails exist per AWS Region and must be created in every region where your applications run, exactly like most other Bedrock resources.

Bedrock Guardrails Architecture — A Separate Policy Object

The single most important architectural fact for AIF-C01 is that a Bedrock Guardrail is an independent AWS resource, not a property of a foundation model or an IAM role.

Guardrails Are Not Tied to a Model

A Bedrock Guardrail is created once and can be applied to any supported foundation model on Amazon Bedrock. You define the guardrail with CreateGuardrail, receive a guardrailIdentifier and a version (initially DRAFT), and then pass those two parameters to InvokeModel or Converse when you call the model. Switching foundation models (say, from Claude 3 Haiku to Amazon Titan Text Lite) does not require rebuilding the guardrail — the same Bedrock Guardrails policy protects both.

Guardrails Are Not IAM

Bedrock Guardrails control content. IAM controls access. A user with bedrock:InvokeModel permission but without permission to bypass Bedrock Guardrails must still operate within the guardrail's policies. Conversely, a user blocked by IAM never reaches the guardrail at all. This distinction is one of the most common AIF-C01 traps. Bedrock Guardrails do not decide "can this user call the model" — IAM does. Bedrock Guardrails decide "given that the call happened, is the content safe."

The Five Policy Types Are Independent

Each Bedrock Guardrails policy — content filters, denied topics, word filters, sensitive information filters, contextual grounding check — is enabled and configured independently. You can use only content filters, only denied topics, or any combination. The evaluation runs all enabled policies in parallel; if any one of them blocks, the request or response is blocked and a canned message is returned.

Bedrock Guardrails defined. A Bedrock Guardrail is a standalone, versioned AWS resource that evaluates user inputs and model outputs against up to five independent policy types. It is identified by a guardrailIdentifier plus a guardrailVersion and is attached to model invocations via InvokeModel, InvokeModelWithResponseStream, Converse, or applied directly to arbitrary text via the ApplyGuardrail API. Because it is separate from the model, one Bedrock Guardrails resource can protect multiple models and multiple applications with identical policy.

Policy 1 — Content Filters (Six Harm Categories with Configurable Strength)

Content filters are the first and most commonly used Bedrock Guardrails policy. They detect and block six pre-defined harm categories on a four-level strength scale.

The Six Content Filter Categories

The six categories covered by Bedrock Guardrails content filters are: hate (discriminatory content against protected groups), insults (demeaning language aimed at a person), sexual (sexually explicit content), violence (physical harm or threats), misconduct (criminal or harmful behaviour guidance such as weapons manufacture), and prompt attacks (a category available on the input side that detects attempts to override the system prompt, jailbreak, or inject instructions). Note that "prompt attacks" is an input-only filter — you cannot configure an output-side prompt-attack filter because output is the model, not the user.

The Four Strength Levels

Each content filter category can be set independently to None, Low, Medium, or High. Higher strength means the filter flags more borderline content. You typically set higher strength on customer-facing children's apps and lower strength on adult-facing creative-writing tools. Content filters score each category on both the input and the output, and you can choose different strength settings for input versus output on the same category.

Input Filtering vs Output Filtering

For each of the six categories, Bedrock Guardrails lets you decide whether to evaluate the user's input, the model's output, or both. Input filtering blocks unsafe user prompts before they reach the foundation model (cheaper, lower latency, but cannot catch model-generated issues). Output filtering blocks unsafe completions before they reach the user (catches model hallucinations and jailbreak successes, but the model has already run, so you pay for those tokens). Most production deployments use both directions.

Content filters use four strength levels (None, Low, Medium, High) across six categories, with independent settings for input and output. Memorise the six categories: hate, insults, sexual, violence, misconduct, and prompt attacks. Prompt attacks is input-only because the user is the source of injection. All other five categories can be applied to both input and output. Strength dialling — not binary on/off — is why this is called a filter, not a block list.

Policy 2 — Denied Topics (Custom Off-Limits Categories)

Denied topics are the policy type that lets you define your own off-limits subject areas in natural language. This is the Bedrock Guardrails feature most unique to your business.

How Denied Topics Work

You create a denied topic by supplying a name (short label, e.g. "InvestmentAdvice"), a definition (a 1-2 sentence description of what the topic covers), and sample phrases (up to five example utterances that fall inside the topic). Bedrock Guardrails uses a semantic classifier to decide whether a user prompt or model output falls into any of the denied topics you defined. If it does, the request or response is blocked.

Limits on Denied Topics

A single Bedrock Guardrails configuration can contain up to 30 denied topics, and each topic supports up to 5 example phrases. Denied topics evaluate on both input and output by default, and you can narrow that per topic. A common pattern for financial-services apps is denying "InvestmentAdvice," "TaxAdvice," and "LegalAdvice" as three separate denied topics so the application can return three different canned refusal messages.

Denied Topics vs Word Filters

Denied topics are semantic — they match meaning, not exact words. "Should I buy Apple stock?" and "Is NVDA a good long-term hold?" would both be classified as InvestmentAdvice without either sharing any particular keyword. Word filters are literal — they match specific strings. Use denied topics when the concept is what matters; use word filters when exact terms are the concern.

Denied topics are semantic, word filters are literal. If the exam scenario says "block any discussion of competitor pricing" without naming specific competitors, that is denied topics (semantic). If the scenario says "block any mention of the product name 'Cortex'" that is a word filter (literal). Mis-reading this distinction is the most common denied-topics trap.

Policy 3 — Word Filters (Profanity + Custom Blocklist)

Word filters are the simplest Bedrock Guardrails policy. They match literal strings on both input and output.

Two Sources of Word Filters

Word filters pull from two sources. First, Bedrock Guardrails provides a managed profanity list that you can enable with a single toggle — AWS curates and updates the list, covering common profanity across multiple languages. Second, you supply a custom blocklist of up to 10,000 phrases (words or short multi-word expressions). The custom blocklist typically contains competitor product names, internal project codenames, banned marketing claims, or phrases that must never appear in your brand voice.

What Word Filters Do Not Do

Word filters do not do stemming, semantic match, or paraphrase detection. If you block "gun" as a custom word filter, "firearm" still passes. If you block "buy Apple stock" as a phrase, "purchase AAPL shares" still passes. For paraphrase-robust blocking, use denied topics. For exact-match compliance requirements — "the word 'guaranteed' must never appear in output" — use word filters.

Case Sensitivity and Matching

Word filters are case-insensitive by default. Multi-word phrases are matched as contiguous sequences. Punctuation between words is handled loosely. When a word filter triggers, Bedrock Guardrails blocks the request or response and returns the configured canned message.

Policy 4 — Sensitive Information Filters (PII Detection with MASK or BLOCK)

Sensitive information filters detect personally identifiable information (PII) in user inputs and model outputs and let you choose what to do about each detected type.

Managed PII Entity Types

Bedrock Guardrails recognises a curated list of managed PII entity types including NAME, EMAIL, PHONE, ADDRESS, US_SOCIAL_SECURITY_NUMBER, US_PASSPORT_NUMBER, DRIVER_ID, CREDIT_DEBIT_CARD_NUMBER, CREDIT_DEBIT_CARD_CVV, CREDIT_DEBIT_CARD_EXPIRY, IP_ADDRESS, URL, USERNAME, PASSWORD, AGE, VEHICLE_IDENTIFICATION_NUMBER, plus financial identifiers like SWIFT_CODE and US_BANK_ACCOUNT_NUMBER, and healthcare identifiers. The full list grows over time as new managed types are added.

Two Actions — MASK or BLOCK

For each PII entity type you enable, you choose an action. MASK replaces the detected entity with a placeholder (e.g. {NAME} or {EMAIL}) and allows the conversation to continue. BLOCK refuses the entire request or response and returns a canned message. Different PII types can have different actions in the same Bedrock Guardrails configuration. A common pattern is MASK for NAME and EMAIL (so the assistant can still discuss the context abstractly) and BLOCK for CREDIT_DEBIT_CARD_NUMBER and US_SOCIAL_SECURITY_NUMBER (never permit these through in either direction).

Regex Custom Patterns

In addition to managed entity types, Bedrock Guardrails sensitive information filters support custom regex patterns. You define a name, the regex, and an action (MASK or BLOCK). Use regex for organisation-specific identifiers like employee IDs, internal case numbers, SKU codes, or proprietary record formats that no managed PII type covers.

Sensitive information filters = PII detection with two possible actions: MASK or BLOCK. MASK substitutes a placeholder and continues; BLOCK refuses the message entirely. Each PII type chooses its own action. The filter supports managed entity types (NAME, EMAIL, SSN, credit card, etc.) and custom regex patterns for organisation-specific identifiers. This is how Bedrock Guardrails handles privacy compliance within the content safety layer.

Policy 5 — Contextual Grounding Check (RAG Faithfulness and Relevance)

The contextual grounding check is the newest Bedrock Guardrails policy and specifically targets Retrieval-Augmented Generation (RAG) hallucinations. It is the only Bedrock Guardrails policy that requires additional inputs — the source documents and the user query.

Two Grounding Metrics

The contextual grounding check in Bedrock Guardrails evaluates the model's response on two dimensions. Grounding score measures how faithful the response is to the provided source passages — how much of the answer is actually supported by the retrieved context. Relevance score measures how relevant the response is to the user's query — does the response actually answer the question that was asked. Each metric produces a score between 0 and 1.

Configurable Thresholds

You set a threshold (between 0 and 1) for each metric independently. If the grounding score falls below the grounding threshold, or the relevance score falls below the relevance threshold, Bedrock Guardrails blocks the response. Typical starting thresholds are 0.75 for grounding and 0.5 for relevance, tuned based on your domain's tolerance for hallucination versus conservative refusal.

When Contextual Grounding Fires

Contextual grounding check only evaluates responses when source passages are supplied (through the Converse API's grounding source field, via Bedrock Knowledge Bases, or through the ApplyGuardrail API). Without source passages there is nothing to ground against, so the check is a no-op. For pure creative generation without retrieved context, contextual grounding is irrelevant; for RAG systems, it is often the single most valuable Bedrock Guardrails policy.

Contextual grounding check is NOT RAG. It is a faithfulness check that runs on top of RAG. Candidates confuse these because both involve retrieved context. RAG retrieves passages and injects them into the prompt so the model can answer. Contextual grounding check verifies, after the model responds, whether the response is supported by those passages. You need RAG (or equivalent source retrieval) before contextual grounding check can do anything. RAG does the retrieval; contextual grounding check does the verification.

Input vs Output Filtering in Bedrock Guardrails

Every enabled Bedrock Guardrails policy runs on one or both of two checkpoints: the input (user prompt) and the output (model response). Understanding which direction is which is a recurring AIF-C01 test pattern.

Input Filtering Stops Unsafe Prompts Early

Input filtering runs Bedrock Guardrails policies on the user's prompt before the foundation model is invoked. Benefits: lower cost (the model never runs on a blocked prompt), lower latency, and early defence against prompt injection and prompt attacks. Limitation: input filtering cannot catch issues that emerge only at generation time, such as a model hallucinating PII that was not in the prompt.

Output Filtering Stops Unsafe Completions

Output filtering runs Bedrock Guardrails policies on the model's completion before returning it to the user. Benefits: catches model-generated issues (hallucinated PII, generated violent content, prompt-injection success), catches responses to a prompt that slipped through input filtering. Cost: the model has already run and produced tokens, so you pay for those even when the output is blocked.

Per-Policy Direction Choice

Content filters and sensitive information filters let you choose per-category whether the direction is input, output, or both. Denied topics evaluate input and output by default. Word filters run on input and output. Contextual grounding check runs on output only (there is no "grounding" concept for input). The "prompt attacks" content-filter category runs on input only by definition.

Cross-Region Availability of Bedrock Guardrails

Bedrock Guardrails are regional resources, just like most Bedrock primitives.

Per-Region Creation

You create a Bedrock Guardrail in a specific AWS Region. The guardrail ARN includes the region. An application deploying in us-east-1 and eu-west-1 needs two separate guardrail resources — one in each region — even if the policy contents are identical. Infrastructure-as-code (CloudFormation, CDK, Terraform) is the standard way to keep the two in sync.

Region Availability and Feature Parity

New Bedrock Guardrails features (for example, contextual grounding check at general availability) roll out region by region. Not every Bedrock region supports every Bedrock Guardrails feature immediately. Always check the Bedrock Guardrails documentation for the region-feature matrix before designing an architecture that depends on a specific Bedrock Guardrails policy type.

Cross-Region Inference and Guardrails

Amazon Bedrock supports cross-region inference profiles that route requests to the region with available capacity. When you use cross-region inference, the guardrail is still attached to the invocation by ID, and Bedrock handles evaluating it in the appropriate region. You do not create a separate cross-region guardrail; you create the guardrail in the region where the inference profile originates.

Versioning in Bedrock Guardrails

Bedrock Guardrails version their policy definitions, and version management is part of the exam-worthy feature set.

Draft vs Numbered Versions

When you create a Bedrock Guardrail, it starts as a DRAFT version. You edit the draft freely — add denied topics, adjust content filter strengths, change sensitive information actions. At any point you publish the draft as a numbered version (1, 2, 3, ...) which is immutable. Applications call InvokeModel or Converse with a specific guardrailVersion — either DRAFT (for testing) or a numbered version (for production).

Why Versioning Matters

Versioning lets you roll back. If version 5 of your Bedrock Guardrails introduces an overly aggressive content filter that starts blocking legitimate traffic, you can flip production back to version 4 by changing the guardrailVersion parameter in your application. Versioning also provides an audit trail — each numbered version records who published it and when, so compliance reviews can trace what policy was active at any past time.

Publishing Workflow

A typical workflow: create the guardrail, iterate in DRAFT with the Bedrock console playground testing tool, publish version 1 when satisfied, point staging applications at version 1, after a bake period point production applications at version 1, continue editing DRAFT for version 2, publish version 2, migrate staging then production. Numbered versions are immutable — to change policy you always publish a new version.

Cost of Bedrock Guardrails

Bedrock Guardrails have their own pricing, separate from the foundation-model token cost.

Text Unit Pricing

Bedrock Guardrails charge per text unit processed. A text unit is a block of text up to 1,000 characters; the charge applies per text unit per policy type evaluated. Different policies have different per-text-unit prices — content filters, denied topics, word filters, and sensitive information filters are generally inexpensive; contextual grounding check is more expensive because it runs its own model comparison. Check the Bedrock pricing page for the current per-policy price in each region.

Cost Implications of Input vs Output Filtering

Because text units are priced per direction, enabling both input and output evaluation roughly doubles the Bedrock Guardrails portion of your bill for that policy. Most production deployments accept the cost for safety-critical policies (PII, content filters) and may save money by only running certain policies in one direction (denied topics on input only, for example).

Cost Relative to Model Tokens

For most workloads Bedrock Guardrails cost is a small fraction of the foundation-model token cost. A Claude 3 Sonnet call producing a 500-token response costs several cents; evaluating that response with all five policy types adds a fraction of a cent. The exception is very-low-cost models with large volumes — if you are running Amazon Titan Text Lite on high throughput, Bedrock Guardrails can meaningfully contribute to the bill and direction choice (input-only vs both) becomes a meaningful optimisation lever.

Bedrock Guardrails for Bedrock Knowledge Bases

Amazon Bedrock Knowledge Bases is the managed RAG pipeline on Bedrock. Bedrock Guardrails integrates natively with Knowledge Bases.

Attaching Guardrails to RetrieveAndGenerate

When you call RetrieveAndGenerate against a Bedrock Knowledge Base, you can supply a guardrailConfiguration containing the guardrail identifier and version. The entire RAG pipeline — retrieval, prompt construction, model invocation, response return — flows through the Bedrock Guardrails policies. For RAG workloads this is the simplest way to get all five policy types applied with zero additional application code.

Contextual Grounding Is the Natural Fit

Because Bedrock Knowledge Bases inherently supply retrieved passages, contextual grounding check is the Bedrock Guardrails policy that pairs most naturally with Knowledge Bases. You get RAG faithfulness checking for free beyond the base Knowledge Bases cost. Many teams start their Bedrock Guardrails journey by attaching contextual grounding check to an existing Knowledge Bases RetrieveAndGenerate workflow.

Guardrails and Bedrock Agents

Bedrock Agents are the orchestration layer for multi-step tool-using workflows on Bedrock. Agents accept guardrail configurations during agent creation, and the guardrail is applied at every LLM call the agent makes during its reasoning loop. This is especially valuable because agents can produce dozens of LLM calls per user turn, and each of those calls must be individually evaluated against safety policies.

Bedrock Guardrails plug into Knowledge Bases and Agents natively. For Knowledge Bases, pass guardrailConfiguration to RetrieveAndGenerate. For Agents, attach a guardrail during agent creation. This integration means the same guardrail resource protects direct InvokeModel calls, RAG calls through Knowledge Bases, and multi-step agentic workflows with no code duplication. If an exam scenario describes protecting a RAG application or an agentic workflow, the answer involves attaching a Bedrock Guardrails resource through the integration parameters, not rebuilding safety at each layer.

ApplyGuardrail API — Standalone Moderation Without Model Invocation

The ApplyGuardrail API is the Bedrock Guardrails feature that most often surprises AIF-C01 candidates.

What ApplyGuardrail Does

ApplyGuardrail runs a Bedrock Guardrails evaluation against arbitrary text without invoking any foundation model. You pass the guardrail identifier, guardrail version, a source type ("INPUT" or "OUTPUT"), and the text to evaluate. The API returns the evaluation outcome — blocked or passed — along with policy-specific details about which filter triggered.

Why ApplyGuardrail Exists

ApplyGuardrail enables two important patterns. First, it lets you apply Bedrock Guardrails to outputs from models that are not on Bedrock — for example, an in-house fine-tuned model running on SageMaker endpoints. You send the model's output through ApplyGuardrail for content safety review before returning it to users. Second, it lets you moderate user-generated content (forum posts, support-ticket submissions, chat messages) with the same Bedrock Guardrails policy used elsewhere in your AI stack, unifying the moderation layer across both AI and non-AI surfaces.

Pricing and Latency

ApplyGuardrail is priced the same per-text-unit as embedded guardrail evaluation. Latency is lower than full InvokeModel because no foundation model runs — it is just policy evaluation. Typical end-to-end latency is tens to low hundreds of milliseconds depending on policies enabled and text length.

ApplyGuardrail lets you use Bedrock Guardrails as a general-purpose content moderation API, independent of any foundation model. If a scenario describes needing content moderation on a model that is not hosted on Bedrock, or needing PII redaction on arbitrary user-generated content, the correct Bedrock Guardrails answer is ApplyGuardrail — not attaching a guardrail to InvokeModel, because there is no InvokeModel call to attach to.

Bedrock Guardrails vs Prompt Engineering — Complementary, Not Alternatives

A common AIF-C01 trap sets up Bedrock Guardrails and prompt engineering as competitors. They are not.

What Prompt Engineering Can Do

Prompt engineering shapes model behaviour through instructions in the system prompt or user prompt. Good prompt engineering can reduce unsafe outputs, constrain topic, and improve response style. Prompt engineering is cheap (no additional AWS resources) and tightly coupled to the application.

What Prompt Engineering Cannot Do

Prompt engineering is defeated by determined prompt injection — a user who says "ignore all prior instructions" can often override a carefully written system prompt. Prompt engineering also does not scale policy changes across applications; each application owns its own prompt. And prompt engineering has no audit artefact — you cannot easily prove to a compliance officer what safety policy was active at a given time.

Why Use Both

Bedrock Guardrails and prompt engineering work at different layers. Prompt engineering sets the default behaviour; Bedrock Guardrails enforce the non-negotiable policy. A well-designed application uses prompt engineering to specify tone, role, and typical task framing, and uses Bedrock Guardrails to enforce the hard rules — "never discuss competitors," "never output SSNs," "never generate violent content." The two layers together give defence in depth.

Common Bedrock Guardrails Exam Traps

The AIF-C01 exam rotates a small set of Bedrock Guardrails traps. Knowing them makes the topic almost automatic.

Trap 1 — Bedrock Guardrails vs IAM

Bedrock Guardrails are content safety; IAM is access control. A question describing "blocking users from invoking a specific model" is IAM. A question describing "blocking certain content types regardless of who the user is" is Bedrock Guardrails. Both can be present; do not let one distract you from the other.

Trap 2 — Contextual Grounding vs RAG

Contextual grounding check verifies RAG responses; it does not perform RAG. Scenarios describing "retrieving documents and giving them to the model" are RAG. Scenarios describing "making sure the model's answer is actually supported by the retrieved documents" are contextual grounding check.

Trap 3 — Denied Topics vs Word Filters

Denied topics are semantic (meaning-based). Word filters are literal (string-based). "Block discussion of mergers and acquisitions" is denied topics. "Block the phrase 'guaranteed returns'" is word filters.

Trap 4 — Content Filters Are Not Binary

Content filters have four strength levels (None, Low, Medium, High), not just on/off. Candidates who answer "enable the hate filter" are not wrong, but questions that ask for a specific strength level — "most aggressive filtering of hate content" — are asking for the High setting specifically.

Trap 5 — PII Actions Are MASK or BLOCK, Chosen Per Type

Sensitive information filters do not have a global action. Each PII type has its own MASK or BLOCK decision. A configuration might MASK names but BLOCK credit card numbers in the same guardrail. Scenarios expecting a single action across all PII are setting up a subtle wrong-answer option.

Trap 6 — ApplyGuardrail Is Standalone

ApplyGuardrail is easy to miss because it does not involve any model invocation. If a scenario describes moderating text that is not being sent through InvokeModel, the answer is ApplyGuardrail, not any of the integrated options.

The top Bedrock Guardrails mistake: conflating the five policy types. Content filters = six harm categories with strength levels. Denied topics = semantic subject blocks. Word filters = literal string matches. Sensitive information filters = PII with MASK or BLOCK action per type. Contextual grounding check = RAG faithfulness and relevance thresholds. Memorise the one-line purpose of each and you will get most Bedrock Guardrails questions right in under ten seconds.

Bedrock Guardrails Quick Reference Summary

Bedrock Guardrails are a separate, versioned AWS resource that evaluates input and output against up to five independent policy types. Content filters cover six harm categories (hate, insults, sexual, violence, misconduct, prompt attacks) at four strength levels. Denied topics are custom semantic subject blocks. Word filters match profanity and a custom literal blocklist. Sensitive information filters detect managed PII types plus custom regex patterns with MASK or BLOCK actions per type. Contextual grounding check verifies RAG responses against source documents on grounding and relevance thresholds. Apply via InvokeModel/Converse parameters, via Bedrock Knowledge Bases and Agents integrations, or standalone via ApplyGuardrail. Versioned with DRAFT plus numbered versions, priced per text unit per policy, regional like all other Bedrock resources.

FAQ — Amazon Bedrock Guardrails

Q1 — Can I use Bedrock Guardrails without calling a foundation model?

Yes. The ApplyGuardrail API evaluates arbitrary text against a Bedrock Guardrails configuration without invoking any foundation model. This is the pattern for moderating content from external (non-Bedrock) models or general user-generated content using the same Bedrock Guardrails policy used elsewhere in your AI stack.

Q2 — Are Bedrock Guardrails the same as IAM permissions for Bedrock?

No, and this is a common trap. Bedrock Guardrails are a content safety layer — they decide what content is allowed to flow through model invocations. IAM is an access control layer — it decides who is allowed to invoke which models in the first place. Both are typically required in production: IAM for authorisation, Bedrock Guardrails for content policy.

Q3 — Does the contextual grounding check replace RAG?

No. Contextual grounding check is a faithfulness check that runs on top of a RAG system or any workflow that supplies source passages. RAG retrieves passages and injects them into the prompt so the model can answer with external knowledge; contextual grounding check verifies that the model's answer is actually supported by those passages. You need both — RAG for the retrieval, contextual grounding check for the verification.

Q4 — How do I prevent prompt injection attacks with Bedrock Guardrails?

The most direct control is the "prompt attacks" category in content filters, which runs on input. Set it to Medium or High strength to catch common injection patterns. Layer this with denied topics for known sensitive subjects, with prompt engineering that defines clear system role boundaries, and with output filtering to catch any injection that slips through input. Remember that no single Bedrock Guardrails feature is a complete prompt-injection defence — defence in depth is required.

Q5 — Can one Bedrock Guardrails resource protect multiple foundation models?

Yes. A single Bedrock Guardrails resource is model-agnostic. You create it once and attach it to any supported foundation model via the guardrailIdentifier and guardrailVersion parameters on InvokeModel or Converse. Switching models does not require rebuilding the guardrail; the same Bedrock Guardrails policy evaluates inputs and outputs identically whether the model is Claude, Titan, Llama, Mistral, or Cohere.

Q6 — How do I roll back a Bedrock Guardrails change that is blocking legitimate traffic?

Use version management. Bedrock Guardrails versions (1, 2, 3, ...) are immutable once published. If version 5 starts blocking legitimate traffic, change your application's guardrailVersion parameter back to version 4 — a configuration change, no rebuild required. Then edit the DRAFT for a fixed version 6 and promote it when ready. This rollback pattern is one of the main reasons Bedrock Guardrails are a versioned resource rather than an inline call-time configuration.

Q7 — Do Bedrock Guardrails work across AWS Regions?

Bedrock Guardrails are regional resources. A guardrail created in us-east-1 is not accessible from eu-west-1. For multi-region applications you create a matching Bedrock Guardrails resource in each region (typically via infrastructure-as-code to keep the policy definitions synchronised). Cross-region inference profiles on Bedrock do handle guardrail attachment transparently — you attach to the profile's origin region and Bedrock evaluates the guardrail wherever the inference actually runs.

官方資料來源