examhub .cc The most efficient path to the most valuable certifications.
Vol. I
In this note ≈ 32 min

Integrating Claude Code into CI/CD Pipelines

6,400 words · ≈ 32 min read

Claude Code CI/CD pipeline integration is the focus of task statement 3.6 — "Integrate Claude Code into CI/CD pipelines" — inside Domain 3 (Claude Code Configuration & Workflows, 20 % of the CCA-F exam). It is also the centerpiece of the sixth scenario cluster, Claude Code for Continuous Integration, one of the four scenarios randomly drawn into every sitting. The community pass report that scored 893/1000 on the April 2026 exam called the -p flag "the single most-tested Claude Code configuration detail in the entire Domain 3 question pool." Getting Claude Code CI/CD pipeline integration right is not optional for a passing score — it is a line-item requirement.

This study note is a full architecture-level walkthrough of every outline bullet in task 3.6: where Claude Code fits in automated pipelines, the precise semantics of the -p / --print flag, the --output-format json envelope, --json-schema schema enforcement, headless mode without a TTY, pull-request review automation, test generation on commit, security-scan summarization, CLAUDE.md and .mcp.json placement for CI agents, exit-code semantics for pipeline failure propagation, and cost-management discipline. Every section ties Claude Code CI/CD pipeline integration back to the exam scenario and to the specific distractors CCA-F uses to separate candidates who memorized flags from candidates who understand the architecture.

CI/CD Integration Overview — Where Claude Code Fits in Automated Pipelines

A CI/CD pipeline is a scripted sequence of stages that runs on every code change: checkout, build, static analysis, test, package, deploy. Claude Code CI/CD pipeline integration means invoking the claude CLI from inside one of these stages so that Claude participates as a first-class automation step — not as a developer sitting at a terminal, but as a non-interactive process with deterministic inputs and outputs.

The Claude Code CI/CD pipeline integration surface spans five recurring automation shapes:

  • Pull-request review — Claude reads the diff, posts structured comments.
  • Test generation — Claude writes tests for newly added code paths.
  • Security-scan summarization — Claude converts raw SAST output into actionable prose.
  • Documentation maintenance — Claude updates README or docs when signatures change.
  • Release-note drafting — Claude turns a merged-PR list into a changelog.

All five shapes share the same mechanical requirements: the claude process must run without a TTY, must not prompt the human, must emit machine-readable output, and must propagate its exit code so the pipeline fails on error. These requirements collapse into one headline flag — -p — and a small cluster of companion flags that CCA-F expects you to know by sight.

Claude Code CI/CD pipeline integration is the pattern of invoking the claude CLI as a non-interactive automation step inside a CI/CD pipeline (GitHub Actions, GitLab CI, Jenkins, CircleCI, and similar). It requires running with the -p / --print flag for non-interactive mode, controlling output shape via --output-format, scoping Claude's allowed tools through .mcp.json and CLAUDE.md, and propagating the CLI exit code to the pipeline runner so downstream stages fail correctly. Claude Code CI/CD pipeline integration is exam task statement 3.6 and the anchor of the Claude Code for Continuous Integration scenario cluster. Source ↗

Why CCA-F Treats CI/CD as Its Own Scenario

Claude Code CI/CD pipeline integration is the only Domain 3 topic that gets its own scenario cluster. The exam guide dedicates one of the six scenarios — Claude Code for Continuous Integration — specifically to this surface, which means every sitting that draws the CI scenario has a cluster of three to five questions walking through the same invocation. Candidates who skip this topic because "it is just a CLI flag" lose the scenario entirely. The community April 2026 pass report explicitly lists -p awareness as a make-or-break concept for the CI scenario.

-p / --print Flag — Non-Interactive Single-Turn Execution for Scripted Use

The -p (also spelled --print) flag is the switch that turns claude from an interactive REPL into a scripted, non-interactive command. Running claude "write me a haiku" without -p drops you into the interactive TUI, prints a banner, and waits for you to type. Running claude -p "write me a haiku" prints the answer to stdout and exits.

The -p (short) / --print (long) flag is the Claude Code CLI option that switches the claude binary out of interactive TUI mode and into a scripted, non-interactive, single-turn execution mode. With -p, Claude Code skips the terminal UI, reads the prompt from a positional argument or piped stdin, runs the agentic loop once, writes the final assistant message to stdout, and exits with a standard Unix exit code. -p is the single most-frequently-tested Claude Code flag on CCA-F task 3.6 and is a mandatory ingredient of any CI/CD invocation; it does NOT disable tool permissions, safety checks, or any other guardrail — those are configured separately. Source ↗

What -p Actually Does

Concretely, -p causes the claude CLI to:

  1. Skip the TUI entirely. No banner, no status line, no keystroke handler.
  2. Execute one agentic-loop run from the prompt you passed on the command line (or piped on stdin).
  3. Emit the final assistant message to stdout and return an exit code.
  4. Exit without waiting for further input.

In a CI/CD context this is non-negotiable. CI runners attach no terminal, accept no keyboard input, and kill any process that blocks on stdin. Invoking claude without -p in CI either hangs until the runner's wall-clock timeout kills it, or prints TUI escape codes into your build log.

Where -p Reads the Prompt From

The -p flag accepts the prompt in any of three ways:

  • Positional argumentclaude -p "review the diff in git".
  • Piped stdingit diff | claude -p.
  • Combinedcat instructions.md | claude -p "follow the instructions on stdin".

CI pipelines almost always use the piped-stdin form because the prompt is dynamic (it depends on the current commit, PR number, or scan output).

Whenever you see "non-interactive" or "automated" or "scripted" in a CCA-F scenario, the first flag you reach for is -p. Distractor answers will propose --batch, --headless, --no-tty, or --silent; none of those flags exist as the canonical non-interactive switch. The canonical flag is -p (short) / --print (long). Source ↗

What -p Does NOT Do

This is the single most-tested trap in the Domain 3 pool. -p is a mode switch for input/output plumbing. It does not:

  • Disable permission prompts on destructive tools. Tools that require confirmation (file writes, shell commands outside the allowed list) still require pre-approved permissions set via .mcp.json, CLAUDE.md, or the --allowedTools / --permission-mode flags.
  • Bypass safety checks. Claude Code's built-in safety rails (for example, the refusal to run obviously destructive commands) are unaffected.
  • Change the model, the context window, or the agentic loop. It changes how you talk to Claude Code, not what Claude Code is.
  • Automatically include the full repository as context. You still need CLAUDE.md, explicit file arguments, or tool calls for Claude to see anything beyond the prompt string.

A CI pipeline that uses -p but forgets to configure tool permissions will fail the first time Claude tries to write a file or run a shell command. Community pass reports repeatedly cite this as the highest-value trap in the scenario.

--output-format json — Machine-Readable Structured Output for Pipeline Consumption

By default, claude -p prints the assistant's final text message to stdout as plain prose. That is readable for a human and almost useless for a pipeline. The --output-format json flag wraps the run's output in a machine-readable envelope so downstream shell scripts, GitHub Actions steps, or build tools can parse structured fields.

--output-format json is the Claude Code CLI option that wraps every non-interactive run in a machine-readable JSON envelope written to stdout — an object containing metadata fields such as session_id, num_turns, total_cost_usd, duration_ms, is_error, and result (the final assistant message as a string). It is the canonical way for CI/CD pipelines to consume Claude Code output programmatically. Critically, --output-format json controls the envelope wrapper only; it does NOT constrain the format of the content inside the result field. Schema enforcement of the content requires a separate mechanism (--json-schema or tool use with strict: true). Source ↗

The JSON Envelope Shape

With --output-format json, Claude Code emits a single JSON object per run that typically includes:

  • type — a tag identifying the record as a run result.
  • session_id — the Claude Code session identifier (useful for correlating with logs).
  • result — the final assistant message content (what would otherwise be printed as prose).
  • num_turns — how many agentic-loop iterations the run consumed.
  • total_cost_usd — the approximate billed cost of the run.
  • duration_ms — wall-clock time.
  • is_error — whether the run terminated with an error condition.

Your pipeline code can then jq '.result' to extract the answer, jq '.is_error' to gate subsequent steps, or jq '.total_cost_usd' to enforce a budget.

What the Envelope Does Not Change

This is the second most-tested trap in the Domain 3 pool. --output-format json changes the envelope, not the content format of the assistant message itself. The result field still contains whatever Claude decided to emit — if you asked Claude for prose, the result contains prose; if you asked Claude for a JSON array, the result contains a JSON array as a string. The envelope is not a schema enforcer; it is a shipping container.

A pipeline that assumes --output-format json will turn a prose answer into structured JSON fields is designing on a misunderstanding. If you need the assistant's answer itself to conform to a schema, you need --json-schema (covered in the next section), tool use with strict mode, or explicit in-prompt instruction — not --output-format json.

--output-format json changes the envelope wrapper, not the format of the assistant's answer inside .result. A CCA-F distractor will claim that --output-format json alone guarantees Claude's output conforms to a structured schema. It does not. If you need schema-guaranteed output, combine --output-format json with either --json-schema, a tool-use definition using strict: true, or an in-prompt JSON contract with validation. Passing --output-format json in isolation gives you a predictable wrapper around unpredictable content. Source ↗

Alternative Formats

Claude Code also supports --output-format stream-json for streaming envelopes (one JSON object per event, newline-delimited), useful when a pipeline wants to observe the run turn-by-turn instead of collecting a single result at the end. stream-json is the automation equivalent of the SDK's stream() entry point — it is the right choice when something downstream is watching progress in real time, for example a bot that posts incremental PR comments.

--json-schema — Enforcing Output Schema Compliance in CI Contexts

--json-schema goes one layer deeper than --output-format json. It instructs Claude Code to validate its final answer against a JSON Schema you provide on the command line or via file reference. If the answer does not conform, Claude Code either retries internally, returns an error, or both — depending on how strict you configured it.

When You Need Schema Enforcement

Schema enforcement in CI matters when the pipeline step immediately consumes the answer. Typical examples:

  • A PR-review step expects an array of comments with file, line, severity, and message.
  • A test-generation step expects a list of test-file writes with path and content.
  • A security-scan summarizer expects a triaged list with finding_id, risk, and action.

In every one of these shapes, the downstream step cannot tolerate prose drift. If Claude sometimes returns the array and sometimes returns a prose summary with a JSON block embedded, the pipeline breaks intermittently — the worst class of CI failure.

Pairing --json-schema With Tool Use

For the highest reliability, combine --json-schema at the CLI boundary with strict-tool-use in the prompt itself. The tool-use approach (a Claude tool with strict: true in its input_schema) guarantees Claude's generation is constrained to the schema at token time; the --json-schema CLI flag adds a second-stage validator that catches any drift the model layer somehow lets through. Belt-and-braces design is appropriate for CI because a failed run wakes up on-call; a false pass corrupts production.

What CCA-F Tests About --json-schema

Expect exam questions that ask you to pick the correct mechanism for "guarantee the JSON output conforms to this schema in a CI pipeline." Wrong answers will propose:

  • Stronger system prompts ("tell Claude in the CLAUDE.md to always return valid JSON").
  • --output-format json alone (envelope, not content).
  • Post-hoc jq validation with no retry loop.

The right answer pairs --json-schema (or tool use with strict: true) with a CI-friendly retry policy. The pain-point pp-01 from the research manifest — programmatic enforcement beats prompt guidance — is directly relevant here.

Headless Mode — Running Claude Code Without TTY for Automation

"Headless" is the operational term for running claude in an environment with no attached terminal — exactly the environment a CI runner provides. Headless mode is enabled by the combination of -p (non-interactive) plus ensuring no TUI features are invoked. Claude Code is designed to detect a non-TTY stdin/stdout pair and automatically adjust output rendering, but -p is still the explicit contract that says "do not expect a human."

Why Not Just Rely on TTY Detection?

Some CI runners simulate a TTY (for colored log output, progress rendering) even though no human is attached. Without -p, Claude Code may interpret the presence of a pseudo-TTY as a signal to launch the interactive UI, producing broken output or hangs. -p is the belt-and-braces way to say "non-interactive," regardless of what the runner's TTY emulation thinks.

Environment Variables That Matter in Headless Contexts

  • ANTHROPIC_API_KEY — required for authentication; inject via CI secret storage.
  • CLAUDE_CODE_WORKSPACE — workspace root; typically set to the checkout directory.
  • Various flags mirrored as env variables for settings that are awkward to pass on the command line.

The exact env-variable list is managed through the settings reference; CCA-F does not test the complete list, but does test the concept that env variables are how CI pipelines inject configuration that is too dynamic or too sensitive for a static .claude/settings.json file.

In headless mode, treat ANTHROPIC_API_KEY as a CI secret managed by the runner (GitHub Actions secrets, GitLab CI variables, Jenkins credentials). Never commit the key to a file inside CLAUDE.md, .mcp.json, or settings.json. The exam does not test specific secret-management syntax per runner, but it does test the principle that API keys flow through runner secrets, not through version-controlled config. Source ↗

PR Review Automation — Auto-Commenting on Pull Requests with Claude Analysis

Pull-request review is the single most common Claude Code CI/CD pipeline integration shape in production. The pattern is:

  1. On a PR event, the CI runner checks out the branch.
  2. The runner computes the diff (git diff origin/main...HEAD).
  3. The runner pipes the diff into claude -p --output-format json with a review prompt and any supporting CLAUDE.md scopes.
  4. The runner extracts structured review comments from the JSON result.
  5. A GitHub Actions (or GitLab CI) step posts the comments back to the PR via the platform API.

Why This Shape Is Canonical

PR review in CI separates three responsibilities cleanly:

  • Claude Code is responsible for generating the review content.
  • The CI runner is responsible for gathering the diff and posting the comments.
  • The pull-request platform (GitHub, GitLab) is responsible for rendering comments inline.

Claude Code does not call the PR platform directly in this pattern. It only produces structured output. This is the right separation — it means the same claude -p invocation works in GitHub Actions, GitLab CI, or a local developer script, as long as the diff is piped in and the structured comments are handled by the caller.

Scoping Claude's Attention in PR Review

The prompt for PR review typically instructs Claude to focus on:

  • Correctness issues (bugs, logic errors).
  • Security concerns surfaced by the diff.
  • Style deviations from project conventions (loaded via CLAUDE.md).
  • Missing tests for new code paths.

CLAUDE.md plays a critical role here. When Claude Code runs in CI with -p, it still honors the CLAUDE.md hierarchy in the checked-out workspace. Project conventions, review checklists, and language-specific rules codified in CLAUDE.md become the implicit criteria for every CI review without bloating the command-line prompt.

The allowedTools Discipline for PR Review

For pure review (no file edits), PR-review CI jobs typically restrict Claude to read-only tools — Read, Grep, Glob, Bash with a narrow allowlist. Write access is explicitly withheld because the pipeline is commenting on, not modifying, the proposed changes. The permissions model is enforced via .mcp.json and the settings file, not by hoping Claude will behave.

Test Generation Workflow — Triggering Claude to Write Tests on Commit

Test generation is the second canonical Claude Code CI/CD pipeline integration pattern. A typical shape:

  1. A push or PR event triggers the pipeline.
  2. A pre-existing step identifies functions or files lacking test coverage (via a coverage report, or via a diff-aware heuristic).
  3. claude -p is invoked with a prompt that names the un-tested files, the project's test conventions (from CLAUDE.md), and instructions to write tests.
  4. Claude Code, with Read and Write tools enabled in a sandboxed path, produces test files.
  5. The pipeline runs the new tests; if they pass, it opens a PR with the changes.

Why Test Generation Is a Natural Fit for CI

Unlike interactive coding (where the developer is debugging tightly-coupled feature code), test authoring is high-volume, pattern-driven, and naturally scripted. The human is happy to review a PR containing only new tests; the cost of getting it wrong is low (the tests either pass, fail, or get revised).

CLAUDE.md Anchoring for Test Style

A project-level CLAUDE.md that documents "we use pytest, fixtures live in tests/conftest.py, we mock external HTTP with responses" is what keeps test-generation CI outputs consistent. Without the CLAUDE.md anchor, the same prompt produces different test styles on different runs — burning reviewer time on stylistic churn. This is the CI counterpart to the pain point pp-08: monolithic or missing CLAUDE.md instruction produces unreliable outputs.

Scoping Writes With Allowed Paths

For test generation, Claude Code is allowed to write — but only to the tests/ directory, for example. Write-scope enforcement is typically expressed through the permissions model rather than trusting Claude to stay in its lane. The CCA-F exam rewards the principle: restrict the blast radius of the tool allowlist to exactly what the CI job needs, never more.

Security Scan Integration — Using Claude to Summarize Static Analysis Findings

Static analysis tools (Semgrep, CodeQL, Bandit, Brakeman, and the rest) produce volumes of raw findings that are hard for humans to triage. Claude Code CI/CD pipeline integration turns that raw stream into a triaged, prose-explained summary.

The Typical Shape

  1. A security-scan step runs and emits findings as JSON, SARIF, or CSV.
  2. The findings artifact is piped or referenced as input to claude -p --output-format json --json-schema <triage-schema>.
  3. The prompt instructs Claude to group findings by root cause, filter out known false positives (optionally matched against a .false-positive-allowlist.json in the repo), and emit a triaged list with severity, recommended action, and a short explanation.
  4. The pipeline posts the triage summary to the PR or to a security dashboard.

Why Not Have Claude Run the Scanner Itself?

The scanner is a specialized tool with decades of tuning; Claude is not a replacement. The value Claude adds is interpretation — translating dense SAST output into prose, grouping related findings, and routing the human's attention to the highest-risk items first. This separation mirrors the PR-review shape: Claude does not do the analysis, it explains the analysis.

Budget Discipline for Scan Summarization

A large scan can produce thousands of findings. Feeding all of them directly into Claude Code is both expensive and wasteful — Claude cannot usefully reason over ten thousand findings at once, and most of them are near-duplicates. The correct pattern is:

  • Pre-filter the scan output to deduplicate and bucket findings before invoking Claude.
  • Cap how many findings are summarized per run.
  • Summarize in passes when the volume is large, feeding only a bucket at a time.

This anticipates the cost-management section below, but is worth calling out inline: do not use Claude Code as a raw findings transform. Use it as a triage layer on already-preprocessed input.

Environment Configuration — CLAUDE.md and .mcp.json for CI Agent Context

Claude Code CI/CD pipeline integration inherits from the same configuration machinery that interactive Claude Code uses — CLAUDE.md for prose instructions, .mcp.json for MCP server wiring, .claude/settings.json for model/tool configuration — but applied with two CI-specific disciplines.

CLAUDE.md in CI

The CI runner's working directory, after checkout, includes whatever CLAUDE.md files are committed to the repository. Claude Code running headless still walks the CLAUDE.md hierarchy: a global CLAUDE.md at ~/.claude/CLAUDE.md (rarely present on CI runners), the project-root CLAUDE.md, and directory-scoped CLAUDE.md files deeper in the tree.

CI-specific CLAUDE.md patterns:

  • Keep CI-relevant conventions (test framework, review checklist, security guardrails) in the committed project CLAUDE.md so the CI invocation sees them automatically.
  • Use @import to split CI-relevant fragments (@.claude/ci-review.md) from interactive-only content.
  • Avoid committing developer-local conventions to the project CLAUDE.md — those belong in the user-scoped CLAUDE.md that the CI runner does not have.

.mcp.json in CI

.mcp.json at the project root declares the MCP servers Claude Code may use. In CI, the policy should be minimal:

  • Enable only the MCP servers the CI job actually needs (typically git, sometimes a GitHub MCP server for PR comment posting, rarely an internal service MCP for knowledge lookup).
  • Do NOT enable experimental MCP servers in CI runs; CI is not the place to debug integrations.
  • Use CI-runner secrets (not committed files) for any MCP server credentials.

The --strict-mcp-config Flag

Some Claude Code builds support a --strict-mcp-config flag that refuses to start if the .mcp.json declares servers that cannot be resolved at run time. In CI, this is a feature, not an annoyance — a misconfigured MCP server should fail the job loudly, not silently start a run that is missing half its capabilities. The CCA-F exam does not test the exact flag spelling but does test the principle that CI runs should prefer strict configuration over silent degradation.

CLAUDE.md in CI runs is subject to the same hierarchy rules as interactive Claude Code — global, project, directory scopes, in that order. The CI runner typically has no global CLAUDE.md, so the committed project CLAUDE.md is the complete source of truth for conventions in CI. This amplifies the "modular CLAUDE.md" design principle: anything that matters to CI must be in a committed, scope-appropriate file, because there is no interactive fallback to repair missing instructions. Source ↗

Exit Codes and Error Handling — Pipeline Failure Propagation From Claude Code

A CI/CD pipeline stage is a black box with exactly one output for flow control: its exit code. Zero means success, non-zero means failure, and the runner decides what to do next based only on that integer. Claude Code CI/CD pipeline integration is useful to the pipeline only to the degree that its exit codes correctly reflect whether the step succeeded or failed.

Exit Code Conventions

Claude Code follows standard Unix exit-code conventions:

  • 0 — successful run; the agentic loop terminated normally (end_turn) and no error occurred.
  • Non-zero — the run failed for some reason: authentication error, tool permission denied, schema validation failure (when --json-schema is enforced), iteration cap exceeded, Anthropic API error, or an explicit error signal from within Claude.

Why Exit Code Discipline Matters

In interactive use, a failed Claude Code run is obvious — the human sees the error message on screen. In CI, a failed run that returns exit code 0 pollutes the pipeline silently: the downstream step (post a comment, open a PR, deploy a build) proceeds as if everything worked. A passing pipeline that shipped broken output is the worst class of CI failure.

Conversely, a succeeded run that returns a non-zero exit code stops the pipeline unnecessarily, costing developer time on false alarms.

Parsing the JSON Envelope's is_error Field

When running with --output-format json, the exit code is one signal and the is_error field inside the JSON envelope is a second. A robust CI wrapper checks both:

OUTPUT=$(claude -p --output-format json "..." )
if [ $? -ne 0 ]; then
  echo "claude returned non-zero exit code"
  exit 1
fi
if [ "$(echo "$OUTPUT" | jq -r '.is_error')" = "true" ]; then
  echo "claude returned is_error=true in envelope"
  exit 1
fi

Defense in depth — exit code plus envelope field — catches both hard failures (process error) and soft failures (Claude reached an internal error state but the process still exited cleanly).

Bubbling Claude's Structured Errors to the Runner

If Claude runs with --json-schema and the final output fails validation, the pipeline needs a deterministic failure. The correct pattern is: validate the schema at the claude layer, set is_error: true in the envelope, emit non-zero exit code. The CI wrapper then propagates the failure in the runner's native idiom (a ::error:: annotation in GitHub Actions, a failure stage in GitLab CI, an unstable build in Jenkins).

Cost Management in CI — Limiting Scope, Caching, and Budgeting Token Usage

CI runs Claude Code a lot. A repository with 100 PRs per week, each triggering a Claude-powered review, is 100 agentic-loop runs per week at minimum. Without discipline, Claude Code CI/CD pipeline integration becomes the single largest line item in an Anthropic bill. Cost management is not a nice-to-have; it is a design requirement.

Scope Limitation

The cheapest token is the one you never send. Before invoking Claude Code in CI, narrow the input:

  • Only review changed files. Pipe the diff, not the whole codebase.
  • Only summarize findings that cross a severity threshold. Skip the noise.
  • Only generate tests for new or changed functions. Leave the rest alone.
  • Pre-filter with cheap tools before invoking Claude. Grep, jq, and a handful of awk lines eliminate 80 % of irrelevant input.

Caching

Where the underlying platform supports prompt caching, long stable preambles (CLAUDE.md content, review checklists, security guidance) cache cheaply across runs. The exact caching mechanics live inside Claude's platform and are out of scope for CCA-F (the exam guide's Appendix explicitly excludes "prompt caching implementation details beyond knowing it exists"), but the architectural fact that stable preambles cost less than volatile preambles is fair game.

Token Budget Per Run

--output-format json emits total_cost_usd and token counts in the envelope. A CI wrapper that enforces a per-run budget cap is straightforward:

COST=$(echo "$OUTPUT" | jq -r '.total_cost_usd')
if (( $(echo "$COST > 0.50" | bc -l) )); then
  echo "claude run exceeded $0.50 budget: $COST"
  # alert, tag, or fail accordingly
fi

Running Claude Code on Every Push vs on Every Merge

A common cost lever is when to run Claude Code. Running it on every push (every git push to a PR branch) gives fast feedback but multiplies the bill. Running it only on PR open, PR ready-for-review, or PR merge cuts the spend dramatically. The CCA-F exam does not prescribe a specific trigger policy, but it does test the principle that CI trigger policy is a cost lever, and the right answer often involves triggering on fewer events rather than on every event.

Claude Code CI/CD pipeline integration cheat sheet — six facts to lock in:

  1. -p (or --print) is the canonical non-interactive mode flag; without it, claude runs the TUI.
  2. -p does NOT disable permission prompts on destructive tools — permissions are separate.
  3. --output-format json wraps the output in an envelope; it does NOT enforce a schema on the content.
  4. --json-schema (or tool use with strict: true) is how you enforce schema on the content.
  5. Exit code + is_error field in the envelope are both signals a CI wrapper must check.
  6. CLAUDE.md and .mcp.json are the configuration surface in CI; there is no interactive fallback.

Source ↗

Plan Mode in CI — Why the Default Answer Is No

Plan mode (task 3.4) is the Claude Code feature that asks Claude to produce a plan for approval before executing. In interactive use, plan mode is often the right choice for risky or ambiguous tasks. In CI, the default answer is the opposite: do not use plan mode in CI.

Why Plan Mode Breaks CI

Plan mode produces a proposed plan and waits for approval before executing. In a non-interactive CI context, there is no human to approve; the run either hangs waiting or the pipeline times out. This is a concrete case where the community pain-point pp-06 ("plan mode is not always the safer choice") manifests — plan mode is the wrong choice in non-interactive pipelines, not the safer choice.

The Rare Exceptions

There are edge cases where a CI-like automation wants a plan produced but not executed — for example, a pipeline stage that merely prepares a plan for a human to review asynchronously in a separate tool. In those rare cases, Claude Code can be invoked with plan-mode flags, but the pipeline must parse the plan out of the envelope and explicitly not wait on approval. This is a niche shape; on CCA-F exam day, the default answer for "plan mode in CI" is "do not enable it."

A CCA-F distractor will sometimes offer "enable plan mode for safer CI execution" as a plausible answer. It is not safer — it is a pipeline deadlock waiting to happen. The correct mental model: plan mode is for interactive risky tasks where a human can approve, CI is by definition non-interactive, therefore plan mode and CI are architecturally incompatible by default. Source ↗

Iterative Refinement and Multi-Agent Patterns Inside a CI Step

Two adjacent task statements bleed into task 3.6 whenever the CI job is more than a one-shot transform: task 3.5 (iterative refinement) and task 1.2/1.3 (multi-agent orchestration). Both are useful inside CI — but both need CI-specific discipline to stay cost-safe.

Iterative Refinement Inside a Single Run

A CI step that generates tests or writes code may need to iterate — write, run, observe failure, fix, re-run — inside a single claude -p invocation. The agentic loop inside Claude Code handles this natively: Claude calls Write, then Bash to run the test, observes failures, calls Edit to fix, runs again, until the test passes or the iteration cap fires.

CI-Specific Iteration Caps

The iteration cap for a CI step should be tight — typically ten or fewer turns — because an unbounded CI step steals runner time from every subsequent job. Combine with a wall-clock timeout at the runner level (for example, GitHub Actions' timeout-minutes on the step).

The Stop-Sign: Repeated-State Detection

If Claude keeps proposing the same EditBash cycle without progress, the loop is stuck. Structured error responses from Bash (non-zero exit codes with actionable stderr) plus Claude's own pattern recognition usually break the cycle, but a belt-and-braces CI job will exit on repeated identical tool calls.

When Subagents Belong in a CI Step

Claude Code CI/CD pipeline integration can delegate to subagents for parallelizable work — reviewing each changed file in its own subagent, summarizing each scan-finding bucket independently, generating tests per module. The CI-specific discipline is:

  • Keep subagent counts small because each subagent spawns its own agentic loop and multiplies cost.
  • Use subagents only when the parallelism actually reduces wall-clock time (the runner has parallel capacity) or when context isolation is required.
  • Monitor cumulative cost across all subagents, not per subagent.

The scenario cluster for CI may reference multi-agent shapes indirectly — recognize them as task-1 patterns deployed in a task-3.6 context.

Common Exam Traps

Claude Code CI/CD pipeline integration is one of the trap-densest topics in Domain 3. Five patterns recur across the question pool.

Trap 1: Believing -p Disables Safety Checks or Permissions

-p / --print is a mode switch for input/output plumbing — nothing more. It does NOT disable tool permission prompts, the refusal-to-run-destructive-commands safety rail, or any other protection. A CI job that relies on -p alone to "just do whatever Claude wants" will fail the first time Claude tries to write outside its allowlist. The correct pattern is -p for non-interactive plus explicit permission configuration via .mcp.json, settings.json, or --allowedTools flags.

Trap 2: Expecting --output-format json to Constrain the Answer's Format

--output-format json changes the envelope. The .result field inside the envelope still holds whatever Claude generated — prose if asked for prose, a JSON array if asked for a JSON array. Schema enforcement of the content requires --json-schema, tool use with strict: true, or both in combination. Relying on --output-format json alone for schema-safe output is a pipeline bug waiting to happen.

Trap 3: Enabling Plan Mode in a Non-Interactive Context

Plan mode waits for approval. CI has no human to approve. The pipeline hangs until a timeout kills it. Plan mode is an interactive-only feature; the default answer in CI is "do not enable it." Distractor answers framing plan mode as a safety enhancement in CI are wrong.

Trap 4: Ignoring the Exit Code or the is_error Envelope Field

A robust CI wrapper checks both the process exit code and the is_error field in the JSON envelope. Soft failures (Claude hit an internal error but exited cleanly) and hard failures (process-level crash) are both possible; checking only one misses half the failure modes. Community pass reports cite single-signal error handling as a frequent "close but wrong" pattern.

Trap 5: Committing Secrets or Full Context Into Committed Config

ANTHROPIC_API_KEY, MCP server credentials, and any other secret belong in CI runner secret storage — never in a committed CLAUDE.md, .mcp.json, or .claude/settings.json. Additionally, do not assume CI runs automatically see the entire repository as Claude context; pipe the diff or specify files explicitly, both for security (minimum necessary disclosure) and cost (minimum necessary tokens).

Practice Anchors

Claude Code CI/CD pipeline integration is the anchor of scenario cluster six. Two anchor sub-scenarios are especially common on exam day.

The Failing PR-Review Pipeline

Scenario shape: a team has wired claude into GitHub Actions on PR events. The runs hang intermittently, or produce ANSI-escape noise in the build log, or fail with permission errors when Claude tries to edit a file. The exam asks you to identify the architectural fix. Expect the answer to involve adding -p, narrowing tool permissions, and often separating CLAUDE.md content into a CI-scoped fragment. Distractors will propose "increase timeout" (treats a symptom, not a cause), "use plan mode" (makes the hang worse), or "disable safety checks with -p" (mis-states what -p does).

The Schema-Drift Test-Generation Job

Scenario shape: a test-generation job has been stable for months. It suddenly starts failing because Claude occasionally returns prose alongside the test-file content, breaking the downstream file-write step. The exam asks you to fix this permanently. The right answer layers --json-schema (or tool use with strict: true) on top of --output-format json; distractors will propose strengthening the prompt ("tell Claude harder in the CLAUDE.md that only JSON is allowed"). The programmatic-vs-prompt pain-point (pp-01) is directly tested here — the right answer is the programmatic constraint, not the prompt constraint.

Preparing for the Full Scenario Cluster

The CI scenario cluster weaves together Domain 2 tool design (what tools Claude may call), Domain 3 configuration (CLAUDE.md, flags, permissions), Domain 4 structured output (schema enforcement), and Domain 5 reliability (exit-code propagation, error categorization). Treating the CI scenario as purely a Domain 3 exercise misses the cross-domain questions that cost a point or two each.

Plain-English Explanation

Claude Code in a CI/CD pipeline is a very specific engineering problem, and three everyday systems make its moving parts immediately intuitive.

The Kitchen Service Window — -p as the Order-Ticket Handoff

Imagine a kitchen with two ways of ordering. In the dining room (interactive Claude Code), a server walks up to the pass, describes the table's needs in words, the chef asks clarifying questions, cooks the dish, and walks it back. In the delivery window (claude -p), an order ticket comes through on a printer, the chef reads it, cooks the dish exactly as specified, and puts the finished plate in a warming box for the courier to grab. No conversation, no clarifying questions, no waiting for a diner to decide. The -p flag is the delivery-window printer. It tells the chef "this is a ticket, not a conversation — cook and go." And just like the delivery window, -p does not mean the chef ignores food-safety rules (permissions and safety checks still apply). It means the chef does not chat. A kitchen that accidentally prints delivery tickets onto the dining-room printer confuses everyone; a CI pipeline that forgets -p confuses the runner into waiting for a conversation that will never happen.

The Shipping Container — --output-format json vs --json-schema

A shipping company can put a standard container around anything (envelope) — a standard shape, a standard address label, standard paperwork. That is --output-format json. The container's exterior is predictable; the cargo inside can be anything the shipper put in. If the receiver expected crates of oranges and got crates of scrap metal, the envelope did not help — the outside looked right, the inside was wrong. --json-schema is the customs inspection that opens the container and checks that the cargo matches the manifest before the truck leaves the port. If the cargo does not match, the shipment gets rejected. A CI pipeline that relies only on --output-format json is a receiver who only checks the outside of the container; the first day Claude ships prose instead of JSON, the downstream step tries to unpack oranges and gets scrap metal. Pairing --output-format json with --json-schema is the only configuration that ships and inspects.

The Factory Emergency Stop — Exit Codes as the Single Wire Back to the Runner

A factory production line is a sequence of machines each with one red "fault" wire that runs back to the main control panel. If any machine's fault wire goes hot, the whole line stops. Each machine has hundreds of internal signals — temperature, vibration, pressure — but the main panel only sees the one wire. A CI pipeline works the same way: each step is a machine, the exit code is the fault wire, and the runner is the main panel. A Claude Code step that always reports "I'm fine" on the fault wire regardless of what happened inside is a broken machine — the line keeps running with a jammed drill, and the finished products are all wrong. --output-format json with is_error gives you a second wire (a temperature sensor alongside the fault sensor), so the CI wrapper can compare and fail even when the primary fault wire forgets to fire. A robust CI integration is a factory machine that tells the control panel the whole truth, not a curated subset.

Which Analogy Fits Which Exam Question

  • Questions about -p and non-interactive mode → kitchen delivery-window analogy.
  • Questions about --output-format json vs --json-schema → shipping container analogy.
  • Questions about exit codes and pipeline propagation → factory emergency-stop analogy.

FAQ — Claude Code CI/CD Top 6 Questions

What does the -p flag actually do, and why is it required in CI?

-p (long form --print) is the Claude Code mode switch that disables the interactive TUI and runs a single-turn, non-interactive invocation. Without -p, claude launches its interactive REPL and waits for keyboard input — which will never arrive in a CI runner, so the process either hangs until the runner's wall-clock timeout kills it or emits TUI escape codes into the build log. With -p, Claude executes one agentic-loop run from the prompt on the command line or stdin, emits the final result to stdout, and exits with a standard exit code. -p is the single most-tested flag on CCA-F task 3.6; memorize both spellings and the exact semantics.

Does -p disable permission checks or safety rails?

No. -p changes input/output plumbing — the TUI is skipped, input comes from the argument or stdin, output goes to stdout. It does not disable the permission model on tools (file writes, shell commands outside the allowed list still require pre-approved permissions), it does not disable Claude's refusal to run obviously destructive commands, and it does not bypass any safety check. Permissions are controlled separately via .mcp.json, .claude/settings.json, the --allowedTools flag, or the --permission-mode flag. A CI job that expects -p to grant "do anything" privileges will fail the first time a tool crosses a permission boundary.

What is the difference between --output-format json and --json-schema?

--output-format json wraps every Claude Code run in a machine-readable JSON envelope — fields for result, session_id, num_turns, total_cost_usd, is_error, and similar metadata — so CI scripts can parse run metadata with jq. It does NOT constrain the format of the assistant's answer inside .result; whatever Claude generated goes in there as a string. --json-schema is a separate flag that validates the final answer against a user-supplied JSON Schema; if the answer does not conform, the run fails (or retries, depending on configuration). The two flags are complementary: --output-format json gives the envelope, --json-schema constrains the cargo. Schema-safe CI pipelines use both, or pair --json-schema with tool use in strict: true mode for belt-and-braces enforcement.

Should I enable plan mode in a CI pipeline?

Almost never. Plan mode produces a proposed plan and waits for human approval before executing; CI is by definition non-interactive and has no human to approve, so the pipeline hangs until the runner's wall-clock timeout. Plan mode is designed for interactive sessions on risky or ambiguous tasks. The rare exception is a CI stage that explicitly wants to produce a plan for asynchronous human review in a separate tool, in which case the pipeline must parse the plan out of the envelope without waiting on approval. On CCA-F, any answer that recommends enabling plan mode in CI "for safety" is almost certainly wrong — it is incompatible with non-interactive execution.

How should my CI wrapper decide whether a Claude Code run succeeded?

Check two signals. First, the process exit code: zero means success, non-zero means the CLI itself errored (authentication, API failure, permission denied, schema validation failure when --json-schema is enforced, iteration cap exceeded). Second, the is_error field inside the --output-format json envelope: true means Claude hit an internal error state even if the process exited cleanly. A robust CI wrapper fails the step if either signal indicates failure. Checking only the exit code misses soft failures where Claude surfaced a problem in the envelope but still exited cleanly; checking only the envelope misses hard CLI-level errors. Both together form the standard reliability pattern for Claude Code CI/CD pipeline integration.

How do I keep Claude Code CI costs under control?

Four levers, applied together. First, narrow the input — pipe diffs instead of full repos, pre-filter scan findings, only target changed files. Second, trigger on fewer events — PR-open and PR-ready-for-review instead of every push. Third, enforce a per-run token/cost budget by parsing total_cost_usd from the JSON envelope and failing or alerting when a run exceeds a threshold. Fourth, use tight iteration caps on the agentic loop inside Claude Code so a single run cannot spiral. Prompt caching of stable preambles (CLAUDE.md content, review checklists) reduces cost further but is platform-level rather than flag-level. The unifying principle: the cheapest token is the one you never send.

Further Reading

Related ExamHub topics: Plan Mode vs Direct Execution, Built-in Tools Selection and Application, Iterative Refinement for Progressive Improvement, Batch Processing Strategies.

Official sources