Home / Domains / Domain 2
D2 ~18% of Exam

Claude Certified Architect: Tool Design & MCP Integration

Build precise and well-differentiated tool interfaces that LLMs can reliably select, implement structured error handling that enables intelligent agent recovery, distribute tools efficiently across specialized agents, and integrate MCP servers and built-in tools into production workflows.

Task 2.1

Design Effective Tool Interfaces with Clear Descriptions and Boundaries

Why Tool Descriptions Are the Most Important Lever

When an LLM needs to decide which tool to invoke, the tool description is the primary signal it uses to make that selection. Unlike prompt instructions that nudge behavior probabilistically, tool descriptions directly determine how the model understands each tool's purpose, scope, and appropriate use cases. If two tools have vague or overlapping descriptions, the model will routinely misroute requests between them — no amount of prompt tuning will compensate for poorly differentiated tool metadata.

The Problem with Minimal Descriptions

Consider two tools with terse descriptions: get_customer — "Retrieves customer information" and lookup_order — "Gets order details." Both accept similar identifier formats (names, IDs, order numbers). When a user says "check on order #12345," the model may call get_customer instead because both descriptions are too sparse to establish clear boundaries. The model lacks the contextual information needed to reliably distinguish which tool handles which class of request.

What a High-Quality Tool Description Contains

  • Accepted input formats — specify exactly what identifiers, data types, and patterns the tool expects
  • Example queries — include 2-3 representative requests that should trigger this tool
  • Edge cases and limitations — describe what the tool does not do, preventing misuse
  • Boundary explanations — explicitly distinguish this tool from similar ones ("Use this for order lookups by order ID or tracking number, NOT for customer profile retrieval")

Ambiguous and Overlapping Descriptions Cause Misrouting

A common failure pattern arises when two tools share overlapping semantic territory. For example, analyze_content and analyze_document without clear differentiation will confuse the model. The fix is not to add more prompt instructions — it is to rename tools so their purpose is unambiguous, or to split a single generic tool into multiple purpose-specific tools with narrow, non-overlapping descriptions.

Hidden Risks: Keyword-Sensitive System Prompt Instructions

Including tool-selection hints in the system prompt (e.g., "always use analyze_content when the user mentions 'review'") can create unintended associations. The model may over-index on the keyword and trigger the tool even when the user's intent clearly points elsewhere. Tool selection should be driven by the tool definitions themselves, not fragile keyword rules injected into the prompt.

Practical Skills for This Task

  • Rename tools to eliminate semantic overlap (e.g., rename analyze_content to analyze_text_sentiment)
  • Split a generic multi-purpose tool into several purpose-specific tools with tight descriptions
  • Write descriptions that include input formats, example queries, scope boundaries, and negative constraints
Key Concept

Tool descriptions are the #1 lever for reliable tool selection. When models misroute between tools, expanding and differentiating tool descriptions is the highest-impact, lowest-cost fix — before adding few-shot examples, routing layers, or model upgrades.

Exam Trap

Wrong approach: Making tool descriptions longer without actually differentiating between similar tools. Simply adding more words to a description doesn't help if both tools still sound interchangeable. Length is not the goal — clear semantic boundaries between tools are what matters.

Practice Scenario

Your production agent keeps calling get_customer instead of lookup_order when users ask about their orders. Both tools have one-sentence descriptions and accept similar identifier formats. What is the most effective first step to fix this?

  • A Add 5-8 few-shot examples in the system prompt showing order queries routed to lookup_order
  • B Expand each tool's description to include accepted input formats, example queries, edge cases, and differentiation notes
  • C Implement a routing classifier that parses user input with keywords before each turn
  • D Merge both into a single lookup_entity tool that internally routes to the correct backend
Tool descriptions are the primary mechanism LLMs use for tool selection. When descriptions are too sparse, the model lacks the context to distinguish between similar tools. Expanding descriptions directly addresses the root cause with minimal effort and maximum impact. Few-shot examples (A) add token overhead without fixing the underlying problem; a routing classifier (C) is over-engineering that bypasses the model's native capabilities; merging tools (D) is too heavy-handed for a first step.
Task 2.2

Implement Structured Error Responses for MCP Tools

The MCP isError Pattern

The Model Context Protocol defines a standard way for tools to communicate failure back to the calling agent: the isError flag. When a tool execution fails, it should return a response with isError: true along with structured metadata that enables the agent to understand what went wrong, why it failed, and whether retrying could succeed. This is fundamentally different from simply returning a text message like "Operation failed."

Error Categories You Must Distinguish

Not all errors are the same, and treating them uniformly prevents intelligent recovery. A well-designed MCP tool classifies errors into distinct categories:

  • Transient errors — timeouts, rate limits, temporary network failures. These are retryable and may succeed on a subsequent attempt.
  • Validation errors — malformed input, missing required fields, type mismatches. The agent needs to fix the input before retrying.
  • Business rule violations — policy limits exceeded, unauthorized operations, logical constraints violated. Retrying with the same parameters will always fail.
  • Permission errors — insufficient access rights, expired tokens, forbidden resources. Requires credential refresh or escalation, not retry.

Why Generic Errors Are Dangerous

When every failure returns a uniform "Operation failed" message, the agent cannot make informed recovery decisions. Should it retry? Should it reformulate the request? Should it escalate to a human? Without error categorization, the agent either retries blindly (wasting resources on non-retryable failures) or gives up immediately (missing easy recoveries from transient issues).

Structured Error Metadata

An effective error response includes multiple pieces of actionable information:

  • isError: true — the MCP standard flag signaling a failure occurred
  • errorCategory — one of "transient", "validation", "business", "permission"
  • isRetryable: boolean — explicitly states whether the agent should attempt again
  • A human-readable description that explains the failure in enough detail for the agent to adjust
  • Contextual metadata (which field failed validation, what policy was violated, how long to wait before retry)

Error Recovery in Multi-Agent Systems

In hub-and-spoke architectures, error handling follows a locality principle: subagents should attempt local recovery for transient failures (retries, alternative queries) and only propagate errors to the coordinator when they cannot be resolved locally. When propagating, the subagent should include partial results alongside the error context, so the coordinator can decide whether to retry with a different strategy, use a different subagent, or proceed with what's available.

Key Concept

Structured error responses enable intelligent agent recovery decisions. The combination of isError, errorCategory, and isRetryable gives the agent a decision framework: retry transient failures, fix validation issues, escalate permission problems, and report business rule violations.

Exam Trap

Wrong approach: Returning a generic "Operation failed" message for all error types. This prevents the agent from distinguishing between a temporary timeout (retry it) and a permanent permission denial (escalate it). Always categorize errors with structured metadata.

Exam Trap

Wrong approach: Silently returning empty results when an access failure occurs. This disguises a failure as a legitimate "no data found" response. The agent will proceed as if the query returned nothing, when in reality the data was inaccessible. Always distinguish access failures (isError: true) from genuine empty results (isError: false).

Practice Scenario

A web search subagent times out while researching a complex topic. You need to design how failure information propagates back to the coordinator. Which error propagation approach best supports intelligent recovery?

  • A Return structured error context to the coordinator including failure type, attempted queries, partial results, and suggested alternatives
  • B Implement automatic retries with exponential backoff inside the subagent, returning "search unavailable" only after all retries are exhausted
  • C Catch the timeout and return an empty result set marked as successful
  • D Let the timeout exception propagate directly to the top-level handler, terminating the entire workflow
Structured error context gives the coordinator everything it needs for intelligent recovery — whether to retry with modified queries, try an alternative source, or proceed with partial results. Option B hides useful context behind a generic status; option C disguises failure as success, preventing any recovery; option D unnecessarily kills the entire workflow when a recovery strategy might succeed.
Task 2.3

Distribute Tools Across Agents and Configure Tool Choice

Why Too Many Tools Degrade Selection Reliability

When a single agent has access to a large number of tools (18 or more), its ability to reliably pick the right tool for each request drops significantly. The model must evaluate every tool description against the current request, and as the list grows, semantic overlap increases and selection confidence decreases. Studies of agentic systems consistently show that agents with focused tool sets of 4-5 tools dramatically outperform those with sprawling toolkits.

Agents Misuse Tools Outside Their Specialization

Giving an agent tools that fall outside its core role creates a temptation for misuse. A customer service agent equipped with database administration tools may attempt direct data modifications when it should be routing through safe, business-logic-aware endpoints. The principle of scoped tool access dictates that each agent should only have access to the tools that are necessary and appropriate for its designated role.

The Distribution Strategy

Rather than loading every tool onto one monolithic agent, distribute tools across a team of specialized subagents:

  • A coordinator agent with 4-5 high-level tools (routing, delegation, aggregation)
  • Specialized subagents, each with 4-5 tools focused on a narrow domain (e.g., order management, customer lookup, inventory queries)
  • Context passed to each subagent should be minimal and task-specific — not the coordinator's full conversation history

Understanding tool_choice Configuration

The tool_choice parameter controls how the model decides whether and which tool to call:

  • "auto" — the model decides on its own whether to call a tool or respond with text. It may choose not to use any tool if it determines the request can be answered directly.
  • "any" — the model must call a tool; it cannot respond with plain text. Use this when you need guaranteed tool invocation but want the model to choose which tool.
  • Forced / specific tool — the model must call one particular tool. Use this for workflows where the next step is deterministic (e.g., always validate output through a specific schema tool).
Key Concept

Keep 4-5 tools per agent for optimal selection reliability. When your system needs more tools, distribute them across specialized subagents rather than overloading a single agent. This is a structural solution — not a description-tuning problem.

Exam Trap

Wrong approach: Assigning 18+ tools to a single agent and attempting to fix selection errors by writing longer or more detailed descriptions. Tool description quality matters, but it cannot overcome the fundamental problem of cognitive overload from too many options. The correct fix is to split tools across multiple focused agents.

Practice Scenario

Your developer productivity agent has 18 tools and frequently selects the wrong one. You've already ensured all tool descriptions are detailed and differentiated. What is the most effective next step?

  • A Upgrade to a larger model with better tool selection capabilities
  • B Add few-shot examples demonstrating correct tool selection for common queries
  • C Redistribute tools across specialized subagents, keeping 4-5 tools per agent
  • D Implement a keyword-based routing classifier that pre-selects tools before each turn
When descriptions are already well-differentiated, the remaining problem is structural — too many tools on one agent. Distributing tools across specialized subagents with 4-5 tools each directly addresses the selection reliability issue. A larger model (A) doesn't solve the fundamental problem; few-shot examples (B) add token cost without addressing tool count; a keyword classifier (D) bypasses the model's reasoning and introduces a fragile routing layer.
Task 2.4

Integrate MCP Servers into Claude Code and Agent Workflows

MCP Server Scoping: Project-Level vs User-Level

MCP servers can be configured at two distinct levels, and choosing the right scope matters for team collaboration and security:

  • Project-level (.mcp.json) — lives in the project repository, committed to version control. Every developer who clones the repo gets the same MCP server configuration. Ideal for team-shared integrations like database connectors, CI systems, or project-specific APIs.
  • User-level (~/.claude.json) — lives in the developer's home directory. Personal MCP servers that apply across all projects (personal productivity tools, individual API keys for development sandboxes).

Environment Variable Expansion for Credential Management

MCP configuration files support environment variable expansion using the ${VARIABLE_NAME} syntax. This is the correct way to reference secrets and API keys in configuration files that will be committed to version control:

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_TOKEN": "${GITHUB_TOKEN}"
      }
    }
  }
}

The actual token value is stored in the developer's environment (e.g., via .env files, shell profiles, or secret managers) — never in the configuration file itself.

MCP Resources: Reducing Exploratory Tool Calls

MCP servers can expose resources — read-only content catalogs that the agent can browse without making tool calls. Instead of the agent making repeated exploratory calls to discover what's available ("list all tables," "show all endpoints"), resources provide this information upfront as structured content. This reduces round-trips, lowers latency, and keeps the agent's context focused on its actual task.

Community Servers vs Custom Implementations

For standard integrations (GitHub, Slack, databases, file systems), prefer established community MCP servers over building custom implementations. Community servers are battle-tested, maintained by the ecosystem, and follow MCP best practices. Reserve custom server development for proprietary APIs and unique business logic that no community server addresses.

Key Concept

Always use ${ENV_VAR} expansion in .mcp.json for credentials. Never hardcode secrets in configuration files that are committed to version control. Project-level config (.mcp.json) is for shared team integrations; user-level config (~/.claude.json) is for personal tools.

Exam Trap

Wrong approach: Hardcoding API keys directly in .mcp.json. Since this file is committed to version control, secrets will be exposed to anyone with repository access. Always use ${ENV_VAR} expansion and store actual values in environment variables or a secret manager.

Practice Scenario

Your team needs to integrate a GitHub MCP server into a shared project repository. The server requires a personal access token for authentication. Which configuration approach is correct?

  • A Add the token directly in .mcp.json with a comment reminding developers to rotate it quarterly
  • B Configure the server in each developer's ~/.claude.json with their individual tokens
  • C Use ${GITHUB_TOKEN} in .mcp.json and have each developer set the environment variable locally
  • D Create a separate .mcp.secrets.json file and add it to .gitignore
The ${ENV_VAR} pattern in .mcp.json provides the best of both worlds: the server configuration is shared through version control, while actual secrets stay in each developer's environment. Option A exposes secrets in the repo; option B loses the benefit of shared configuration; option D introduces a non-standard pattern that MCP doesn't natively support.
Task 2.5

Select and Apply Built-in Tools Effectively

The Built-in Tool Suite

Claude Code and the Agent SDK provide a set of built-in tools that cover fundamental file and code operations. Knowing exactly when to use each tool — and when not to — is essential for building efficient, reliable agent workflows.

Grep: Content Search by Pattern

Use Grep when you need to search for content inside files. This is the right tool for finding function names, error messages, configuration values, import statements, API endpoints, or any text pattern within the codebase. Grep searches through file contents using patterns and regular expressions.

Glob: File Path Pattern Matching

Use Glob when you need to find files by their name, extension, or directory structure — without looking inside them. Glob matches file paths against patterns like *.config.js, src/components/**/*.tsx, or **/test_*.py. It answers "which files exist that match this pattern?" rather than "which files contain this text?"

Read and Write: Full File Operations

Read loads the full content of a file. Use it when you need to examine an entire file's contents, understand its structure, or review its code. Write creates a new file or completely overwrites an existing one. Be cautious with Write on existing files — anything not included in the new content will be permanently lost.

Edit: Targeted Modifications

Edit makes targeted changes to specific sections of a file using unique text matching. It identifies the exact location to modify by matching a unique string, then replaces it with the new content. The key requirement is that the text to be replaced must be unique within the file. When Edit fails because the match text isn't unique, the fallback strategy is to use Read to get the full file, then Write the complete modified version.

Building Codebase Understanding Incrementally

Effective agents don't try to read an entire codebase at once. Instead, they build understanding incrementally using a disciplined tool chain:

  • Start with Grep to find entry points (main functions, route definitions, exported symbols)
  • Use Read to examine the entry point files and follow import chains
  • Use Glob to discover related files by naming conventions or directory patterns
  • Use Grep again to trace specific function calls or references across the codebase
Key Concept

Know exactly when to use each built-in tool. Grep = content search inside files. Glob = file path pattern matching. Read = load file contents. Write = create or overwrite files. Edit = targeted modifications using unique text matching. Bash = system commands and processes (never use it when a dedicated tool exists for the task).

Exam Trap

Wrong approach: Using Bash('cat config.json') to read a configuration file. When a dedicated Read tool exists, always prefer it over shell commands. Reserve Bash for operations that genuinely require shell execution (running tests, installing packages, git operations) — never for file reading, writing, or searching when built-in tools exist.

Practice Scenario

You need to read the contents of a project's config.yaml file to understand the application's database settings. Which tool should you use?

  • A Read — the dedicated tool for loading file contents
  • B Bash('cat config.yaml') — shell command to display file contents
  • C Grep — search for "database" pattern inside the file
  • D Glob('config.yaml') — locate the config file first
The Read tool is purpose-built for loading file contents. Using Bash('cat ...') (B) works but is an anti-pattern when a dedicated tool exists. Grep (C) searches for patterns within files, not reads the whole file. Glob (D) finds files by path pattern — useful if you don't know the exact location, but the question specifies the path is known.

Continue Exploring

Everything you need for the Claude Certified Architect exam