Tool errors inform LLM to adjust strategy. Pattern: from langchain_core.tools import ToolException; @tool def database_query(sql: str) -> str: '''Query database''' try: return db.execute(sql); except PermissionError: raise ToolException('Access denied. Try query without sensitive tables.'); except TimeoutError: raise ToolException('Query timeout. Use simpler query with LIMIT 100.'); except Exception as e: raise ToolException(f'Query failed: {str(e)[:100]}. Check SQL syntax.'). ToolException message is sent to LLM as observation - LLM can retry with corrected input. Generic exceptions stop agent execution. Use ToolException for: user errors (invalid input, permissions), transient errors (timeout, rate limit), recoverable failures (file not found, API error). Error message best practices: explain WHAT failed (not why), suggest HOW to fix ('Use LIMIT 100'), include relevant context (status code, error type), keep concise (<100 chars). Enable on AgentExecutor: handle_tool_error=True (default error message) or custom function. Error recovery improves success rate 25-40%.
LangChain AI Agents Tools FAQ & Answers
25 expert LangChain AI Agents Tools answers researched from official documentation. Every answer cites authoritative sources you can verify.
unknown
25 questionsTool validation prevents invalid inputs reaching tool logic. Use Pydantic for schema validation: from pydantic import BaseModel, Field, validator; class EmailInput(BaseModel): to: str = Field(pattern=r'^[\w.-]+@[\w.-]+.\w+$'); subject: str = Field(min_length=1, max_length=200); body: str; @validator('to') def validate_email(cls, v): if not v.endswith('@company.com'): raise ValueError('Only company emails allowed'); return v; @tool(args_schema=EmailInput) def send_email(to: str, subject: str, body: str) -> str: '''Send email'''. Pydantic validators: pattern (regex), min_length/max_length (strings), ge/le (numbers), validator decorator (custom logic). Validation errors raise ValidationError before tool execution - agent receives error and retries. For runtime validation: def send_email(...): if not check_rate_limit(): raise ToolException('Rate limit exceeded - wait 60s'). Use ToolException for recoverable errors (agent can retry/adapt), raise generic Exception for fatal errors (agent stops). Validate: input format (email, URL, JSON), business rules (permissions, rate limits), data ranges (dates, amounts). Validation reduces tool failures 40-60%.
Parallel tool execution requires async tools + LangGraph or manual orchestration. Pattern with LangGraph: from langgraph.prebuilt import ToolNode; tool_node = ToolNode(tools, parallel=True); graph.add_node('tools', tool_node). LangGraph automatically executes independent tools in parallel when LLM returns multiple tool calls. Manual approach with asyncio: import asyncio; async def parallel_tools(tool_calls): tasks = [tool.ainvoke(call.input) for call in tool_calls]; results = await asyncio.gather(*tasks, return_exceptions=True); return results. AgentExecutor does NOT support parallel execution (sequential only). For parallel execution: 1) All tools must be async (@tool with async def). 2) LLM must support multi-tool calling (OpenAI parallel function calling, Anthropic tool use). 3) Tools must be independent (no shared state). Benefits: 2-5x faster for I/O-bound tools (API calls, database queries), no benefit for CPU-bound tools (GIL). Use max_concurrency to limit: config={'max_concurrency': 5}. Production: LangGraph with ToolNode is recommended approach for parallel execution.
LangChain v0.2+ offers two agent patterns: AgentExecutor (legacy, simpler) vs LCEL/LangGraph (modern, flexible). AgentExecutor: Pre-built loop, agent = create_tool_calling_agent(llm, tools, prompt); executor = AgentExecutor(agent=agent, tools=tools). Pros: simple setup, batteries-included. Cons: limited customization, no persistence, basic error handling. LCEL pattern: agent = prompt | llm.bind_tools(tools) | parser; result = agent.invoke(input). Manually implement loop for tool calling. LangGraph pattern (recommended for production): StateGraph with nodes (agent, tools, human_review), edges (conditional routing), checkpoints (persistence). Pros: full control over execution, human-in-loop, state persistence, parallel tool execution, complex workflows. Cons: more code, steeper learning curve. Migration: LangChain v0.2 docs recommend LangGraph for new projects - AgentExecutor maintained for backward compatibility. Use AgentExecutor for: quick prototypes, simple tasks. Use LangGraph for: production apps, multi-step workflows, human oversight, debugging/replay.
Custom agents provide full control over tool selection and execution loop. Pattern: from langchain.agents import BaseSingleActionAgent; class CustomAgent(BaseSingleActionAgent): @property def input_keys(self): return ['input']; def plan(self, intermediate_steps, **kwargs): '''Decide next action''' if not intermediate_steps: return AgentAction(tool='search', tool_input='query', log='Starting'); last_tool, last_output = intermediate_steps[-1]; if 'error' in last_output: return AgentAction(tool='fallback', tool_input=last_output, log='Retry'); return AgentFinish(return_values={'output': last_output}, log='Done'); async def aplan(self, intermediate_steps, **kwargs): '''Async version'''. Override plan() to implement custom logic: tool selection based on state, conditional workflows, retry logic, tool chaining rules. Return AgentAction (continue with tool) or AgentFinish (stop with answer). Use with: AgentExecutor(agent=CustomAgent(), tools=tools). Custom agents enable: non-LLM agents (rule-based), hybrid approaches (LLM + heuristics), deterministic workflows. More control than LCEL but verbose. Modern alternative: LangGraph for complex workflows.
BaseTool provides maximum control for complex tools. Pattern: from langchain_core.tools import BaseTool; from pydantic import BaseModel, Field; class SearchInput(BaseModel): query: str; class SearchTool(BaseTool): name: str = 'search_db'; description: str = 'Search database'; args_schema: type[BaseModel] = SearchInput; return_direct: bool = False; def _run(self, query: str) -> str: '''Sync implementation''' return self.db.search(query); async def _arun(self, query: str) -> str: '''Async implementation''' return await self.db.async_search(query). Must override _run() for sync, _arun() for async. Use when: tool needs state (self.db), complex initialization, custom validation beyond Pydantic, integration with existing classes. Add custom methods for setup: def init(self, db_connection): self.db = db_connection. The args_schema defines input validation. Set return_direct=True to skip LLM processing of output. BaseTool is verbose but necessary for stateful tools or inheritance patterns.
Agent security requires multiple layers: input sanitization, tool restrictions, execution sandboxing. Input validation: from pydantic import BaseModel, Field, validator; class SafeInput(BaseModel): query: str = Field(max_length=500); @validator('query') def no_sql_injection(cls, v): if any(kw in v.lower() for kw in ['drop', 'delete', 'truncate']): raise ValueError('Unsafe SQL keyword'); return v. Tool restrictions: allowed_tools = ['search', 'calculator']; agent_executor = AgentExecutor(agent=agent, tools=tools, allowed_tools=allowed_tools). Tool allowlisting: safer than blocklisting. Per-user permissions: class PermissionedTool(BaseTool): def _run(self, user_id: str, ...): if not has_permission(user_id, self.name): raise ToolException('Permission denied'). Sandboxing: use RestrictedPython for code execution tools, Docker containers for file system isolation, read-only database connections, network egress controls. Monitor: log all tool calls, alert on sensitive operations, rate limit per user. Production checklist: validate all inputs, allowlist tools, sandbox code execution, read-only by default, audit logging, rate limiting.
Cost tracking monitors LLM spending during agent execution. Built-in tracking: from langchain.callbacks import get_openai_callback; with get_openai_callback() as cb: result = agent_executor.invoke(input); print(f'Tokens: {cb.total_tokens}, Cost: ${cb.total_cost}'). Tracks: prompt_tokens, completion_tokens, total_tokens, total_cost (USD). Supports: OpenAI, Anthropic (via specific callbacks). Custom tracking: class CostCallback(BaseCallbackHandler): total_cost = 0; def on_llm_end(self, response, **kwargs): tokens = response.llm_output['token_usage']['total_tokens']; self.total_cost += tokens * 0.000002; # $0.002 per 1K tokens. Token limits per call: llm = ChatOpenAI(max_tokens=500, temperature=0). Agent-level budget: track in callback, raise exception when exceeded. Production pattern: cb = CostCallback(); agent_executor.invoke(input, callbacks=[cb]); if cb.total_cost > budget: alert('Budget exceeded'). Monitor: cost per query, cost per user, cost per day. Optimization: shorter prompts (20% savings), caching (50% savings), smaller models for simple tools (70% savings). LangSmith provides built-in cost tracking dashboard.
Callbacks hook into agent execution for logging, monitoring, tracing. Pattern: from langchain.callbacks.base import BaseCallbackHandler; class AgentLogger(BaseCallbackHandler): def on_llm_start(self, serialized, prompts, **kwargs): logger.info(f'LLM called with {len(prompts)} prompts'); def on_tool_start(self, serialized, input_str, **kwargs): logger.info(f'Tool {serialized["name"]} called'); def on_tool_end(self, output, **kwargs): logger.info(f'Tool returned: {output[:100]}'); def on_agent_action(self, action, **kwargs): logger.info(f'Action: {action.tool}, Input: {action.tool_input}'); def on_agent_finish(self, finish, **kwargs): logger.info(f'Agent finished: {finish.return_values}'). Use: AgentExecutor(callbacks=[AgentLogger()]). Built-in callbacks: StdOutCallbackHandler (console output), LangChainTracer (LangSmith tracing), WandbCallbackHandler (Weights & Biases). For cost tracking: use get_openai_callback context manager. Async callbacks: inherit AsyncCallbackHandler, implement async def methods. Callbacks enable: cost tracking, performance monitoring, error alerting, audit logs.
Conversational agents remember chat history across turns. Pattern: from langchain.agents import create_tool_calling_agent, AgentExecutor; from langchain.memory import ConversationBufferMemory; memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True); agent = create_tool_calling_agent(llm, tools, prompt); agent_executor = AgentExecutor(agent=agent, tools=tools, memory=memory, verbose=True). Prompt must include {chat_history} variable. Memory types: ConversationBufferMemory (stores all messages, grows unbounded), ConversationBufferWindowMemory (keeps last K messages), ConversationSummaryMemory (LLM-generated summary, token-efficient), ConversationSummaryBufferMemory (hybrid: recent messages + summary). CRITICAL: Set return_messages=True for chat models (returns Message objects), False for LLMs (returns strings). Access history: memory.load_memory_variables({})['chat_history']. Clear: memory.clear(). Use conversational agent when: multi-turn chat, context from previous questions needed, follow-up queries. Limitation: memory adds context to every LLM call (increases cost/latency). Production: use ConversationSummaryBufferMemory with max_token_limit=2000.
max_iterations controls maximum agent loops before forced termination. Each iteration = tool invocation + LLM processing. Default is 15 iterations. Pattern: AgentExecutor(agent=agent, tools=tools, max_iterations=10, early_stopping_method='generate'). Behavior: iteration count includes initial LLM call + each tool use. After max_iterations, early_stopping_method determines response: 'force' returns 'Agent stopped due to iteration limit or time limit.' (fails fast), 'generate' prompts LLM one final time to synthesize answer from partial results (best effort). Use max_iterations=5-10 for simple tasks (web search + answer), 15-25 for complex reasoning (multi-step calculations), infinite (None) only for human-supervised agents. Combine with max_execution_time=60 for hard timeout in seconds. Monitor with callbacks: on_agent_action fired on each iteration. Common issue: too low max_iterations causes premature stopping, too high causes expensive runaway loops. Production: set both max_iterations and max_execution_time.
AgentExecutor wraps agent and tools for execution loop. Basic pattern: from langchain.agents import AgentExecutor, create_tool_calling_agent; agent = create_tool_calling_agent(llm, tools, prompt); agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=15, max_execution_time=300, handle_parsing_errors=True, return_intermediate_steps=False). Key parameters: max_iterations (default 15) caps loops to prevent infinite runs, max_execution_time in seconds for hard timeout, handle_parsing_errors=True retries on LLM output parse failures, return_intermediate_steps=True includes agent trajectory in output (for debugging). Early stopping: early_stopping_method='force' returns error message, 'generate' prompts LLM for final response. Use callbacks=[CustomCallback()] for logging/monitoring. NOTE: LangGraph is now preferred over AgentExecutor for production (PEP 0.2 migration guide) - provides better control, persistence, human-in-loop.
The @tool decorator is the simplest way to create tools in LangChain. Basic pattern: from langchain_core.tools import tool; @tool def search_database(query: str) -> str: '''Search database for query. Args: query: The search term. Returns: Search results.''' return db.search(query). CRITICAL: Docstring is REQUIRED - it becomes the tool description for the LLM. The function name becomes tool name by default, override with @tool('custom_name'). Type hints are mandatory for function arguments. The decorator automatically creates a structured tool with proper schema. Return type must be JSON-serializable (str, dict, list, int, float, bool). Use from langchain_core.tools import tool, NOT langchain.tools (deprecated in v0.2). For complex validation, use Pydantic BaseModel inputs. Always test tool.invoke({'query': 'test'}) before agent use.
return_intermediate_steps=True includes agent's reasoning trajectory in output for debugging. Pattern: AgentExecutor(agent=agent, tools=tools, return_intermediate_steps=True); result = agent_executor.invoke({'input': 'query'}); print(result['intermediate_steps']). Output format: list of tuples [(AgentAction, tool_output), ...]. AgentAction contains: tool name, tool_input dict, log (LLM reasoning text). Example: [((tool='search', tool_input={'query': 'weather'}, log='I should search'), 'Result: 72°F'), ((tool='calculator', tool_input={'expr': '72-32'}, log='Convert to Celsius'), '40')]. Use for: debugging tool selection errors, understanding agent reasoning, identifying which step failed, optimizing prompts. WARNING: intermediate_steps can be large (100+ KB for long chains), only return at end of execution (not streamed), includes sensitive data (sanitize before logging). Production: set False (default) for performance, enable only when debugging specific issues. Access via callbacks for real-time monitoring: on_agent_action(action), on_tool_end(output).
Streaming provides real-time output during agent execution. Pattern: for chunk in agent_executor.stream({'input': 'query'}): if 'actions' in chunk: print(f'Action: {chunk["actions"]}'); elif 'steps' in chunk: print(f'Step: {chunk["steps"]}'); elif 'output' in chunk: print(f'Output: {chunk["output"]}'). Chunks contain: 'actions' (agent decisions), 'steps' (tool executions), 'output' (final answer), 'messages' (intermediate text). Async streaming: async for chunk in agent_executor.astream(input). For token-level streaming from LLM: use astream_events() API (Python 3.11+): async for event in agent_executor.astream_events(input, version='v1'): if event['event'] == 'on_chat_model_stream': print(event['data']['chunk']). Limitations: streaming doesn't work well with return_intermediate_steps=True (buffering conflict), JSON tool outputs not streamable (must be complete). Use streaming for: chatbots (user sees progress), long-running tasks (feedback), debugging (real-time logs). AgentExecutor.stream() yields complete steps, not individual tokens.
OpenAI Functions agent vs Tools agent differ in LLM API used. OpenAI Functions (deprecated in v0.2): Uses function_call parameter (OpenAI API), agent = create_openai_functions_agent(llm, tools, prompt). Forces JSON responses with specific schema. Only works with OpenAI models (gpt-3.5-turbo, gpt-4). Tools agent (current standard): Uses tools parameter (OpenAI API, supported by Anthropic, Google), agent = create_tool_calling_agent(llm, tools, prompt). Uses native tool calling format. Works with OpenAI, Anthropic Claude, Google Gemini, others with tool support. Performance: Tools agent is faster (10-20% fewer tokens), more reliable parsing, better multi-tool handling. Migration: Replace create_openai_functions_agent with create_tool_calling_agent - API compatible. Use Tools agent for: all new projects, cross-model compatibility, better performance. Only use Functions agent for: legacy code, specific OpenAI function calling features. LangChain v0.2+ recommends Tools agent as default.
Tool descriptions are critical - LLMs use them to select correct tools. Best practices: 1) Start with verb ('Search for', 'Calculate', 'Fetch'). 2) Specify WHEN to use tool: 'Use when user asks about pricing' vs vague 'Gets pricing info'. 3) Specify input format: 'Takes ISO date string (YYYY-MM-DD)'. 4) Specify output format: 'Returns JSON with status and data fields'. 5) Add constraints: 'Only works for US addresses'. Example: @tool def search(query: str) -> str: '''Search product database. Use when user asks about product availability, pricing, or specifications. Takes product name or SKU as query. Returns JSON with {name, price, stock_count, specs}. Only searches active products.''' Good descriptions reduce tool selection errors by 60-80%. Use Field(description='...') for parameter-level descriptions. Keep total <200 words (fits in system prompt). Test by asking LLM 'which tool would you use for X?' Avoid: 'This tool does searching' (too vague), 'Advanced search algorithm' (implementation details irrelevant to LLM).
Timeouts and rate limiting prevent runaway tool execution. Timeout pattern: import asyncio; from langchain_core.tools import tool, ToolException; @tool async def api_call(url: str) -> str: '''Call external API''' try: async with httpx.AsyncClient(timeout=10.0) as client: response = await asyncio.wait_for(client.get(url), timeout=10.0); return response.text; except asyncio.TimeoutError: raise ToolException('API timeout after 10s - try again'). Set timeout on: HTTP clients (httpx/aiohttp timeout parameter), asyncio.wait_for() wrapper, AgentExecutor max_execution_time. Rate limiting: from langchain_core.rate_limiters import InMemoryRateLimiter; limiter = InMemoryRateLimiter(requests_per_second=2.0); llm = ChatOpenAI(rate_limiter=limiter). For tool-level rate limiting: use asyncio.Semaphore or token bucket algorithm. Pattern: semaphore = asyncio.Semaphore(5); async with semaphore: await tool_call(). Global rate limiter: share InMemoryRateLimiter instance across tools. Monitor: track requests_per_unit in callbacks. Production: combine timeouts (fail fast) + rate limiting (prevent API bans) + retries (handle transient errors).
StructuredTool.from_function() provides more control than @tool decorator. Pattern: from langchain_core.tools import StructuredTool; from pydantic import BaseModel, Field; class SearchSchema(BaseModel): query: str = Field(description='What to search'); filters: dict = Field(default_factory=dict); def search_func(query: str, filters: dict) -> str: return f'Searching: {query}'; tool = StructuredTool.from_function(func=search_func, name='search_db', description='Search database with filters', args_schema=SearchSchema, return_direct=False, handle_tool_error=True). Key parameters: return_direct=True returns tool output directly (skips LLM), handle_tool_error=True/str/callable for error recovery, coroutine=async_func for async tools. Use StructuredTool over @tool when: custom error handling needed, dynamic tool creation, tool metadata customization, integration with non-decorated functions. The args_schema creates JSON schema for OpenAI function calling.
handle_parsing_errors catches LLM output parsing failures and implements retry logic. Pattern: AgentExecutor(agent=agent, tools=tools, handle_parsing_errors=True). Options: True (sends generic error + parsing exception to LLM for retry), str (returns custom message to LLM: 'Could not parse. Try again with valid JSON.'), callable (custom function taking OutputParserException, returning str). Example custom handler: def handle_error(e: OutputParserException) -> str: return f'Invalid format. Expected JSON with action and action_input. Error: {str(e)[:100]}'. Common parsing errors: invalid JSON from LLM, missing required fields (action/action_input), tool name not in tool list. The error message is sent as observation to LLM, which retries with corrected format. Use True for development (verbose errors help debugging), custom string/function for production (cleaner prompts). Parsing errors count toward max_iterations. If retry fails, agent stops. Pair with verbose=True to see parsing failures in logs.
Zero-shot vs few-shot differ in tool selection strategy. Zero-shot agent: Selects tools based only on tool descriptions (no examples). Pattern: agent = create_tool_calling_agent(llm, tools, prompt). Prompt contains tool descriptions only. Works when: tool names/descriptions are clear, LLM is capable (GPT-4), task is simple. Few-shot agent: Includes example tool uses in prompt to guide selection. Pattern: few_shot_prompt = FewShotPromptTemplate(examples=[{'input': 'weather in NYC', 'thought': 'Need weather tool', 'action': 'weather_tool', 'action_input': 'NYC'}], example_prompt=example_template, prefix='Answer questions using tools:', suffix='Question: {input}'); agent = create_react_agent(llm, tools, few_shot_prompt). Use few-shot when: complex tool selection logic, ambiguous tool names, weaker LLMs (GPT-3.5), novel domains. Examples improve tool selection accuracy 15-30%. Trade-off: few-shot uses more tokens (300-500 per example), slower, but more reliable. Modern approach: zero-shot with better tool descriptions + GPT-4 often beats few-shot with GPT-3.5. Few-shot essential for: domain-specific tools, non-obvious tool combinations.
Valid tool return types: str (most common, LLM-friendly), dict (auto-converted to JSON string), list (serialized to JSON), int/float/bool (converted to string), Pydantic BaseModel (serialized via model_dump_json()). Return str for: text responses, formatted output, error messages. Return dict/list for: structured data, multiple values, nested objects. Pattern: def tool() -> dict[str, Any]: return {'status': 'success', 'data': [...]}. CRITICAL: Avoid returning raw objects, file handles, or non-serializable types - causes JSON serialization errors. For binary data, return base64 string or file path. Tool output is passed back to LLM as observation, so format for readability. Use return_direct=True on StructuredTool if output should bypass LLM (e.g., calculator result). Maximum practical size: ~4000 characters per tool output (fits in LLM context). For large data, return summary + reference.
Use @tool with Pydantic BaseModel for multiple arguments. Pattern: from langchain_core.tools import tool; from pydantic import BaseModel, Field; class SearchInput(BaseModel): query: str = Field(description='Search query'); limit: int = Field(default=10, ge=1, le=100); filters: dict[str, str] = Field(default_factory=dict); @tool(args_schema=SearchInput) def search(query: str, limit: int = 10, filters: dict[str, str] | None = None) -> list[dict]: '''Search with filters'''. Pydantic Field() provides descriptions, defaults, and validation (ge=greater-equal, le=less-equal). Type hints support Python 3.10+ unions (str | None), generics (list[dict], dict[str, Any]), Optional[T]. The args_schema parameter creates proper JSON schema for LLM tool calling. Validation happens automatically - invalid inputs raise ValidationError before tool execution. Use Field(description='...') extensively - LLMs use these to select correct tools.
Define async tools with async def and @tool decorator. Pattern: from langchain_core.tools import tool, ToolException; @tool async def fetch_api(url: str) -> str: '''Fetch data from API''' try: async with httpx.AsyncClient(timeout=10.0) as client: response = await client.get(url); response.raise_for_status(); return response.json(); except httpx.TimeoutException: raise ToolException('API timeout after 10s'); except httpx.HTTPStatusError as e: raise ToolException(f'API error {e.response.status_code}'). Use ToolException for graceful errors - message is sent to LLM to try alternative approach. Agent calls await tool.ainvoke() for async execution. Set handle_tool_error=True on AgentExecutor to catch exceptions. For rate limiting, use asyncio.Semaphore inside tool. Always set timeouts on external calls (default is infinite). Use try/except/finally pattern for cleanup. Async tools enable parallel execution with LangGraph - multiple tools run concurrently.
ReAct (Reasoning + Acting) alternates between reasoning and tool use until answer found. Pattern: from langchain.agents import create_react_agent; agent = create_react_agent(llm, tools, prompt); executor = AgentExecutor(agent=agent, tools=tools, verbose=True). ReAct prompt structure: Thought: (reasoning), Action: (tool name), Action Input: (tool input), Observation: (tool output), repeat until Final Answer. Custom ReAct prompt: from langchain import hub; prompt = hub.pull('hwchase17/react'); or define manually with {tools}, {tool_names}, {agent_scratchpad} variables. Behavior: agent explicitly shows reasoning steps (unlike tool-calling agents with implicit reasoning). Use ReAct when: debugging needed (see thought process), complex multi-step reasoning, educational/transparent agents. Comparison to tool-calling: ReAct has explicit Thought traces (verbose, more tokens), tool-calling has implicit reasoning (faster, fewer tokens). ReAct works with: any LLM (no tool-calling API required), legacy models. Modern approach: prefer create_tool_calling_agent for efficiency, ReAct for transparency.