The @tool decorator is the simplest way to create tools in LangChain. Basic pattern: from langchain_core.tools import tool; @tool def search_database(query: str) -> str: '''Search database for query. Args: query: The search term. Returns: Search results.''' return db.search(query). CRITICAL: Docstring is REQUIRED - it becomes the tool description for the LLM. The function name becomes tool name by default, override with @tool('custom_name'). Type hints are mandatory for function arguments. The decorator automatically creates a structured tool with proper schema. Return type must be JSON-serializable (str, dict, list, int, float, bool). Use from langchain_core.tools import tool, NOT langchain.tools (deprecated in v0.2). For complex validation, use Pydantic BaseModel inputs. Always test tool.invoke({'query': 'test'}) before agent use.
LangChain AI Agents Tools FAQ & Answers
25 expert LangChain AI Agents Tools answers researched from official documentation. Every answer cites authoritative sources you can verify.
unknown
25 questionsUse @tool with Pydantic BaseModel for multiple arguments. Pattern: from langchain_core.tools import tool; from pydantic import BaseModel, Field; class SearchInput(BaseModel): query: str = Field(description='Search query'); limit: int = Field(default=10, ge=1, le=100); filters: dict[str, str] = Field(default_factory=dict); @tool(args_schema=SearchInput) def search(query: str, limit: int = 10, filters: dict[str, str] | None = None) -> list[dict]: '''Search with filters'''. Pydantic Field() provides descriptions, defaults, and validation (ge=greater-equal, le=less-equal). Type hints support Python 3.10+ unions (str | None), generics (list[dict], dict[str, Any]), Optional[T]. The args_schema parameter creates proper JSON schema for LLM tool calling. Validation happens automatically - invalid inputs raise ValidationError before tool execution. Use Field(description='...') extensively - LLMs use these to select correct tools.
Define async tools with async def and @tool decorator. Pattern: from langchain_core.tools import tool, ToolException; @tool async def fetch_api(url: str) -> str: '''Fetch data from API''' try: async with httpx.AsyncClient(timeout=10.0) as client: response = await client.get(url); response.raise_for_status(); return response.json(); except httpx.TimeoutException: raise ToolException('API timeout after 10s'); except httpx.HTTPStatusError as e: raise ToolException(f'API error {e.response.status_code}'). Use ToolException for graceful errors - message is sent to LLM to try alternative approach. Agent calls await tool.ainvoke() for async execution. Set handle_tool_error=True on AgentExecutor to catch exceptions. For rate limiting, use asyncio.Semaphore inside tool. Always set timeouts on external calls (default is infinite). Use try/except/finally pattern for cleanup. Async tools enable parallel execution with LangGraph - multiple tools run concurrently.
StructuredTool.from_function() provides more control than @tool decorator. Pattern: from langchain_core.tools import StructuredTool; from pydantic import BaseModel, Field; class SearchSchema(BaseModel): query: str = Field(description='What to search'); filters: dict = Field(default_factory=dict); def search_func(query: str, filters: dict) -> str: return f'Searching: {query}'; tool = StructuredTool.from_function(func=search_func, name='search_db', description='Search database with filters', args_schema=SearchSchema, return_direct=False, handle_tool_error=True). Key parameters: return_direct=True returns tool output directly (skips LLM), handle_tool_error=True/str/callable for error recovery, coroutine=async_func for async tools. Use StructuredTool over @tool when: custom error handling needed, dynamic tool creation, tool metadata customization, integration with non-decorated functions. The args_schema creates JSON schema for OpenAI function calling.
Valid tool return types: str (most common, LLM-friendly), dict (auto-converted to JSON string), list (serialized to JSON), int/float/bool (converted to string), Pydantic BaseModel (serialized via model_dump_json()). Return str for: text responses, formatted output, error messages. Return dict/list for: structured data, multiple values, nested objects. Pattern: def tool() -> dict[str, Any]: return {'status': 'success', 'data': [...]}. CRITICAL: Avoid returning raw objects, file handles, or non-serializable types - causes JSON serialization errors. For binary data, return base64 string or file path. Tool output is passed back to LLM as observation, so format for readability. Use return_direct=True on StructuredTool if output should bypass LLM (e.g., calculator result). Maximum practical size: ~4000 characters per tool output (fits in LLM context). For large data, return summary + reference.
Tool descriptions are critical - LLMs use them to select correct tools. Best practices: 1) Start with verb ('Search for', 'Calculate', 'Fetch'). 2) Specify WHEN to use tool: 'Use when user asks about pricing' vs vague 'Gets pricing info'. 3) Specify input format: 'Takes ISO date string (YYYY-MM-DD)'. 4) Specify output format: 'Returns JSON with status and data fields'. 5) Add constraints: 'Only works for US addresses'. Example: @tool def search(query: str) -> str: '''Search product database. Use when user asks about product availability, pricing, or specifications. Takes product name or SKU as query. Returns JSON with {name, price, stock_count, specs}. Only searches active products.''' Good descriptions reduce tool selection errors by 60-80%. Use Field(description='...') for parameter-level descriptions. Keep total <200 words (fits in system prompt). Test by asking LLM 'which tool would you use for X?' Avoid: 'This tool does searching' (too vague), 'Advanced search algorithm' (implementation details irrelevant to LLM).
BaseTool provides maximum control for complex tools. Pattern: from langchain_core.tools import BaseTool; from pydantic import BaseModel, Field; class SearchInput(BaseModel): query: str; class SearchTool(BaseTool): name: str = 'search_db'; description: str = 'Search database'; args_schema: type[BaseModel] = SearchInput; return_direct: bool = False; def _run(self, query: str) -> str: '''Sync implementation''' return self.db.search(query); async def _arun(self, query: str) -> str: '''Async implementation''' return await self.db.async_search(query). Must override _run() for sync, _arun() for async. Use when: tool needs state (self.db), complex initialization, custom validation beyond Pydantic, integration with existing classes. Add custom methods for setup: def init(self, db_connection): self.db = db_connection. The args_schema defines input validation. Set return_direct=True to skip LLM processing of output. BaseTool is verbose but necessary for stateful tools or inheritance patterns.
AgentExecutor wraps agent and tools for execution loop. Basic pattern: from langchain.agents import AgentExecutor, create_tool_calling_agent; agent = create_tool_calling_agent(llm, tools, prompt); agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=15, max_execution_time=300, handle_parsing_errors=True, return_intermediate_steps=False). Key parameters: max_iterations (default 15) caps loops to prevent infinite runs, max_execution_time in seconds for hard timeout, handle_parsing_errors=True retries on LLM output parse failures, return_intermediate_steps=True includes agent trajectory in output (for debugging). Early stopping: early_stopping_method='force' returns error message, 'generate' prompts LLM for final response. Use callbacks=[CustomCallback()] for logging/monitoring. NOTE: LangGraph is now preferred over AgentExecutor for production (PEP 0.2 migration guide) - provides better control, persistence, human-in-loop.
max_iterations controls maximum agent loops before forced termination. Each iteration = tool invocation + LLM processing. Default is 15 iterations. Pattern: AgentExecutor(agent=agent, tools=tools, max_iterations=10, early_stopping_method='generate'). Behavior: iteration count includes initial LLM call + each tool use. After max_iterations, early_stopping_method determines response: 'force' returns 'Agent stopped due to iteration limit or time limit.' (fails fast), 'generate' prompts LLM one final time to synthesize answer from partial results (best effort). Use max_iterations=5-10 for simple tasks (web search + answer), 15-25 for complex reasoning (multi-step calculations), infinite (None) only for human-supervised agents. Combine with max_execution_time=60 for hard timeout in seconds. Monitor with callbacks: on_agent_action fired on each iteration. Common issue: too low max_iterations causes premature stopping, too high causes expensive runaway loops. Production: set both max_iterations and max_execution_time.
handle_parsing_errors catches LLM output parsing failures and implements retry logic. Pattern: AgentExecutor(agent=agent, tools=tools, handle_parsing_errors=True). Options: True (sends generic error + parsing exception to LLM for retry), str (returns custom message to LLM: 'Could not parse. Try again with valid JSON.'), callable (custom function taking OutputParserException, returning str). Example custom handler: def handle_error(e: OutputParserException) -> str: return f'Invalid format. Expected JSON with action and action_input. Error: {str(e)[:100]}'. Common parsing errors: invalid JSON from LLM, missing required fields (action/action_input), tool name not in tool list. The error message is sent as observation to LLM, which retries with corrected format. Use True for development (verbose errors help debugging), custom string/function for production (cleaner prompts). Parsing errors count toward max_iterations. If retry fails, agent stops. Pair with verbose=True to see parsing failures in logs.
return_intermediate_steps=True includes agent's reasoning trajectory in output for debugging. Pattern: AgentExecutor(agent=agent, tools=tools, return_intermediate_steps=True); result = agent_executor.invoke({'input': 'query'}); print(result['intermediate_steps']). Output format: list of tuples [(AgentAction, tool_output), ...]. AgentAction contains: tool name, tool_input dict, log (LLM reasoning text). Example: [((tool='search', tool_input={'query': 'weather'}, log='I should search'), 'Result: 72°F'), ((tool='calculator', tool_input={'expr': '72-32'}, log='Convert to Celsius'), '40')]. Use for: debugging tool selection errors, understanding agent reasoning, identifying which step failed, optimizing prompts. WARNING: intermediate_steps can be large (100+ KB for long chains), only return at end of execution (not streamed), includes sensitive data (sanitize before logging). Production: set False (default) for performance, enable only when debugging specific issues. Access via callbacks for real-time monitoring: on_agent_action(action), on_tool_end(output).
Callbacks hook into agent execution for logging, monitoring, tracing. Pattern: from langchain.callbacks.base import BaseCallbackHandler; class AgentLogger(BaseCallbackHandler): def on_llm_start(self, serialized, prompts, **kwargs): logger.info(f'LLM called with {len(prompts)} prompts'); def on_tool_start(self, serialized, input_str, **kwargs): logger.info(f'Tool {serialized["name"]} called'); def on_tool_end(self, output, **kwargs): logger.info(f'Tool returned: {output[:100]}'); def on_agent_action(self, action, **kwargs): logger.info(f'Action: {action.tool}, Input: {action.tool_input}'); def on_agent_finish(self, finish, **kwargs): logger.info(f'Agent finished: {finish.return_values}'). Use: AgentExecutor(callbacks=[AgentLogger()]). Built-in callbacks: StdOutCallbackHandler (console output), LangChainTracer (LangSmith tracing), WandbCallbackHandler (Weights & Biases). For cost tracking: use get_openai_callback context manager. Async callbacks: inherit AsyncCallbackHandler, implement async def methods. Callbacks enable: cost tracking, performance monitoring, error alerting, audit logs.
Streaming provides real-time output during agent execution. Pattern: for chunk in agent_executor.stream({'input': 'query'}): if 'actions' in chunk: print(f'Action: {chunk["actions"]}'); elif 'steps' in chunk: print(f'Step: {chunk["steps"]}'); elif 'output' in chunk: print(f'Output: {chunk["output"]}'). Chunks contain: 'actions' (agent decisions), 'steps' (tool executions), 'output' (final answer), 'messages' (intermediate text). Async streaming: async for chunk in agent_executor.astream(input). For token-level streaming from LLM: use astream_events() API (Python 3.11+): async for event in agent_executor.astream_events(input, version='v1'): if event['event'] == 'on_chat_model_stream': print(event['data']['chunk']). Limitations: streaming doesn't work well with return_intermediate_steps=True (buffering conflict), JSON tool outputs not streamable (must be complete). Use streaming for: chatbots (user sees progress), long-running tasks (feedback), debugging (real-time logs). AgentExecutor.stream() yields complete steps, not individual tokens.
LangChain v0.2+ offers two agent patterns: AgentExecutor (legacy, simpler) vs LCEL/LangGraph (modern, flexible). AgentExecutor: Pre-built loop, agent = create_tool_calling_agent(llm, tools, prompt); executor = AgentExecutor(agent=agent, tools=tools). Pros: simple setup, batteries-included. Cons: limited customization, no persistence, basic error handling. LCEL pattern: agent = prompt | llm.bind_tools(tools) | parser; result = agent.invoke(input). Manually implement loop for tool calling. LangGraph pattern (recommended for production): StateGraph with nodes (agent, tools, human_review), edges (conditional routing), checkpoints (persistence). Pros: full control over execution, human-in-loop, state persistence, parallel tool execution, complex workflows. Cons: more code, steeper learning curve. Migration: LangChain v0.2 docs recommend LangGraph for new projects - AgentExecutor maintained for backward compatibility. Use AgentExecutor for: quick prototypes, simple tasks. Use LangGraph for: production apps, multi-step workflows, human oversight, debugging/replay.
OpenAI Functions agent vs Tools agent differ in LLM API used. OpenAI Functions (deprecated in v0.2): Uses function_call parameter (OpenAI API), agent = create_openai_functions_agent(llm, tools, prompt). Forces JSON responses with specific schema. Only works with OpenAI models (gpt-3.5-turbo, gpt-4). Tools agent (current standard): Uses tools parameter (OpenAI API, supported by Anthropic, Google), agent = create_tool_calling_agent(llm, tools, prompt). Uses native tool calling format. Works with OpenAI, Anthropic Claude, Google Gemini, others with tool support. Performance: Tools agent is faster (10-20% fewer tokens), more reliable parsing, better multi-tool handling. Migration: Replace create_openai_functions_agent with create_tool_calling_agent - API compatible. Use Tools agent for: all new projects, cross-model compatibility, better performance. Only use Functions agent for: legacy code, specific OpenAI function calling features. LangChain v0.2+ recommends Tools agent as default.
ReAct (Reasoning + Acting) alternates between reasoning and tool use until answer found. Pattern: from langchain.agents import create_react_agent; agent = create_react_agent(llm, tools, prompt); executor = AgentExecutor(agent=agent, tools=tools, verbose=True). ReAct prompt structure: Thought: (reasoning), Action: (tool name), Action Input: (tool input), Observation: (tool output), repeat until Final Answer. Custom ReAct prompt: from langchain import hub; prompt = hub.pull('hwchase17/react'); or define manually with {tools}, {tool_names}, {agent_scratchpad} variables. Behavior: agent explicitly shows reasoning steps (unlike tool-calling agents with implicit reasoning). Use ReAct when: debugging needed (see thought process), complex multi-step reasoning, educational/transparent agents. Comparison to tool-calling: ReAct has explicit Thought traces (verbose, more tokens), tool-calling has implicit reasoning (faster, fewer tokens). ReAct works with: any LLM (no tool-calling API required), legacy models. Modern approach: prefer create_tool_calling_agent for efficiency, ReAct for transparency.
Conversational agents remember chat history across turns. Pattern: from langchain.agents import create_tool_calling_agent, AgentExecutor; from langchain.memory import ConversationBufferMemory; memory = ConversationBufferMemory(memory_key='chat_history', return_messages=True); agent = create_tool_calling_agent(llm, tools, prompt); agent_executor = AgentExecutor(agent=agent, tools=tools, memory=memory, verbose=True). Prompt must include {chat_history} variable. Memory types: ConversationBufferMemory (stores all messages, grows unbounded), ConversationBufferWindowMemory (keeps last K messages), ConversationSummaryMemory (LLM-generated summary, token-efficient), ConversationSummaryBufferMemory (hybrid: recent messages + summary). CRITICAL: Set return_messages=True for chat models (returns Message objects), False for LLMs (returns strings). Access history: memory.load_memory_variables({})['chat_history']. Clear: memory.clear(). Use conversational agent when: multi-turn chat, context from previous questions needed, follow-up queries. Limitation: memory adds context to every LLM call (increases cost/latency). Production: use ConversationSummaryBufferMemory with max_token_limit=2000.
Zero-shot vs few-shot differ in tool selection strategy. Zero-shot agent: Selects tools based only on tool descriptions (no examples). Pattern: agent = create_tool_calling_agent(llm, tools, prompt). Prompt contains tool descriptions only. Works when: tool names/descriptions are clear, LLM is capable (GPT-4), task is simple. Few-shot agent: Includes example tool uses in prompt to guide selection. Pattern: few_shot_prompt = FewShotPromptTemplate(examples=[{'input': 'weather in NYC', 'thought': 'Need weather tool', 'action': 'weather_tool', 'action_input': 'NYC'}], example_prompt=example_template, prefix='Answer questions using tools:', suffix='Question: {input}'); agent = create_react_agent(llm, tools, few_shot_prompt). Use few-shot when: complex tool selection logic, ambiguous tool names, weaker LLMs (GPT-3.5), novel domains. Examples improve tool selection accuracy 15-30%. Trade-off: few-shot uses more tokens (300-500 per example), slower, but more reliable. Modern approach: zero-shot with better tool descriptions + GPT-4 often beats few-shot with GPT-3.5. Few-shot essential for: domain-specific tools, non-obvious tool combinations.
Custom agents provide full control over tool selection and execution loop. Pattern: from langchain.agents import BaseSingleActionAgent; class CustomAgent(BaseSingleActionAgent): @property def input_keys(self): return ['input']; def plan(self, intermediate_steps, **kwargs): '''Decide next action''' if not intermediate_steps: return AgentAction(tool='search', tool_input='query', log='Starting'); last_tool, last_output = intermediate_steps[-1]; if 'error' in last_output: return AgentAction(tool='fallback', tool_input=last_output, log='Retry'); return AgentFinish(return_values={'output': last_output}, log='Done'); async def aplan(self, intermediate_steps, **kwargs): '''Async version'''. Override plan() to implement custom logic: tool selection based on state, conditional workflows, retry logic, tool chaining rules. Return AgentAction (continue with tool) or AgentFinish (stop with answer). Use with: AgentExecutor(agent=CustomAgent(), tools=tools). Custom agents enable: non-LLM agents (rule-based), hybrid approaches (LLM + heuristics), deterministic workflows. More control than LCEL but verbose. Modern alternative: LangGraph for complex workflows.
Tool validation prevents invalid inputs reaching tool logic. Use Pydantic for schema validation: from pydantic import BaseModel, Field, validator; class EmailInput(BaseModel): to: str = Field(pattern=r'^[\w.-]+@[\w.-]+.\w+$'); subject: str = Field(min_length=1, max_length=200); body: str; @validator('to') def validate_email(cls, v): if not v.endswith('@company.com'): raise ValueError('Only company emails allowed'); return v; @tool(args_schema=EmailInput) def send_email(to: str, subject: str, body: str) -> str: '''Send email'''. Pydantic validators: pattern (regex), min_length/max_length (strings), ge/le (numbers), validator decorator (custom logic). Validation errors raise ValidationError before tool execution - agent receives error and retries. For runtime validation: def send_email(...): if not check_rate_limit(): raise ToolException('Rate limit exceeded - wait 60s'). Use ToolException for recoverable errors (agent can retry/adapt), raise generic Exception for fatal errors (agent stops). Validate: input format (email, URL, JSON), business rules (permissions, rate limits), data ranges (dates, amounts). Validation reduces tool failures 40-60%.
Timeouts and rate limiting prevent runaway tool execution. Timeout pattern: import asyncio; from langchain_core.tools import tool, ToolException; @tool async def api_call(url: str) -> str: '''Call external API''' try: async with httpx.AsyncClient(timeout=10.0) as client: response = await asyncio.wait_for(client.get(url), timeout=10.0); return response.text; except asyncio.TimeoutError: raise ToolException('API timeout after 10s - try again'). Set timeout on: HTTP clients (httpx/aiohttp timeout parameter), asyncio.wait_for() wrapper, AgentExecutor max_execution_time. Rate limiting: from langchain_core.rate_limiters import InMemoryRateLimiter; limiter = InMemoryRateLimiter(requests_per_second=2.0); llm = ChatOpenAI(rate_limiter=limiter). For tool-level rate limiting: use asyncio.Semaphore or token bucket algorithm. Pattern: semaphore = asyncio.Semaphore(5); async with semaphore: await tool_call(). Global rate limiter: share InMemoryRateLimiter instance across tools. Monitor: track requests_per_unit in callbacks. Production: combine timeouts (fail fast) + rate limiting (prevent API bans) + retries (handle transient errors).
Tool errors inform LLM to adjust strategy. Pattern: from langchain_core.tools import ToolException; @tool def database_query(sql: str) -> str: '''Query database''' try: return db.execute(sql); except PermissionError: raise ToolException('Access denied. Try query without sensitive tables.'); except TimeoutError: raise ToolException('Query timeout. Use simpler query with LIMIT 100.'); except Exception as e: raise ToolException(f'Query failed: {str(e)[:100]}. Check SQL syntax.'). ToolException message is sent to LLM as observation - LLM can retry with corrected input. Generic exceptions stop agent execution. Use ToolException for: user errors (invalid input, permissions), transient errors (timeout, rate limit), recoverable failures (file not found, API error). Error message best practices: explain WHAT failed (not why), suggest HOW to fix ('Use LIMIT 100'), include relevant context (status code, error type), keep concise (<100 chars). Enable on AgentExecutor: handle_tool_error=True (default error message) or custom function. Error recovery improves success rate 25-40%.
Parallel tool execution requires async tools + LangGraph or manual orchestration. Pattern with LangGraph: from langgraph.prebuilt import ToolNode; tool_node = ToolNode(tools, parallel=True); graph.add_node('tools', tool_node). LangGraph automatically executes independent tools in parallel when LLM returns multiple tool calls. Manual approach with asyncio: import asyncio; async def parallel_tools(tool_calls): tasks = [tool.ainvoke(call.input) for call in tool_calls]; results = await asyncio.gather(*tasks, return_exceptions=True); return results. AgentExecutor does NOT support parallel execution (sequential only). For parallel execution: 1) All tools must be async (@tool with async def). 2) LLM must support multi-tool calling (OpenAI parallel function calling, Anthropic tool use). 3) Tools must be independent (no shared state). Benefits: 2-5x faster for I/O-bound tools (API calls, database queries), no benefit for CPU-bound tools (GIL). Use max_concurrency to limit: config={'max_concurrency': 5}. Production: LangGraph with ToolNode is recommended approach for parallel execution.
Agent security requires multiple layers: input sanitization, tool restrictions, execution sandboxing. Input validation: from pydantic import BaseModel, Field, validator; class SafeInput(BaseModel): query: str = Field(max_length=500); @validator('query') def no_sql_injection(cls, v): if any(kw in v.lower() for kw in ['drop', 'delete', 'truncate']): raise ValueError('Unsafe SQL keyword'); return v. Tool restrictions: allowed_tools = ['search', 'calculator']; agent_executor = AgentExecutor(agent=agent, tools=tools, allowed_tools=allowed_tools). Tool allowlisting: safer than blocklisting. Per-user permissions: class PermissionedTool(BaseTool): def _run(self, user_id: str, ...): if not has_permission(user_id, self.name): raise ToolException('Permission denied'). Sandboxing: use RestrictedPython for code execution tools, Docker containers for file system isolation, read-only database connections, network egress controls. Monitor: log all tool calls, alert on sensitive operations, rate limit per user. Production checklist: validate all inputs, allowlist tools, sandbox code execution, read-only by default, audit logging, rate limiting.
Cost tracking monitors LLM spending during agent execution. Built-in tracking: from langchain.callbacks import get_openai_callback; with get_openai_callback() as cb: result = agent_executor.invoke(input); print(f'Tokens: {cb.total_tokens}, Cost: ${cb.total_cost}'). Tracks: prompt_tokens, completion_tokens, total_tokens, total_cost (USD). Supports: OpenAI, Anthropic (via specific callbacks). Custom tracking: class CostCallback(BaseCallbackHandler): total_cost = 0; def on_llm_end(self, response, **kwargs): tokens = response.llm_output['token_usage']['total_tokens']; self.total_cost += tokens * 0.000002; # $0.002 per 1K tokens. Token limits per call: llm = ChatOpenAI(max_tokens=500, temperature=0). Agent-level budget: track in callback, raise exception when exceeded. Production pattern: cb = CostCallback(); agent_executor.invoke(input, callbacks=[cb]); if cb.total_cost > budget: alert('Budget exceeded'). Monitor: cost per query, cost per user, cost per day. Optimization: shorter prompts (20% savings), caching (50% savings), smaller models for simple tools (70% savings). LangSmith provides built-in cost tracking dashboard.