Middleware

Middleware is LangChain 1.0's main extension point: hook into fixed stages of the agent loop so you don't rewrite the loop for every production concern.


Why middleware?

Common production needs:

  • Summarization (context limits)
  • Long-term memory injection
  • Human-in-the-loop before sensitive tools
  • Subagent delegation
  • Dynamic prompts / model routing
  • Step and tool limits

Hooks sit around model and tool nodes; multiple middleware pieces compose (onion model).


Hook types (conceptual)

HookWhenExamples
before_modelBefore LLM callTrim/inject messages
modify_model_requestBuilding requestSwap model, temperature
wrap_model_callAround LLMRetry, cache, log
after_modelAfter LLMHITL interrupt, filter tools

Order: forward on the way in, reverse on the way out.


Example: step limit (sketch)

from langchain.agents import create_agent
from langchain.agents.middleware import AgentMiddleware  # see official docs

class StepLimitMiddleware(AgentMiddleware):
    def __init__(self, max_steps: int = 10):
        self.max_steps = max_steps

    def before_model(self, state, runtime):
        ...

agent = create_agent(
    "openai:gpt-4.1-mini",
    tools=[...],
    middleware=[StepLimitMiddleware(max_steps=15)],
)

Exact base classes and signatures depend on your installed version—see official middleware docs.


Built-in & community middleware

Examples: human-in-the-loop, summarization, model fallback, PII filters, prompt caching. Check LangChain blog and GitHub for community lists.


Middleware vs LangGraph nodes

MiddlewareCustom nodes
IntrusionLowHigh
Best forCross-cutting concernsNew control flow

Use middleware for single-agent enhancements; LangGraph for orchestration patterns.


Subagents

Middleware can spawn subagents with isolated context and merge results—see Deep Agents / delegation middleware.


Debugging

Use LangSmith state diffs; add one middleware at a time; avoid heavy sync IO in hooks.


Next steps