Summarization for Pydantic AI
Automatic conversation summarization for unlimited context
Three strategies for managing agent context: intelligent LLM-based summarization, zero-cost sliding window trimming, and real-time context manager middleware with token tracking.
Installation
pip install summarization-pydantic-aiTwo strategies for keeping agent conversations within context limits. LLM-based summarization intelligently compresses older messages while preserving key information — triggered by message count, token count, or context fraction. Zero-cost sliding window trimming simply drops the oldest messages with a safe cutoff that never breaks tool call/response pairs. A real-time context manager middleware tracks token usage live, truncates long tool outputs, and auto-detects model context windows.
Features
Quick Start
from pydantic_ai import Agentfrom pydantic_ai_summarization import create_summarization_processor
processor = create_summarization_processor( trigger=("tokens", 100000), keep=("messages", 20),)
agent = Agent( "openai:gpt-4o", history_processors=[processor],)
result = await agent.run("Hello!")Use Cases
Long Conversations
Keep agents running for hours without hitting context limits — older messages get summarized automatically.
Customer Support Bots
Preserve key customer details (name, issue, order ID) while discarding routine back-and-forth exchanges.
Research Assistants
Maintain research context across deep investigation sessions where accumulated findings would exceed the context window.
Cost-Sensitive Apps
Choose zero-cost sliding window for maximum throughput, or LLM summarization when quality matters more than speed.
Ready to build your first production AI agent?
Open-source tools, battle-tested patterns, zero boilerplate. Configure your stack and ship in minutes — not months.