Predictive AI: Give Your Agent a Docker Lab to Run Models
Table of Contents
Every AI agent demo shows the same thing: “ask your agent about your data.” The agent queries a database, summarizes results, maybe makes a chart.
But try asking it to run a polynomial regression on 24 months of sales data and forecast the next 6 months. Suddenly your chat-based agent hits a wall. It can’t import sklearn. It can’t execute Python. It can reason about what model to use, but it can’t actually run one.
The obvious fix: give it a Python environment. But here’s the question nobody talks about - should that environment be the agent’s default runtime (like Claude Code), or should it be a tool the agent chooses to use?
We built a demo that answers this question.
TL;DR
- The “environment as a tool” pattern lets your AI agent selectively use a Docker sandbox only when needed, avoiding overhead on simple queries.
- A sub-agent delegation approach keeps the main agent clean - it describes what to predict, and the sub-agent figures out how to write the Python code.
- Structured Pydantic output for charts beats image generation - send data, let the frontend render with Chart.js.
- WebSocket streaming with
agent.iter()gives real-time visibility into text, tool calls, and results. - The hardest part is frontend integration - merging chart series with different date ranges, intercepting tool outputs, and streaming over WebSocket.
The Demo: Predictive Analytics Agent
We built a full-stack demo app - a chat-based analytics assistant that can:
- Query data - filter and aggregate monthly sales data (3 products, 3 regions, 24 months)
- Run predictions - a sub-agent writes and executes Python (sklearn, pandas) inside an isolated Docker container
- Generate charts - structured Pydantic output rendered as Chart.js line charts in the browser
The main agent has three tools. Two are simple (query JSON, return chart data). The third is where it gets interesting - it spins up a sub-agent that has full access to a Docker sandbox.
The Architecture: Environment as a Tool
Here’s the key design decision: the Docker sandbox is a tool, not the agent’s default environment.
analytics_agent: Agent[AnalyticsDeps, str] = Agent( "openai:gpt-4.1", deps_type=AnalyticsDeps,)The main agent is a regular Pydantic AI agent. It doesn’t live inside Docker. It has three tools registered with @analytics_agent.tool:
query_data- reads a JSON file, filters records, returns results. No Docker needed.predict- creates a sub-agent with Docker access, delegates the prediction task.generate_chart- returns structuredLineChartData(a Pydantic model) that the frontend renders as a Chart.js chart.
The agent decides when to use Docker. If you ask “show me total sales by product,” it calls query_data - fast, no overhead. If you ask “predict Widget Alpha sales for the next 6 months,” it calls predict - which spins up a sub-agent inside Docker.
The Predict Tool: Sub-Agent with a Docker Lab
This is the core pattern. The predict tool doesn’t execute code itself - it delegates to a sub-agent that has full access to a Docker container:
@analytics_agent.toolasync def predict( ctx: RunContext[AnalyticsDeps], task_description: str,) -> str: """Run a prediction using Python in a Docker sandbox.""" sandbox = ctx.deps.sandbox
# Write sales data into the Docker container sandbox.write("/workspace/sales_data.json", data_content)
# Create a sub-agent with Docker tools console_toolset = create_console_toolset( include_execute=True, require_write_approval=False, require_execute_approval=False, )
sub_agent: Agent[SandboxDeps, str] = Agent( "openai:gpt-4.1", system_prompt="You are a data science code executor...", deps_type=SandboxDeps, toolsets=[console_toolset], )
result = await sub_agent.run( f"Perform this prediction task:\n\n{task_description}", deps=SandboxDeps(backend=sandbox), ) return result.outputWhat happens step by step:
- Main agent receives: “Predict Widget Alpha sales for the next 6 months”
- Main agent calls
predict(task_description="...") - Sales data gets written into the Docker container at
/workspace/sales_data.json - A fresh sub-agent is created with
create_console_toolset()- giving itls,read,write,execute, and other file operations - The sub-agent writes a Python script using pandas + sklearn
- The sub-agent executes the script inside Docker
- Results flow back to the main agent, which explains them to the user
The sub-agent has no idea it’s a sub-agent. It just sees a system prompt saying “you’re a data science code executor” and tools to read/write/execute files. The Docker sandbox is completely transparent.
Structured Charts with Pydantic
The third tool - generate_chart - demonstrates structured output. Instead of returning raw text, it returns a Pydantic model:
class DataPoint(BaseModel): x: str # e.g. "2024-01" y: float
class ChartSeries(BaseModel): name: str data_points: list[DataPoint]
class LineChartData(BaseModel): title: str x_label: str y_label: str series: list[ChartSeries]The generate_chart tool takes chart parameters from the LLM and returns a serialized LineChartData with a special prefix (CHART_DATA:). The server intercepts this prefix in the WebSocket stream and sends it to the frontend as a chart_data message:
if result_str.startswith(CHART_DATA_PREFIX): chart_json = result_str[len(CHART_DATA_PREFIX):] await websocket.send_json( {"type": "chart_data", "data": json.loads(chart_json)} )The frontend picks it up and renders it with Chart.js. No images, no base64, no matplotlib - just structured data flowing from agent to browser.
The Environment Question: Tool vs. Default
This is the design question I mentioned at the start. There are two ways to give an agent a code execution environment:
Option A: Environment as a Tool (what we built)
The agent lives outside Docker. It has a predict tool that delegates to a sub-agent inside Docker. The agent decides when to use it.
Option B: Default Environment (like Claude Code) The agent lives inside Docker. Every command it runs, every file it reads - it’s all in the sandbox. The environment is always there.
Here’s when each makes sense:
| Environment as Tool | Default Environment | |
|---|---|---|
| Best for | Domain-specific tasks (predictions, data analysis, code review) | General-purpose coding agents |
| Agent control | Agent decides when to use sandbox | Agent always runs in sandbox |
| Overhead | Only pays Docker cost when needed | Always running |
| Flexibility | Can mix tools freely | Everything goes through the sandbox |
| Complexity | Needs sub-agent delegation pattern | Simpler - agent just has tools |
For our predictive analytics demo, Option A is clearly right. The agent mostly answers questions about data (no Docker needed) and only runs Docker when it needs to execute sklearn code. Making Docker the default environment would add unnecessary latency to every interaction.
But for a coding agent like Claude Code, Option B makes sense - the agent’s entire job is reading, writing, and executing code. The environment is the product.
Real Results
Here’s what the demo actually produces. Ask it to “analyze Widget Beta’s seasonal patterns and predict the next 12 months”:
The sub-agent chose Holt-Winters exponential smoothing (appropriate for seasonal data), ran it inside Docker, and returned structured predictions. The main agent then called generate_chart with both historical and forecast data as separate series.
The entire flow - from user message to rendered chart - happens over a single WebSocket connection with real-time streaming of text, tool calls, and chart data.
The WebSocket Streaming Protocol
The server uses Pydantic AI’s agent.iter() for real-time streaming. Every model token, tool call, and tool result is streamed to the frontend:
async with analytics_agent.iter( user_message, deps=deps, message_history=message_history,) as run: async for node in run: if Agent.is_model_request_node(node): # Stream text deltas and tool call deltas async with node.stream(run.ctx) as stream: async for event in stream: if isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): await ws.send_json({ "type": "text_delta", "content": event.delta.content_delta }) elif Agent.is_call_tools_node(node): # Stream tool execution events ...The frontend shows tool cards that expand to show arguments and results, text streaming token by token, and charts rendered inline - all over one WebSocket.
One Gotcha: Chart.js Multi-Series with Different Ranges
We hit an interesting bug during development. When charting “Historical” (2024-01 to 2025-12) alongside “Forecast” (2026-01 to 2026-06), Chart.js only used labels from the first series. Forecast points were mapped to historical dates.
The fix: merge all unique x-labels across all series, then use a lookup map per series with null for missing dates:
const allLabels = [...new Set( chartData.series.flatMap((s) => s.data_points.map((dp) => dp.x)))].sort();
const datasets = chartData.series.map((s, i) => { const lookup = new Map(s.data_points.map((dp) => [dp.x, dp.y])); return { label: s.name, data: allLabels.map((x) => lookup.get(x) ?? null), spanGaps: false, // ...styling };});Small thing, but it’s the kind of bug that makes your forecast look completely wrong while the data is actually correct.
Key Takeaways
- “Environment as a tool” is the right pattern when your agent only sometimes needs code execution. Don’t pay Docker overhead on every interaction.
- Sub-agent delegation keeps the main agent clean. The main agent describes what to predict. The sub-agent figures out how to write the Python code.
- Structured Pydantic output for charts beats generating images. Send data, let the frontend render. Easier to style, interactive, and no base64 blobs.
- WebSocket streaming with
agent.iter()gives you real-time visibility into what the agent is doing - text, tool calls, and results. - The hardest part isn’t the agent - it’s the frontend integration (merging chart series with different date ranges, intercepting tool outputs, streaming over WebSocket).
Try It Yourself
pydantic-ai-backend - Docker sandbox, console toolset, and backend abstractions for Pydantic AI agents
The full demo is in examples/predictive_analytics/:
pip install pydantic-ai-backend[docker,console]export OPENAI_API_KEY=your-keyuvicorn examples.predictive_analytics.server:app --port 8000Related Articles
From create-react-app to create-ai-app: The New Default for AI Applications
In 2016, create-react-app standardized how we build frontends. In 2026, AI applications need the same moment — and it's...
AGENTS.md: Making Your Codebase AI-Agent Friendly (Copilot, Cursor, Codex, Claude Code)
Every AI coding tool reads your repo differently. Here's how AGENTS.md — the emerging tool-agnostic standard — gives the...
From 0 to Production AI Agent in 30 Minutes — Full-Stack Template with 5 AI Frameworks
Step-by-step walkthrough: web configurator, pick a preset, choose your AI framework, configure 75+ options, docker-compo...