Deep Agents
Web Scraping Agent with Deep Agents
Build an intelligent web scraping agent that fetches pages, extracts structured data, and handles pagination — powered by Deep Agents.
web scrapingdata extractionHTTPparsing
Working Code
from deepagents import create_deep_agentfrom langchain_core.tools import tool
@tooldef fetch_url(url: str) -> str: """Fetch a webpage and return its content as markdown.""" import httpx from markdownify import markdownify response = httpx.get(url, headers={"User-Agent": "Mozilla/5.0"}, timeout=15) return markdownify(response.text)[:5000]
@tooldef extract_data(text: str, instruction: str) -> str: """Extract structured data from text based on instruction.""" # Uses the LLM itself to parse — no regex needed return f"Extracting from {len(text)} chars: {instruction}"
agent = create_deep_agent( model="anthropic:claude-sonnet-4-5-20250929", tools=[fetch_url, extract_data], system_prompt="You are a web scraping agent. Fetch pages, extract the requested data, and return it in structured format. Respect robots.txt.",)
result = agent.invoke({ "messages": [("user", "Scrape the pricing page at example.com/pricing and extract all plan names and prices")]})print(result["messages"][-1].content)Step by Step
1
Install dependencies
Install Deep Agents and the required tools for this use case.
2
Define your tools
Create the domain-specific tool functions your agent will use to interact with external services.
3
Create the agent and run
Initialize the Deep Agents agent with your tools, set the system prompt, and execute a query.
Build with other frameworks
Ready to build with Deep Agents?
Generate a production-ready project with Deep Agents pre-configured — FastAPI + Next.js, auth, streaming, and more.
Get StartedReady to build your first production AI agent?
Open-source tools, battle-tested patterns, zero boilerplate. Configure your stack and ship in minutes — not months.