CrewAI

Web Scraping Agent with CrewAI

Build an intelligent web scraping agent that fetches pages, extracts structured data, and handles pagination — powered by CrewAI.

web scrapingdata extractionHTTPparsing

Working Code

from crewai import Agent, Crew, Task
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool

@tool
def fetch_url(url: str) -> str:
    """Fetch a webpage and return its content as markdown."""
    import httpx
    from markdownify import markdownify
    response = httpx.get(url, headers={"User-Agent": "Mozilla/5.0"}, timeout=15)
    return markdownify(response.text)[:5000]

@tool
def extract_data(text: str, instruction: str) -> str:
    """Extract structured data from text based on instruction."""
    # Uses the LLM itself to parse — no regex needed
    return f"Extracting from {len(text)} chars: {instruction}"

agent = Agent(
    role="Specialist",
    goal="You are a web scraping agent. Fetch pages, extract the requested data, and return it in structured format. Respect robots.txt.",
    tools=[fetch_url, extract_data],
    llm=ChatOpenAI(model="gpt-4o"),
)

task = Task(
    description="Scrape the pricing page at example.com/pricing and extract all plan names and prices",
    expected_output="Detailed response",
    agent=agent,
)

crew = Crew(agents=[agent], tasks=[task])
result = crew.kickoff()
print(result.raw)

Step by Step

Install dependencies

Install CrewAI and the required tools for this use case.

Define your tools

Create the domain-specific tool functions your agent will use to interact with external services.

Create the agent and run

Initialize the CrewAI agent with your tools, set the system prompt, and execute a query.

Build with other frameworks

Pydantic AI LangChain LangGraph Deep Agents

More guides with CrewAI

Customer Support Chatbot with CrewAI RAG Pipeline with CrewAI Research Agent with CrewAI Text-to-SQL Agent with CrewAI

Ready to build with CrewAI?

Generate a production-ready project with CrewAI pre-configured — FastAPI + Next.js, auth, streaming, and more.

Get Started

Ready to build your first production AI agent?

Open-source tools, battle-tested patterns, zero boilerplate. Configure your stack and ship in minutes — not months.

Build Your AI App