Multi-Source Research Agent
Autonomous AI research agent that searches the live web, scrapes full article content, and synthesises a downloadable PDF report — built on a 4-node LangGraph workflow with Gemini & Groq backends, deployed live on Hugging Face Spaces.
4
LangGraph Nodes
2
AI Backends
Live Web
Data Source
Report Export
The Problem
Manual research is slow — opening dozens of browser tabs, skimming for relevance, copy-pasting fragments, then synthesising it all into a coherent report. Generic LLMs like ChatGPT help with the writing but only know what was in their training data — they never see live sources. Teams needed an agent that searches the real web, reads full article content (not just snippets), and produces a structured, citable report on demand.
The Solution
Built a LangGraph agent with a 4-node workflow: (1) the LLM decomposes the topic into 4 targeted research questions; (2) Tavily API searches the web and returns 2 URLs per question; (3) BeautifulSoup scrapes the full article content from each URL; (4) a router decides whether enough material has been gathered or to loop back for more depth, then a final LLM node synthesises a structured report. Dual-backend support — Google Gemini 2.5 Flash or Groq Llama 3.3 — selectable at runtime. ReportLab renders the synthesis as a professional PDF for download. Packaged as a Streamlit app and deployed on Hugging Face Spaces via Docker — always active, never sleeps.
Results & Metrics
- 4-node LangGraph workflow — question generation → web search → scraping → synthesis
- Answers grounded in live web sources via Tavily — not stale model training data
- Dual-backend support: Google Gemini 2.5 Flash and Groq Llama 3.3, switchable at runtime
- Smart router loops for more depth when initial results are insufficient
- Professional PDF reports generated via ReportLab — ready to share
- Deployed live on Hugging Face Spaces with Docker — always-on, zero cold starts