Skip to content
Learn Agentic AI
Learn Agentic AI14 min read9 views

Building a Multi-Agent Research Lab: Scientist, Librarian, Analyst, and Writer Agents

Construct a multi-agent research system with four specialized agents — Scientist, Librarian, Analyst, and Writer — that collaborate on source discovery, analysis, and paper generation with complete Python code.

The Research Lab Concept

Research is inherently a multi-stage process: formulating questions, finding sources, analyzing evidence, and synthesizing findings into a coherent document. A single AI agent attempting all four stages produces shallow results because it cannot specialize — it must juggle search queries, citation tracking, statistical reasoning, and academic writing simultaneously.

A multi-agent research lab assigns each stage to a specialized agent. The Scientist formulates hypotheses and directs research. The Librarian discovers and manages sources. The Analyst evaluates evidence and finds patterns. The Writer synthesizes everything into a structured document. Each agent excels at its narrow responsibility, and the handoffs between them enforce quality gates.

Shared Data Structures

from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from enum import Enum
import uuid

@dataclass
class Source:
    source_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    title: str = ""
    url: str = ""
    content_summary: str = ""
    relevance_score: float = 0.0
    source_type: str = ""  # "paper", "article", "dataset", "book"
    metadata: Dict[str, Any] = field(default_factory=dict)

@dataclass
class ResearchQuestion:
    question: str
    sub_questions: List[str] = field(default_factory=list)
    hypothesis: Optional[str] = None
    priority: int = 1

@dataclass
class AnalysisFinding:
    claim: str
    supporting_sources: List[str]  # Source IDs
    confidence: float = 0.0  # 0.0 to 1.0
    evidence_summary: str = ""
    contradicting_sources: List[str] = field(default_factory=list)

@dataclass
class ResearchProject:
    topic: str
    questions: List[ResearchQuestion] = field(default_factory=list)
    sources: List[Source] = field(default_factory=list)
    findings: List[AnalysisFinding] = field(default_factory=list)
    draft: str = ""
    status: str = "initialized"

The Scientist Agent

The Scientist drives the research process. It formulates research questions, evaluates whether enough evidence has been gathered, and decides when the research is complete.

flowchart TD
    START["Building a Multi-Agent Research Lab: Scientist, L…"] --> A
    A["The Research Lab Concept"]
    A --> B
    B["Shared Data Structures"]
    B --> C
    C["The Scientist Agent"]
    C --> D
    D["The Librarian Agent"]
    D --> E
    E["The Analyst Agent"]
    E --> F
    F["The Writer Agent"]
    F --> G
    G["The Research Orchestrator"]
    G --> H
    H["FAQ"]
    H --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
from openai import AsyncOpenAI
import json

client = AsyncOpenAI()

async def scientist_agent(
    topic: str, existing_findings: Optional[List[AnalysisFinding]] = None
) -> List[ResearchQuestion]:
    context = f"Research topic: {topic}\n"
    if existing_findings:
        context += "\nExisting findings:\n"
        for f in existing_findings:
            context += f"- {f.claim} (confidence: {f.confidence})\n"
        context += "\nIdentify gaps and generate follow-up questions.\n"
    else:
        context += "Generate initial research questions and hypotheses.\n"

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research scientist. Generate structured research "
                    "questions with sub-questions and hypotheses. Return JSON: "
                    "questions (list of objects with question, sub_questions, "
                    "hypothesis, priority)."
                ),
            },
            {"role": "user", "content": context},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return [ResearchQuestion(**q) for q in data["questions"]]

The Librarian Agent

The Librarian handles source discovery and management. It searches for relevant materials, deduplicates sources, and maintains a citation index.

async def librarian_agent(
    questions: List[ResearchQuestion],
    existing_sources: List[Source],
) -> List[Source]:
    existing_titles = {s.title for s in existing_sources}

    search_prompt = "Find relevant sources for these research questions:\n"
    for q in questions:
        search_prompt += f"- {q.question}\n"
        for sq in q.sub_questions:
            search_prompt += f"  - {sq}\n"

    if existing_sources:
        search_prompt += (
            f"\nAlready have {len(existing_sources)} sources. "
            "Find complementary sources that fill gaps."
        )

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research librarian. For each research question, "
                    "suggest relevant academic papers, articles, and datasets. "
                    "Return JSON: sources (list of objects with title, url, "
                    "content_summary, relevance_score, source_type)."
                ),
            },
            {"role": "user", "content": search_prompt},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    new_sources = []
    for s in data["sources"]:
        if s["title"] not in existing_titles:
            new_sources.append(Source(**s))
    return new_sources

The Analyst Agent

The Analyst evaluates evidence across sources, identifies patterns, and produces structured findings with confidence scores.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

async def analyst_agent(
    questions: List[ResearchQuestion],
    sources: List[Source],
) -> List[AnalysisFinding]:
    analysis_prompt = "Analyze these sources against the research questions.\n"
    analysis_prompt += "\nQUESTIONS:\n"
    for q in questions:
        analysis_prompt += f"- {q.question} (hypothesis: {q.hypothesis})\n"
    analysis_prompt += "\nSOURCES:\n"
    for s in sources:
        analysis_prompt += (
            f"- [{s.source_id[:8]}] {s.title}: {s.content_summary}\n"
        )

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research analyst. Cross-reference sources to "
                    "produce evidence-based findings. For each finding, cite "
                    "supporting source IDs and note any contradictions. Return "
                    "JSON: findings (list of objects with claim, "
                    "supporting_sources, confidence, evidence_summary, "
                    "contradicting_sources)."
                ),
            },
            {"role": "user", "content": analysis_prompt},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return [AnalysisFinding(**f) for f in data["findings"]]

The Writer Agent

The Writer synthesizes findings into a structured research document with proper citations.

async def writer_agent(
    project: ResearchProject,
) -> str:
    write_prompt = f"Topic: {project.topic}\n\n"
    write_prompt += "FINDINGS:\n"
    for f in project.findings:
        write_prompt += (
            f"- {f.claim} (confidence: {f.confidence})\n"
            f"  Evidence: {f.evidence_summary}\n"
        )
    write_prompt += "\nSOURCES:\n"
    for s in project.sources:
        write_prompt += f"- [{s.source_id[:8]}] {s.title} ({s.url})\n"

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are an academic writer. Synthesize the findings into "
                    "a structured research document with sections: Abstract, "
                    "Introduction, Methodology, Findings, Discussion, "
                    "Conclusion, References. Use inline citations [source_id]. "
                    "Write in a clear, evidence-based academic style."
                ),
            },
            {"role": "user", "content": write_prompt},
        ],
    )
    return response.choices[0].message.content

The Research Orchestrator

The orchestrator runs the full research loop, allowing the Scientist to request additional rounds of source gathering and analysis.

async def run_research_lab(
    topic: str, max_rounds: int = 3
) -> ResearchProject:
    project = ResearchProject(topic=topic)

    for round_num in range(1, max_rounds + 1):
        print(f"\n--- Research Round {round_num} ---")

        # Scientist formulates questions
        questions = await scientist_agent(topic, project.findings or None)
        project.questions.extend(questions)

        # Librarian finds sources
        new_sources = await librarian_agent(questions, project.sources)
        project.sources.extend(new_sources)
        print(f"Found {len(new_sources)} new sources")

        # Analyst evaluates evidence
        findings = await analyst_agent(questions, project.sources)
        project.findings.extend(findings)

        # Check if we have sufficient high-confidence findings
        high_confidence = [
            f for f in project.findings if f.confidence >= 0.7
        ]
        if len(high_confidence) >= 5:
            print("Sufficient evidence gathered")
            break

    # Writer produces the final document
    project.draft = await writer_agent(project)
    project.status = "completed"
    return project

FAQ

How do I integrate real source retrieval instead of LLM-generated sources?

Replace the Librarian agent's LLM call with actual API calls to Google Scholar (via SerpAPI), Semantic Scholar, arXiv, or PubMed. Feed the retrieved abstracts and metadata into the Source dataclass. The Analyst then works with real evidence instead of synthesized summaries. You can also combine both: use the LLM to generate search queries, execute them against real APIs, then let the LLM rank and summarize the results.

How does the Scientist decide when research is "done"?

The Scientist evaluates two criteria: coverage (do the findings address all research questions?) and confidence (are the confidence scores above the threshold?). In the orchestrator above, we stop when we have at least 5 high-confidence findings. In production, you would also check that each research question has at least one finding addressing it.

Can I add a Peer Reviewer agent to improve quality?

Absolutely — add a Peer Reviewer between the Analyst and Writer stages. The Peer Reviewer checks findings for logical consistency, flags unsupported claims, and verifies that citations actually support the claims made. If the review fails, loop back to the Scientist with the reviewer's feedback to trigger another research round targeting the weaknesses identified.


#ResearchAgents #MultiAgentLab #KnowledgeManagement #AIPaperGeneration #ResearchAutomation #AgenticAI #PythonAI #AIResearch

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Use Cases

Building a Multi-Agent Insurance Intake System: How AI Handles Policy Questions, Quotes, and Bind Requests Over the Phone

Learn how multi-agent AI voice systems handle insurance intake calls — policy questions, quoting, and bind requests — reducing agent workload by 60%.

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Flat vs Hierarchical vs Mesh: Choosing the Right Multi-Agent Topology

Architectural comparison of multi-agent topologies including flat, hierarchical, and mesh designs with performance trade-offs, decision frameworks, and migration strategies.

AI Interview Prep

7 Agentic AI & Multi-Agent System Interview Questions for 2026

Real agentic AI and multi-agent system interview questions from Anthropic, OpenAI, and Microsoft in 2026. Covers agent design patterns, memory systems, safety, orchestration frameworks, tool calling, and evaluation.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

Building a Research Agent with Web Search and Report Generation: Complete Tutorial

Build a research agent that searches the web, extracts and synthesizes data, and generates formatted reports using OpenAI Agents SDK and web search tools.