Building a Multi-Agent Research Lab: Scientist, Librarian, Analyst, and Writer Agents

The Research Lab Concept

Research is inherently a multi-stage process: formulating questions, finding sources, analyzing evidence, and synthesizing findings into a coherent document. A single AI agent attempting all four stages produces shallow results because it cannot specialize — it must juggle search queries, citation tracking, statistical reasoning, and academic writing simultaneously.

A multi-agent research lab assigns each stage to a specialized agent. The Scientist formulates hypotheses and directs research. The Librarian discovers and manages sources. The Analyst evaluates evidence and finds patterns. The Writer synthesizes everything into a structured document. Each agent excels at its narrow responsibility, and the handoffs between them enforce quality gates.

Shared Data Structures

from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
from enum import Enum
import uuid

@dataclass
class Source:
    source_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    title: str = ""
    url: str = ""
    content_summary: str = ""
    relevance_score: float = 0.0
    source_type: str = ""  # "paper", "article", "dataset", "book"
    metadata: Dict[str, Any] = field(default_factory=dict)

@dataclass
class ResearchQuestion:
    question: str
    sub_questions: List[str] = field(default_factory=list)
    hypothesis: Optional[str] = None
    priority: int = 1

@dataclass
class AnalysisFinding:
    claim: str
    supporting_sources: List[str]  # Source IDs
    confidence: float = 0.0  # 0.0 to 1.0
    evidence_summary: str = ""
    contradicting_sources: List[str] = field(default_factory=list)

@dataclass
class ResearchProject:
    topic: str
    questions: List[ResearchQuestion] = field(default_factory=list)
    sources: List[Source] = field(default_factory=list)
    findings: List[AnalysisFinding] = field(default_factory=list)
    draft: str = ""
    status: str = "initialized"

The Scientist Agent

The Scientist drives the research process. It formulates research questions, evaluates whether enough evidence has been gathered, and decides when the research is complete.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart TD
    INPUT(["Task input"])
    SUPER["Supervisor agent<br/>plans plus monitors"]
    W1["Worker 1<br/>research"]
    W2["Worker 2<br/>code"]
    W3["Worker 3<br/>writing"]
    CRITIC{"Output meets<br/>rubric?"}
    REWORK["Rework or<br/>retry path"]
    SHARED[("Shared scratchpad<br/>and memory")]
    OUT(["Final result"])
    INPUT --> SUPER
    SUPER --> W1 --> CRITIC
    SUPER --> W2 --> CRITIC
    SUPER --> W3 --> CRITIC
    W1 --> SHARED
    W2 --> SHARED
    W3 --> SHARED
    SHARED --> SUPER
    CRITIC -->|Pass| OUT
    CRITIC -->|Fail| REWORK --> SUPER
    style SUPER fill:#4f46e5,stroke:#4338ca,color:#fff
    style CRITIC fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OUT fill:#059669,stroke:#047857,color:#fff
    style SHARED fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b

from openai import AsyncOpenAI
import json

client = AsyncOpenAI()

async def scientist_agent(
    topic: str, existing_findings: Optional[List[AnalysisFinding]] = None
) -> List[ResearchQuestion]:
    context = f"Research topic: {topic}\n"
    if existing_findings:
        context += "\nExisting findings:\n"
        for f in existing_findings:
            context += f"- {f.claim} (confidence: {f.confidence})\n"
        context += "\nIdentify gaps and generate follow-up questions.\n"
    else:
        context += "Generate initial research questions and hypotheses.\n"

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research scientist. Generate structured research "
                    "questions with sub-questions and hypotheses. Return JSON: "
                    "questions (list of objects with question, sub_questions, "
                    "hypothesis, priority)."
                ),
            },
            {"role": "user", "content": context},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return [ResearchQuestion(**q) for q in data["questions"]]

The Librarian Agent

The Librarian handles source discovery and management. It searches for relevant materials, deduplicates sources, and maintains a citation index.

async def librarian_agent(
    questions: List[ResearchQuestion],
    existing_sources: List[Source],
) -> List[Source]:
    existing_titles = {s.title for s in existing_sources}

    search_prompt = "Find relevant sources for these research questions:\n"
    for q in questions:
        search_prompt += f"- {q.question}\n"
        for sq in q.sub_questions:
            search_prompt += f"  - {sq}\n"

    if existing_sources:
        search_prompt += (
            f"\nAlready have {len(existing_sources)} sources. "
            "Find complementary sources that fill gaps."
        )

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research librarian. For each research question, "
                    "suggest relevant academic papers, articles, and datasets. "
                    "Return JSON: sources (list of objects with title, url, "
                    "content_summary, relevance_score, source_type)."
                ),
            },
            {"role": "user", "content": search_prompt},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    new_sources = []
    for s in data["sources"]:
        if s["title"] not in existing_titles:
            new_sources.append(Source(**s))
    return new_sources

The Analyst Agent

The Analyst evaluates evidence across sources, identifies patterns, and produces structured findings with confidence scores.

async def analyst_agent(
    questions: List[ResearchQuestion],
    sources: List[Source],
) -> List[AnalysisFinding]:
    analysis_prompt = "Analyze these sources against the research questions.\n"
    analysis_prompt += "\nQUESTIONS:\n"
    for q in questions:
        analysis_prompt += f"- {q.question} (hypothesis: {q.hypothesis})\n"
    analysis_prompt += "\nSOURCES:\n"
    for s in sources:
        analysis_prompt += (
            f"- [{s.source_id[:8]}] {s.title}: {s.content_summary}\n"
        )

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a research analyst. Cross-reference sources to "
                    "produce evidence-based findings. For each finding, cite "
                    "supporting source IDs and note any contradictions. Return "
                    "JSON: findings (list of objects with claim, "
                    "supporting_sources, confidence, evidence_summary, "
                    "contradicting_sources)."
                ),
            },
            {"role": "user", "content": analysis_prompt},
        ],
        response_format={"type": "json_object"},
    )
    data = json.loads(response.choices[0].message.content)
    return [AnalysisFinding(**f) for f in data["findings"]]

The Writer Agent

The Writer synthesizes findings into a structured research document with proper citations.

async def writer_agent(
    project: ResearchProject,
) -> str:
    write_prompt = f"Topic: {project.topic}\n\n"
    write_prompt += "FINDINGS:\n"
    for f in project.findings:
        write_prompt += (
            f"- {f.claim} (confidence: {f.confidence})\n"
            f"  Evidence: {f.evidence_summary}\n"
        )
    write_prompt += "\nSOURCES:\n"
    for s in project.sources:
        write_prompt += f"- [{s.source_id[:8]}] {s.title} ({s.url})\n"

    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are an academic writer. Synthesize the findings into "
                    "a structured research document with sections: Abstract, "
                    "Introduction, Methodology, Findings, Discussion, "
                    "Conclusion, References. Use inline citations [source_id]. "
                    "Write in a clear, evidence-based academic style."
                ),
            },
            {"role": "user", "content": write_prompt},
        ],
    )
    return response.choices[0].message.content

The Research Orchestrator

The orchestrator runs the full research loop, allowing the Scientist to request additional rounds of source gathering and analysis.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

async def run_research_lab(
    topic: str, max_rounds: int = 3
) -> ResearchProject:
    project = ResearchProject(topic=topic)

    for round_num in range(1, max_rounds + 1):
        print(f"\n--- Research Round {round_num} ---")

        # Scientist formulates questions
        questions = await scientist_agent(topic, project.findings or None)
        project.questions.extend(questions)

        # Librarian finds sources
        new_sources = await librarian_agent(questions, project.sources)
        project.sources.extend(new_sources)
        print(f"Found {len(new_sources)} new sources")

        # Analyst evaluates evidence
        findings = await analyst_agent(questions, project.sources)
        project.findings.extend(findings)

        # Check if we have sufficient high-confidence findings
        high_confidence = [
            f for f in project.findings if f.confidence >= 0.7
        ]
        if len(high_confidence) >= 5:
            print("Sufficient evidence gathered")
            break

    # Writer produces the final document
    project.draft = await writer_agent(project)
    project.status = "completed"
    return project

FAQ

How do I integrate real source retrieval instead of LLM-generated sources?

Replace the Librarian agent's LLM call with actual API calls to Google Scholar (via SerpAPI), Semantic Scholar, arXiv, or PubMed. Feed the retrieved abstracts and metadata into the Source dataclass. The Analyst then works with real evidence instead of synthesized summaries. You can also combine both: use the LLM to generate search queries, execute them against real APIs, then let the LLM rank and summarize the results.

How does the Scientist decide when research is "done"?

The Scientist evaluates two criteria: coverage (do the findings address all research questions?) and confidence (are the confidence scores above the threshold?). In the orchestrator above, we stop when we have at least 5 high-confidence findings. In production, you would also check that each research question has at least one finding addressing it.

Can I add a Peer Reviewer agent to improve quality?

Absolutely — add a Peer Reviewer between the Analyst and Writer stages. The Peer Reviewer checks findings for logical consistency, flags unsupported claims, and verifies that citations actually support the claims made. If the review fails, loop back to the Scientist with the reviewer's feedback to trigger another research round targeting the weaknesses identified.

#ResearchAgents #MultiAgentLab #KnowledgeManagement #AIPaperGeneration #ResearchAutomation #AgenticAI #PythonAI #AIResearch

Building a Multi-Agent Research Lab: Scientist, Librarian, Analyst, and Writer Agents

The Research Lab Concept

Shared Data Structures

The Scientist Agent

The Librarian Agent

The Analyst Agent

The Writer Agent

The Research Orchestrator

FAQ

How do I integrate real source retrieval instead of LLM-generated sources?

How does the Scientist decide when research is "done"?

Can I add a Peer Reviewer agent to improve quality?

Try CallSphere AI Voice Agents

Related Articles You May Like

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

A2A Multi-Agent Architecture Patterns (2026 Reference)

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Multi-Agent Handoffs with the OpenAI Agents SDK: The Pattern That Actually Scales (2026)

LangGraph Supervisor Pattern: Orchestrating Multi-Agent Teams in 2026 — Langgraph multi-agent supervisor handoffs docs

Smolagents: Hugging Face's Code-First Agent Framework Reviewed