199-biotech / claude-deep-research-skill
github.com/199-biotechnologies
⭐ 509
Claude Skill
MIT
8-phase pipeline: scope → plan → retrieve → triangulate → outline → synthesize → critique → refine → package. Disk-persisted citations survive context compaction.
[4]
Depth mechanism8 phases · critique loop-back · auto-continue past 18k words
Citation rigorDisk-persisted · DOI/URL hallucination check
VERY HIGH
Same substrate as Scout — directly portable.
StealDisk-persisted citations + multi-persona critique + validate→fix→retry max 3 cycles.
GPT-Researcher
github.com/assafelovic
⭐ 26.6k
OSS
Apache-2.0
The closest conceptual sibling to Scout. Three-role split: planner / executors / publisher. 2026 additions: tree-shaped Deep Research mode, ~5 min runs at ~$0.40 on o3-mini.
[2]
Depth mechanismPlanner+executor+publisher · tree DR mode · ~5 min
Citation rigorInline · 20+ sources per run
HIGH
Closest analogue — production-ready, since May 2023.
StealThree-role separation. Final writer sees only the evidence bundle — not the noisy search trajectory.
open_deep_research
github.com/langchain-ai
⭐ 11.2k
OSS
MIT
MCP-native, model-agnostic via init_chat_model(). LangGraph-based. #6 on Deep Research Bench with 0.4943 RACE on GPT-5.
[3]
Depth mechanismLangGraph · any MCP server pluggable
Bench scoreRACE 0.4943 · #6 on Deep Research Bench
HIGH
Most "configurable" of the OSS options — closest to a reference impl.
StealMCP-server-as-tool. Swap Tavily / SearXNG / Exa as config, not code.
Weizhena / Deep-Research-skills
github.com/Weizhena
⭐ 483
Claude Skill
MIT
Two phases: outline generation (user can expand it), then deep investigation per item in parallel. HITL checkpoints — approve outline before spending tokens.
[18]
Depth mechanismOutline → parallel deep investigations
HITLApprove outline before token spend
HIGH
HITL pattern translates to Scout's "self-review the outline first."
StealApprove-outline-before-investigation. Scout has expedition plans — tighten the gate.
Claude Managed Agents
platform.claude.com · 2026-04-01
BETA
API
commercial
Hosted agent harness behind
managed-agents-2026-04-01 header — sandbox, Bash, file ops, web search/fetch, MCP servers. The Environment / Session / Events model maps cleanly onto "one research run = one Session."
[12]
Depth mechanismHarness-defined · Environment/Session/Events
Citation rigorDepends on system prompt
HIGH
Hosting target if Scout outgrows GitHub Actions.
StealMigration path — Environment / Session / Events maps 1:1 onto research-per-run.
STORM / Co-STORM
github.com/stanford-oval
⭐ 28.1k
OSS
MIT
Different shape: simulates conversations between writers with different perspectives + a topic-expert LLM grounded in web sources, then builds the outline from the transcript. +10% absolute coverage, +25% organization vs outline-then-RAG.
[14]
Depth mechanismPersona-guided Q&A · Co-STORM HITL turns
Coverage gain+10% absolute · +25% organization
MEDIUM
Long-form only · authors warn: not publication-ready.
StealPersona-guided Q&A — basis for a future scout-researcher-perspectives specialist.
Perplexity Sonar Deep Research
perplexity.ai · API
94.3% cite
SaaS · API
commercial
Only one of the "big five" with a production API. Fastest commercial option (2–3 min runs). ~$0.41 per typical query.
[8]
Pricing$2/$8 per M tok + $5/1k searches
Citation rigor94.3% Sonar Pro vs ~87% GPT-5.2 DR
[6]
MEDIUM
API-driven — but opaque. No on-disk artifact.
StealOptional drop-in: delegate the search+synth step from Scout if cost permits.
local-deep-researcher
github.com/langchain-ai
offline
OSS
MIT
Fully local: any Ollama- or LMStudio-hosted model, SearXNG search, nothing leaves the machine. Loop: query → search → summarize → reflect for gaps → next query, for N cycles.
[17]
Depth mechanismReflect-and-requery loop · user-set cycles
Citation rigorInline markdown sources
MEDIUM
Reference for Scout's offline mode if that's ever on the table.
StealReflect-and-requery loop. Cleaner than today's ad-hoc "did I miss anything" check.
smolagents · Open Deep Research
github.com/huggingface
⭐ 26.8k
OSS
Apache-2.0
Agents emit Python code instead of JSON tool calls — ~30% fewer steps. 55.15% on GAIA vs OpenAI DR's 67.36%. Known context-window blow-ups; demo unstable.
[16]
Depth mechanismCode-agent · multimodal state handling
GAIA score55.15% · −12pt vs OpenAI DR
LOW
Proof-of-concept. Production gap is real — browser tooling and vision.
StealCode-emission idea, not the agent itself. Less JSON ceremony.
OpenAI Deep Research
openai.com · ChatGPT Pro
26.6% HLE
SaaS
commercial
o3-tuned agent. Longest runs (15–25 min) and the most essay-like reports. 26.6% on Humanity's Last Exam at launch — frontier in its cohort.
[7]
Citation rigor87% cited accuracy
LOW
Closed · no on-disk artifact · no steering hints.
StealNothing portable. Reference frontier numbers only.
Claude Research / Advanced Research
anthropic.com · Claude Pro/Max
45 min
SaaS
commercial
"Advanced Research" runs up to 45 min across hundreds of sources autonomously. Anthropic renamed the Claude Code SDK to Claude Agent SDK — internal use was dominated by research, video, note-taking, not just coding.
[11][21]
Run lengthUp to 45 min · hundreds of sources
Citation rigorClosed metric
LOW
Not programmable as artifact. But: Claude-Code-as-research-platform is the sanctioned pattern.
StealValidation that Scout's substrate choice is right. No code to lift.
Gemini Deep Research
google.com · AI Pro
$19.99/mo
SaaS
commercial
Differentiates on Workspace: pulls from Gmail, Drive, and the public web simultaneously and drops multi-page reports back into Docs.
[10]
PricingGoogle AI Pro $19.99/mo
[9]
DifferentiatorGmail/Drive/Docs round-trip
LOW
Workspace-locked. Useless for a Scout-style standalone artifact.
StealNothing portable.
Grok DeepSearch
x.ai · X Premium
X-native
SaaS
commercial
Real-time X / web synthesis. Direct X-timeline access is the one thing competitors can't match — only interesting when the topic is breaking news or social sentiment.
[13]
DifferentiatorReal-time X/Twitter timeline access
APINone relevant to Scout
LOW
X-centric · breaking-news only.
StealNothing portable.
Elicit
elicit.com · academic
99.4% extr.
SaaS
commercial
Best-in-class for academic literature. 138M papers + 545K clinical trials. 80% time-saved on abstract screening with quote-level rationale per decision. Systematic Review reports cap at 80 papers.
[19]
CoveragePubMed · ClinicalTrials.gov · 138M papers
CapSR reports max 80 papers
LOW
PubMed-only — not Scout's brief.
StealPer-claim rationale + source quote. Quality scoring not binary in/out.
FutureHouse · Crow / Falcon / Owl / Phoenix
futurehouse.org · science
science
SaaS · API
commercial
Scientific-discovery platform built on Claude
[23]. Task-specialized: Crow extracts genes/markers, Falcon does background, Owl checks if a hypothesis was already investigated, Phoenix designs chemistry.
[20]
PatternOne agent per epistemic move
CaveatPhoenix not as deeply benchmarked
LOW
Science-only. Pattern is transferable; product isn't.
StealOne agent per epistemic move with named roles — Scout already does this implicitly via Explore sub-agents.