Home

2026

How can we disrupt traditional archival research with AI without losing reasoning control?

Spinosa | 2026 | Product Design & Developer Tools

Spinosa is a CLI-based research product for expert teams working with large, mixed research archives. It closes the gap between fragmented material and AI-assisted research by preparing corpora, structuring sources, and giving researchers an agent-based way to find evidence and analyse material without ceding judgment to a black-box model.

Problem

The apparent problem was how to give researchers an LLM over their archive. The real problem was how to let expert researchers navigate evidence, preserve source traceability, and reason across corpora without turning interpretation into a black box. Therefore I designed a CLI product with 7 specialist subagents - mapper, searcher, analyst, writer, verifier, serendippo, janitor - a read-only raw corpus, and a system layer that enforces supervised reasoning at every step.

UX research

Co-design with diverse profiles - design researchers, ethnographers, anthropologists, no-code users, data engineers - revealed actual needs: orientation, confidence, reversible actions, source inspection, and layered technical access.

Product response

Spinosa's agent architecture maps directly to four concrete research workflows:

Corpus Intelligence Pipeline - Mapper ingests raw files and builds thematic maps with cross-file connections; Searcher retrieves evidence via structured queries with alias resolution. It turns messy multi-format corpora into a navigable knowledge graph.

Evidence Integrity & Claim Traceability - Verifier traces every claim to its source document; Searcher backs it with alias-aware retrieval. Every assertion anchors to a specific file, line, and context.

Critical Analysis & Synthesis - Analyst reads project context and dictionary to identify coverage gaps and blind spots; Writer produces structured reports with navigation dashboards mapping what was consulted, scanned, and cited.

Serendipitous Cross-Source Discovery - Serendippo roams across siloed sources - interviews, worksheets, transcriptions - to surface unexpected connections that targeted queries miss. This is what differentiates Spinosa from a query tool: four distinct research capabilities, each grounded in a methodological need.

Onboarding

User runs spinosa new. CLI asks for source folder, scans corpus, shows what can be imported before making changes. Scan classifies material by type: text files, PDFs (routed through MarkItDown or RapidOCR), native files copied unchanged, unsupported formats reported. This makes the corpus legible, bounded, and safe to reason with.

After import: workspace boundary created (raw/), configuration files record source policy and evidence rules. CLI prints startup prompt for Claude, Codex, or Opencode. That handoff teaches the AI to treat the prepared corpus as sole source, refuse to edit originals, and build navigation indexes. Onboarding turns an ambiguous folder into an accountable, queryable research environment.

Architecture and usage

Every Spinosa workspace starts with a startup.md that teaches the AI to treat the prepared corpus as the sole source of evidence, refuse to edit originals, and build navigation indexes. A per-workspace dictionary.md centers definitions and agreed terminology - a shared lexicon the researcher controls. Context files record source policy, evidence rules, and methodological constraints.

The agent architecture implements separation of concerns: each subagent (Mapper, Searcher, Verifier, Analyst, Writer, Serendippo) has a single responsibility and a constrained toolset. No agent can edit raw/ - it is mounted read-only by design. Every operation is logged in agent_reports/: what was searched, which files were consulted, which sources were cited. The researcher navigates these reports to inspect, validate, or redirect. This is supervised reasoning: agents move fast through material, but every output is an intermediate research object — provisional, situated, always the researcher's call to accept or reject.

The orchestration around Spinosa has two sides, the first being on-the-vault, to keep it tidy, to update it. While the second being the search-and-find pipeline, with custom agents built exactly for taht.

Outcome

Spinosa turns messy material into auditable research. Not by automating analysis, but by providing a system where every operation - ingesting, mapping, searching, verifying, synthesising - is methodologically grounded and fully traceable. The corpus is bounded and read-only. Claims are anchored to specific files and lines. Reports document exactly what was done, consulted, and cited.

For the researcher, this means defensible claims, inspectable reasoning, and a process that holds up under scrutiny. Spinosa does not promise truth from AI. It promises a controlled environment where researchers can direct, constrain, and test LLM output as part of their own rigorous process.

All the spinosa workspaces are designed to be visualized and read by the users with Obsidian

The Opencode Integration

Outcomes

My role covered product design, UX, system architecture, and final development. I owned the CLI, agent skill protocol, system architecture, and implementation, while collaborating with around 20 médialab researchers across sociology, design, and engineering who tested Spinosa in their own projects. I did not own the theoretical frameworks applied by users.

Spinosa was designed as a CLI first for speed, modularity, composability, and compatibility with expert researchers already working in terminal based AI workflows. Its 7 specialist subagents each have a constrained role, making the system auditable, debuggable, and extensible. AGENTS.md files provide machine readable instructions for agent behaviour, while ORT explores a future GUI layer for less technical users.

The system is built around calibrated trust. Agent outputs are provisional research objects, not final conclusions. Search can retrieve irrelevant evidence, synthesis can flatten contradiction, indexing can miss what is not named, and hallucination remains a risk. For this reason, every claim must trace back to source files, and final validation remains with the researcher.

Spinosa is now an open source alpha: the CLI is released and installable, ORT is in early prototype, and médialab teams are testing it across active research projects. Done means a researcher can initialize a workspace, configure a corpus, run agent operations, and produce source verified reports independently.

Spinosa's official website