Source: arXiv
Published: 17 June 2026
Runtime: 0:00

A conversation between

Yi Lu , Zhuofeng Li , Ping Nie , Haoxiang Zhang , Yuyu Zhang , Kai Zou , Wenhu Chen , Jimmy Lin , Dongfu Jiang , Yu Zhang

Dr-DCI: Scaling Direct Corpus Interaction via Dynamic Workspace Expansion

Source · arxiv.org/watch?v=2606.14885 ↗

§03

Synthesis

## The Problem: Retrieval Alone Isn't Enough

Current agentic search systems rely on retrieval models (BM25, ColBERT) to find relevant documents, then show agents ranked lists or snippets. This works for simple lookups, but falls apart when a task requires comparing information across multiple documents, verifying constraints, or reorganizing evidence. A retrieval result tells you *which* documents are relevant, not how to work *with* them flexibly.

Direct Corpus Interaction (DCI) solves this by giving agents shell-like commands—search, filter, compare—to manipulate corpus data directly. The catch: running these commands over millions of documents is slow and unstable. Full-corpus terminal queries degrade as scale increases, defeating the purpose of agentic search.

## The Solution: Retrieval as a Workspace Expander

DR-DCI inverts the relationship between retrieval and direct interaction. Instead of retrieval being the final answer and DCI a side tool, the authors treat retrieval as an agent-callable action that *expands a local workspace*—a manageable subset of documents the agent actually operates on.

Here's the workflow: when an agent needs to search, filter, or verify something, it first calls the retriever to pull relevant documents into memory. Then it runs DCI operations (shell commands for search, comparison, etc.) within that workspace, not the full corpus. The agent can refine its workspace by calling the retriever again, pulling in new documents as needed.

This design leverages two complementary strengths. Retrieval handles scalability—it finds candidate documents efficiently even in massive corpora. DCI handles precision—once candidates are local, the agent can perform complex relational operations (comparing fields, cross-document filtering) that simple ranking cannot.

## Why It Works

The key insight is that most corpus operations don't need the full dataset present; they need the *right subset* present. By dynamically expanding the workspace, DR-DCI avoids the instability of full-corpus terminal commands while retaining the flexibility advantage over static ranked lists.

On Browsecomp-Plus (a complex reasoning benchmark), DR-DCI achieves 71.2% accuracy—8.3 points better than raw DCI and ablated variants—while using fewer tool calls and less wall-clock time. Adding workspace-preserving context reset (keeping relevant documents in memory across multiple turns) pushes accuracy to 73.3%.

The scaling results are striking. As corpus size grows from 100K to 10M documents, DR-DCI remains stable; raw DCI degrades sharply, and BM25 performance drops substantially. DR-DCI also scales to 20M documents (Wiki-18 QA), reaching an average score of 63.0 across six benchmarks and outperforming both retrieval-only and learned search-agent baselines.

Ablations confirm that two design choices matter most: showing agents ranked previews of retrieval results (so they know what's available before pulling documents in), and enabling inter-document DCI operations (cross-document comparisons and filtering).

Mine your own.

Lode is a workbench, not a feed. Paste a YouTube URL. The model proposes a transcript, a set of quote-grounded snippets, a synthesis essay, and the fan-out. You decide what stays.

Open the curator