AI gets code authorship.
Context ownership stays with you.

Agents write the code. You carry the consequences. Unlost keeps the context that used to live in your head - and now lives in the agent - close to you.

curl -fsSL https://unlost.unfault.dev/install.sh | bashCopied
For Windows, please download the binary from our releases page. Setup guide
Privacy First

Capsules and indexes stay on your machine.

Open-Source

Built in the open under a MIT license.

Non-Blocking

Journaling returns immediately; heavy work runs async.

Any agent. One shared memory.

Claude Code
Claude Cowork
OpenCode
GitHub Copilot

When it matters

It used to be that writing the code meant understanding it. That's changed. These are the moments where the gap shows.

A colleague asks why

unlost brief / unlost trace

You know it works. You're less sure you can explain it. The reasoning was in the chat, which is gone.

Six months later

unlost trace / unlost challenge

Someone needs to change it. Maybe it's you. The agent doesn't remember. The commit message says "feat: add retry logic."

Production is down

unlost trace / unlost brief

You're reading code under pressure that you didn't write. You don't know if the retry logic was intentional or a guess.

The PR is the handoff

unlost pr-comment

To your team, and to your future self. The diff shows what changed. It doesn't show what was tried and rejected, what constraint you were navigating, what's still open.

The context you didn't know you needed

unlost trace (proactive)

You can't query what you don't remember. When a new change echoes a past decision, the recurrence channel surfaces the dormant capsule — before you ask, before you repeat a mistake.

How It Works

01

Record

Unlost sits alongside your agent process. It captures the raw stream of thought—intent, reasoning, and decisions—without blocking your flow.

# Silent observation
Recording session ses_3a79... [active]
02

Extract & Structure

We don't just log text. Unlost distills messy conversations into Capsules: atomic units of memory containing the Why, the What, and the How.

# Extracted Capsule
{intent: "Refactor auth", decision: "Use PASETO tokens", rationale: "..."}
03

Ground

Every capsule is verified against the live code graph. We link decisions to specific files and symbols, creating a navigable map of your project's evolution.

# Graph Links
Linked: src/auth.rs (80% relevance) | Symbol: verify_token
04

Re-Surface

Dormant capsules don't sit idle. When a new movement of work echoes a past decision, the recurrence channel injects a SYSTEM NOTE into the agent's context — pointing back to the original reasoning, with source provenance attached. The agent decides whether to mention it to you.

# SYSTEM NOTE (auto-injected)
This continues a prior thread. On Jan 14, the decision was: "switch to HTTP/2 for upstream proxy." Source: claude+jsonl://...abc.jsonl#L47

One Memory. Many Lenses.

Once context is grounded, you can query it from any angle.

Look Back

Recover the context you lost. Whether it's a high-level staff engineer debrief or a specific decision trail, you get the why behind the code. Past context doesn't just wait for queries — it surfaces itself when it becomes relevant again.

trace brief

Challenge Present

Don't just approve; verify. Pressure-test decisions against the graph and monitor the agent's struggle in real-time.

challenge metrics

Explore Future

Draft with memory. Use your established constraints and history to weigh new trade-offs before writing a single line of code.

explore

Review the Journey

Reflect on how you and the agent worked together. Get actionable coaching on collaboration habits, agent drift patterns, and which skills to add or tune.

reflect

Long Memory & the Story Arc

Capsules are not just a log. They encode a causal history - the sequence of decisions, constraints, and failures that explain why the code looks the way it does today.

The problem with standard RAG

Most retrieval systems return a ranked bag of hits. Ask "why is the timeout 30s?" and you get the five most similar capsules - but no sense of how you arrived there. Causality is lost. Sessions are mixed. Old decisions look the same as new ones.

Engineers don't think in bags of results. They think in chains: "because we switched to HTTP/2, we needed longer timeouts, which surfaced a bug in the keepalive logic, which we worked around by…" That's a causal chain, and it's what unlost trace reconstructs.

How the causal chain is built

1
Richer embeddings

Every capsule is embedded with its category, failure mode, top symbols, and the prior decision from the same work thread. This encodes trajectory into the vector; capsules from the same causal chain cluster together in embedding space, even across different agent sessions.

2
HyPE: question-first retrieval

When a capsule is extracted, the LLM also generates 2–3 questions the capsule answers: "Why is the connection timeout 30 seconds?", "How does the proxy route upstream requests?". At retrieval time, each command frames your input as a question matching its intent before embedding - recall asks "What happened with X?", challenge asks "Was the decision about X the right call?", explore asks "What are the alternatives for X?", and so on. This turns retrieval into a question-to-question match against the stored questions, dramatically improving precision with zero extra LLM cost.

Based on: Ma et al., "HyPE: Hypothetical Prompt Embeddings for Improved Retrieval" (2025). papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335

3
Seed → fan-out → threshold

A vector ANN search finds the closest seed capsules. For each seed, unlost fans out to all capsules that share symbols, traversing the existing LabelList index. Capsules above the similarity threshold are dropped to prevent the chain from sprawling. The survivors are sorted chronologically: oldest first, newest last.

Sessions don't slice neatly

In practice, engineers reuse the same agent session across multiple pieces of work. Session IDs are a poor proxy for "same work thread." Unlost sidesteps this by using semantic continuity instead: the prior-decision embedding means capsules that are conceptually related cluster together regardless of session boundaries. The chain reflects intent, not session structure.

# same chain, different sessions, months apart
2026-01-14 [ses_a] switched to HTTP/2 for upstream
2026-01-21 [ses_a] retry spiral on keepalive, increased timeout
2026-02-03 [ses_b] timeout 30s hardcoded in proxy_request
2026-02-18 [ses_b] ← you are here

Install

Get Unlost

curl -fsSL https://unlost.unfault.dev/install.sh | bash
For Windows, please download the binary from our releases page.

Agent Integration

Hook unlost into your agent. This installs the Unlost agent skill and configures the necessary hooks.

# Claude Code - one command, works everywhere
unlost config agent claude --global

# Claude Cowork - per-project or global
unlost config agent cowork --path .
unlost config agent cowork --global

# OpenCode - opt-in per repository
cd your-project
unlost config agent opencode --path .

# GitHub Copilot CLI - opt-in per repository
cd your-project
unlost config agent copilot --path .

# Any MCP-aware agent (Claude Code, OpenCode, Copilot, Cowork, or any MCP host)
unlost config agent mcp --target claude   # or opencode / copilot / generic

Configure Your LLM

Unlost requires an LLM for capsule extraction and summarization.

unlost config llm anthropic --model claude-sonnet-4-5-20250929
unlost config llm openai --model gpt-4o-mini
Recommendation: Use a fast, cheap model like GPT-4o-mini or Claude 3.5 Haiku.

That's it. Next time you start your agent, unlost will automatically work alongside it. Happy coding!

Commands

unlost brief

Memory

A staff engineer's debrief on any codebase: what matters, what bites, where to start. Scans all recorded memory (conversations + git commits) and scores by importance, not recency.

unlost brief
unlost brief src/governor.rs
unlost brief TrajectoryController

unlost recall

Memory

Recall the story so far (proactive overview) from a specific file or directory.

unlost recall src/http_proxy.rs
unlost recall src/

unlost trace

Memory

Reconstruct the causal chain of decisions that led to the current state of a file, symbol, or concept. Every result includes a Source: footer pointing back to the original conversation, commit, or changelog entry.

unlost trace src/governor.rs
unlost trace "why is the timeout 30s?"

unlost challenge

Memory

Pressure-test a past decision or technology choice using workspace memory and the live code graph.

unlost challenge "lancedb"

unlost explore

Memory

Explore future paths grounded in your workspace memory.

unlost explore "should we keep lancedb or move to sqlite+fts?"

unlost thread

Memory

Map when a topic was explored over time, across all registered projects. Results open with an LLM synthesis of the journey — what the topic meant, why it recurred, and what older notes change about the current reading — followed by a day-grouped, recent-first timeline with arc duration in the header and backward-in-time gap markers. Cross-workspace by default; falls back to extracted notes and a dim configuration hint if no LLM is configured.

unlost thread "embedding model performance"
unlost thread "retry logic" --since 6m
unlost thread "auth strategy" --no-llm --limit 20
unlost thread "timeout" --output plain

unlost reflect

Memory

Reflect on how you and the agent worked together. Reads per-turn evaluation telemetry (collected silently during sessions) and generates a structured narrative via LLM — no raw transcript required. Three modes:

  • coach — developer collaboration habits: clarity, scope discipline, verification rigor, session health.
  • tune — agent drift and failure patterns: repetition, hallucination, alignment debt, instruction drift. Audits installed skills and suggests behavioural gaps to fill.
  • both — combined developer and agent report.

Every output opens with a NEXT ACTIONS block — 3–5 scannable imperatives — before the full analysis. The SKILL ASSESSMENT section (tune/both) audits your installed agent skills against observed turn data and suggests behavioural gaps to address.

unlost reflect
unlost reflect --mode tune
unlost reflect --mode both --since 7d
unlost reflect --mode coach --session ses_3a79

unlost metrics

Monitor

Show workspace trajectory metrics and friction hotspots.

unlost metrics

unlost replay

Monitor

Replay and backfill agent transcripts.

unlost replay opencode --git-grounding --no-llm

unlost inspect

Monitor

Inspect stored capsules for this workspace.

unlost inspect

unlost reindex

Manage

Rebuild LanceDB index from capsules.jsonl.

unlost reindex

unlost clear

Manage

Delete all generated data for the current workspace.

unlost clear

unlost where

Manage

Show where the workspace's files are stored.

unlost where

unlost config

Manage

Manage configuration (LLM provider, agent integrations, etc.).

unlost config

unlost note

Memory

Capture a manual note into workspace memory — a thought, a meeting decision, a constraint discovered outside the agent session. Fully queryable alongside recorded sessions. If no project is detected in the current directory, the note lands in a global workspace; --global forces this regardless. Each note gets a note+local:// source pointer, so it shows up with a manual note (YYYY-MM-DD) footer in trace and thread results.

unlost note "decided to keep lancedb — sqlite FTS doesn't support ANN"
echo "meeting notes..." | unlost note --stdin --source meeting
unlost note "cross-cutting constraint" --global

unlost mcp serve

Agent API

Start an MCP (Model Context Protocol) stdio server giving agents structured, direct access to workspace memory — no LLM narration, no ANSI prose. Seven tools: unlost_recall, unlost_trace_decision, unlost_challenge, unlost_thread, unlost_orient, unlost_capsule_get, and unlost_note. All read tools run on the no-LLM fast path. Write tools opt-in.

unlost mcp serve
unlost mcp serve --allow-writes
Wire automatically with unlost config agent mcp --target <claude|opencode|copilot|generic>.

The Cognitive Mirror

The unlost metrics command generates a Cognitive Mirror: a diagnostic report that reveals the structural health of your collaboration with AI agents. For a deeper, session-level narrative, see unlost reflect.

Emotional Dynamics

We analyze conversation patterns to detect emotional signals that indicate when interactions may be deteriorating:

  • Valence: Positive vs negative sentiment (-1.0 to +1.0)
  • Intensity: Strength of the emotional response (0.0 to 1.0)

The purpose is early detection of friction before it escalates. When emotion signals cross thresholds, Unlost can intervene with guidance.

The Three Basins of Friction

Unlost models interaction dynamics across three distinct "Basins of Friction" to provide proactive regulation:

Loop Basin (The Stall)

Detects repetitive failures, symbol stalls, and logic churn. Triggered by high EMA-smoothed repetition and low novelty.

Spec Basin (Misunderstanding)

Detects alignment debt (user corrections) and instruction staticness (verbatim repeats).

Drift Basin (Grounding Failure)

Detects grounding stalls (ignoring user file mentions) and symbol hallucinations. Validated against a live codebase graph via unfault-core.

Metrics at a Glance

friction rate

Measured in warnings per 1M tokens. This is your primary "Babysitting Tax" indicator. A rate under 5 is healthy exploration; over 10 indicates a session that is likely stalling or drifting.

avg interval

The average number of tokens processed between proactive interventions. Short intervals suggest you are fighting the agent's mental model turn-by-turn.

Friction vs Context Size

Unlost buckets friction by input token count to identify your agent's Context Inflection Point.

"For most current models, we observe a stability collapse between 8k and 12k tokens. The friction rate typically doubles past this point as the 'lost in the middle' phenomenon degrades grounding."

Typical Inflection Pattern

Under the Hood

Every sensor, retrieval strategy, and storage primitive that makes Unlost work.

Recording: The Silent Observer

Unlost is designed to be invisible until you need it. The recording architecture prioritizes your flow above all else:

Process Isolation

The Unlost daemon runs as a separate sidecar process. Your agent (Claude, OpenCode) talks to it via a lightweight shim that fires-and-forgets. If Unlost crashes or hangs, your agent keeps working.

Async Processing

Heavy lifting (LLM extraction, embedding generation, graph analysis) happens asynchronously in the background. The shim returns control to the agent immediately.

Content-Addressed Deduplication

Flush jobs are hashed by content. If an agent loops or retries the same output, Unlost silently suppresses the duplicates (within a 45s window) to keep your history clean.

Grounding: The Code Graph

Drift happens when an agent hallucinates symbols that don't exist. Unlost prevents this by maintaining a live graph of your codebase.

unfault-core + petgraph

We use unfault-core to parse your code and build a dependency graph in milliseconds. It handles symbol resolution, identifying which functions call which, and calculating centrality scores.

Symbol Verification

Every time an agent mentions a file or function, we verify it against the graph. If it exists, we link the capsule to that node. If it doesn't, we flag it as potential drift.

Git Provenance

We capture the git HEAD SHA at the start and end of every turn. This allows us to time-travel: we know exactly what the code looked like when a decision was made, even if it has changed since.

Storage & Retrieval

Apache Arrow / LanceDB

Capsules are stored locally as Arrow RecordBatches with three indexes: ANN (vector search), LabelList (tag filtering), and Scalar (time). It's fast, private, and zero-config.

HyPE Embeddings

We use Hypothetical Prompt Embeddings (HyPE). At indexing time, we generate questions the capsule answers. At retrieval time, we frame your query as a question. Matching questions-to-questions is significantly more accurate than matching queries-to-documents.

Source Pointers & the Recurrence Channel

Every capsule carries a source pointer: a URI back to the system of record where the conversation happened. Unlost never resolves it for content — that's the agent's job. This separation keeps unlost small while giving the agent a direct line back to original context.

The Pointer Registry

Every capsule is stamped with a provenance URI from its shim: claude+jsonl:// (line-precise to transcript), opencode+message:// (session + message ID), copilot+events:// (byte-offset into events log), git+commit:// (pinned to SHA), changelog+version:// (pinned to version string), or note+local:// (timestamp-pinned to a manual note, renders as manual note (YYYY-MM-DD) in query footers). A resolve_source_label helper renders these as human-readable footers in every query command.

The Recurrence Channel

Before every LLM call, unlost runs an ANN vector search across dormant capsules. If a past capsule scores above the similarity gate, a SYSTEM NOTE is injected — quietly pointing the agent back. The note includes the capsule's decision, rationale, and the source pointer. The agent mentions it to you if it's relevant, or ignores it if not. No noise, no notifications, no new UI.

Cross-Workspace Memory

The recurrence channel searches across all workspaces, not just the current one. A decision made in project A that proves relevant to project B surfaces naturally — with a label showing where it came from. Your memory isn't siloed by cd boundaries.

Point, Don't Fetch

Unlost never fetches external content, never manages API credentials, and never builds source-specific connectors. The agent is the connector. When the agent fetched a Linear ticket via MCP or a GitHub PR during the original conversation, the URL lives inside the capsule's text. The source pointer points to the chat session where it was discussed. Six months later, the agent re-fetches using the same tools it used the first time.

Research-First Discipline

Unlost isn't just a collection of heuristics. It is built on a research-first discipline where every sensor and basin is validated against real-world "Marathon" datasets.

Precision-First

We only intervene when the trajectory signal is unambiguous. We favor silence over noise to protect your flow.

Temporal Awareness

The controller respects "Coffee Pauses." It decays state across breaks to avoid misattributing human pauses to agent stalls.

Academic Alignment

Our basins are aligned with EASE'25 research on emotional strain, and Nature Scientific Reports on how conversational fluency can mask inaccuracy.

Codebase Grounding

We use unfault-core for sub-second symbol graph validation, ensuring drift detection is backed by factual codebase state.

Scientific References

Cristina Martinez Montes and Ranim Khojah. 2025. Emotional Strain and Frustration in LLM Interactions in Software Engineering. In Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE '25). DOI: 10.1145/3756681.3756951

Zhu, Y., Wu, Y., & Miller, J. 2024. Conversational presentation mode increases credibility judgements during information search with ChatGPT. Scientific Reports (Nature). DOI: 10.1038/s41598-024-67829-6

Ma, L., et al. 2025. HyPE: Hypothetical Prompt Embeddings for Improved Retrieval. arXiv preprint. papers.ssrn.com/sol3/papers.cfm?abstract_id=5139335 Demonstrates that pre-generating questions at indexing time and matching query-to-question rather than query-to-document significantly improves retrieval precision.

Who's behind this?

I built Unlost because the intimacy I had with code started slipping once agents were doing the writing. I still wanted to feel close to what was being built - to understand the tradeoffs, to own the decisions, not just approve the diff. Unlost is my attempt to keep that feeling alive in a world where I'm not the one typing every line.

If something feels rough, open an issue. I read them. github.com/unfault/unlost/issues

- Sylvain