How Claude Dreams: Background Memory Defragmentation

Posted on 2026-04-13 In NLP 评论: Views:

There's a module inside Claude Code called autoDream. Its prompt title reads "Dream: Memory Consolidation."

This isn't a metaphor. Claude Code actually spins up a background sub-agent that reviews transcripts from past sessions, consolidates scattered memories — merging, deduplicating, correcting — and writes them back to disk. The whole thing is invisible unless you dig into the background task list.

Sticky Notes vs. Notebook Rewrite

Claude Code has two memory mechanisms with a clean division of labor.

extractMemories runs at the end of each conversation turn. It only looks at recently added messages in the current session and decides if anything is worth remembering long-term. Tell it "we use bun, not npm" and it'll save that preference. Fast — 2 to 4 turns and it's done, like jotting something on a sticky note. If you've already had Claude write to memory during the conversation, it skips the extraction to avoid duplicate work.

autoDream is a different beast entirely. It doesn't look at a single conversation. It waits until several sessions have accumulated, then reads through past transcripts and existing memory files, merging related information scattered across files, deleting stale memories, and trimming the index to a reasonable length.

One takes notes. The other reorganizes the notebook.

Three Gates

autoDream doesn't just fire whenever it wants. It has three layers of checks, ordered from cheapest to most expensive — any layer failing means the whole thing is skipped:

+------------------+     +------------------+     +------------------+
|  Gate 1: Time    |---->| Gate 2: Sessions |---->|  Gate 3: Lock    |
|                  |     |                  |     |                  |
|  >= 24h since    |     |  >= 5 sessions   |     |  No other dream  |
|  last dream?     |     |  touched since?  |     |  in progress?    |
|                  |     |                  |     |                  |
|  Cost: 1 stat()  |     |  Cost: dir scan  |     |  Cost: file I/O  |
+------------------+     +------------------+     +------------------+
       |                        |                        |
    No: skip               No: skip                 No: skip

The ordering itself is a design choice. The first gate reads a single file's modification time — near-zero cost. If a dream happened within 24 hours, everything else is skipped. The second gate scans the session directory, slightly more expensive, so it's second. The scan is also throttled to once every 10 minutes, preventing repeated directory traversals when the time gate passes but the session gate doesn't.

The third gate is the interesting one. Multiple Claude Code instances may be running for the same project, and they shouldn't dream simultaneously. The lock is a single .consolidate-lock file pulling triple duty: its modification time is the "when did we last dream" timestamp, eliminating the need for separate storage. Its content holds the owning process's PID — other processes that see a live PID yield. The lock has a 1-hour TTL: if more than an hour passes, it's forcibly reclaimed regardless of PID status, guarding against PID reuse by the OS causing a permanent deadlock. In Python-style pseudocode (the actual source is TypeScript):

def try_acquire_lock():
    lock_path = memory_dir / ".consolidate-lock"

    # Read existing lock's mtime and holder PID
    mtime, holder_pid = read_lock(lock_path)

    # Lock is fresh and holder is alive — yield
    if mtime and (now - mtime) < 1_hour:
        if holder_pid and is_process_running(holder_pid):
            return None

    # Write our PID
    write(lock_path, str(my_pid))

    # Double-check: if two processes wrote simultaneously, last write wins
    if read(lock_path) != str(my_pid):
        return None  # Lost the race

    return mtime or 0  # Return old mtime for rollback on failure

This write-then-verify pattern can't fully prevent races. But dreaming is idempotent — worst case, two processes dream simultaneously and one overwrites the other's results. The next dream will reconsolidate, so losing one round causes no lasting harm.

This "because the operation is idempotent, the lock can be imperfect" philosophy runs through the entire autoDream design. Dream failed? Roll back the lock file's mtime so the next session retries. Sub-agent garbled a memory file? The next dream will re-examine it. No step needs to be perfectly correct as long as the system converges to correctness over time.

Trade-offs Hidden in the Prompt

Once all three gates pass, autoDream launches a sub-agent with a four-phase prompt:

+-------------------------------------------------------+
|                  Dream Prompt                         |
|                                                       |
|  Phase 1: Orient                                      |
|  ls memory dir, read MEMORY.md index,                 |
|  skim existing topic files                            |
|           |                                           |
|           v                                           |
|  Phase 2: Gather                                      |
|  1. Check daily logs (highest priority)               |
|  2. Find drifted memories                             |
|  3. Grep transcripts for narrow terms (last resort)   |
|           |                                           |
|           v                                           |
|  Phase 3: Consolidate                                 |
|  Merge related memories, fix stale facts,             |
|  convert relative dates to absolute                   |
|           |                                           |
|           v                                           |
|  Phase 4: Prune                                       |
|  Update MEMORY.md index (keep < 200 lines),           |
|  remove stale pointers, resolve contradictions        |
+-------------------------------------------------------+

Phase 1 and Phase 4 are straightforward. Phase 1 surveys what's currently in the memory directory. Phase 4 trims the index file to under 200 lines and ~25KB.

Phase 2 is where it gets interesting. It tells the sub-agent to gather new information, but with an explicit priority ranking: daily logs first, memories that have drifted from the current codebase second, session transcripts last. The prompt says: "don't read transcripts end-to-end, only grep for terms you think matter."

This is a very practical token budget trade-off. Session transcripts can be enormous, and having the sub-agent read them from top to bottom would burn through input tokens. But ignoring them entirely risks missing important information. The compromise is letting the model decide what to search for, using grep to extract by keyword. This delegates the "how much to read" decision to the model itself, balancing cost against coverage.

Phase 3 has an easy-to-miss but critical rule: relative time references like "last week" and "yesterday" must be converted to absolute dates. Memory files might not be read again for weeks or months — by then, "last week" could mean anything.

The sub-agent's permissions are worth noting briefly. It can read any file, search the codebase, grep session transcripts, but can only edit files inside the memory directory. No git, no npm, no MCP tools, no spawning other agents. Can't rm files, but can clear content via Write. This guarantees dreaming produces zero side effects on project code. It shares the main conversation's prompt cache, so cache reads cost about a tenth of full input — making each dream cheap in practice.

Bad Dreams

The dreaming sub-agent is itself an LLM, and LLMs hallucinate when synthesizing information. When merging two memories it might introduce details that weren't in the original records. When "correcting" a stale fact it might change something right into something wrong.

But circle back to that core principle: dreaming is idempotent. If this round's consolidation goes wrong, the next dream will re-examine the same files. A memory that was incorrectly modified has a chance of being corrected in the next cycle. Memory files are plain text — you can open them anytime, see exactly what's recorded, and fix anything that looks off.

This does resemble the unreliability of human memory. Are you sure that childhood event happened the way you remember it? The brain reconstructs each time it recalls. The difference is you can't open your hippocampus and check the diff.

Learning to Forget

The memory system has an explicit spec for what should not be stored: no code patterns, no architecture, no file structure, no git history. The reasoning is direct — this information can be derived in real time from the codebase. Storing it as memory is redundant.

This constraint applies during dreaming too. When the sub-agent is consolidating and finds a memory that records something derivable from the current code, it should delete it.

Dreaming isn't just organizing scattered sticky notes into a notebook. It's also throwing away the notes you don't need anymore. The engineering effort spent on deciding what to forget is no less than what's spent on deciding what to remember.