Finisky Garden

NLP, Software Engineering, Product Design

0%

After OpenClaw crossed 350K stars, a narrative started forming in the community: since both run on Opus 4.6 under the hood, the open-source option should be on par with Claude Code. Anyone who has actually used both probably shares the same observation — in long sessions, OpenClaw starts losing context, forgetting files it already read, redoing work it already did. Claude Code does too, but noticeably later, and it recovers much better.

Same model, different experience. Why?

Cursor’s parent company Anysphere has about 150 employees. In November 2025, its ARR crossed $1 billion . OpenAI, as of early 2026, has 4,500 employees . Its 2025 revenue was $13.1 billion, but according to Fortune, it lost roughly $9 billion and doesn’t expect to turn profitable until 2028.

An application company that trains zero models is outproducing, per capita, the company that trains them. This is the most telling signal in AI for 2025.

In late November 2025, an open-source project called OpenClaw went live on GitHub. Four and a half months later, it had 350K stars, 70K forks, 81 releases, and sponsorships from OpenAI, NVIDIA, and Vercel. For comparison: Open WebUI took two and a half years to reach 130K stars; NextChat took three years to hit 88K. Growth like OpenClaw’s is rare in GitHub’s history.

It isn’t a new model, a training framework, or even a “technical breakthrough” in the traditional sense. It’s a personal AI assistant that runs on your own machine and talks to you through the chat apps you already use: WhatsApp, Telegram, Slack, Discord, WeChat, Feishu, iMessage, Matrix, and over 25 platforms in total, all connected to a single backend.

Why did it break out of the developer bubble?

When you use Claude Code, there’s something you probably never notice: it has over 40 registered tools, but when you ask it to read a file or edit a few lines of code, it only uses three or four. The definitions for the remaining 30-plus tools, each around 500 tokens, add up to over 10,000 tokens of fixed overhead per request. You just want to change one line of CSS, but you’re paying for WebSearch, NotebookEdit, CronCreate, and a bunch of tools you’ll never touch.

Claude Code’s Edit tool has a deceptively simple interface: give it an old_string, give it a new_string, and it finds the former in a file and replaces it with the latter. Sounds like nothing more than a str.replace(). But in the context of an LLM Agent, this seemingly trivial operation is backed by an entire engineering pipeline spanning string sanitization to concurrency safety. The model stuffs line numbers into its replacement strings. It conjures curly quotes out of thin air. External tools modify the target file while the user is still reviewing the permission dialog. The Edit tool has to stay correct through all of this — far more than find-and-replace can handle.

From observing its behavior, the Edit tool’s execution breaks down into three phases: API-layer preprocessing (before the tool even receives input), input validation (before the permission dialog is shown), and the actual write (after the user approves). Each phase handles a distinct class of problems and maintains deliberate sync/async boundaries.

Claude Code has a mode that appears in no documentation whatsoever. When active, it systematically erases every trace of AI involvement. No Co-Authored-By trailer, no “Generated with Claude Code” footer, and the system prompt itself doesn’t even tell the model what it is. This mode is called Undercover Mode. It exists only in Anthropic’s internal builds; external users will never see it, because dead code elimination strips the entire feature out during public builds.

The behavioral implications are telling: this mechanism exists because Anthropic employees routinely use Claude Code to commit to public repositories. Without some form of protection, commit messages might contain unreleased model codenames, PR descriptions might expose internal project names, and model identifiers in the system prompt could leak through some vector or another. Undercover Mode is designed to plug all of these holes.

When tackling complex tasks, Claude Code spawns multiple sub-agents in parallel, each needing the full parent conversation context. If the parent has accumulated 100K tokens and three sub-agents are spawned, a naive implementation charges 300K tokens of input.

Anyone familiar with LLM inference optimization will recognize this immediately: it’s a KV Cache sharing problem. When multiple requests share the same prefix, the Attention layer’s Key/Value tensors can be reused, skipping redundant computation. Anthropic exposes this capability to API users as Prompt Cache, offering a 90% discount on cached prefix portions — but only if the prefix bytes are exactly identical across requests. Claude Code’s fork sub-agents are deliberately constructed so that over 99% of the bytes are identical, compressing the effective input cost of three sub-agents to roughly 120K token-equivalent (100K full price + 2 × 100K × 10%).

A 200K context window sounds generous — until you’re in a moderately complex coding session. Read a few dozen files, run several rounds of grep, execute some bash commands, and you’ve already burned through most of it. Compaction is inevitable, but compaction itself costs money: you need an LLM call to generate a summary, and the input to that call is the very context you’re trying to compress. This creates a fascinating engineering trade-off: compact too early and you lose useful information; compact too late and the window overflows; and the cost of compaction itself can’t be ignored. Claude Code’s answer is a multi-layer cascade: avoid compaction if you can, do it cheaply if you must, and only call the LLM as a last resort.

The security challenge of AI-executed Bash commands isn’t “should we trust the model” — it’s “how do we make sure a command actually means what it looks like.”

Claude Code lets AI execute Bash commands directly. Not through a structured interface like MCP; it literally gives the model a shell. MCP’s approach wraps tools into JSON schemas, which is safe enough, but you can’t realistically write adapters for thousands of CLI tools. The capability ceiling is obvious. A raw shell can do anything; the tradeoff is that the security problem shifts from “controlling interface permissions” to “figuring out what a command actually does.” In the previous post, I covered how YOLO Classifier uses AI to review AI, but the Classifier works with the full command string and makes a semantic judgment: is this operation dangerous? Before that judgment even happens, there’s a deeper question that needs answering: does this command actually mean what it appears to mean?

Claude Code has an auto mode that executes operations without confirmation. But “auto” doesn’t mean “unreviewed” — there’s a classifier watching every action.

The Auto Mode Paradox

One of the most annoying things about using Claude Code is the permission popups. Every Bash command, every file write requires a confirmation click. Power users turn on auto mode, letting Claude execute everything autonomously without asking.

This creates an obvious problem: what if the model decides to rm -rf /, push to the production branch, or write a backdoor into .bashrc?

Claude Code has no vector database and no embedding index, yet it can pinpoint the exact file you need in a million-line codebase. Behind this is a retrieval architecture completely different from traditional RAG.

This Isn’t the RAG You Know

If you’ve used RAG before, the pipeline should be familiar: build an offline index, user asks a question, vector-search for Top-K chunks, inject into prompt, generate an answer. A straight line, one pass, done.

Claude Code doesn’t work like that at all. It has no offline index. The model itself drives the retrieval process.

Nassim Taleb proposed a thought experiment in Fooled by Randomness: given infinite monkeys typing on infinite typewriters, one of them will eventually produce the complete text of the Iliad.

The more I think about it, the more I believe this story’s endgame is today’s large language models.

Developers who’ve used Claude Code probably share this experience: even in an ultra-long conversation where dozens of files have been modified, it seems to always “remember” what it did before. Even more remarkably, if you told it “I prefer bun over npm” in a previous session, it automatically follows that preference next time.

Behind this is a sophisticated memory management system. Let’s tear apart Claude Code’s memory mechanism layer by layer.

I haven’t updated my blog in over a year. Not out of laziness, nor because I’ve fallen behind on technology. It’s more of a conviction: once AI became powerful enough, the value of technical blogs dropped significantly. People shifted from searching and reading to learning through direct conversations with AI. On top of that, AI-generated content floods social media the moment anything happens, making me feel there’s little point in writing after the fact. Blog traffic has plummeted over the past year, which further killed my motivation to spend hours crafting a post. I miss the days when every article was painstakingly typed out, word by word, over hours or even days.

AI tools have made remarkable progress in the past six months. As a heavy user, I want to talk about this: if agents are supposed to be our helpers, why do we feel more exhausted than ever?

When compiling a LaTeX document that uses a customized font in TeX Live on Windows, you might encounter the following error:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
kpathsea: Running mktextfm Fontin

The command name is F:\texlive\2025\bin\windows\mktextfm
name = Fontin, rootname = Fontin, pointsize =
mktexmf: empty or non-existent rootfile!

kpathsea: Running mktexmf Fontin.mf

The command name is F:\texlive\2025\bin\windows\mktexmf
Cannot find Fontin.mf.
kpathsea: Appending font creation commands to missfont.log.

kpathsea: Running mktextfm Fontin

The command name is F:\texlive\2025\bin\windows\mktextfm
name = Fontin, rootname = Fontin, pointsize =
mktexmf: empty or non-existent rootfile!

kpathsea: Running mktexmf Fontin.mf

The command name is F:\texlive\2025\bin\windows\mktexmf
Cannot find Fontin.mf.

This error occurs because the TeX system cannot find the necessary font files (specifically .mf or .tfm files) for Fontin, and fails to generate them. The weird part is that you have installed the font already.

MongoDB’s Aggregation Pipeline is a powerful tool for processing and analyzing data, suitable for both real-time queries and offline data analysis. It allows developers to use multiple stages to transform, filter, group, and sort data, enabling efficient execution of complex computations. This article will explore the basic concepts, application examples, performance analysis, and optimization strategies of the Aggregation Pipeline.

Today, when I opened Chrome, I suddenly received a prompt saying that SwitchyOmega “# This extension may soon no longer be supported because it doesn’t follow best practices for Chrome extensions.” It seems that the plugin was disabled after Chrome automatically updated. Another bad news: the Stylish plugin has also been rendered unusable for the same reason.

Moreover, the Chrome Web Store is inaccessible, preventing the installation of other extensions—deadlock.

The double-spending problem is a critical challenge in transaction systems, especially when managing account balances or funds. It occurs when a system allows the same funds to be spent multiple times due to concurrent operations or race conditions. In this article, we explore two approaches to resolving this issue using MongoDB: transaction-based handling and versioning-based handling.

This post is an in-depth discussion of the double-spending problem from the Building a Transaction System with MongoDB blog.

After a Windows Update, RDP on Windows 11 might stop working correctly. Symptoms include a black screen upon connection, no visible mouse or interface, and an automatic disconnection after about a minute. This forces users to return to the local machine to investigate the issue.

If you’ve ever struggled to set the correct timezone for your cron jobs on Ubuntu 22.04, you’re not alone. In this blog, we’ll walk you through a troubleshooting journey that highlights common pitfalls and the ultimate solution.