Finisky Garden

NLP, 软件工程, 产品设计

After OpenClaw crossed 350K stars, a narrative started forming in the community: since both run on Opus 4.6 under the hood, the open-source option should be on par with Claude Code. Anyone who has actually used both probably shares the same observation — in long sessions, OpenClaw starts losing context, forgetting files it already read, redoing work it already did. Claude Code does too, but noticeably later, and it recovers much better.

Same model, different experience. Why?

阅读全文 »

OpenClaw 拿下 35 万 Star 之后,社区开始出现一种论调:底层都是 Opus 4.6,开源方案应该能对标 Claude Code。实际用过两边的人大概都有同一个感受,长会话跑到后半段,OpenClaw 开始丢上下文,忘记之前读过的文件,重复做已经做过的事。Claude Code 也会,但明显晚得多,而且恢复能力强很多。

模型一样,体验不一样。差在哪?

阅读全文 »

Cursor's parent company Anysphere has about 150 employees. In November 2025, its ARR crossed $1 billion. OpenAI, as of early 2026, has 4,500 employees. Its 2025 revenue was $13.1 billion, but according to Fortune, it lost roughly $9 billion and doesn't expect to turn profitable until 2028.

An application company that trains zero models is outproducing, per capita, the company that trains them. This is the most telling signal in AI for 2025.

阅读全文 »

Cursor 的母公司 Anysphere 大概 150 人,2025 年 11 月年收入突破 10 亿美元。OpenAI 到 2026 年初有 4500 名员工,2025 年收入 131 亿美元,但据 Fortune 报道,亏损约 90 亿美元,而且预计一直亏到 2028 年。

一个不训练任何模型的应用公司,人均产出碾压了训练模型的公司。这组数字是 2025 年 AI 行业最值得琢磨的信号。

阅读全文 »

In late November 2025, an open-source project called OpenClaw went live on GitHub. Four and a half months later, it had 350K stars, 70K forks, 81 releases, and sponsorships from OpenAI, NVIDIA, and Vercel. For comparison: Open WebUI took two and a half years to reach 130K stars; NextChat took three years to hit 88K. Growth like OpenClaw's is rare in GitHub's history.

It isn't a new model, a training framework, or even a "technical breakthrough" in the traditional sense. It's a personal AI assistant that runs on your own machine and talks to you through the chat apps you already use: WhatsApp, Telegram, Slack, Discord, WeChat, Feishu, iMessage, Matrix, and over 25 platforms in total, all connected to a single backend.

Why did it break out of the developer bubble?

阅读全文 »

2025 年 11 月底,一个叫 OpenClaw 的开源项目在 GitHub 上线。4 个半月后,它有 35 万 Star,7 万 Fork,81 个发布版本,OpenAI、NVIDIA、Vercel 做它的赞助商。同期对比:Open WebUI 用了两年半才到 13 万,NextChat 三年到 8.8 万。OpenClaw 的增速在 GitHub 历史上都不多见。

它不是一个新模型,不是一个训练框架,甚至不是一个传统意义上的"技术突破"。它是一个个人 AI 助手,跑在你自己的机器上,通过你已经在用的聊天工具跟你对话。WhatsApp、Telegram、Slack、Discord、微信、飞书、iMessage、Matrix,25 个以上的平台,同时接入,一个后台。

这篇聊聊它为什么能出圈。

阅读全文 »

Claude Code registers over 40 tools, yet the average request only uses 3–4 of them. Stuffing every tool's JSON Schema into the system prompt is expensive: each tool's description plus parameter definitions runs about 500 tokens, so 40 tools cost roughly 20K tokens. When all the user wants is to read a file, paying for WebSearch, NotebookEdit, and CronCreate — tools that have nothing to do with the task — is a bad deal.

Claude Code's answer is deferred loading: hide infrequently used tools, expose only their names, and let the model pull in full schemas on demand through a discovery tool. This turns the token cost of tool definitions from a fixed overhead into a pay-as-you-go expense.

阅读全文 »

Claude Code 注册了超过 40 个工具, 但绝大多数请求只会用到 3-4 个. 把所有工具的 JSON Schema 全部塞进 system prompt, 每个工具的描述加参数定义大约 500 tokens, 40 个就是 20K tokens. 用户只想读个文件, 却要为 WebSearch、NotebookEdit、CronCreate 这些八竿子打不着的工具付出 token 成本. 这是一笔亏本买卖.

Claude Code 的解决方案是延迟加载 (deferred loading): 把不常用的工具藏起来, 只暴露名字, 模型需要时再通过一个发现工具把完整 schema 加载进来. 这个设计把 tool definition 的 token 开销从固定成本变成了按需付费.

阅读全文 »

Claude Code's Edit tool has a deceptively simple interface: give it an old_string, give it a new_string, and it finds the former in a file and replaces it with the latter. Sounds like nothing more than a str.replace(). But in the context of an LLM Agent, this seemingly trivial operation is backed by an entire engineering pipeline spanning string sanitization to concurrency safety. The model stuffs line numbers into its replacement strings. It conjures curly quotes out of thin air. External tools modify the target file while the user is still reviewing the permission dialog. The Edit tool has to stay correct through all of this — far more than find-and-replace can handle.

From observing its behavior, the Edit tool's execution breaks down into three phases: API-layer preprocessing (before the tool even receives input), input validation (before the permission dialog is shown), and the actual write (after the user approves). Each phase handles a distinct class of problems and maintains deliberate sync/async boundaries.

阅读全文 »

Claude Code 的 Edit 工具接口极简: 给一个 old_string, 给一个 new_string, 在文件里找到前者替换成后者。听上去就是一个 str.replace() 的事。但在 LLM Agent 的语境下, 这个看似平凡的操作背后藏着一整套从字符串清洗到并发安全的工程。模型会把行号塞进替换字符串, 会凭空产生弯引号, 会在用户审批的间隙里被外部工具改了目标文件。Edit 工具要在这些情况下保持正确, 比 find-and-replace 复杂得多。

从行为上观察, Edit 工具的执行可以拆成三个阶段: API 层预处理(在工具拿到输入之前), 输入校验(展示权限对话框之前), 和实际写入(用户同意之后)。每个阶段各自处理一类问题, 且刻意保持了特定的同步/异步边界。

阅读全文 »
0%