看懂三件事：

它为什么出现？它提出了什么新抽象？它对你的工作有什么启发？

一、只学习了解时，要带着什么问题看？

我建议你脑子里带着 6 类问题。

1. 这个工作的核心矛盾是什么？

先不要问“它怎么实现”，而是问：

没有这个工作之前，大家遇到了什么痛点？

比如你看 agent / harness / Kaggle automation 相关工作，可以这样问：

它是在解决 agent 不会长期自主运行的问题？
还是解决工具调用不可靠的问题？
还是解决状态管理问题？
还是解决 evaluation / observability 问题？
还是解决多 agent 协作问题？

好的技术工作一定是围绕一个矛盾展开的。

例如：

普通 agent demo 可以完成短任务，
但真实任务有长等待、外部状态变化、失败恢复、成本控制、上下文丢失。
这个工作试图解决其中哪一个？

你先抓住这个矛盾，后面所有设计都更容易理解。

2. 它的核心 insight 是什么？

也就是：

作者最想让你相信的一句话是什么？

这句话通常不是 API，而是一个设计判断。

比如：

Agent 不应该只是 chat loop，而应该是 event-driven system。

或者：

LLM 的能力不是瓶颈，环境反馈和评估机制才是瓶颈。

或者：

长任务不应该阻塞 agent，而应该拆成提交、观察、恢复、聚合。

你看一个工作时，要试着把它压缩成一句话：

这个工作的核心 insight 是：__________

如果你压缩不出来，说明你还在看表层功能。

3. 它提出了什么新抽象？

技术工作的价值很多时候不是代码，而是抽象。

你要问：

它把复杂系统拆成了哪些概念？
这些概念之间是什么关系？
有没有一个我以前没想到的切分方式？

比如 agent 系统里常见抽象：

Agent
Tool
Memory
State
Session
Task
Run
Trace
Event
Environment
Evaluator
Checkpoint
Policy
Skill
Subagent

你要看的是：

它为什么这样切？这种切法解决了什么问题？这种切法有没有副作用？

举个例子，你看 Claude Code、OpenAI Agents SDK、LangGraph、Hermes、opencode 时，不要只比较“谁有什么功能”，而是比较它们的抽象边界：

Claude Code 更像 coding agent runtime
LangGraph 更像 stateful workflow graph
OpenAI Agents SDK 更像 agent orchestration + tracing framework
Hermes/opencode 更像 terminal-native autonomous coding agent

这比记住功能列表更重要。

4. 它的系统边界在哪里？

这是你最应该练的能力。

你要问：

它解决了什么？
它没有解决什么？
它假设什么东西已经存在？
它把复杂性转移到了哪里？

很多项目表面上说“让 agent 自动化完成任务”，但实际上可能假设：

任务很短
环境稳定
工具调用可靠
用户会不断监督
失败后人工介入
不需要长期状态
不需要严格评估

你要专门找这些隐含假设。

对于你的 Kaggle Agent / Research OS 场景，可以问：

它能处理长时间等待吗？
它能处理多个实验并发吗？
它能恢复中断吗？
它能区分实验结果的可信度吗？
它能避免重复试错吗？
它能沉淀经验吗？
它能让后续 agent 接上上下文吗？

如果一个工作没有回答这些问题，它可能对你的场景只是部分有用。

5. 它的 trade-off 是什么？

任何设计都有代价。

你要问：

它为了获得什么能力，牺牲了什么？

常见 trade-off：

灵活性 vs 可控性
自动化程度 vs 可解释性
并发能力 vs 状态一致性
通用框架 vs 领域优化
LLM 自主性 vs harness 约束
快速 demo vs 生产可靠性

比如：

LangGraph 的状态图让流程更可控，但也让 agent 的自由度下降。
Claude Code 很强，但很多内部机制不透明。
自己写 harness 很灵活，但开发成本高。
完全 autonomous agent 很酷，但容易失控、浪费预算、难以 debug。

你学习一个工作时，不要只看它“有什么优点”，而要看它“选择了什么代价”。

6. 它给我的启发是什么？

最后要落到你自己身上。

每看完一个工作，你都可以写 4 句话：

这个工作解决的问题是：
它的核心思想是：
它不适合我的地方是：
我可以借鉴的是：

比如：

这个工作解决的是 agent 在复杂任务中的可观测性问题。
它的核心思想是把每一步工具调用、模型决策和状态变化都 trace 化。
它不直接解决 Kaggle 长时间 kernel 等待和实验聚合问题。
但我可以借鉴它的 trace schema，用来记录我的 agent 每次实验决策。

这才是真正的“读懂”。

二、如果你要写自己的工作文档，要写什么？

如果你想让别人觉得你的工作有意思，文档不能只是：

我做了一个 Kaggle agent，可以自动跑实验。

这太普通了。

你需要写成一个有张力的技术故事：

现在的 agent 很会写代码，但不会可靠地做长期实验。我做的是一个面向 Kaggle / research automation 的 agent OS，让 agent 可以提交实验、异步等待、恢复上下文、聚合结果、持续改进。

你要让读者一眼知道：这不是普通 wrapper，而是在解决一个真实系统问题。

1. 开头先写痛点，不要先写功能

不要这样开头：

This project is an AI agent framework for Kaggle competitions.

这个太平。

更好的开头是：

Current coding agents are good at writing code in short interactive sessions, but they struggle with long-running experimental workflows. In Kaggle-style research, an agent often needs to submit a notebook, wait for remote execution, inspect logs and scores, compare multiple experiments, and continue from prior decisions. Most agents either block while waiting or lose context after the run finishes.

中文就是：

现在的 coding agent 擅长短交互式任务，但不擅长长期实验流程。Kaggle 场景中，agent 需要提交 notebook、等待远程运行、读取日志和分数、比较多个实验，并基于历史结果继续改进。现有 agent 往往要么阻塞等待，要么运行结束后丢失上下文。

这个开头一下就有问题意识。

2. 然后写你的核心观点

你需要一句 thesis statement。

例如：

我的观点是：Kaggle automation 不应该只是让一个 agent 循环写代码，而应该被设计成一个 event-driven research operating system。

或者：

The key idea is to separate experiment execution from agent reasoning: agents propose and launch experiments, while an external watcher observes remote execution and feeds structured results back into a persistent research state.

这句话非常重要，因为它决定别人怎么看你的工作。

你不是在说“我调用了 Kaggle CLI”，而是在说：

我把 agent 的思考、实验执行、外部等待、结果聚合拆开了。

这就有架构味道了。

3. 写系统架构，而不是堆功能列表

你的文档应该有一张逻辑结构：

Agent / Planner
    ↓
Experiment Producer
    ↓
Kaggle Kernel Submitter
    ↓
Watcher / Poller
    ↓
Result Collector
    ↓
Research State Store
    ↓
Aggregator / Next-step Agent

你可以解释：

Producer 负责发起实验
Watcher 负责非阻塞地观察远程 kernel 状态
Collector 负责下载 logs / outputs / leaderboard score
State Store 负责保存实验历史、假设、结果、失败原因
Aggregator 负责比较实验并生成下一步方向

这比写“支持自动提交、支持日志下载、支持状态保存”更有吸引力。

4. 写一个具体故事

技术文档要有一个场景，让人知道它为什么有用。

例如：

A typical run looks like this:

1. The agent inspects the current solution and proposes three experiments.
2. It submits the first notebook to Kaggle and records the hypothesis.
3. Instead of blocking, the watcher tracks the remote kernel status.
4. When the run finishes, logs, outputs, and scores are collected.
5. The aggregator compares the result against previous experiments.
6. The next agent session resumes from the structured research state and decides what to try next.

这个比抽象描述更好，因为读者能想象系统在工作。

5. 写清楚你和普通 agent wrapper 的区别

这是最重要的部分。

你要主动回答：

这和直接让 Claude Code / opencode / Hermes 自己跑有什么区别？

可以这样写：

Unlike a normal coding-agent wrapper, this system treats long-running experiments as external events rather than blocking tool calls. The agent does not need to stay alive while a Kaggle kernel is running. Instead, experiment metadata, hypotheses, logs, and results are persisted, allowing future agent sessions to resume from a shared research state.

中文：

和普通 coding-agent wrapper 不同，这个系统把长时间实验看成外部事件，而不是一次阻塞的工具调用。agent 不需要在 Kaggle kernel 运行期间一直存活；实验假设、提交记录、日志、结果和结论都会被保存，后续 agent session 可以从共享 research state 中继续。

这就直接打中了你之前一直关心的问题。

6. 写你解决了哪些失败模式

这个部分会让你的文档显得很专业。

可以列：

Failure modes this system is designed for:

- Agent blocks while waiting for remote execution.
- Agent forgets why an experiment was launched.
- Multiple experiments finish out of order.
- Logs and scores are not linked to hypotheses.
- New agent sessions cannot continue prior reasoning.
- Experiments are repeated because previous attempts were not summarized.
- Human users cannot inspect why the agent chose a direction.

这比写 feature 更有说服力。

因为读者会觉得：

你真的踩过坑，所以你知道这个系统为什么需要存在。

7. 写你的核心设计原则

你的工作要显得有思想，可以写 design principles。

比如：

Design principles:

1. Non-blocking by default  
Long-running external jobs should not occupy an agent session.

2. Persistent research state  
Every experiment should be linked to its hypothesis, code version, logs, outputs, score, and conclusion.

3. Agent-agnostic execution  
The system should be able to use Claude Code, opencode, Hermes, or other agents as interchangeable workers.

4. Event-driven continuation  
A finished kernel should trigger structured result collection and future reasoning, rather than relying on the original agent process to stay alive.

5. Minimal local state  
Whenever possible, live platform state such as kernel status and submission quota should be queried from Kaggle instead of duplicated locally.

这些原则其实很适合你之前的思路。

8. 写目前不解决什么

这反而会增加可信度。

例如：

Non-goals:

- This is not a new LLM model.
- This is not a replacement for Claude Code or opencode.
- This does not try to fully constrain the agent with a rigid workflow.
- This does not assume every experiment will improve the score.
- This is not initially optimized for large-scale distributed training.

这会让别人知道你的边界很清楚。

9. 写 roadmap

不要只写已经完成的东西，也要写下一步。

例如：

Roadmap:

- Experiment registry
- Kaggle watcher
- Structured result collector
- Research memory / summary store
- Multi-agent experiment aggregator
- Langfuse-based tracing
- Automatic failure classification
- Self-improving skill updates

这让项目看起来像一个持续演化的系统，而不是一次性脚本。

三、你的文档可以按这个结构写

我建议你的 README 或技术文档结构是：

# Project Name

## 1. Motivation
为什么现有 coding agents 不适合长期 Kaggle/research workflow？

## 2. Core Idea
把 agent reasoning、experiment execution、remote waiting、result aggregation 解耦。

## 3. System Overview
画出 Producer / Watcher / Collector / State Store / Aggregator。

## 4. Example Workflow
从提出实验 → 提交 kernel → 等待 → 收集结果 → 总结 → 下一步。

## 5. Key Abstractions
Experiment, Run, Hypothesis, Result, Research State, Agent Session。

## 6. Failure Modes Addressed
阻塞、遗忘、重复实验、并发结果混乱、无法恢复上下文。

## 7. Design Principles
非阻塞、持久状态、agent-agnostic、event-driven、minimal local state。

## 8. Comparison with Existing Tools
Claude Code / opencode / Hermes / LangGraph / ML experiment trackers。

## 9. Current Status
哪些已经实现，哪些还只是设计。

## 10. Roadmap
后续计划。

四、最关键的一点

你读别人的工作时，要找：

问题 → insight → 抽象 → trade-off → 可借鉴点

你写自己的工作时，也要按这个顺序写：

痛点 → 核心观点 → 系统抽象 → 设计取舍 → 实际场景

不要写成：

我支持了 A、B、C、D 功能。

要写成：

为什么现有方式不够好？
我的核心判断是什么？
我怎么重新切分这个问题？
这个系统因此获得了什么能力？

这样你的工作才会显得有思想，而不是一个普通 automation script。

张睿豪

如何高效阅读技术文章并提问

一、只学习了解时，要带着什么问题看？

1. 这个工作的核心矛盾是什么？

2. 它的核心 insight 是什么？

3. 它提出了什么新抽象？

4. 它的系统边界在哪里？

5. 它的 trade-off 是什么？

6. 它给我的启发是什么？

二、如果你要写自己的工作文档，要写什么？

1. 开头先写痛点，不要先写功能

2. 然后写你的核心观点

3. 写系统架构，而不是堆功能列表

4. 写一个具体故事

5. 写清楚你和普通 agent wrapper 的区别

6. 写你解决了哪些失败模式

7. 写你的核心设计原则

8. 写目前不解决什么

9. 写 roadmap

三、你的文档可以按这个结构写

四、最关键的一点