diff --git a/docs/design/17-action-mail-type.md b/docs/design/17-action-mail-type.md
index 83706c5..629b171 100644
--- a/docs/design/17-action-mail-type.md
+++ b/docs/design/17-action-mail-type.md
@@ -5,6 +5,10 @@ version: v2.0
 status: draft
 ---
 
+> ⚠️ **SUPERSEDED** — 本文档已被 `17-toolchain-handler-enforcement.md` 取代。
+> 方向修正：不在 Mail 侧加 action 类型，而是让 toolchain 事件回归 ToolchainHandler（§14 已有架构）。
+> 保留本文档仅作历史参考。
+
 # action Mail 类型设计
 
 > **状态**: 草案 v2.0
diff --git a/docs/design/17-toolchain-handler-enforcement.md b/docs/design/17-toolchain-handler-enforcement.md
new file mode 100644
index 0000000..1c6e36f
--- /dev/null
+++ b/docs/design/17-toolchain-handler-enforcement.md
@@ -0,0 +1,1027 @@
+---
+title: "ToolchainHandler 强约束设计"
+created: 2026-06-13
+version: v1.0
+status: draft
+---
+
+# ToolchainHandler 强约束设计
+
+> **状态**: 草案
+> **作者**: 庞统（副军师）🐦
+> **日期**: 2026-06-13
+> **定位**: 让 toolchain 事件走已有的 ToolchainHandler（§14 设计），并在 ToolchainHandler 中加入 L2 引擎层的强约束（输入/执行/输出），取代 §16 权宜方案中走 MailHandler inform 的路径
+> **前置文档**: §14 TaskTypeRegistry + Handler 架构、§13 工具链与开发流程设计
+> **推翻文档**: 原 §17 action Mail 类型设计（`17-action-mail-type.md`）
+
+---
+
+## §1. 问题陈述
+
+### 1.1 现状
+
+ToolchainHandler 已完整实现并注册（`src/daemon/toolchain_handler.py`，`virtual_project="_toolchain"`）。但实际的 toolchain 事件流从未走过 ToolchainHandler：
+
+| 环节 | 设计（§14） | 实际（§16 权宜） |
+|------|------------|----------------|
+| 创建 task | `project_id="_toolchain"` + `task_type="toolchain"` | `project_id="_mail"` + `task_type="mail"` |
+| 写入 DB | `_toolchain/blackboard.db` | `_mail/blackboard.db` |
+| 路由 handler | ToolchainHandler | MailHandler |
+| 完成验证 | verify action output | inform 始终通过 |
+
+**根因**：§16 设计 toolchain 事件中枢时，没有用 ToolchainHandler，而是通过 `_send_mail`（`src/api/toolchain_routes.py` L203）创建 `task_type="mail"` + `type=inform"` 的 task。§16 D1 决策"不做第三种 task 类型"是基于权宜方案做的——但 §14 早已设计了 ToolchainHandler 作为独立的 task 类型，只是没有接线。
+
+### 1.2 后果
+
+所有 toolchain 事件走了 MailHandler 的 inform 路径：
+
+1. MailHandler 的 inform PromptSection 说"已阅即可"、"不要执行任何状态转换命令"——Agent 收到 Review 驳回通知后当纯通知处理就 done 了
+2. verify_completion 对 inform 始终返回 `VerifyResult(True)`——没有任何验证
+3. 流程断链：Review 驳回 → Agent 不修 → 永远卡在驳回状态；CI 失败 → Agent 不修 → 永远卡在失败状态
+
+### 1.3 主公要求
+
+> "输入，执行过程和输出都要能确保流程的顺利执行，以及业务内容的承载"
+> "L2 引擎这一层的 toolchain 必须是强约束的"
+> "不强约束总是断链"
+
+### 1.4 方向
+
+**不是给 Mail 加类型，而是让 toolchain 事件回归 ToolchainHandler**。ToolchainHandler 已有独立的 DB、独立的 PromptSection、独立的 verify 逻辑。需要做的是：
+
+1. 新增 `_send_toolchain_task` 函数，创建 `project_id="_toolchain"` + `task_type="toolchain"` 的 task
+2. 强化 ToolchainHandler 的三个 PromptSection，实现 L2 引擎层的三层强约束
+3. 强化 verify_completion，从"任意 comment ≥20 字符"升级为"action_report comment 存在"
+
+---
+
+## §2. 三层强约束总览
+
+主公要求的"输入、执行过程和输出"三层，对应 ToolchainHandler 的三个核心环节：
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    ToolchainHandler 强约束                │
+│                                                          │
+│  ┌─── 输入约束 ───┐   ┌─── 执行约束 ───┐   ┌─── 输出约束 ──┐ │
+│  │                │   │                │   │              │ │
+│  │ must_hives JSON │   │ PromptSection  │   │ verify_      │ │
+│  │ 携带结构化字段   │→  │ 强语气+Red Flag│→  │ completion   │ │
+│  │                │   │                │   │              │ │
+│  │ ToolchainContext│   │ ToolchainCon-  │   │ action_report│ │
+│  │ Section 渲染    │   │ straintsSection│   │ comment 检查 │ │
+│  │ 编号步骤列表    │   │ 必须执行       │   │              │ │
+│  │                │   │                │   │ on_failure:  │ │
+│  │                │   │                │   │ 标failed+通知│ │
+│  └────────────────┘   └────────────────┘   └──────────────┘ │
+│                                                          │
+│  数据流: webhook → _send_toolchain_task → _toolchain DB  │
+│           → spawner (ToolchainHandler.build_prompt)       │
+│           → Agent 执行 → comment (action_report)          │
+│           → verify_completion → done / failed             │
+└─────────────────────────────────────────────────────────┘
+```
+
+| 层 | 目标 | 机制 | 防什么 |
+|----|------|------|--------|
+| **输入约束** | Agent 收到的不是一句话摘要，而是结构化的编号步骤 + 事件上下文 | must_hives JSON（event_type, action_type, steps, context）+ ToolchainContextSection 渲染 | 防"Agent 不知道该做什么" |
+| **执行约束** | Agent 知道必须执行，不会自合理化跳过 | ToolchainConstraintsSection 强语气"必须执行" + Red Flags 表 | 防"Agent 当纯通知忽略" |
+| **输出约束** | Agent 执行后有可验证的产出 | verify_completion 检查 action_report comment | 防"Agent 假装执行" |
+
+---
+
+## §3. 输入约束（Agent 收到什么）
+
+### 3.1 must_hives JSON 结构
+
+toolchain task 的 `must_hives` 字段携带完整的结构化信息：
+
+```json
+{
+  "event_type": "review_result",
+  "action_type": "review_result",
+  "steps": [
+    "合并 PR（Gitea API: POST /repos/{repo}/pulls/{pr_number}/merge）",
+    "提交 action report（POST http://localhost:8083/api/projects/_toolchain/tasks/{task_id}/comments，comment_type=action_report）"
+  ],
+  "context": {
+    "pr_number": 42,
+    "repo": "sanguo/sanguo_moziplus_v2",
+    "pr_title": "feat: add login page",
+    "result": "APPROVED",
+    "reviewer": "simayi-challenger",
+    "review_body": "代码质量良好，可以合并"
+  },
+  "from": "system",
+  "source": "webhook"
+}
+```
+
+字段说明：
+
+| 字段 | 类型 | 必填 | 说明 |
+|------|------|------|------|
+| `event_type` | string | ✅ | 事件类型，用于 ToolchainContextSection 模板选择 |
+| `action_type` | string | ✅ | 动作分类，用于步骤选择和日志统计 |
+| `steps` | string[] | ✅ | 结构化编号步骤列表，渲染到 Prompt 中 |
+| `context` | object | ❌ | 事件上下文数据（PR 号、仓库名、review 意见等） |
+| `from` | string | ✅ | 来源标识（`system` / webhook） |
+| `source` | string | ❌ | 来源类型（`webhook`） |
+
+### 3.2 ToolchainContextSection 渲染增强
+
+现有 `ToolchainContextSection` 使用模板引擎渲染事件信息。增强后必须包含三部分内容：
+
+**Part 1：事件类型 + 事件上下文**（现有，保留）
+
+通过 `toolchain_templates.py` 的模板引擎渲染，展示事件的核心信息（PR 标题、审查结果、CI 错误摘要等）。
+
+**Part 2：结构化编号步骤**（新增，从 must_hives 的 `steps` 字段读取）
+
+```markdown
+### 必须执行的步骤
+
+1. 合并 PR（Gitea API: POST /repos/sanguo/sanguo_moziplus_v2/pulls/42/merge）
+2. 提交 action report（POST http://localhost:8083/api/projects/_toolchain/tasks/<task_id>/comments，comment_type=action_report）
+```
+
+**Part 3：事件专属 action 指引**（新增，按 action_type 选择）
+
+每种事件类型有对应的 action 指引文本，以"你收到一个需要执行动作的事件"开头，明确告诉 Agent 这不是纯通知。
+
+渲染逻辑伪代码：
+
+```python
+class ToolchainContextSection:
+    def render(self, context: PromptContext) -> str:
+        # Part 1: 事件信息（现有模板引擎）
+        event_text = render_template(context.event_type, ...)
+        
+        # Part 2: 结构化步骤（新增）
+        steps = context.action_steps  # 从 PromptContext 获取
+        steps_text = ""
+        if steps:
+            lines = ["", "### 必须执行的步骤", ""]
+            for i, step in enumerate(steps, 1):
+                lines.append(f"{i}. {step}")
+            steps_text = "\n".join(lines)
+        
+        # Part 3: action 指引（新增）
+        action_hint = self._get_action_hint(context.action_type)
+        
+        return f"{action_hint}\n\n{event_text}{steps_text}"
+
+    def _get_action_hint(self, action_type: str) -> str:
+        hints = {
+            "review_result": "你收到一个 Review 结果通知，这是一个需要你执行动作的事件（不是纯通知）。",
+            "review_request": "你收到一个 Review 请求，这是一个需要你审查并提交 Review 的事件。",
+            "ci_failure": "你收到一个 CI 失败通知，这是一个需要你修复失败测试的事件。",
+            ...
+        }
+        return hints.get(action_type, "你收到一个工具链事件，这是一个需要你执行动作的事件。")
+```
+
+### 3.3 与 §13 §15.5 强约束模板的关系
+
+§13 §15.5 定义了 6 个流程强约束 Mail 模板，每个模板包含编号步骤和 Gitea API 调用指令。这些模板内容**不变**，但承载方式变了：
+
+| 维度 | §16 权宜方案 | 本设计 |
+|------|------------|--------|
+| 模板渲染为 | Mail task 的 description | toolchain task 的 description（ToolchainContextSection 模板引擎） |
+| 步骤固化在 | description 纯文本中 | must_hives JSON `steps` 字段 + PromptSection 渲染 |
+| Agent 是否能看到步骤 | 受 MailHandler inform prompt 干扰（"已阅即可"） | ToolchainHandler prompt 强化（"必须执行"） |
+
+---
+
+## §4. 执行约束（Agent 知道必须做）
+
+### 4.1 ToolchainConstraintsSection 强化
+
+现有的 `ToolchainConstraintsSection` 有 5 条约束，但语气不够强，且缺少防自合理化机制。改写为：
+
+```markdown
+## 硬约束（必须遵守）
+
+⚠️ 以下是强制要求，不是建议或参考。违反任何一条都会导致任务失败。
+
+### 1. 必须按步骤执行
+- 检查上方"必须执行的步骤"列表
+- 逐条执行每个步骤，不可跳过
+- 不要只读不做——这不是纯通知
+
+### 2. 必须提交 action report
+- 执行完所有步骤后，必须提交 action report
+- 提交方式：POST comment（comment_type='action_report'）
+- 报告内容：简要描述你执行了什么操作、结果如何
+- ⚠️ 不提交 action report 的任务会被标记为 failed
+
+### 3. 不要执行任何状态转换命令
+- 不要手动标 working/done/review/failed，系统会自动处理
+
+### 4. 不需要回复此邮件
+- 和 request 类型不同：不需要 in_reply_to 回复
+- action report 就是你的完成凭证
+
+### Red Flags（如果脑海中出现以下想法，说明你错了）
+
+| Agent 想法 | Red Flag 驳回 |
+|------------|--------------|
+| "这个通知看看就行了" | ❌ 错！这是 action 指令，必须执行步骤列表中的每一项 |
+| "我不需要做任何事" | ❌ 错！检查"必须执行的步骤"列表，每一步都要执行 |
+| "先放着等会处理" | ❌ 错！立即执行，不要推迟 |
+| "我已经知道了" | ❌ 知道不等于执行。执行步骤 + 提交 action report 才算完成 |
+| "步骤太多了，选几个做就行" | ❌ 错！必须逐条执行，不可跳过 |
+| "这个步骤不适用于当前情况" | ❌ 如果确实不适用，在 action report 中说明原因，但其他步骤必须执行 |
+```
+
+### 4.2 Red Flags 表设计原理
+
+参考 Superpowers 模式的 Red Flags 防 self-rationalization 机制：
+
+**问题**：LLM Agent 面对需要执行动作的 prompt 时，常见的 self-rationalization 模式：
+1. **降级处理**：把 action 当 inform 处理（"这个通知看看就行了"）
+2. **推迟执行**：识别到需要执行但不执行（"先放着等会处理"）
+3. **部分执行**：选择性跳过步骤（"选几个做就行"）
+4. **认知混淆**：知道但没行动（"我已经知道了"）
+
+**解法**：提前列出这些 self-rationalization 模式，在 Agent 的 prompt 中以"Red Flag"表格形式呈现。当 Agent 的推理过程中出现这些模式时，表格中的"驳回"语句会激活，阻止 Agent 继续沿错误路径推理。
+
+**来源**：Superpowers 框架的"Pre-commitment + Red Flag"模式。预先承诺一个规则，然后列出违反规则的常见借口，让 Agent 在推理时就识别出自己的 self-rationalization。
+
+### 4.3 强语气设计
+
+| 语气级别 | 用词 | 效果 | 使用场景 |
+|---------|------|------|---------|
+| **强制** | "必须"、"不可跳过"、"强制要求" | Agent 无法自合理化跳过 | 步骤执行、action report 提交 |
+| **禁止** | "不要"、"违反会"、"failed" | Agent 不会越界 | 状态转换、回复邮件 |
+| **提醒** | "⚠️" | 视觉强调 | 关键约束前缀 |
+
+**避免的用词**："建议"、"如需"、"可以考虑"、"参考"、"推荐"——这些词在 Agent 的推理中会被解读为"可选"。
+
+### 4.4 ToolchainApiSection 调整
+
+现有的 `ToolchainApiSection` 指示 Agent 标 done，但强约束设计中 done 由 verify_completion 自动触发。调整：
+
+```markdown
+## API 操作指令
+
+项目 ID: `_toolchain`
+任务 ID: {task_id}
+
+### 完成后必须提交 action report
+
+执行完所有步骤后，必须提交 action report：
+
+```bash
+curl -s -X POST "http://localhost:8083/api/projects/_toolchain/tasks/{task_id}/comments" \
+  -H "Content-Type: application/json" \
+  -d '{"author": "{agent_id}", "comment_type": "action_report", "body": "简要描述你执行了什么操作及结果"}'
+```
+
+⚠️ 不提交 action report 的任务会被标记为 failed。
+
+### 提交产出
+
+如有产出（如 review 结果、修复方案），提交到任务 outputs：
+
+```bash
+curl -s -X POST "http://localhost:8083/api/projects/_toolchain/tasks/{task_id}/outputs" \
+  -H "Content-Type: application/json" \
+  -d '{"content": "<你的产出内容>", "type": "text"}'
+```
+```
+
+**变化**：移除了"手动标 done"的 curl 示例（done 由 verify 自动处理），替换为 action report 提交指引。
+
+---
+
+## §5. 输出约束（Agent 执行后怎么验证）
+
+### 5.1 verify_completion 设计
+
+**D17-1: verify_completion 采用 action_report comment 机制**
+
+现有逻辑：检查 output 或 comment（任意 comment 长度 ≥20 字符）→ 通过。
+
+问题：Agent 可以随便写一条 comment 就通过验证，无法确认是否真正执行了步骤。
+
+新逻辑：检查 `comment_type='action_report'` 的 comment 是否存在。
+
+```python
+def verify_completion(self, task_id: str, db_path: Path) -> VerifyResult:
+    """检查 action report（精确验证）"""
+    try:
+        conn = get_connection(db_path)
+        try:
+            # 1. 优先检查 action_report comment
+            report_row = conn.execute(
+                "SELECT id FROM comments WHERE task_id=? "
+                "AND comment_type='action_report' LIMIT 1",
+                (task_id,)
+            ).fetchone()
+            if report_row:
+                return VerifyResult(True, "has_action_report", "action_report found")
+
+            # 2. fallback：检查 output（向后兼容）
+            output_count = conn.execute(
+                "SELECT COUNT(*) FROM outputs WHERE task_id=?", (task_id,)
+            ).fetchone()[0]
+            if output_count > 0:
+                return VerifyResult(True, "has_output", f"output_count={output_count}")
+
+            # 3. fallback：检查有实质内容的 comment（向后兼容）
+            comment_count = conn.execute(
+                "SELECT COUNT(*) FROM comments WHERE task_id=? "
+                "AND author != 'system' AND LENGTH(body) >= 20",
+                (task_id,)
+            ).fetchone()[0]
+            if comment_count > 0:
+                return VerifyResult(True, "has_comment", f"comment_count={comment_count}")
+
+            return VerifyResult(False, "no_action", "no action_report, no output, no valid comment")
+        finally:
+            conn.close()
+    except Exception as e:
+        logger.error("Toolchain %s: verify error: %s", task_id, e)
+        return VerifyResult(False, "verify_error", str(e))
+```
+
+**验证优先级**：
+1. `action_report` comment（首选——精确验证 Agent 执行了步骤）
+2. output（fallback——Agent 可能通过 outputs API 提交了产出）
+3. 有实质内容的 comment（fallback——向后兼容现有行为）
+
+保留 fallback 层次是为了平滑过渡：改造初期 Agent 可能还不习惯提交 action_report，fallback 避免"改造后所有 task 都 failed"的问题。
+
+### 5.2 on_failure 处理
+
+verify 失败时的处理逻辑（现有逻辑保留）：
+
+1. 标 task 为 `failed`
+2. 通过 Mail API 通知庞统（`_notify_via_mail_api`）
+3. 通知内容包含：事件类型、事件详情、失败原因、Gitea 链接、行动指引
+
+### 5.3 action_report comment 格式
+
+Agent 提交的 action_report comment：
+
+```json
+{
+  "comment_type": "action_report",
+  "author": "zhangfei-dev",
+  "body": "已修复 CI 失败：修正 import 错误（src/api/routes.py L23），push 到 feat/issue-42 分支。CI 已自动重跑。"
+}
+```
+
+**comment body 要求**：
+- 简要描述执行了什么操作
+- 如有修改，说明修改的文件和大致内容
+- 如有外部验证（CI 重跑、Review 提交），说明状态
+
+### 5.4 防虚假报告
+
+Agent 可能写了 action_report 但没真做。缓解机制：
+
+1. **后续事件链自然暴露**：CI 不会通过、Reviewer 不会收到 Review、部署不会成功
+2. **action_report 可审计**：庞统可以查看 action_report 内容和实际 Gitea 状态是否一致
+3. **验证目标定位**：verify 的首要目标是防止 Agent"没看到/忽略了"（占 90% 的失败场景），不是防止恶意行为（<10%）
+
+---
+
+## §6. 场景 steps 定义
+
+### 6.1 完整场景对照表
+
+| 场景 | action_type | 走向 | steps | 说明 |
+|------|------------|------|-------|------|
+| Review APPROVED → PR 作者 | review_result | toolchain | 2 步 | 合并 PR + 提交 report |
+| Review REQUEST_CHANGES → PR 作者 | review_result | toolchain | 4 步 | 修改代码 + push + 等 Review + report |
+| Review 请求 → reviewer | review_request | toolchain | 4 步 | 读 diff + 审查 + 提交 Review + report |
+| Review 有新提交 → reviewer | review_updated | toolchain | 4 步 | 读 diff + 检查修改 + 提交 Review + report |
+| Review 评论 → PR 作者 | review_comment | toolchain | 3 步 | 查看评论 + 响应（修改/回复）+ report |
+| CI 失败 → PR 作者 | ci_failure | toolchain | 4 步 | 查 CI 日志 + 修测试 + push + report |
+| Issue 指派 → 开发者 | issue_assigned | toolchain | 6 步 | 创建分支 + 编码 + push + CI + PR + report |
+| 部署失败 → 运维 | deploy_failure | toolchain | 4 步 | 查日志 + 排查 + 修+重部署 + report |
+| @mention → 被@者 | mention | toolchain | 按 guidance | 按 mention 模板的 response_guidance + report |
+| PR 合并 → PR 作者 | review_merged | **mail (inform)** | — | 纯通知，走 _mail 路径 |
+
+**D17-2: 除 PR 合并通知外，所有 toolchain 场景走 ToolchainHandler**
+
+### 6.2 各场景 steps 详细定义
+
+#### Review APPROVED → PR 作者
+
+```
+event_type: review_result
+action_type: review_result
+steps:
+  1. 合并 PR（Gitea API: POST /repos/{repo}/pulls/{pr_number}/merge）
+  2. 提交 action report（POST .../comments, comment_type=action_report）
+context:
+  pr_number, repo, pr_title, result=APPROVED, reviewer, review_body
+```
+
+#### Review REQUEST_CHANGES → PR 作者
+
+```
+event_type: review_result
+action_type: review_result
+steps:
+  1. 按审查意见逐条修改代码
+  2. push 到原分支 → CI 自动跑
+  3. CI 通过后等重新 Review
+  4. 提交 action report
+context:
+  pr_number, repo, pr_title, result=REQUEST_CHANGES, reviewer, review_body
+```
+
+#### Review 请求 → reviewer
+
+```
+event_type: review_request
+action_type: review_request
+steps:
+  1. 读取 PR diff（Gitea API: GET /repos/{repo}/pulls/{pr_number}.diff）
+  2. 按审查清单审查（参考 code-review Skill）
+  3. 提交 Review（Gitea API: POST /repos/{repo}/pulls/{pr_number}/reviews）— APPROVE 或 REQUEST_CHANGES
+  4. 提交 action report
+context:
+  pr_number, repo, pr_title, pr_author, branch, risk_level, changed_files
+```
+
+#### Review 有新提交 → reviewer
+
+```
+event_type: review_updated
+action_type: review_updated
+steps:
+  1. 读取 PR diff（Gitea API: GET /repos/{repo}/pulls/{pr_number}.diff）
+  2. 重点检查上次 Review 意见的修改部分
+  3. 提交 Review（Gitea API: POST /repos/{repo}/pulls/{pr_number}/reviews）
+  4. 提交 action report
+context:
+  pr_number, repo, pr_title, pr_author, new_sha, reviewer
+```
+
+#### CI 失败 → PR 作者
+
+```
+event_type: ci_failure
+action_type: ci_failure
+steps:
+  1. 查看完整 CI 日志（PR 页面或 Gitea Actions 页面）
+  2. 修复失败的测试
+  3. push → CI 自动重跑
+  4. 提交 action report
+context:
+  pr_number, repo, branch, error_summary
+```
+
+#### Issue 指派 → 开发者
+
+```
+event_type: issue_assigned
+action_type: issue_assigned
+steps:
+  1. 创建分支 fix/{issue_number}-{brief}
+  2. 编码 + 写 UT
+  3. push → 等 CI
+  4. CI 通过后创建 PR（Gitea API: POST /repos/{repo}/pulls）
+  5. 等 Review
+  6. 提交 action report
+context:
+  issue_number, repo, issue_title, labels, issue_body, brief
+```
+
+#### 部署失败 → 运维
+
+```
+event_type: deploy_failure
+action_type: deploy_failure
+steps:
+  1. 检查 deploy 日志
+  2. 排查失败原因
+  3. 修复并重新部署
+  4. 提交 action report
+context:
+  repo, commit_sha, reason
+```
+
+#### @mention → 被@者
+
+```
+event_type: mention
+action_type: mention
+steps:
+  1. 按 mention 模板中的 response_guidance 执行
+  2. 提交 action report
+context:
+  source_type, source_url, commenter, content_snippet, intent_hint, response_guidance
+```
+
+#### PR 合并 → PR 作者
+
+```
+走 _mail 路径（inform），不走 toolchain。
+理由：PR 已经合并，部署已自动触发，作者无需做任何事。纯 FYI 通知。
+```
+
+### 6.3 PR 合并通知为何保持 inform
+
+- PR 已经被合并到 main
+- 部署已自动触发（deploy workflow）
+- 作者无需做任何事
+
+这是真正的"FYI"通知，设为 inform 正确。继续走 `_send_mail` 函数，不受本设计影响。
+
+---
+
+## §7. _send_toolchain_task 函数设计
+
+### 7.1 函数定义
+
+新增 `_send_toolchain_task` 函数（`src/api/toolchain_routes.py`），与 `_send_mail` 并列。两个函数各司其职：
+
+| 函数 | 用途 | project_id | task_type | DB |
+|------|------|-----------|-----------|-----|
+| `_send_mail` | 纯通知（inform）和 Agent 间通信（request） | `_mail` | `mail` | `_mail/blackboard.db` |
+| `_send_toolchain_task` | 工具链动作事件（需 Agent 执行步骤） | `_toolchain` | `toolchain` | `_toolchain/blackboard.db` |
+
+```python
+def _toolchain_db_path() -> Path:
+    """获取 Toolchain 数据库路径，确保目录存在。"""
+    root = get_data_root()
+    db = root / "_toolchain" / "blackboard.db"
+    db.parent.mkdir(parents=True, exist_ok=True)
+    init_db(db)
+    return db
+
+
+def _send_toolchain_task(
+    to_agent: str,
+    title: str,
+    description: str,
+    event_type: str,
+    action_type: str,
+    steps: list[str],
+    context_data: dict | None = None,
+    source: str = "webhook",
+) -> str:
+    """创建 Toolchain Task 并写入 _toolchain DB。
+
+    Args:
+        to_agent: 收件人 Agent ID
+        title: 任务标题
+        description: 任务描述（模板渲染后的事件信息）
+        event_type: 事件类型（review_result / ci_failure / ...）
+        action_type: 动作分类（用于步骤选择和日志统计）
+        steps: 结构化编号步骤列表
+        context_data: 事件上下文数据（PR 号、仓库名等）
+        source: 来源标识
+
+    Returns:
+        创建的 Task ID
+    """
+    if to_agent not in AGENT_IDS:
+        logger.warning("Unknown agent: %s, skipping toolchain task", to_agent)
+        return ""
+
+    task_id = f"tc-{int(datetime.now().timestamp() * 1000)}"
+    must_hives = json.dumps({
+        "event_type": event_type,
+        "action_type": action_type,
+        "steps": steps,
+        "context": context_data or {},
+        "from": "system",
+        "source": source,
+    }, ensure_ascii=False)
+
+    task = Task(
+        id=task_id,
+        title=title,
+        description=description,
+        assignee=to_agent,
+        assigned_by="system",
+        must_haves=must_hives,
+        task_type="toolchain",
+        status="pending",
+    )
+    bb = Blackboard(_toolchain_db_path())
+    bb.create_task(task)
+    logger.info(
+        "Toolchain task sent: %s → %s [%s] action_type=%s",
+        title[:40], to_agent, task_id, action_type,
+    )
+    return task_id
+```
+
+### 7.2 各 handler 调用改造
+
+`toolchain_routes.py` 中各 handler 从调用 `_send_mail` 改为调用 `_send_toolchain_task`。
+
+以 `_handle_pr_opened` 为例（改造后）：
+
+```python
+async def _handle_pr_opened(payload: Dict[str, Any]) -> None:
+    """PR opened → 通知 simayi-challenger（走 ToolchainHandler）。"""
+    pr = payload.get("pull_request")
+    repo = _repo_fullname(payload)
+    pr_number = pr.get("number", 0)
+    pr_title = pr.get("title", "")
+    # ... 获取文件列表、风险级别（现有逻辑不变）
+
+    text = render_template("review_request", {...})
+
+    title = f"Review 请求: {pr_title} ({repo}#{pr_number})"
+    _send_toolchain_task(
+        to_agent="simayi-challenger",
+        title=title,
+        description=text,
+        event_type="review_request",
+        action_type="review_request",
+        steps=[
+            f"读取 PR diff（Gitea API: GET /repos/{repo}/pulls/{pr_number}.diff）",
+            "按审查清单审查（参考 code-review Skill）",
+            f"提交 Review（Gitea API: POST /repos/{repo}/pulls/{pr_number}/reviews）— APPROVE 或 REQUEST_CHANGES",
+            f"提交 action report（POST http://localhost:8083/api/projects/_toolchain/tasks/<task_id>/comments，comment_type=action_report）",
+        ],
+        context_data={
+            "pr_number": pr_number,
+            "repo": repo,
+            "pr_title": pr_title,
+            "pr_author": pr_author,
+            "branch": branch,
+            "risk_level": risk_level,
+        },
+    )
+```
+
+### 7.3 handler 改造对照表
+
+| handler 函数 | 新 action_type | 新调用 |
+|-------------|---------------|--------|
+| `_handle_pr_opened` | review_request | `_send_toolchain_task(...)` |
+| `_handle_pull_request_review` (APPROVED/REQUEST_CHANGES) | review_result | `_send_toolchain_task(...)` |
+| `_handle_pr_synchronize` | review_updated | `_send_toolchain_task(...)` |
+| `_handle_pull_request_review` (COMMENTED) | review_comment | `_send_toolchain_task(...)` |
+| `_handle_issue_comment` (CI failure) | ci_failure | `_send_toolchain_task(...)` |
+| `_handle_issues` (assigned) | issue_assigned | `_send_toolchain_task(...)` |
+| `_handle_pr_closed` (merged) | — | `_send_mail(...)` **不变**（inform 纯通知） |
+| `_send_deploy_failure_mail` | deploy_failure | `_send_toolchain_task(...)` |
+| `_send_mention_mails` | mention | `_send_toolchain_task(...)` |
+
+### 7.4 _send_mail 保留不变
+
+`_send_mail` 函数完全保留不变，只服务两个场景：
+1. PR 合并通知（inform 纯通知）
+2. ToolchainHandler on_failure 的 Mail 通知（通过 Mail API 发给庞统）
+
+---
+
+## §8. spawner / PromptContext 改动
+
+### 8.1 PromptContext 新增字段
+
+```python
+@dataclass
+class PromptContext:
+    # ... 现有字段不变 ...
+
+    # toolchain 专用（增强）
+    event_type: str = ""
+    event_data: Dict = field(default_factory=dict)
+    action_type: str = ""           # 新增：动作分类
+    action_steps: list = field(default_factory=list)  # 新增：结构化步骤列表
+```
+
+### 8.2 spawner 解析 must_hives 增强
+
+`spawner.py` 中 handler 路径的 PromptContext 构造，需要从 must_hives JSON 中提取 action 字段：
+
+```python
+handler = TaskTypeRegistry.get_by_project(project_id)
+if handler:
+    meta = json.loads(must_haves) if must_haves else {}
+    from_agent = meta.get("from", "")
+    mail_type = meta.get("performative", meta.get("type", ""))
+    
+    # 新增：toolchain 字段提取
+    event_type = meta.get("event_type", "")
+    event_data = meta.get("context", {})
+    action_type = meta.get("action_type", "")
+    action_steps = meta.get("steps", [])
+    
+    ctx = PromptContext(
+        task_id=task_id, title=title, description=description or "",
+        must_haves=must_haves or "", project_id=project_id,
+        agent_id=agent_id, role=spawn_type,
+        spawn_type=spawn_type,
+        from_agent=from_agent, mail_type=mail_type,
+        event_type=event_type, event_data=event_data,
+        action_type=action_type, action_steps=action_steps,
+    )
+    return handler.build_prompt(ctx)
+```
+
+---
+
+## §9. DB 隔离设计
+
+### 9.1 两套独立 DB
+
+**D17-5: _mail 和 _toolchain DB 隔离**（§14 原设计）
+
+```
+~/.sanguo_projects/sanguo_moziplus_v2/data/
+├── _mail/
+│   └── blackboard.db          # MailHandler 管理
+├── _toolchain/
+│   └── blackboard.db          # ToolchainHandler 管理
+└── projects/
+    └── ...
+```
+
+隔离理由：
+1. §14 L41 明确：`_toolchain` 有独立 DB，和 `_mail` 隔离
+2. 不同 task_type 的 task 互不干扰
+3. 前端展示独立（Toolchain Tab vs 飞鸽传书 Tab）
+4. 数据量可控（toolchain 事件多但每个 task 生命周期短）
+
+### 9.2 comments 表 CHECK 约束
+
+**D17-3: comments 表 CHECK 约束处理**
+
+`_toolchain/blackboard.db` 的 comments 表需要支持 `action_report` 类型的 comment。
+
+现有 CHECK 约束（`db.py`）：
+```sql
+CHECK (comment_type IN (
+    'general','handoff','observation','review',
+    'rebuttal','rebuttal_response','debate_argument',
+    'debate_rebuttal','debate_judgment'
+))
+```
+
+**方案 A（推荐）**：去掉 CHECK 约束。CHECK 在 SQLite 中仅文档作用，应用层已做校验。去掉后未来新增 comment_type 不需要 migration。
+
+**方案 B**：新增 `action_report` 到 CHECK 约束枚举。需要 migration（SQLite 重建表）。
+
+推荐方案 A——直接去掉 CHECK，在应用层（ToolchainApiSection + comments API）校验 comment_type 合法性。
+
+### 9.3 ticker 扫描
+
+ticker 需要扫描 `_toolchain` 虚拟项目。当前 ticker 通过 `TaskTypeRegistry.virtual_projects()` 自动发现已注册的虚拟项目，ToolchainHandler 已注册 `virtual_project="_toolchain"`，无需额外改动。
+
+---
+
+## §10. 影响范围
+
+涉及的文件（开发目录 `~/.openclaw/sanguo_projects/sanguo_moziplus_v2/`）：
+
+| 文件 | 改动类型 | 说明 |
+|------|---------|------|
+| `src/daemon/toolchain_handler.py` | 修改 | ToolchainContextSection 加 steps 渲染 + action_hint；ToolchainApiSection 改为 action_report 指引；ToolchainConstraintsSection 加 Red Flags；verify_completion 改用 action_report |
+| `src/api/toolchain_routes.py` | 修改 | 新增 `_toolchain_db_path()` + `_send_toolchain_task()`；各 handler 改为调用 `_send_toolchain_task`；PR merged 保持 `_send_mail` |
+| `src/daemon/spawner.py` | 修改 | handler 路径 PromptContext 构造时提取 `action_type`、`action_steps` 字段 |
+| `src/daemon/prompt_composer.py` | 修改 | PromptContext 新增 `action_type`、`action_steps` 字段 |
+| `src/blackboard/db.py` | 修改 | comments 表 CHECK 约束处理（去掉 CHECK 或加 action_report） |
+| `src/daemon/mail_notify.py` | 修改 | `_REASON_MAP` 新增 `no_action_report` reason |
+
+### 改动量估算
+
+| 文件 | 改动量 | 风险 |
+|------|--------|------|
+| `src/daemon/toolchain_handler.py` | ~80 行 | 中（核心逻辑变化） |
+| `src/api/toolchain_routes.py` | ~120 行 | 中（新增函数 + 8 个 handler 改造） |
+| `src/daemon/spawner.py` | ~8 行 | 低（纯新增字段提取） |
+| `src/daemon/prompt_composer.py` | ~3 行 | 低（dataclass 新增字段） |
+| `src/blackboard/db.py` | ~5 行 | 低（CHECK 约束处理） |
+| `src/daemon/mail_notify.py` | ~2 行 | 低（新增一行 reason map） |
+| **总计** | **~218 行** | |
+
+---
+
+## §11. 向后兼容
+
+### 11.1 Mail 系统不受影响
+
+| 场景 | 影响 |
+|------|------|
+| Agent 间手动发 inform Mail | ✅ 无影响 |
+| Agent 间手动发 request Mail | ✅ 无影响 |
+| MailHandler 的 verify / on_failure | ✅ 无影响 |
+| `_send_mail` 函数 | ✅ 保留不变 |
+
+### 11.2 _mail DB 中已有的 toolchain task
+
+改造前创建的 toolchain task 存在于 `_mail/blackboard.db` 中（task_type=mail, type=inform）。改造后：
+
+- 这些旧 task 继续由 MailHandler 管理（inform 始终通过验证）
+- 不需要 migration——旧 task 自然 done / 过期清理
+- 新 task 创建在 `_toolchain/blackboard.db` 中
+
+### 11.3 ToolchainHandler 的 fallback verify
+
+改造后 verify_completion 保留了三层 fallback：
+1. action_report comment（首选）
+2. output（fallback）
+3. 有实质内容的 comment（fallback）
+
+这意味着即使 Agent 初期不习惯提交 action_report，只要有 output 或实质 comment 仍能通过验证。平滑过渡。
+
+### 11.4 PromptContext 新增字段
+
+`action_type` 和 `action_steps` 使用 `field(default_factory=list/str)`，不传时为空/[]。现有 PromptContext 构造不受影响。
+
+### 11.5 comments 表 CHECK 约束
+
+如果选择"去掉 CHECK"方案，对现有数据无影响——现有 comment_type 值在应用层仍被校验。如果选择"加 action_report"方案，需要 migration。
+
+---
+
+## §12. 设计决策记录
+
+### D17-1: verify 采用 action_report comment 机制
+
+**决策**：verify_completion 优先检查 `comment_type='action_report'` 的 comment 是否存在，替代现有"任意 comment ≥20 字符"逻辑。保留 output 和实质 comment 作为 fallback。
+
+**讨论的替代方案**：
+- A（外部状态检查）：太复杂，每个 action_type 需要不同检查逻辑，耦合 Gitea API
+- B（始终通过）：等于 inform，失去意义
+- C（只查 action_report，无 fallback）：改造初期过于激进
+
+**理由**：action_report comment 机制最简单且与现有架构一致。保留 fallback 确保平滑过渡。Agent 如果写了 report 但没执行，后续事件链（CI 不会通过、Reviewer 不会收到 Review）会自然暴露问题。
+
+### D17-2: 除 PR 合并通知外，所有 toolchain 场景走 ToolchainHandler
+
+**决策**：9 种 toolchain 场景中，8 种走 ToolchainHandler（`_send_toolchain_task`），仅 `review_merged` 走 MailHandler（`_send_mail` + inform）。
+
+**理由**：
+- 8 种场景都需要 Agent 执行后续动作（修代码/审查/合并/排查/响应 mention）
+- PR 合并是真正的 FYI，无需 Agent 行动
+
+### D17-3: comments 表 CHECK 约束处理
+
+**决策**：推荐去掉 comments 表的 comment_type CHECK 约束，在应用层校验。或新增 action_report 到 CHECK 枚举。
+
+**理由**：SQLite CHECK 约束修改需要重建表（CREATE → COPY → DROP → RENAME），migration 成本高。CHECK 仅文档作用，去掉不影响功能。
+
+### D17-4: 推翻 §16.0 D1"不做第三种 task 类型"
+
+**决策**：推翻 D1。toolchain 事件不再走 MailHandler（mail + inform），改走 ToolchainHandler（toolchain）。
+
+**理由**：
+- D1 基于权宜方案——当时 ToolchainHandler 还没接线
+- §14 早已设计 ToolchainHandler 作为独立 task type，有自己的 DB、PromptSection、verify 逻辑
+- 不强约束总是断链——主公明确要求 L2 引擎层必须是强约束
+- MailHandler 的 inform 语义（"已阅即可"）和 toolchain 事件的期望行为（"必须执行步骤"）根本矛盾
+- ToolchainHandler 不需要 Mail 的回复机制（request），也不应该被 inform 的"已阅即可"语义覆盖
+
+### D17-5: _mail 和 _toolchain DB 隔离
+
+**决策**：toolchain task 写入 `_toolchain/blackboard.db`，不写入 `_mail/blackboard.db`。
+
+**理由**：
+- §14 L41 原设计
+- 不同 task_type 的 task 互不干扰
+- 数据生命周期不同（toolchain task 短生命周期，mail 可能需要长期保留线程）
+- 前端展示独立
+
+### D17-6: action_report 作为唯一的新 comment_type
+
+**决策**：只引入 `action_report` 一个新 comment_type，不为每种 action_type 定义不同的 comment_type。
+
+**理由**：
+- 统一简化了 verify 逻辑（只查一种 type）
+- action_type 的区分在 must_hives JSON 中已有
+- comment body 内容可以自由描述具体执行了什么
+
+### D17-7: ToolchainApiSection 移除"手动标 done"指引
+
+**决策**：ToolchainApiSection 不再指示 Agent 手动标 done，改为指示提交 action_report。done 由 verify_completion 通过后自动触发。
+
+**理由**：
+- Agent 手动标 done 和 verify_completion 的 done 可能冲突
+- action_report + verify 是更可靠的完成路径
+- 减少 Agent 需要执行的 API 操作（从"标 done + 提交产出"简化为"提交 action_report"）
+
+---
+
+## §13. 实施计划
+
+### Step 1：基础设施（prompt_composer + spawner + db）
+
+| 子步骤 | 文件 | 内容 | 风险 |
+|--------|------|------|------|
+| 1a | `prompt_composer.py` | PromptContext 新增 `action_type`、`action_steps` 字段 | 极低 |
+| 1b | `spawner.py` | handler 路径提取 `action_type`、`action_steps` | 极低 |
+| 1c | `db.py` | comments 表 CHECK 约束处理 | 低 |
+
+### Step 2：ToolchainHandler 强化
+
+| 子步骤 | 文件 | 内容 | 风险 |
+|--------|------|------|------|
+| 2a | `toolchain_handler.py` | ToolchainContextSection 加 steps 渲染 + action_hint | 低 |
+| 2b | `toolchain_handler.py` | ToolchainApiSection 改为 action_report 指引 | 低 |
+| 2c | `toolchain_handler.py` | ToolchainConstraintsSection 加 Red Flags 表 | 低 |
+| 2d | `toolchain_handler.py` | verify_completion 改用 action_report（保留 fallback） | 中 |
+| 2e | `toolchain_handler.py` | on_failure 保留现有逻辑（标 failed + 通知庞统） | 无 |
+
+### Step 3：toolchain_routes 改造
+
+| 子步骤 | 文件 | 内容 | 风险 |
+|--------|------|------|------|
+| 3a | `toolchain_routes.py` | 新增 `_toolchain_db_path()` + `_send_toolchain_task()` | 低 |
+| 3b | `toolchain_routes.py` | 8 个 handler 改为调用 `_send_toolchain_task`（PR merged 除外） | 中 |
+| 3c | `mail_notify.py` | `_REASON_MAP` 新增 `no_action_report` | 极低 |
+
+### Step 4：测试 + 验证
+
+| 子步骤 | 内容 |
+|--------|------|
+| 4a | 单元测试：ToolchainContextSection steps 渲染 |
+| 4b | 单元测试：verify_completion action_report 检查 + fallback |
+| 4c | 单元测试：_send_toolchain_task 写入 _toolchain DB |
+| 4d | 集成测试：webhook → toolchain task → Agent → action_report → done |
+| 4e | 回归测试：_mail 路径不受影响（inform/request 不变） |
+| 4f | 回归测试：PR merged 仍走 _send_mail（inform） |
+
+---
+
+## §14. 风险评估
+
+| 风险 | 概率 | 影响 | 缓解措施 |
+|------|------|------|----------|
+| Agent 不提交 action_report | 中 | 高 | Prompt 强约束 + Red Flag 表 + verify 失败标 failed + 通知庞统 + fallback（output/comment） |
+| Agent 提交虚假 action_report | 低 | 中 | 后续事件链自然暴露（CI 不通过、Reviewer 收不到 Review） |
+| Agent 混淆 toolchain 和 mail 语义 | 低 | 低 | ToolchainContextSection 明确告知"这是需要执行动作的事件" |
+| _toolchain DB 未初始化 | 低 | 中 | `_toolchain_db_path()` 中调用 `init_db()` 确保目录和表存在 |
+| ticker 不扫描 _toolchain | 低 | 中 | TaskTypeRegistry.virtual_projects() 已自动发现 |
+| handler 改造遗漏 | 中 | 低 | 逐个 handler 审查 + 集成测试覆盖 |
+| steps 内容不完整或不准确 | 中 | 低 | 代码审查 + 从 §13 §15.5 已有模板提取 steps |
+
+---
+
+## §15. 优秀实践参考
+
+### 15.1 §13 §15.5 流程强约束模板
+
+§13 §15.5 已定义 6 个强约束模板，每个包含编号步骤、Gitea API 调用指令、时限要求。本设计在此基础上：
+- 将步骤从模板纯文本提取到 must_hives JSON 的 `steps` 字段（结构化、可编程化）
+- 通过 ToolchainHandler 的 PromptSection 强约束确保 Agent 知道必须执行
+- 将"回复此 Mail 确认"改为"提交 action report"（更适合自动化验证）
+
+### 15.2 Superpowers: Red Flags 表
+
+Superpowers 使用 Red Flags 表防止 Agent self-rationalization。本设计在 ToolchainConstraintsSection 中增加 Red Flag 示例（如"这个通知看看就行了"→ ❌ 错！），防止 Agent 将 action 降级为 inform 处理。
+
+### 15.3 §14 Handler 架构
+
+§14 定义了 TaskTypeRegistry + Handler 架构，ToolchainHandler 已实现并注册。本设计是"给 ToolchainHandler 接线 + 强化约束"——不是新建一套系统，而是让已有的系统按设计意图运转。
+
+### 15.4 Hermes: "Keep calling tools until complete AND verified"
+
+Hermes 的验证理念：执行和验证不可分割。本设计中 verify_completion 的 action_report 检查就是这个理念的体现——不是检查 Agent"说了什么"，而是检查 Agent"产出了什么执行凭证"。
+
+---
+
+## §16. 未来演进
+
+### 16.1 外部状态验证（增强）
+
+未来可选择性增加某些 action_type 的外部状态验证作为补充：
+- `review_request`：检查 PR 是否有新的 Review
+- `ci_failure`：检查 PR 是否有新 push
+- `issue_assigned`：检查是否有关联 PR 创建
+
+这是可选增强，在 action_report 之上叠加更精确的验证。
+
+### 16.2 steps 动态化
+
+当前 steps 在 `_send_toolchain_task` 调用时硬编码。未来可以根据事件内容动态生成 steps（如根据 CI 失败的具体测试文件给出修复建议）。
+
+### 16.3 超时检测
+
+参考 §15.4 的超时处置机制：
+- toolchain task 创建后 N 小时内无 action_report → ticker 重发提醒
+- 超时严重时 spawn 庞统介入
+
+当前由 verify_completion 的 fail → on_failure → notify 机制处理。
+
+---
+
+## §17. 审查检查清单
+
+- [ ] §1 问题陈述是否准确（ToolchainHandler 未接线，toolchain 事件走了 MailHandler inform）
+- [ ] §2 三层强约束设计是否完整（输入/执行/输出）
+- [ ] §3 输入约束：must_hives JSON 结构 + ToolchainContextSection 渲染增强
+- [ ] §4 执行约束：Red Flags 表设计是否覆盖常见 self-rationalization 模式
+- [ ] §5 输出约束：action_report verify 机制 + fallback 设计
+- [ ] §6 场景 steps 定义是否完整（8 种 action 场景 + 1 种 inform 场景）
+- [ ] §7 _send_toolchain_task 函数设计是否正确
+- [ ] §8 PromptContext / spawner 改动是否和 §14 架构一致
+- [ ] §9 DB 隔离是否符合 §14 原设计
+- [ ] §10 影响范围是否完整
+- [ ] §11 向后兼容是否充分
+- [ ] §12 设计决策 D17-1 ~ D17-7 是否合理
+- [ ] §13 实施计划是否可行（4 步渐进）