auto-sync: 2026-05-17 20:17:17

2026-05-17 20:17:17 +08:00
parent f598389a5c
commit 55d7a6b37a
1 changed files with 416 additions and 301 deletions
@@ -1,33 +1,21 @@
 # Agent 路由机制重设计方案

-**版本**: v1.0  
+**版本**: v2.0  
 **作者**: 庞统（副军师）🐦  
 **日期**: 2026-05-17  
-**状态**: 待确认  
-**触发**: E2E 测试暴露 review 阶段派错 Agent（张飞被派去审查自己），根因是 Daemon 硬编码路由
+**状态**: 待评审  
+**触发**: E2E 测试暴露 review 阶段派错 Agent（张飞被派去审查自己），根因是 Daemon 硬编码路由  
+**评审**: 司马懿

 ---

 ## 1. 问题诊断

-### 1.1 当前实现
+### 1.1 Bug 根因

-```
-Ticker tick → dispatcher.decide(task, action_type) → 返回 agent_id → spawn
-```
+任务生命周期中 `assignee` 只在执行阶段被设置（张飞 claim → assignee="zhangfei-dev"）。到 review 阶段，`decide()` 走 Level 2：`task.assignee` 在注册列表中 → 又派给张飞。

-`decide()` 的逻辑：
-1. action_type 是机械检查 → Daemon 本地执行
-2. task.assignee 有值且已注册 → spawn 这个 agent（**直接用 assignee**）
-3. task.assignee 为空 → 查 capability_map → fallback 庞统
-
-### 1.2 Bug 根因
-
-任务生命周期中 assignee 只在 **执行阶段** 被设置（张飞 claim → assignee="zhangfei-dev"）。
-
-到 **review 阶段**，ticker 调用 `dispatcher.dispatch(task, action_type="review")`，但 `decide()` 走 Level 2：`task.assignee="zhangfei-dev"` 在注册列表中 → 又派给张飞。
-
-### 1.3 更深层的问题
+### 1.2 更深层的问题

 **Daemon 在做 AI 该做的决策。** v2.6 架构明确定义：

@@ -35,9 +23,9 @@ Ticker tick → dispatcher.decide(task, action_type) → 返回 agent_id → spa
 |------|-------------|---------|
 | 决策者 | Agent（在黑板上自主决策） | Daemon（if-else 硬编码） |
 | Daemon 角色 | 投递员（执行黑板上的决策） | 调度器（决定谁干什么） |
-| 编排方式 | AI agent 在黑板上自主领活（动态协作） | 配置表驱动（非 AI 判断） |
+| 编排方式 | AI agent 在黑板上自主领活 | 配置表驱动（非 AI 判断） |

-T3-10 的设计原文写着"**配置表驱动非 AI 判断**"——这和 v2.6 的核心原则矛盾。
+T3-10 设计原文写着"配置表驱动非 AI 判断"——与 v2.6 核心原则矛盾。

 ---

@@ -45,359 +33,486 @@ T3-10 的设计原文写着"**配置表驱动非 AI 判断**"——这和 v2.6

 ### 2.1 学术前沿

-#### bMAS（Blackboard Multi-Agent System）— arXiv 2507.01701
-
-**核心机制**：Control Unit（LLM 驱动）根据黑板当前内容**动态选择**下一轮该哪个 Agent 行动。
-
-关键发现：
- 不是固定 DAG，Control Unit 根据黑板状态决定下一步
- token 效率更高（智能路由不浪费在不相关的 Agent 上）
- Agent 轮流行动 → 更新黑板 → Control Unit 判断 → 直到共识
-
-#### 自主选择模式（Self-Selection）— arXiv 2510.01285
-
-**核心发现**：**任务不显式分配给 Agent。** 相反，中央 Agent 把需求发布到黑板上，**每个 Agent 自主决定是否参与**。
-
-> "Tasks are not explicitly assigned to helper agents; instead, each agent autonomously decides whether to participate based on its capabilities."
-
-这是最 AI Native 的模式——不需要任何路由规则表。
-
-#### MasRouter（Confidence-Aware Routing）— arXiv 2601.04861
-
-根据任务复杂度动态选择模型规模，引入 confidence 机制：
- 简单任务 → 小模型
- 复杂任务 → 大模型
- 基于历史表现动态更新 Agent 可靠性评分
-
-#### AgentGate — arXiv 2604.06696
-
-结构化路由引擎，用 3B-7B 小模型做路由决策，candidate-aware 微调策略。验证了"路由本身也可以是 AI"的可行性。
+| 来源 | 核心发现 | 对我们的价值 |
+|------|---------|-------------|
+| **bMAS** arXiv 2507.01701 | Control Unit（LLM 驱动）根据黑板当前内容动态选择 Agent | 路由本身可以是 LLM 调用，不是 if-else |
+| **Self-Selection** arXiv 2510.01285 | 任务不显式分配，Agent 根据自己能力自主决定是否参与 | 最 AI Native 的模式，我们的演进目标 |
+| **MasRouter** arXiv 2601.04861 | 根据任务复杂度动态选模型规模 + confidence 机制 | confidence 阈值 + 历史表现动态评分 |
+| **AgentGate** arXiv 2604.06696 | 3B-7B 小模型做结构化路由决策 | 验证"路由可以是 AI"的可行性 |

 ### 2.2 生产实践

-#### Microsoft Conductor（2026.05）
+| 项目 | 模式 | 启发 |
+|------|------|------|
+| **Microsoft Conductor**（2026.05 开源） | YAML 确定性编排 | 确定性流程 + LLM 动态路由分层混合 |
+| **Azure Agent Patterns** | 5 种模式：顺序/并发/群聊/**Handoff**/Magentic | **Handoff**：Agent 完成后自己决定交接给谁 |
+| **AWS 动态分派** | 事件驱动 + 上下文感知路由 | 路由变成事件，不是轮询 |
+| **Claude Code Agent Teams** | Lead coordinator + context 隔离 | Lead 做分解+分配+监控，subagent 只拿相关 context |

-刚开源的确定性编排工具。核心思路：**YAML 定义工作流，路由是确定性的**。
+### 2.3 已有调研的线索

-但它的定位是：当任务**不是探索性的**时（如 code review pipeline），确定性路由比 LLM 动态路由更可靠。关键洞察是：
- **探索性任务** → LLM 编排（动态）
- **确定性流程** → 声明式编排（YAML）
- 两者不是互斥的，而是**分层混合**
-
-#### AWS 动态分派模式
-
-事件驱动架构 + 动态分派：LLM 调用变成智能路由的、上下文感知的事件。
-
-#### Azure Agent Orchestration Patterns
-
-五种模式：顺序、并发、群聊、交接（Handoff）、Magentic。
- **Handoff 模式**：Agent 完成自己的部分后，**自己决定交接给谁**
- 关键：控制权从一个 Agent 转移到另一个，不是中央调度
-
-### 2.3 已有调研报告中的线索
-
-| 来源 | 关键洞察 |
-|------|---------|
-| shared-consciousness-research.md | Control Unit 是 LLM 驱动的，不是规则路由；Agent 能力画像是关键 |
-| v2.6-research-01 | Hermes 不信任 Agent 完成声明（系统级保护）；Claude Code Lead 主动协调 |
-| v2.6-research-02 | 事件驱动：complete→auto-unlock 是核心模式 |
-| architecture-v2.6.md | **"Agent 决策，Daemon 执行"**；Daemon 是投递员不是决策者 |
+- architecture-v2.6.md：**"Agent 决策，Daemon 执行"**；Daemon 是投递员不是决策者
+- shared-consciousness-research.md：Control Unit 是 LLM 驱动的，不是规则路由
+- v2.6-research-01：Hermes 幻觉门控——不信任 Agent 完成声明

 ---

 ## 3. 设计原则

-从调研中提炼出三个核心原则：
-
-### P1: 路由决策在 Agent 层，不在 Daemon 层
-
-Daemon 只做"投递"——读黑板、spawn Agent、清理 session。**"谁该做这个任务"的决策由 Agent 自己或由黑板上的声明式数据驱动。**
-
-### P2: Agent 通过黑板声明自己的能力和意图
-
-不是 Daemon 维护一个 capability_map，而是 **Agent 自己在黑板上注册能力画像**。Daemon 查黑板找到匹配的 Agent。
-
-### P3: 执行者声明下一步需要什么
-
-执行阶段的 Agent 完成任务后，在提交产出时声明"下一步需要什么能力"。Daemon 读这个声明，找到匹配的 Agent，spawn 它。
+| # | 原则 | 说明 |
+|---|------|------|
+| P1 | 路由决策在 Agent 层，不在 Daemon 层 | "谁该做这个任务"由 Agent 自己或 LLM 决定，Daemon 只执行 |
+| P2 | 当前 Agent 最清楚下一步需要谁 | 刚做完工作的人最清楚该交接给谁（Azure Handoff） |
+| P3 | 路由可审计 | 每次路由决策记录到黑板，可回溯 |

 ---

-## 4. 方案设计
+## 4. 三种路由模式

-### 4.1 核心机制：Agent 能力画像 + 声明式路由
+### 4.1 模式总览

-#### 机制一：Agent 能力画像（Agent Profile）
-
-每个 Agent 在黑板上注册自己的能力画像（不是 Daemon 硬编码）：
-
-```yaml
-# 存储在黑板的 agents 表或独立 agent_profiles 表
-zhangfei-dev:
-  capabilities: [coding, implementation, scripting]
-  can_review: false        # 张飞不做审查
-  max_concurrent: 1
-  performance_score: 0.85  # 基于历史表现的动态评分
-
-simayi-challenger:
-  capabilities: [review, quality_check, debate]
-  can_review: true         # 司马懿专门做审查
-  max_concurrent: 2
-  performance_score: 0.92
-
-pangtong-fujunshi:
-  capabilities: [planning, coordination, escalation, strategy]
-  can_review: true
-  is_fallback: true        # 庞统是最终兜底
-  max_concurrent: 3
-  performance_score: 0.90
+```
+┌───────────────────────────────────────────────────────────┐
+│                     路由决策入口                            │
+│               Dispatcher.decide(task)                      │
+└────────┬──────────────┬──────────────────┬────────────────┘
+         │              │                  │
+    ┌────▼────┐   ┌────▼─────┐   ┌────────▼────────┐
+    │ Mode A  │   │ Mode B   │   │    Mode C       │
+    │ LLM路由  │   │ Agent交接 │   │  Agent自主领活   │
+    │(中心化)  │   │(去中心化) │   │   (去中心化)     │
+    └────┬────┘   └────┬─────┘   └────────┬────────┘
+         │              │                  │
+    LLM选Agent     执行者说需要谁     Agent自己来领
 ```

-**关键**：能力画像是声明式的、可演进的。Agent 的 SOUL.md/IDENTITY.md 中就定义了自己的能力。Daemon 启动时读取 Agent 配置，写入黑板。
+### 4.2 Mode A：LLM 路由（中心化）

-#### 机制二：任务生命周期的声明式流转
+**场景**：首次分配（pending → claimed）、异常升级（failed/blocked）、无明确 handoff 指令时。

-任务的 `status` 字段仍然驱动状态机，但**每个状态需要什么能力由黑板上的元数据声明**，不是 Daemon 硬编码：
+**机制**：Daemon 调用一次轻量 LLM API，传入任务信息 + Agent 能力画像 + 负载状态，LLM 返回选择的 Agent + 理由 + 置信度。

-```python
-# 任务的 metadata 字段存储生命周期声明
-# 创建时由创建者（用户或庞统）或默认模板设置
-TASK_LIFECYCLE = {
-    "pending": {
-        "needs": "execution",      # pending 阶段需要 execution 能力
-        "capability": "auto",      # 从 task_type 推断，或显式声明
-    },
-    "review": {
-        "needs": "review",         # review 阶段需要 review 能力
-        "capability": "review",    # 固定查 review 能力的 Agent
-        "exclude_assignee": True,  # 排除执行者（不能自己审自己）
-    },
-    "failed": {
-        "needs": "escalation",     # 失败后需要升级能力
-        "capability": "escalation",
-    }
-}
+**关键**：不是 spawn 一个 Agent session，是一次 ~300 token 的 API 调用（~1-2s，<¥0.01）。
+
+```
+输入: 任务描述 + 6个Agent画像 + 负载
+输出: {"agent_id": "xxx", "reason": "...", "confidence": 0.9}
+约束: ~200 token response, temperature=0.1
 ```

-**这不是模板！** 这是任务生命周期本身固有的语义。区别在于：
- **模板（v1.0）**：预先定义完整的 DAG 流程，每个节点固定
- **声明式流转（本方案）**：只声明每个状态需要什么能力，具体谁来由能力画像动态匹配
+### 4.3 Mode B：Agent 声明式交接（去中心化）⭐ 最高频

-#### 机制三：执行者声明下一步
+**场景**：Agent 完成当前阶段后，明确声明下一步需要什么。

-Agent 在完成产出提交时，可以声明下一步需要什么：
+**机制**：Agent 在 POST /status 时附带 `next_capability` 字段：

 ```json
-// Agent 调用 POST /api/projects/{pid}/tasks/{id}/status 时
 {
  "status": "review",
  "agent": "zhangfei-dev",
-  "next_capability": "review",      // 声明下一步需要 review 能力
+  "next_capability": "review",
  "handoff_note": "代码已实现，请审查质量和安全性"
 }
 ```

-Daemon 读 `next_capability`，在 Agent 能力画像中找到匹配的 Agent（且排除当前 assignee），spawn 它。
+Daemon 读 `next_capability`，查 Agent 能力画像找到匹配者（排除当前执行者），直接 spawn。

-如果不声明 `next_capability`，Daemon 从 `TASK_LIFECYCLE[status].needs` 推断。
+**这是最 AI Native 的模式**——刚做完工作的人最清楚下一步需要谁。不需要 LLM 调用，0ms 延迟。

-### 4.2 Daemon 路由逻辑重写
+### 4.4 Mode C：Agent 自主领活（去中心化）— 未来演进
+
+**场景**：Daemon 广播任务需求，Agent 自己决定是否 claim。
+
+**当前阶段不实现**，保留演进空间。数据结构（agent_profiles、capabilities）不变，只需把"Daemon 查表派发"改为"Daemon 广播 + Agent 自己 claim"。
+
+### 4.5 模式选择逻辑

 ```python
-class Dispatcher:
-    """Agent 路由器 — 基于能力画像的声明式路由"""
+def decide(self, task, action_type=""):
+    # 确定性快速路径（0ms，不调 LLM）
+    if self._is_deterministic(task, action_type):
+        return self._deterministic_route(task, action_type)
    
-    def decide(self, task: Task, action_type: str = "") -> dict:
-        # Level 1: 纯机械检查 → Daemon 本地执行（不变）
-        if action_type in self.LOCAL_ACTIONS:
-            return {"level": DispatchLevel.LOCAL, ...}
-        
-        # Level 2: 基于能力画像的路由（替代原来的 assignee 硬编码）
-        needed_capability = self._resolve_needed_capability(task, action_type)
-        exclude = self._get_exclusions(task, action_type)
-        agent_id = self._find_agent_by_capability(
-            needed_capability, 
-            exclude_agents=exclude
-        )
-        
-        if agent_id:
-            return {
-                "level": DispatchLevel.FULL_AGENT,
-                "agent_id": agent_id,
-                "reason": f"Matched capability '{needed_capability}' → {agent_id}",
-            }
-        
-        # Level 3: 无匹配 → 庞统兜底
-        return {
-            "level": DispatchLevel.FULL_AGENT,
-            "agent_id": "pangtong-fujunshi",
-            "reason": "No agent matched capability, fallback to coordinator",
-        }
-
-    def _resolve_needed_capability(self, task: Task, action_type: str) -> str:
-        """推断当前任务阶段需要什么能力"""
-        
-        # 1. 优先看 Agent 声明的 next_capability（黑板上的 handoff_note）
-        if task.next_capability:
-            return task.next_capability
-        
-        # 2. 看任务当前状态对应的生命周期需求
-        lifecycle = TASK_LIFECYCLE.get(task.status)
-        if lifecycle:
-            return lifecycle["capability"]
-        
-        # 3. 看任务类型（fallback）
-        return self._infer_from_task_type(task.task_type)
-
-    def _get_exclusions(self, task: Task, action_type: str) -> set:
-        """获取需要排除的 Agent"""
-        exclude = set()
-        lifecycle = TASK_LIFECYCLE.get(task.status, {})
-        
-        # review 阶段排除执行者（不能自己审自己）
-        if lifecycle.get("exclude_assignee") and task.assignee:
-            exclude.add(task.assignee)
-        
-        return exclude
-
-    def _find_agent_by_capability(self, capability: str, 
-                                   exclude_agents: set = None) -> str | None:
-        """从 Agent 能力画像中找到匹配的 Agent"""
-        candidates = []
-        for agent_id, profile in self.agent_profiles.items():
-            if agent_id in (exclude_agents or set()):
-                continue
-            if capability in profile.get("capabilities", []):
-                candidates.append(agent_id)
-        
-        if not candidates:
-            return None
-        
-        # 多候选时：选负载最低的
-        if len(candidates) > 1:
-            return min(candidates, 
-                       key=lambda a: self.counter._active.get(a, 0))
-        
-        return candidates[0]
+    # Mode B: Agent 声明了 next_capability → 直接匹配
+    if task.next_capability:
+        return self._match_capability(task.next_capability, 
+                                       exclude={task.assignee})
+    
+    # Mode A: 无明确 handoff → LLM 路由
+    return self._llm_route(task, action_type)
 ```

-### 4.3 assignee 字段语义变更
-
-当前：`assignee` 是"负责人"（整个任务的），一旦设置就贯穿全生命周期。
-
-**改为**：`assignee` 是"当前阶段的执行者"，每次状态流转时更新。
-
-```python
-# 状态流转时自动更新 assignee
-def transition_status(task_id, new_status, agent):
-    # ...
-    if lifecycle.get("exclude_assignee"):
-        # review 阶段：assignee 改为审查者
-        old_assignee = task.assignee  # 保存执行者信息
-        task.previous_assignee = old_assignee  # 新增字段
-        task.assignee = new_agent_id  # 设为审查者
-```
-
-### 4.4 和 v2.6 架构的对齐
-
-| v2.6 原则 | 本方案实现 |
-|-----------|----------|
-| Agent 决策，Daemon 执行 | 路由决策基于 Agent 的能力画像（Agent 声明的能力），Daemon 只做匹配 |
-| Daemon 是投递员不是决策者 | Daemon 不做"谁该做什么"的价值判断，只做能力匹配 |
-| 编排是 AI agent 自主领活 | Agent 自己声明能力、声明下一步需要什么能力 |
-| 黑板是唯一真相源 | 能力画像、任务生命周期声明都在黑板上 |
-
-### 4.5 和模板机制的本质区别
-
-| 维度 | v1.0 模板 | 当前 capability_map | 本方案 |
-|------|----------|--------------------| -------|
-| 路由定义位置 | 模板 YAML | Daemon config YAML | 黑板（Agent 能力画像） |
-| 谁定义能力 | 用户/开发者 | 开发者 | **Agent 自己**（SOUL.md → 黑板） |
-| 每个阶段谁做 | 模板固定 | config 硬编码 | 声明式匹配 + 排除规则 |
-| 可扩展性 | 加模板 | 改代码 | Agent 注册即可 |
-| AI Native 程度 | 低 | 低 | **中高**（Agent 自声明） |
-
-### 4.6 演进路线
-
-本方案是**务实的第一步**。它不是最终的 AI Native 终极形态，而是从"Daemon 硬编码"到"Agent 自主领活"之间的**关键跳板**：
-
-```
-当前: Daemon if-else 硬编码
-  ↓ 本方案
-第一步: Agent 能力画像 + 声明式路由（Daemon 做能力匹配）
-  ↓ 未来
-第二步: Agent 自主领活（Daemon 只广播，Agent 自己 claim）
-  ↓ 更远
-第三步: bMAS Control Unit（LLM 驱动的动态选择）
-```
-
-第一步到第二步的迁移成本很低——能力画像和声明式路由机制不变，只是把"Daemon 查找匹配 → 派发"变成"Daemon 广播需求 → Agent 自己 claim"。这是同一个数据结构的两种消费方式。
+**确定性快速路径**包括：
+- 机械检查（L1_guardrail、format_check）→ Daemon 本地执行
+- 已有 assignee 且非生命周期流转（如 crashed → retry 同一人）→ 直接用

 ---

-## 5. 具体改动清单
+## 5. 核心组件设计

-### 5.1 数据模型变更
+### 5.1 Agent 能力画像（Agent Profile）

-| 变更 | 说明 |
-|------|------|
-| 新增 `agent_profiles` 表（或用 agents 表扩展） | 存储 Agent 能力画像 |
-| tasks 表新增 `next_capability` 字段 | Agent 声明下一步需要的能力 |
-| tasks 表新增 `previous_assignee` 字段 | 状态流转时保存前一阶段执行者 |
-| `assignee` 语义变更 | 从"任务负责人"改为"当前阶段执行者" |
+每个 Agent 在配置中声明自己的能力（**不是 Daemon 硬编码**）：

-### 5.2 代码变更
+```yaml
+# config/default.yaml → agents 段扩展
+agents:
+  zhangfei-dev:
+    capabilities: [coding, implementation, scripting]
+    can_review: false
+    max_concurrent: 1
+    
+  simayi-challenger:
+    capabilities: [review, quality_check, debate]
+    can_review: true
+    max_concurrent: 2
+    
+  guanyu-dev:
+    capabilities: [risk, compliance, position_check]
+    can_review: true
+    max_concurrent: 1
+    
+  zhaoyun-data:
+    capabilities: [data, acquisition, cleaning, verification]
+    can_review: false
+    max_concurrent: 1
+    
+  jiangwei-infra:
+    capabilities: [deploy, infrastructure, docker, vnpy]
+    can_review: false
+    max_concurrent: 1
+    
+  pangtong-fujunshi:
+    capabilities: [planning, coordination, escalation, strategy]
+    can_review: true
+    is_fallback: true
+    max_concurrent: 3
+```
+
+Daemon 启动时读取配置，写入黑板 `agent_profiles` 表。未来可演进为 Agent 自己注册。
+
+### 5.2 LLM 路由器（LLMDriver）
+
+```python
+class LLMDriver:
+    """bMAS Control Unit — 轻量 LLM 路由决策"""
+    
+    def __init__(self, model: str, api_base: str, api_key: str):
+        self.model = model
+        self.client = OpenAI(base_url=api_base, api_key=api_key)
+    
+    def route(self, task, agent_profiles, active_agents) -> RouteDecision:
+        prompt = self._build_prompt(task, agent_profiles, active_agents)
+        
+        response = self.client.chat.completions.create(
+            model=self.model,
+            messages=[{"role": "user", "content": prompt}],
+            response_format={"type": "json_object"},
+            max_tokens=200,
+            temperature=0.1,
+        )
+        
+        result = json.loads(response.choices[0].message.content)
+        return RouteDecision(
+            agent_id=result["agent_id"],
+            reason=result["reason"],
+            confidence=result.get("confidence", 0.5),
+        )
+```
+
+**Routing Prompt 模板**：
+
+```
+你是任务路由器。根据任务需求和 Agent 能力，选择最合适的 Agent。
+
+## 当前任务
+- ID: {task_id}
+- 标题: {task_title}  
+- 状态: {task_status}
+- 描述: {task_description}
+- 上一步执行者: {previous_assignee}
+
+## 可用 Agent
+{每个Agent: ID, 能力列表, 当前负载}
+
+## 约束
+1. review/quality_check 不能选上一步执行者
+2. 同等能力优先选负载最低的
+3. 必须匹配任务所需能力
+
+## 输出
+{"agent_id": "...", "reason": "...", "confidence": 0.0-1.0}
+```
+
+### 5.3 Dispatcher 重写
+
+```python
+class Dispatcher:
+    def __init__(self, config, counter):
+        self.counter = counter
+        self.agent_profiles = config.get("agent_profiles", {})
+        self.llm = LLMDriver(
+            model=config.get("routing", {}).get("model", "zhipu/glm-5.1"),
+            api_base=config.get("routing", {}).get("api_base", ""),
+            api_key=config.get("routing", {}).get("api_key", ""),
+        )
+        self.LOCAL_ACTIONS = {"L1_guardrail", "format_check", "file_exists_check"}
+    
+    def decide(self, task, action_type="") -> dict:
+        # ── 快速路径：确定性路由 ──
+        if action_type in self.LOCAL_ACTIONS:
+            return {"level": "local", "reason": "机械检查，Daemon本地执行"}
+        
+        # retry 同一人
+        if action_type == "retry" and task.assignee:
+            return {"level": "full_agent", "agent_id": task.assignee,
+                    "reason": "retry原执行者", "mode": "deterministic"}
+        
+        # ── Mode B: Agent 声明式交接 ──
+        if task.next_capability:
+            agent = self._match_capability(task.next_capability, 
+                                            exclude={task.assignee})
+            if agent:
+                return {"level": "full_agent", "agent_id": agent,
+                        "reason": f"执行者handoff: 需要{task.next_capability}",
+                        "mode": "agent_handoff"}
+        
+        # ── Mode A: LLM 路由 ──
+        decision = self.llm.route(task, self.agent_profiles, 
+                                   self.counter.active_agents)
+        
+        # 合法性校验
+        if (decision.agent_id not in self.agent_profiles 
+            or decision.confidence < 0.7):
+            return {"level": "full_agent", "agent_id": "pangtong-fujunshi",
+                    "reason": f"LLM低置信度({decision.confidence}): {decision.reason}",
+                    "mode": "fallback"}
+        
+        return {"level": "full_agent", "agent_id": decision.agent_id,
+                "reason": decision.reason, "mode": "llm_route",
+                "confidence": decision.confidence}
+    
+    def _match_capability(self, capability, exclude=None):
+        """从能力画像中匹配 Agent"""
+        candidates = [
+            aid for aid, prof in self.agent_profiles.items()
+            if aid not in (exclude or set())
+            and capability in prof.get("capabilities", [])
+        ]
+        if not candidates:
+            return None
+        if len(candidates) == 1:
+            return candidates[0]
+        return min(candidates, key=lambda a: self.counter.active_agents.get(a, 0))
+```
+
+### 5.4 assignee 语义变更
+
+| 维度 | 当前 | 改为 |
+|------|------|------|
+| `assignee` 含义 | 任务负责人（贯穿全生命周期） | **当前阶段执行者**（随状态流转更新） |
+| 新增 `previous_assignee` | 无 | 保存前一阶段执行者（用于排除和审计） |
+
+```python
+# 状态流转时更新
+task.previous_assignee = task.assignee
+task.assignee = new_agent_id
+```
+
+---
+
+## 6. 路由审计
+
+每次路由决策写入黑板 `routing_decisions` 表。
+
+### 6.1 表结构
+
+```sql
+CREATE TABLE IF NOT EXISTS routing_decisions (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    task_id TEXT NOT NULL,
+    from_status TEXT,           -- 前一状态
+    to_status TEXT,             -- 目标状态
+    mode TEXT NOT NULL,         -- deterministic / agent_handoff / llm_route / fallback
+    selected_agent TEXT NOT NULL,
+    previous_agent TEXT,        -- 前一阶段执行者
+    reason TEXT,                -- 路由理由
+    confidence REAL,            -- LLM 置信度（Mode A 才有）
+    model TEXT,                 -- 使用的 LLM 模型（Mode A 才有）
+    latency_ms INTEGER,         -- 路由耗时
+    created_at TEXT DEFAULT (datetime('now')),
+    FOREIGN KEY (task_id) REFERENCES tasks(id)
+);
+
+CREATE INDEX idx_routing_task ON routing_decisions(task_id);
+```
+
+### 6.2 审计日志示例
+
+```
+task=test-e2e-001 | pending→claimed | mode=llm_route
+  → zhangfei-dev (confidence=0.95, reason="编码任务匹配coding能力")
+  → model=zhipu/glm-5.1, latency=1200ms
+
+task=test-e2e-001 | working→review | mode=agent_handoff
+  → simayi-challenger (reason="执行者handoff: 需要review")
+  → latency=2ms
+
+task=test-e2e-001 | review→done | mode=agent_handoff  
+  → pangtong-fujunshi (reason="审查通过，交接给协调者收尾")
+  → latency=1ms
+```
+
+---
+
+## 7. 路由模型配置
+
+### 7.1 后端配置
+
+```yaml
+# config/default.yaml 新增
+routing:
+  model: "zhipu/glm-5.1"     # 默认路由模型
+  api_base: ""                # 空=用 OpenClaw Gateway
+  api_key: ""                 # 空=用 OpenClaw 默认
+  confidence_threshold: 0.7   # 低于此值 fallback
+  max_tokens: 200
+  temperature: 0.1
+```
+
+### 7.2 前端配置入口
+
+在现有 `ModelConfig.tsx` 页面顶部新增"路由模型"配置区域：
+
+```
+┌─────────────────────────────────────────────┐
+│ 🎯 路由模型（Control Unit）                  │
+│ ┌─────────────────────┐ ┌────┐              │
+│ │ zhipu/glm-5.1     ▾ │ │应用│              │
+│ └─────────────────────┘ └────┘              │
+│ 任务路由使用的 LLM（推荐轻量快速模型）         │
+├─────────────────────────────────────────────┤
+│ 🐦 庞统  pangtong-fujunshi                  │
+│ 当前: zhipu/glm-5.1                         │
+│ ...                                         │
+```
+
+- 模型下拉列表复用 OpenClaw 已注册的 `knownModels`（和 Agent 模型选的是同一个数据源）
+- 通过后端 API `PATCH /api/config/routing-model` 保存
+- 调用 `api.setModel` 同理，走 Gateway 模型配置
+
+### 7.3 API
+
+```python
+# blackboard_routes.py 新增
+@api_route("GET", "/api/config/routing")
+def get_routing_config(request):
+    return {"model": config.routing.model, 
+            "confidence_threshold": config.routing.confidence_threshold}
+
+@api_route("PATCH", "/api/config/routing")  
+def set_routing_config(request):
+    new_model = request.json.get("model")
+    # 校验模型在 OpenClaw 已注册模型列表中
+    config.routing.model = new_model
+    config.save()
+    return {"ok": True}
+```
+
+---
+
+## 8. 改动清单
+
+### 8.1 数据模型
+
+| 变更 | 类型 | 说明 |
+|------|------|------|
+| 新增 `agent_profiles` 配置段 | 配置 | 每个 Agent 声明能力列表 |
+| 新增 `routing` 配置段 | 配置 | 路由模型 + 参数 |
+| tasks 新增 `next_capability` 字段 | DDL | Agent 声明下一步需要的能力 |
+| tasks 新增 `previous_assignee` 字段 | DDL | 保存前一阶段执行者 |
+| 新增 `routing_decisions` 表 | DDL | 路由审计日志 |
+| `assignee` 语义变更 | 逻辑 | 从"任务负责人"改为"当前阶段执行者" |
+
+### 8.2 代码

 | 文件 | 变更 |
 |------|------|
-| `dispatcher.py` | 重写 `decide()`：能力匹配替代 assignee 查表 |
-| `dispatcher.py` | 新增 `_resolve_needed_capability()`、`_find_agent_by_capability()`、`_get_exclusions()` |
-| `config/default.yaml` | `capability_map` 改为 `agent_profiles`（每个 Agent 声明自己的能力列表） |
-| `blackboard_routes.py` | status API 接受 `next_capability` 参数 |
-| `ticker.py` | `_dispatch_reviews()` 使用新的 dispatcher 路由 |
-| `blackboard/db.py` | 新增 agent_profiles 表 / 字段 |
+| `dispatcher.py` | 重写：新增 LLMDriver + Mode A/B/C 路由逻辑 |
+| `config/default.yaml` | 新增 `agent_profiles` + `routing` 配置段 |
+| `blackboard_routes.py` | status API 接受 `next_capability`；新增路由配置 API |
+| `ticker.py` | 使用新 dispatcher；路由结果写 routing_decisions |
+| `blackboard/db.py` | 新增 routing_decisions 表 DDL；tasks 表新增字段 |
+| `ModelConfig.tsx` | 新增路由模型配置区域 |

-### 5.3 不变的部分
+### 8.3 不变

 | 不变 | 原因 |
 |------|------|
 | 状态机（pending→claimed→working→review→done） | 状态流转语义正确 |
-| 前端 Dashboard | 前端不感知路由逻辑 |
-| Agent prompt 模板（S2） | Agent 仍然按 4 步流程执行 |
+| Agent prompt 模板（S2） | Agent 仍按 4 步流程，只在 POST /status 时多传一个字段 |
 | Spawner 逻辑 | spawn 机制不变 |
-| API 契约（S1） | 对 Agent 透明 |
+| 前端 Dashboard 核心布局 | 只在 ModelConfig 加一个区域 |

 ---

-## 6. 和现有优秀实践的对标
+## 9. 和现有实践的对标

 | 实践 | 本方案对应 |
 |------|----------|
-| bMAS Control Unit（LLM 驱动） | 本方案用能力画像做结构化匹配（成本更低、确定性更高），未来可演进为 LLM 驱动 |
-| 自主选择模式（arXiv 2510.01285） | 本方案的演进方向：Agent 自主 claim 而非被指派 |
-| Handoff 模式（Azure） | Agent 声明 `next_capability` 就是 Handoff |
-| 声明式编排（Conductor） | 生命周期声明 TASK_LIFECYCLE 是声明式的 |
-| 能力画像（OpenClaw RFC #35203） | agent_profiles 直接实现能力画像 |
-| 幻觉门控（Hermes） | 不变，产出验证逻辑独立于路由 |
+| bMAS Control Unit（LLM 驱动） | Mode A: LLMDriver 实现，轻量 API 调用 |
+| Azure Handoff（Agent 交接） | Mode B: next_capability + handoff_note |
+| 自主选择（arXiv 2510.01285） | Mode C: 未来演进，数据结构预留 |
+| MasRouter（confidence） | confidence 阈值 + fallback 机制 |
+| Microsoft Conductor（确定性 + 动态混合） | 快速路径（确定性）+ LLM 路由（动态）分层 |
+| 幻觉门控（Hermes） | LLM 输出合法性校验 + confidence 阈值 |
+| "Agent 决策，Daemon 执行"（v2.6 原则） | Mode B 是最直接的实现：Agent 自己决定交接给谁 |

 ---

-## 7. 待确认
+## 10. 演进路线

-1. **`agent_profiles` 数据来源**：从 config/default.yaml 读取（启动时写入黑板），还是从 Agent 的 SOUL.md 动态解析？
-2. **`TASK_LIFECYCLE` 定义位置**：硬编码在 dispatcher.py 中，还是也放到 config？
-3. **`assignee` 语义变更的影响**：前端 Dashboard 是否有依赖 assignee = 执行者的假设？
-4. **是否要一步到位到"Agent 自主领活"**（第二步），还是先实现本方案（第一步）？
+```
+Phase 1（本次实现）: Mode A + Mode B
+  - LLMDriver 路由（首次分配、异常场景）
+  - Agent 声明式交接（最高频场景）
+  - 路由审计表
+  - 前端路由模型配置
+  
+Phase 2（未来）: Mode C
+  - 同样的 agent_profiles 和 capabilities 数据结构
+  - Daemon 广播需求 → Agent 自己 claim
+  - 迁移成本极低（数据结构不变，只改消费方式）
+  
+Phase 3（更远）: 经验驱动的路由
+  - 路由审计数据反哺 LLM prompt（历史匹配成功率）
+  - Agent 可靠性评分（参考 MasRouter）
+  - 动态能力发现（Agent 完成新类型任务后自动更新画像）
+```

 ---

-## 8. 参考
+## 11. 司马懿评审要点
+
+请重点关注：
+
+1. **LLMDriver 的异常处理**：API 超时/失败时的 fallback 策略是否合理
+2. **Mode B 的安全性**：Agent 声明 `next_capability` 时是否需要校验（防恶意指定）
+3. **assignee 语义变更**的影响范围：是否有其他模块依赖"assignee = 任务负责人"
+4. **routing_decisions 表设计**：字段是否充分，索引是否合理
+5. **配置 API 的安全性**：修改路由模型是否需要鉴权
+6. **性能影响**：Mode A 的 ~2s 延迟在 tick cycle 中是否可接受
+
+---
+
+## 12. 参考

 - bMAS: arXiv 2507.01701 — Blackboard LLM Multi-Agent System
 - Self-Selection: arXiv 2510.01285 — Agent 自主选择模式
 - MasRouter: arXiv 2601.04861 — Confidence-Aware Routing
+- AgentGate: arXiv 2604.06696 — 结构化路由引擎
 - Microsoft Conductor: github.com/microsoft/conductor — 确定性编排
 - Azure Agent Patterns: learn.microsoft.com — Handoff 模式
- OpenClaw RFC #35203 — Capability Profiling + Shared Blackboard
 - v2.6 调研报告: docs/research/shared-consciousness-research.md
 - v2.6 架构设计: docs/design/architecture-v2.6.md
+- T3-10 调度判据: docs/design/topic3-challenge-review-proposal.md §5.4