Stage C Step 1: Executor 接口骨架 + HostExecutor in-process backend
- core/executor.py: Executor ABC + ExecCtx(user_id/task_id/working_dir/cancel_check) + ToolResult - core/executor_host.py: HostExecutor 包原 tools dict,统一三种错误为 ToolResult - core/loop.py: AgentLoop 接 executor 而非 tools,_execute_tool_call 收成单条 call_tool 调用 - core/agent_builder.py: tools 装完后 HostExecutor(tools) 包一层,working_dir 透传 AgentLoop 接口形状与 DESIGN §7.5 #5 sketch (`call_tool(name, args, ctx)`) 完全一致, backend 无关 —— Step 3 docker backend 接入时 AgentLoop 零改动,只换装配层。 行为零变化:smoke 4 分支(unknown/bad args/happy/schemas)全过,unittest 1/1 PASS。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
d6af9a59fe
commit
48f99cf66d
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
> 配合 `DESIGN.md`。本文件只记 phase 状态、决策偏差、文件量、下一步。每条 1-2 句:做了啥 + 关键判断;细节查 `git log` / `git diff` / `DESIGN §7.9`。
|
||||
|
||||
最后更新:2026-05-26(新增 patent skill + REVISIONS.md 修订日志机制覆盖 proposal/patent/ppt 三个产物型 skill)
|
||||
最后更新:2026-05-26(Stage C Step 1:Executor 接口骨架 + HostExecutor in-process backend,行为零变化)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -23,6 +23,7 @@
|
|||
|
||||
### 2026-05-26
|
||||
|
||||
- **Stage C Step 1:Executor 接口骨架 + HostExecutor in-process backend(§7.5 #5 落地)**:`core/executor.py` 加 `Executor` ABC + `ExecCtx`(user_id/task_id/working_dir/cancel_check)+ `ToolResult`(content/exit_code);`core/executor_host.py` 加 `HostExecutor` 包原 tools dict,`call_tool` 内部分流到对应 `Tool.execute` 并把三种错误(unknown / TypeError / 抛异常)统一收成 `[Error] ...` content + exit_code 区分。`AgentLoop.__init__` 改接 `executor` 而非 `tools` dict、加 `working_dir` 形参;`_stream_llm` 用 `executor.schemas()` 拼 LLM tools 字段;`_execute_tool_call` 改单条 `executor.call_tool(name, args, ctx)`,删原三段错误 emit(unknown/TypeError/Exception 已被 executor 收编为 ToolResult,只剩一处 emit)。`agent_builder.py` 装完 tools dict 后 `HostExecutor(tools)` 包一层,传给 `AgentLoop`。**接口形状刻意 backend 无关**——不暴露 `docker exec` / `docker cp` 等 Docker 假设,Step 3 切 docker backend 时 `AgentLoop` 零改动,只换 `agent_builder.py` 里 `HostExecutor` → `DockerExecutor(host_tools=..., docker_tools={shell, run_python})`。**行为零变化** —— sanity import 通过,`unittest discover -s tests` 1/1 PASS。`DESIGN.md` 不动(纯按 §7.5 #5 既有协议实施,无架构漂移);`RUN.md` 不动(无新 env / CLI 变化,`ZCBOT_SANDBOX_BACKEND` env 留到 Step 3 docker backend 引入时一起加)。否决:(a) 不抽 Executor 直接在 `shell.py/run_python.py` 里 `if backend=='docker'` —— 违反 §7.5 #5,未来切 gVisor/Firecracker 时改动散到工具层;(b) Executor 用 `exec(cmd, ctx)` primitive 而非 `call_tool(name, args, ctx)` dispatcher —— 不匹配 DESIGN 签名,且 host 工具(read/web_*/seedream)不是 "命令" 语义;(c) 用 `cancel_check` callable 替代 ExecCtx 重建 —— 当前 cancel_check 是 build 后 setter 赋值,ctx 缓存会指向 stale,per-call 构 ExecCtx 是 dataclass 廉价。
|
||||
- **REVISIONS.md 修订日志机制(覆盖 proposal/patent/ppt 三个产物型 skill)**:`<task_dir>/REVISIONS.md` 作为产物迭代过程的紧凑 changelog —— task 对话历史是粗流水(50 条消息找上周改动靠翻),REVISIONS 是用户与 LLM 共同沉淀的实质决策列表(5 行就能复盘"上周这章为啥这么写"),与 spec 定位互补:**spec = 宪法(定调一次),REVISIONS = 实施日志(每次卡点累加)**。三个 SKILL.md 各加 (a) 起草步骤里加一步"用户确认实质改动后追加一行" + (b) "## 修订日志" 独立小节(何时记/何时不记表 + 格式约定 + 实例 + 操作)。三类 skill 的"实质改动"判据按各自领域定制:proposal = 技术路线/考核指标/创新点/课题分解/关键引文/预算结构;patent = 区别技术特征/关键参数/公式/实施例/章节;ppt = 版式/主色/页/图标/文案要点。统一原则:首次起草不记 / 错别字微调不记 / 模型自己改改撤撤不记 — 拿不准倾向不记,避免变流水账。格式选**单行 bullet 倒序追加**(时间在前、文件:章节定位、改了什么 — 为什么),用 edit 在头注释后插入新一行(不 append 到末尾,倒序读秒看最新)。否决:(a) 走 system prompt 软约束 — 对 coding/research/documents/imagegen/videogen 等非产物型 skill 强加无关约束;(b) 新建 `record_revision` tool — 开发期内 LLM 直接 edit 追加足够,加 tool 增加每次小改的调用开销,后期发现 LLM 漏记多再升 tool 化;(c) 按产物拆多文件(`<topic>.revisions.md`)— 单文件好读、跨产物时间线统一。`DESIGN.md` 不动(无架构变化);`RUN.md` 不动(无 CLI/env 变化)。
|
||||
- **新增 patent skill(中国发明专利技术交底书)**:`skills/patent/` 完整 6 文件 — `SKILL.md` 主入口(五阶段 workflow:摄取 → 挖点 → 检索 → spec → 逐章起草 → 自查渲染,跟 proposal 同款 BLOCKING 节奏)+ `references/{disclosure_structure,patent_point_taxonomy,prior_art_search,self_check}.md` 4 份指南 + `templates/{spec,disclosure}.md` 2 份模板。**关键复用避免重复造**:① 素材摄取用 `markitdown` CLI(不内置 docx/pptx→md);② mermaid + docx 渲染直接复用 `skills/proposal/scripts/{render_diagrams,render_docx}.py`(参数兼容,patent 不另写);③ 现有技术检索走现成的 `web_search`/`web_fetch`(Bocha)+ `documents` + `research`,不实现 CNIPA Playwright 爬虫(反爬重、维护成本高,正式可作 IDS 提交的检索建议走线下专业渠道);④ 不实现修订日志(zcbot task 对话历史已有)。源 repo `github.com/handsomestWei/patent-disclosure-skill` 的 11 prompts 文件折叠进单份 SKILL.md(跟 proposal/ppt 风格一致)+ 8 Python tools 减到 0(全靠复用)。skill 内特有内容:7 章交底书骨架(技术领域 / 背景 / 发明内容 / 附图 / 实施方式 / 有益效果 / 权利要求建议)+ 三性自检(新颖/创造/实用)+ 9 类客体排除清单 + 6 类自查清单 + 脱敏边界(商业敏感词中性化、技术参数不脱敏)。`SkillRegistry` 自动发现验证通过。`DESIGN.md` 不动(无架构变化,纯新 skill);`RUN.md` 不动(无 CLI/env 变化)。
|
||||
- **§7.5 沙盒落地清单 6 条写入 DESIGN(Stage C 实施硬协议)**:Stage C 动手前把"原则 → 具体协议"沉淀,防实施时漏。① 网络 blocklist 硬编码段(`169.254.0.0/16` cloud metadata / loopback / 内网三段 / `100.64.0.0/10` CGNAT,**PG IP 单独再 block 一遍**——Capital One 2019 同款攻击向量);② egress proxy 模型(容器 `HTTP_PROXY` env + iptables DROP except proxy 端口防 SDK 绕 env,宿主侧 proxy 做域名 allowlist + 字节计量 + `network_audit` 审计日志,allowlist 初始集列出 PyPI / GitHub / npm 等);③ 进程组清理协议(`docker exec` 走 `setsid` + `kill -- -PGID`,防 `nohup &` / `disown` 跨 exec 持久化破"同 user 不内隔离"残留风险假设);④ 磁盘配额硬化时点(开外部前必须升 xfs/ext4 project quota 或 zfs dataset quota,否则扫描间隙打满共享 fs 拖死同节点);⑤ Executor 接口走 backend driver + `ZCBOT_SANDBOX_RUNTIME` config 注入(未来切 gVisor/Firecracker/e2b 应用层零改动,避免 Docker API 形状泄漏到接口层);⑥ 工具按信任域二分 dispatch — **host in-process**:`read/write/edit/glob/grep/load_skill/web_search/web_fetch`(原本就在 host 持凭据 / 走 paths.py 校验,塞容器无收益付 200ms × N),**container exec**:`shell/run_python`(执行任意代码必隔离)。同时把 gVisor / Firecracker / 容器内 tool-runner 三档升级触发信号写死,反向兜底"无信号不升级"。否决:(a) 把落地清单同时写进 DESIGN 和 PROGRESS 双 source — 漂移源,PROGRESS 只指针 DESIGN;(b) 在落地清单里写"勾对"验收语气 — DESIGN 写为什么 + 协议形状,验收语气进 PROGRESS 下一步候选 DoD;(c) 立即开始实施 — 设计先沉淀,实施排进下一步候选 #2 单独节奏。`RUN.md` 不动(运行方式无变化,Stage C 还没实施)。
|
||||
|
|
|
|||
|
|
@ -26,6 +26,7 @@ import yaml
|
|||
from rich.console import Console
|
||||
|
||||
from core.capabilities import ModelCapabilities
|
||||
from core.executor_host import HostExecutor
|
||||
from core.llm import LLM
|
||||
from core.loop import AgentLoop
|
||||
from core.memory import memory_block
|
||||
|
|
@ -438,7 +439,13 @@ def build_agent(
|
|||
tools[ws.name] = ws
|
||||
|
||||
sink = ConsoleEventSink(console) if console else None
|
||||
agent = AgentLoop(llm, tools, session, caps, user_id=uid, sink=sink)
|
||||
# §7.5 #5 Executor 抽象:本步全 host backend(in-process),Step 3 docker backend
|
||||
# 引入后切 `ZCBOT_SANDBOX_BACKEND=docker` 把 shell/run_python dispatch 到容器。
|
||||
executor = HostExecutor(tools)
|
||||
agent = AgentLoop(
|
||||
llm, executor, session, caps,
|
||||
user_id=uid, working_dir=working_dir_path, sink=sink,
|
||||
)
|
||||
if cancel_check is not None:
|
||||
agent.cancel_check = cancel_check
|
||||
return agent, session, sid, task_state, working_dir_path
|
||||
|
|
|
|||
|
|
@ -0,0 +1,66 @@
|
|||
"""Executor 接口:工具调用的总入口(DESIGN §7.5 落地清单 #5)。
|
||||
|
||||
`AgentLoop` 不直接调 `tool.execute`,而是 `executor.call_tool(name, args, ctx)`。
|
||||
Backend 内部 dispatch:
|
||||
|
||||
- `HostExecutor`(本步引入):全部 tools in-process,沿用原 `Tool.execute` 行为
|
||||
- `DockerExecutor`(Step 3 引入):`shell` / `run_python` 走 `docker exec`,
|
||||
其余按 §7.5 #6 信任域二分仍走 host —— 此时 DockerExecutor 内部组合 HostExecutor
|
||||
|
||||
接口形状刻意保持 backend 无关:`call_tool(name, args, ctx)` 不暴露 `docker exec` /
|
||||
`docker cp` / `docker stats` 等 Docker 假设。未来切 gVisor / Firecracker / e2b 时
|
||||
应用层零改动,只换 backend driver(§7.5 #5 / §7.9 升级触发表)。
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any, Callable, Dict, List, Optional
|
||||
from uuid import UUID
|
||||
|
||||
|
||||
@dataclass
|
||||
class ExecCtx:
|
||||
"""每次 tool 调用的执行上下文。
|
||||
|
||||
带身份 / 范围 / 取消钩子;arg-irrelevant 信息从 args 剥离。
|
||||
host backend 当前只用 cancel_check;docker backend 会:
|
||||
- user_id → 找 / 起 per-user 容器
|
||||
- working_dir → 拼 `docker exec --workdir /workspace/<wd_name>`
|
||||
- task_id → 临时文件命名空间 `/tmp/zcbot/<task_id>/`
|
||||
- cancel_check → 轮询期间响应停止按钮(主动 `kill -- -PGID`)
|
||||
"""
|
||||
user_id: UUID
|
||||
task_id: UUID
|
||||
working_dir: Path
|
||||
cancel_check: Optional[Callable[[], bool]] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class ToolResult:
|
||||
"""工具调用统一返回。
|
||||
|
||||
现状所有 `Tool.execute` 都返 str,docker backend 后续可能要带 stdout/stderr/
|
||||
exit_code 分离。这里先留单 content 字段(LLM 拿到的就是这串),exit_code 作
|
||||
backend 内部使用 hint(0=ok / 1=tool 抛异常 / 2=参数非法 / 124=timeout 等),
|
||||
不影响 LLM 接口。
|
||||
"""
|
||||
content: str
|
||||
exit_code: int = 0
|
||||
|
||||
|
||||
class Executor(ABC):
|
||||
"""工具调度抽象 —— 见模块 docstring。"""
|
||||
|
||||
@abstractmethod
|
||||
def call_tool(self, name: str, args: Dict[str, Any], ctx: ExecCtx) -> ToolResult:
|
||||
"""执行单次 tool 调用。永远返 ToolResult,不抛异常(异常包成 exit_code=1)。"""
|
||||
|
||||
@abstractmethod
|
||||
def schemas(self) -> List[Dict[str, Any]]:
|
||||
"""暴露给 LLM 的 OpenAI tool schema 列表;`AgentLoop._stream_llm` 用。"""
|
||||
|
||||
@abstractmethod
|
||||
def has_tool(self, name: str) -> bool:
|
||||
"""schema 列表覆盖的 tool 名;主要给测试 / 诊断用。"""
|
||||
|
|
@ -0,0 +1,46 @@
|
|||
"""HostExecutor:in-process 工具调用,沿用原 `Tool.execute` 行为。
|
||||
|
||||
用途:
|
||||
- 本地 dogfood / 单租户 / Step 1 默认 backend
|
||||
- Step 3 docker backend 引入后,承担"信任域 = host"那一半(read/write/edit/glob/
|
||||
grep/load_skill/web_*/seedream/seedance,§7.5 #6),DockerExecutor 内部组合本类
|
||||
+ docker exec 处理 shell/run_python。
|
||||
|
||||
行为兼容性:错误分支与原 `AgentLoop._execute_tool_call` 三段(unknown tool /
|
||||
bad args / 抛异常)语义对齐 —— 都包成 `[Error] ...` content 返回,exit_code
|
||||
区分内部用。
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List
|
||||
|
||||
from .executor import ExecCtx, Executor, ToolResult
|
||||
from tools.base import Tool
|
||||
|
||||
|
||||
class HostExecutor(Executor):
|
||||
def __init__(self, tools: Dict[str, Tool]) -> None:
|
||||
self._tools = tools
|
||||
|
||||
def has_tool(self, name: str) -> bool:
|
||||
return name in self._tools
|
||||
|
||||
def schemas(self) -> List[Dict[str, Any]]:
|
||||
return [t.schema for t in self._tools.values()]
|
||||
|
||||
def call_tool(self, name: str, args: Dict[str, Any], ctx: ExecCtx) -> ToolResult:
|
||||
tool = self._tools.get(name)
|
||||
if tool is None:
|
||||
return ToolResult(content=f"[Error] unknown tool: {name}", exit_code=2)
|
||||
try:
|
||||
result = tool.execute(**args)
|
||||
except TypeError as e:
|
||||
return ToolResult(content=f"[Error] bad arguments to {name}: {e}", exit_code=2)
|
||||
except Exception as e:
|
||||
return ToolResult(
|
||||
content=f"[Error executing {name}] {type(e).__name__}: {e}",
|
||||
exit_code=1,
|
||||
)
|
||||
if not isinstance(result, str):
|
||||
result = str(result)
|
||||
return ToolResult(content=result, exit_code=0)
|
||||
43
core/loop.py
43
core/loop.py
|
|
@ -11,13 +11,15 @@ from __future__ import annotations
|
|||
|
||||
import json
|
||||
import time
|
||||
from typing import Any, Callable, Dict, List, Optional, Tuple
|
||||
from pathlib import Path
|
||||
from typing import Any, Callable, List, Optional, Tuple
|
||||
|
||||
from uuid import UUID
|
||||
|
||||
import litellm
|
||||
|
||||
from .capabilities import ModelCapabilities
|
||||
from .executor import ExecCtx, Executor
|
||||
from .llm import LLM
|
||||
from .session import Session
|
||||
from .storage import record_chat_usage
|
||||
|
|
@ -60,19 +62,23 @@ class AgentLoop:
|
|||
def __init__(
|
||||
self,
|
||||
llm: LLM,
|
||||
tools: Dict[str, Any],
|
||||
executor: Executor,
|
||||
session: Session,
|
||||
capabilities: ModelCapabilities,
|
||||
user_id: UUID,
|
||||
working_dir: Path,
|
||||
sink: Optional[Any] = None,
|
||||
max_iterations: Optional[int] = None,
|
||||
cancel_check: Optional[Callable[[], bool]] = None,
|
||||
) -> None:
|
||||
self.llm = llm
|
||||
self.tools = tools
|
||||
self.executor = executor
|
||||
self.session = session
|
||||
self.caps = capabilities
|
||||
self.user_id = user_id # usage_events 写入时按 user 维度聚合
|
||||
# ExecCtx 字段:user_id / task_id 已在,working_dir 单独传 —— 供 docker backend
|
||||
# (Step 3)拼 `--workdir /workspace/<wd_name>` 与临时文件命名空间使用。
|
||||
self.working_dir = working_dir
|
||||
self.max_iterations = max_iterations or capabilities.max_iterations
|
||||
self.sink = sink
|
||||
# 协作式 cancel:web 层注入 `lambda: broker.is_cancelled(task_id)`;
|
||||
|
|
@ -181,7 +187,7 @@ class AgentLoop:
|
|||
chunks: List[Any] = []
|
||||
stream = self.llm.chat_stream(
|
||||
messages=self.session.messages,
|
||||
tools=[t.schema for t in self.tools.values()],
|
||||
tools=self.executor.schemas(),
|
||||
reasoning_effort=self.caps.default_reasoning_effort or None,
|
||||
)
|
||||
cancelled = False
|
||||
|
|
@ -228,28 +234,13 @@ class AgentLoop:
|
|||
"args_preview": args_preview,
|
||||
})
|
||||
|
||||
tool = self.tools.get(name)
|
||||
if tool is None:
|
||||
err = f"[Error] unknown tool: {name}"
|
||||
self._emit({"type": "tool_result", "name": name, "result": err,
|
||||
"preview": err, "truncated": False})
|
||||
return err
|
||||
|
||||
try:
|
||||
result = tool.execute(**args)
|
||||
except TypeError as e:
|
||||
err = f"[Error] bad arguments to {name}: {e}"
|
||||
self._emit({"type": "tool_result", "name": name, "result": err,
|
||||
"preview": err, "truncated": False})
|
||||
return err
|
||||
except Exception as e:
|
||||
err = f"[Error executing {name}] {type(e).__name__}: {e}"
|
||||
self._emit({"type": "tool_result", "name": name, "result": err,
|
||||
"preview": err, "truncated": False})
|
||||
return err
|
||||
|
||||
if not isinstance(result, str):
|
||||
result = str(result)
|
||||
ctx = ExecCtx(
|
||||
user_id=self.user_id,
|
||||
task_id=self.session.task_id,
|
||||
working_dir=self.working_dir,
|
||||
cancel_check=self.cancel_check,
|
||||
)
|
||||
result = self.executor.call_tool(name, args, ctx).content
|
||||
|
||||
# 控制返回给模型的 tool 结果体量,避免炸 context
|
||||
MAX_LEN = 16_000
|
||||
|
|
|
|||
Loading…
Reference in New Issue