Stage C Step 3d: fs 工具进容器 + DESIGN §7.5 #6 重写(物理边界替代代码护栏)
Ubuntu dogfood 暴露 host 工具漏底:base_dir=Path.cwd() 无 user_root 校验, 模型 glob "*" 列出 host /home/lighthouse/zcbot/.git/.venv/... zcbot 源码自身。 DESIGN §7.5 #6 原写"host 工具走 paths.py::resolve_user_path 校验"是假命题 (代码里没那函数),绝对路径完全不挡。 修法:fs 工具(read/write/edit/glob/grep)也走 docker exec,物理边界替代 代码护栏(Phase B path validator 那条不做 ── 脆弱)。 - core/sandbox/tool_runner.py 新增:容器内 helper,stdin 接 JSON args, 调 tools/fs.py 的 Tool 子类;base_dir=cwd,user_root=/workspace - DockerExecutor 加 FS_TOOLS 信任域 + _exec_fs_tool:docker exec -i ... python /sandbox/tool_runner.py <name>,stdin 喂 JSON args(CJK / 引号 透明传不被 shell metachar 切) - _run_subprocess 加 stdin 参数 + is_fs_tool 分支返 stdout 直透(原 Tool 返回串语义保持),exit≠0 stderr 当 ToolResult content - SandboxPool 加 repo_root 字段 + <repo>/skills:/sandbox/skills:ro mount 让容器内 read SKILL references 能解析 - Dockerfile COPY tools/ /sandbox/tools/ + tool_runner.py(build-time COPY 而非 mount ── 容器内代码不应跟随 host repo 改动) - web/app.py 透传 ROOT 给 init_pool - 留 host 的工具:load_skill(SkillRegistry 内存查找)/ web_search / web_fetch / seedream / seedance(持 Bocha/ARK key 不入容器) - DESIGN §7.5 #6 重写:"几乎所有工具进容器,host 只留持 key + 跨 user 的", 原假命题溯源标注 2026-05-26 修正 代价:每 fs tool call +~200ms docker exec overhead,对话级 N≤15 总 1-3s, LLM 推理 5-30s 下噪声。升级触发(§7.9 升级表)docker exec → unix socket RPC 仍按原信号(overhead/total > 30% 持续 / 长驻服务工作流)。 测试:test_executor_docker 加 4 fs 路径测试(argv 形态 / CJK stdin JSON / exit≠0 stderr 透传 / timeout);改原 read 直通测试 → load_skill 直通 (read 现在进容器)。unittest discover 35/35 PASS。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
d93cc1a949
commit
23ff996d38
|
|
@ -402,10 +402,11 @@ create index on usage_events (model_profile, created_at);
|
|||
```
|
||||
Container 创建参数走 config:`ZCBOT_SANDBOX_RUNTIME=runc|runsc|...`(默 `runc`),per-user 容器起的时候 `docker run --runtime=<runtime>`。**理由**:未来切 gVisor / Firecracker / Kata / e2b 时应用层零改动(只换 backend driver + 改 config + 重启容器),避免接口形状泄漏 Docker 假设(`docker exec` / `docker cp` / `docker stats`)导致后期重写。
|
||||
|
||||
6. **工具按信任域二分,Executor 内部 dispatch**:
|
||||
- **Host in-process backend**:`read` / `write` / `edit` / `glob` / `grep` / `load_skill` / `web_search` / `web_fetch` — 这些工具原本就在 host 持有凭据(Bocha API key)或走 `paths.py::resolve_user_path` 校验(user-rooted 安全边界已存在,`/v1/files` API 复用同一份),塞进容器既无安全收益又付 ~200ms exec overhead × N 次。
|
||||
- **Container exec backend**:`shell` / `run_python` — 执行模型生成的任意代码,必须容器隔离。
|
||||
- Dispatcher 内部分流,使用方(`AgentLoop`)零感知。**接口形状按"未来若需全部进容器 + 内部 tool-runner"留好**:只换 host backend 实现成 unix socket RPC,接口不动。
|
||||
6. **工具按信任域二分,Executor 内部 dispatch**(2026-05-26 修正,原"host 工具走 paths.py::resolve_user_path 校验"是假命题,代码里没那函数;Ubuntu dogfood 第一次切 docker backend 发现 glob 工具仍列 host repo `.git/.venv/...`,改物理边界替代代码护栏):
|
||||
- **Container exec backend**:`shell` / `run_python` / `read` / `write` / `edit` / `glob` / `grep` — 全走 docker exec。shell/run_python 是任意代码必须隔离;fs 工具(read 等 5 个)以前在 host 跑 `base_dir = Path.cwd()` 无 user_root 校验,能读 host 全 fs(`/etc/passwd` / zcbot 源码 / `~/.ssh/`),改进容器后 `user_root=/workspace` 是物理边界。fs 工具调用形态:`docker exec --user zcbot --workdir /workspace/<wd> -i <c> python /sandbox/tool_runner.py <name>` + stdin 喂 JSON args(CJK / 引号 / 路径分隔符透明传,不被 shell metachar 切)。`tool_runner.py` 在镜像里 `/sandbox/`,复用 `tools/fs.py` 的 Tool 子类(`COPY tools/ /sandbox/tools/`);skill references 通过额外的 `<repo>/skills:/sandbox/skills:ro` mount 暴露(只读)。
|
||||
- **Host in-process backend**:`load_skill` / `web_search` / `web_fetch` / `seedream` / `seedance` — 持 Bocha/ARK API key 不能塞容器 env(SaaS 时 key 泄漏面增加);`load_skill` 是 SkillRegistry 内存查找,无 fs 访问越界可能。Step 4 egress proxy 之后再讨论这几个工具的容器化方案(media tool 调远端 API 走 proxy 比 key 入容器更直)。
|
||||
- Dispatcher(`DockerExecutor`)内部分流,使用方(`AgentLoop`)零感知。**接口形状按"未来若需全部进容器 + 内部 tool-runner unix socket RPC"留好**,升级触发信号见下表。
|
||||
- **代价**:每个 fs tool call 多 ~200ms docker exec overhead;对话级 N≤15 → 总 1-3s,LLM 推理时间 5-30s 下面噪声。镜像 build 多一步 `COPY tools/`,rebuild 增量 ~5s。
|
||||
|
||||
**升级触发信号(写下来防遗忘,反向兜底:无信号不升级)**:
|
||||
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
> 配合 `DESIGN.md`。本文件只记 phase 状态、决策偏差、文件量、下一步。每条 1-2 句:做了啥 + 关键判断;细节查 `git log` / `git diff` / `DESIGN §7.9`。
|
||||
|
||||
最后更新:2026-05-26(Stage C Step 3 hotfix:exec_user 改 username 自动跟随镜像 build_arg + Dockerfile 加 chromium + nodejs + @mermaid-js/mermaid-cli 给 proposal/patent skill 渲图)
|
||||
最后更新:2026-05-26(Stage C Step 3d:fs 工具(read/write/edit/glob/grep)进容器 + DESIGN §7.5 #6 重写,物理边界替代代码护栏)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -23,6 +23,7 @@
|
|||
|
||||
### 2026-05-26
|
||||
|
||||
- **Stage C Step 3d:fs 工具(read/write/edit/glob/grep)进容器 + DESIGN §7.5 #6 重写**:Ubuntu dogfood 第一次切 docker backend 后发现 host 工具 `Path.cwd()` 漏底 —— 模型用 glob `*` 列出了 host `/home/lighthouse/zcbot/.git/.venv/config/core/...`,即 zcbot 源码自身。回查 DESIGN §7.5 #6 写"host 工具走 `paths.py::resolve_user_path` 校验",grep 代码**根本没那个函数**,假命题;`Tool._resolve` 实际是 `base_dir / path`,base_dir=`Path.cwd()`(= web 启动目录 = zcbot repo 根),绝对路径完全不挡,模型能 read `/etc/passwd` / write zcbot 源码自己。**修法对比**:Phase A(改 cwd → working_dir,1 行 hack)修 UX 不修安全;Phase B(host 工具加 user_root 强制校验 + skills/ 白名单,~80 行)安全但脆弱(symlink/`..`/Windows path 都得 case 挡,漏一个就破);**方案 3(fs 工具进容器)物理边界替代代码护栏,选这条**。`core/sandbox/tool_runner.py` 新增容器内 helper(~80 行,from stdin 接 JSON args 调 `tools/fs.py` Tool 子类,base_dir=cwd 走 docker exec --workdir 传入,user_root=/workspace);`DockerExecutor` 加 `FS_TOOLS = {read,write,edit,glob,grep}` 信任域 + `_exec_fs_tool` 方法 `docker exec -i ... python /sandbox/tool_runner.py <name>` + stdin 喂 JSON args(CJK 路径透明传不被 shell metachar 切);`_run_subprocess` 加 stdin 参数 + is_fs_tool 路径返 stdout 直透(不包 [stdout]/[exit],原模型语义保持),exit≠0 把 stderr 当 ToolResult content。`SandboxPool` 加 `repo_root` 字段,`_docker_run` 加 `<repo>/skills:/sandbox/skills:ro` mount(SKILL.md 内引用 `references/foo.md` 时容器内 read 能解析);`web/app.py` lifespan 透传 `ROOT`;`Dockerfile` `COPY tools/ /sandbox/tools/ + tool_runner.py` 让镜像内有一份 tools 源(build-time COPY 而非 mount —— 容器内代码不应跟随 host repo 修改重启)。**留 host 的工具**:`load_skill`(SkillRegistry 内存查找,无 fs 越界)/ `web_search` / `web_fetch` / `seedream` / `seedance`(持 Bocha/ARK API key,key 不入容器 env;Step 4 egress proxy 后再讨论)。**测试**:`tests/test_executor_docker.py` 改 `test_load_skill_passthrough_to_host`(原 `test_read_passthrough_to_host` 不再成立 —— read 进容器了)+ 加 4 个 fs 路径测试(read argv 形态 / CJK 路径 stdin JSON 透明传 / grep exit≠0 stderr 透传 / glob timeout 杀 docker CLI),`unittest discover 35/35 PASS`。**DESIGN §7.5 #6 重写**:从"工具二分(host fs + container code)"改"几乎所有工具进容器,host 只留持 key + 跨 user 的"+ 标注 2026-05-26 修正记录(原假命题溯源)。**代价**:每个 fs tool call 多 ~200ms docker exec overhead,对话级 N≤15 总 1-3s,LLM 推理 5-30s 下噪声;镜像 build COPY tools/ ~5s 增量。**升级触发**(§7.9 升级表):若 metric `docker_exec_overhead / total_tool_time > 30%` 持续两周,或模型出现"在容器内起长驻服务"工作流,启用容器内 tool-runner unix socket RPC(消除每次 exec 开销)。否决:(a) Phase B path validator —— 跟 §7.9 § "美学统一性 ≠ 升级理由"对称,**安全要"物理 ≠ 代码"才稳**;(b) `COPY core/ tools/ ...` 把整个 zcbot core 进镜像 —— tool_runner 只需要 `tools/fs.py` + base.py,容器内多余代码增加攻击面;(c) tool_runner.py 用 argv 传 JSON args —— CJK / 引号 / 路径分隔符全是 shell metachar 切风险,stdin 喂稳;(d) 让 host backend 也保留 fs 工具走 user_root 校验作"双保险" —— 双源 = 漂移源,docker backend 是 §7.5 的全部论证基础,host backend 不在外部用户场景有它就够。
|
||||
- **Stage C Step 3 hotfix:exec_user 改 username 跟随 build_arg + Dockerfile 加 Node/Chromium/mermaid-cli**:Ubuntu 上 dogfood 暴露两个真问题。① **uid 错配**:DockerExecutor 写死 `--user 1000:1000`,但镜像 `docker build --build-arg HOST_UID=$(id -u)` 跟随 host 实际 uid(腾讯云轻量 lighthouse 用户 uid=1001),docker exec 进容器 uid=1000 → bind mount `/workspace/<wd>/` owner 1001 → 写文件全 EACCES,文件落 `/tmp/`。改 `DEFAULT_EXEC_USER = "zcbot"`(username,docker 自动查容器 /etc/passwd 拿 uid),无论 HOST_UID build 成 1000/1001/其他都跟 bind mount owner 对齐,且未来切其他部署机不用改 env。② **proposal/patent skill 渲 mermaid 缺 Node**:`skills/proposal/scripts/render_diagrams.py` `render_via_mmdc` 调 `shutil.which("mmdc")`,容器没装 → 退到 mermaid.ink 公网 API → 但 sandbox 容器 `--internal` 默 deny outbound,API 也走不通 → ASCII fallback 出 docx 没图不能用。Dockerfile 加 `chromium nodejs npm` apt 装(Debian bookworm 自带 node 18.x 够新)+ `npm install -g @mermaid-js/mermaid-cli@latest`,镜像 +~400MB(接受)。容器内 chromium 缺 setuid sandbox + `/dev/shm` 不够大会跪,镜像落 `/sandbox/puppeteer-config.json`(`--no-sandbox` / `--disable-setuid-sandbox` / `--disable-dev-shm-usage` + executablePath=/usr/bin/chromium)+ ENV `MERMAID_PUPPETEER_CONFIG=/sandbox/puppeteer-config.json`,`render_via_mmdc` 改读 env 拼 `-p <config>` 注入 mmdc;host 上跑 env 没设行为零变化。`PUPPETEER_SKIP_DOWNLOAD=true` + `PUPPETEER_EXECUTABLE_PATH` 让 puppeteer 用容器 chromium 不再下载它自带的 Chrome(省 ~300MB build)。npm 源加 `--build-arg NPM_REGISTRY=https://mirrors.cloud.tencent.com/npm/`(腾讯云内网)防境内 build 慢。`DESIGN.md` 不动(纯实施层 bug fix + skill 依赖);`RUN.md` 加 NPM_REGISTRY 段 + 故障兜底 3 行(EACCES uid 错配 / mmdc 报 launch chromium / npm 慢)。否决:(a) 让 DockerExecutor 启动时探测 `os.getuid()` 自动取 host uid 作 `--user` —— 写死 username 让 docker 查 passwd 比应用层探测更直接,且 部署机 uid 偶尔变(从 1000 重装成 1001)不用改任何东西;(b) 容器走 NodeSource repo 装 Node 20 LTS —— Debian bookworm 自带 18.x 已满足 mermaid-cli 要求(>=14.x),多一步外网拖速度;(c) 不装 chromium 等 Step 4 egress proxy 后用 mermaid.ink —— proposal 是早期就要交付的能力,等 Step 4(还没动手)不现实;(d) puppeteer config 注入靠改 mmdc 启动脚本 —— mmdc 默支持 `-p`,改 render_diagrams.py 读 env 就够,不动 mmdc 内部。
|
||||
- **Stage C Step 5:`main.py sandbox check` 部署前置对账 + lifespan fs quota WARN**:外部用户开放是 §7.5 #4 magnetic 要求(xfs prjquota / ext4 project quota / zfs dataset quota,否则"扫描间隙打满共享 fs 拖死同节点"),且 docker backend 启动前置(daemon/镜像/HOST_UID 对齐)出错时 lifespan 直接 fail-fast、traceback 排查贵 —— 把"运维心智清单"沉淀成可执行命令。`main.py sandbox check` 跑 5 项独立探测:① docker daemon 可达(CLI 存在 + `docker version` rc=0)② `zcbot-sandbox:latest` 镜像存在 ③ `zcbot-sandbox-net` network 存在(缺也 OK,lifespan 自动 ensure,这一项 warn 不 err)④ 镜像内 zcbot uid 与 host uid 对齐(`docker run --rm --entrypoint id` 拿镜像 uid 比对 `os.getuid()`;Windows 自动 skip)⑤ workspace/users/ 所在 fs 类型可 quota(`findmnt --target ... -no FSTYPE,OPTIONS` 解析,识别 xfs+prjquota / ext4+project quota / zfs / btrfs / tmpfs / 其他)。`detect_fs_quota(path) -> (level, msg)` 抽出来给 lifespan 复用:`web/app.py` docker backend 启动时同样跑一次,WARN 打 stdout(不阻塞),应用层周期扫描仍生效。**err vs warn 分界**:err = docker backend 启动会 fail-fast 的根因(daemon/镜像/HOST_UID,exit 1);warn = 不阻塞启动但外部用户开放前要清(network 缺 / fs 不可 quota,exit 0)。`tests/test_sandbox_check.py` 19 测试覆盖各分支 + 汇总 exit code,mock subprocess 与 sys.platform(`run_sandbox_check` 改用 module-level lookup 而非固化 `CHECKS` 元组,让 unittest patch 生效);**全套 unittest discover 31/31 PASS**。RUN.md 加"部署前置对账"小节(`sandbox check` 5 项含义)+ "配额硬化"段重写(fs 类型 → 处理动作映射表 + xfs 升级 4 步)+ 故障兜底 3 行(sandbox init failed / fs quota warn / image not found)。否决:(a) lifespan 探测失败 → fail-fast 而非 WARN —— Step 5 阶段应用层周期扫描已有,OS 层 quota 是外部开放硬要求不是 dogfood 硬要求,fail-fast 会阻碍 dogfood 启动;(b) sandbox check 自带 `quota-set` 子命令直接调 `xfs_quota` —— `<pid>` 整数 ↔ user_uuid 映射要建表跟踪,且 sudo + /etc/projects 改动属于运维操作,Step 5 阶段只落 RUN.md 说明 + 命令清单,真要做时在外部开放前一步;(c) 在 sandbox check 里探测 egress proxy 状态 —— Step 4 未实施,占位会让人误以为已落地。`DESIGN.md` 不动(纯按 §7.5 #4 既有协议实施);`RUN.md` 更新如上。
|
||||
- **Stage C Step 3:DockerExecutor 集成 AgentLoop + web lifespan(`ZCBOT_SANDBOX_BACKEND=host|docker` env 切 backend)**:`core/executor_docker.py` `DockerExecutor` 组合 `HostExecutor` + `SandboxPool`,`call_tool` 按 §7.5 #6 信任域 dispatch:`shell` / `run_python` → `pool.ensure(user_id)` 拿容器名 + `docker exec --user 1000:1000 --workdir /workspace/<wd_name> -e PYTHONIOENCODING=utf-8 setsid bash -c <cmd>` / `python <script>`(`setsid` 走包一层进程组,§7.5 #3 PGID kill 协议留 Step 3b 启用);其他工具(read/write/edit/glob/grep/load_skill/web_*/seedream/seedance)直通 host。**run_python tmp .py 落 host 侧 `<user_root>/.zcbot_tmp/<task_id>/<rand>.py`**,容器内对应 `/workspace/.zcbot_tmp/<task_id>/<rand>.py`(bind mount 自动可见);dotfile 起头让 `/v1/files` API 天然过滤(`web/app.py:169` `startswith(".")` 已挡)。**Cancel limitation 接受**:Popen.kill() 杀 docker CLI 客户端,容器内 server 端进程不会因此终止(docker exec 设计如此);第一版靠 idle 5min reaper / 下次 `ensure` 时 `rm -f` 兜底,升级触发为"用户报取消但还在烧 CPU"。`core/sandbox/__init__.py` 暴露 module-level singleton `init_pool` / `get_pool`,`agent_builder._resolve_executor` 按 env 切 backend、docker 路径 pool 未初始化 → fail-fast(不静默退到 host 防止"以为有沙盒实则在裸跑"误判);`web/app.py` lifespan 启动钩子:`init_pool(workspace/users)` + `shutdown_all` 清前驱孤儿 + `asyncio.create_task(_reaper)`(每 60s `run_in_executor(pool.reap_idle)`),关闭钩子 cancel reaper + `shutdown_all`。**pool.py 顺手清债**:`asyncio.Lock` → `threading.Lock`(主使用方是 web BG 线程同步 tool call,asyncio.Lock 会被每次 `asyncio.run` 起的 ephemeral loop 绕过保护;reaper 改 async wrapper `loop.run_in_executor(pool.reap_idle)`,pool API 全 sync 更直)。**测试**:`tests/test_executor_docker.py` 11 测试覆盖 host 直通 / shell argv 形态 / run_python tmp 文件清理 / timeout / cancel / 未知工具 / caps.enable_run_python=False;`unittest discover -s tests` **12/12 PASS**(原 1 测试不变,新 11 测试加上)。**Windows dogfood 零变化**:默 `ZCBOT_SANDBOX_BACKEND=host`,本地不动 docker;切 docker 路径只在 Ubuntu 部署机有效,真起容器 smoke 仍按 RUN.md "Sandbox(Stage C,Ubuntu)" 段 5 条命令在部署机跑。`DESIGN.md` **不动**(纯按 §7.5 #5 #6 既有协议实施);`RUN.md` 加 `ZCBOT_SANDBOX_BACKEND` env 说明 + 切 docker backend 时的启动前置条件。否决:(a) DockerExecutor 用 `asyncio.run(pool.ensure)` 包 ephemeral loop —— 跨 loop 不共享 asyncio.Lock,失串行化保护,且每次 tool call 多 ~5ms loop 创建销毁噪声;改 pool 同步成本更低;(b) `run_python` tmp .py 放工作目录内 —— 污染用户视野,SKILL 教模型"列工作目录用 glob"时 tmp 文件干扰,crash 残留与产物混(详 §7.9 取舍记录会在下次有同款问题时考虑沉淀);(c) host 侧独立 bind mount `<workspace>/.sandbox_tmp/<uid>/` 挂成容器 `/tmp_scripts` —— 多挂一个 mount 复杂度上升,单 bind mount 协议保持更直;(d) docker backend 失败时退化到 host —— 沙盒缺失=安全模型崩,fail-fast 比"看起来在跑"重要,§7.5 硬协议"任一缺失视为部署未完成"。
|
||||
|
|
|
|||
|
|
@ -1,14 +1,21 @@
|
|||
"""DockerExecutor:`shell` / `run_python` 走 docker exec,其余 in-process(§7.5 #6)。
|
||||
"""DockerExecutor:fs / shell / run_python 全走 docker exec,持 key 工具留 host(§7.5 #6)。
|
||||
|
||||
Backend 二分(§7.5 #6 信任域):
|
||||
- host in-process:`read/write/edit/glob/grep/load_skill/web_*/seedream/seedance`
|
||||
原本就在 host 持凭据(Bocha key / ARK key)或走 `paths.py::resolve_user_path` 校验
|
||||
(user-rooted 安全边界已存),塞容器无收益付 ~200ms exec overhead × N 次
|
||||
- container exec:`shell` / `run_python` —— 执行模型生成的任意代码,必须容器隔离
|
||||
Backend 二分(§7.5 #6 信任域,2026-05-26 修正:`paths.py::resolve_user_path` 校验
|
||||
原本是 DESIGN 假命题 ── 实际 host 工具 base_dir = Path.cwd() 无校验,模型能 read
|
||||
host 整个 fs。改物理边界替代代码护栏):
|
||||
- **container exec**:`shell` / `run_python` / `read` / `write` / `edit` / `glob` /
|
||||
`grep` —— 全走 docker exec,容器内 user_root=/workspace 物理边界
|
||||
- **host in-process**:`load_skill` / `web_*` / `seedream` / `seedance` —— 持
|
||||
Bocha/ARK API key 不能入容器 env(SaaS 时 key 泄漏面);load_skill 是 SKILL 注册表
|
||||
内存查找无 fs 访问越界
|
||||
|
||||
容器准入(per call):
|
||||
1. `pool.ensure(user_id)` —— 拿到 / 起 `zcbot-sandbox-<uid>` 容器(per-user lock 已串行化)
|
||||
2. `docker exec --user 1000:1000 --workdir /workspace/<wd_name> <c> setsid bash -c '<cmd>'`
|
||||
2. 命令分两类:
|
||||
- shell/run_python:`docker exec --user zcbot --workdir /workspace/<wd> -e ... setsid bash -c '<cmd>'`
|
||||
- read/write/edit/glob/grep:`docker exec --user zcbot --workdir /workspace/<wd>
|
||||
<c> python /sandbox/tool_runner.py <tool_name>`,JSON args 走 stdin
|
||||
(不被 shell metachar 切,CJK 路径透明传)
|
||||
3. timeout 到 → 杀 docker CLI 客户端(Popen.kill())
|
||||
4. 完成 → `pool.mark_active(user_id)` 刷 idle 计时
|
||||
|
||||
|
|
@ -33,12 +40,20 @@ from pathlib import Path
|
|||
from typing import Any, Dict, List, Optional
|
||||
from uuid import UUID
|
||||
|
||||
import json
|
||||
|
||||
from .executor import ExecCtx, Executor, ToolResult
|
||||
from .executor_host import HostExecutor
|
||||
from .sandbox import SandboxPool
|
||||
|
||||
|
||||
CONTAINER_TOOLS = frozenset({"shell", "run_python"})
|
||||
# 信任域分类(§7.5 #6,2026-05-26 修正):
|
||||
# - SHELL_LIKE:执行任意代码,Popen 直接喂 cmd / script,setsid 包一层
|
||||
# - FS_TOOLS:fs 操作,docker exec → /sandbox/tool_runner.py + stdin 喂 JSON args
|
||||
# 二者都走 docker exec,但调用形态不同(setsid bash vs python tool_runner)
|
||||
SHELL_LIKE_TOOLS = frozenset({"shell", "run_python"})
|
||||
FS_TOOLS = frozenset({"read", "write", "edit", "glob", "grep"})
|
||||
CONTAINER_TOOLS = SHELL_LIKE_TOOLS | FS_TOOLS
|
||||
|
||||
# 容器内非 root 用户:用 username 让 docker 解析容器内 /etc/passwd 自动拿 uid。
|
||||
# Dockerfile 里 `useradd -u ${HOST_UID} zcbot` 已对齐 host uid,这里写死 "zcbot"
|
||||
|
|
@ -91,13 +106,15 @@ class DockerExecutor(Executor):
|
|||
if name not in CONTAINER_TOOLS:
|
||||
return self.host.call_tool(name, args, ctx)
|
||||
if not self.host.has_tool(name):
|
||||
# caps.enable_run_python=False 等场景下,host 没装 run_python → schema 也没暴露
|
||||
# caps.enable_run_python=False 等场景下,host 没装该工具 → schema 也没暴露
|
||||
return ToolResult(content=f"[Error] unknown tool: {name}", exit_code=2)
|
||||
try:
|
||||
if name == "shell":
|
||||
return self._exec_shell(args, ctx)
|
||||
if name == "run_python":
|
||||
return self._exec_python(args, ctx)
|
||||
if name in FS_TOOLS:
|
||||
return self._exec_fs_tool(name, args, ctx)
|
||||
except Exception as e:
|
||||
return ToolResult(
|
||||
content=f"[Error executing {name} via docker] {type(e).__name__}: {e}",
|
||||
|
|
@ -160,16 +177,50 @@ class DockerExecutor(Executor):
|
|||
except OSError:
|
||||
pass
|
||||
|
||||
# ── fs tools(read/write/edit/glob/grep)──────────────────
|
||||
|
||||
def _exec_fs_tool(
|
||||
self, name: str, args: Dict[str, Any], ctx: ExecCtx
|
||||
) -> ToolResult:
|
||||
"""fs 工具走 `python /sandbox/tool_runner.py <name>` + stdin 喂 JSON args。
|
||||
|
||||
fs 工具的 cancel / timeout 都用与 shell/run_python 不同的默认值:
|
||||
- timeout 短(30s),fs 操作不会跑很久,卡住就说明撞 mount / 大目录扫描
|
||||
- cancel 仍 poll(模型可能 grep 全 user_root 然后用户停止,响应即时)
|
||||
"""
|
||||
timeout = int(args.get("timeout") or 30) if name == "grep" else 30
|
||||
|
||||
container = self.pool.ensure(self.user_id)
|
||||
argv = self._docker_exec_argv(
|
||||
container,
|
||||
extra_env={"PYTHONIOENCODING": "utf-8"},
|
||||
stdin_open=True,
|
||||
) + ["python", "/sandbox/tool_runner.py", name]
|
||||
|
||||
# tool_runner.py 从 stdin 拿 args(JSON)── 路径含 CJK / 引号都透明传
|
||||
stdin_payload = json.dumps(args, ensure_ascii=False)
|
||||
result = self._run_subprocess(
|
||||
argv, timeout=timeout, ctx=ctx, stdin=stdin_payload
|
||||
)
|
||||
self.pool.mark_active(self.user_id)
|
||||
return result
|
||||
|
||||
# ── helpers ──────────────────────────────────────────────
|
||||
|
||||
def _docker_exec_argv(
|
||||
self, container: str, extra_env: Optional[Dict[str, str]] = None
|
||||
self,
|
||||
container: str,
|
||||
extra_env: Optional[Dict[str, str]] = None,
|
||||
stdin_open: bool = False,
|
||||
) -> List[str]:
|
||||
"""`stdin_open=True` 时加 `-i` 让 stdin 通到容器(fs tool_runner 用)。"""
|
||||
argv = [
|
||||
"docker", "exec",
|
||||
"--user", self.exec_user,
|
||||
"--workdir", self.container_workdir,
|
||||
]
|
||||
if stdin_open:
|
||||
argv.append("-i")
|
||||
env: Dict[str, str] = {}
|
||||
if extra_env:
|
||||
env.update(extra_env)
|
||||
|
|
@ -179,17 +230,30 @@ class DockerExecutor(Executor):
|
|||
return argv
|
||||
|
||||
def _run_subprocess(
|
||||
self, argv: List[str], timeout: int, ctx: ExecCtx
|
||||
self,
|
||||
argv: List[str],
|
||||
timeout: int,
|
||||
ctx: ExecCtx,
|
||||
stdin: Optional[str] = None,
|
||||
) -> ToolResult:
|
||||
"""跑 docker exec 子进程,带 cancel 协作 poll。
|
||||
|
||||
`stdin` 非空时通过 PIPE 喂给容器内进程(fs tool_runner 用 JSON args)。
|
||||
cancel 命中 / timeout 到 → Popen.kill() 杀 docker CLI 客户端;
|
||||
容器内 server 端进程接受 limitation(见模块头注释)。
|
||||
|
||||
fs tool_runner 返回形态特殊处理:
|
||||
- stdout 是 Tool.execute 直接结果(纯文本,无 [stdout] 包装)
|
||||
- exit_code != 0 时 stderr 含 [Error executing ...],透传给 LLM
|
||||
"""
|
||||
# 仅 shell/run_python 有 stdout/stderr 包装;fs tool_runner 输出本身就是
|
||||
# LLM 拿到的最终串,不再包 [stdout]/[exit N]
|
||||
is_fs_tool = stdin is not None
|
||||
cancel_check = ctx.cancel_check
|
||||
try:
|
||||
proc = subprocess.Popen(
|
||||
argv,
|
||||
stdin=subprocess.PIPE if stdin is not None else None,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
text=True,
|
||||
|
|
@ -206,7 +270,7 @@ class DockerExecutor(Executor):
|
|||
stderr: str = ""
|
||||
while True:
|
||||
try:
|
||||
stdout, stderr = proc.communicate(timeout=0.5)
|
||||
stdout, stderr = proc.communicate(input=stdin, timeout=0.5)
|
||||
break
|
||||
except subprocess.TimeoutExpired:
|
||||
if cancel_check is not None and cancel_check():
|
||||
|
|
@ -231,6 +295,15 @@ class DockerExecutor(Executor):
|
|||
exit_code=130,
|
||||
)
|
||||
|
||||
# fs tool_runner:stdout 直返;exit != 0 走 stderr 当 [Error ...] 透传
|
||||
if is_fs_tool:
|
||||
if proc.returncode == 0:
|
||||
return ToolResult(content=stdout, exit_code=0)
|
||||
# tool_runner.py 把 [Error] ... 落 stderr,exit 1=异常 / 2=参数 / unknown
|
||||
err_msg = stderr.strip() or f"tool_runner exit {proc.returncode}"
|
||||
return ToolResult(content=err_msg, exit_code=proc.returncode)
|
||||
|
||||
# shell/run_python:原 [stdout]/[stderr]/[exit] 包装
|
||||
parts: List[str] = []
|
||||
if stdout:
|
||||
parts.append(f"[stdout]\n{stdout.rstrip()}")
|
||||
|
|
|
|||
|
|
@ -35,14 +35,17 @@ __all__ = [
|
|||
_pool: Optional[SandboxPool] = None
|
||||
|
||||
|
||||
def init_pool(user_root_base: Path) -> SandboxPool:
|
||||
def init_pool(
|
||||
user_root_base: Path, repo_root: Optional[Path] = None
|
||||
) -> SandboxPool:
|
||||
"""幂等初始化 module-level pool。返回 pool 实例。
|
||||
|
||||
lifespan 调一次;ensure_network 内部也幂等。重复调用返回同一实例(不重新建)。
|
||||
`repo_root` 给 fs 工具进容器后 SKILL references 的 ro mount(详 pool.py)。
|
||||
"""
|
||||
global _pool
|
||||
if _pool is None:
|
||||
_pool = setup_pool(user_root_base)
|
||||
_pool = setup_pool(user_root_base, repo_root=repo_root)
|
||||
return _pool
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -76,6 +76,7 @@ class SandboxPool:
|
|||
def __init__(
|
||||
self,
|
||||
user_root_base: Path,
|
||||
repo_root: Optional[Path] = None,
|
||||
image: Optional[str] = None,
|
||||
runtime: Optional[str] = None,
|
||||
idle_ttl: Optional[int] = None,
|
||||
|
|
@ -84,6 +85,11 @@ class SandboxPool:
|
|||
"""
|
||||
user_root_base: per-user 子树父目录,典型 `<workspace>/users`。bind mount 源
|
||||
= `user_root_base / <user_id>`,目标 `/workspace`。
|
||||
repo_root: zcbot repo 根(`core/paths.py::ROOT`)。**fs 工具进容器后**
|
||||
(read/write/edit/glob/grep)`/sandbox/skills:ro` mount 让
|
||||
容器内 read SKILL 内部 references 的 path 能解析(skill
|
||||
在 host 上是 repo 内代码,容器 user_root 是用户文件,两者
|
||||
正交)。None → 不挂 skills,只走 user_root 边界。
|
||||
image: sandbox 镜像 tag(默 env `ZCBOT_SANDBOX_IMAGE`)
|
||||
runtime: `docker run --runtime` 值(runc / runsc / kata 等);空 = 默认
|
||||
(env `ZCBOT_SANDBOX_RUNTIME`)。§7.5 #5 / §7.9 升级表 ── 切
|
||||
|
|
@ -94,6 +100,7 @@ class SandboxPool:
|
|||
(env `ZCBOT_PG_IPS`)。defense-in-depth ── 即便落内网三段。
|
||||
"""
|
||||
self.user_root_base = user_root_base
|
||||
self.repo_root = repo_root
|
||||
self.image = image or os.getenv("ZCBOT_SANDBOX_IMAGE", DEFAULT_IMAGE)
|
||||
self.runtime = runtime or os.getenv("ZCBOT_SANDBOX_RUNTIME") or ""
|
||||
self.idle_ttl = idle_ttl if idle_ttl is not None else int(
|
||||
|
|
@ -151,6 +158,13 @@ class SandboxPool:
|
|||
"-e", f"ZCBOT_PG_IPS={self.pg_ips}",
|
||||
"--restart=no",
|
||||
]
|
||||
# repo skills 只读 mount ── fs 工具进容器后(read/glob/grep)能 access
|
||||
# SKILL.md 内引用的 references/*.md。host 上 zcbot/skills/ 是项目代码,
|
||||
# 跟用户 working_dir 正交,只读防容器内进程改 skill 实现。
|
||||
if self.repo_root is not None:
|
||||
skills_path = (self.repo_root / "skills").resolve()
|
||||
if skills_path.is_dir():
|
||||
cmd += ["-v", f"{skills_path}:/sandbox/skills:ro"]
|
||||
if self.runtime:
|
||||
cmd += ["--runtime", self.runtime]
|
||||
cmd.append(self.image)
|
||||
|
|
@ -204,13 +218,16 @@ class SandboxPool:
|
|||
return ids
|
||||
|
||||
|
||||
def setup_pool(user_root_base: Path) -> SandboxPool:
|
||||
def setup_pool(
|
||||
user_root_base: Path, repo_root: Optional[Path] = None
|
||||
) -> SandboxPool:
|
||||
"""app 启动便捷入口:ensure 网络存在 + 返回 pool 实例。
|
||||
|
||||
典型用法(lifespan 启动钩子):
|
||||
pool = setup_pool(workspace / "users")
|
||||
from core.paths import ROOT
|
||||
pool = setup_pool(workspace / "users", repo_root=ROOT)
|
||||
pool.shutdown_all() # 清前驱孤儿
|
||||
# 后台 reaper task 周期跑 pool.reap_idle()
|
||||
"""
|
||||
ensure_network()
|
||||
return SandboxPool(user_root_base=user_root_base)
|
||||
return SandboxPool(user_root_base=user_root_base, repo_root=repo_root)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,80 @@
|
|||
"""容器内 fs 工具 helper(DockerExecutor 通过 `docker exec python tool_runner.py` 调用)。
|
||||
|
||||
调用约定:
|
||||
- argv[1] = tool name(read / write / edit / glob / grep)
|
||||
- stdin = JSON 序列化的 args(用 stdin 而非 argv 是为了不被 shell metachar 切路径,
|
||||
CJK / 引号 / 路径分隔符全透明传)
|
||||
- stdout = tool execute 返回的文本(LLM 拿到的)
|
||||
- exit code = 0 ok / 1 工具内部抛异常 / 2 参数 / unknown tool
|
||||
|
||||
base_dir = `os.getcwd()` ── docker exec --workdir /workspace/<wd> 已切到 task 工作目录
|
||||
user_root = `/workspace` ── bind mount 边界,Tool._display 据此渲相对路径
|
||||
|
||||
不依赖任何 zcbot 自家包外的东西,纯用 `tools.fs` 五个 Tool 子类。容器镜像里
|
||||
`/sandbox/tools/` 是 host repo `tools/` 目录的拷贝(Dockerfile `COPY tools/`)。
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import traceback
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
# 镜像里 /sandbox/ 下放了 tools/ 的拷贝,让 import 走 /sandbox/
|
||||
sys.path.insert(0, "/sandbox")
|
||||
|
||||
from tools.fs import EditTool, GlobTool, GrepTool, ReadTool, WriteTool # noqa: E402
|
||||
|
||||
|
||||
TOOLS = {
|
||||
"read": ReadTool,
|
||||
"write": WriteTool,
|
||||
"edit": EditTool,
|
||||
"glob": GlobTool,
|
||||
"grep": GrepTool,
|
||||
}
|
||||
|
||||
|
||||
def main() -> int:
|
||||
if len(sys.argv) < 2:
|
||||
print("[Error] tool_runner: missing tool name argv[1]", file=sys.stderr)
|
||||
return 2
|
||||
name = sys.argv[1]
|
||||
if name not in TOOLS:
|
||||
print(f"[Error] tool_runner: unknown tool: {name}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
try:
|
||||
args_raw = sys.stdin.read()
|
||||
args = json.loads(args_raw) if args_raw.strip() else {}
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"[Error] tool_runner: invalid JSON args: {e}", file=sys.stderr)
|
||||
return 2
|
||||
|
||||
cls = TOOLS[name]
|
||||
tool = cls(base_dir=Path(os.getcwd()), user_root=Path("/workspace"))
|
||||
try:
|
||||
result = tool.execute(**args)
|
||||
except TypeError as e:
|
||||
print(f"[Error] bad arguments to {name}: {e}", file=sys.stderr)
|
||||
return 2
|
||||
except Exception as e:
|
||||
# 容器内 Tool 抛异常(IO / 权限等)── 落 stderr + 退非 0,DockerExecutor
|
||||
# 兜底成 ToolResult content;traceback 限 80 行防爆 LLM context
|
||||
print(f"[Error executing {name}] {type(e).__name__}: {e}", file=sys.stderr)
|
||||
tb = traceback.format_exc().splitlines()
|
||||
if len(tb) > 80:
|
||||
tb = tb[:40] + [f"... ({len(tb) - 40} lines truncated) ..."]
|
||||
print("\n".join(tb), file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if not isinstance(result, str):
|
||||
result = str(result)
|
||||
sys.stdout.write(result)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
|
|
@ -75,6 +75,13 @@ RUN mkdir -p /sandbox && cat > /sandbox/puppeteer-config.json <<'EOF'
|
|||
}
|
||||
EOF
|
||||
|
||||
# fs 工具进容器(§7.5 #6,2026-05-26 修正)── tool_runner.py 在容器内通过
|
||||
# `python /sandbox/tool_runner.py <name>` 调用 tools/fs.py 的 Tool 子类,read/write/
|
||||
# edit/glob/grep 全在容器内执行,物理边界替代代码护栏。tools/ 目录与 host 同步
|
||||
# (build 时 COPY,不挂 mount ── 容器内代码不应跟随 host repo 修改重启)。
|
||||
COPY tools/ /sandbox/tools/
|
||||
COPY core/sandbox/tool_runner.py /sandbox/tool_runner.py
|
||||
|
||||
COPY deploy/sandbox/init.sh /init.sh
|
||||
RUN chmod +x /init.sh
|
||||
|
||||
|
|
|
|||
|
|
@ -10,6 +10,8 @@ mock subprocess(`docker exec` 命令的实际跑由部署机 smoke 验,RUN.md
|
|||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
import sys
|
||||
import tempfile
|
||||
import unittest
|
||||
|
|
@ -93,13 +95,19 @@ def make_ctx(executor):
|
|||
|
||||
|
||||
class TestHostPassthrough(unittest.TestCase):
|
||||
"""非 container tool 直通 host backend,不调 pool / subprocess。"""
|
||||
"""非 container tool 直通 host backend,不调 pool / subprocess。
|
||||
|
||||
def test_read_passthrough_to_host(self):
|
||||
executor, pool, _ = make_executor()
|
||||
2026-05-26 修正:fs 工具(read/write/edit/glob/grep)也进容器了,host passthrough
|
||||
剩 load_skill / web_* / seedream / seedance(持 key)。用 load_skill 测 passthrough。
|
||||
"""
|
||||
|
||||
def test_load_skill_passthrough_to_host(self):
|
||||
executor, pool, _ = make_executor(tools_dict={
|
||||
"load_skill": FakeTool("load_skill", "LOAD_OUT"),
|
||||
})
|
||||
ctx = make_ctx(executor)
|
||||
result = executor.call_tool("read", {"file": "x"}, ctx)
|
||||
self.assertEqual(result.content, "READ_OUT")
|
||||
result = executor.call_tool("load_skill", {"name": "x"}, ctx)
|
||||
self.assertEqual(result.content, "LOAD_OUT")
|
||||
self.assertEqual(result.exit_code, 0)
|
||||
self.assertEqual(pool.ensure_calls, [])
|
||||
self.assertEqual(pool.mark_active_calls, [])
|
||||
|
|
@ -264,6 +272,108 @@ class TestRunPython(unittest.TestCase):
|
|||
self.assertEqual(leftover, [])
|
||||
|
||||
|
||||
class TestFsToolsInContainer(unittest.TestCase):
|
||||
"""fs 工具(read/write/edit/glob/grep)走 docker exec + tool_runner.py(§7.5 #6)。"""
|
||||
|
||||
def _setup_fs_executor(self):
|
||||
return make_executor(tools_dict={
|
||||
"read": FakeTool("read"),
|
||||
"write": FakeTool("write"),
|
||||
"edit": FakeTool("edit"),
|
||||
"glob": FakeTool("glob"),
|
||||
"grep": FakeTool("grep"),
|
||||
})
|
||||
|
||||
def test_read_invokes_tool_runner(self):
|
||||
executor, pool, _ = self._setup_fs_executor()
|
||||
ctx = make_ctx(executor)
|
||||
|
||||
proc = MagicMock()
|
||||
proc.communicate.return_value = ("file content here", "")
|
||||
proc.returncode = 0
|
||||
|
||||
with patch("core.executor_docker.subprocess.Popen", return_value=proc) as popen:
|
||||
result = executor.call_tool("read", {"path": "foo.txt"}, ctx)
|
||||
|
||||
# fs 工具:stdout 直返,不包 [stdout]/[exit]
|
||||
self.assertEqual(result.content, "file content here")
|
||||
self.assertEqual(result.exit_code, 0)
|
||||
|
||||
argv = popen.call_args[0][0]
|
||||
# argv 末三:python /sandbox/tool_runner.py read
|
||||
self.assertEqual(argv[-3:], ["python", "/sandbox/tool_runner.py", "read"])
|
||||
# 必须有 -i(stdin 通到容器)
|
||||
self.assertIn("-i", argv)
|
||||
# workdir / user 正常
|
||||
self.assertEqual(argv[argv.index("--workdir") + 1], "/workspace/demo")
|
||||
|
||||
# stdin 喂的 JSON args
|
||||
kwargs = popen.call_args[1]
|
||||
self.assertEqual(kwargs.get("stdin"), subprocess.PIPE)
|
||||
stdin_payload = proc.communicate.call_args[1].get("input")
|
||||
self.assertEqual(json.loads(stdin_payload), {"path": "foo.txt"})
|
||||
|
||||
# pool 调过
|
||||
self.assertEqual(pool.ensure_calls, [executor.user_id])
|
||||
self.assertEqual(pool.mark_active_calls, [executor.user_id])
|
||||
|
||||
def test_write_with_cjk_path(self):
|
||||
"""CJK 路径不被 shell metachar 切(stdin 喂 JSON 的核心论据)。"""
|
||||
executor, _, _ = self._setup_fs_executor()
|
||||
ctx = make_ctx(executor)
|
||||
|
||||
proc = MagicMock()
|
||||
proc.communicate.return_value = ("[wrote 100 chars to 测试.md]", "")
|
||||
proc.returncode = 0
|
||||
|
||||
with patch("core.executor_docker.subprocess.Popen", return_value=proc):
|
||||
result = executor.call_tool(
|
||||
"write",
|
||||
{"path": "测试目录/中文文件.md", "content": "你好"},
|
||||
ctx,
|
||||
)
|
||||
|
||||
self.assertIn("[wrote", result.content)
|
||||
stdin_payload = proc.communicate.call_args[1].get("input")
|
||||
parsed = json.loads(stdin_payload)
|
||||
self.assertEqual(parsed["path"], "测试目录/中文文件.md")
|
||||
self.assertEqual(parsed["content"], "你好")
|
||||
|
||||
def test_grep_error_to_stderr(self):
|
||||
"""tool_runner.py exit != 0 时 stderr 当 ToolResult content 透传。"""
|
||||
executor, _, _ = self._setup_fs_executor()
|
||||
ctx = make_ctx(executor)
|
||||
|
||||
proc = MagicMock()
|
||||
proc.communicate.return_value = ("", "[Error] invalid regex: ...\n")
|
||||
proc.returncode = 1
|
||||
|
||||
with patch("core.executor_docker.subprocess.Popen", return_value=proc):
|
||||
result = executor.call_tool("grep", {"pattern": "["}, ctx)
|
||||
|
||||
self.assertIn("[Error]", result.content)
|
||||
self.assertEqual(result.exit_code, 1)
|
||||
|
||||
def test_fs_tool_timeout(self):
|
||||
executor, _, _ = self._setup_fs_executor()
|
||||
ctx = make_ctx(executor)
|
||||
|
||||
proc = MagicMock()
|
||||
proc.communicate.side_effect = [
|
||||
subprocess.TimeoutExpired(cmd="docker", timeout=0.5),
|
||||
("", ""),
|
||||
]
|
||||
proc.returncode = -9
|
||||
|
||||
with patch("core.executor_docker.subprocess.Popen", return_value=proc), \
|
||||
patch("core.executor_docker.time.monotonic", side_effect=[0, 1000]):
|
||||
result = executor.call_tool("glob", {"pattern": "**/*"}, ctx)
|
||||
|
||||
self.assertIn("timed out", result.content)
|
||||
self.assertEqual(result.exit_code, 124)
|
||||
proc.kill.assert_called_once()
|
||||
|
||||
|
||||
class TestUnknownTool(unittest.TestCase):
|
||||
def test_unknown_tool_goes_to_host(self):
|
||||
executor, _, _ = make_executor(tools_dict={}) # 空 host → 啥都没
|
||||
|
|
|
|||
|
|
@ -508,6 +508,7 @@ def create_app() -> FastAPI:
|
|||
sandbox_backend = os.getenv("ZCBOT_SANDBOX_BACKEND", "host").lower()
|
||||
sandbox_reaper_task = None
|
||||
if sandbox_backend == "docker":
|
||||
from core.paths import ROOT
|
||||
from core.sandbox import init_pool
|
||||
from core.sandbox.check import detect_fs_quota
|
||||
workspace = resolve_workspace(None, _cfg)
|
||||
|
|
@ -520,7 +521,9 @@ def create_app() -> FastAPI:
|
|||
except Exception as e:
|
||||
print(f"[startup] [warn] fs quota detect failed: {type(e).__name__}: {e}")
|
||||
try:
|
||||
pool = init_pool(user_root_base)
|
||||
# repo_root=ROOT 让 SandboxPool 把 <repo>/skills 只读 mount 进容器
|
||||
# (fs 工具进容器后 read SKILL references 需要)
|
||||
pool = init_pool(user_root_base, repo_root=ROOT)
|
||||
removed = pool.shutdown_all()
|
||||
if removed:
|
||||
print(f"[startup] swept {len(removed)} stale sandbox container(s)")
|
||||
|
|
|
|||
Loading…
Reference in New Issue