fix(context): 发送期补齐悬空 tool_calls,断中断 run 留下的协议崩 + bump 0.20.2
run 在写入 assistant.tool_calls 之后、tool 结果写库之前被中断(上游流式断连 / 用户取消 / 崩溃),历史里留下一条 tool_calls 后面没有对应 tool 结果的消息;用户 随后继续发言,下一轮原样发给 DeepSeek/OpenAI 即被拒(must be followed by tool messages),任务卡死在 run_status=error(监控页排查 task 5c5d6d25 实测)。 prepare_messages_with_stats 入口(早返回分支之前)新增 _repair_dangling_tool_calls: 对每条 assistant.tool_calls 扫描紧随其后的 tool 结果,为缺失的 tool_call_id 补占位 tool 消息。纯发送期不改库 → 覆盖所有中断路径 + 存量坏数据自愈,stats 计 repaired_tool_calls。 区别于 06-06/06-12 的 arguments 损坏修复(那治参数投毒,此为结构性悬空)。 新增 4 个单测,context 套件 14 项全过。 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f8d11a2491
commit
c55d0d11f0
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
> 配合 `DESIGN.md`。本文件只记 phase 状态、决策偏差、文件量、下一步。每条 1-2 句:做了啥 + 关键判断;细节查 `git log` / `git diff` / `DESIGN §7.9`。
|
> 配合 `DESIGN.md`。本文件只记 phase 状态、决策偏差、文件量、下一步。每条 1-2 句:做了啥 + 关键判断;细节查 `git log` / `git diff` / `DESIGN §7.9`。
|
||||||
|
|
||||||
最后更新:2026-06-18(brief 简报重定位为「重要文献速览」+ 精简到三文件 + bump 0.20.0)
|
最后更新:2026-06-21(发送期修复悬空 tool_calls,断中断 run 留下的协议崩 + bump 0.20.2)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -21,6 +21,13 @@
|
||||||
|
|
||||||
## 已完成关键能力
|
## 已完成关键能力
|
||||||
|
|
||||||
|
### 2026-06-21 / 发送期修复悬空 tool_calls(bump 0.20.2)
|
||||||
|
|
||||||
|
- 根因(监控页 error 任务排查,task 5c5d6d25 DB 实测):run 在写入 `assistant.tool_calls` 之后、tool 结果写库之前被中断(上游流式断连 / 用户取消 / 崩溃),历史里留下一条 `assistant.tool_calls` 后面**没有对应 tool 结果**的消息;用户随后继续发言,下一轮把历史原样发给 DeepSeek/OpenAI 即被拒 `An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'` → 任务进 `run_status=error` 卡死。区别于 06-06/06-12 的 arguments 损坏/投毒修复(那治"参数被压成 marker"),这是**结构性悬空**,旧修复不覆盖。
|
||||||
|
- 修复(方案 A,发送期兜底):`core/context.py` 新增 `_repair_dangling_tool_calls`,在 `prepare_messages_with_stats` 入口(早返回分支之前)对每条 `assistant.tool_calls` 扫描紧随其后的连续 tool 结果,为**缺失**的 `tool_call_id` 补一条占位 tool 消息(`[interrupted: ...]`,带原 function name)。纯发送期、不改库 → 覆盖所有中断路径 + 已存在的坏数据自愈(下次发消息即修复),`stats.repaired_tool_calls` 计数。选 A 而非写入期防御(方案 B):B 要覆盖所有中断路径易漏且救不了存量。
|
||||||
|
- 验证:真实坏 task 5c5d6d25 修复前 idx 19 悬空 1 条 → 修复后 0 悬空、协议合法(压缩开/跳过两分支均覆盖);新增 4 个单测,context 套件 14 项全过。
|
||||||
|
- 文件:`core/context.py`、`tests/test_context_compaction.py`;`core/__init__.py` 0.20.1→0.20.2。
|
||||||
|
|
||||||
### 2026-06-18 / brief 简报重定位「重要文献速览」+ 精简三文件(bump 0.20.0)
|
### 2026-06-18 / brief 简报重定位「重要文献速览」+ 精简三文件(bump 0.20.0)
|
||||||
|
|
||||||
- 需求漂移收敛:brief 从"热点聚类趋势判断型简报"重定位为**「重要论文列表 + 内容总结」速览型** —— ①只描述不给建议(去掉启示/判断/空白争议);②开头一份重要期刊论文列表(各大相关刊、**Elsevier 数据库优先**),每篇带一段简介/摘要概述;③对这批论文做客观总结即可。
|
- 需求漂移收敛:brief 从"热点聚类趋势判断型简报"重定位为**「重要论文列表 + 内容总结」速览型** —— ①只描述不给建议(去掉启示/判断/空白争议);②开头一份重要期刊论文列表(各大相关刊、**Elsevier 数据库优先**),每篇带一段简介/摘要概述;③对这批论文做客观总结即可。
|
||||||
|
|
|
||||||
|
|
@ -1,3 +1,3 @@
|
||||||
# zcbot 版本号单一事实源:web/app.py 的 FastAPI version、/healthz 返回、前端展示都引这里。
|
# zcbot 版本号单一事实源:web/app.py 的 FastAPI version、/healthz 返回、前端展示都引这里。
|
||||||
# 改版本只动这一行。
|
# 改版本只动这一行。
|
||||||
__version__ = "0.20.1"
|
__version__ = "0.20.2"
|
||||||
|
|
|
||||||
|
|
@ -49,6 +49,68 @@ def _message_chars(msg: dict[str, Any]) -> int:
|
||||||
return len(str(msg))
|
return len(str(msg))
|
||||||
|
|
||||||
|
|
||||||
|
_INTERRUPTED_TOOL_RESULT = (
|
||||||
|
"[interrupted: tool result missing — run was cut off "
|
||||||
|
"(disconnect/cancel) before this tool finished]"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _repair_dangling_tool_calls(
|
||||||
|
messages: List[dict[str, Any]],
|
||||||
|
) -> tuple[List[dict[str, Any]], int]:
|
||||||
|
"""补齐被中断 run 留下的悬空 tool_calls,返回 (修复后的消息, 补的占位条数)。
|
||||||
|
|
||||||
|
run 在写入 `assistant.tool_calls` 之后、tool 结果写入之前被中断(上游断连 /
|
||||||
|
用户取消 / 崩溃),会在历史里留下一条 `assistant.tool_calls` 后面没有对应 tool
|
||||||
|
结果的消息;用户随后继续发言,下一轮把历史原样发给 OpenAI/DeepSeek 就会被拒:
|
||||||
|
"An assistant message with 'tool_calls' must be followed by tool messages
|
||||||
|
responding to each 'tool_call_id'"(2026-06-18 DB 实测 task 5c5d6d25 命中)。
|
||||||
|
|
||||||
|
这里在发送前为每个**缺失**的 tool_call_id 紧跟其 assistant 消息补一条占位 tool
|
||||||
|
消息,满足协议且不丢上下文。纯发送期处理,不改库 —— 对所有中断路径和已存在的坏
|
||||||
|
数据都生效。
|
||||||
|
"""
|
||||||
|
repaired: List[dict[str, Any]] = []
|
||||||
|
repaired_count = 0
|
||||||
|
n = len(messages)
|
||||||
|
i = 0
|
||||||
|
while i < n:
|
||||||
|
msg = messages[i]
|
||||||
|
repaired.append(msg)
|
||||||
|
tool_calls = msg.get("tool_calls") if isinstance(msg, dict) else None
|
||||||
|
if isinstance(msg, dict) and msg.get("role") == "assistant" and tool_calls:
|
||||||
|
id_to_name = {
|
||||||
|
tc.get("id"): (tc.get("function") or {}).get("name")
|
||||||
|
for tc in tool_calls
|
||||||
|
if isinstance(tc, dict) and tc.get("id")
|
||||||
|
}
|
||||||
|
# 收集紧随其后的连续 tool 消息已回应的 id(协议要求 tool 结果紧跟 assistant)。
|
||||||
|
answered: set[Any] = set()
|
||||||
|
j = i + 1
|
||||||
|
while j < n and isinstance(messages[j], dict) and messages[j].get("role") == "tool":
|
||||||
|
cid = messages[j].get("tool_call_id")
|
||||||
|
if cid:
|
||||||
|
answered.add(cid)
|
||||||
|
repaired.append(messages[j])
|
||||||
|
j += 1
|
||||||
|
# 为缺失的 id 补占位 tool 消息(保持在该 assistant 的 tool 结果块内)。
|
||||||
|
for cid, name in id_to_name.items():
|
||||||
|
if cid not in answered:
|
||||||
|
synthetic: dict[str, Any] = {
|
||||||
|
"role": "tool",
|
||||||
|
"tool_call_id": cid,
|
||||||
|
"content": _INTERRUPTED_TOOL_RESULT,
|
||||||
|
}
|
||||||
|
if name:
|
||||||
|
synthetic["name"] = name
|
||||||
|
repaired.append(synthetic)
|
||||||
|
repaired_count += 1
|
||||||
|
i = j
|
||||||
|
continue
|
||||||
|
i += 1
|
||||||
|
return repaired, repaired_count
|
||||||
|
|
||||||
|
|
||||||
def prepare_messages_for_llm(
|
def prepare_messages_for_llm(
|
||||||
messages: List[dict[str, Any]],
|
messages: List[dict[str, Any]],
|
||||||
*,
|
*,
|
||||||
|
|
@ -87,6 +149,8 @@ def prepare_messages_with_stats(
|
||||||
"""
|
"""
|
||||||
if keep_recent < 0:
|
if keep_recent < 0:
|
||||||
keep_recent = 0
|
keep_recent = 0
|
||||||
|
# 先补齐被中断 run 留下的悬空 tool_calls(否则原样发给模型会被拒,见函数注释)。
|
||||||
|
messages, repaired_tool_calls = _repair_dangling_tool_calls(messages)
|
||||||
original_chars = sum(_message_chars(m) for m in messages)
|
original_chars = sum(_message_chars(m) for m in messages)
|
||||||
|
|
||||||
# 未到上下文压力门槛 → 原样发,零压缩(缓存全暖 + 不丢信息)。压缩是"放不下"才做的事。
|
# 未到上下文压力门槛 → 原样发,零压缩(缓存全暖 + 不丢信息)。压缩是"放不下"才做的事。
|
||||||
|
|
@ -99,6 +163,7 @@ def prepare_messages_with_stats(
|
||||||
"compacted_tool_messages": 0,
|
"compacted_tool_messages": 0,
|
||||||
"compacted_skill_messages": 0,
|
"compacted_skill_messages": 0,
|
||||||
"compaction_skipped": 1,
|
"compaction_skipped": 1,
|
||||||
|
"repaired_tool_calls": repaired_tool_calls,
|
||||||
}
|
}
|
||||||
return prepared, stats
|
return prepared, stats
|
||||||
|
|
||||||
|
|
@ -136,5 +201,6 @@ def prepare_messages_with_stats(
|
||||||
"compacted_tool_messages": compacted_tool_messages,
|
"compacted_tool_messages": compacted_tool_messages,
|
||||||
"compacted_skill_messages": compacted_skill_messages,
|
"compacted_skill_messages": compacted_skill_messages,
|
||||||
"compaction_skipped": 0,
|
"compaction_skipped": 0,
|
||||||
|
"repaired_tool_calls": repaired_tool_calls,
|
||||||
}
|
}
|
||||||
return prepared, stats
|
return prepared, stats
|
||||||
|
|
|
||||||
|
|
@ -208,5 +208,97 @@ class ContextCompactionTests(unittest.TestCase):
|
||||||
self.assertGreater(stats["saved_chars"], 0)
|
self.assertGreater(stats["saved_chars"], 0)
|
||||||
|
|
||||||
|
|
||||||
|
def test_repairs_dangling_tool_calls_followed_by_user(self) -> None:
|
||||||
|
# run 在 assistant.tool_calls 之后被中断(断连/取消),tool 结果没写库;用户接着发言。
|
||||||
|
# 原样发给 DeepSeek/OpenAI 会被拒。发送前必须补占位 tool 结果。(task 5c5d6d25 实测)
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": "rules"},
|
||||||
|
{
|
||||||
|
"role": "assistant",
|
||||||
|
"content": None,
|
||||||
|
"tool_calls": [{
|
||||||
|
"id": "call_x",
|
||||||
|
"type": "function",
|
||||||
|
"function": {"name": "run_python", "arguments": "{}"},
|
||||||
|
}],
|
||||||
|
},
|
||||||
|
{"role": "user", "content": "怎么不回应了"},
|
||||||
|
{"role": "user", "content": "在干什么"},
|
||||||
|
]
|
||||||
|
|
||||||
|
prepared, stats = prepare_messages_with_stats(messages, keep_recent=12)
|
||||||
|
|
||||||
|
# assistant.tool_calls 后面紧跟补出来的 tool 结果,再才是 user。
|
||||||
|
self.assertEqual(prepared[1]["role"], "assistant")
|
||||||
|
self.assertEqual(prepared[2]["role"], "tool")
|
||||||
|
self.assertEqual(prepared[2]["tool_call_id"], "call_x")
|
||||||
|
self.assertEqual(prepared[2]["name"], "run_python")
|
||||||
|
self.assertIn("interrupted", prepared[2]["content"])
|
||||||
|
self.assertEqual(prepared[3]["role"], "user")
|
||||||
|
self.assertEqual(stats["repaired_tool_calls"], 1)
|
||||||
|
|
||||||
|
def test_does_not_touch_well_paired_tool_calls(self) -> None:
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": "rules"},
|
||||||
|
{
|
||||||
|
"role": "assistant",
|
||||||
|
"content": None,
|
||||||
|
"tool_calls": [{"id": "call_x", "type": "function",
|
||||||
|
"function": {"name": "shell", "arguments": "{}"}}],
|
||||||
|
},
|
||||||
|
{"role": "tool", "tool_call_id": "call_x", "name": "shell", "content": "ok"},
|
||||||
|
{"role": "user", "content": "next"},
|
||||||
|
]
|
||||||
|
|
||||||
|
prepared, stats = prepare_messages_with_stats(messages, keep_recent=12)
|
||||||
|
|
||||||
|
self.assertEqual(stats["repaired_tool_calls"], 0)
|
||||||
|
self.assertEqual(len(prepared), len(messages))
|
||||||
|
|
||||||
|
def test_repairs_partial_multi_tool_call_block(self) -> None:
|
||||||
|
# 一条 assistant 发了两个 tool_call,只回了一个 → 只补缺的那个。
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": "rules"},
|
||||||
|
{
|
||||||
|
"role": "assistant",
|
||||||
|
"content": None,
|
||||||
|
"tool_calls": [
|
||||||
|
{"id": "a", "type": "function", "function": {"name": "shell", "arguments": "{}"}},
|
||||||
|
{"id": "b", "type": "function", "function": {"name": "run_python", "arguments": "{}"}},
|
||||||
|
],
|
||||||
|
},
|
||||||
|
{"role": "tool", "tool_call_id": "a", "name": "shell", "content": "ok"},
|
||||||
|
{"role": "user", "content": "next"},
|
||||||
|
]
|
||||||
|
|
||||||
|
prepared, stats = prepare_messages_with_stats(messages, keep_recent=12)
|
||||||
|
|
||||||
|
self.assertEqual(stats["repaired_tool_calls"], 1)
|
||||||
|
tool_ids = [m["tool_call_id"] for m in prepared if m.get("role") == "tool"]
|
||||||
|
self.assertEqual(set(tool_ids), {"a", "b"})
|
||||||
|
|
||||||
|
def test_repair_runs_even_when_compaction_skipped(self) -> None:
|
||||||
|
# 低于压缩门槛也要修复(修复在早返回分支之前)。
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": "rules"},
|
||||||
|
{
|
||||||
|
"role": "assistant",
|
||||||
|
"content": None,
|
||||||
|
"tool_calls": [{"id": "call_x", "type": "function",
|
||||||
|
"function": {"name": "run_python", "arguments": "{}"}}],
|
||||||
|
},
|
||||||
|
{"role": "user", "content": "hello"},
|
||||||
|
]
|
||||||
|
|
||||||
|
prepared, stats = prepare_messages_with_stats(
|
||||||
|
messages, keep_recent=12, compact_threshold_chars=10_000_000,
|
||||||
|
)
|
||||||
|
|
||||||
|
self.assertEqual(stats["compaction_skipped"], 1)
|
||||||
|
self.assertEqual(stats["repaired_tool_calls"], 1)
|
||||||
|
self.assertEqual(prepared[2]["role"], "tool")
|
||||||
|
self.assertEqual(prepared[2]["tool_call_id"], "call_x")
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue