model: 同 task 内切模型(c 模式 task 级 / A 粒度)+ usage_events v2 表(0006); GET /v1/models; 前端顶栏下拉 + 历史 model 切换点小标
- DB(0006): messages 加 model_profile 列(assistant 行有值); 重建 usage_events 表 v2 形态(event_id/user_id/task_id/message_id/kind/model_profile/units jsonb/cost_usd + 三索引), 0004 删的旧 schema 字段不够多态; tasks.tokens_prompt/completion/cost_usd 保留作粗概览 - ModelCapabilities 加 display_name; deepseek_v4.yaml flash/pro 各填名 - GET /v1/models: 扫 config/models/*.yaml 列可选项(profile/display_name/family/thinking_mode/is_default); POST /v1/tasks + PATCH 接受 model_profile(不传 → cfg["default_model"]; 校验走 ModelCapabilities.load 失败 400) - build_agent: resume 时优先 task.model_profile 而非 cfg default; AgentLoop 加 user_id 透传, 每轮 assistant 入库后调 record_chat_usage(litellm cost map 算钱, 失败吞掉 emit warn 不阻 loop) - core/storage/usage.py 新文件: record_chat_usage(双写 messages.tokens_in/out + model_profile + insert usage_events 一行) - session.append() 返回 message_id(供 usage 关联) - 前端 dev.html: chat-meta 加模型下拉(切了 PATCH + running 中提示"跑完后生效"); 新建对话框 modal 加 nt-model select; renderMessages 按 model_profile 切换点画小标 "── DeepSeek V4 Pro ──" - CLAUDE.md: 加"开发测试期 / 不删现有数据 / DROP COLUMN 两种情况"规则 - DESIGN §7.4 schema 加 messages.model_profile + usage_events v2 段; PROGRESS 加 0006 条目 + 文件清单 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
48924d0d56
commit
781a216ca6
|
|
@ -8,13 +8,14 @@
|
|||
|
||||
## 开发阶段心智
|
||||
|
||||
当前处于开发阶段(尚未发布给真实用户)。改需求 / 重构时,**以最优实现为准,不为旧数据 / 旧字段 / 旧 API 留兼容层**:
|
||||
- DB schema 变 → 直接改 model + 写一条干净的 migration(必要时清空旧 row,不写双向兼容代码)
|
||||
- 字段语义变 → 全量替换,不留 `legacy_xxx` / `*_v2` 并存
|
||||
当前处于**开发测试期**(开发自用 + 内部测试,DB 已有真实测试数据)。改需求 / 重构时,**以最优实现为准,不为旧数据 / 旧字段 / 旧 API 留兼容层**,但**不删现有数据**:
|
||||
- DB schema 变 → 直接改 model + 写一条干净的 migration:加列 / 改列结构 OK;**不要 truncate / DELETE FROM 现有表 —— 测试数据要保留**
|
||||
- 删字段(DROP COLUMN)前:若该列是当前唯一持有该信息(如累计型 tokens 列),先 backfill 到新位置再删;若纯冗余(从其他列能推出)直接删 OK
|
||||
- 字段语义变 → 全量替换 + migration 把旧值映射到新值(不留 `legacy_xxx` / `*_v2` 并存)
|
||||
- CLI / REPL 选项变 → 直接改,不留 deprecated 别名
|
||||
- 只有当用户明确说"这条要保留兼容"时才写兼容代码
|
||||
|
||||
理由:兼容层就是技术债,开发期写了之后忘记删反而拖累;真上线后再视情况补迁移路径。
|
||||
理由:兼容层是技术债;但测试数据是观察新代码行为的依据 —— 一次 truncate 后再回去查"上周那 task 烧了多少 token / 哪条消息触发的 bug",就只能瞎猜。
|
||||
|
||||
## 文档维护
|
||||
|
||||
|
|
|
|||
18
DESIGN.md
18
DESIGN.md
|
|
@ -324,12 +324,26 @@ create index on tasks (user_id, working_dir);
|
|||
-- 入口校验 validate_task_name():拒空 / 含 /\NUL / `.` 起头 / >255
|
||||
|
||||
messages(message_id uuid pk, task_id fk, idx int not null,
|
||||
payload jsonb not null, tokens_in, tokens_out, created_at,
|
||||
payload jsonb not null, tokens_in, tokens_out,
|
||||
model_profile text null, -- 0006:只在 assistant 行有值,标产生该 msg 的模型
|
||||
created_at,
|
||||
unique (task_id, idx));
|
||||
create index on messages using gin (payload jsonb_path_ops);
|
||||
|
||||
usage_events(event_id uuid pk, user_id fk, task_id fk on delete cascade,
|
||||
message_id fk on delete set null,
|
||||
kind text not null, -- chat / image / video / audio / ...(0006 起只 chat,媒体扩展位)
|
||||
model_profile text not null,
|
||||
units jsonb not null, -- chat: {tokens_in, tokens_out};image: {count, size};...
|
||||
cost_usd numeric(12,6) not null default 0,
|
||||
created_at);
|
||||
create index on usage_events (user_id, created_at); -- 用户级聚合走这条,JOIN-free
|
||||
create index on usage_events (task_id);
|
||||
create index on usage_events (model_profile, created_at);
|
||||
```
|
||||
|
||||
**0004 简化**:`runs` 表角色等价"task 当前 in-flight 状态",合并到 `tasks.run_status` + `run_error`;`usage_events` 是计费预付架构成本,真要计费再加。`run_id` 单活 run 形态下对客户端 / broker / cancel 全冗余 → 客户端只需 task_id。
|
||||
**0004 简化**:`runs` 表角色等价"task 当前 in-flight 状态",合并到 `tasks.run_status` + `run_error`;`run_id` 单活 run 形态下对客户端 / broker / cancel 全冗余 → 客户端只需 task_id。
|
||||
**0006 模型切换 + 用量统计**:`tasks.model_profile` 从 0001 起就有,本次开始真用 —— task 创建时 UI 选 / PATCH 切;`build_agent` resume 读它而非 `cfg["default_model"]`(A 粒度:下条 send 才生效,当前 run 不受影响)。`messages.model_profile` 新增,assistant 行落实际用的模型,前端按 model 切换点画小标。`usage_events` 表 0004 删掉的简陋版形态(id/user_id/task_id/run_id/kind/value/ts)字段不够多态,本次重建 v2 形态:per-event 一行,`units` JSONB 装多态用量(token / 张数 / 秒数),`cost_usd` 用 litellm cost map 算;chat 已接入(`core/loop.py` 在 assistant message 入库后调 `record_chat_usage`),媒体工具未来加 image/video kind 不动 schema。**`tasks.tokens_prompt/completion/cost_usd` 三列保留作粗 task 级概览**,继续由 `sync_task_tokens` 维护;`messages.tokens_in/out` 同时双写,查 message 详情不需 JOIN。统计真实 source-of-truth 走 `usage_events`,跨用户 / 跨模型 / 跨时间维度都按 `(user_id, created_at)` 索引直查。
|
||||
**run_status 终态语义**:`ok` / `cancelled` 收尾回 `idle`(用户视角等价),只有 `error` 持久(让用户能看到),起新 run 时由 `post_message` 清。
|
||||
|
||||
**No-subtask 校验**(`create_task`):同 user 下查 `new LIKE existing/%` 或 `existing LIKE new/%`,中一则拒;同 working_dir 允许。两侧先用 `from_db_path` 归一到 absolute posix 再比前缀(混合存储形态不漏判),数量小直接 Python 端比对,不在 SQL 里拼分隔符。
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
> 配合 `DESIGN.md`。本文件只记 phase 状态、决策偏差、文件量、下一步。每条 2-4 句:做了啥 + 关键判断 + 没动什么;细节查 `git log` / `git diff`。
|
||||
|
||||
最后更新:2026-05-19(dev SPA 登录从"邀请码/uuid5"撤回 邮箱+密码 — `users.email/password_hash` + UNIQUE + `main.py user add` CLI + 登录页两 tab)
|
||||
最后更新:2026-05-19(0006 模型切换 + usage_events v2 表:task 级模型 PATCH / `GET /v1/models` / 前端顶栏下拉 + 历史小标 / chat usage 落 messages 双写 + usage_events 一行,A 粒度下条 send 生效)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -23,6 +23,7 @@
|
|||
|
||||
### 2026-05-19
|
||||
|
||||
- **0006 模型切换(c 模式 task 级 A 粒度)+ usage_events v2 表**:`tasks.model_profile` 从死字段变 source-of-truth,顶栏下拉 → `PATCH /v1/tasks/{id}` 即换,**A 粒度下条 send 生效**(当前 run 不受影响;running 中切 UI 提示"跑完后生效")。`build_agent` resume 时优先 task.model_profile,新建 task POST body 加可选 `model_profile`(留空 → `cfg["default_model"]`)。`GET /v1/models` 扫 `config/models/*.yaml` 列可选项(含 display_name / thinking_mode / is_default),`ModelCapabilities` 加 `display_name` 字段,deepseek_v4.yaml 两 variant 各填名。**前端**:chat-meta 加下拉(切了 PATCH+提示)、新建对话框 modal 加 `<select id="nt-model">`、message 历史按 `messages.model_profile` 切换点画小标(`── DeepSeek V4 Pro ──`,连续同 model 不重复)。**统计 schema**:0004 删掉的简陋 usage_events 字段不够多态,本次重建 v2 形态(`event_id/user_id/task_id/message_id/kind/model_profile/units jsonb/cost_usd`),chat 已接入(`core/storage/usage.py::record_chat_usage`,`loop.py` 在 assistant message 入库后调,litellm cost map 算钱),媒体扩展位(image/video/audio kind)预留不动 schema。**双写**:同时回填 `messages.tokens_in/out/model_profile`,查 message 详情时不需 JOIN。**索引**:`(user_id, created_at)` / `(task_id)` / `(model_profile, created_at)`,用户级配额 query JOIN-free。**没动**:CLI / RUN.md(无 env / 命令变化)、`tasks.tokens_prompt/completion/cost_usd` 保留作 task 级粗概览。
|
||||
- **dev SPA 登录撤回 邮箱+密码,删 invites 表**:前两条"邀请码 env → invites 表(0005)"一日游撤回,复用 users 表本来就有的 email/password_hash 列(0001 schema)+ 0005 加 UNIQUE(email)。`bcrypt` 哈希,新 `/v1/auth/login_password` 路由,新 `main.py user add --email --password` CLI 发用户。dev SPA 登录两 tab(邮箱密码 默认 / UUID+PLATFORM_KEY 备用,last-used 持久化 LS)。判定:邀请码 uuid5(NS,name) 推导对外是黑盒(改 name = 换身份),复用 users 列语义清晰也对齐生产路径。**没动**:JWT 签发 / platform_key 路径 / DB users 表列结构。
|
||||
- **邀请码后端 env → invites 表(0005)** _(已撤,见上条;原条目已删,有需要看 git history)_
|
||||
- **SENTINEL user 彻底撤(数据 + 代码)**:`SENTINEL_USER_ID = UUID('00000000-...')` 在 web 必从 JWT 拿 user_id 后已无角色,按 CLAUDE.md "不写兼容层" 连根拔。DB CASCADE 删 sentinel user + workspace dotfile 目录;代码 10 处删 import / 默认参数 / fallback,`utils.py` 三函数和 `build_agent` 的 `user_id` 从可选变必填(`build_agent` 加 `*,` 转 KEYWORD_ONLY 规避默认参数顺序)。**Bonus**:把"操作 user 数据的函数必须显式传 user_id"作为 Python 必填参数固化,以后多 user 函数 typechecker 会拦到。
|
||||
|
|
@ -113,9 +114,10 @@ core/skills.py 81
|
|||
core/task.py 82 ← §7 B Step 3: PG-backed TaskState
|
||||
core/memory.py 81 ← per-user `.memory/` dotfile
|
||||
core/export_docx.py 383
|
||||
core/storage/__init__.py 27
|
||||
core/storage/__init__.py 29 ← record_chat_usage 出口(0006)
|
||||
core/storage/engine.py 80
|
||||
core/storage/models.py 98 ← 3 表(0004 删 runs/usage_events;0005 email UNIQUE)
|
||||
core/storage/models.py 130 ← 4 表(0004 删 runs;0005 email UNIQUE;0006 加 usage_events v2 + messages.model_profile)
|
||||
core/storage/usage.py 70 ← 0006:record_chat_usage(litellm cost map + 双写 messages + insert usage_events)
|
||||
core/storage/utils.py 136
|
||||
core/agent_builder.py 307 ← 装配 lib(原 main.py 内容,05-18 改名归位)
|
||||
tools/{base,fs,shell,run_python,skill_tool}.py ~440 行
|
||||
|
|
@ -127,6 +129,7 @@ db/migrations/versions/
|
|||
0003_task_name_and_working_dir.py 51
|
||||
0004_drop_runs_usage_events.py 77
|
||||
0005_users_email_unique.py 28 ← 0005 一日游 invites 已撤,接 users.email UNIQUE
|
||||
0006_usage_events_v2_and_message_model.py 60 ← messages.model_profile 列 + usage_events v2 表(多态 units jsonb)
|
||||
web/__init__.py 5
|
||||
web/app.py ~890 ← /v1 JSON API + user_id 隔离 + run lock + task-level cancel
|
||||
web/auth.py ~190 ← D' 过渡:邮箱密码 + platform_key → JWT
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@ family: deepseek_v4
|
|||
|
||||
variants:
|
||||
flash:
|
||||
display_name: DeepSeek V4 Flash
|
||||
model_id: deepseek/deepseek-v4-flash
|
||||
api_base: https://api.deepseek.com/v1
|
||||
api_key_env: DEEPSEEK_API_KEY
|
||||
|
|
@ -23,6 +24,7 @@ variants:
|
|||
extended_thinking: false
|
||||
|
||||
pro:
|
||||
display_name: DeepSeek V4 Pro
|
||||
model_id: deepseek/deepseek-v4-pro
|
||||
api_base: https://api.deepseek.com/v1
|
||||
api_key_env: DEEPSEEK_API_KEY
|
||||
|
|
|
|||
|
|
@ -187,9 +187,23 @@ def build_agent(
|
|||
web 入口从 JWT 拿到后透传;不允许无 user 的调用路径。
|
||||
"""
|
||||
cfg = load_config()
|
||||
model = model_name or cfg["default_model"]
|
||||
uid = user_id
|
||||
|
||||
# model 选择优先级:caller 传参 > resume 时 task.model_profile > cfg["default_model"]。
|
||||
# caller 传参为新建 task 时 web POST /v1/tasks 接收的 model_profile 字段;resume
|
||||
# 不传时读 tasks 表(由顶栏下拉切换 PATCH 维护)。整体满足 grill A 粒度:下条 send 生效。
|
||||
model = model_name
|
||||
if model is None and resume and session_id:
|
||||
from sqlalchemy import select as _select
|
||||
from core.storage import session_scope as _scope
|
||||
from core.storage.models import Task as _Task
|
||||
with _scope() as _s:
|
||||
model = _s.execute(
|
||||
_select(_Task.model_profile).where(_Task.task_id == UUID(session_id))
|
||||
).scalar_one_or_none() or None
|
||||
if not model:
|
||||
model = cfg["default_model"]
|
||||
|
||||
caps = ModelCapabilities.load(model, ROOT / cfg["models_dir"])
|
||||
llm = LLM(caps)
|
||||
|
||||
|
|
@ -283,7 +297,7 @@ def build_agent(
|
|||
tools[rp.name] = rp
|
||||
|
||||
sink = ConsoleEventSink(console, token_counter=lambda: llm.token_counter.total) if console else None
|
||||
agent = AgentLoop(llm, tools, session, caps, sink=sink)
|
||||
agent = AgentLoop(llm, tools, session, caps, user_id=uid, sink=sink)
|
||||
return agent, session, sid, task_state, working_dir_path
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ class ModelCapabilities:
|
|||
model_id: str = ""
|
||||
family: str = ""
|
||||
variant: str = ""
|
||||
display_name: str = "" # UI 展示用,如 "DeepSeek V4 Flash";空时前端 fallback 拼 family.variant
|
||||
|
||||
# 上下文
|
||||
max_context: int = 128_000
|
||||
|
|
|
|||
23
core/loop.py
23
core/loop.py
|
|
@ -9,9 +9,12 @@ import json
|
|||
import time
|
||||
from typing import Any, Callable, Dict, Optional, Tuple
|
||||
|
||||
from uuid import UUID
|
||||
|
||||
from .capabilities import ModelCapabilities
|
||||
from .llm import LLM
|
||||
from .session import Session
|
||||
from .storage import record_chat_usage
|
||||
|
||||
|
||||
_CANCELLED_TOOL_PLACEHOLDER = "[cancelled by user]"
|
||||
|
|
@ -37,6 +40,7 @@ class AgentLoop:
|
|||
tools: Dict[str, Any],
|
||||
session: Session,
|
||||
capabilities: ModelCapabilities,
|
||||
user_id: UUID,
|
||||
sink: Optional[Any] = None,
|
||||
max_iterations: Optional[int] = None,
|
||||
cancel_check: Optional[Callable[[], bool]] = None,
|
||||
|
|
@ -45,6 +49,7 @@ class AgentLoop:
|
|||
self.tools = tools
|
||||
self.session = session
|
||||
self.caps = capabilities
|
||||
self.user_id = user_id # usage_events 写入时按 user 维度聚合
|
||||
self.max_iterations = max_iterations or capabilities.max_iterations
|
||||
self.sink = sink
|
||||
# 协作式 cancel:web 层注入 `lambda: broker.is_cancelled(run_id)`;
|
||||
|
|
@ -87,9 +92,25 @@ class AgentLoop:
|
|||
)
|
||||
elapsed = time.monotonic() - start
|
||||
msg = response.choices[0].message
|
||||
self.session.append(msg)
|
||||
asst_msg_id = self.session.append(msg)
|
||||
|
||||
pt, ct = _extract_usage(getattr(response, "usage", None))
|
||||
# 记账(0006):一行 usage_event + 回填 messages.tokens_in/out + model_profile。
|
||||
# 任何失败都吞掉(litellm cost map miss / DB 异常),不阻塞主 loop;
|
||||
# message 仍在 session/DB 里,后续重启不影响。
|
||||
model_profile = f"{self.caps.family}.{self.caps.variant}"
|
||||
try:
|
||||
record_chat_usage(
|
||||
task_id=self.session.task_id,
|
||||
user_id=self.user_id,
|
||||
message_id=asst_msg_id,
|
||||
model_profile=model_profile,
|
||||
prompt_tokens=pt,
|
||||
completion_tokens=ct,
|
||||
response=response,
|
||||
)
|
||||
except Exception as e:
|
||||
self._emit({"type": "warn", "msg": f"record_usage failed: {type(e).__name__}: {e}"})
|
||||
self._emit({
|
||||
"type": "llm_end",
|
||||
"prompt_tokens": pt,
|
||||
|
|
|
|||
|
|
@ -69,24 +69,31 @@ class Session:
|
|||
if system_prompt:
|
||||
self.messages.append({"role": "system", "content": system_prompt})
|
||||
|
||||
def append(self, msg: Any) -> None:
|
||||
"""追加消息;非 system 落 DB,system 仅内存。
|
||||
def append(self, msg: Any) -> Optional[UUID]:
|
||||
"""追加消息;非 system 落 DB,system 仅内存。返回新落库行的 message_id。
|
||||
|
||||
前置条件:tasks 行已由 web 入口(`POST /v1/tasks` → `ensure_local_task_row`)写入;
|
||||
Session 不再做 idempotent ensure(无 user 上下文,且 task 必先在,多余)。
|
||||
|
||||
返回值:非 system → 新 row 的 message_id(供 loop 给 usage_events 关联用);
|
||||
system 消息不入库,返 None。旧调用方忽略返回值不影响行为。
|
||||
"""
|
||||
msg_dict = _to_dict(msg)
|
||||
self.messages.append(msg_dict)
|
||||
if msg_dict.get("role") == "system":
|
||||
return
|
||||
return None
|
||||
|
||||
with session_scope() as s:
|
||||
s.add(Message(
|
||||
row = Message(
|
||||
task_id=self.task_id,
|
||||
idx=self._db_idx,
|
||||
payload=msg_dict,
|
||||
))
|
||||
)
|
||||
s.add(row)
|
||||
s.flush() # 触发 INSERT 拿到 server-default 生成的 message_id
|
||||
msg_id = row.message_id
|
||||
self._db_idx += 1
|
||||
return msg_id
|
||||
|
||||
def reset(self, keep_system: bool = True) -> None:
|
||||
"""清空消息。keep_system 仅影响内存(system 本来就不在 DB)。"""
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ from .engine import (
|
|||
get_engine,
|
||||
session_scope,
|
||||
)
|
||||
from .usage import record_chat_usage
|
||||
from .utils import (
|
||||
NoSubtaskError,
|
||||
check_no_subtask,
|
||||
|
|
@ -27,6 +28,7 @@ __all__ = [
|
|||
"ensure_local_task_row",
|
||||
"get_engine",
|
||||
"get_task",
|
||||
"record_chat_usage",
|
||||
"session_scope",
|
||||
"update_task",
|
||||
"upsert_task",
|
||||
|
|
|
|||
|
|
@ -1,12 +1,16 @@
|
|||
"""SQLAlchemy 2.x ORM models,对应 DESIGN.md §7.4 schema。
|
||||
|
||||
3 张表:users / tasks / messages。
|
||||
4 张表:users / tasks / messages / usage_events。
|
||||
- users 行在 web 入口按需 INSERT(`/v1/auth/login_password` 实际创行 / `/v1/auth/login`
|
||||
platform_key 入口 ensure_user_row);email UNIQUE(0005)给 login lookup 用,
|
||||
password_hash 是 bcrypt(`bcrypt.hashpw`),只在邮箱密码登录时有值
|
||||
- messages.payload 用 jsonb,GIN 索引在 migration 里建
|
||||
- messages.payload 用 jsonb,GIN 索引在 migration 里建;messages.model_profile
|
||||
(0006)只在 assistant 行有值,标注产生该条 message 的模型
|
||||
- run 状态承载在 tasks.run_status / run_error 两列(0004 合并 runs 表);
|
||||
原 runs / usage_events 表 0004 删 — 详 DESIGN §7.4 取舍 / PROGRESS 05-18
|
||||
原 runs / 旧 usage_events 表 0004 删 — 详 DESIGN §7.4 取舍 / PROGRESS 05-18
|
||||
- usage_events(0006 v2 形态):per-event 多态用量行,kind=chat/image/video/...,
|
||||
units 列 JSONB(chat 装 tokens_in/out,image 装 count/size 等)。用户级聚合
|
||||
走 (user_id, created_at) 复合索引,JOIN-free。详 DESIGN §7.4 / PROGRESS 05-19
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
|
|
@ -90,6 +94,44 @@ class Message(Base):
|
|||
payload: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
|
||||
tokens_in: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
|
||||
tokens_out: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
|
||||
# 0006:产生该 message 的模型(只在 assistant 行有值;user/tool/system 为 NULL)。
|
||||
# 跟 usage_events.model_profile 写入一致,JOIN-free 时按 message 直查也能拿到。
|
||||
model_profile: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), server_default=func.now(), nullable=False
|
||||
)
|
||||
|
||||
|
||||
class UsageEvent(Base):
|
||||
"""per-event 用量记账(0006 v2 形态)。
|
||||
|
||||
一行 = 一次产生成本的调用。chat 类型由 loop 在 assistant message 入库后写入;
|
||||
未来的媒体工具(image/video/audio)在 tool execute 完后由 loop 顺手记账。
|
||||
units 列 polymorphic JSONB —— chat: {"tokens_in": N, "tokens_out": M};
|
||||
image: {"count": K, "size": "1024x1024"};按 kind 约定。
|
||||
|
||||
按 user 聚合的统计 query 走 (user_id, created_at) 索引,不 JOIN tasks 表。
|
||||
"""
|
||||
__tablename__ = "usage_events"
|
||||
|
||||
event_id: Mapped[UUID] = mapped_column(PG_UUID(as_uuid=True), primary_key=True, default=uuid4)
|
||||
user_id: Mapped[UUID] = mapped_column(
|
||||
PG_UUID(as_uuid=True), ForeignKey("users.user_id"), nullable=False
|
||||
)
|
||||
task_id: Mapped[UUID] = mapped_column(
|
||||
PG_UUID(as_uuid=True),
|
||||
ForeignKey("tasks.task_id", ondelete="CASCADE"),
|
||||
nullable=False,
|
||||
)
|
||||
message_id: Mapped[Optional[UUID]] = mapped_column(
|
||||
PG_UUID(as_uuid=True),
|
||||
ForeignKey("messages.message_id", ondelete="SET NULL"),
|
||||
nullable=True,
|
||||
)
|
||||
kind: Mapped[str] = mapped_column(Text, nullable=False) # chat / image / video / audio / ...
|
||||
model_profile: Mapped[str] = mapped_column(Text, nullable=False) # deepseek_v4.pro / dall-e-3 / ...
|
||||
units: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
|
||||
cost_usd: Mapped[Decimal] = mapped_column(Numeric(12, 6), nullable=False, default=0)
|
||||
created_at: Mapped[datetime] = mapped_column(
|
||||
DateTime(timezone=True), server_default=func.now(), nullable=False
|
||||
)
|
||||
|
|
|
|||
|
|
@ -0,0 +1,77 @@
|
|||
"""用量记账(0006):一次产生成本的调用 = 一行 usage_events + 双写 messages 列。
|
||||
|
||||
chat 类型的入口由 loop.py 在 assistant message 入库后调用;未来的媒体工具
|
||||
(image/video/audio)在 tool execute 后由 loop 顺手记账。
|
||||
|
||||
成本计算依赖 litellm 的 cost map(litellm.cost_calculator.completion_cost)。
|
||||
未知 model 或 map 缺失时 cost=0(不阻塞主流程),emit warn 给 sink。
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from decimal import Decimal
|
||||
from typing import Any, Optional
|
||||
from uuid import UUID
|
||||
|
||||
from sqlalchemy import update
|
||||
|
||||
from .engine import session_scope
|
||||
from .models import Message, UsageEvent
|
||||
|
||||
|
||||
def _safe_chat_cost(response: Any) -> Decimal:
|
||||
"""litellm.completion_cost(response) 包一层:任何异常都吞掉返 0。
|
||||
|
||||
未知 model / cost map 没收录 / response 结构变都不影响主流程 —— usage_events
|
||||
仍写入,只是 cost_usd=0,后续人工补算 OK。
|
||||
"""
|
||||
try:
|
||||
from litellm import completion_cost # type: ignore[import-not-found]
|
||||
cost = completion_cost(completion_response=response)
|
||||
if cost is None:
|
||||
return Decimal("0")
|
||||
return Decimal(str(cost))
|
||||
except Exception:
|
||||
return Decimal("0")
|
||||
|
||||
|
||||
def record_chat_usage(
|
||||
*,
|
||||
task_id: UUID,
|
||||
user_id: UUID,
|
||||
message_id: Optional[UUID],
|
||||
model_profile: str,
|
||||
prompt_tokens: int,
|
||||
completion_tokens: int,
|
||||
response: Any = None,
|
||||
) -> Decimal:
|
||||
"""记一次 chat 调用:写 usage_events 行 + 回填 messages.model_profile/tokens_in/out。
|
||||
|
||||
`message_id` 来自 `Session.append` 的返回值;若为 None(系统消息 / 旧路径未拿到)
|
||||
则 usage_events 仍写但 message_id=NULL,messages 列不回填。
|
||||
`model_profile` 形如 `"deepseek_v4.pro"`(family.variant)。
|
||||
返回算出的 cost_usd(已落库),调用方可用作 SSE 显示。
|
||||
"""
|
||||
cost = _safe_chat_cost(response)
|
||||
units = {"tokens_in": int(prompt_tokens), "tokens_out": int(completion_tokens)}
|
||||
|
||||
with session_scope() as s:
|
||||
s.add(UsageEvent(
|
||||
user_id=user_id,
|
||||
task_id=task_id,
|
||||
message_id=message_id,
|
||||
kind="chat",
|
||||
model_profile=model_profile,
|
||||
units=units,
|
||||
cost_usd=cost,
|
||||
))
|
||||
if message_id is not None:
|
||||
s.execute(
|
||||
update(Message)
|
||||
.where(Message.message_id == message_id)
|
||||
.values(
|
||||
tokens_in=int(prompt_tokens),
|
||||
tokens_out=int(completion_tokens),
|
||||
model_profile=model_profile,
|
||||
)
|
||||
)
|
||||
return cost
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
"""usage_events 表(v2 形态)+ messages.model_profile 列。
|
||||
|
||||
Revision ID: 0006
|
||||
Revises: 0005
|
||||
Create Date: 2026-05-19
|
||||
|
||||
模型切换 + 用量统计(grill 共识):
|
||||
- task 默认模型走 tasks.model_profile(0001 起就存,本次开始真用)
|
||||
- 每条 assistant message 标注实际用的模型 → messages.model_profile(本次加列)
|
||||
- 用量按 message 记成一行 usage_event(JSONB polymorphic units),前期只 chat kind;
|
||||
未来扩 image/video/audio 不动 schema。0004 删掉的 v1 形态(id/user_id/task_id/run_id/
|
||||
kind/value/ts)字段不够,本次按 grill Q6 设计重建一张 v2 形态。
|
||||
- task_id ondelete=CASCADE(简单,跟 messages 一致);message_id ondelete=SET NULL
|
||||
(单条 message 不会主动删,留作未来防御);user_id 仍是 NOT NULL —— 计费 / 限额查询
|
||||
按 user 维度直出,不靠 JOIN。
|
||||
"""
|
||||
from typing import Sequence, Union
|
||||
|
||||
from alembic import op
|
||||
import sqlalchemy as sa
|
||||
from sqlalchemy.dialects import postgresql
|
||||
|
||||
revision: str = "0006"
|
||||
down_revision: Union[str, None] = "0005"
|
||||
branch_labels: Union[str, Sequence[str], None] = None
|
||||
depends_on: Union[str, Sequence[str], None] = None
|
||||
|
||||
|
||||
def upgrade() -> None:
|
||||
op.add_column(
|
||||
"messages",
|
||||
sa.Column("model_profile", sa.Text(), nullable=True),
|
||||
)
|
||||
|
||||
op.create_table(
|
||||
"usage_events",
|
||||
sa.Column("event_id", postgresql.UUID(as_uuid=True), primary_key=True,
|
||||
server_default=sa.text("gen_random_uuid()")),
|
||||
sa.Column("user_id", postgresql.UUID(as_uuid=True),
|
||||
sa.ForeignKey("users.user_id"), nullable=False),
|
||||
sa.Column("task_id", postgresql.UUID(as_uuid=True),
|
||||
sa.ForeignKey("tasks.task_id", ondelete="CASCADE"), nullable=False),
|
||||
sa.Column("message_id", postgresql.UUID(as_uuid=True),
|
||||
sa.ForeignKey("messages.message_id", ondelete="SET NULL"), nullable=True),
|
||||
sa.Column("kind", sa.Text(), nullable=False),
|
||||
sa.Column("model_profile", sa.Text(), nullable=False),
|
||||
sa.Column("units", postgresql.JSONB(), nullable=False),
|
||||
sa.Column("cost_usd", sa.Numeric(12, 6), nullable=False, server_default="0"),
|
||||
sa.Column("created_at", sa.DateTime(timezone=True),
|
||||
server_default=sa.text("now()"), nullable=False),
|
||||
)
|
||||
op.create_index("ix_usage_user_created", "usage_events", ["user_id", "created_at"])
|
||||
op.create_index("ix_usage_task", "usage_events", ["task_id"])
|
||||
op.create_index("ix_usage_model_created", "usage_events", ["model_profile", "created_at"])
|
||||
|
||||
|
||||
def downgrade() -> None:
|
||||
op.drop_index("ix_usage_model_created", table_name="usage_events")
|
||||
op.drop_index("ix_usage_task", table_name="usage_events")
|
||||
op.drop_index("ix_usage_user_created", table_name="usage_events")
|
||||
op.drop_table("usage_events")
|
||||
op.drop_column("messages", "model_profile")
|
||||
72
web/app.py
72
web/app.py
|
|
@ -233,6 +233,26 @@ def _sse_event(event_type: str, payload: dict) -> bytes:
|
|||
return f"event: {event_type}\ndata: {body}\n\n".encode("utf-8")
|
||||
|
||||
|
||||
def _resolve_model_profile(profile: str) -> tuple[str, str]:
|
||||
"""校验 model_profile 并返回 (profile, model_id)。
|
||||
|
||||
传空 → cfg["default_model"]。profile 走 ModelCapabilities.load:
|
||||
格式或文件错误一律 400。返 (profile_str, caps.model_id) —— 调 ensure_local_task_row
|
||||
时 model_profile / model 两列一起填,保持现有 schema 双列约定。
|
||||
"""
|
||||
from core.agent_builder import load_config
|
||||
from core.capabilities import ModelCapabilities
|
||||
from core.paths import ROOT
|
||||
|
||||
cfg = load_config()
|
||||
name = (profile or "").strip() or cfg["default_model"]
|
||||
try:
|
||||
caps = ModelCapabilities.load(name, ROOT / cfg["models_dir"])
|
||||
except (FileNotFoundError, ValueError) as e:
|
||||
raise HTTPException(400, f"invalid model_profile {name!r}: {e}")
|
||||
return name, caps.model_id
|
||||
|
||||
|
||||
# ────────────────────── Pydantic 请求体 ──────────────────────
|
||||
|
||||
class TaskCreateRequest(BaseModel):
|
||||
|
|
@ -240,6 +260,7 @@ class TaskCreateRequest(BaseModel):
|
|||
working_dir: str = "" # 工作目录名(可选,留空 → 用 name 作目录名)
|
||||
description: str = ""
|
||||
skill: str = ""
|
||||
model_profile: str = "" # `family.variant`,留空 → cfg["default_model"];必须存在于 config/models/
|
||||
|
||||
|
||||
class TaskPatchRequest(BaseModel):
|
||||
|
|
@ -247,6 +268,7 @@ class TaskPatchRequest(BaseModel):
|
|||
description: Optional[str] = None
|
||||
name: Optional[str] = None
|
||||
skill: Optional[str] = None
|
||||
model_profile: Optional[str] = None # 切模型(c 模式 task 层 / A 粒度 — 下条 send 生效)
|
||||
|
||||
|
||||
class MessageRequest(BaseModel):
|
||||
|
|
@ -339,6 +361,45 @@ def create_app() -> FastAPI:
|
|||
def healthz():
|
||||
return {"status": "ok"}
|
||||
|
||||
@app.get("/v1/models", tags=["misc"])
|
||||
def list_models(user_id: UUID = Depends(require_user)):
|
||||
"""列出所有可用 LLM 模型(扫 config/models/*.yaml)。
|
||||
|
||||
前端顶栏 / 新建对话框的模型下拉拉这个。is_default 标记 cfg["default_model"]
|
||||
命中项。开发期不缓存,每次扫一遍(几个文件 IO);改 YAML 立即生效。
|
||||
"""
|
||||
from core.agent_builder import load_config
|
||||
from core.capabilities import ModelCapabilities
|
||||
from core.paths import ROOT
|
||||
import yaml as _yaml
|
||||
cfg = load_config()
|
||||
default = cfg["default_model"]
|
||||
models_dir = ROOT / cfg["models_dir"]
|
||||
|
||||
out: list[dict] = []
|
||||
if models_dir.is_dir():
|
||||
for path in sorted(models_dir.glob("*.yaml")):
|
||||
try:
|
||||
data = _yaml.safe_load(path.read_text(encoding="utf-8")) or {}
|
||||
except Exception:
|
||||
continue
|
||||
family = data.get("family") or path.stem
|
||||
for variant in (data.get("variants") or {}).keys():
|
||||
profile = f"{family}.{variant}"
|
||||
try:
|
||||
caps = ModelCapabilities.load(profile, models_dir)
|
||||
except (ValueError, FileNotFoundError):
|
||||
continue
|
||||
out.append({
|
||||
"profile": profile,
|
||||
"display_name": caps.display_name or profile,
|
||||
"family": caps.family,
|
||||
"variant": caps.variant,
|
||||
"thinking_mode": caps.thinking_mode,
|
||||
"is_default": profile == default,
|
||||
})
|
||||
return {"models": out}
|
||||
|
||||
# ───────────── Auth ─────────────
|
||||
|
||||
@app.post("/v1/auth/login", tags=["auth"])
|
||||
|
|
@ -426,9 +487,11 @@ def create_app() -> FastAPI:
|
|||
# 工作目录立刻建出(同 working_dir 多 task 共享,exist_ok=True)
|
||||
fs_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
profile, model_id = _resolve_model_profile(body.model_profile)
|
||||
ensure_local_task_row(
|
||||
task_id=tid, name=name, working_dir=fs_dir_db, skill=skill,
|
||||
description=description, user_id=user_id,
|
||||
model=model_id, model_profile=profile,
|
||||
)
|
||||
with session_scope() as s:
|
||||
row = s.execute(select(Task).where(Task.task_id == tid)).scalar_one()
|
||||
|
|
@ -631,6 +694,12 @@ def create_app() -> FastAPI:
|
|||
updates["name"] = validate_task_name(body.name)
|
||||
except InvalidTaskName as e:
|
||||
raise HTTPException(400, f"name 不合法: {e}")
|
||||
if body.model_profile is not None:
|
||||
# 切模型:校验后双列同更(profile + model_id)。下条 send 才生效 — 当前
|
||||
# in-flight run 不受影响(build_agent resume 时下次重读)。
|
||||
profile, model_id = _resolve_model_profile(body.model_profile)
|
||||
updates["model_profile"] = profile
|
||||
updates["model"] = model_id
|
||||
if not updates:
|
||||
raise HTTPException(400, "no fields to update")
|
||||
with session_scope() as s:
|
||||
|
|
@ -668,7 +737,7 @@ def create_app() -> FastAPI:
|
|||
rows = s.execute(
|
||||
select(
|
||||
Message.idx, Message.payload, Message.tokens_in,
|
||||
Message.tokens_out, Message.created_at,
|
||||
Message.tokens_out, Message.model_profile, Message.created_at,
|
||||
).where(Message.task_id == tid).order_by(Message.idx)
|
||||
).all()
|
||||
return {
|
||||
|
|
@ -678,6 +747,7 @@ def create_app() -> FastAPI:
|
|||
"payload": dict(r.payload),
|
||||
"tokens_in": r.tokens_in,
|
||||
"tokens_out": r.tokens_out,
|
||||
"model_profile": r.model_profile, # 0006:assistant 行非空,标产生该 msg 的模型
|
||||
"created_at": _iso(r.created_at),
|
||||
}
|
||||
for r in rows
|
||||
|
|
|
|||
|
|
@ -491,6 +491,8 @@
|
|||
<select id="nt-skill">
|
||||
<option value="">(默认 · 不限定)</option>
|
||||
</select>
|
||||
<label for="nt-model">模型</label>
|
||||
<select id="nt-model"></select>
|
||||
<div class="err" id="nt-err"></div>
|
||||
<div class="actions">
|
||||
<button id="nt-cancel">取消</button>
|
||||
|
|
@ -530,6 +532,8 @@ const state = {
|
|||
taskPage: 1,
|
||||
taskPageSize: 20,
|
||||
taskTotal: 0,
|
||||
// 模型清单(GET /v1/models 一次缓存):新建对话框 + 顶栏切换下拉 + 历史小标显示名都用
|
||||
models: [],
|
||||
};
|
||||
|
||||
// ───── helpers ─────
|
||||
|
|
@ -748,6 +752,16 @@ function enterApp() {
|
|||
$("hd-who").title = state.userId;
|
||||
loadTaskList();
|
||||
loadFiles(); // 文件面板与 task 解耦 — 启动即拉 user_root
|
||||
loadModels(); // 模型清单缓存:chat-meta 下拉 + 新建对话框 + 历史小标
|
||||
}
|
||||
|
||||
async function loadModels() {
|
||||
try {
|
||||
const data = await api("GET", "/v1/models");
|
||||
state.models = data.models || [];
|
||||
} catch (e) {
|
||||
state.models = []; // 静默兜底:无模型清单时下拉不显示,不挡正常流程
|
||||
}
|
||||
}
|
||||
|
||||
async function loadTaskList() {
|
||||
|
|
@ -924,8 +938,11 @@ function renderChatMeta() {
|
|||
<span class="tid">${t.task_id.slice(0, 8)}</span>
|
||||
${t.description ? `<span class="muted">${escapeHtml(t.description)}</span>` : ""}
|
||||
<span class="spacer"></span>
|
||||
${renderModelDropdown(t)}
|
||||
<span class="muted small">${t.n_messages || 0} 条 · ${t.tokens || 0} tok</span>
|
||||
`;
|
||||
const sel = $("chat-model-sel");
|
||||
if (sel) sel.onchange = onChangeModel;
|
||||
const active = t.status === "active";
|
||||
$("chat-form").style.display = active ? "flex" : "none";
|
||||
$("btn-done").disabled = !active;
|
||||
|
|
@ -934,6 +951,35 @@ function renderChatMeta() {
|
|||
$("btn-export").disabled = (t.n_messages || 0) === 0;
|
||||
}
|
||||
|
||||
function renderModelDropdown(t) {
|
||||
// 模型清单未加载好(或为空)时不渲染下拉,但 task 仍可正常用(后端走 task.model_profile)
|
||||
if (!state.models || state.models.length === 0) return "";
|
||||
const cur = t.model_profile || "";
|
||||
const opts = state.models.map(m =>
|
||||
`<option value="${escapeHtml(m.profile)}" ${m.profile === cur ? "selected" : ""}>${escapeHtml(m.display_name)}</option>`
|
||||
).join("");
|
||||
return `<span class="muted small" style="display:inline-flex;align-items:center;gap:4px;">模型 <select id="chat-model-sel" class="small" style="width:auto;padding:1px 4px;font-size:12px;" title="切换 task 模型(下条消息生效)">${opts}</select></span>`;
|
||||
}
|
||||
|
||||
async function onChangeModel(ev) {
|
||||
const sel = ev.target;
|
||||
const newProfile = sel.value;
|
||||
const t = state.taskMeta;
|
||||
if (!t || !newProfile || newProfile === t.model_profile) return;
|
||||
const oldProfile = t.model_profile || "";
|
||||
try {
|
||||
const updated = await api("PATCH", `/v1/tasks/${t.task_id}`, { model_profile: newProfile });
|
||||
state.taskMeta = updated;
|
||||
const running = updated.run_status === "running" || updated.run_status === "cancelling";
|
||||
$("chat-hint").textContent = running
|
||||
? `已切到 ${newProfile} · 当前 run 跑完后生效`
|
||||
: `已切到 ${newProfile}`;
|
||||
} catch (e) {
|
||||
sel.value = oldProfile; // PATCH 失败 UI 回滚
|
||||
$("chat-hint").textContent = `切换失败:${e.message}`;
|
||||
}
|
||||
}
|
||||
|
||||
async function loadMessages() {
|
||||
const data = await api("GET", `/v1/tasks/${state.taskId}/messages`);
|
||||
renderMessages(data.messages);
|
||||
|
|
@ -946,10 +992,22 @@ function renderMessages(msgs) {
|
|||
wrap.innerHTML = `<div class="empty">(暂无消息 · 在下方输入开始对话)</div>`;
|
||||
return;
|
||||
}
|
||||
// 模型切换点小标:assistant 行的 model_profile 与上一个 assistant 不同就插一行分隔
|
||||
// (含首条);避免每条都标制造噪声。空 model_profile(历史旧数据)不画。
|
||||
let lastAsstModel = null;
|
||||
for (const m of msgs) {
|
||||
const p = m.payload || {};
|
||||
const role = p.role || "?";
|
||||
if (role === "system") continue; // 不显示 system
|
||||
if (role === "assistant" && m.model_profile && m.model_profile !== lastAsstModel) {
|
||||
const dn = (state.models.find(x => x.profile === m.model_profile) || {}).display_name || m.model_profile;
|
||||
const sep = document.createElement("div");
|
||||
sep.className = "model-switch muted";
|
||||
sep.style.cssText = "margin:8px 0;text-align:center;font-size:11px;letter-spacing:0.5px;";
|
||||
sep.textContent = `── ${dn} ──`;
|
||||
wrap.appendChild(sep);
|
||||
lastAsstModel = m.model_profile;
|
||||
}
|
||||
if (role === "tool") {
|
||||
// 嵌进上一个 assistant 的 tool_call(简化:直接独立显示)
|
||||
const card = document.createElement("div");
|
||||
|
|
@ -1650,19 +1708,33 @@ $("hd-new").onclick = async () => {
|
|||
$("nt-err").textContent = "";
|
||||
$("nt-wd-hint").textContent = "";
|
||||
$("new-task-modal").classList.add("show");
|
||||
await Promise.all([loadFolderSuggestions(), loadSkillOptions()]);
|
||||
await Promise.all([loadFolderSuggestions(), loadSkillOptions(), loadModels()]);
|
||||
populateModelSelect();
|
||||
$("nt-name").focus();
|
||||
};
|
||||
function populateModelSelect() {
|
||||
const sel = $("nt-model");
|
||||
const models = state.models || [];
|
||||
if (models.length === 0) {
|
||||
sel.innerHTML = `<option value="">(默认)</option>`;
|
||||
return;
|
||||
}
|
||||
sel.innerHTML = models.map(m =>
|
||||
`<option value="${escapeHtml(m.profile)}" ${m.is_default ? "selected" : ""}>${escapeHtml(m.display_name)}</option>`
|
||||
).join("");
|
||||
}
|
||||
$("nt-cancel").onclick = () => $("new-task-modal").classList.remove("show");
|
||||
$("nt-go").onclick = async () => {
|
||||
const name = $("nt-name").value.trim();
|
||||
const working_dir = $("nt-wd").value.trim();
|
||||
const desc = $("nt-desc").value.trim();
|
||||
const skill = $("nt-skill").value;
|
||||
const model_profile = $("nt-model").value;
|
||||
$("nt-err").textContent = "";
|
||||
if (!name) { $("nt-err").textContent = "任务名为必填项"; return; }
|
||||
try {
|
||||
const t = await api("POST", "/v1/tasks", { name, working_dir, description: desc, skill });
|
||||
const t = await api("POST", "/v1/tasks",
|
||||
{ name, working_dir, description: desc, skill, model_profile });
|
||||
$("new-task-modal").classList.remove("show");
|
||||
await loadTaskList();
|
||||
selectTask(t.task_id);
|
||||
|
|
|
|||
Loading…
Reference in New Issue