Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions doc/summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ src/
├── recovery.ts # LLM 错误分类与恢复决策:backoff/compact/continue/fail
├── terminal.ts # 终端输入输出封装:共享 readline(REPL + 权限确认共用)
├── debug-e2e.ts # 端到端调试脚本(Skill+TODO+SubAgent 协作验证)
├── foundation-models.test.ts # Profile 注册表、匹配、fallback 测试(14 个测试用例)
├── runtime-policy.test.ts # Policy 解析、env 覆盖、非法值报错测试(19 个测试用例)
├── foundation-models.test.ts # Profile 注册表、匹配、fallback 测试(17 个测试用例)
├── runtime-policy.test.ts # Policy 解析、env 覆盖、非法值报错测试(20 个测试用例)
├── runtime-policy-store.test.ts # Override 合并、reset、snapshot 测试(15 个测试用例)
├── context-budget.test.ts # 预算分配公式、总和约束、override 裁剪测试(9 个测试用例)
├── llm-adapter.test.ts # Adapter 请求构建、reasoning 回放、streaming 聚合测试(15 个测试用例)
Expand Down Expand Up @@ -437,7 +437,7 @@ skills/
### Foundation Model Profile 基座模型画像 (`foundation-models.ts`)

- **能力驱动而非模型名驱动**:`agent.ts` 不出现 `kimi`/`deepseek` 等具体模型分支,业务层只看 `RuntimePolicy` 中的策略字段
- **Profile Registry**:含 `generic-openai-compatible`、`kimi-k2.6`、`kimi-code`、`deepseek-v4`、`minimax-m2.7`、`minimax-m3`、`mimo-v2.5-pro`、`qwen3.7-max`、`glm-5.1` 等画像
- **Profile Registry**:含 `generic-openai-compatible`、`kimi-k2.6`、`kimi-code`、`deepseek-v4`、`minimax-m2.7`、`minimax-m3`、`mimo-v2.5-pro`、`qwen3.7-max`、`glm-5.2`、`glm-5.1` 等画像
- **匹配优先级**:`LLM_MODEL_PROFILE` 显式指定 > exact model id > prefix > provider default > generic fallback
- **硬协议字段 vs 优化提示分离**:maxTokensField、thinking requestShape、reasoning responseFields 等硬字段必须保守;context budget、compression mode 等优化提示允许合理默认
- **Profile 分级**:`verified` / `experimental` / `needs_review`,stale/high-risk profile 启动时产生 warning 但不阻断
Expand Down Expand Up @@ -508,7 +508,7 @@ skills/

### LLM Provider Profile 抽象层 (`llm-providers.ts`)

- **集中 profile 表**:声明 4 个 provider(`openai_compatible`、`minimax_cn`、`kimi_platform_cn`、`kimi_code_cn`)的默认 endpoint、默认模型、key 环境变量和能力标记
- **集中 profile 表**:声明 5 个 provider(`openai_compatible`、`minimax_cn`、`kimi_platform_cn`、`kimi_code_cn`、`zhipuai_cn`)的默认 endpoint、默认模型、key 环境变量和能力标记
- **解析优先级**:`LLM_API_KEY` / `LLM_BASE_URL` / `LLM_MODEL` 优先于 provider 默认值,兼容现有使用方式
- **启动时解析**:`resolveLLMProviderConfig()` 只读 env,不做网络请求,返回 `ResolvedLLMConfig`
- **厂商差异不泄漏到 Agent 循环**:Agent、SubAgent、Async Run 只依赖 `LLMClient.chat()`
Expand Down Expand Up @@ -725,6 +725,8 @@ skills/
| `KIMI_CODE_API_KEY` | Kimi Code CN 专用 key | `sk-kimi-...` |
| `MOONSHOT_API_KEY` | Kimi Platform CN 专用 key | `sk-moonshot-...` |
| `MINIMAX_CN_API_KEY` | MiniMax CN 专用 key | `sk-minimax-...` |
| `ZHIPUAI_API_KEY` | ZhipuAI CN 专用 key | `sk-zhipu-...` |
| `BIGMODEL_API_KEY` | ZhipuAI CN 备用 key | `sk-bigmodel-...` |
| `LOG_LEVEL` | 日志级别 | `info` |
| `COMPRESS_TOOL_OUTPUT` | 即时压缩 token 阈值 | `2000` |
| `COMPRESS_DECAY_THRESHOLD` | 衰减压缩轮次阈值 | `3` |
Expand Down Expand Up @@ -789,11 +791,11 @@ skills/
| `src/session-events.test.ts` | 5 | drain 清空、peek 不清空、顺序保持 |
| `src/transcript.test.ts` | 6 | 消息分类、事件 sequence、historySequence、timing 元信息、搜索 |
| `src/cache-debug.test.ts` | 7 | inspect 变化检测、system prompt 不变性、formatCacheDebugLog |
| `src/llm-providers.test.ts` | 26 | provider 解析、默认值、覆盖优先级、错误提示、能力标记 |
| `src/llm-providers.test.ts` | 31 | provider 解析、默认值、覆盖优先级、错误提示、能力标记 |
| `src/config.test.ts` | 5 | loadConfig 解析 provider 字段、compression/logLevel 默认值、错误信息不泄漏 key |
| `src/llm.test.ts` | 10 | non-streaming 路径、streaming content/tool_calls 聚合、llmLogger 调用 |
| `src/foundation-models.test.ts` | 14 | Profile 注册表、exact/prefix/fallback 匹配、provider 兼容校验、显式 profile、stale warning |
| `src/runtime-policy.test.ts` | 19 | Policy 默认值、env 覆盖、非法覆盖报错、协议 fallback、compression 派生 |
| `src/foundation-models.test.ts` | 17 | Profile 注册表、exact/prefix/fallback 匹配、provider 兼容校验、显式 profile、stale warning |
| `src/runtime-policy.test.ts` | 20 | Policy 默认值、env 覆盖、非法覆盖报错、协议 fallback、compression 派生 |
| `src/context-budget.test.ts` | 9 | 三种模式预算分配、总和约束、override 处理、裁剪优先级、极小预算边界 |
| `src/runtime-policy-store.test.ts` | 15 | Override 合并、reset、snapshot、非法更新报错 |
| `src/llm-adapter.test.ts` | 15 | 请求构建、reasoning 占位、streaming 聚合、max token 字段、usage 解析 |
Expand Down
38 changes: 38 additions & 0 deletions src/foundation-models.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,34 @@ describe("resolveFoundationModelProfile", () => {
expect(result.id).toBe("deepseek-v4");
});

it("matches exact model id for glm-5.2", () => {
const result = resolveFoundationModelProfile({
provider: "zhipuai_cn",
model: "glm-5.2",
});
expect(result.id).toBe("glm-5.2");
expect(result.provider).toBe("zhipuai_cn");
expect(result.limits.contextWindowTokens).toBe(1000000);
expect(result.limits.effectiveContextBudgetTokens).toBe(750000);
expect(result.thinking.defaultMode).toBe("adaptive");
expect(result.reasoning.responseFields).toEqual(["reasoning_content"]);
expect(result.cache.supported).toBe(false);
});

it("matches GLM-5.2 aliases and prefixes conservatively", () => {
const exactAlias = resolveFoundationModelProfile({
provider: "zhipuai_cn",
model: "GLM5.2",
});
const prefixAlias = resolveFoundationModelProfile({
provider: "zhipuai_cn",
model: "glm-5.2-long-context",
});

expect(exactAlias.id).toBe("glm-5.2");
expect(prefixAlias.id).toBe("glm-5.2");
});

it("uses explicit profile id when provided", () => {
const result = resolveFoundationModelProfile({
provider: "kimi_platform_cn",
Expand All @@ -83,6 +111,15 @@ describe("resolveFoundationModelProfile", () => {
expect(result.id).toBe("kimi-k2.6");
});

it("returns zhipuai default profile before generic for unknown zhipu model", () => {
const result = resolveFoundationModelProfile({
provider: "zhipuai_cn",
model: "glm-custom-alias",
});
expect(result.id).toBe("glm-5.2");
expect(result.provider).toBe("zhipuai_cn");
});

it("throws for unknown explicit profile id", () => {
expect(() =>
resolveFoundationModelProfile({
Expand Down Expand Up @@ -147,6 +184,7 @@ describe("getRegisteredModelProfileIds", () => {
expect(ids).toContain("minimax-m3");
expect(ids).toContain("mimo-v2.5-pro");
expect(ids).toContain("qwen3.7-max");
expect(ids).toContain("glm-5.2");
expect(ids).toContain("glm-5.1");
});
});
79 changes: 79 additions & 0 deletions src/foundation-models.ts
Original file line number Diff line number Diff line change
Expand Up @@ -781,6 +781,85 @@ const modelProfiles: FoundationModelProfile[] = [
},
},

// -------------------------------------------------------------------------
// GLM-5.2
// -------------------------------------------------------------------------
{
id: "glm-5.2",
displayName: "GLM-5.2",
provider: "zhipuai_cn",
match: {
exactModelIds: ["glm-5.2", "GLM-5.2", "glm5.2", "GLM5.2"],
modelIdPrefixes: ["glm-5.2", "GLM-5.2"],
},
protocol: {
preferred: "openai-chat-completions",
fallbacks: [],
implemented: ["openai-chat-completions"],
},
limits: {
contextWindowTokens: 1000000,
effectiveContextBudgetTokens: 750000,
longContextThresholdTokens: 512000,
maxOutputTokens: 65536,
maxTokensField: "max_tokens",
},
thinking: {
supported: true,
defaultMode: "adaptive",
efforts: ["default"],
enableForAgenticTasks: true,
disableForSimpleChat: true,
requestShape: "extra_body_thinking",
},
reasoning: {
returned: true,
mustReplayWithToolCalls: false,
preserveRawAssistantMessage: true,
responseFields: ["reasoning_content"],
streamingDeltaFields: ["reasoning_content"],
},
tools: {
supported: true,
supportsToolChoiceRequired: false,
allowedToolChoiceModes: ["auto", "none"],
streamingArguments: true,
multimodalToolResults: false,
},
cache: {
supported: false,
automatic: false,
exposesUsage: false,
usageFields: {},
},
modalities: {
text: true,
image: false,
video: false,
audio: false,
},
optimizationHints: {
bestFor: ["coding", "long_horizon_agent", "large_context"],
defaultCompressionMode: "long_context",
prefersStreaming: true,
goodForSubagents: false,
},
knownQuirks: [
"官方资料确认 1M context 与 reasoning_content,但本仓尚未做 live smoke test",
"max output 和 cache usage 字段未在当前资料中核实,先使用保守上限并关闭 cache telemetry",
"thinking effort 的具体枚举未核实,暂不暴露 low/medium/high 等非 default 覆盖",
],
documentation: {
sourceUrls: [
"https://github.com/zai-org/glm-5/blob/main/README.md",
"https://github.com/metaglm/zhipuai-sdk-python-v4",
],
verifiedAt: "2026-06-26",
updateRisk: "high",
status: "experimental",
},
},

// -------------------------------------------------------------------------
// GLM-5.1
// -------------------------------------------------------------------------
Expand Down
61 changes: 57 additions & 4 deletions src/llm-providers.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,49 @@ describe("minimax_cn provider", () => {
});
});

// ============================================================
// zhipuai_cn 解析
// ============================================================

describe("zhipuai_cn provider", () => {
it("只设置 ZHIPUAI_API_KEY 时可解析成功", () => {
const config = resolveLLMProviderConfig({
LLM_PROVIDER: "zhipuai_cn",
ZHIPUAI_API_KEY: "sk-zhipu-test",
});
expect(config.provider).toBe("zhipuai_cn");
expect(config.apiKey).toBe("sk-zhipu-test");
expect(config.baseURL).toBe("https://open.bigmodel.cn/api/paas/v4/");
expect(config.model).toBe("glm-5.2");
});

it("BIGMODEL_API_KEY 也可作为 fallback", () => {
const config = resolveLLMProviderConfig({
LLM_PROVIDER: "zhipuai_cn",
BIGMODEL_API_KEY: "sk-bigmodel-fallback",
});
expect(config.apiKey).toBe("sk-bigmodel-fallback");
});

it("LLM_MODEL 优先于 GLM-5.2 默认模型", () => {
const config = resolveLLMProviderConfig({
LLM_PROVIDER: "zhipuai_cn",
ZHIPUAI_API_KEY: "sk-zhipu-test",
LLM_MODEL: "glm-5.2-proxy",
});
expect(config.model).toBe("glm-5.2-proxy");
});

it("声明 supportsThinking 和 prefersStreaming 能力", () => {
const config = resolveLLMProviderConfig({
LLM_PROVIDER: "zhipuai_cn",
ZHIPUAI_API_KEY: "sk-zhipu-test",
});
expect(config.capabilities.supportsThinking).toBe(true);
expect(config.capabilities.prefersStreaming).toBe(true);
});
});

// ============================================================
// 启发式推断
// ============================================================
Expand Down Expand Up @@ -213,6 +256,16 @@ describe("baseURL 启发式推断", () => {
expect(config.provider).toBe("kimi_platform_cn");
});

it("未设置 LLM_PROVIDER 时,从 LLM_BASE_URL 推断为 zhipuai_cn", () => {
const config = resolveLLMProviderConfig({
LLM_API_KEY: "sk-test",
LLM_BASE_URL: "https://open.bigmodel.cn/api/paas/v4/",
LLM_MODEL: "glm-5.2",
});
expect(config.provider).toBe("zhipuai_cn");
expect(config.capabilities.supportsThinking).toBe(true);
});

it("不匹配的 baseURL 回退到 openai_compatible", () => {
const config = resolveLLMProviderConfig({
LLM_API_KEY: "sk-test",
Expand Down Expand Up @@ -261,15 +314,15 @@ describe("错误提示", () => {
it("apiKey 缺失错误提示包含 provider id 和候选 key env", () => {
expect(() =>
resolveLLMProviderConfig({
LLM_PROVIDER: "kimi_code_cn",
LLM_PROVIDER: "zhipuai_cn",
}),
).toThrow('provider "kimi_code_cn"');
).toThrow('provider "zhipuai_cn"');

expect(() =>
resolveLLMProviderConfig({
LLM_PROVIDER: "kimi_code_cn",
LLM_PROVIDER: "zhipuai_cn",
}),
).toThrow("LLM_API_KEY, KIMI_CODE_API_KEY");
).toThrow("LLM_API_KEY, ZHIPUAI_API_KEY, BIGMODEL_API_KEY");
});

it("错误信息不泄漏已有 key 值", () => {
Expand Down
19 changes: 18 additions & 1 deletion src/llm-providers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ export type LLMProviderId =
| "openai_compatible"
| "minimax_cn"
| "kimi_platform_cn"
| "kimi_code_cn";
| "kimi_code_cn"
| "zhipuai_cn";

/**
* Provider 能力标记
Expand Down Expand Up @@ -136,6 +137,22 @@ const providerProfiles: Record<LLMProviderId, LLMProviderProfile> = {
supportsThinking: false,
},
},
zhipuai_cn: {
id: "zhipuai_cn",
displayName: "ZhipuAI CN",
protocol: "openai-chat-completions",
// 智谱官方 SDK 文档使用 open.bigmodel.cn 的 v4 基础路径。
// 这里作为可运行默认值;企业代理或自建网关仍通过 LLM_BASE_URL 覆盖。
defaultBaseURL: "https://open.bigmodel.cn/api/paas/v4/",
defaultModel: "glm-5.2",
apiKeyEnvNames: ["ZHIPUAI_API_KEY", "BIGMODEL_API_KEY"],
capabilities: {
supportsTools: true,
supportsToolChoiceRequired: false,
prefersStreaming: true,
supportsThinking: true,
},
},
};

/**
Expand Down
23 changes: 23 additions & 0 deletions src/runtime-policy.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,29 @@ describe("resolveRuntimePolicy defaults", () => {
expect(policy.context.compressionMode).toBe("balanced");
});

it("derives correct defaults for glm-5.2", () => {
const profile = resolveFoundationModelProfile({
provider: "zhipuai_cn",
model: "glm-5.2",
});
const policy = resolveRuntimePolicy(profile, "glm-5.2");

expect(policy.modelProfileId).toBe("glm-5.2");
expect(policy.context.contextWindowTokens).toBe(1000000);
expect(policy.context.effectiveBudgetTokens).toBe(750000);
expect(policy.context.longContextThresholdTokens).toBe(512000);
expect(policy.context.compressionMode).toBe("long_context");
expect(policy.request.prefersStreaming).toBe(true);
expect(policy.request.thinkingMode).toBe("adaptive");
expect(policy.request.extraBody).toEqual({
thinking: { type: "auto" },
});
expect(policy.reasoning.responseFields).toEqual(["reasoning_content"]);
expect(policy.tools.streamingArguments).toBe(true);
expect(policy.cache.supported).toBe(false);
expect(policy.telemetry.recordCacheTokens).toBe(false);
});

it("derives long_context compression with relaxed thresholds", () => {
const profile = getProfile("deepseek-v4");
const policy = resolveRuntimePolicy(profile, "deepseek-v4");
Expand Down
Loading