Skip to content

Conversation

@vmoranv
Copy link
Contributor

@vmoranv vmoranv commented Jan 25, 2026

实现star类下llmtool的 claude skills 加载模式,首轮仅发送 name+desc 以减少上下文开销,在模型决定调用工具后再提供参数 schema返回 name+param。

Modifications / 改动点

internal.py: 首轮使用轻量工具集并保存 full/param 工具集用于二次查询。

tool_loop_agent_runner.py: 工具选中后使用“仅参数”工具集 requery,再用完整工具集执行;用 getattr 消除 event 类型告警。

func_tool_manager.py: 新增 get_param_only_tool_set 生成无描述的参数工具集。

tool.py: schema 生成时允许不输出 description,以支持“name+params only”。

utils.py: 系统提示加入两阶段工具加载说明。

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

PixPin_2026-01-25_14-41-09

测试工具调用正常


Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
  • 👀 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

引入两阶段工具模式加载机制,在保持完整工具执行行为的同时,减少 LLM 工具调用的初始上下文大小。

新功能:

  • 为工具模式添加惰性加载模式:初始请求中仅暴露工具名称和描述,在模型选择工具后再提供参数模式。
  • 支持在工具选定后,使用仅包含参数的工具集合重新请求 LLM,然后使用缩小后的完整工具子集执行调用。

增强点:

  • 提供辅助方法,从现有工具构建轻量(仅名称/描述)和仅参数的工具集合,并遵守工具启用标志。
  • 调整工具模式生成逻辑,以便在需要时省略描述并跳过轻量级占位参数模式,从而支持在不同提供方间进行分阶段暴露。
  • 扩展代理运行器和内部流水线阶段,以管理每个事件的完整、轻量和仅参数工具集合的存储,并在需要时将其切换进请求中。
  • 更新 system/tool 调用提示文案,说明新的两阶段工具模式暴露机制,并提示在看到参数模式之前避免猜测参数。
Original summary in English

Summary by Sourcery

Introduce two-stage tool schema loading to reduce initial context size for LLM tool calls while preserving full tool execution behavior.

New Features:

  • Add lazy loading mode for tool schemas where the initial request exposes only tool names and descriptions, with parameter schemas provided after the model chooses tools.
  • Support requerying the LLM with a parameter-only tool set once tools are selected, then executing with a narrowed full tool subset.

Enhancements:

  • Provide helper methods to construct light (name/description only) and parameter-only tool sets from existing tools, respecting tool activation flags.
  • Adjust tool schema generation to optionally omit descriptions and skip lightweight placeholder parameter schemas to support staged exposure across providers.
  • Extend agent runner and internal pipeline stages to manage per-event storage of full, light, and parameter-only tool sets and to swap them into requests as needed.
  • Update system/tool call prompt text to explain the new two-stage tool schema exposure and discourage argument guessing before schemas are seen.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels Jan 25, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了两个问题,并在下面给出了一些整体反馈:

  • _apply_tool_schema_mode 中,假设 get_llm_tool_manager() 始终返回有效的 manager;建议对 None 或意外类型做防护,以避免在插件上下文配置错误时产生难以排查的运行时错误。
  • ToolLoopAgentRunner.step 的二次查询路径中,当构建仅参数工具子集时,总是复用最初的 llm_resp.tools_call_name;如果二次 LLM 调用改变了工具集合,或者决定不再调用工具,可能会导致行为不一致——建议对二次查询响应中的工具调用名称进行校验或对齐。
  • ToolLoopAgentRunner.step 会临时将 self.req.func_tool 修改为子集,然后再恢复;如果该 runner 未来存在并发使用,或者请求对象会在多个组件之间共享,那么将子集作为参数显式传给 _handle_function_tools 会比修改共享状态更安全。
给 AI Agents 的提示词
请根据这次代码评审中的评论进行修改:

## 整体评论
-`_apply_tool_schema_mode` 中,假设 `get_llm_tool_manager()` 始终返回有效的 manager;建议对 `None` 或意外类型做防护,以避免在插件上下文配置错误时产生难以排查的运行时错误。
-`ToolLoopAgentRunner.step` 的二次查询路径中,当构建仅参数工具子集时,总是复用最初的 `llm_resp.tools_call_name`;如果二次 LLM 调用改变了工具集合,或者决定不再调用工具,可能会导致行为不一致——建议对二次查询响应中的工具调用名称进行校验或对齐。
- `ToolLoopAgentRunner.step` 会临时将 `self.req.func_tool` 修改为子集,然后再恢复;如果该 runner 未来存在并发使用,或者请求对象会在多个组件之间共享,那么将子集作为参数显式传给 `_handle_function_tools` 会比修改共享状态更安全。

## 单条评论

### 评论 1
<location> `astrbot/core/agent/runners/tool_loop_agent_runner.py:329-330` </location>
<code_context>
+                        exec_tool_set = full_tool_set
+
             tool_call_result_blocks = []
+            original_tool_set = self.req.func_tool
+            if exec_tool_set is not None:
+                self.req.func_tool = exec_tool_set
             async for result in self._handle_function_tools(self.req, llm_resp):
</code_context>

<issue_to_address>
**issue (bug_risk):** 如果 `_handle_function_tools` 抛出异常,func_tool 不会被恢复,从而泄漏修改后的工具集。

由于 `self.req.func_tool` 只在成功路径上被重置,`_handle_function_tools`(或该循环)中的任何异常都会让它保持为 `exec_tool_set`,从而潜在地影响后续步骤或其他对该 `ProviderRequest` 的使用。请将重新赋值逻辑放入 `try`/`finally`(或一个小的上下文管理器)中,以确保即使发生错误也总能恢复原始的工具集。
</issue_to_address>

### 评论 2
<location> `astrbot/core/agent/runners/tool_loop_agent_runner.py:303` </location>
<code_context>

         # 如果有工具调用,还需处理工具调用
         if llm_resp.tools_call_name:
+            exec_tool_set = self.req.func_tool
+            event = getattr(self.run_context.context, "event", None)
</code_context>

<issue_to_address>
**issue (complexity):** 建议重构工具执行路径,避免修改 `self.req.func_tool`,并将嵌套的工具集/二次查询逻辑提取到专门的 helper 中,以获得更清晰、线性的控制流程。

你可以在保留现有功能的同时,降低复杂度增加带来的影响:

1. **停止修改 `self.req.func_tool`;显式传递工具集**

与其临时覆盖 `self.req.func_tool`,不如将选中的 `ToolSet` 作为参数传入 `_handle_function_tools`。这样可以消除非局部副作用,也无需维护 `original_tool_set````python
# 重构前
tool_call_result_blocks = []
original_tool_set = self.req.func_tool
if exec_tool_set is not None:
    self.req.func_tool = exec_tool_set

async for result in self._handle_function_tools(self.req, llm_resp):
    ...

if exec_tool_set is not None:
    self.req.func_tool = original_tool_set
````_handle_function_tools` 重构为接收一个可选的 `tool_set` 参数(对现有调用者默认使用 `req.func_tool`):

```python
# 在类中
async def _handle_function_tools(
    self,
    req: ProviderRequest,
    llm_resp: LLMResponse,
    tool_set: ToolSet | None = None,
):
    tool_set = tool_set or req.func_tool
    # 现有逻辑使用 `tool_set` 而不是 `req.func_tool`
    ...
```

然后 `step` 中就可以在不修改 `self.req` 的情况下调用它:

```python
tool_call_result_blocks = []

async for result in self._handle_function_tools(
    self.req,
    llm_resp,
    tool_set=exec_tool_set,
):
    if isinstance(result, list):
        tool_call_result_blocks = result
    elif isinstance(result, MessageChain):
        ...
```

2. **将工具集 / 二次查询逻辑提取到带有早返回的 helper 中**

这个嵌套很深的代码块可以提取到一个小的 helper 中,由它返回选择好的 `ToolSet` 和可能更新后的 `llm_resp`。这样可以让 `step` 保持线性、易读。

当前代码块(简化版):

```python
if llm_resp.tools_call_name:
    exec_tool_set = self.req.func_tool
    event = getattr(self.run_context.context, "event", None)
    if event:
        full_tool_set = event.get_extra("_tool_schema_full_set")
        param_tool_set = event.get_extra("_tool_schema_param_set")
        if isinstance(full_tool_set, ToolSet):
            subset = self._build_tool_subset(
                full_tool_set, llm_resp.tools_call_name
            )
            if subset.tools:
                if isinstance(param_tool_set, ToolSet):
                    param_subset = self._build_tool_subset(
                        param_tool_set, llm_resp.tools_call_name
                    )
                    if param_subset.tools:
                        requery_resp = await self._requery_tool_calls(
                            param_subset, llm_resp.tools_call_name
                        )
                        if requery_resp:
                            llm_resp = requery_resp
                exec_tool_set = subset
            else:
                exec_tool_set = full_tool_set
    ...
```

将其重构为 helper 来减少嵌套并澄清职责:

```python
async def _resolve_tool_exec(
    self,
    llm_resp: LLMResponse,
) -> tuple[LLMResponse, ToolSet | None]:
    tool_names = llm_resp.tools_call_name
    if not tool_names:
        return llm_resp, self.req.func_tool

    event = getattr(self.run_context.context, "event", None)
    if not event:
        return llm_resp, self.req.func_tool

    full_tool_set = event.get_extra("_tool_schema_full_set")
    if not isinstance(full_tool_set, ToolSet):
        return llm_resp, self.req.func_tool

    # 从完整集合中计算子集
    subset = self._build_tool_subset(full_tool_set, tool_names)
    if not subset.tools:
        # 没有匹配项时,保持之前的行为:回退到完整集合
        return llm_resp, full_tool_set

    # 针对仅参数 schema 的可选二次查询
    param_tool_set = event.get_extra("_tool_schema_param_set")
    if isinstance(param_tool_set, ToolSet):
        param_subset = self._build_tool_subset(param_tool_set, tool_names)
        if param_subset.tools:
            requery_resp = await self._requery_tool_calls(param_subset, tool_names)
            if requery_resp:
                llm_resp = requery_resp

    return llm_resp, subset
```

然后 `step` 会变得更简单、更线性:

```python
if llm_resp.tools_call_name:
    llm_resp, exec_tool_set = await self._resolve_tool_exec(llm_resp)

    tool_call_result_blocks = []

    async for result in self._handle_function_tools(
        self.req,
        llm_resp,
        tool_set=exec_tool_set,
    ):
        if isinstance(result, list):
            tool_call_result_blocks = result
        elif isinstance(result, MessageChain):
            ...
```

这样的重构能够:

- 保留延迟 schema + 二次查询的行为。
- 消除对 `self.req.func_tool` 的隐式修改。
- 通过早返回扁平化嵌套条件。
-`_build_tool_subset``_requery_tool_calls` 之间的耦合封装在 `_resolve_tool_exec` 中,使得 `step` 更容易理解和推理。
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助,欢迎分享 ✨
帮我变得更有用!请对每条评论点 👍 或 👎,我会根据你的反馈改进后续评审。
Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

  • In _apply_tool_schema_mode, get_llm_tool_manager() is assumed to always return a valid manager; consider guarding against None or unexpected types to avoid hard-to-trace runtime errors when the plugin context is misconfigured.
  • The re-query path in ToolLoopAgentRunner.step always reuses the original llm_resp.tools_call_name when building the param-only tool subset; if the secondary LLM call changes the set of tools or decides not to call tools, this could lead to inconsistent behavior—consider validating or reconciling the tool call names from the re-query response.
  • ToolLoopAgentRunner.step temporarily mutates self.req.func_tool to a subset and then restores it; if the runner is ever used concurrently or the request object is shared across components, it may be safer to pass the subset explicitly into _handle_function_tools instead of mutating shared state.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_apply_tool_schema_mode`, `get_llm_tool_manager()` is assumed to always return a valid manager; consider guarding against `None` or unexpected types to avoid hard-to-trace runtime errors when the plugin context is misconfigured.
- The re-query path in `ToolLoopAgentRunner.step` always reuses the original `llm_resp.tools_call_name` when building the param-only tool subset; if the secondary LLM call changes the set of tools or decides not to call tools, this could lead to inconsistent behavior—consider validating or reconciling the tool call names from the re-query response.
- `ToolLoopAgentRunner.step` temporarily mutates `self.req.func_tool` to a subset and then restores it; if the runner is ever used concurrently or the request object is shared across components, it may be safer to pass the subset explicitly into `_handle_function_tools` instead of mutating shared state.

## Individual Comments

### Comment 1
<location> `astrbot/core/agent/runners/tool_loop_agent_runner.py:329-330` </location>
<code_context>
+                        exec_tool_set = full_tool_set
+
             tool_call_result_blocks = []
+            original_tool_set = self.req.func_tool
+            if exec_tool_set is not None:
+                self.req.func_tool = exec_tool_set
             async for result in self._handle_function_tools(self.req, llm_resp):
</code_context>

<issue_to_address>
**issue (bug_risk):** func_tool is not restored if _handle_function_tools raises, which can leak the modified tool set.

Because `self.req.func_tool` is only reset on the success path, any exception in `_handle_function_tools` (or that loop) leaves it stuck as `exec_tool_set`, potentially affecting later steps or other uses of this `ProviderRequest`. Please wrap the reassignment in a `try`/`finally` (or a small context manager) so the original tool is always restored, even on errors.
</issue_to_address>

### Comment 2
<location> `astrbot/core/agent/runners/tool_loop_agent_runner.py:303` </location>
<code_context>

         # 如果有工具调用,还需处理工具调用
         if llm_resp.tools_call_name:
+            exec_tool_set = self.req.func_tool
+            event = getattr(self.run_context.context, "event", None)
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring the tool execution path to avoid mutating `self.req.func_tool` and to move the nested tool-set/requery logic into a dedicated helper for clearer, linear control flow.

You can keep all functionality while reducing the complexity bump by:

1. **Stop mutating `self.req.func_tool`; pass the tool set explicitly**

Instead of temporarily overwriting `self.req.func_tool`, thread the chosen `ToolSet` into `_handle_function_tools`. This removes non-local side effects and the `original_tool_set` bookkeeping.

```python
# before
tool_call_result_blocks = []
original_tool_set = self.req.func_tool
if exec_tool_set is not None:
    self.req.func_tool = exec_tool_set

async for result in self._handle_function_tools(self.req, llm_resp):
    ...

if exec_tool_set is not None:
    self.req.func_tool = original_tool_set
```

Refactor `_handle_function_tools` to accept an optional `tool_set` parameter (defaulting to `req.func_tool` for existing callers):

```python
# in the class
async def _handle_function_tools(
    self,
    req: ProviderRequest,
    llm_resp: LLMResponse,
    tool_set: ToolSet | None = None,
):
    tool_set = tool_set or req.func_tool
    # existing logic uses `tool_set` instead of `req.func_tool`
    ...
```

Then `step` can call it without mutating `self.req`:

```python
tool_call_result_blocks = []

async for result in self._handle_function_tools(
    self.req,
    llm_resp,
    tool_set=exec_tool_set,
):
    if isinstance(result, list):
        tool_call_result_blocks = result
    elif isinstance(result, MessageChain):
        ...
```

2. **Extract the tool-set / requery logic into a helper with early returns**

The deeply nested block can be pulled into a small helper that returns both the chosen `ToolSet` and a possibly-updated `llm_resp`. This keeps `step` linear and readable.

Current block (simplified):

```python
if llm_resp.tools_call_name:
    exec_tool_set = self.req.func_tool
    event = getattr(self.run_context.context, "event", None)
    if event:
        full_tool_set = event.get_extra("_tool_schema_full_set")
        param_tool_set = event.get_extra("_tool_schema_param_set")
        if isinstance(full_tool_set, ToolSet):
            subset = self._build_tool_subset(
                full_tool_set, llm_resp.tools_call_name
            )
            if subset.tools:
                if isinstance(param_tool_set, ToolSet):
                    param_subset = self._build_tool_subset(
                        param_tool_set, llm_resp.tools_call_name
                    )
                    if param_subset.tools:
                        requery_resp = await self._requery_tool_calls(
                            param_subset, llm_resp.tools_call_name
                        )
                        if requery_resp:
                            llm_resp = requery_resp
                exec_tool_set = subset
            else:
                exec_tool_set = full_tool_set
    ...
```

Refactor into a helper to reduce nesting and clarify responsibilities:

```python
async def _resolve_tool_exec(
    self,
    llm_resp: LLMResponse,
) -> tuple[LLMResponse, ToolSet | None]:
    tool_names = llm_resp.tools_call_name
    if not tool_names:
        return llm_resp, self.req.func_tool

    event = getattr(self.run_context.context, "event", None)
    if not event:
        return llm_resp, self.req.func_tool

    full_tool_set = event.get_extra("_tool_schema_full_set")
    if not isinstance(full_tool_set, ToolSet):
        return llm_resp, self.req.func_tool

    # compute subset from full set
    subset = self._build_tool_subset(full_tool_set, tool_names)
    if not subset.tools:
        # no match, fall back to full set as before
        return llm_resp, full_tool_set

    # optional requery on parameter-only schema
    param_tool_set = event.get_extra("_tool_schema_param_set")
    if isinstance(param_tool_set, ToolSet):
        param_subset = self._build_tool_subset(param_tool_set, tool_names)
        if param_subset.tools:
            requery_resp = await self._requery_tool_calls(param_subset, tool_names)
            if requery_resp:
                llm_resp = requery_resp

    return llm_resp, subset
```

Then `step` becomes simpler and linear:

```python
if llm_resp.tools_call_name:
    llm_resp, exec_tool_set = await self._resolve_tool_exec(llm_resp)

    tool_call_result_blocks = []

    async for result in self._handle_function_tools(
        self.req,
        llm_resp,
        tool_set=exec_tool_set,
    ):
        if isinstance(result, list):
            tool_call_result_blocks = result
        elif isinstance(result, MessageChain):
            ...
```

This refactor:

- Keeps the lazy schema + requery behavior intact.
- Removes hidden mutation of `self.req.func_tool`.
- Flattens nested conditionals via early returns.
- Encapsulates the coupling between `_build_tool_subset` and `_requery_tool_calls` inside `_resolve_tool_exec`, making `step` easier to reason about.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 187173b078

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 319 to 323
requery_resp = await self._requery_tool_calls(
param_subset, llm_resp.tools_call_name
)
if requery_resp:
llm_resp = requery_resp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Account for requery token usage

When tool schema lazy loading is enabled and a tool call is detected, this code issues a second LLM request via _requery_tool_calls and then replaces llm_resp with requery_resp. The additional request’s LLMResponse.usage is never added to self.stats.token_usage (only the initial response is counted earlier in the step), so runner_stats.token_usage.total and the stored conversation token usage will systematically undercount whenever requery happens. This affects cost accounting and the “trusted_token_usage” passed to context management. Consider adding the requery usage to the stats before continuing.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant