perf(tools): memoize get_tool_definitions + TTL-cache check_fn results (#17098)
Two amplifying optimizations to per-turn overhead in the gateway: 1. get_tool_definitions() memoization (model_tools.py) Keyed on (frozenset(enabled), frozenset(disabled), registry._generation, config.yaml mtime+size). Only active when quiet_mode=True (which is every hot-path caller — gateway, AIAgent.__init__); quiet_mode=False keeps the existing print side effects. Cached path returns a shallow-copy list sharing read-only schema dicts. Measured: 7.5 ms → 0.01 ms per call (~750× speedup). Gateway constructs fresh AIAgent per message, so this saves ~7 ms/turn before any LLM work. 2. check_fn() TTL cache (tools/registry.py) check_fn callables like check_terminal_requirements probe external state (Docker daemon, Modal SDK, playwright binary). For a long-lived process, hitting them on every get_definitions() pass was pure waste — external state changes on human timescales. 30 s TTL so env-var flips (hermes tools enable X) propagate within a turn or two without explicit invalidation. Measured: first call 7.5ms → 1.6ms (check_fn probes now dominate); subsequent calls ~0.01ms via the upstream memoization. Invalidation surface: - registry._generation bumps on register/deregister/register_toolset_alias, invalidating the memoized definitions automatically. - config.yaml mtime in the cache key captures user-visible config edits affecting dynamic schemas (execute_code mode, discord allowlist). - invalidate_check_fn_cache() exposed for explicit flushes (e.g. after hermes tools enable/disable). - tests/conftest.py autouse fixture clears both caches before every test so env-var monkeypatches don't see stale results. Also fixes a regression from PR #17046 that I missed: - tools/web_tools.py — Firecrawl was removed from module scope by the lazy import, breaking 8 tests that patch 'tools.web_tools.Firecrawl'. Applied the same _FirecrawlProxy pattern used in auxiliary_client/ run_agent for OpenAI (module-level proxy that looks like the class but imports the SDK on first call/isinstance; patch() replaces the attribute as usual). Verified: - 49/49 tests/tools/test_web_tools_config.py pass (was 8 failing on main) - 68/68 tests/tools/test_homeassistant_tool.py pass (was 1 failing in the full suite due to check_fn TTL cross-test pollution; fixed by the autouse fixture) - 3887/3895 tests/tools/ (8 pre-existing fails: 2 delegate, 1 mcp dynamic discovery, 5 mcp structured content — all confirmed on main) - 2973/2976 tests/agent/ + tests/run_agent/ (3 pre-existing fails) - 868/868 tests/run_agent/ (excluding test_run_agent.py which has pre-existing suite-level issues) - Live smoke: 2 turns + /model switch + tool calls, zero errors in agent.log session window. Co-authored-by: teknium1 <teknium@users.noreply.github.com>
This commit is contained in:
@@ -206,6 +206,27 @@ _LEGACY_TOOLSET_MAP = {
|
||||
# get_tool_definitions (the main schema provider)
|
||||
# =============================================================================
|
||||
|
||||
# Module-level memoization for get_tool_definitions(). Keyed on
|
||||
# (frozenset(enabled_toolsets), frozenset(disabled_toolsets), registry._generation).
|
||||
# Hot callers (gateway runner, AIAgent.__init__) invoke this on every turn
|
||||
# with quiet_mode=True; caching avoids ~7 ms of registry walking + schema
|
||||
# filtering + check_fn probing per call. Only active when quiet_mode=True
|
||||
# because quiet_mode=False has stdout side effects (tool-selection prints).
|
||||
#
|
||||
# Invalidation happens transparently via the registry's _generation counter,
|
||||
# which bumps on register() / deregister() / register_toolset_alias(). The
|
||||
# inner check_fn TTL cache in registry.py handles environment drift (Docker
|
||||
# daemon start/stop, env var changes, etc.) on a 30 s horizon.
|
||||
_tool_defs_cache: Dict[tuple, List[Dict[str, Any]]] = {}
|
||||
|
||||
|
||||
def _clear_tool_defs_cache() -> None:
|
||||
"""Drop memoized get_tool_definitions() results. Called when dynamic
|
||||
schema dependencies change (e.g. discord capability cache reset,
|
||||
execute_code sandbox reconfigured)."""
|
||||
_tool_defs_cache.clear()
|
||||
|
||||
|
||||
def get_tool_definitions(
|
||||
enabled_toolsets: List[str] = None,
|
||||
disabled_toolsets: List[str] = None,
|
||||
@@ -224,6 +245,50 @@ def get_tool_definitions(
|
||||
Returns:
|
||||
Filtered list of OpenAI-format tool definitions.
|
||||
"""
|
||||
# Fast path: memoized result when the caller doesn't need stdout prints.
|
||||
# The cache key captures every argument-level input; the registry
|
||||
# generation captures registry mutations (MCP refresh, plugin load).
|
||||
# check_fn results are TTL-cached one level down, inside
|
||||
# registry.get_definitions. The config-mtime fingerprint below captures
|
||||
# user-visible config edits that affect dynamic schemas (execute_code
|
||||
# mode, discord action allowlist, etc.) without needing an explicit
|
||||
# invalidate hook on every config-writer.
|
||||
if quiet_mode:
|
||||
try:
|
||||
from hermes_cli.config import get_config_path
|
||||
cfg_path = get_config_path()
|
||||
cfg_stat = cfg_path.stat()
|
||||
cfg_fp = (cfg_stat.st_mtime_ns, cfg_stat.st_size)
|
||||
except (FileNotFoundError, OSError, ImportError):
|
||||
cfg_fp = None
|
||||
cache_key = (
|
||||
frozenset(enabled_toolsets) if enabled_toolsets is not None else None,
|
||||
frozenset(disabled_toolsets) if disabled_toolsets else None,
|
||||
registry._generation,
|
||||
cfg_fp,
|
||||
)
|
||||
cached = _tool_defs_cache.get(cache_key)
|
||||
if cached is not None:
|
||||
# Update _last_resolved_tool_names so downstream callers see
|
||||
# consistent state even on a cache hit.
|
||||
global _last_resolved_tool_names
|
||||
_last_resolved_tool_names = [t["function"]["name"] for t in cached]
|
||||
# Return a shallow copy of the list but share the dict references —
|
||||
# schemas are treated as read-only by all known callers.
|
||||
return list(cached)
|
||||
|
||||
result = _compute_tool_definitions(enabled_toolsets, disabled_toolsets, quiet_mode)
|
||||
if quiet_mode:
|
||||
_tool_defs_cache[cache_key] = result
|
||||
return result
|
||||
|
||||
|
||||
def _compute_tool_definitions(
|
||||
enabled_toolsets: List[str] = None,
|
||||
disabled_toolsets: List[str] = None,
|
||||
quiet_mode: bool = False,
|
||||
) -> List[Dict[str, Any]]:
|
||||
"""Uncached implementation of :func:`get_tool_definitions`."""
|
||||
# Determine which tool names the caller wants
|
||||
tools_to_include: set = set()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user