hermes

Files

Bartok9 24c209f112 fix(auxiliary): detect quota exhaustion as payment error; allow capacity-error fallback for explicit providers

Closes #26803

Root causes:
1. _is_payment_error() checked for billing keywords (credits, insufficient
   funds, billing, payment required) but missed daily token quota exhaustion
   phrases used by Bedrock, Vertex AI, and LiteLLM proxies — e.g.
   'Too many tokens per day', 'quota exceeded', 'resource exhausted',
   'daily limit'. These are functionally identical to credit exhaustion
   (provider cannot serve the request) but don't trigger fallback.

2. The call_llm() fallback chain was gated on resolved_provider == 'auto'.
   When a task resolves to a specific provider (e.g. 'custom' for a LiteLLM
   proxy, or 'openrouter'), capacity failures (payment/quota/connection)
   silently raise instead of trying alternatives. This is overly conservative:
   capacity errors mean the provider *cannot* serve the request regardless of
   user intent, so alternatives should always be tried.

Fixes:
- Add quota-related keywords to _is_payment_error(): quota_exceeded,
  too many tokens per day, daily limit, tokens per day, daily quota,
  resource exhausted (Vertex AI gRPC code).
- Allow fallback for capacity errors (payment + connection) even when
  resolved_provider is not 'auto'. Rate-limit fallback stays gated on
  is_auto to honour explicit provider constraints for transient limits.
- Apply both fixes to sync call_llm() and async acall_llm() paths.
- Add 6 targeted tests for the new quota-error detection cases.

2026-05-17 17:15:31 -07:00

lsp

chore: ruff auto-fix PLR6201 resweep — tuple → set in membership tests (#27355 )

2026-05-17 02:29:41 -07:00

transports

fix(codex): allow kanban worker board writes

2026-05-17 11:50:43 -07:00

__init__.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

account_usage.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

agent_init.py

fix(run_agent): guard memory provider init against empty/whitespace string

2026-05-16 23:43:09 -07:00

agent_runtime_helpers.py

fix(agent): reset _fallback_index at turn start even when no fallback activated

2026-05-16 23:41:45 -07:00

anthropic_adapter.py

style: move secrets import alongside other function-level imports

2026-05-16 02:38:02 -07:00

async_utils.py

fix(async): close unscheduled coroutines in all threadsafe bridges (#26584 )

2026-05-15 14:00:01 -07:00

auxiliary_client.py

fix(auxiliary): detect quota exhaustion as payment error; allow capacity-error fallback for explicit providers

2026-05-17 17:15:31 -07:00

background_review.py

fix(run_agent): isolate background review fork from external memory plugins (#27190 )

2026-05-16 23:42:49 -07:00

bedrock_adapter.py

chore(deps): lazy-install boto3/botocore for bedrock adapter

2026-05-17 02:31:18 -07:00

browser_provider.py

fix(browser): self-review pass — dead-import, log levels, future-proofing

2026-05-17 04:04:15 -07:00

browser_registry.py

fix(browser): self-review pass — dead-import, log levels, future-proofing

2026-05-17 04:04:15 -07:00

chat_completion_helpers.py

fix(xai): wire schema sanitizer into post-refactor build_api_kwargs

2026-05-17 13:13:22 -07:00

codex_responses_adapter.py

fix(xai-oauth): recover from prelude SSE errors, gate reasoning replay, surface entitlement 403s (#26644 )

2026-05-15 16:35:12 -07:00

codex_runtime.py

fix(xai): surface provider 'error' SSE frame in Codex fallback stream (#27184 )

2026-05-16 23:41:09 -07:00

context_compressor.py

Port from Kilo-Org/kilocode#9434: strip historical media after compression (#27189 )

2026-05-16 17:18:25 -07:00

context_engine.py

fix(compression): keep default protect_first_n at 3 + align ABC

2026-05-13 22:25:16 -07:00

context_references.py

fix(agent): fall back when rg is blocked for @folder references

2026-04-20 01:56:41 -07:00

conversation_compression.py

fix(auxiliary): resolve xai oauth compression from pool — port to conversation_compression

2026-05-16 23:33:59 -07:00

conversation_loop.py

fix(copilot): GitHub Models 413 hint — port to extracted conversation_loop

2026-05-16 23:38:45 -07:00

copilot_acp_client.py

fix(copilot-acp): tighten deprecation detection + sharpen GitHub Models 413 hint

2026-05-16 02:24:48 -07:00

credential_pool.py

fix(auth) fix a few cases where refresh tokens were not rotated.

2026-05-17 16:56:37 -07:00

credential_sources.py

feat(xai-oauth): add xAI Grok OAuth (SuperGrok Subscription) provider

2026-05-15 12:11:32 -07:00

curator_backup.py

fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671 ) (#18731 )

2026-05-02 01:29:57 -07:00

curator.py

feat(curator): hint at hermes curator pin in the rename block (#23212 )

2026-05-10 06:44:53 -07:00

display.py

chore: remove Atropos RL environments and tinker-atropos integration (#26106 )

2026-05-15 10:36:38 +05:30

error_classifier.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

file_safety.py

fix(security): apply file safety to copilot acp fs

2026-04-21 01:31:58 -07:00

gemini_cloudcode_adapter.py

fix(agent/gemini-cloudcode): seed delta defaults for reasoning-only stream chunks

2026-05-14 08:03:56 -07:00

gemini_native_adapter.py

fix(auxiliary): evict async wrappers on poisoned client (follow-up to #23482 )

2026-05-11 11:13:20 -07:00

gemini_schema.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

google_code_assist.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

google_oauth.py

fix(google_oauth): close TOCTOU window when saving credentials

2026-05-04 03:16:19 -07:00

i18n.py

feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914 )

2026-05-10 07:14:14 -07:00

image_gen_provider.py

feat(plugins): pluggable image_gen backends + OpenAI provider (#13799 )

2026-04-21 21:30:10 -07:00

image_gen_registry.py

fix(plugins): filter resolution by is_available() in web + image_gen registries

2026-05-13 22:31:28 -07:00

image_routing.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

insights.py

Merge branch 'main' into feat/dashboard-skill-analytics

2026-04-20 05:25:49 -07:00

iteration_budget.py

refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget

2026-05-16 17:59:32 -07:00

lmstudio_reasoning.py

feat(agent): add lmstudio integration

2026-04-28 12:27:36 -07:00

manual_compression_feedback.py

fix(compression): include system prompt + tool schemas in token estimates (#18265 )

2026-04-30 23:03:54 -07:00

markdown_tables.py

fix(cli): vertical fallback for markdown tables wider than terminal (#23948 )

2026-05-11 16:49:13 -07:00

memory_manager.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

memory_provider.py

docs(agent): remove stale BuiltinMemoryProvider references from memory module docstrings

2026-05-05 13:33:49 -07:00

message_sanitization.py

refactor(run_agent): extract message sanitization to agent/message_sanitization.py

2026-05-16 17:41:09 -07:00

model_metadata.py

fix(metadata): qwen3.6-plus has a 1M context window (#27008 )

2026-05-17 02:31:18 -07:00

models_dev.py

feat: add NovitaAI as LLM provider

2026-05-13 23:51:15 -07:00

moonshot_schema.py

fix(moonshot): strip $ref siblings and collapse tuple items in tool schemas (#27104 )

2026-05-16 13:02:19 -07:00

nous_rate_guard.py

codebase: add encoding='utf-8' to all bare open() calls (PLW1514)

2026-05-08 14:27:40 -07:00

onboarding.py

docs(onboarding): lead OpenClaw residue banner with migrate, warn that cleanup breaks OpenClaw (#17507 )

2026-04-29 08:08:36 -07:00

plugin_llm.py

feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194 )

2026-05-10 07:09:28 -07:00

portal_tags.py

feat(nous): unified client=hermes-client-v<version> tag on every Portal request (#24779 )

2026-05-12 20:49:20 -07:00

process_bootstrap.py

refactor(run_agent): extract OpenAI proxy, safe stdio, IterationBudget

2026-05-16 17:59:32 -07:00

prompt_builder.py

fix(prompt_builder): inject tool-use enforcement for GLM models

2026-05-12 18:46:28 -07:00

prompt_caching.py

fix(cache): kill long-lived prefix layout — system prompt is now byte-static within a session (#24778 )

2026-05-12 20:46:04 -07:00

rate_limit_tracker.py

refactor: remove dead code — 1,784 lines across 77 files (#9180 )

2026-04-13 16:32:04 -07:00

redact.py

chore: ruff auto-fix PLR6201 — tuple → set in membership tests (#23937 )

2026-05-11 11:13:25 -07:00

retry_utils.py

feat(agent): add jittered retry backoff

2026-04-08 00:41:36 -07:00

shell_hooks.py

refactor(security): extract _block_message helper to unify block logic in _parse_response

2026-05-17 02:31:18 -07:00

skill_commands.py

fix(skills): return None instead of truthy stub when skill load fails

2026-05-16 22:52:22 -07:00

skill_preprocessing.py

fix(skills): apply inline shell in skill_view

2026-04-24 15:15:07 -07:00

skill_utils.py

perf(cli): cut ~19s from 'hermes' cold start (skills cache + lazy Feishu + no Nous HTTP) (#22138 )

2026-05-08 16:39:32 -07:00

stream_diag.py

refactor(run_agent): extract stream diagnostics to agent/stream_diag.py

2026-05-16 18:28:17 -07:00

subdirectory_hints.py

fix(agent): catch PermissionError in subdirectory hint discovery

2026-04-09 03:10:30 -07:00

system_prompt.py

refactor(run_agent): extract system-prompt builder to agent/system_prompt.py

2026-05-16 18:16:20 -07:00

think_scrubber.py

fix(agent): stateful streaming scrubber for reasoning-block leaks (#17924 ) (#20184 )

2026-05-05 04:33:38 -07:00

title_generator.py

fix: improve telegram topic mode setup

2026-05-04 12:07:17 -07:00

tool_dispatch_helpers.py

feat: add supports_parallel_tool_calls for MCP servers (#26825 ) — port to tool_dispatch_helpers

2026-05-16 23:36:37 -07:00

tool_executor.py

refactor(run_agent): extract tool execution to agent/tool_executor.py

2026-05-16 18:24:05 -07:00

tool_guardrails.py

fix: classify landed file mutations with diagnostics

2026-05-13 06:46:23 -07:00

tool_result_classification.py

fix: classify landed file mutations with diagnostics

2026-05-13 06:46:23 -07:00

trajectory.py

Refactor Terminal and AIAgent cleanup

2026-02-21 22:31:43 -08:00

usage_pricing.py

fix(pricing): add deepseek-v4-pro to official docs pricing table

2026-05-12 16:32:57 -07:00

video_gen_provider.py

feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )

2026-05-13 16:39:41 -07:00

video_gen_registry.py

feat(video_gen): unified video_generate tool with pluggable provider backends (#25126 )

2026-05-13 16:39:41 -07:00

web_search_provider.py

fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup

2026-05-13 22:31:28 -07:00

web_search_registry.py

fix(web): align _LEGACY_PREFERENCE with legacy 7-provider order + doc cleanup

2026-05-13 22:31:28 -07:00