hermes/hermes_cli at 595e906698c164d1b0e88148e8e1c38bc45902f8 - hermes - Zopu Git: Git solution

common/hermes

Files

History

Sonic Chang b49a3f8474 fix(kanban): reap completed worker children in dispatch_once

The gateway-embedded dispatcher (default since `kanban.dispatch_in_gateway
= true`) is the parent of every spawned kanban worker. `_default_spawn`
calls `subprocess.Popen(..., start_new_session=True)` and returns the
pid — `start_new_session` detaches the controlling tty but does not
reparent to init, so the gateway keeps each worker as a child until it
`wait()`s for them.

Nothing in the dispatch loop ever calls `waitpid`. Result: every
completed worker becomes a `<defunct>` zombie that lingers until the
gateway exits. We hit ~430 zombies on a single hermes-agent container
after ~40 days of steady kanban traffic, approaching process-table
exhaustion on the host.

Fix: add a non-blocking reap loop at the top of `dispatch_once`, so
every dispatcher tick (default 60s) drains zombies that accumulated
since the last tick. WNOHANG keeps the call non-blocking; ChildProcessError
means no children to reap.

Why here, not a SIGCHLD handler:
- signal.signal requires the main thread; gateway threading model makes
  that placement non-trivial.
- Bounded staleness: at default interval=60s the maximum live zombie
  count is one tick's worth of worker completions.
- No interaction with detect_crashed_workers: that function only inspects
  rows where status='running', and rows reach 'done' (and stop being
  inspected) before their workers exit.

2026-05-07 05:05:20 -07:00

..

__init__.py

fix(windows): enforce UTF-8 stdout/stderr to prevent UnicodeEncodeError crash

2026-05-03 16:58:25 -07:00

_parser.py

refactor(cli): derive relaunch flag table from argparse introspection

2026-04-29 20:33:29 -07:00

auth_commands.py

feat(nous): persist Nous OAuth across profiles via shared token store (#19712 )

2026-05-04 04:54:55 -07:00

auth.py

fix(auth): fall back to global-root auth.json for providers missing in profile

2026-05-06 13:29:54 -07:00

azure_detect.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

backup.py

fix(backup): floor pre-update backup_keep to 1 so the new backup survives

2026-05-04 05:07:13 -07:00

banner.py

fix(banner): show correct update status on nix-built hermes (#17550 )

2026-04-30 07:03:00 +05:30

browser_connect.py

fix(browser): address Copilot review on /browser connect

2026-04-28 22:11:10 -07:00

callbacks.py

fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902 )

2026-04-14 16:11:37 -07:00

checkpoints.py

feat(checkpoints): v2 single-store rewrite with real pruning + disk guardrails (#20709 )

2026-05-06 05:44:35 -07:00

claw.py

fix(claw): handle missing dir in _scan_workspace_state

2026-05-05 06:08:14 -07:00

cli_output.py

refactor: remove dead code — 1,784 lines across 77 files (#9180 )

2026-04-13 16:32:04 -07:00

clipboard.py

feat: fix img pasting in new ink plus newline after tools

2026-04-11 13:14:32 -05:00

codex_models.py

feat(codex): add gpt-5.5 and wire live model discovery into picker (#14720 )

2026-04-23 13:32:43 -07:00

colors.py

feat: respect NO_COLOR env var and TERM=dumb (#4079 )

2026-03-30 17:07:21 -07:00

commands.py

feat(telegram): /topic off + help + auth gate + screenshot debounce

2026-05-04 12:07:17 -07:00

completion.py

fix: preserve profile name completion in dynamic shell completion

2026-04-14 10:45:42 -07:00

config.py

feat(web): add SearXNG as a native search-only backend

2026-05-06 10:05:29 -07:00

copilot_auth.py

fix(copilot): exchange raw GitHub token for Copilot API JWT

2026-04-24 05:09:08 -07:00

cron.py

feat(cron): add no_agent mode for script-only cron jobs (watchdog pattern) (#19709 )

2026-05-04 12:31:01 -07:00

curator.py

feat(curator): add archive and prune subcommands (#20200 )

2026-05-05 05:15:54 -07:00

curses_ui.py

fix: treat ctrl-c as curses cancel

2026-05-04 01:36:44 -07:00

debug.py

fix(debug): redact log content at upload time in hermes debug share

2026-05-03 11:42:20 -07:00

default_soul.py

fix: reset default SOUL.md to baseline identity text (#3159 )

2026-03-26 01:34:27 -07:00

dingtalk_auth.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

doctor.py

docs: pluggable surfaces coverage — model-provider guide, full plugin map, opt-in fix (#20749 )

2026-05-06 07:24:42 -07:00

dump.py

refactor(env): use shared Hermes dotenv loader

2026-05-05 10:13:13 -07:00

env_loader.py

refactor: consolidate symlink-safe atomic replace into shared helper

2026-04-28 04:58:22 -07:00

fallback_cmd.py

feat(cli): add 'hermes fallback' command to manage fallback providers (#16052 )

2026-04-26 06:19:04 -07:00

gateway.py

fix(gateway): wait for systemd restart readiness

2026-05-06 18:12:35 -07:00

goals.py

feat: /goal — persistent cross-turn goals (Ralph loop) (#18262 )

2026-04-30 23:10:20 -07:00

hooks.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

kanban_db.py

fix(kanban): reap completed worker children in dispatch_once

2026-05-07 05:05:20 -07:00

kanban_diagnostics.py

fix(kanban): unify failure counter across spawn/timeout/crash outcomes (#20410 )

2026-05-05 13:55:37 -07:00

kanban.py

feat(kanban): surface task_runs.summary on dashboard cards + `kanban show`

2026-05-05 17:26:15 -07:00

logs.py

feat: component-separated logging with session context and filtering (#7991 )

2026-04-11 17:23:36 -07:00

main.py

feat(profiles): --no-skills flag for empty profile creation (#20986 )

2026-05-07 04:34:38 -07:00

mcp_config.py

refactor(config): migrate remaining 33 cfg_get call sites (#17311 )

2026-04-29 04:03:03 -07:00

memory_setup.py

fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows

2026-05-02 01:40:31 -07:00

model_catalog.py

chore: remove unused imports and dead locals (ruff F401, F841) (#17010 )

2026-04-28 06:46:45 -07:00

model_normalize.py

fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802 )

2026-05-06 09:08:33 -07:00

model_switch.py

fix(opencode-go): keep users on opencode-go instead of hijacking to native providers (#20802 )

2026-05-06 09:08:33 -07:00

models.py

docs: pluggable surfaces coverage — model-provider guide, full plugin map, opt-in fix (#20749 )

2026-05-06 07:24:42 -07:00

nous_subscription.py

feat(web): add SearXNG as a native search-only backend

2026-05-06 10:05:29 -07:00

oneshot.py

fix(tui): honor launch toolsets (#17623 )

2026-04-29 16:55:27 -07:00

pairing.py

fix(pairing): handle null user_name in pairing list display

2026-04-23 02:34:11 -07:00

platforms.py

feat: complete plugin platform parity — all 12 integration points

2026-04-29 21:56:51 -07:00

plugins_cmd.py

feat(dashboard): add Plugins page with enable/disable, auth status, install/remove

2026-04-30 20:29:37 -04:00

plugins.py

feat(providers): make all 33 providers pluggable under plugins/model-providers/

2026-05-05 13:40:01 -07:00

profiles.py

feat(profiles): --no-skills flag for empty profile creation (#20986 )

2026-05-07 04:34:38 -07:00

providers.py

fix: prevent bare 'custom' slug in model.provider (#17478 )

2026-04-30 04:32:11 -07:00

pty_bridge.py

fix(pty): default TERM for resize probes

2026-05-04 02:38:54 -07:00

relaunch.py

remove relaunch_chat

2026-04-29 20:33:29 -07:00

runtime_provider.py

fix(fallback): let custom_providers shadow built-in aliases

2026-04-30 20:18:44 -07:00

setup.py

fix(gateway): don't dead-end setup wizard when only system-scope unit is installed

2026-05-06 15:58:02 -07:00

skills_config.py

refactor(config): migrate remaining 33 cfg_get call sites (#17311 )

2026-04-29 04:03:03 -07:00

skills_hub.py

feat(skills): install skills from a direct HTTP(S) URL (#16323 )

2026-04-26 20:57:10 -07:00

skin_engine.py

fix(tui): honor skin highlight colors (#20895 )

2026-05-06 14:01:56 -07:00

slack_cli.py

fix(paths): route achievements plugin + profile-tui through HERMES_HOME

2026-04-30 23:21:54 -07:00

status.py

fix(status): add missing popular provider API keys to hermes status display

2026-05-04 05:14:13 -07:00

timeouts.py

refactor(timeouts): drop redundant ImportError in except clause

2026-04-26 20:48:20 -07:00

tips.py

docs: refresh stale platform/LOC/test counts; clarify gateway vs plugin platforms

2026-05-05 13:45:47 -07:00

tools_config.py

feat(web): add SearXNG as a native search-only backend

2026-05-06 10:05:29 -07:00

uninstall.py

feat(uninstall): offer to remove named profiles when uninstalling from default

2026-04-18 19:18:13 -07:00

vercel_auth.py

feat: add Vercel Sandbox backend

2026-04-29 07:22:33 -07:00

voice.py

fix(tui): restore voice push-to-talk parity (#20897 )

2026-05-06 15:49:59 -07:00

web_server.py

feat(profiles): --no-skills flag for empty profile creation (#20986 )

2026-05-07 04:34:38 -07:00

webhook.py

refactor(config): migrate remaining 33 cfg_get call sites (#17311 )

2026-04-29 04:03:03 -07:00