Four fixes from PR #27248 review:
1. **__init__ forwarder is now keyword-forwarded** (daimon-nous review).
Previously the run_agent.AIAgent.__init__ wrapper forwarded all 64
params positionally to agent.agent_init.init_agent, so adding a
65th param on main would require three lockstep edits (signature,
init_agent signature, forwarder call) or silently shift every value.
Keyword forwarding makes this trivially safe — adding a param now
only needs the two signatures and one extra keyword line.
2. **Drop dead _ra() in agent/codex_runtime.py** (daimon-nous + Copilot).
The lazy run_agent reference was defined but never called inside
this module — the codex paths use agent.* accessors only.
3. **Drop unused imports in agent/codex_runtime.py** (Copilot):
contextvars, threading, time, uuid, Optional. Carried over from
run_agent.py during the original extraction.
4. **Tighten three source-introspection test guards** (Copilot):
- test_memory_nudge_counter_hydration.py — was scanning the
concatenated source of run_agent.py + agent/conversation_loop.py
and matching self.X or agent.X form. Now asserts the
hydration block lives in agent/conversation_loop.py specifically
with the agent.X form — the body never moves back, so if it
ever drifts a future re-introduction fails the guard.
- test_run_agent.py::TestMemoryNudgeCounterPersistence — anchor on
agent.iteration_budget = IterationBudget exactly (was just
iteration_budget = IterationBudget) so an unrelated identifier
ending in iteration_budget can't match.
- test_run_agent.py::TestMemoryProviderTurnStart — assert the
agent._user_turn_count form directly (the extracted body uses
agent.X, not self.X — accepting either was a transitional fudge).
- test_jsondecodeerror_retryable.py — scan agent/conversation_loop.py
only, not the concatenation.
Not addressed in this commit:
* Pre-existing bugs in agent/tool_executor.py (heartbeat index
mismatch when calls are blocked, _current_tool clobber in result
loop, blocked-counted-as-completed in spinner summary, dead
result_preview computation). These were preserved byte-for-byte from
the original _execute_tool_calls_concurrent — worth a separate
follow-up PR with proper tests.
* _OpenAIProxy.__instancecheck__ concern — pre-existing, not flagged
by any of the original test patches (nothing actually does
isinstance(x, OpenAI) against the proxy instance).
* agent_init.py:949 mem_config potential NameError — pre-existing;
only triggers if _agent_cfg.get('memory', {}) itself raises, which
it can't with a stock dict.
tests/run_agent/ + tests/agent/: 4313 passed, 1 pre-existing
test_auxiliary_client failure (unchanged).
run_agent.py: 3821 -> 3937 lines (+116 from the keyword-forwarded
init call's verbosity). Final: 16083 -> 3937 (-12146, 75% reduction).
The 3,877-line run_conversation body — the agent loop itself — moves out
of run_agent.py into a dedicated module. AIAgent.run_conversation is
now a thin forwarder that delegates to agent.conversation_loop.run_conversation
with the AIAgent instance as the first argument.
This is the largest single extraction in the run_agent.py refactor.
The body keeps all 163 self.X references intact (rewritten as agent.X),
all nested closures, all retry/backoff/compression machinery. Symbols
that tests or callers patch on run_agent (_set_interrupt,
handle_function_call, AIAgent class attrs) are resolved through _ra()
inside the extracted module so the patch surface is preserved.
Five tests doing inspect.getsource(AIAgent.run_conversation) updated to
scan agent.conversation_loop.run_conversation. Two source-introspection
tests (TestMemoryNudgeCounterPersistence, TestMemoryProviderTurnStart)
updated to accept either self.X (legacy) or agent.X (extracted
form) in the matched assertions.
Live E2E verified on three model paths:
* openai/gpt-5.4 (OpenAI chat completions via OpenRouter)
* anthropic/claude-sonnet-4.6 (Anthropic Messages via OpenRouter)
* moonshotai/kimi-k2-thinking (reasoning model, reasoning_content path)
Plus read_file tool execution, terminal tool, web_search.
tests/run_agent/ + tests/agent/: 4313 passed, 1 pre-existing failure
(test_auxiliary_client::test_custom_endpoint... — same as on main).
run_agent.py: 9800 -> 5944 lines (-3856).
Total reduction since baseline: 16083 -> 5944 (-10139, 63%).
Gateway creates a fresh AIAgent per inbound message in several common
scenarios: cache miss, idle eviction (1h TTL), config-signature
mismatch, process restart. A freshly-built AIAgent has
_turns_since_memory=0 and _user_turn_count=0, so the
memory.nudge_interval trigger ('_turns_since_memory >=
_memory_nudge_interval') can never be reached when these reconstructions
happen on roughly the cadence of the interval. A user can chat for hours
on Telegram without ever seeing a self-improvement review fire.
Reconstruct the counters from conversation_history at the top of
run_conversation(), right after the existing _hydrate_todo_store call.
Idempotent guard ('if self._user_turn_count == 0') means a cached agent
that already accumulated counters keeps them; only freshly-built agents
hydrate. Modulo arithmetic preserves the original 1-in-N cadence rather
than firing a review immediately on resume.
7 regression tests pinning the contract (mid-cycle history, modulo wrap,
idempotency, zero-interval skip, role==user filtering, production-code
anchor).
Closes#22357.