fix(gateway): prevent --replace race condition causing multiple instances

When starting the gateway with --replace, concurrent invocations could
leave multiple instances running simultaneously. This happened because
write_pid_file() used a plain overwrite, so the second racer would
silently replace the first process's PID record.

Changes:
- gateway/status.py: write_pid_file() now uses atomic O_CREAT|O_EXCL
  creation. If the file already exists, it raises FileExistsError,
  allowing exactly one process to win the race.
- gateway/run.py: before writing the PID file, re-check get_running_pid()
  and catch FileExistsError from write_pid_file(). In both cases, stop
  the runner and return False so the process exits cleanly.

Fixes #11718
This commit is contained in:
opriz
2026-04-18 02:24:35 +08:00
committed by Teknium
parent 328223576b
commit cbe29db774
3 changed files with 42 additions and 4 deletions

View File

@@ -307,6 +307,7 @@ AUTHOR_MAP = {
"anthhub@163.com": "anthhub",
"shenuu@gmail.com": "shenuu",
"xiayh17@gmail.com": "xiayh0107",
"zhujianxyz@gmail.com": "opriz",
"asurla@nvidia.com": "anniesurla",
"limkuan24@gmail.com": "WideLee",
"aviralarora002@gmail.com": "AviArora02-commits",