Resolve Server-002, -004, -005, -006 code-review findings
Server-002: the gateway never terminated leftover MxGateway.Worker.exe processes at startup, contradicting gateway.md and CLAUDE.md. Added IRunningProcessInspector/SystemRunningProcessInspector, OrphanWorkerTerminator, and OrphanWorkerCleanupHostedService (best-effort, runs before sessions are accepted); updated gateway.md to describe the implemented behavior. Server-004: API-key scopes were persisted verbatim with no validation. Added GatewayScopes.All/IsKnown; the CLI parser and dashboard create path now reject unknown scope strings. Server-005: a non-SqlException/InvalidOperationException fault on the initial Galaxy hierarchy load faulted the BackgroundService. ExecuteAsync now catches all non-cancellation exceptions on first load and RefreshCoreAsync broadens its catch so the cache records Stale/Unavailable instead. Server-006: OpenSessionAsync incremented the open-sessions gauge before alarm auto-subscribe; an auto-subscribe failure leaked the gauge. The catch path now calls SessionRemoved() when the gauge was incremented. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+5
-2
@@ -579,8 +579,11 @@ Policy:
|
||||
- command exceptions return structured command fault with HRESULT if known,
|
||||
- stale sessions are closed by lease timeout,
|
||||
- stuck workers are killed by process id,
|
||||
- gateway restart should not attempt to reattach old workers unless explicitly
|
||||
designed; first version should terminate orphaned workers on startup.
|
||||
- gateway restart does not reattach old workers; `OrphanWorkerCleanupHostedService`
|
||||
runs `OrphanWorkerTerminator` once on startup — before the server accepts
|
||||
sessions — to kill leftover `MxGateway.Worker.exe` processes (matched by the
|
||||
configured worker executable path, or by image name when the x64 gateway cannot
|
||||
introspect the x86 worker's module) left behind by a previous unclean run.
|
||||
|
||||
Because each client owns one worker, a crash or leak affects only that session.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user