Implement graceful worker shutdown
This commit is contained in:
@@ -175,6 +175,12 @@ Behavior:
|
||||
`CloseSession` should be idempotent. Closing an already closed session should
|
||||
return a successful close result with the final known state.
|
||||
|
||||
`WorkerClient.ShutdownAsync` sends `WorkerShutdown`, waits for the worker read,
|
||||
write, and heartbeat loops to stop, and waits for the launched worker process to
|
||||
exit within the same shutdown timeout. If the pipe loops or process exit exceed
|
||||
the timeout, the close operation fails with `ShutdownTimeout`; `GatewaySession`
|
||||
then kills the worker process tree before surfacing the close failure.
|
||||
|
||||
### Invoke
|
||||
|
||||
`Invoke` forwards one MXAccess command to the worker that owns the session.
|
||||
@@ -515,6 +521,11 @@ It handles:
|
||||
The write loop should fail the session if a pipe write fails outside normal
|
||||
shutdown.
|
||||
|
||||
During shutdown the worker client treats `WorkerShutdownAck` as the protocol
|
||||
close signal, but the process handle remains authoritative for process lifetime.
|
||||
The client waits for both the protocol close and process exit before reporting a
|
||||
clean shutdown to `GatewaySession`.
|
||||
|
||||
## Command Correlation
|
||||
|
||||
Each command gets:
|
||||
|
||||
@@ -321,6 +321,13 @@ If COM creation fails, the worker should send a structured fault with:
|
||||
when the exception exposes one, and does not send `WorkerReady` after a failed
|
||||
COM creation attempt.
|
||||
|
||||
After `WorkerReady`, `WorkerPipeSession` continues reading gateway frames for
|
||||
the lifetime of the process. `WorkerCommand` frames are dispatched to
|
||||
`MxAccessStaSession`, replies are written as `WorkerCommandReply`, and queued
|
||||
worker events are drained after command replies. `WorkerShutdown` starts the
|
||||
graceful shutdown path and returns `WorkerShutdownAck` only after the STA
|
||||
cleanup path completes.
|
||||
|
||||
## Event Sink
|
||||
|
||||
The worker must subscribe to every public MXAccess event family:
|
||||
@@ -663,6 +670,29 @@ Graceful shutdown sequence:
|
||||
If shutdown wedges, the gateway kills the process. The worker should be written
|
||||
so process kill does not corrupt other sessions.
|
||||
|
||||
`MxAccessStaSession.ShutdownGracefullyAsync` implements the current cleanup
|
||||
path. It first calls `StaCommandDispatcher.RequestShutdown()` so new commands
|
||||
are rejected and queued commands that have not started receive
|
||||
`ProtocolStatusCode.WorkerUnavailable`. The command already executing on the
|
||||
STA is allowed to finish until the shutdown grace period expires.
|
||||
|
||||
After command dispatch is closed, cleanup runs on the STA in MXAccess handle
|
||||
order:
|
||||
|
||||
1. one `UnAdvise` call per advised server/item pair,
|
||||
2. `RemoveItem` for active item handles,
|
||||
3. `Unregister` for active server handles,
|
||||
4. event sink detach,
|
||||
5. COM release.
|
||||
|
||||
Each cleanup call is best effort. A failed cleanup operation is recorded as an
|
||||
`MxAccessShutdownFailure`, logged by `WorkerPipeSession`, and does not prevent
|
||||
later cleanup calls from running. A shutdown with cleanup failures still returns
|
||||
`WorkerShutdownAck` with `ProtocolStatusCode.Ok` because the worker reached the
|
||||
controlled release path. If the grace period expires before cleanup can run or
|
||||
finish, the worker reports `WorkerFaultCategory.ShutdownTimeout` when possible
|
||||
and relies on the gateway to kill the process.
|
||||
|
||||
## Fault Handling
|
||||
|
||||
Worker fault categories:
|
||||
|
||||
Reference in New Issue
Block a user