dc9c0c950c
Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.
External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths
Also fixes two tests that were not rename-related but became visible
while validating the rename:
- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
gateway service correctly maps to RpcException(Cancelled) per gRPC
convention was being misclassified as a stream fault. Added a sibling
catch on RpcException with StatusCode.Cancelled.
- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
and made it accept either a .git marker OR a .sln/.slnx next to src/
so the worker-exe walker works in non-git working copies.
clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.
Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
Tests: 472/472 pass
Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
IntegrationTests: 18/18 pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
76 lines
2.8 KiB
Markdown
76 lines
2.8 KiB
Markdown
# Worker Process Launcher
|
|
|
|
The gateway uses `WorkerProcessLauncher` to validate and start one worker
|
|
process for a gateway session. The launcher owns process start semantics only;
|
|
pipe handshaking and `WorkerReady` validation remain part of the worker client
|
|
startup path.
|
|
|
|
## Launch Inputs
|
|
|
|
`WorkerProcessLaunchRequest` carries the per-session bootstrap values:
|
|
|
|
- `SessionId`,
|
|
- `PipeName`,
|
|
- `ProtocolVersion`,
|
|
- `Nonce`,
|
|
- optional `PipeReservation` cleanup handle.
|
|
|
|
The launcher passes `SessionId`, `PipeName`, and `ProtocolVersion` as command
|
|
line arguments:
|
|
|
|
```text
|
|
--session-id <sessionId> --pipe-name <pipeName> --protocol-version <version>
|
|
```
|
|
|
|
The launcher sets the nonce through the `MXGATEWAY_WORKER_NONCE` environment
|
|
variable. The nonce is not included in `WorkerProcessCommandLine` so logs and
|
|
diagnostics can report the launch command without exposing the secret.
|
|
|
|
## Validation And Cleanup
|
|
|
|
Before starting the process, the launcher validates that the configured worker
|
|
path exists, has a `.exe` extension, contains a valid Windows Portable
|
|
Executable header, and matches the configured `RequiredArchitecture`.
|
|
|
|
After the process starts, `IWorkerStartupProbe` waits for startup readiness.
|
|
The default probe only verifies that the worker did not exit immediately. The
|
|
worker client replaces this probe when pipe connection, hello, and
|
|
`WorkerReady` handling are implemented.
|
|
|
|
Startup probing uses a bounded Polly retry policy. The gateway starts the worker
|
|
process once, then retries only transient startup-probe failures while the
|
|
process remains alive. The policy is configured by
|
|
`WorkerOptions.StartupProbeRetryAttempts` and
|
|
`WorkerOptions.StartupProbeRetryDelayMilliseconds`; the retry counter is
|
|
recorded as `mxgateway.retries.attempted` with `area=worker_startup`.
|
|
|
|
The launcher also passes
|
|
`MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS` to the worker process from
|
|
`WorkerOptions.PipeConnectAttemptTimeoutMilliseconds`. The worker uses that
|
|
value as the per-attempt named-pipe connect timeout inside its own bounded
|
|
Polly retry loop.
|
|
|
|
If startup fails or exceeds `WorkerOptions.StartupTimeoutSeconds`, the launcher
|
|
kills the worker process tree, disposes the process handle, disposes the
|
|
optional pipe reservation, records a worker kill metric, and reports a
|
|
`WorkerProcessLaunchException`.
|
|
|
|
## Verification
|
|
|
|
Run the focused launcher tests after changing process launch behavior:
|
|
|
|
```bash
|
|
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter WorkerProcessLauncherTests
|
|
```
|
|
|
|
Run the gateway build because the launcher is part of `ZB.MOM.WW.MxGateway.Server`:
|
|
|
|
```bash
|
|
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
|
|
```
|
|
|
|
## Related Documentation
|
|
|
|
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
|
|
- [Worker Frame Protocol](./WorkerFrameProtocol.md)
|