Files
mxaccessgw/docs/WorkerProcessLauncher.md
T
Joseph Doherty dc9c0c950c rename: prefix gateway projects/namespaces with ZB.MOM.WW + sln→slnx
Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.

External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
  MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths

Also fixes two tests that were not rename-related but became visible
while validating the rename:

- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
  gateway service correctly maps to RpcException(Cancelled) per gRPC
  convention was being misclassified as a stream fault. Added a sibling
  catch on RpcException with StatusCode.Cancelled.

- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
  and made it accept either a .git marker OR a .sln/.slnx next to src/
  so the worker-exe walker works in non-git working copies.

clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.

Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
  Tests: 472/472 pass
  Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
  IntegrationTests: 18/18 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:22:23 -04:00

2.8 KiB

Worker Process Launcher

The gateway uses WorkerProcessLauncher to validate and start one worker process for a gateway session. The launcher owns process start semantics only; pipe handshaking and WorkerReady validation remain part of the worker client startup path.

Launch Inputs

WorkerProcessLaunchRequest carries the per-session bootstrap values:

  • SessionId,
  • PipeName,
  • ProtocolVersion,
  • Nonce,
  • optional PipeReservation cleanup handle.

The launcher passes SessionId, PipeName, and ProtocolVersion as command line arguments:

--session-id <sessionId> --pipe-name <pipeName> --protocol-version <version>

The launcher sets the nonce through the MXGATEWAY_WORKER_NONCE environment variable. The nonce is not included in WorkerProcessCommandLine so logs and diagnostics can report the launch command without exposing the secret.

Validation And Cleanup

Before starting the process, the launcher validates that the configured worker path exists, has a .exe extension, contains a valid Windows Portable Executable header, and matches the configured RequiredArchitecture.

After the process starts, IWorkerStartupProbe waits for startup readiness. The default probe only verifies that the worker did not exit immediately. The worker client replaces this probe when pipe connection, hello, and WorkerReady handling are implemented.

Startup probing uses a bounded Polly retry policy. The gateway starts the worker process once, then retries only transient startup-probe failures while the process remains alive. The policy is configured by WorkerOptions.StartupProbeRetryAttempts and WorkerOptions.StartupProbeRetryDelayMilliseconds; the retry counter is recorded as mxgateway.retries.attempted with area=worker_startup.

The launcher also passes MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS to the worker process from WorkerOptions.PipeConnectAttemptTimeoutMilliseconds. The worker uses that value as the per-attempt named-pipe connect timeout inside its own bounded Polly retry loop.

If startup fails or exceeds WorkerOptions.StartupTimeoutSeconds, the launcher kills the worker process tree, disposes the process handle, disposes the optional pipe reservation, records a worker kill metric, and reports a WorkerProcessLaunchException.

Verification

Run the focused launcher tests after changing process launch behavior:

dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter WorkerProcessLauncherTests

Run the gateway build because the launcher is part of ZB.MOM.WW.MxGateway.Server:

dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj