Files
mxaccessgw/docs/WorkerBootstrap.md
T
Joseph Doherty e541339c07 docs(audit): apply per-cluster judgment fixes across living docs
Resolve audit findings: correct WorkerEnvelope proto/route/metric/session
facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme),
and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap
options, and gateway alarm broker; fix client CLI flags and package paths.
2026-06-03 16:01:28 -04:00

224 lines
10 KiB
Markdown

# Worker Bootstrap
The bootstrap layer parses the command-line arguments and environment variables passed to the `ZB.MOM.WW.MxGateway.Worker` process, validates them against the gateway contract, and produces either a populated `WorkerOptions` instance or a structured failure that maps to a `WorkerExitCode`.
## Overview
The worker process is a per-session child process of the gateway: one worker is launched per session and lives for that session's lifetime. The gateway side of this contract lives in [WorkerProcessLauncher](./WorkerProcessLauncher.md). On the worker side, `Program.cs` is a single line that delegates to `WorkerApplication.Run(args)`:
```csharp
using ZB.MOM.WW.MxGateway.Worker;
return WorkerApplication.Run(args);
```
`WorkerApplication.Run` constructs the bootstrap dependencies (`EnvironmentVariableWorkerEnvironment`, `WorkerConsoleLogger` writing to `Console.Error`, and a `WorkerPipeClient`), runs `WorkerOptionsParser`, and routes the resulting `WorkerBootstrapResult` either into the pipe client or into a non-zero exit. Splitting parsing from process wiring lets tests substitute fakes for the environment, logger, and pipe client without spawning a child process.
## Worker Options
`WorkerOptions` is the validated input contract for a worker session. The gateway hands every field to the worker; the worker never reads configuration files.
```csharp
public sealed class WorkerOptions
{
public const string NonceEnvironmentVariableName = "MXGATEWAY_WORKER_NONCE";
public WorkerOptions(
string sessionId,
string pipeName,
uint protocolVersion,
string nonce)
{
SessionId = sessionId;
PipeName = pipeName;
ProtocolVersion = protocolVersion;
Nonce = nonce;
}
public string SessionId { get; }
public string PipeName { get; }
public uint ProtocolVersion { get; }
public string Nonce { get; }
}
```
### Required inputs
All four fields are required. Three arrive on the command line and one arrives via environment variable:
| Source | Name | Maps to |
|--------|------|---------|
| Argument | `--session-id` | `SessionId` |
| Argument | `--pipe-name` | `PipeName` |
| Argument | `--protocol-version` | `ProtocolVersion` |
| Env var | `MXGATEWAY_WORKER_NONCE` | `Nonce` |
The nonce travels via environment variable rather than an argument because process command lines are visible to other users on Windows through `wmic`, `Get-CimInstance Win32_Process`, and the kernel object table; environment variables of another process are not. Treating the nonce as a credential keeps it off the command line.
There are no optional options. An unknown flag, a flag without a value, or a flag whose value starts with `--` is reported as an error rather than silently ignored.
## The Parser
`WorkerOptionsParser` walks `args` once, collects values into a case-insensitive dictionary, and accumulates errors so the caller sees every problem in a single failure rather than fixing them one at a time.
```csharp
for (int index = 0; index < args.Length; index++)
{
string arg = args[index];
if (!IsKnownOption(arg))
{
errors.Add($"Unknown option '{arg}'.");
continue;
}
if (index + 1 >= args.Length || args[index + 1].StartsWith("--", StringComparison.Ordinal))
{
errors.Add($"Option '{arg}' requires a value.");
continue;
}
values[arg] = args[index + 1];
index++;
}
```
After argument scanning, the parser cross-checks the protocol version against `GatewayContractInfo.WorkerProtocolVersion`. A version that parses as a `uint` but does not match the contract value is a hard failure with `WorkerExitCode.InvalidProtocolVersion`, separate from `InvalidArguments`, so the gateway can distinguish a malformed launch from a version mismatch and report a useful upgrade message.
The nonce is read last so that argument-shape errors are reported before the parser asks the environment for a secret it might not need.
## Bootstrap Result
`WorkerBootstrapResult` is a discriminated success/failure carrier. `Options` is non-null when `Succeeded` is true; `Errors` is populated only on failure.
```csharp
public static WorkerBootstrapResult Success(WorkerOptions options)
{
return new WorkerBootstrapResult(WorkerExitCode.Success, options, []);
}
public static WorkerBootstrapResult Failure(WorkerExitCode exitCode, IEnumerable<string> errors)
{
return new WorkerBootstrapResult(exitCode, null, errors.ToArray());
}
```
`Succeeded` is defined as `ExitCode == WorkerExitCode.Success` rather than as a separate flag, so the exit code and the success state cannot disagree.
## Exit Codes
`WorkerExitCode` is the worker process's exit contract. The gateway-side launcher decodes these values to decide whether a relaunch is safe.
| Value | Numeric | Produced when |
|-------|---------|---------------|
| `Success` | 0 | The pipe session ran to a clean close. |
| `UnexpectedFailure` | 1 | Any unhandled exception not matched by a more specific catch. |
| `InvalidArguments` | 2 | One or more `--session-id`, `--pipe-name`, or `--protocol-version` errors (missing, empty, unknown flag, or no value). |
| `InvalidProtocolVersion` | 3 | `--protocol-version` is not a `uint` or does not equal `GatewayContractInfo.WorkerProtocolVersion`. |
| `MissingNonce` | 4 | The `MXGATEWAY_WORKER_NONCE` environment variable is null, empty, or whitespace. |
| `PipeConnectionFailed` | 5 | An `IOException` or `TimeoutException` escapes the pipe client. |
| `ProtocolViolation` | 6 | A `WorkerFrameProtocolException` escapes the pipe client. |
`InvalidArguments`, `InvalidProtocolVersion`, and `MissingNonce` originate in the parser; the others originate in `WorkerApplication.Run`'s `try/catch` around the pipe client.
## Environment Abstraction
`IWorkerEnvironment` exists so tests can supply a fake nonce without mutating the real process environment, which would be a shared mutable global across parallel test runs.
```csharp
public interface IWorkerEnvironment
{
string? GetEnvironmentVariable(string name);
}
public sealed class EnvironmentVariableWorkerEnvironment : IWorkerEnvironment
{
public string? GetEnvironmentVariable(string name)
{
return Environment.GetEnvironmentVariable(name);
}
}
```
The production binding in `WorkerApplication.Run(string[])` is `EnvironmentVariableWorkerEnvironment`, which is a thin pass-through to `System.Environment.GetEnvironmentVariable`.
## Logging
The worker writes structured key/value lines to standard error. The launcher does not redirect either stream (`WorkerProcessLauncher` sets `UseShellExecute=false` and `CreateNoWindow=true` but leaves stdout and stderr inherited), so log output lands on the inherited console rather than a pipe the gateway reads. Standard error is used rather than standard output so that diagnostic logging stays clear of stdout, keeping that stream free for any future stdout-based channel.
### The logger contract
`IWorkerLogger` exposes only `Information` and `Error`. There is no `Debug` or `Trace` level, because the worker is launched per session and verbose tracing belongs to the gateway-side launcher.
```csharp
public interface IWorkerLogger
{
void Information(string eventName, IReadOnlyDictionary<string, object?> fields);
void Error(string eventName, IReadOnlyDictionary<string, object?> fields);
}
```
`WorkerConsoleLogger` formats each call as `level=<Level> event=<EventName> key=value key=value` after running the field dictionary through `WorkerLogRedactor`:
```csharp
private void Write(
string level,
string eventName,
IReadOnlyDictionary<string, object?> fields)
{
Dictionary<string, object?> redactedFields = WorkerLogRedactor.RedactFields(fields);
string fieldText = string.Join(
" ",
redactedFields.Select(field => $"{field.Key}={FormatValue(field.Value)}"));
_writer.WriteLine($"level={level} event={eventName} {fieldText}".TrimEnd());
}
```
### What the redactor redacts and why
`gateway.md` "Security" requires that the worker never log raw credential values for `AuthenticateUser`, `WriteSecured`, or related secured operations. The bootstrap nonce is also a credential: anyone who reads it can impersonate the worker to the gateway pipe. `WorkerLogRedactor` enforces this by replacing values whose field name contains any of these substrings (case-insensitive) with the literal `[redacted]`:
```csharp
private static readonly string[] SensitiveFieldNameParts =
[
"nonce",
"secret",
"password",
"token",
"credential",
"apikey",
"api_key",
];
```
The match is on substrings of the field name rather than an exact list, so a field called `auth_token` or `user_password` is redacted automatically without each call site having to remember to opt in. `null` values pass through unchanged so the absence of a value is still visible in logs.
## How `Program.cs` Consumes The Result
`WorkerApplication.Run` is the single consumer of `WorkerBootstrapResult`. On failure it logs a `WorkerBootstrapFailed` event and returns the numeric `ExitCode` directly:
```csharp
WorkerOptionsParser parser = new(environment);
WorkerBootstrapResult result = parser.Parse(args);
if (!result.Succeeded)
{
logger.Error("WorkerBootstrapFailed", new Dictionary<string, object?>
{
["exit_code"] = result.ExitCode,
["errors"] = string.Join(";", result.Errors),
});
return (int)result.ExitCode;
}
```
On success it logs `WorkerBootstrapSucceeded` with the session fields (the `nonce` field is redacted by `WorkerLogRedactor` because of its name), hands the `WorkerOptions` to `IWorkerPipeClient.RunAsync`, and waits synchronously. The `try/catch` around the pipe call maps `WorkerFrameProtocolException` to `ProtocolViolation`, `IOException`/`TimeoutException` to `PipeConnectionFailed`, and any other exception to `UnexpectedFailure`, so every code path through `Run` returns one of the values in `WorkerExitCode`.
## Related Documentation
- [Worker Process Launcher](./WorkerProcessLauncher.md)
- [Worker STA](./WorkerSta.md)
- [Worker Frame Protocol](./WorkerFrameProtocol.md)