Files
mxaccessgw/docs/WorkerBootstrap.md
T
Joseph Doherty e541339c07 docs(audit): apply per-cluster judgment fixes across living docs
Resolve audit findings: correct WorkerEnvelope proto/route/metric/session
facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme),
and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap
options, and gateway alarm broker; fix client CLI flags and package paths.
2026-06-03 16:01:28 -04:00

10 KiB

Worker Bootstrap

The bootstrap layer parses the command-line arguments and environment variables passed to the ZB.MOM.WW.MxGateway.Worker process, validates them against the gateway contract, and produces either a populated WorkerOptions instance or a structured failure that maps to a WorkerExitCode.

Overview

The worker process is a per-session child process of the gateway: one worker is launched per session and lives for that session's lifetime. The gateway side of this contract lives in WorkerProcessLauncher. On the worker side, Program.cs is a single line that delegates to WorkerApplication.Run(args):

using ZB.MOM.WW.MxGateway.Worker;

return WorkerApplication.Run(args);

WorkerApplication.Run constructs the bootstrap dependencies (EnvironmentVariableWorkerEnvironment, WorkerConsoleLogger writing to Console.Error, and a WorkerPipeClient), runs WorkerOptionsParser, and routes the resulting WorkerBootstrapResult either into the pipe client or into a non-zero exit. Splitting parsing from process wiring lets tests substitute fakes for the environment, logger, and pipe client without spawning a child process.

Worker Options

WorkerOptions is the validated input contract for a worker session. The gateway hands every field to the worker; the worker never reads configuration files.

public sealed class WorkerOptions
{
    public const string NonceEnvironmentVariableName = "MXGATEWAY_WORKER_NONCE";

    public WorkerOptions(
        string sessionId,
        string pipeName,
        uint protocolVersion,
        string nonce)
    {
        SessionId = sessionId;
        PipeName = pipeName;
        ProtocolVersion = protocolVersion;
        Nonce = nonce;
    }

    public string SessionId { get; }
    public string PipeName { get; }
    public uint ProtocolVersion { get; }
    public string Nonce { get; }
}

Required inputs

All four fields are required. Three arrive on the command line and one arrives via environment variable:

Source Name Maps to
Argument --session-id SessionId
Argument --pipe-name PipeName
Argument --protocol-version ProtocolVersion
Env var MXGATEWAY_WORKER_NONCE Nonce

The nonce travels via environment variable rather than an argument because process command lines are visible to other users on Windows through wmic, Get-CimInstance Win32_Process, and the kernel object table; environment variables of another process are not. Treating the nonce as a credential keeps it off the command line.

There are no optional options. An unknown flag, a flag without a value, or a flag whose value starts with -- is reported as an error rather than silently ignored.

The Parser

WorkerOptionsParser walks args once, collects values into a case-insensitive dictionary, and accumulates errors so the caller sees every problem in a single failure rather than fixing them one at a time.

for (int index = 0; index < args.Length; index++)
{
    string arg = args[index];
    if (!IsKnownOption(arg))
    {
        errors.Add($"Unknown option '{arg}'.");
        continue;
    }

    if (index + 1 >= args.Length || args[index + 1].StartsWith("--", StringComparison.Ordinal))
    {
        errors.Add($"Option '{arg}' requires a value.");
        continue;
    }

    values[arg] = args[index + 1];
    index++;
}

After argument scanning, the parser cross-checks the protocol version against GatewayContractInfo.WorkerProtocolVersion. A version that parses as a uint but does not match the contract value is a hard failure with WorkerExitCode.InvalidProtocolVersion, separate from InvalidArguments, so the gateway can distinguish a malformed launch from a version mismatch and report a useful upgrade message.

The nonce is read last so that argument-shape errors are reported before the parser asks the environment for a secret it might not need.

Bootstrap Result

WorkerBootstrapResult is a discriminated success/failure carrier. Options is non-null when Succeeded is true; Errors is populated only on failure.

public static WorkerBootstrapResult Success(WorkerOptions options)
{
    return new WorkerBootstrapResult(WorkerExitCode.Success, options, []);
}

public static WorkerBootstrapResult Failure(WorkerExitCode exitCode, IEnumerable<string> errors)
{
    return new WorkerBootstrapResult(exitCode, null, errors.ToArray());
}

Succeeded is defined as ExitCode == WorkerExitCode.Success rather than as a separate flag, so the exit code and the success state cannot disagree.

Exit Codes

WorkerExitCode is the worker process's exit contract. The gateway-side launcher decodes these values to decide whether a relaunch is safe.

Value Numeric Produced when
Success 0 The pipe session ran to a clean close.
UnexpectedFailure 1 Any unhandled exception not matched by a more specific catch.
InvalidArguments 2 One or more --session-id, --pipe-name, or --protocol-version errors (missing, empty, unknown flag, or no value).
InvalidProtocolVersion 3 --protocol-version is not a uint or does not equal GatewayContractInfo.WorkerProtocolVersion.
MissingNonce 4 The MXGATEWAY_WORKER_NONCE environment variable is null, empty, or whitespace.
PipeConnectionFailed 5 An IOException or TimeoutException escapes the pipe client.
ProtocolViolation 6 A WorkerFrameProtocolException escapes the pipe client.

InvalidArguments, InvalidProtocolVersion, and MissingNonce originate in the parser; the others originate in WorkerApplication.Run's try/catch around the pipe client.

Environment Abstraction

IWorkerEnvironment exists so tests can supply a fake nonce without mutating the real process environment, which would be a shared mutable global across parallel test runs.

public interface IWorkerEnvironment
{
    string? GetEnvironmentVariable(string name);
}

public sealed class EnvironmentVariableWorkerEnvironment : IWorkerEnvironment
{
    public string? GetEnvironmentVariable(string name)
    {
        return Environment.GetEnvironmentVariable(name);
    }
}

The production binding in WorkerApplication.Run(string[]) is EnvironmentVariableWorkerEnvironment, which is a thin pass-through to System.Environment.GetEnvironmentVariable.

Logging

The worker writes structured key/value lines to standard error. The launcher does not redirect either stream (WorkerProcessLauncher sets UseShellExecute=false and CreateNoWindow=true but leaves stdout and stderr inherited), so log output lands on the inherited console rather than a pipe the gateway reads. Standard error is used rather than standard output so that diagnostic logging stays clear of stdout, keeping that stream free for any future stdout-based channel.

The logger contract

IWorkerLogger exposes only Information and Error. There is no Debug or Trace level, because the worker is launched per session and verbose tracing belongs to the gateway-side launcher.

public interface IWorkerLogger
{
    void Information(string eventName, IReadOnlyDictionary<string, object?> fields);

    void Error(string eventName, IReadOnlyDictionary<string, object?> fields);
}

WorkerConsoleLogger formats each call as level=<Level> event=<EventName> key=value key=value after running the field dictionary through WorkerLogRedactor:

private void Write(
    string level,
    string eventName,
    IReadOnlyDictionary<string, object?> fields)
{
    Dictionary<string, object?> redactedFields = WorkerLogRedactor.RedactFields(fields);
    string fieldText = string.Join(
        " ",
        redactedFields.Select(field => $"{field.Key}={FormatValue(field.Value)}"));

    _writer.WriteLine($"level={level} event={eventName} {fieldText}".TrimEnd());
}

What the redactor redacts and why

gateway.md "Security" requires that the worker never log raw credential values for AuthenticateUser, WriteSecured, or related secured operations. The bootstrap nonce is also a credential: anyone who reads it can impersonate the worker to the gateway pipe. WorkerLogRedactor enforces this by replacing values whose field name contains any of these substrings (case-insensitive) with the literal [redacted]:

private static readonly string[] SensitiveFieldNameParts =
[
    "nonce",
    "secret",
    "password",
    "token",
    "credential",
    "apikey",
    "api_key",
];

The match is on substrings of the field name rather than an exact list, so a field called auth_token or user_password is redacted automatically without each call site having to remember to opt in. null values pass through unchanged so the absence of a value is still visible in logs.

How Program.cs Consumes The Result

WorkerApplication.Run is the single consumer of WorkerBootstrapResult. On failure it logs a WorkerBootstrapFailed event and returns the numeric ExitCode directly:

WorkerOptionsParser parser = new(environment);
WorkerBootstrapResult result = parser.Parse(args);

if (!result.Succeeded)
{
    logger.Error("WorkerBootstrapFailed", new Dictionary<string, object?>
    {
        ["exit_code"] = result.ExitCode,
        ["errors"] = string.Join(";", result.Errors),
    });

    return (int)result.ExitCode;
}

On success it logs WorkerBootstrapSucceeded with the session fields (the nonce field is redacted by WorkerLogRedactor because of its name), hands the WorkerOptions to IWorkerPipeClient.RunAsync, and waits synchronously. The try/catch around the pipe call maps WorkerFrameProtocolException to ProtocolViolation, IOException/TimeoutException to PipeConnectionFailed, and any other exception to UnexpectedFailure, so every code path through Run returns one of the values in WorkerExitCode.