mxaccessgw/docs/Diagnostics.md

# Gateway Diagnostics

The diagnostics subsystem provides structured logging, credential redaction, and request-scoped log enrichment for the gateway. It lives under `src/ZB.MOM.WW.MxGateway.Server/Diagnostics/` and is wired into the ASP.NET Core pipeline so every gRPC and HTTP request carries the same correlation fields.

## Goals

The subsystem exists to satisfy two security rules from `gateway.md`: never log passwords or raw credential values for `AuthenticateUser`, `WriteSecured`, or related secured operations, and never log full MXAccess values by default. Code paths that touch credentials or tag values must therefore route through `GatewayLogRedactor` rather than emitting them directly.

A second goal is parity-test diagnosability. Because MXAccess sessions, workers, correlation ids, and command methods are the units of comparison, every log entry produced inside a request scope must carry those identifiers without each call site having to format them.

## Log Scopes

`GatewayLogScope` is a record that captures the fields attached to a logger scope. It only emits keys whose values are non-null, so callers can supply just the identifiers they know about:

```csharp
public sealed record GatewayLogScope(
    string? SessionId = null,
    int? WorkerProcessId = null,
    ulong? CorrelationId = null,
    string? CommandMethod = null,
    string? ClientIdentity = null)
{
    public IReadOnlyDictionary<string, object?> ToDictionary()
    {
        Dictionary<string, object?> values = [];

        AddIfPresent(values, "SessionId", SessionId);
        AddIfPresent(values, "WorkerProcessId", WorkerProcessId);
        AddIfPresent(values, "CorrelationId", CorrelationId);
        AddIfPresent(values, "CommandMethod", CommandMethod);
        AddIfPresent(values, "ClientIdentity", GatewayLogRedactor.RedactClientIdentity(ClientIdentity));

        return values;
    }
```

`ClientIdentity` is passed through `GatewayLogRedactor.RedactClientIdentity` inside `ToDictionary` rather than at the call site. This guarantees that any logger scope built from a `GatewayLogScope` cannot accidentally surface a raw API key, even when a caller forgets to redact before constructing the scope.

### How scopes are pushed

`GatewayLoggerExtensions` exposes a single method that converts a `GatewayLogScope` into the dictionary form expected by `ILogger.BeginScope`:

```csharp
public static class GatewayLoggerExtensions
{
    public static IDisposable? BeginGatewayScope(
        this ILogger logger,
        GatewayLogScope scope)
    {
        ArgumentNullException.ThrowIfNull(logger);
        ArgumentNullException.ThrowIfNull(scope);

        return logger.BeginScope(scope.ToDictionary());
    }
}
```

The returned `IDisposable?` follows the standard `BeginScope` contract: callers wrap it in a `using` to bound the scope to a request, command, or worker interaction.

## Redaction Rules

`GatewayLogRedactor` centralizes every redaction decision so that policy changes live in one file. Three categories of input are handled differently because each has different "safe to log" prefixes.

### Sensitive command methods

A static set names the MXAccess commands that are known to carry credentials in their payloads:

```csharp
private static readonly HashSet<string> SensitiveCommandMethods = new(StringComparer.OrdinalIgnoreCase)
{
    "AuthenticateUser",
    "WriteSecured",
    "WriteSecured2"
};

public static bool IsCredentialBearingCommand(string? commandMethod)
{
    return commandMethod is not null
        && SensitiveCommandMethods.Contains(commandMethod);
}
```

The names match the MXAccess command list in `gateway.md` exactly. `Write` and `Write2` are not in the set because their payloads are tag values, not credentials, and are governed by the `valueLoggingEnabled` flag described below.

### API key redaction

`RedactApiKey` is built around the `mxgw_` API key format issued by the gateway. It preserves the bearer scheme and the key id segment so that operators can correlate a log entry to a specific principal, but always strips the secret tail:

```csharp
public static string? RedactApiKey(string? authorizationHeader)
{
    if (string.IsNullOrWhiteSpace(authorizationHeader))
    {
        return authorizationHeader;
    }

    const string bearerPrefix = "Bearer ";
    if (!authorizationHeader.StartsWith(bearerPrefix, StringComparison.OrdinalIgnoreCase))
    {
        return RedactedValue;
    }

    string token = authorizationHeader[bearerPrefix.Length..].Trim();

    if (!token.StartsWith("mxgw_", StringComparison.OrdinalIgnoreCase))
    {
        return $"{bearerPrefix}{RedactedValue}";
    }

    string[] tokenParts = token.Split('_', 3, StringSplitOptions.RemoveEmptyEntries);
    if (tokenParts.Length < 2)
    {
        return $"{bearerPrefix}mxgw_{RedactedValue}";
    }

    return $"{bearerPrefix}mxgw_{tokenParts[1]}_{RedactedValue}";
}
```

The split uses `count: 3` because the secret portion may itself contain underscores; only the first two segments (`mxgw` and the key id) are kept verbatim. Authorization headers that are not bearer tokens are reduced to `[redacted]` rather than passed through, since the gateway cannot reason about their structure.

`RedactClientIdentity` is the entry point used by `GatewayLogScope` and `DashboardRedactor`. It only invokes `RedactApiKey` when the input contains the `mxgw_` marker, leaving non-key identities (for example, Windows account names) untouched.

### Command value redaction

`RedactCommandValue` enforces the "values are opt-in and redacted by default" rule:

```csharp
public static object? RedactCommandValue(
    string? commandMethod,
    object? value,
    bool valueLoggingEnabled = false)
{
    if (value is null)
    {
        return null;
    }

    if (!valueLoggingEnabled || IsCredentialBearingCommand(commandMethod))
    {
        return RedactedValue;
    }

    return value;
}
```

Two rules combine here. First, when `valueLoggingEnabled` is `false` (the default), every value is replaced with `[redacted]`. Second, even when value logging is enabled, credential-bearing commands still redact. The credential check is therefore unconditional and cannot be overridden by configuration.

The shared `RedactedValue` constant is `"[redacted]"`. `DashboardRedactor` reuses it so that gateway logs and dashboard renders use the same placeholder.

## Request Logging Middleware

`GatewayRequestLoggingMiddlewareExtensions.UseGatewayRequestLoggingScope` registers the middleware that pushes a `GatewayLogScope` for the duration of every request:

```csharp
public static IApplicationBuilder UseGatewayRequestLoggingScope(this IApplicationBuilder app)
{
    ArgumentNullException.ThrowIfNull(app);

    return app.Use(async (context, next) =>
    {
        ILogger logger = context.RequestServices
            .GetRequiredService<ILoggerFactory>()
            .CreateLogger("ZB.MOM.WW.MxGateway.Request");

        using IDisposable? scope = logger.BeginGatewayScope(new GatewayLogScope(
            SessionId: ReadHeader(context, SessionIdHeaderName),
            WorkerProcessId: ReadInt32Header(context, WorkerProcessIdHeaderName),
            CorrelationId: ReadUInt64Header(context, CorrelationIdHeaderName),
            CommandMethod: ReadHeader(context, CommandMethodHeaderName),
            ClientIdentity: ReadHeader(context, "authorization")));

        await next(context);
    });
}
```

The scope is keyed off four custom headers and the standard `authorization` header:

| Header | Scope field | Type |
|--------|-------------|------|
| `x-session-id` | `SessionId` | string |
| `x-worker-process-id` | `WorkerProcessId` | int |
| `x-correlation-id` | `CorrelationId` | ulong |
| `x-command-method` | `CommandMethod` | string |
| `authorization` | `ClientIdentity` | string (redacted) |

The numeric headers use `int.TryParse` and `ulong.TryParse`; missing or unparseable values become `null` and are dropped by `GatewayLogScope.ToDictionary`. This keeps the middleware tolerant of clients that do not yet emit every header, which matters because the earliest call in a session (`OpenSession`) has no `SessionId` to send.

The logger category is `ZB.MOM.WW.MxGateway.Request`, which lets operators filter the request scope events independently from per-component categories.

### Pipeline ordering

`GatewayApplication.Build` registers the middleware before authentication, authorization, and endpoint mapping:

```csharp
app.UseGatewayRequestLoggingScope();
app.UseStaticFiles();
app.UseAuthentication();
app.UseAuthorization();
app.UseAntiforgery();
app.MapGatewayEndpoints();
```

The order matters: putting the logging scope first ensures that authentication failures, authorization denials, and endpoint exceptions all run inside the request scope, so failure logs still carry the correlation id and session id headers that the caller sent. The `ClientIdentity` field is redacted before logging, so reading the `authorization` header at this stage does not leak the bearer secret into authentication failure logs.

## Consumers

`GatewayLoggerExtensions.BeginGatewayScope` is consumed by `GatewayRequestLoggingMiddlewareExtensions` to attach the per-request scope. Component-level call sites build narrower `GatewayLogScope` instances (for example, with a known `WorkerProcessId` after a worker launch) and push a nested scope on top of the request scope.

`GatewayLogRedactor` is consumed in three places:

- `GatewayLogScope.ToDictionary` redacts `ClientIdentity` whenever a scope is materialized.
- `DashboardRedactor.Redact` delegates to `RedactClientIdentity` for any value containing the `mxgw_` marker, then falls back to a marker-keyword check for fields like `password` or `token`. This keeps dashboard renders aligned with log redaction.
- `ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorTests.cs` covers each redaction branch, including the assertion that `WriteSecured` values stay redacted even when `valueLoggingEnabled` is true.

## Related Documentation

- [Sessions](./Sessions.md)
- [gRPC](./Grpc.md)
- [Authentication](./Authentication.md)