Files
mxaccessgw/docs/Authorization.md
T
Joseph Doherty dc9c0c950c rename: prefix gateway projects/namespaces with ZB.MOM.WW + sln→slnx
Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.

External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
  MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths

Also fixes two tests that were not rename-related but became visible
while validating the rename:

- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
  gateway service correctly maps to RpcException(Cancelled) per gRPC
  convention was being misclassified as a stream fault. Added a sibling
  catch on RpcException with StatusCode.Cancelled.

- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
  and made it accept either a .git marker OR a .sln/.slnx next to src/
  so the worker-exe walker works in non-git working copies.

clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.

Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
  Tests: 472/472 pass
  Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
  IntegrationTests: 18/18 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 16:22:23 -04:00

16 KiB

Gateway gRPC Authorization

The authorization subsystem has two layers. The gRPC interceptor enforces the verb scope required by the RPC. Service-layer constraint checks then narrow what an authenticated API key can browse, read, or write inside the Galaxy.

Overview

Authorization runs as a single gRPC server interceptor registered for every call on the gateway. It pulls the authenticated identity for the current request, derives the scope that the request type requires, and either lets the call continue or fails the call with a gRPC status. The pipeline keeps service classes free of cross-cutting checks, which matches the gateway.md "thin gRPC layer" rule that service handlers translate between contracts and domain code without owning policy.

The participating types live under src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/:

  • GatewayGrpcAuthorizationInterceptor runs the authenticate-then-authorize pipeline for unary and server-streaming calls.
  • GatewayGrpcScopeResolver maps a request message (and, for MxCommandRequest, the inner MxCommandKind) to the scope string that must be present on the caller.
  • GatewayScopes exposes the canonical scope constants used by the resolver and any downstream consumer.
  • GatewayRequestIdentityAccessor and IGatewayRequestIdentityAccessor expose the verified identity to handlers and any service code that runs inside the call.
  • IConstraintEnforcer applies optional API-key constraints against the cached Galaxy hierarchy from service bodies.
  • GrpcAuthorizationServiceCollectionExtensions wires the components into the DI container and the gRPC pipeline.

The ApiKeyIdentity consumed here is produced by the authentication layer; see Authentication for how it is built and how scopes are persisted.

Why an Interceptor

Centralizing the policy in GatewayGrpcAuthorizationInterceptor produces three concrete benefits:

  1. Every RPC defined in MxAccessGatewayService is covered by construction. A new RPC inherits the check the moment its request type is added to GatewayGrpcScopeResolver, instead of relying on each service method to remember to call an authorization helper.
  2. Verb-scope policy stays centralized. Request-specific constraints still run in service bodies because they need command payloads, item handles, and Galaxy metadata that the interceptor should not inspect.
  3. Authentication and authorization happen in one place, so the gRPC Status mapping is consistent. A failed key check always returns Unauthenticated, and a missing scope always returns PermissionDenied with the offending scope name.

Interceptor Flow

GatewayGrpcAuthorizationInterceptor overrides both UnaryServerHandler and ServerStreamingServerHandler. Both call the same private AuthenticateAndAuthorizeAsync helper before invoking the continuation, then push the resolved identity onto the accessor for the duration of the call.

public override async Task<TResponse> UnaryServerHandler<TRequest, TResponse>(
    TRequest request,
    ServerCallContext context,
    UnaryServerMethod<TRequest, TResponse> continuation)
{
    ApiKeyIdentity? identity = await AuthenticateAndAuthorizeAsync(request, context).ConfigureAwait(false);
    IDisposable? identityScope = identity is null ? null : identityAccessor.Push(identity);
    using (identityScope)
    {
        return await continuation(request, context).ConfigureAwait(false);
    }
}

The shared helper performs the actual decision:

if (options.Value.Authentication.Mode == AuthenticationMode.Disabled)
{
    return null;
}

string? authorizationHeader = context.RequestHeaders.GetValue("authorization");
ApiKeyVerificationResult verificationResult = await apiKeyVerifier
    .VerifyAsync(authorizationHeader, context.CancellationToken)
    .ConfigureAwait(false);

if (!verificationResult.Succeeded || verificationResult.Identity is null)
{
    throw new RpcException(new Status(
        StatusCode.Unauthenticated,
        "Missing or invalid API key."));
}

string requiredScope = scopeResolver.ResolveRequiredScope(request);
if (!verificationResult.Identity.Scopes.Contains(requiredScope))
{
    throw new RpcException(new Status(
        StatusCode.PermissionDenied,
        $"API key is missing required scope '{requiredScope}'."));
}

return verificationResult.Identity;

The flow is:

  1. If GatewayOptions.Authentication.Mode is AuthenticationMode.Disabled, the helper returns null immediately. No identity is pushed onto the accessor and the continuation runs without scope enforcement. This matches the AuthenticationMode enum, which only defines ApiKey and Disabled.
  2. Otherwise, the authorization request header is read directly off ServerCallContext.RequestHeaders and handed to IApiKeyVerifier.VerifyAsync. A failed verification or a missing identity throws RpcException with StatusCode.Unauthenticated.
  3. GatewayGrpcScopeResolver.ResolveRequiredScope(request) produces the scope string. If the identity's Scopes set does not contain it, the helper throws RpcException with StatusCode.PermissionDenied and embeds the missing scope name in Status.Detail so callers can diagnose the failure.
  4. On success, the verified ApiKeyIdentity is returned and pushed onto IGatewayRequestIdentityAccessor for the lifetime of the call.

The status codes are deliberately distinct: Unauthenticated signals "we do not know who you are," and PermissionDenied signals "we know who you are, but you cannot do this." Treating the two as the same code would make troubleshooting harder for client implementations.

Scope Resolution

GatewayGrpcScopeResolver is a stateless singleton that switches on the runtime request type. Top-level RPC requests map directly:

public string ResolveRequiredScope(object request)
{
    return request switch
    {
        OpenSessionRequest => GatewayScopes.SessionOpen,
        CloseSessionRequest => GatewayScopes.SessionClose,
        StreamEventsRequest => GatewayScopes.EventsRead,
        MxCommandRequest commandRequest => ResolveCommandScope(commandRequest.Command?.Kind ?? MxCommandKind.Unspecified),
        AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite,
        StreamAlarmsRequest => GatewayScopes.EventsRead,
        TestConnectionRequest or
        GetLastDeployTimeRequest or
        DiscoverHierarchyRequest or
        WatchDeployEventsRequest => GatewayScopes.MetadataRead,
        _ => GatewayScopes.Admin
    };
}

The _ => GatewayScopes.Admin fallback is intentional: any future request type that the resolver does not recognize fails closed, requiring the strongest scope until the resolver is updated. AcknowledgeAlarm is treated as a write — it mutates alarm state, mirroring MxCommandKind.Write* — and StreamAlarms shares the alarm/event surface with StreamEvents and MxCommandKind.DrainEvents, so it carries events:read. Both alarm RPCs are session-less: the scope check is the only authorization gate, since there is no per-session ownership to enforce.

MxCommandRequest is special because it multiplexes many MxAccess operations through a single RPC. The resolver inspects the embedded MxCommandKind so each operation gets its own scope:

private static string ResolveCommandScope(MxCommandKind kind)
{
    return kind switch
    {
        MxCommandKind.Write or
        MxCommandKind.Write2 or
        MxCommandKind.WriteBulk or
        MxCommandKind.Write2Bulk => GatewayScopes.InvokeWrite,

        MxCommandKind.WriteSecured or
        MxCommandKind.WriteSecured2 or
        MxCommandKind.WriteSecuredBulk or
        MxCommandKind.WriteSecured2Bulk or
        MxCommandKind.AuthenticateUser => GatewayScopes.InvokeSecure,

        MxCommandKind.ArchestraUserToId or
        MxCommandKind.GetSessionState or
        MxCommandKind.GetWorkerInfo => GatewayScopes.MetadataRead,

        MxCommandKind.DrainEvents => GatewayScopes.EventsRead,
        MxCommandKind.ShutdownWorker => GatewayScopes.Admin,

        _ => GatewayScopes.InvokeRead
    };
}

Reads (Register, AddItem, Advise, ReadBulk, and any other unspecified kind) fall through to InvokeRead, which keeps the matrix small while still separating reads from writes, secured writes, metadata lookups, event drains, and worker shutdown. The four bulk-write families (WriteBulk, Write2Bulk, WriteSecuredBulk, WriteSecured2Bulk) are mapped explicitly so a missing arm cannot silently demote a bulk write to a read scope.

Constraint Enforcement

ApiKeyIdentity.Constraints is optional. Empty constraints preserve the previous behavior: the key is authorized only by its verb scopes. Non-empty constraints are stored as JSON in api_keys.constraints and are applied by IConstraintEnforcer after the interceptor succeeds.

Supported constraints are:

Constraint Meaning
read_subtrees Contained-path globs allowed for read/subscription commands.
write_subtrees Contained-path globs allowed for write commands.
read_tag_globs Tag-address globs allowed for read/subscription commands.
write_tag_globs Tag-address globs allowed for write commands.
max_write_classification Maximum Galaxy attribute security_classification a key may write.
browse_subtrees Contained-path globs used to filter Galaxy browse results and deploy-event counts.
read_alarm_only Read/subscription commands must target objects with alarm-bearing attributes.
read_historized_only Read/subscription commands must target objects with historized attributes.

Glob matching is anchored, case-insensitive, and supports * and ?. Subtree and tag glob lists are alternatives: matching either list allows that scope dimension. Empty lists mean unconstrained for that dimension.

Constraints are set when a key is created — through the apikey create-key flags (see Authentication) or the dashboard API Keys page create dialog (see Gateway Dashboard Design). The dashboard API Keys page also renders each key's effective constraints.

The service checks read constraints for AddItem, AddItem2, AddItemBulk, SubscribeBulk, AdviseItemBulk, and ReadBulk. It checks write constraints for Write, Write2, WriteSecured, WriteSecured2, WriteBulk, Write2Bulk, WriteSecuredBulk, and WriteSecured2Bulk. Bulk commands run through BulkConstraintPlan (ReadBulkConstraintPlan, WriteBulkConstraintPlan, SubscribeBulkConstraintPlan), which preserves the caller's input order: each entry is evaluated against the constraint surface, and BulkConstraintPlan.MergeDeniedInto re-merges denied entries back into their original index positions so the reply slot at entries[i] always corresponds to the request slot at entries[i]. Successful item registrations are tracked per session so later item-handle commands resolve back to the original tag address. If a constrained key presents an unknown item handle, the gateway fails closed.

Non-bulk constraint failures return gRPC PermissionDenied. Bulk read commands preserve input order and return a failed SubscribeResult for each denied item while still forwarding allowed items to the worker. Every denial adds an api_key_audit entry with the key id, command kind, target, and blocking constraint; secured values and raw credentials are never logged.

Scope Catalog

GatewayScopes is the single source of truth for scope strings. Every entry is currently mapped by either the resolver or another security component:

Constant Value Required For
SessionOpen session:open OpenSessionRequest
SessionClose session:close CloseSessionRequest
EventsRead events:read StreamEventsRequest, StreamAlarmsRequest, MxCommandKind.DrainEvents
InvokeRead invoke:read MxCommandRequest for read-style command kinds (Register, AddItem, Advise, ReadBulk, and any kind not otherwise mapped)
InvokeWrite invoke:write AcknowledgeAlarmRequest, MxCommandKind.Write, MxCommandKind.Write2, MxCommandKind.WriteBulk, MxCommandKind.Write2Bulk
InvokeSecure invoke:secure MxCommandKind.WriteSecured, MxCommandKind.WriteSecured2, MxCommandKind.WriteSecuredBulk, MxCommandKind.WriteSecured2Bulk, MxCommandKind.AuthenticateUser
MetadataRead metadata:read MxCommandKind.ArchestraUserToId, MxCommandKind.GetSessionState, MxCommandKind.GetWorkerInfo, GalaxyRepository.TestConnection, GalaxyRepository.GetLastDeployTime, GalaxyRepository.DiscoverHierarchy, GalaxyRepository.WatchDeployEvents
Admin admin MxCommandKind.ShutdownWorker, the default for any unrecognized request type, and the dashboard authorization policy

The Admin constant is also referenced by DashboardAuthenticator and DashboardAuthorizationHandler so that the dashboard and the gRPC layer agree on what "admin" means.

Identity Access for Downstream Layers

Once authorization passes, GatewayGrpcAuthorizationInterceptor calls identityAccessor.Push(identity) and disposes the returned scope when the continuation completes. GatewayRequestIdentityAccessor stores the active identity in an AsyncLocal<ApiKeyIdentity?>, so the value flows across await boundaries and child tasks belonging to the same request.

public sealed class GatewayRequestIdentityAccessor : IGatewayRequestIdentityAccessor
{
    private readonly AsyncLocal<ApiKeyIdentity?> currentIdentity = new();

    public ApiKeyIdentity? Current => currentIdentity.Value;

    public IDisposable Push(ApiKeyIdentity identity)
    {
        ArgumentNullException.ThrowIfNull(identity);

        ApiKeyIdentity? previousIdentity = currentIdentity.Value;
        currentIdentity.Value = identity;

        return new IdentityScope(this, previousIdentity);
    }
}

The returned IdentityScope restores the previous value on dispose rather than clearing it. This makes the accessor safe for nested pushes, even though the current interceptor only pushes once per call. Disposing twice is a no-op because of the disposed guard inside IdentityScope.

Downstream code consumes the accessor through the IGatewayRequestIdentityAccessor interface:

public interface IGatewayRequestIdentityAccessor
{
    ApiKeyIdentity? Current { get; }

    IDisposable Push(ApiKeyIdentity identity);
}

MxAccessGatewayService takes IGatewayRequestIdentityAccessor as a constructor dependency and reads Current whenever it needs to attach the calling identity to a domain operation, which keeps the service free of header parsing or scope checks.

When AuthenticationMode.Disabled is configured, no identity is pushed, so Current returns null. Downstream code must tolerate that, just as it tolerates the absence of a scope check.

Registration

GrpcAuthorizationServiceCollectionExtensions.AddGatewayGrpcAuthorization is the single entry point that registers every component and inserts the interceptor into the gRPC pipeline:

public static IServiceCollection AddGatewayGrpcAuthorization(this IServiceCollection services)
{
    services.AddSingleton<GatewayGrpcScopeResolver>();
    services.AddSingleton<IGatewayRequestIdentityAccessor, GatewayRequestIdentityAccessor>();
    services.AddSingleton<GatewayGrpcAuthorizationInterceptor>();
    services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInterceptor>());

    return services;
}

Singleton lifetimes are appropriate because none of the three classes hold per-request state on instance fields; the request-scoped value lives inside the AsyncLocal on GatewayRequestIdentityAccessor. GatewayApplication calls builder.Services.AddGatewayGrpcAuthorization() during startup, and the call also performs AddGrpc, so the gateway never registers gRPC without the interceptor attached.