Files
mxaccessgw/docs/Authentication.md
T
Joseph Doherty e541339c07 docs(audit): apply per-cluster judgment fixes across living docs
Resolve audit findings: correct WorkerEnvelope proto/route/metric/session
facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme),
and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap
options, and gateway alarm broker; fix client CLI flags and package paths.
2026-06-03 16:01:28 -04:00

21 KiB

Gateway Authentication

The gateway authentication subsystem verifies inbound API key credentials against a SQLite-backed key store, hashes secrets with a configurable pepper, and records administrative and verification events to an audit trail.

The peppered-HMAC API-key pipeline — token format, parsing, secret generation and hashing, constant-time comparison, the SQLite schema, the stores, the verifier, and the migrator — lives in the shared ZB.MOM.WW.Auth.ApiKeys package (with abstractions in ZB.MOM.WW.Auth.Abstractions), of which this gateway is the donor. The gateway references the package and binds the library's ApiKeyOptions from its own MxGateway:Authentication section through AddSqliteAuthStore, then layers the gateway-specific pieces on top: constraint enforcement, the gRPC authorization interceptor, the admin CLI, the dashboard API Keys page, and canonical audit forwarding. Types whose code is shown below for reference are owned by the shared package unless noted; the gateway does not re-implement them.

Token Format

API keys travel in the HTTP Authorization header as a bearer token shaped mxgw_<keyId>_<secret>. The mxgw_ prefix scopes parsing to gateway tokens, the <keyId> segment is the public identifier used for lookup, and <secret> is the high-entropy portion that the gateway verifies against a stored hash.

The shared library's ApiKeyParser enforces the format and rejects malformed tokens before any database round-trip:

public bool TryParseAuthorizationHeader(string? authorizationHeader, out ParsedApiKey? apiKey)
{
    apiKey = null;

    if (string.IsNullOrWhiteSpace(authorizationHeader)
        || !authorizationHeader.StartsWith(BearerPrefix, StringComparison.OrdinalIgnoreCase))
    {
        return false;
    }

    string token = authorizationHeader[BearerPrefix.Length..].Trim();

    if (!token.StartsWith(TokenPrefix, StringComparison.OrdinalIgnoreCase))
    {
        return false;
    }

A successful parse produces a ParsedApiKey(KeyId, Secret) record. The IApiKeyParser interface exists so verification consumers can be tested without depending on header-format details.

Parsing and Secrets

Secret generation

ApiKeySecretGenerator.Generate() is the single source of new secret material. It uses 32 bytes from RandomNumberGenerator.Fill and encodes with URL-safe base64 (no padding) so secrets can be embedded in headers without escaping:

public static string Generate()
{
    Span<byte> bytes = stackalloc byte[32];
    RandomNumberGenerator.Fill(bytes);

    return Convert.ToBase64String(bytes)
        .TrimEnd('=')
        .Replace('+', '-')
        .Replace('/', '_');
}

Peppered hashing

The shared library's ApiKeySecretHasher (behind IApiKeySecretHasher) hashes secrets with HMACSHA256 keyed by a server-side pepper. The pepper lives outside the database and is resolved through an IApiKeyPepperProvider — the gateway wires the configuration-backed provider so the pepper comes from IConfiguration lookup against MxGateway:ApiKeyPepper (PepperSecretName):

public byte[] HashSecret(string secret)
{
    string pepper = GetPepper();
    byte[] pepperBytes = Encoding.UTF8.GetBytes(pepper);
    byte[] secretBytes = Encoding.UTF8.GetBytes(secret);

    using HMACSHA256 hmac = new(pepperBytes);

    return hmac.ComputeHash(secretBytes);
}

The pepper is intentionally not stored alongside the hash: an attacker who exfiltrates only the SQLite file holds the hashes but lacks the keying material to brute-force candidate secrets, even if the stored hash algorithm and salt scheme are known. If the pepper is missing the hasher throws ApiKeyPepperUnavailableException, which the verifier converts to a distinct failure code rather than treating it as a credential mismatch.

Verification

The shared library's IApiKeyVerifier.VerifyAsync(authorizationHeader, cancellationToken) owns the whole verification flow — the gateway interceptor hands it the raw authorization header value and never parses the token itself:

  1. Parse the Authorization header into the key id and secret.
  2. Look up the record by key id.
  3. Reject revoked records.
  4. Hash the presented secret with the configured pepper.
  5. Compare hashes with CryptographicOperations.FixedTimeEquals to avoid timing oracles.
  6. Stamp last_used_utc and return an identity.

VerifyAsync returns an ApiKeyVerification value with a Succeeded flag and a nullable Identity. On failure the result is discriminated so the caller can tell parse errors, missing pepper, missing or revoked keys, and secret mismatch apart for audit detail — without leaking which check failed to the client. The gateway interceptor treats any non-success uniformly as Unauthenticated (see Authorization):

ApiKeyVerification verification = await apiKeyVerifier
    .VerifyAsync(authorizationHeader ?? string.Empty, context.CancellationToken)
    .ConfigureAwait(false);

if (!verification.Succeeded || verification.Identity is null)
{
    throw new RpcException(new Status(StatusCode.Unauthenticated, "Missing or invalid API key."));
}

The shared verifier returns ZB.MOM.WW.Auth.Abstractions.ApiKeys.ApiKeyIdentity, which carries the persisted constraints as an opaque JSON string. The gateway's GatewayApiKeyIdentityMapper.ToGatewayIdentity projects it onto the gateway-local ApiKeyIdentity record, which exposes only non-secret fields (KeyId, KeyPrefix, DisplayName, Scopes) plus the deserialized Constraints, and is the type downstream authorization code consumes.

Storage

The gateway keeps API key state in a dedicated SQLite database. SQLite is sufficient because credential volume is small, the gateway runs as a single process, and the file is straightforward to back up and rotate independently of the main application data.

Connection factory

The shared library's AuthSqliteConnectionFactory (registered by AddZbApiKeyAuth) reads the bound ApiKeyOptions.SqlitePath — which the gateway populates from MxGateway:Authentication:SqlitePath — ensures the parent directory exists, and builds a connection string in ReadWriteCreate mode so first-run installations can create the file without manual provisioning. Connection pooling is enabled and the connection string carries a non-zero DefaultTimeout:

SqliteConnectionStringBuilder builder = new()
{
    DataSource = sqlitePath,
    Mode = SqliteOpenMode.ReadWriteCreate,
    Pooling = true,
    DefaultTimeout = (int)BusyTimeout.TotalSeconds,
};

Every store opens its connection through OpenConnectionAsync, which opens the connection and then applies PRAGMA journal_mode=WAL and PRAGMA busy_timeout. WAL is a persistent database-level setting so re-applying it per connection is a cheap no-op; busy_timeout is per-connection state. Because MarkKeyUsedAsync runs on every authenticated request and the canonical audit writer appends to the same file, this lets concurrent readers and writers retry briefly instead of surfacing SQLITE_BUSY as a hard failure on the request path.

Schema

The shared library's SqliteAuthSchema declares the API-key table names and the current schema version as constants. Four tables live in the database file:

  • api_keys stores key_id, key_prefix, the secret_hash blob, display_name, serialized scopes, optional serialized constraints, and the created_utc, last_used_utc, and revoked_utc timestamps.
  • api_key_audit is the shared library's append-only audit log keyed by an autoincrement audit_id with key_id, event_type, remote_address, created_utc, and details columns. The gateway overrides the library audit store (see Audit trail), so this table is left in place but unused at runtime — nothing writes to it.
  • audit_event is the gateway-owned canonical audit table written by SqliteCanonicalAuditStore. It lives in the same SQLite file (reusing the library's AuthSqliteConnectionFactory) and is where every gateway audit event actually lands. See Audit trail.
  • schema_version carries a single row whose version column is matched against SqliteAuthSchema.CurrentVersion.

Read paths

The shared library's SqliteApiKeyStore (IApiKeyStore) handles the two reads needed at request time: FindByKeyIdAsync returns any record (so revoked keys can be reported distinctly) and FindActiveByKeyIdAsync filters to non-revoked rows. MarkKeyUsedAsync updates last_used_utc only for non-revoked rows so a freshly revoked key cannot have its timestamp refreshed by a racing verification.

ApiKeyRecord is the in-memory projection. ApiKeyRecordReader.Read is shared by every read path so column ordering is defined in one place:

public static ApiKeyRecord Read(SqliteDataReader reader)
{
    return new ApiKeyRecord(
        KeyId: reader.GetString(0),
        KeyPrefix: reader.GetString(1),
        SecretHash: (byte[])reader["secret_hash"],
        DisplayName: reader.GetString(3),
        Scopes: ApiKeyScopeSerializer.Deserialize(reader.GetString(4)),
        Constraints: ApiKeyConstraintSerializer.Deserialize(reader.IsDBNull(5) ? null : reader.GetString(5)),
        CreatedUtc: DateTimeOffset.Parse(reader.GetString(6), System.Globalization.CultureInfo.InvariantCulture),
        LastUsedUtc: ReadNullableDateTimeOffset(reader, 7),
        RevokedUtc: ReadNullableDateTimeOffset(reader, 8));
}

Write paths

The shared library's SqliteApiKeyAdminStore (IApiKeyAdminStore) implements administrative mutations: CreateAsync accepts an ApiKeyCreateRequest, RevokeAsync sets revoked_utc only when not already revoked, RotateAsync replaces secret_hash, clears last_used_utc, and clears revoked_utc so a rotated key is immediately usable, and DeleteAsync permanently removes a row but only when revoked_utc IS NOT NULL — active keys are untouched (returns false) so the revoke event lands in the audit log before the row disappears.

Because RotateAsync clears revoked_utc, rotating a previously revoked key reactivates it. The dashboard API Keys page therefore offers the Rotate (and Revoke) actions only for keys whose status is Active; revoked keys instead show a Delete action that calls DeleteAsync, so an operator can permanently remove a revoked row without ever risking un-revocation as a side effect of a rotation.

Audit trail

All gateway audit flows through a single canonical AuditEvent written to the gateway-owned audit_event table, not the shared library's api_key_audit table. The gateway adopts ZB.MOM.WW.Audit and overrides the library's IApiKeyAuditStore registration with CanonicalForwardingApiKeyAuditStore. That adapter receives each library-emitted ApiKeyAuditEntry — including the library-internal admin-command verbs (create-key, revoke-key, rotate-key, init-db) the gateway cannot edit — canonicalizes it onto an AuditEvent, and forwards it through IAuditWriter (CanonicalAuditWriter), which persists to audit_event via SqliteCanonicalAuditStore.

Because the adapter is registered after AddZbApiKeyAuth, it is the IApiKeyAuditStore that the admin commands resolve and that the dashboard "recent audit" view reads through IApiKeyAuditStore.ListRecentAsync. The library's own SqliteApiKeyAuditStore and its api_key_audit table are therefore unused at runtime — the override is the only writer. Audit rows are kept even after the referenced key is revoked because the audit history is the durable record of administrative action; non-key-scoped events such as init-db carry no key id.

This canonical-forwarding wiring lives under src/ZB.MOM.WW.MxGateway.Server/Security/Audit/; the audit store override and writer are gateway types, while the entry shape and admin verbs originate in the shared library.

Migration

Schema bring-up for the API-key tables is owned by the shared library's SqliteAuthStoreMigrator, wired by AddZbApiKeyAuth along with its migration hosted service. It executes the migration inside a single transaction so a partial failure leaves the database untouched, refuses to start when the on-disk schema version is newer than the binary supports, and idempotently creates the schema:

if (existingVersion > SqliteAuthSchema.CurrentVersion)
{
    throw new AuthStoreMigrationException(
        $"Auth database schema version {existingVersion} is newer than supported version {SqliteAuthSchema.CurrentVersion}.");
}

await ApplyVersionOneAsync(connection, transaction, cancellationToken).ConfigureAwait(false);

await transaction.CommitAsync(cancellationToken).ConfigureAwait(false);

The library's migration hosted service runs the migrator at startup. Operators who manage schema out-of-band can use the admin CLI's init-db command instead.

Admin CLI

ApiKeyAdminCommandLineParser.Parse (a gateway type) recognises a leading apikey argument and dispatches to one of the subcommands declared by ApiKeyAdminCommandKind. Each parsed invocation produces an ApiKeyAdminCommand (or an ApiKeyAdminParseResult carrying an error). The parser validates requested --scopes against GatewayScopes.All (see Authorization) so a non-canonical scope string cannot be persisted on a key. ApiKeyAdminCliRunner then drives the shared library's ApiKeyAdminCommands — which the gateway registers over the already-wired stores, pepper provider, and migrator — to execute the command, and writes either text or JSON output via ApiKeyAdminOutput. The returned ApiKeyAdminListedKey projection deliberately omits the secret_hash so listing a database does not surface hash material.

The supported subcommands match ApiKeyAdminCommandKind exactly:

Subcommand Required options Behaviour
init-db none Runs the migrator and records an audit entry.
create-key --key-id, --display-name Generates a new secret, stores its peppered hash and optional constraints, and prints the assembled mxgw_<keyId>_<secret> token.
list-keys none Lists every stored key with its scopes, constraints, and revocation state.
revoke-key --key-id Sets revoked_utc if the key is currently active.
rotate-key --key-id Replaces the secret hash and prints the new token.

Examples:

mxgateway apikey init-db
mxgateway apikey create-key --key-id ops.alice --display-name "Alice (ops)" --scopes invoke:read,invoke:write
mxgateway apikey create-key --key-id area1.reader --display-name "Area 1 reader" --scopes invoke:read,metadata:read --read-subtree "Area1/*" --browse-subtree "Area1/*"
mxgateway apikey list-keys --json
mxgateway apikey revoke-key --key-id ops.alice
mxgateway apikey rotate-key --key-id ops.alice

Constraint flags are optional. --read-subtree, --write-subtree, --read-tag-glob, --write-tag-glob, and --browse-subtree are repeatable. --max-write-classification accepts one integer. --read-alarm-only and --read-historized-only are boolean flags. Existing rows with null constraints remain fully unconstrained after migration.

Key ids are restricted by the parser to ASCII letters, digits, periods, and hyphens so they remain safe to embed in the token format and in URL paths used by administrative tooling.

The CLI is not the only management surface: the dashboard API Keys page creates, rotates, revokes, and deletes (revoked-only) keys through the same IApiKeyAdminStore. Every destructive dashboard action is gated by a confirmation dialog and emits its own audit event (dashboard-create-key, dashboard-rotate-key, dashboard-revoke-key, dashboard-delete-key). See Gateway Dashboard Design.

Scope Serialization

Scopes are persisted as a single TEXT column rather than a join table because the set is small, never queried by membership at the database level, and changes atomically with the owning row. The shared library's ApiKeyScopeSerializer.Serialize writes a JSON array sorted with StringComparer.Ordinal so equivalent scope sets produce byte-identical column values, which makes audit diffing and database comparisons deterministic:

public static string Serialize(IReadOnlySet<string> scopes)
{
    return JsonSerializer.Serialize(scopes.Order(StringComparer.Ordinal));
}

public static IReadOnlySet<string> Deserialize(string value)
{
    if (string.IsNullOrWhiteSpace(value))
    {
        return new HashSet<string>(StringComparer.Ordinal);
    }

    string[]? scopes = JsonSerializer.Deserialize<string[]>(value);

    return new HashSet<string>(scopes ?? [], StringComparer.Ordinal);
}

Deserialize tolerates an empty column by returning an empty set so older rows or hand-edited records do not crash the verifier.

The API-key model above guards the gRPC surface. Interactive dashboard requests use a separate LDAP-backed cookie scheme (see Gateway Dashboard Design). Two timeouts and a few configuration knobs govern that cookie:

  • Cookie idle timeout — 8 hours. DashboardServiceCollectionExtensions applies the shared ZbCookieDefaults.Apply hardened cookie defaults (HttpOnly, SameSite=Strict, secure policy, sliding expiration) but overrides the library's 30-minute default with an 8-hour idle timeout, so an active operator is not signed out mid-shift. The expiration is sliding, so each authenticated request resets the window.
  • Hub bearer token — 30 minutes. SignalR hub connections cannot always carry the HttpOnly cookie (the client SignalR JS may resolve the cookie scope to loopback), so the dashboard mints a short-lived data-protected bearer at /hubs/token via HubTokenService. The token lifetime is 30 minutes; the hubs accept either it or the cookie.
  • MxGateway:Dashboard:CookieName overrides the cookie name (default MxGatewayDashboard, from DashboardAuthenticationDefaults.CookieName). Two gateway instances on the same host but different ports share a cookie scope — host+path, not port — so giving each a distinct name keeps their dashboard sessions from clobbering each other. Changing it signs out existing sessions on next deploy.
  • MxGateway:Dashboard:RequireHttpsCookie (default true) restricts the cookie to HTTPS via CookieSecurePolicy.Always. Set it to false for plain-HTTP dev so the cookie uses SameAsRequest; leaving it true while serving the dashboard over plain HTTP from a non-localhost host breaks login, because browsers drop Secure cookies set over HTTP.

The dashboard issues claims through the shared ZB.MOM.WW.Auth.AspNetCore.ZbClaimTypes (e.g. ZbClaimTypes.Username = zb:username, ZbClaimTypes.Name = ClaimTypes.Name so Identity.Name resolves, ZbClaimTypes.Role = ClaimTypes.Role so IsInRole/[Authorize(Roles=...)] work). Cookie hardening defaults come from ZbCookieDefaults. Both live in the shared Auth packages, not the gateway.

Registration

AuthStoreServiceCollectionExtensions.AddSqliteAuthStore is the gateway entry point. It does not register the parser, hasher, verifier, stores, or migrator directly — those come from the shared package. Instead it delegates to the package's AddZbApiKeyAuth and then layers the gateway-specific audit and CLI services:

public static IServiceCollection AddSqliteAuthStore(
    this IServiceCollection services,
    IConfiguration configuration)
{
    // Register the shared API-key provider: binds ApiKeyOptions from MxGateway:Authentication,
    // wires up the SQLite stores, the configuration-backed pepper provider, the verifier, the
    // migrator and the migration hosted service.
    services.AddZbApiKeyAuth(effectiveConfig, AuthenticationSectionPath);

    // Gateway-owned canonical audit (ZB.MOM.WW.Audit) in the same SQLite file.
    services.AddSingleton(sp =>
        new SqliteCanonicalAuditStore(sp.GetRequiredService<AuthSqliteConnectionFactory>()));
    services.AddSingleton<IAuditWriter>(sp => new CanonicalAuditWriter(/* ... */));

    // Override the library's IApiKeyAuditStore so every audit lands in audit_event.
    services.AddSingleton<IApiKeyAuditStore, CanonicalForwardingApiKeyAuditStore>();

    // The shared admin command set, driven by the gateway CLI and dashboard.
    services.AddSingleton(sp => new ApiKeyAdminCommands(/* ... */));
    services.AddSingleton<ApiKeyAdminCliRunner>();

    return services;
}

The gateway pins its own API-key contract — token prefix mxgw and the pepper key MxGateway:ApiKeyPepper — by layering those as fallback defaults under the supplied configuration before calling AddZbApiKeyAuth, because ApiKeyOptions is an init-only record that must be bound with those values present rather than mutated afterward. Explicit configuration still wins. AddZbApiKeyAuth binds ApiKeyOptions from the MxGateway:Authentication section and registers the connection factory, stores, pepper provider, verifier, migrator, and migration hosted service.

The audit-store override is registered after AddZbApiKeyAuth so it replaces the library's TryAddSingleton registration. The shared admin command set is not auto-registered by AddZbApiKeyAuth, so the gateway registers ApiKeyAdminCommands itself over the wired stores; the CLI and dashboard drive it. Library services are singletons and safe because each operation opens its own short-lived SqliteConnection through the factory.