Compare commits

..

3 Commits

Author SHA1 Message Date
Joseph Doherty e13152f340 test: remove redundant HostBuildingCollection workaround (shared lib no longer installs a global frozen logger) 2026-06-01 08:47:21 -04:00
Joseph Doherty deba5ed115 refactor(logging): correlation scope + redaction on shared ILogRedactor seam
Move the per-request correlation context and secret redaction off the MEL
mechanism onto the Serilog primitives the shared bootstrap consumes.

- GatewayRequestLoggingMiddlewareExtensions now pushes the correlation
  properties (SessionId / WorkerProcessId / CorrelationId / CommandMethod /
  ClientIdentity) via Serilog LogContext.PushProperty for the request lifetime
  and pops them on completion, replacing MEL ILogger.BeginScope. Header parsing
  and property names are unchanged; GatewayLogScope remains the data holder.
- Add GatewayLogRedactorAdapter : ILogRedactor delegating to the existing
  GatewayLogRedactor policy (mxgw_ bearer tokens / credential-bearing command
  values), registered as a singleton so the shared RedactionEnricher masks
  secrets on every event. Remove the now-dead GatewayLoggerExtensions MEL helper.
- Tests: add GatewayLogRedactorAdapterTests; serialize the four host-building
  test classes into one non-parallel collection (HostBuildingCollection) so the
  process-wide Serilog bootstrap logger is not frozen by two concurrent host
  builds racing in parallel collections.

The net48/x86 worker is untouched.
2026-06-01 08:06:28 -04:00
Joseph Doherty 4bf71a0b2c refactor(logging): adopt ZB.MOM.WW.Telemetry.Serilog bootstrap
Swap the gateway process logging from the default Microsoft.Extensions.Logging
provider onto the shared ZB.MOM.WW.Telemetry.Serilog two-stage bootstrap.

- Add a cross-repo ProjectReference to ZB.MOM.WW.Telemetry.Serilog (transitively
  brings the Telemetry core package); the referenced project resolves its own
  Directory.Build.props / Directory.Packages.props so it does not perturb this build.
- Replace MEL wiring in GatewayApplication with builder.AddZbSerilog(ServiceName=
  "mxgateway"; SiteId/NodeRole read from MxGateway:Telemetry when present) and add
  app.UseSerilogRequestLogging().
- Add a Serilog section (Console + daily rolling File sinks, MinimumLevel) to
  appsettings.json and a MinimumLevel override to appsettings.Development.json,
  replacing the old MEL Logging sections.

The net48/x86 worker is untouched. Correlation scope + redaction move to the
shared ILogRedactor seam in the follow-up commit.
2026-06-01 08:03:49 -04:00
149 changed files with 3364 additions and 6660 deletions
-5
View File
@@ -147,8 +147,3 @@ generated-scratch/
# Keep empty directories with .gitkeep files when needed
!.gitkeep
# Documentation review artifacts (CommentChecker output)
*-docs-issues.md
*-docs-fixed.md
*-docs-final.md
+1 -1
View File
@@ -100,7 +100,7 @@ When source code changes, build and test the affected component before reporting
## Design Sources To Consult Before Non-Trivial Changes
- `gateway.md` — top-level architecture, command/event surface, IPC envelope, STA thread model, fault handling.
- `glauth.md`shared GLAuth LDAP server (`10.100.0.35:3893`, base DN `dc=zb,dc=local`, source of truth `scadaproj/infra/glauth/`) used for dev authn. Dashboard test users (`multi-role`/`password` = Administrator, `gw-viewer`/`password` = Viewer) and the role→capability mapping live there.
- `glauth.md`local LDAP server (GLAuth on `localhost:3893`, base DN `dc=lmxopcua,dc=local`) used for dev authn. Pre-provisioned users (`admin/admin123`, `readonly/readonly123`, etc.) and the role→capability mapping live there.
- `docs/DesignDecisions.md` — v1 choices (MXAccess COM target `LMXProxyServerClass` from `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`, API-key-in-SQLite auth, fail-fast event backpressure, etc.).
- `docs/GatewayProcessDesign.md`, `docs/MxAccessWorkerInstanceDesign.md`, `docs/WorkerFrameProtocol.md`, `docs/WorkerProcessLauncher.md` — detailed component designs.
- `docs/GatewayConfiguration.md` — full `MxGateway:*` options bound by `GatewayOptions` and validated at startup by `GatewayOptionsValidator`.
-19
View File
@@ -107,7 +107,6 @@ public sealed class MxGatewayClientOptions
public required string ApiKey { get; init; }
public bool UseTls { get; init; }
public string? CaCertificatePath { get; init; }
public bool RequireCertificateValidation { get; init; }
public string? ServerNameOverride { get; init; }
public TimeSpan ConnectTimeout { get; init; } = TimeSpan.FromSeconds(10);
public TimeSpan DefaultCallTimeout { get; init; } = TimeSpan.FromSeconds(30);
@@ -125,24 +124,6 @@ or subscription changes because those calls can partially succeed in MXAccess.
API key may be loaded from `MXGATEWAY_API_KEY` by the CLI, not implicitly by the
library constructor unless a helper explicitly says it does that.
### TLS trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when `UseTls` is set
and `CaCertificatePath` is empty, `CreateHttpHandler` installs a
`RemoteCertificateValidationCallback` that returns `true`, so the gateway's
self-signed certificate is accepted without verification.
To verify the gateway instead:
- set `CaCertificatePath` to pin a CA — validated via a `CustomRootTrust`
`X509Chain` against that root, and the callback additionally rejects a
hostname/SAN mismatch (`RemoteCertificateNameMismatch`); or
- set `RequireCertificateValidation` to `true` to keep the default OS/system-trust
verification on a connection with no pinned CA.
Pinning a CA always wins over the lenient default.
## Auth Interceptor
Use a gRPC call credentials/interceptor layer to attach:
-11
View File
@@ -287,17 +287,6 @@ Use TLS options for a secured gateway:
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint https://ZB.MOM.WW.MxGateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name ZB.MOM.WW.MxGateway.example.local --api-key-env MXGATEWAY_API_KEY --item Area001.Pump001.Speed --json
```
### TLS trust
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`UseTls` / `--tls`) with
no pinned CA accepts whatever certificate the gateway presents. To verify
instead, pin a CA with `CaCertificatePath` / `--ca-file` (this path also enforces
the certificate hostname/SAN match), or set `RequireCertificateValidation` to
force OS/system-trust verification without pinning. Use `ServerNameOverride` /
`--server-name` when the dialed host differs from the certificate SAN. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## Integration Checks
Run live checks only when a gateway and MXAccess-backed worker are available:
@@ -1,85 +0,0 @@
using System.Net.Http;
using System.Net.Security;
using ZB.MOM.WW.MxGateway.Client;
namespace ZB.MOM.WW.MxGateway.Client.Tests;
public sealed class MxGatewayClientTlsHandlerTests
{
/// <summary>
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
/// the handler installs an accept-all callback so the gateway's self-signed cert is trusted.
/// The callback must return true regardless of chain errors.
/// </summary>
[Fact]
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
};
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
}
/// <summary>
/// Verifies that when RequireCertificateValidation is true, the callback is left null
/// so the OS trust store performs validation.
/// </summary>
[Fact]
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
RequireCertificateValidation = true,
};
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
}
}
public sealed class GalaxyRepositoryClientTlsHandlerTests
{
/// <summary>
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
/// the Galaxy client handler installs an accept-all callback so the gateway's self-signed cert is trusted.
/// The callback must return true regardless of chain errors.
/// </summary>
[Fact]
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
};
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
}
/// <summary>
/// Verifies that when RequireCertificateValidation is true, the Galaxy client callback is left null
/// so the OS trust store performs validation.
/// </summary>
[Fact]
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
{
MxGatewayClientOptions options = new()
{
Endpoint = new Uri("https://localhost:5120"),
ApiKey = "k",
UseTls = true,
RequireCertificateValidation = true,
};
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
}
}
@@ -490,10 +490,7 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
.ConfigureAwait(false);
}
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
CreateHttpHandlerForTests(options);
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
{
SocketsHttpHandler handler = new()
{
@@ -513,11 +510,6 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
{
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
{
return false;
}
if (certificate is null)
{
return false;
@@ -533,10 +525,6 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
return customChain.Build(certificateToValidate);
};
}
else if (!options.RequireCertificateValidation)
{
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
}
}
return handler;
@@ -315,10 +315,7 @@ public sealed class MxGatewayClient : IAsyncDisposable
.ConfigureAwait(false);
}
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
CreateHttpHandlerForTests(options);
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
{
SocketsHttpHandler handler = new()
{
@@ -338,11 +335,6 @@ public sealed class MxGatewayClient : IAsyncDisposable
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
{
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
{
return false;
}
if (certificate is null)
{
return false;
@@ -358,10 +350,6 @@ public sealed class MxGatewayClient : IAsyncDisposable
return customChain.Build(certificateToValidate);
};
}
else if (!options.RequireCertificateValidation)
{
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
}
}
return handler;
@@ -27,14 +27,6 @@ public sealed class MxGatewayClientOptions
/// </summary>
public string? CaCertificatePath { get; init; }
/// <summary>
/// When true, TLS connections without a pinned <see cref="CaCertificatePath"/>
/// use the OS trust store. When false (default), the gateway certificate is
/// accepted without verification — appropriate for this internal tool's
/// auto-generated self-signed certificate. Pinning a CA always verifies.
/// </summary>
public bool RequireCertificateValidation { get; init; }
/// <summary>
/// Gets the server name override for SNI during TLS handshake.
/// </summary>
@@ -27,10 +27,4 @@
<None Include="..\README.md" Pack="true" PackagePath="\" />
</ItemGroup>
<ItemGroup>
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
<_Parameter1>ZB.MOM.WW.MxGateway.Client.Tests</_Parameter1>
</AssemblyAttribute>
</ItemGroup>
</Project>
-17
View File
@@ -104,23 +104,6 @@ Support:
- `credentials.NewClientTLSFromFile`,
- custom `tls.Config` for advanced callers.
### Trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when `Plaintext` is
`false` and no `CACertFile`/`TLSConfig`/`TransportCredentials` is supplied,
`buildCredentials` dials with `tls.Config{InsecureSkipVerify: true}` (carrying
`ServerNameOverride` as the SNI when set), so the gateway's self-signed
certificate is accepted without verification.
To verify the gateway instead:
- set `CACertFile` to pin a CA (full verification against that root), or
- set `RequireCertificateValidation: true` to verify against the OS/system trust
roots without pinning.
Pinning a CA always wins over the lenient default.
## Streaming
`Events(ctx)` should return a receive channel of:
-8
View File
@@ -75,14 +75,6 @@ client, err := mxgateway.Dial(ctx, mxgateway.Options{
})
```
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`Plaintext: false`) with
no `CACertFile`/`TLSConfig` accepts whatever certificate the gateway presents
(`InsecureSkipVerify`, with `ServerNameOverride` as the SNI when set). To verify
instead, set `CACertFile` to pin a CA, or set `RequireCertificateValidation:
true` to verify against the OS/system trust roots without pinning. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
`Client.OpenSession` returns a `Session` with helpers for `Register`,
`AddItem`, `AddItem2`, `Advise`, `Write`, `Events`, and `Close`. Prefer
`SubscribeEvents` or `SubscribeEventsAfter` for long-running streams because the
+4 -16
View File
@@ -222,22 +222,10 @@ func resolveTransportCredentials(opts Options) (credentials.TransportCredentials
return credentials.NewTLS(cfg), nil
}
return credentials.NewTLS(tlsConfigForOptions(opts)), nil
}
// tlsConfigForOptions returns the *tls.Config for the no-CA, no-custom-config TLS path.
// It returns nil when the caller should use a different credentials path (CA file or custom TLSConfig).
// Exposed as an internal helper so unit tests can assert the InsecureSkipVerify posture.
func tlsConfigForOptions(opts Options) *tls.Config {
// CA file and custom TLSConfig take their own paths in resolveTransportCredentials.
if opts.CACertFile != "" || opts.TLSConfig != nil {
return nil
}
return &tls.Config{
MinVersion: tls.VersionTLS12,
ServerName: opts.ServerNameOverride,
InsecureSkipVerify: !opts.RequireCertificateValidation, //nolint:gosec // internal tool; self-signed gateway cert expected; opt-in strict via RequireCertificateValidation
}
return credentials.NewTLS(&tls.Config{
MinVersion: tls.VersionTLS12,
ServerName: opts.ServerNameOverride,
}), nil
}
// OpenSessionOptions describes fields used to create an OpenSessionRequest.
-59
View File
@@ -1,59 +0,0 @@
package mxgateway
import (
"crypto/tls"
"testing"
)
// tlsConfigFromOptions is the internal helper under test.
// It extracts the *tls.Config from the no-CA TLS path of resolveTransportCredentials.
// We exercise it directly to avoid needing a real dial target.
func TestTLSInsecureSkipVerify_DefaultTrue(t *testing.T) {
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
})
if cfg == nil {
t.Fatal("expected non-nil tls.Config")
}
if !cfg.InsecureSkipVerify {
t.Error("InsecureSkipVerify should be true by default when no CA is pinned")
}
}
func TestTLSInsecureSkipVerify_FalseWhenRequireCertificateValidation(t *testing.T) {
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
RequireCertificateValidation: true,
})
if cfg == nil {
t.Fatal("expected non-nil tls.Config")
}
if cfg.InsecureSkipVerify {
t.Error("InsecureSkipVerify should be false when RequireCertificateValidation is true")
}
}
func TestTLSInsecureSkipVerify_FalseWhenCACertFileSet(t *testing.T) {
// When a CA file is pinned, the CA-verification path is taken instead.
// tlsConfigForOptions should return nil (the CA path does not use our helper).
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
CACertFile: "/some/ca.pem",
})
if cfg != nil {
t.Error("expected nil tls.Config when CACertFile is set (CA path taken)")
}
}
func TestTLSInsecureSkipVerify_FalseWhenCustomTLSConfig(t *testing.T) {
// When TLSConfig is supplied explicitly, our default skip-verify must not overwrite it.
custom := &tls.Config{MinVersion: tls.VersionTLS13}
cfg := tlsConfigForOptions(Options{
Endpoint: "localhost:5120",
TLSConfig: custom,
})
if cfg != nil {
t.Error("expected nil tls.Config when TLSConfig is already set (custom config path taken)")
}
}
-4
View File
@@ -34,10 +34,6 @@ type Options struct {
TransportCredentials credentials.TransportCredentials
// DialOptions are appended to the gRPC dial options after the defaults.
DialOptions []grpc.DialOption
// RequireCertificateValidation forces TLS certificate verification even when
// no CACertFile is pinned. Default false: the gateway's self-signed cert is
// accepted without verification (internal-tool posture).
RequireCertificateValidation bool
}
// BrowseChildrenOptions configures lazy Galaxy hierarchy walks performed by
-17
View File
@@ -112,23 +112,6 @@ Support:
- custom CA certificate file,
- server name override for test environments.
### Trust posture
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). To make that usable, TLS is **lenient by default**: when the channel is not
plaintext and no `caCertificatePath` is set, the client builds
`GrpcSslContexts.forClient().trustManager(InsecureTrustManagerFactory.INSTANCE)`
(grpc-netty-shaded), so the gateway's self-signed certificate is accepted without
verification.
To verify the gateway instead:
- set `caCertificatePath` to pin a CA (full verification against that root), or
- set `requireCertificateValidation` to `true` to verify against the JVM trust
store without pinning.
Pinning a CA always wins over the lenient default.
## Streaming
Support both:
-10
View File
@@ -57,16 +57,6 @@ try (MxGatewayClient client = MxGatewayClient.connect(options);
}
```
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
the client is **lenient by default**: a TLS connection (`plaintext(false)`) with
no `caCertificatePath` accepts whatever certificate the gateway presents (via
grpc-netty-shaded's `InsecureTrustManagerFactory`). To verify instead, set
`caCertificatePath` to pin a CA, or set `requireCertificateValidation(true)` to
verify against the JVM trust store without pinning. Use `serverNameOverride` /
`--server-name-override` when the dialed host differs from the certificate SAN.
See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
Use `rawBlockingStub`, `rawFutureStub`, `rawAsyncStub`, `openSessionRaw`,
`closeSessionRaw`, `invoke`, and raw session helper methods when tests need the
underlying protobuf messages. `MxGatewayCommandException` and
-4
View File
@@ -9,10 +9,6 @@ pluginManagement {
}
}
plugins {
id 'org.gradle.toolchains.foojay-resolver-convention' version '1.0.0'
}
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
@@ -384,15 +384,6 @@ public final class MxGatewayClient implements AutoCloseable {
} catch (SSLException error) {
throw new MxGatewayException("failed to configure gateway TLS", error);
}
} else if (!options.requireCertificateValidation()) {
try {
builder.sslContext(GrpcSslContexts.forClient()
.trustManager(io.grpc.netty.shaded.io.netty.handler.ssl.util
.InsecureTrustManagerFactory.INSTANCE)
.build());
} catch (SSLException error) {
throw new MxGatewayException("failed to configure lenient gateway TLS", error);
}
} else {
builder.useTransportSecurity();
}
@@ -402,19 +393,6 @@ public final class MxGatewayClient implements AutoCloseable {
return builder.build();
}
/**
* Package-visible test seam — creates a raw {@link ManagedChannel} from the
* given options without attaching auth interceptors. Used by TLS fixture
* tests to verify channel construction behaviour without a full
* {@link MxGatewayClient} wrapper.
*
* @param options the client options
* @return a new {@link ManagedChannel}
*/
static ManagedChannel createChannelForTests(MxGatewayClientOptions options) {
return createChannel(options);
}
private <T extends io.grpc.stub.AbstractStub<T>> T withDeadline(T stub) {
if (options.callTimeout().isNegative()) {
return stub;
@@ -20,7 +20,6 @@ public final class MxGatewayClientOptions {
private final String apiKey;
private final boolean plaintext;
private final Path caCertificatePath;
private final boolean requireCertificateValidation;
private final String serverNameOverride;
private final Duration connectTimeout;
private final Duration callTimeout;
@@ -32,7 +31,6 @@ public final class MxGatewayClientOptions {
apiKey = builder.apiKey == null ? "" : builder.apiKey;
plaintext = builder.plaintext;
caCertificatePath = builder.caCertificatePath;
requireCertificateValidation = builder.requireCertificateValidation;
serverNameOverride = builder.serverNameOverride == null ? "" : builder.serverNameOverride;
connectTimeout = builder.connectTimeout == null ? DEFAULT_CONNECT_TIMEOUT : builder.connectTimeout;
callTimeout = builder.callTimeout == null ? DEFAULT_CALL_TIMEOUT : builder.callTimeout;
@@ -97,18 +95,6 @@ public final class MxGatewayClientOptions {
return caCertificatePath;
}
/**
* Returns whether TLS certificate verification is required even when no CA is pinned.
* When {@code false} (default), the gateway's self-signed certificate is accepted
* without verification. When {@code true}, the OS trust store is used.
* Pinning a CA via {@link #caCertificatePath()} always verifies regardless of this flag.
*
* @return {@code true} if strict certificate verification is required
*/
public boolean requireCertificateValidation() {
return requireCertificateValidation;
}
/**
* Returns the TLS server-name override, or an empty string when none was supplied.
*
@@ -162,8 +148,6 @@ public final class MxGatewayClientOptions {
+ plaintext
+ ", caCertificatePath="
+ caCertificatePath
+ ", requireCertificateValidation="
+ requireCertificateValidation
+ ", serverNameOverride='"
+ serverNameOverride
+ '\''
@@ -193,7 +177,6 @@ public final class MxGatewayClientOptions {
private String apiKey;
private boolean plaintext;
private Path caCertificatePath;
private boolean requireCertificateValidation;
private String serverNameOverride;
private Duration connectTimeout;
private Duration callTimeout;
@@ -247,21 +230,6 @@ public final class MxGatewayClientOptions {
return this;
}
/**
* When {@code true}, TLS connections without a pinned CA use the OS trust store
* and will reject the gateway's self-signed certificate. When {@code false}
* (default), the gateway certificate is accepted without verification —
* appropriate for this internal tool's auto-generated self-signed certificate.
* Pinning a CA via {@link #caCertificatePath(Path)} always verifies.
*
* @param value {@code true} to require certificate validation, {@code false} to accept any cert
* @return this builder
*/
public Builder requireCertificateValidation(boolean value) {
requireCertificateValidation = value;
return this;
}
/**
* Overrides the TLS server name used during the handshake.
*
@@ -1,198 +0,0 @@
package com.zb.mom.ww.mxgateway.client;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import io.grpc.ManagedChannel;
import io.grpc.Server;
import io.grpc.StatusRuntimeException;
import io.grpc.netty.shaded.io.grpc.netty.GrpcSslContexts;
import io.grpc.netty.shaded.io.grpc.netty.NettyServerBuilder;
import io.grpc.stub.StreamObserver;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.file.Files;
import java.security.KeyStore;
import java.security.PrivateKey;
import java.security.cert.Certificate;
import java.security.cert.X509Certificate;
import java.time.Duration;
import java.util.Base64;
import java.util.concurrent.TimeUnit;
import javax.net.ssl.SSLException;
import mxaccess_gateway.v1.MxAccessGatewayGrpc;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionReply;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionRequest;
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatus;
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatusCode;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
/**
* Verifies that the Java client connects to a Netty TLS server with a
* self-signed certificate when no CA is pinned (lenient default), and that
* setting {@code requireCertificateValidation(true)} causes a TLS failure.
*
* <p>A self-signed certificate is generated using {@code keytool} (always
* available in the JDK) to avoid dependencies on internal JDK APIs or
* BouncyCastle, and so the test works on all JDK versions used by the project.
*/
final class MxGatewayClientTlsTests {
private Server server;
private int port;
private File certPemFile;
private File keyPemFile;
private File keystoreFile;
@BeforeEach
void startTlsServer() throws Exception {
keystoreFile = File.createTempFile("gw-test-ks", ".p12");
certPemFile = File.createTempFile("gw-test-cert", ".pem");
keyPemFile = File.createTempFile("gw-test-key", ".pem");
// keytool refuses to write to a pre-existing (even empty) file; delete it first.
keystoreFile.delete();
// Use keytool to generate a self-signed PKCS12 keystore.
String keytool = ProcessHandle.current().info().command()
.map(cmd -> cmd.replace("java", "keytool"))
.orElse("keytool");
// Fall back to just "keytool" on PATH if the resolved path doesn't exist.
if (!new File(keytool).exists()) {
keytool = "keytool";
}
Process p = new ProcessBuilder(
keytool,
"-genkeypair",
"-alias", "server",
"-keyalg", "RSA",
"-keysize", "2048",
"-sigalg", "SHA256withRSA",
"-validity", "1",
"-dname", "CN=localhost",
"-storetype", "PKCS12",
"-storepass", "changeit",
"-keypass", "changeit",
"-keystore", keystoreFile.getAbsolutePath())
.redirectErrorStream(true)
.start();
int exit = p.waitFor();
if (exit != 0) {
String out = new String(p.getInputStream().readAllBytes());
throw new IllegalStateException("keytool failed (exit " + exit + "): " + out);
}
// Export cert and private key from the PKCS12 keystore to PEM files.
KeyStore ks = KeyStore.getInstance("PKCS12");
try (var is = Files.newInputStream(keystoreFile.toPath())) {
ks.load(is, "changeit".toCharArray());
}
X509Certificate cert = (X509Certificate) ks.getCertificate("server");
PrivateKey privateKey = (PrivateKey) ks.getKey("server", "changeit".toCharArray());
try (FileOutputStream out = new FileOutputStream(certPemFile)) {
out.write("-----BEGIN CERTIFICATE-----\n".getBytes());
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(cert.getEncoded()));
out.write("\n-----END CERTIFICATE-----\n".getBytes());
}
try (FileOutputStream out = new FileOutputStream(keyPemFile)) {
out.write("-----BEGIN PRIVATE KEY-----\n".getBytes());
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(privateKey.getEncoded()));
out.write("\n-----END PRIVATE KEY-----\n".getBytes());
}
server = NettyServerBuilder
.forAddress(new InetSocketAddress("127.0.0.1", 0))
.sslContext(GrpcSslContexts.forServer(certPemFile, keyPemFile).build())
.addService(new MinimalGatewayService())
.build()
.start();
port = server.getPort();
}
@AfterEach
void stopTlsServer() throws InterruptedException {
if (server != null) {
server.shutdown();
server.awaitTermination(5, TimeUnit.SECONDS);
}
if (certPemFile != null) {
certPemFile.delete();
}
if (keyPemFile != null) {
keyPemFile.delete();
}
if (keystoreFile != null) {
keystoreFile.delete();
}
}
@Test
void connectsToSelfSignedServer_WhenRequireCertificateValidationIsFalse() throws SSLException {
// Default options — requireCertificateValidation defaults to false.
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
.endpoint("127.0.0.1:" + port)
.apiKey("test-key")
.connectTimeout(Duration.ofSeconds(5))
.callTimeout(Duration.ofSeconds(5))
.build();
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
try {
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
MxAccessGatewayGrpc.newBlockingStub(channel);
OpenSessionReply reply = stub.openSession(
OpenSessionRequest.newBuilder()
.setClientSessionName("tls-test")
.build());
assertTrue(reply.getProtocolStatus().getCode()
== ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK);
} finally {
channel.shutdownNow();
}
}
@Test
void failsToConnect_WhenRequireCertificateValidationIsTrue() throws SSLException {
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
.endpoint("127.0.0.1:" + port)
.apiKey("test-key")
.requireCertificateValidation(true)
.connectTimeout(Duration.ofSeconds(5))
.callTimeout(Duration.ofSeconds(5))
.build();
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
try {
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
MxAccessGatewayGrpc.newBlockingStub(channel);
assertThrows(StatusRuntimeException.class, () ->
stub.openSession(OpenSessionRequest.newBuilder()
.setClientSessionName("tls-strict-test")
.build()));
} finally {
channel.shutdownNow();
}
}
/** Minimal gateway stub that succeeds any OpenSession call. */
private static final class MinimalGatewayService
extends MxAccessGatewayGrpc.MxAccessGatewayImplBase {
@Override
public void openSession(
OpenSessionRequest request,
StreamObserver<OpenSessionReply> responseObserver) {
responseObserver.onNext(OpenSessionReply.newBuilder()
.setSessionId("tls-test-session")
.setProtocolStatus(ProtocolStatus.newBuilder()
.setCode(ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK)
.build())
.build());
responseObserver.onCompleted();
}
}
}
-22
View File
@@ -112,28 +112,6 @@ Support:
- TLS channel with default roots,
- custom root certificate file.
### Trust posture (trust-on-first-use)
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). grpc-python exposes no per-channel skip-verify hook, so the client cannot
"accept any certificate" the way the other clients do. Instead, when the channel
is not plaintext and neither `ca_file` nor `require_certificate_validation` is
set, the TLS default is **trust-on-first-use**: the client fetches the server's
presented certificate once via `ssl.get_server_certificate` (an unverified
probe), pins it as the channel's only trust root, and — because the generated
certificate always carries a `localhost` SAN — defaults
`grpc.ssl_target_name_override` to `localhost` when no `server_name_override` was
supplied (tolerating dial-by-IP or a hostname mismatch). A failed probe is
surfaced as a transport error naming the endpoint.
To verify the gateway instead:
- set `ca_file` to verify against a specific CA, or
- set `require_certificate_validation=True` to verify against the system trust
roots.
Both bypass the TOFU path.
## Streaming
Expose `stream_events` as an async iterator. Canceling the task should cancel
-11
View File
@@ -230,17 +230,6 @@ The client supports plaintext channels for local development, TLS with system
roots, TLS with a custom `ca_file`, and an optional test server name override.
API keys are redacted from option repr output and CLI error output.
The gateway can auto-generate its own self-signed certificate (it has no PKI).
grpc-python has no per-channel skip-verify, so the lenient TLS default is
**trust-on-first-use**: with no `ca_file` and `require_certificate_validation`
left `False`, the client fetches the gateway's presented certificate once
(unverified) and pins it for the channel, defaulting the SNI/target-name override
to `localhost` (the generated certificate always carries a `localhost` SAN) when
none was supplied. To verify instead, pass `ca_file` to verify against a specific
CA, or set `require_certificate_validation=True` to verify against the system
trust roots. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## CLI
The CLI emits deterministic JSON for automation:
@@ -2,7 +2,6 @@
from __future__ import annotations
import ssl
from collections.abc import Sequence
from dataclasses import dataclass, field
from pathlib import Path
@@ -10,7 +9,6 @@ from pathlib import Path
import grpc
from .auth import REDACTED, ApiKey
from .errors import MxGatewayTransportError
@dataclass(frozen=True)
@@ -21,7 +19,6 @@ class ClientOptions:
api_key: str | ApiKey | None = None
plaintext: bool = False
ca_file: str | None = None
require_certificate_validation: bool = False
server_name_override: str | None = None
call_timeout: float | None = 30.0
stream_timeout: float | None = None
@@ -48,7 +45,6 @@ class ClientOptions:
f"{type(self).__name__}(endpoint={self.endpoint!r}, "
f"api_key={api_key!r}, plaintext={self.plaintext!r}, "
f"ca_file={self.ca_file!r}, "
f"require_certificate_validation={self.require_certificate_validation!r}, "
f"server_name_override={self.server_name_override!r}, "
f"call_timeout={self.call_timeout!r}, "
f"stream_timeout={self.stream_timeout!r}, "
@@ -73,34 +69,8 @@ class BrowseChildrenOptions:
historized_only: bool = False
def _split_authority(endpoint: str) -> tuple[str, int]:
"""Split a gRPC target (optionally scheme-prefixed) into (host, port).
Handles bracketed IPv6 literals (e.g. ``[::1]:5120`` or bare ``[::1]``),
returning the host without brackets so it is safe to pass to
``ssl.get_server_certificate``.
"""
target = endpoint.split("://", 1)[-1]
if target.startswith("["):
# Bracketed IPv6: "[::1]:5120" or "[::1]"
bracket_end = target.find("]")
host = target[1:bracket_end] # strip surrounding brackets
remainder = target[bracket_end + 1 :] # ":5120" or ""
port_str = remainder.lstrip(":")
return (host, int(port_str) if port_str else 443)
host, _, port = target.rpartition(":")
return (host or "localhost", int(port) if port else 443)
def create_channel(options: ClientOptions) -> grpc.aio.Channel:
"""Create a plaintext or TLS `grpc.aio` channel from client options.
The TLS default is lenient: grpc-python has no per-channel skip-verify, so
the server's presented certificate is fetched once (unverified) and pinned
as the channel's only trust root (trust-on-first-use). Set
`require_certificate_validation=True` to force system-trust verification, or
pass `ca_file` to verify against a specific CA both bypass the TOFU path.
"""
"""Create a plaintext or TLS `grpc.aio` channel from client options."""
channel_options: list[tuple[str, str | int]] = [
("grpc.max_receive_message_length", options.max_grpc_message_bytes),
@@ -112,28 +82,11 @@ def create_channel(options: ClientOptions) -> grpc.aio.Channel:
if options.plaintext:
return grpc.aio.insecure_channel(options.endpoint, options=channel_options)
root_certificates = None
if options.ca_file:
root_certificates = Path(options.ca_file).read_bytes()
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
elif options.require_certificate_validation:
credentials = grpc.ssl_channel_credentials()
else:
# Lenient default: grpc-python has no per-channel skip-verify, so fetch the
# server's certificate (unverified) and pin it for this channel (TOFU).
host, port = _split_authority(options.endpoint)
try:
presented = ssl.get_server_certificate((host, port))
except OSError as error:
raise MxGatewayTransportError(
f"failed to fetch TLS certificate from {options.endpoint}: {error}"
) from error
credentials = grpc.ssl_channel_credentials(root_certificates=presented.encode("ascii"))
# The gateway self-signed cert always carries a "localhost" SAN, so default
# the SNI/target-name override to it when none was supplied, tolerating
# dial-by-IP or hostname mismatch.
if not options.server_name_override:
channel_options.append(("grpc.ssl_target_name_override", "localhost"))
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
return grpc.aio.secure_channel(
options.endpoint,
credentials,
+24 -187
View File
@@ -72,83 +72,27 @@ def test_create_channel_uses_plaintext_channel(monkeypatch: pytest.MonkeyPatch)
]
def test_create_channel_uses_tls_channel_tofu_default(monkeypatch: pytest.MonkeyPatch) -> None:
"""Default TLS (no ca_file, no require_certificate_validation) uses TOFU:
fetches the server cert unverified, pins it as root_certificates, and adds
grpc.ssl_target_name_override = "localhost" automatically.
"""
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
get_cert_calls: list[tuple[str, int]] = []
def test_create_channel_uses_tls_channel(monkeypatch: pytest.MonkeyPatch) -> None:
calls: list[tuple[str, object, object]] = []
def fake_get_server_certificate(addr: tuple[str, int]) -> str:
get_cert_calls.append(addr)
return _DUMMY_PEM
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
def fake_credentials(*, root_certificates: object) -> str:
assert root_certificates is None
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(endpoint="gateway.example:5001"),
)
assert channel == "tls-channel"
# TOFU: should have fetched the cert from the server (host, port)
assert get_cert_calls == [("gateway.example", 5001)]
# Pinned the fetched PEM bytes as root_certificates
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
# Auto-injected localhost override (no server_name_override supplied)
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
("grpc.ssl_target_name_override", "localhost"),
],
),
]
def test_create_channel_uses_tls_channel_tofu_respects_server_name_override(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""When server_name_override is set, TOFU still runs but does NOT add the
auto-localhost override (the explicit override is already in channel_options).
"""
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
monkeypatch.setattr(
options_module.ssl,
"get_server_certificate",
lambda addr: _DUMMY_PEM,
options_module.grpc,
"ssl_channel_credentials",
fake_credentials,
)
monkeypatch.setattr(
options_module.grpc.aio,
"secure_channel",
fake_secure_channel,
)
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
@@ -158,121 +102,14 @@ def test_create_channel_uses_tls_channel_tofu_respects_server_name_override(
)
assert channel == "tls-channel"
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
# Explicit override from ClientOptions — not the auto-localhost one
("grpc.ssl_target_name_override", "gateway.test"),
],
),
]
def test_create_channel_uses_tls_channel_require_cert_validation(
monkeypatch: pytest.MonkeyPatch,
) -> None:
"""require_certificate_validation=True uses system trust (no TOFU, no root_certificates)."""
get_cert_called = False
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
nonlocal get_cert_called
get_cert_called = True
return "SHOULD_NOT_BE_CALLED"
cred_calls: list[object] = []
def fake_credentials(**kwargs: object) -> str:
cred_calls.append(kwargs)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
endpoint="gateway.example:5001",
require_certificate_validation=True,
),
)
assert channel == "tls-channel"
# Must NOT call TOFU prefetch
assert not get_cert_called
# ssl_channel_credentials() called with NO keyword args (system trust)
assert cred_calls == [{}]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
],
),
]
def test_create_channel_uses_tls_channel_ca_file(
monkeypatch: pytest.MonkeyPatch,
tmp_path: pytest.TempPathFactory,
) -> None:
"""ca_file path: reads the PEM file, passes bytes as root_certificates, skips TOFU."""
ca_pem = b"-----BEGIN CERTIFICATE-----\nY2FkYXRh\n-----END CERTIFICATE-----\n"
ca_file = tmp_path / "ca.pem"
ca_file.write_bytes(ca_pem)
get_cert_called = False
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
nonlocal get_cert_called
get_cert_called = True
return "SHOULD_NOT_BE_CALLED"
cred_calls: list[object] = []
def fake_credentials(*, root_certificates: object = None) -> str:
cred_calls.append(root_certificates)
return "creds"
channel_calls: list[tuple[str, object, object]] = []
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
channel_calls.append((endpoint, credentials, options))
return "tls-channel"
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
channel = create_channel(
ClientOptions(
endpoint="gateway.example:5001",
ca_file=str(ca_file),
),
)
assert channel == "tls-channel"
assert not get_cert_called
assert cred_calls == [ca_pem]
assert channel_calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
],
),
]
assert calls == [
(
"gateway.example:5001",
"creds",
[
("grpc.max_receive_message_length", 16 * 1024 * 1024),
("grpc.max_send_message_length", 16 * 1024 * 1024),
("grpc.ssl_target_name_override", "gateway.test"),
],
),
]
-165
View File
@@ -1,165 +0,0 @@
"""TLS behaviour tests for ``create_channel``.
These spin up a real loopback ``grpc.aio`` server with a freshly generated
self-signed certificate (carrying a ``localhost`` SAN, mirroring the gateway's
auto-generated cert) and assert the lenient TOFU default lets a client connect
without any CA configured.
Marked ``tls`` and skipped unless ``MXGATEWAY_RUN_TLS_TESTS=1`` because loopback
TLS handshakes can be timing-flaky on shared CI runners. This mirrors how the
suite gates anything that depends on real sockets rather than fakes.
"""
from __future__ import annotations
import os
import shutil
import socket
import ssl
import subprocess
import tempfile
from collections.abc import AsyncIterator
from pathlib import Path
import grpc
import pytest
import pytest_asyncio
from zb_mom_ww_mxgateway import ClientOptions
from zb_mom_ww_mxgateway.errors import MxGatewayTransportError
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2_grpc as pb_grpc
from zb_mom_ww_mxgateway.options import create_channel
pytestmark = pytest.mark.tls
_RUN_TLS_TESTS = os.environ.get("MXGATEWAY_RUN_TLS_TESTS") == "1"
_OPENSSL = shutil.which("openssl")
requires_tls = pytest.mark.skipif(
not _RUN_TLS_TESTS,
reason="set MXGATEWAY_RUN_TLS_TESTS=1 to run loopback TLS tests",
)
requires_openssl = pytest.mark.skipif(
_OPENSSL is None,
reason="openssl CLI is required to generate a self-signed test certificate",
)
def _generate_self_signed_cert(directory: Path) -> tuple[Path, Path]:
"""Generate a self-signed cert/key pair with a ``localhost`` SAN."""
key_path = directory / "server.key"
cert_path = directory / "server.crt"
subprocess.run(
[
str(_OPENSSL),
"req",
"-x509",
"-newkey",
"rsa:2048",
"-nodes",
"-keyout",
str(key_path),
"-out",
str(cert_path),
"-days",
"1",
"-subj",
"/CN=mxgateway-test",
"-addext",
"subjectAltName=DNS:localhost,IP:127.0.0.1",
],
check=True,
capture_output=True,
)
return cert_path, key_path
def _free_port() -> int:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
sock.bind(("127.0.0.1", 0))
return int(sock.getsockname()[1])
class _StaticGatewayServicer(pb_grpc.MxAccessGatewayServicer):
"""Minimal servicer answering ``OpenSession`` with a fixed session id."""
async def OpenSession( # noqa: N802 - generated gRPC method name
self, request: pb.OpenSessionRequest, context: object
) -> pb.OpenSessionReply:
return pb.OpenSessionReply(session_id="tls-session-1")
@pytest_asyncio.fixture
async def tls_server() -> AsyncIterator[int]:
with tempfile.TemporaryDirectory() as tmp:
cert_path, key_path = _generate_self_signed_cert(Path(tmp))
credentials = grpc.ssl_server_credentials(
[(key_path.read_bytes(), cert_path.read_bytes())]
)
server = grpc.aio.server()
pb_grpc.add_MxAccessGatewayServicer_to_server(_StaticGatewayServicer(), server)
port = _free_port()
server.add_secure_port(f"127.0.0.1:{port}", credentials)
await server.start()
try:
yield port
finally:
await server.stop(grace=None)
@requires_tls
@requires_openssl
@pytest.mark.asyncio
async def test_default_tls_connects_via_tofu(tls_server: int) -> None:
"""Default TLS options (no CA) connect by pinning the presented cert."""
options = ClientOptions(
endpoint=f"127.0.0.1:{tls_server}",
api_key="mxgw_test_secret",
)
channel = create_channel(options)
try:
stub = pb_grpc.MxAccessGatewayStub(channel)
reply = await stub.OpenSession(pb.OpenSessionRequest(), timeout=10)
assert reply.session_id == "tls-session-1"
finally:
await channel.close()
def test_split_authority_parses_host_and_port() -> None:
from zb_mom_ww_mxgateway.options import _split_authority
assert _split_authority("https://10.0.0.5:5120") == ("10.0.0.5", 5120)
assert _split_authority("localhost:5120") == ("localhost", 5120)
assert _split_authority(":5120") == ("localhost", 5120)
def test_split_authority_strips_ipv6_brackets() -> None:
from zb_mom_ww_mxgateway.options import _split_authority
# Bracketed IPv6 with port — brackets must be removed for ssl.get_server_certificate
assert _split_authority("[::1]:5120") == ("::1", 5120)
# Bare bracketed IPv6 (no port) — default port 443
assert _split_authority("[::1]") == ("::1", 443)
# Scheme-prefixed bracketed IPv6
assert _split_authority("grpc://[::1]:5120") == ("::1", 5120)
def test_tofu_connect_failure_raises_transport_error() -> None:
"""A failed cert pre-fetch surfaces the client's transport error type."""
options = ClientOptions(endpoint=f"127.0.0.1:{_free_port()}")
with pytest.raises(MxGatewayTransportError) as excinfo:
create_channel(options)
assert options.endpoint in str(excinfo.value)
def test_require_certificate_validation_uses_system_trust() -> None:
"""``require_certificate_validation`` must not attempt a TOFU pre-fetch."""
# Pointing at a closed port: with system-trust the channel is created lazily
# (no eager pre-fetch), so create_channel must succeed without connecting.
options = ClientOptions(
endpoint=f"127.0.0.1:{_free_port()}",
require_certificate_validation=True,
)
channel = create_channel(options)
assert isinstance(channel, grpc.aio.Channel)
-13
View File
@@ -76,19 +76,6 @@ types.
cargo run -p mxgw-cli -- smoke --endpoint https://mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt --json
```
### TLS trust (pin-only)
The gateway can auto-generate its own self-signed certificate (it has no PKI).
Unlike the other clients, the Rust client is **not** lenient: tonic 0.13.1
exposes no public hook to inject a custom certificate verifier, so TLS over Rust
is pin-only. A TLS connection requires either `--ca-file` /
`ClientOptions::with_ca_file(...)` to pin a CA (export the gateway's self-signed
certificate and pin it), or `--require-certificate-validation` /
`with_require_certificate_validation(true)` to verify against the system trust
roots. TLS with neither set fails `connect` with a clear, actionable error rather
than accepting the certificate. See
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
## Library Surface
`ClientOptions` configures endpoint, API key, plaintext or TLS transport,
-19
View File
@@ -189,25 +189,6 @@ Support:
- custom CA file,
- domain override.
### Trust posture (pin-only)
The gateway can serve a self-signed certificate it generates itself (it has no
PKI). Rust is the **exception** to the lenient-by-default posture the other
clients use: tonic 0.13.1 exposes no public hook to inject a custom certificate
verifier, so the Rust client cannot accept an arbitrary certificate. TLS over the
Rust client is therefore **pin-only** — it requires either:
- `ClientOptions::with_ca_file(...)` to pin a CA (the supported path for the
gateway's self-signed certificate; export the certificate and pin it), or
- `ClientOptions::with_require_certificate_validation(true)` to verify against the
system trust roots.
With TLS enabled (`with_plaintext(false)`), no pinned CA, and certificate
validation not required, `GatewayClient::connect` rejects the connection with a
clear, actionable error pointing at `with_ca_file` /
`require_certificate_validation` rather than silently accepting the certificate.
The CLI exposes `--ca-file` and `--require-certificate-validation`.
## Streaming
Expose event streams as a `Stream<Item = Result<MxEvent, Error>>`. Dropping the
-8
View File
@@ -426,11 +426,6 @@ struct ConnectionArgs {
ca_file: Option<PathBuf>,
#[arg(long)]
server_name_override: Option<String>,
/// Verify the server certificate against the system trust roots even
/// without a pinned CA. The Rust client's default is to require a CA
/// file (see `--ca-file`); set this flag to use system roots instead.
#[arg(long)]
require_certificate_validation: bool,
#[arg(long, default_value_t = 10)]
connect_timeout_seconds: u64,
#[arg(long, default_value_t = 30)]
@@ -458,9 +453,6 @@ impl ConnectionArgs {
if let Some(server_name_override) = &self.server_name_override {
options = options.with_server_name_override(server_name_override);
}
if self.require_certificate_validation {
options = options.with_require_certificate_validation(true);
}
options
}
+16 -3
View File
@@ -6,8 +6,10 @@
//! code should prefer [`GatewayClient::open_session`] and the [`Session`]
//! handle it returns, rather than the `*_raw` methods.
use std::fs;
use tonic::codegen::InterceptedService;
use tonic::transport::Channel;
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
use tonic::Request;
use crate::auth::AuthInterceptor;
@@ -19,7 +21,7 @@ use crate::generated::mxaccess_gateway::v1::{
OpenSessionReply, OpenSessionRequest, QueryActiveAlarmsRequest, StreamAlarmsRequest,
StreamEventsRequest,
};
use crate::options::{build_tls_config, ClientOptions};
use crate::options::ClientOptions;
use crate::session::Session;
/// Generated gateway client wrapped in the auth interceptor that
@@ -76,7 +78,18 @@ impl GatewayClient {
})?;
endpoint = endpoint.connect_timeout(options.connect_timeout());
if let Some(tls) = build_tls_config(&options)? {
if !options.plaintext() {
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
}
endpoint = endpoint.tls_config(tls)?;
}
+15 -3
View File
@@ -6,12 +6,13 @@
//! re-exported through [`crate::generated::galaxy_repository::v1`].
use std::collections::HashSet;
use std::fs;
use std::sync::Arc;
use prost_types::Timestamp;
use tokio::sync::Mutex as AsyncMutex;
use tonic::codegen::InterceptedService;
use tonic::transport::Channel;
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
use tonic::Request;
use crate::auth::AuthInterceptor;
@@ -22,7 +23,7 @@ use crate::generated::galaxy_repository::v1::{
DiscoverHierarchyRequest, GalaxyObject, GetLastDeployTimeRequest, TestConnectionRequest,
WatchDeployEventsRequest,
};
use crate::options::{build_tls_config, ClientOptions};
use crate::options::ClientOptions;
const DISCOVER_HIERARCHY_PAGE_SIZE: i32 = 5000;
const BROWSE_CHILDREN_PAGE_SIZE: i32 = 500;
@@ -182,7 +183,18 @@ impl GalaxyClient {
})?;
endpoint = endpoint.connect_timeout(options.connect_timeout());
if let Some(tls) = build_tls_config(&options)? {
if !options.plaintext() {
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
}
endpoint = endpoint.tls_config(tls)?;
}
-94
View File
@@ -3,14 +3,10 @@
//! chain of `with_*` setters; the `Debug` impl redacts the API key.
use std::fmt;
use std::fs;
use std::path::PathBuf;
use std::time::Duration;
use tonic::transport::{Certificate, ClientTlsConfig};
use crate::auth::ApiKey;
use crate::error::Error;
const DEFAULT_MAX_GRPC_MESSAGE_BYTES: usize = 16 * 1024 * 1024;
@@ -26,7 +22,6 @@ pub struct ClientOptions {
api_key: Option<ApiKey>,
plaintext: bool,
ca_file: Option<PathBuf>,
require_certificate_validation: bool,
server_name_override: Option<String>,
connect_timeout: Duration,
call_timeout: Duration,
@@ -43,7 +38,6 @@ impl ClientOptions {
api_key: None,
plaintext: true,
ca_file: None,
require_certificate_validation: false,
server_name_override: None,
connect_timeout: Duration::from_secs(10),
call_timeout: Duration::from_secs(30),
@@ -73,22 +67,6 @@ impl ClientOptions {
self
}
/// Require TLS certificate verification even without a pinned CA. Default
/// false: the gateway's self-signed certificate is accepted (internal-tool
/// posture). Setting a CA file always verifies.
///
/// Note for Rust: tonic 0.13's `ClientTlsConfig` exposes no hook for a
/// custom rustls verifier, so the Rust client cannot accept an arbitrary
/// self-signed certificate the way the other clients do. With the default
/// (false) and no pinned CA, [`crate::client::GatewayClient::connect`]
/// rejects the TLS connection and asks for a CA file. Either pin a CA via
/// [`ClientOptions::with_ca_file`] (the supported lenient path on Rust) or
/// set this `true` to verify against the system trust roots.
pub fn with_require_certificate_validation(mut self, require: bool) -> Self {
self.require_certificate_validation = require;
self
}
/// Override the SNI/server name used during the TLS handshake. Useful
/// when the dial-target host name does not match the certificate.
pub fn with_server_name_override(mut self, server_name_override: impl Into<String>) -> Self {
@@ -143,12 +121,6 @@ impl ClientOptions {
self.ca_file.as_ref()
}
/// Whether TLS certificate verification is required even without a pinned
/// CA. See [`ClientOptions::with_require_certificate_validation`].
pub fn require_certificate_validation(&self) -> bool {
self.require_certificate_validation
}
/// Optional SNI / server-name override for TLS handshakes.
pub fn server_name_override(&self) -> Option<&str> {
self.server_name_override.as_deref()
@@ -175,68 +147,6 @@ impl ClientOptions {
}
}
/// Build the [`ClientTlsConfig`] for a non-plaintext connection described by
/// `options`, applying the lenient-default guard that is the **Rust
/// pin-only exception**.
///
/// Returns `Ok(None)` when `options.plaintext()` is `true` (no TLS needed).
/// Returns `Ok(Some(tls))` when a valid TLS config can be assembled.
/// Returns `Err(Error::InvalidEndpoint)` when TLS is requested but no pinned
/// CA was provided and `require_certificate_validation` is `false`.
///
/// # Why this guard exists
///
/// `tonic` 0.13's `ClientTlsConfig` builds its rustls verifier inside a
/// crate-private connector and exposes no hook for a custom
/// `ServerCertVerifier`. The Rust client therefore cannot accept an arbitrary
/// self-signed certificate the way the other language clients do. Rather than
/// silently falling back to system-root verification (which always fails
/// against a self-signed gateway certificate), we reject the configuration
/// early with an actionable error.
pub(crate) fn build_tls_config(options: &ClientOptions) -> Result<Option<ClientTlsConfig>, Error> {
if options.plaintext() {
return Ok(None);
}
let mut tls = ClientTlsConfig::new();
if let Some(server_name) = options.server_name_override() {
tls = tls.domain_name(server_name.to_owned());
}
if let Some(ca_file) = options.ca_file() {
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
})?;
tls = tls.ca_certificate(Certificate::from_pem(certificate));
} else if !options.require_certificate_validation() {
// Lenient-default fallback (Rust pin-only exception): tonic
// 0.13's `ClientTlsConfig` builds its rustls verifier inside a
// crate-private connector and exposes no hook for a custom
// `ServerCertVerifier`, so — unlike the other clients — the
// Rust client cannot accept an arbitrary self-signed cert. Pin
// the gateway's CA instead, or opt into strict verification
// against the system trust roots. We reject here rather than
// silently verifying against system roots (which would fail a
// self-signed gateway with a confusing handshake error).
//
// Note: a server-name override affects SNI (the hostname sent
// in the TLS ClientHello) but does NOT pin trust. Overriding
// the server name alone does not bypass certificate validation.
return Err(Error::InvalidEndpoint {
endpoint: options.endpoint().to_owned(),
detail: "TLS requested without a pinned CA. The Rust client cannot accept an \
arbitrary self-signed certificate (tonic 0.13 exposes no custom \
rustls verifier). Pin the gateway certificate with \
ClientOptions::with_ca_file, or call \
ClientOptions::with_require_certificate_validation(true) to verify \
against the system trust roots. Note: a server-name override \
affects SNI but does not pin trust."
.to_owned(),
});
}
Ok(Some(tls))
}
impl Default for ClientOptions {
fn default() -> Self {
Self::new("http://127.0.0.1:5000")
@@ -251,10 +161,6 @@ impl fmt::Debug for ClientOptions {
.field("api_key", &self.api_key.as_ref().map(|_| "<redacted>"))
.field("plaintext", &self.plaintext)
.field("ca_file", &self.ca_file)
.field(
"require_certificate_validation",
&self.require_certificate_validation,
)
.field("server_name_override", &self.server_name_override)
.field("connect_timeout", &self.connect_timeout)
.field("call_timeout", &self.call_timeout)
-137
View File
@@ -1,137 +0,0 @@
//! TLS posture coverage for the Rust client.
//!
//! tonic 0.13.1's `ClientTlsConfig` exposes no hook for a custom rustls
//! `ServerCertVerifier` (the verifier is built internally inside the
//! crate-private `TlsConnector`), so the Rust client cannot implement the
//! "accept any server certificate" lenient default the other clients use.
//! Rust is therefore the documented **pin-only exception**: TLS without a
//! pinned CA is rejected up front with a clear, actionable error, and
//! supplying a CA file is the supported path. These tests pin that contract.
use std::time::Duration;
use zb_mom_ww_mxgateway_client::{ClientOptions, Error, GalaxyClient, GatewayClient};
/// Drive `connect` to its error without requiring `GatewayClient: Debug`
/// (the success arm is dropped explicitly so `unwrap_err` is unnecessary).
async fn connect_err(options: ClientOptions) -> Error {
match GatewayClient::connect(options).await {
Ok(_client) => panic!("connect unexpectedly succeeded against a dead TLS address"),
Err(error) => error,
}
}
#[tokio::test]
async fn tls_without_ca_is_rejected_with_actionable_error_by_default() {
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
let Error::InvalidEndpoint { detail, .. } = error else {
panic!("expected InvalidEndpoint, got {error:?}");
};
// The message must point the caller at the supported remedy (pin a CA)
// and name the opt-in escape hatch.
assert!(
detail.contains("ca_file") || detail.contains("CA"),
"error should instruct the user to pass a CA file: {detail}"
);
assert!(
detail.contains("require_certificate_validation"),
"error should mention the require_certificate_validation opt-in: {detail}"
);
}
#[tokio::test]
async fn tls_with_require_certificate_validation_does_not_short_circuit() {
// With strict verification opted in, the no-CA guard must not fire; the
// connect attempt instead proceeds to the transport (and fails to reach
// the dead address) rather than returning the "CA required" guard error.
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_require_certificate_validation(true)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
assert!(
!matches!(&error, Error::InvalidEndpoint { detail, .. }
if detail.contains("require_certificate_validation")),
"strict verification must bypass the no-CA guard, got {error:?}"
);
}
#[tokio::test]
async fn tls_with_ca_file_is_permitted_and_proceeds_past_the_guard() {
// Pinning a CA is the supported TLS path: the no-CA guard must not fire.
// We hand it a readable PEM file; construction proceeds past the guard
// and only fails later at the transport (dead address / handshake).
let ca_path = std::env::temp_dir().join("mxgw-rust-tls-ca-fixture.pem");
std::fs::write(&ca_path, SELF_SIGNED_CA_PEM).unwrap();
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_ca_file(&ca_path)
.with_connect_timeout(Duration::from_millis(200));
let error = connect_err(options).await;
let _ = std::fs::remove_file(&ca_path);
assert!(
!matches!(&error, Error::InvalidEndpoint { detail, .. }
if detail.contains("require_certificate_validation")),
"pinning a CA must bypass the no-CA guard, got {error:?}"
);
}
/// Drive `GalaxyClient::connect` to its error (mirrors `connect_err` above).
async fn galaxy_connect_err(options: ClientOptions) -> Error {
match GalaxyClient::connect(options).await {
Ok(_client) => {
panic!("GalaxyClient::connect unexpectedly succeeded against a dead TLS address")
}
Err(error) => error,
}
}
#[tokio::test]
async fn galaxy_tls_without_ca_is_rejected_with_actionable_error_by_default() {
// GalaxyClient::connect must apply the same TLS guard as GatewayClient —
// TLS without a pinned CA (and without require_certificate_validation)
// returns a clear, actionable InvalidEndpoint error.
let options = ClientOptions::new("https://127.0.0.1:1")
.with_plaintext(false)
.with_connect_timeout(Duration::from_millis(200));
let error = galaxy_connect_err(options).await;
let Error::InvalidEndpoint { detail, .. } = error else {
panic!("expected InvalidEndpoint, got {error:?}");
};
assert!(
detail.contains("ca_file") || detail.contains("CA"),
"error should instruct the user to pass a CA file: {detail}"
);
assert!(
detail.contains("require_certificate_validation"),
"error should mention the require_certificate_validation opt-in: {detail}"
);
}
/// A throwaway self-signed CA certificate (PEM). Only needs to parse as a
/// PEM trust root so the CA-pinning path is exercised past the guard.
const SELF_SIGNED_CA_PEM: &str = "-----BEGIN CERTIFICATE-----
MIIBhTCCASugAwIBAgIQIRi6zePL6mKjOipn+dNuaTAKBggqhkjOPQQDAjASMRAw
DgYDVQQKEwdBY21lIENvMB4XDTE3MTAyMDE5NDMwNloXDTE4MTAyMDE5NDMwNlow
EjEQMA4GA1UEChMHQWNtZSBDbzBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABD0d
7VNhbWvZLWPuj/RtHFjvtJBEwOkhbN/BnnE8rnZR8+sbwnc/KhCk3FhnpHZnQz7B
5aETbbIgmuvewdjvSBSjYzBhMA4GA1UdDwEB/wQEAwICpDATBgNVHSUEDDAKBggr
BgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MCkGA1UdEQQiMCCCDmxvY2FsaG9zdDo1
NDUzgg4xMjcuMC4wLjE6NTQ1MzAKBggqhkjOPQQDAgNIADBFAiEA2zpJEPQyz6/l
Wf86aX6PepsntZv2GYlA5UpabfT2EZICICpJ5h/iI+i341gBmLiAFQOyTDT+/wQc
6MF9+Yw1Yy0t
-----END CERTIFICATE-----
";
-13
View File
@@ -51,19 +51,6 @@ The shared inputs are:
The commands in the matrix use `MXGATEWAY_API_KEY` through each CLI's
`api-key-env` flag. They must not embed bearer tokens or raw API keys.
### TLS variant
The matrix runs over plaintext (`h2c`) by default. A TLS variant exists but stays
a manual/opt-in run, consistent with the gate above, because it needs the gateway
started with an HTTPS endpoint (an `https://` `MXGATEWAY_ENDPOINT`) and each CLI
switched to its TLS flag (`--tls` / `-tls` / `--plaintext=false` /
`plaintext=False`). The clients are lenient by default and accept the gateway's
auto-generated self-signed certificate without extra trust setup, except the Rust
CLI, which is pin-only and needs `--ca-file` or `--require-certificate-validation`
(and Python uses trust-on-first-use). See
[Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
and each client README for the per-client TLS flags.
## JSON Comparison
Every command in the matrix requests JSON output. A runner can compare the
-36
View File
@@ -375,42 +375,6 @@ deployment-heavy box, multiply per-session SQL connections, and complicate the
cold-start path. Wire-side laziness solves the actual pain (oversized gRPC
replies and a heavy DOM) without disturbing the materialization model.
## TLS Auto-Certificate and Lenient Client Trust
Decision: when a Kestrel `https://` endpoint is configured without a certificate
of its own (and no `Kestrel:Certificates:Default` is set), the gateway generates
and persists a self-signed certificate rather than failing to start. Clients
connecting over TLS without a pinned CA accept whatever certificate the server
presents by default; pinning a CA restores full verification.
Rationale: `mxaccessgw` is an internal tool with no PKI to issue or distribute
certificates. The prior behavior — an `https` endpoint with no certificate
fails at startup with Kestrel's opaque "no server certificate was specified"
error — pushed operators toward plaintext (`h2c`), exposing the API key and
request payloads on the wire. Auto-generating a long-lived, persisted, reused
certificate lets TLS "just work" with zero certificate management, while the
lenient client default means clients connect to that self-signed certificate
without a manual trust step. Both choices are deliberate, not oversights:
strict-by-default would force PKI work this tool does not warrant. Plaintext-only
deployments are untouched — no certificate or key material is written for them —
and an operator who supplies a real certificate transparently overrides the
generated one.
Two clients diverge from "accept any certificate" because their gRPC stacks lack
a per-channel skip-verify hook:
- Python uses trust-on-first-use: it fetches the server's presented certificate
over a separate unverified probe and pins it for the channel, and defaults the
SNI/target-name override to `localhost` (the generated certificate always
carries a `localhost` SAN).
- Rust is pin-only: tonic exposes no public hook to inject a custom certificate
verifier, so TLS over Rust requires either a pinned CA or an explicit opt-in to
system-trust verification; otherwise connecting returns a clear, actionable
error.
See [Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
and the per-client READMEs for the as-built behavior.
## Later Revisit Items
These are explicit post-v1 revisit items, not open blockers:
-180
View File
@@ -148,7 +148,6 @@ the affected stream while the MXAccess session remains active.
| `MxGateway:Dashboard:Enabled` | `true` | Enables Blazor Server dashboard route mapping. The dashboard mounts at the host root (`/`); there is no separate path-base prefix. |
| `MxGateway:Dashboard:AllowAnonymousLocalhost` | `true` | Allows loopback dashboard requests to bypass the dashboard cookie requirement for local development. Remote requests still require dashboard authentication. |
| `MxGateway:Dashboard:RequireHttpsCookie` | `true` | Sets the dashboard auth cookie's secure policy. `true` keeps `CookieSecurePolicy.Always` — the cookie is only sent over HTTPS, which matches a production HTTPS deployment. Set to `false` for plain-HTTP dev deployments to use `CookieSecurePolicy.SameAsRequest`; the cookie is still flagged Secure on HTTPS requests, but it can round-trip over HTTP. Browsers drop Secure cookies set over HTTP from non-localhost hosts, so leaving this `true` while serving the dashboard over plain HTTP will break login from any remote browser. |
| `MxGateway:Dashboard:CookieName` | `MxGatewayDashboard` | Dashboard auth cookie name. Leave unset (null/blank) to use the default. Override it to give a distinct name to a gateway that shares a hostname with another gateway instance: browser cookies are scoped by host+path but **not** by port, so two instances on the same host would otherwise clobber each other's dashboard session under a shared cookie name. Changing it signs out existing dashboard sessions on next deploy. |
| `MxGateway:Dashboard:SnapshotIntervalMilliseconds` | `1000` | Dashboard snapshot refresh interval used by the snapshot SignalR hub and the pages that subscribe to it. |
| `MxGateway:Dashboard:RecentFaultLimit` | `100` | Maximum number of fault summaries projected into each dashboard snapshot. |
| `MxGateway:Dashboard:RecentSessionLimit` | `200` | Maximum number of session summaries projected into each dashboard snapshot. |
@@ -230,185 +229,6 @@ behavior.
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
`StreamAlarms` are session-less RPCs served by the monitor.
## Host Endpoints and Transport Security (Kestrel)
The listening endpoints are **not** part of the `MxGateway` section. The gateway
uses the stock ASP.NET Core host (`WebApplication.CreateBuilder`) with no
`ConfigureKestrel` call in code, so endpoints come entirely from the standard
`Kestrel` configuration section. On the deployed hosts these values are supplied
as NSSM environment variables (`Kestrel__Endpoints__...`), not from
`appsettings.json`.
Two named endpoints are bound:
| Endpoint name | Purpose | Protocol requirement |
|---|---|---|
| `Http` | Public gRPC API (sessions, invoke, events, Galaxy browse) | HTTP/2 |
| `Dashboard` | Blazor dashboard and SignalR hubs | HTTP/1.1 (HTTP/2 optional) |
Both endpoints share one routing pipeline; the names only select which TCP port
serves which traffic. The gRPC endpoint must negotiate **HTTP/2**, which drives
the protocol settings below.
### Plaintext (current deployments)
Both running hosts (`10.100.0.48` and `wonder-app-vd03`) serve the gRPC port in
**cleartext HTTP/2 (`h2c`)**. Because cleartext HTTP/2 has no ALPN to negotiate
the protocol, the gRPC endpoint must be pinned to `Http2` with prior knowledge:
```text
Kestrel__Endpoints__Http__Url=http://0.0.0.0:5120
Kestrel__Endpoints__Http__Protocols=Http2
Kestrel__Endpoints__Dashboard__Url=http://0.0.0.0:5130
```
In this mode all client↔gateway traffic — including the
`authorization: Bearer mxgw_...` API key and any `WriteSecured` / `AuthenticateUser`
payloads — crosses the network **unencrypted**. This is acceptable only on a
trusted/isolated network segment. Prefer TLS for anything else.
### TLS
To encrypt the gRPC channel, give the `Http` endpoint an `https://` URL and a
certificate. Over TLS, ALPN negotiates HTTP/2, so the explicit `Protocols=Http2`
pin is no longer required (the default `Http1AndHttp2` works for gRPC over TLS).
`appsettings.json` form:
```json
{
"Kestrel": {
"Endpoints": {
"Http": {
"Url": "https://0.0.0.0:5120",
"Certificate": {
"Path": "C:\\ProgramData\\MxGateway\\certs\\gateway.pfx",
"Password": "<pfx-password>"
}
},
"Dashboard": {
"Url": "https://0.0.0.0:5130",
"Certificate": {
"Path": "C:\\ProgramData\\MxGateway\\certs\\gateway.pfx",
"Password": "<pfx-password>"
}
}
}
}
}
```
Equivalent NSSM environment-variable form (how config is delivered on the hosts —
see [server deploy mechanics in the project notes]):
```text
Kestrel__Endpoints__Http__Url=https://0.0.0.0:5120
Kestrel__Endpoints__Http__Certificate__Path=C:\ProgramData\MxGateway\certs\gateway.pfx
Kestrel__Endpoints__Http__Certificate__Password=<pfx-password>
Kestrel__Endpoints__Dashboard__Url=https://0.0.0.0:5130
Kestrel__Endpoints__Dashboard__Certificate__Path=C:\ProgramData\MxGateway\certs\gateway.pfx
Kestrel__Endpoints__Dashboard__Certificate__Password=<pfx-password>
```
Certificate sourcing options (any standard ASP.NET Core form is accepted):
| Form | Keys |
|---|---|
| PFX file | `Certificate:Path` (+ `Certificate:Password` if encrypted) |
| PEM pair | `Certificate:Path` (cert) + `Certificate:KeyPath` (private key) |
| Windows cert store | `Certificate:Subject`, `Certificate:Store` (e.g. `My`), `Certificate:Location` (`LocalMachine`), `Certificate:AllowInvalid` |
The certificate's CN/SAN must cover the host name clients dial (or clients must
set a server-name override — see below). The dashboard endpoint can keep its own
certificate independent of the gRPC endpoint; pair this with
`MxGateway:Dashboard:RequireHttpsCookie` (`true`) for production HTTPS.
### Automatic self-signed certificate
`mxaccessgw` is an internal tool with no PKI to issue certificates, so requiring
an operator to supply one before TLS works pushed deployments toward plaintext.
To avoid that, the gateway fills in a self-signed certificate when an HTTPS
endpoint is configured without one.
**Trigger.** At startup the gateway inspects `Kestrel:Endpoints:*`. If any
endpoint has an `https://` URL and no `Certificate` subsection of its own, and no
`Kestrel:Certificates:Default` is set, the gateway generates (or loads) a
persisted self-signed certificate and wires it in as the HTTPS *default* via
`ConfigureHttpsDefaults`. All-plaintext deployments are untouched: when no HTTPS
endpoint is configured, no certificate or key material is generated or written.
**Generated certificate.** ECDSA P-256, `serverAuth` EKU, validity ≈
`ValidityYears` (default 10 years, with one day of clock-skew slack before
`notBefore`). SANs cover `localhost`, the machine name (and its FQDN when
resolvable), each entry in `AdditionalDnsNames`, and the loopback addresses
`127.0.0.1` and `::1`.
**`MxGateway:Tls:*` options.** All optional; the zero-config path needs none of
them.
| Option | Default | Purpose |
|---|---|---|
| `Tls:SelfSignedCertPath` | `C:\ProgramData\MxGateway\certs\gateway-selfsigned.pfx` | Where the generated certificate is persisted |
| `Tls:ValidityYears` | `10` | Lifetime of the generated certificate (validated 1100) |
| `Tls:AdditionalDnsNames` | `[]` | Extra DNS SANs (e.g. a load-balancer name) |
| `Tls:RegenerateIfExpired` | `true` | Replace an expired persisted certificate instead of failing |
`ValidityYears` is validated by `GatewayOptionsValidator` (range 1100); the
"HTTPS endpoint configured but no certificate available" fail-fast lives in the
bootstrap/provider, because the validator only sees the `MxGateway` section, not
`Kestrel:Endpoints`.
**Persistence.** The PFX is written with an **empty** export password — a random
in-memory password could not be reused across restarts, which the
persist-and-reuse model requires. The private key is instead protected at rest by
filesystem permissions: a restrictive ACL on Windows (SYSTEM + Administrators,
inherited ACEs stripped) on the `certs` directory and file, and mode `0600` on
non-Windows. The write is atomic (hardened temp file, then move). The persisted
certificate is reused across restarts (stable thumbprint, so CA-pinning clients
keep working) and regenerated only when it is missing, expired (and
`RegenerateIfExpired` is `true`), or unreadable/corrupt. If the directory is not
writable or the ACL cannot be applied, the gateway fails fast with a diagnostic
naming the path rather than falling back to an in-memory certificate.
**Logging.** On generate or load, the gateway logs the certificate thumbprint,
SAN list, and `notAfter` at Information. The PFX bytes, export password, and
private key are never logged.
**Operator override.** The generated certificate is only the HTTPS *default*. To
use a real certificate, configure one explicitly — either per endpoint via
`Kestrel:Endpoints:<name>:Certificate` (`Path`/`Subject`/`Thumbprint`, etc., as
in the table above) or globally via `Kestrel:Certificates:Default`. An
explicitly-configured certificate takes precedence, and the gateway then writes
no self-signed material.
### Client side
Each official client opts into TLS explicitly. For the .NET client
(`MxGatewayClientOptions`):
| Option | Effect |
|---|---|
| `UseTls` (default `false`) | Enables TLS. Requires an `https://` endpoint; an `https://` endpoint without `UseTls` fails validation, and vice versa. |
| `CaCertificatePath` | Pins a custom root (self-signed / private CA) using `CustomRootTrust` chain validation instead of the OS trust store; the .NET client also enforces the certificate hostname/SAN match on this path. |
| `RequireCertificateValidation` (default `false`) | Forces OS/system-trust verification on a TLS connection with no pinned CA. Leave `false` for the lenient default. |
| `ServerNameOverride` | SNI / certificate host name override when the dialed host differs from the certificate CN/SAN. |
To pair with the auto-generated self-signed certificate above, the clients are
**lenient by default**: a TLS connection with no pinned CA accepts whatever
certificate the gateway presents. Pin `CaCertificatePath` to verify, or set
`RequireCertificateValidation` to force system-trust verification without
pinning. The other language clients expose the equivalent options; the exact
behavior differs per stack — Python uses trust-on-first-use and Rust is pin-only.
See each client README for the as-built behavior.
### Gateway↔worker IPC
Transport security here applies only to the public gRPC channel. The
gateway↔worker link is a per-session **named pipe**
(`mxaccess-gateway-{gatewayPid}-{sessionId}`), not a network socket. It is not
TLS-encrypted and does not need to be: it never leaves the local Windows host and
is secured by the OS pipe ACL. See [Worker Frame Protocol](./WorkerFrameProtocol.md).
## Related Documentation
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
-18
View File
@@ -243,27 +243,9 @@ services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInt
Because the interceptor runs before any handler, `MxAccessGatewayService` can safely assume the call has been authorized and that `IGatewayRequestIdentityAccessor.Current` is populated. The handler's only responsibility is to read the identity for `OpenSession` so the session is owned by the authenticated principal; it does not perform any authorization checks of its own. See [Authorization](./Authorization.md) for the policy and identity model.
## Transport Security
The gRPC endpoint runs over HTTP/2, in cleartext (`h2c`) or TLS depending on the
Kestrel endpoint configuration. The current deployments serve it in cleartext, so
the API key and request payloads cross the network unencrypted. The endpoint,
protocol pinning, and TLS certificate configuration — plus the corresponding
client `UseTls` / `CaCertificatePath` options — are documented in
[Host Endpoints and Transport Security](./GatewayConfiguration.md#host-endpoints-and-transport-security-kestrel).
To make TLS usable without PKI, the gateway can auto-generate and persist a
self-signed certificate when an HTTPS endpoint is configured without one, and the
language clients are lenient by default — a TLS connection with no pinned CA
accepts the presented certificate (with per-stack nuances: Python is
trust-on-first-use, Rust is pin-only). See
[Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
and each client README for the as-built behavior.
## Related Documentation
- [Contracts](./Contracts.md)
- [Sessions](./Sessions.md)
- [Authorization](./Authorization.md)
- [Gateway Configuration](./GatewayConfiguration.md)
- [Gateway Process Design](./GatewayProcessDesign.md)
+6 -6
View File
@@ -4,7 +4,7 @@ The metrics subsystem exposes counters, histograms, and observable gauges that d
## Overview
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `ZB.MOM.WW.MxGateway` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `ZB.MOM.WW.MxGateway.Server` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
## Meter and OpenTelemetry Compatibility
@@ -13,7 +13,7 @@ The meter name is exposed as a constant so that hosting code can register it wit
```csharp
public sealed class GatewayMetrics : IDisposable
{
public const string MeterName = "ZB.MOM.WW.MxGateway";
public const string MeterName = "ZB.MOM.WW.MxGateway.Server";
public GatewayMetrics()
{
@@ -50,12 +50,12 @@ All counters are `Counter<long>`. Tag values come from the call sites listed und
### Histograms
Histograms record durations in seconds (the `unit` argument on `CreateHistogram`):
Histograms record durations in milliseconds (the `unit` argument on `CreateHistogram`):
```csharp
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "s");
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "s");
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "s");
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "ms");
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "ms");
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "ms");
```
| Instrument | Tags | What it measures |
@@ -1,156 +0,0 @@
# Gateway TLS Auto-Certificate and Lenient Client Trust — Design
Date: 2026-06-01
Status: Approved (brainstorming), pending implementation plan
## Problem
The gateway can serve gRPC and the dashboard over TLS, but only if an operator
supplies a certificate via the Kestrel `https://` endpoint config. With no cert,
an `https` endpoint fails at startup with Kestrel's opaque "No server certificate
was specified" error. Both current deployments therefore run plaintext (`h2c`),
exposing the API key and request payloads on the wire.
`mxaccessgw` is an internal tool. The goal is for TLS to "just work" with zero PKI
management: the gateway fabricates its own long-lived certificate when an HTTPS
endpoint is configured without one, and clients accept whatever certificate is
presented unless an operator explicitly opts into pinning.
## Decisions
1. **Gateway = fill-missing-cert-only.** No new "enable TLS" switch. TLS is still
driven by configuring a Kestrel `https://` endpoint. New behavior: when an
HTTPS endpoint has no `Certificate` section, the gateway generates/loads a
persisted self-signed cert instead of failing. Plaintext-only hosts are
untouched — no certificate or key material is ever written for them.
2. **Persist & reuse.** The self-signed cert is saved as a PFX under
`C:\ProgramData\MxGateway\certs`, reused across restarts, regenerated only if
missing, expired, or unreadable. Stable thumbprint; survives restarts; any
CA-pinning client keeps working.
3. **Clients = lenient TLS, plaintext default.** When a client connects over TLS
without a pinned CA, it skips verification (accepts any cert). Pinning a CA file
restores full verification. The per-client connection default (mostly
plaintext/`http`) does not change — TLS is still opt-in via the endpoint scheme.
**Scope boundary:** the gateway↔worker named-pipe IPC is unchanged (local,
OS-secured by the pipe ACL). This work touches only the public gRPC/dashboard
transport and the five language clients.
## Gateway component
New type `SelfSignedCertificateProvider` in
`src/ZB.MOM.WW.MxGateway.Server/Security/Tls/`.
1. **Detect need.** Inspect `Kestrel:Endpoints:*` configuration at startup. If any
endpoint has an `https://` URL and no `Certificate` subsection, a default cert
is needed. If none do, the provider is a no-op (no file written).
2. **Load-or-create.** Look for the persisted PFX. If present, valid, and
unexpired, load it. Otherwise generate and persist.
3. **Generate.** `CertificateRequest` with **ECDSA P-256**, `notBefore = now - 1
day` (clock-skew slack), `notAfter = now + ValidityYears`. SANs: `DNS=localhost`,
`DNS=<MachineName>`, `DNS=<MachineName.FQDN>` when resolvable, plus
`IP=127.0.0.1` and `IP=::1`. Server-auth EKU.
4. **Persist securely.** Write the PFX with an **empty** export password (a random
in-memory password cannot be reused across restarts, which the persist-and-reuse
decision requires); protect the private key with a restrictive ACL (SYSTEM +
Administrators + service account) on the `certs` directory and file on Windows,
and `0600` on non-Windows; atomic write (temp + rename). After generating, the
cert is reloaded from the persisted PFX so Kestrel always serves the on-disk key.
5. **Wire into Kestrel.** In `GatewayApplication.CreateBuilder`, add
`builder.WebHost.ConfigureKestrel(o => o.ConfigureHttpsDefaults(h =>
h.ServerCertificate = cert))`. `ConfigureHttpsDefaults` supplies the cert only
for HTTPS endpoints that did not specify their own, so an operator-configured
`Kestrel:Endpoints:*:Certificate` transparently overrides it. One hook covers
both the gRPC and dashboard ports.
### New config block `MxGateway:Tls`
All optional; the zero-config path needs none of them.
| Option | Default | Purpose |
|---|---|---|
| `Tls:SelfSignedCertPath` | `C:\ProgramData\MxGateway\certs\gateway-selfsigned.pfx` | Where the generated cert lives |
| `Tls:ValidityYears` | `10` | Lifetime of the generated cert |
| `Tls:AdditionalDnsNames` | `[]` | Extra SANs (e.g. a load-balancer name) |
| `Tls:RegenerateIfExpired` | `true` | Auto-replace an expired persisted cert |
Validated by `GatewayOptionsValidator`: `ValidityYears` in 1100,
`SelfSignedCertPath` is a valid path shape when non-blank, and
`AdditionalDnsNames` entries are non-blank. (The "https endpoint exists but cert
path is blank" fail-fast lives in the bootstrap/provider, not the validator,
because the validator only sees the `MxGateway` section, not `Kestrel:Endpoints`.)
**Logging:** on generate/load, log thumbprint + SAN list + `notAfter` at
Information. Never log the PFX password or private key.
## Client lenient-TLS behavior
Uniform rule: **TLS on + no CA pinned ⇒ skip verification; CA pinned ⇒ full
verification.** No transport default changes. Each client also exposes an explicit
switch to force-disable leniency (strict-without-pinning) for the future.
| Client | Mechanism | Effort |
|---|---|---|
| .NET | In `CreateHttpHandler`, when `UseTls` and `CaCertificatePath` empty, set `SslOptions.RemoteCertificateValidationCallback = (_,_,_,_) => true`. CA path keeps existing custom-root validation. | trivial |
| Go | In `buildCredentials`, when TLS and no `CACertFile`/`TLSConfig`, use `tls.Config{InsecureSkipVerify: true, ServerName: override}`. | trivial |
| Java | grpc-netty-shaded 1.76.0 ships `InsecureTrustManagerFactory`. When TLS and no CA, build `GrpcSslContexts.forClient().trustManager(InsecureTrustManagerFactory.INSTANCE)`. | easy |
| Python | grpc-python has no per-channel skip-verify. Fetch the server leaf cert at connect via `ssl.get_server_certificate((host, port))`, pass it as `root_certificates` to `ssl_channel_credentials`, plus `grpc.ssl_target_name_override`. Effectively trusts what is presented (TOFU). | moderate, special-cased |
| Rust | tonic 0.13.1 + rustls (`tls-ring`). Implement a custom `rustls::client::danger::ServerCertVerifier` that accepts everything, build a `rustls::ClientConfig` via `.dangerous().with_custom_certificate_verifier(...)`, feed it to the channel. May require a custom hyper-rustls connector if `ClientTlsConfig` will not take a raw rustls config. **Needs an API spike.** | highest |
### Honesty caveats
- **Python** is not literally "ignore the cert"; it pins whatever the server
presents on first contact via a separate unverified TLS probe. For a self-signed
internal cert this is the intended outcome. Documented as a difference.
- **Rust** leniency depends on the tonic 0.13 TLS surface. If a custom verifier is
disproportionately invasive, the fallback is to require a CA file for Rust TLS
(pin-only) and document Rust as the exception.
## Error handling
Gateway:
- Cert dir not writable / ACL fails ⇒ fail fast at startup with a diagnostic naming
the path and required permission. No silent in-memory fallback.
- Persisted PFX corrupt/unreadable ⇒ warn, regenerate, overwrite.
- Persisted cert expired ⇒ regenerate if `RegenerateIfExpired` (default), else fail
fast instructing the operator to delete it or enable regeneration.
- HTTPS endpoint configured but generation disabled / path empty ⇒ validator
rejects at startup rather than letting Kestrel throw its opaque error.
Clients: surface unchanged. Skip-verify cannot itself raise. Python's pre-fetch
wraps connect failure into the existing connect-error type with the endpoint in the
message. Rust pin-only fallback surfaces the existing CA-file error.
## Documentation (same commit as source, per CLAUDE.md)
- `docs/GatewayConfiguration.md` — extend the TLS section: auto-generation, the
`MxGateway:Tls:*` block, persistence location/ACL, thumbprint logging, operator
override via `Kestrel:Endpoints:*:Certificate`.
- Each client README + `*ClientDesign.md` — "TLS is lenient by default; pin a CA to
verify," with Python TOFU and any Rust caveat noted.
- `docs/DesignDecisions.md` — record both posture choices and the why (internal
tool, no PKI) so they are not mistaken for an oversight.
## Testing
Gateway (`MxGateway.Tests`, no MXAccess):
- `SelfSignedCertificateProvider`: SANs, server-auth EKU, `notAfter ≈ now +
ValidityYears`, ECDSA P-256.
- Load-or-create: valid persisted PFX reused (same thumbprint); expired regenerates
when enabled; corrupt regenerates with a warning.
- Detection: HTTPS-without-cert engages; all-plaintext no-ops and writes no file;
endpoint with its own cert is not overridden.
- `GatewayOptionsValidator`: new `Tls:*` rules.
- Host integration: `Kestrel:Endpoints:Http:Url=https://127.0.0.1:0` builds and
binds (today it throws "no certificate specified").
Clients: each test project gets a lenient-TLS test against a throwaway self-signed
cert — connect with no CA succeeds; pinning a wrong CA fails (proves pinning still
verifies). Python exercises the pre-fetch path; mark opt-in if loopback timing is
flaky. Standard (non-live) tests; no MXAccess or external services.
Cross-language: add a TLS variant note to `docs/CrossLanguageSmokeMatrix.md`;
running the matrix over TLS stays manual/opt-in, consistent with the existing gate.
Per-component verification follows CLAUDE.md's source-update table (build + test
each touched component independently).
File diff suppressed because it is too large Load Diff
@@ -1,18 +0,0 @@
{
"planPath": "docs/plans/2026-06-01-gateway-cert-autogen-implementation.md",
"tasks": [
{"id": 1, "subject": "Task 1: Add TlsOptions config + bind into GatewayOptions", "status": "pending"},
{"id": 2, "subject": "Task 2: Validate MxGateway:Tls in GatewayOptionsValidator", "status": "pending", "blockedBy": [1]},
{"id": 3, "subject": "Task 3: SelfSignedCertificateProvider.GenerateCertificate", "status": "pending", "blockedBy": [1]},
{"id": 4, "subject": "Task 4: SelfSignedCertificateProvider.LoadOrCreate (persist/reuse/regenerate/ACL)", "status": "pending", "blockedBy": [3]},
{"id": 5, "subject": "Task 5: KestrelTlsInspector (detect HTTPS-without-cert)", "status": "pending"},
{"id": 6, "subject": "Task 6: Wire auto-cert into GatewayApplication.CreateBuilder", "status": "pending", "blockedBy": [1, 4, 5]},
{"id": 7, "subject": "Task 7: .NET client lenient TLS by default", "status": "pending"},
{"id": 8, "subject": "Task 8: Go client lenient TLS by default", "status": "pending"},
{"id": 9, "subject": "Task 9: Java client lenient TLS by default", "status": "pending"},
{"id": 10, "subject": "Task 10: Python client lenient TLS via TOFU pre-fetch", "status": "pending"},
{"id": 11, "subject": "Task 11: Rust client lenient TLS via rustls verifier (spike + fallback)", "status": "pending"},
{"id": 12, "subject": "Task 12: Documentation", "status": "pending", "blockedBy": [6, 7, 8, 9, 10, 11]}
],
"lastUpdated": "2026-06-01"
}
+57 -104
View File
@@ -1,37 +1,28 @@
# GLAuth — LDAP authn reference for mxaccessgw
> **UPDATED 2026-06-04 — mxaccessgw no longer uses a per-box GLAuth at `C:\publish\glauth`.
> Dev/test LDAP is now the SHARED GLAuth on `10.100.0.35:3893` (`dc=zb,dc=local`);
> the single source of truth is `scadaproj/infra/glauth/` (`config.toml` + `README`).
> The localhost/NSSM/`glauth.cfg` procedures below are RETIRED, kept for reference/rollback.**
GLAuth is a lightweight LDAP server installed on this dev box at
`C:\publish\glauth\` and run as a Windows service via NSSM. It already
backs the LmxOpcUa OPC UA server's UserName-token authn and the LmxOpcUa
Admin UI's cookie login; this doc captures everything mxaccessgw needs
to consume the same directory so a single set of dev credentials covers
both stacks.
GLAuth is a lightweight LDAP server. It already backs all three sister apps (MxAccessGateway,
OtOpcUa, ScadaBridge) through a **shared container** (`zb-shared-glauth`) running on the Linux
docker host at **`10.100.0.35:3893`**. This doc captures everything mxaccessgw needs to consume
that directory so a single set of dev credentials covers all stacks.
~~GLAuth is installed on this dev box at `C:\publish\glauth\` and run as a Windows service via
NSSM.~~ *(RETIRED — the per-box Windows service has been stopped and set to Manual startup;
kept only as a rollback option. Do not edit or restart it for new work.)*
The single source of truth for the shared GLAuth is
**`~/Desktop/scadaproj/infra/glauth/config.toml`** (deploy/verify runbook:
`scadaproj/infra/glauth/README.md`). This doc is a redistilled view tailored to mxaccessgw —
what users + groups are provisioned, how to bind against them, and what's needed to add a
gw-specific role.
The authoritative copy of LmxOpcUa's reference lives at
`C:\publish\glauth\auth.md`. This doc is a redistilled view tailored to
mxaccessgw — what users + groups are already provisioned, how to bind
against them, and what's needed to add a gw-specific role.
## Connection details
| Setting | Value |
|---|---|
| Protocol | LDAP (unencrypted) |
| Host | **`10.100.0.35`** (shared docker host — ~~`localhost`~~ retired) |
| Host | `localhost` |
| Port | `3893` |
| LDAPS | disabled in dev (`Transport=None`, `AllowInsecure=true`) |
| Base DN | `dc=zb,dc=local` |
| Bind DN format | `cn={username},dc=zb,dc=local` |
| Service account DN | `cn=serviceaccount,dc=zb,dc=local` / `serviceaccount123` |
| Group OU | `ou=<groupname>,ou=groups,dc=zb,dc=local` |
| LDAPS | disabled in dev (set `[ldaps]` block to enable) |
| Base DN | `dc=lmxopcua,dc=local` |
| Bind DN format | `cn={username},dc=lmxopcua,dc=local` |
| Group OU | `ou=<groupname>,ou=groups,dc=lmxopcua,dc=local` |
| Failed-bind throttle | 3 fails → 10-minute IP lockout (per `[behaviors]`) |
## Pre-existing groups (LmxOpcUa role taxonomy)
@@ -42,11 +33,11 @@ LmxOpcUa write rights doesn't need a second account for the gw.
| Group | GID | DN | LmxOpcUa meaning | Suggested mxgw mapping |
|---|---|---|---|---|
| ReadOnly | 5501 | `ou=ReadOnly,ou=groups,dc=zb,dc=local` | Browse + read OPC UA nodes | `Browse` + `Subscribe` (read paths only) |
| WriteOperate | 5502 | `ou=WriteOperate,ou=groups,dc=zb,dc=local` | Write FreeAccess / Operate attrs | `Write` (plain) |
| WriteTune | 5504 | `ou=WriteTune,ou=groups,dc=zb,dc=local` | Write Tune attrs | `WriteSecured` (Tune only) |
| WriteConfigure | 5505 | `ou=WriteConfigure,ou=groups,dc=zb,dc=local` | Write Configure attrs | `WriteSecured` (Configure) |
| AlarmAck | 5503 | `ou=AlarmAck,ou=groups,dc=zb,dc=local` | Acknowledge alarms | gw alarm-ack RPC, when added |
| ReadOnly | 5501 | `ou=ReadOnly,ou=groups,dc=lmxopcua,dc=local` | Browse + read OPC UA nodes | `Browse` + `Subscribe` (read paths only) |
| WriteOperate | 5502 | `ou=WriteOperate,ou=groups,dc=lmxopcua,dc=local` | Write FreeAccess / Operate attrs | `Write` (plain) |
| WriteTune | 5504 | `ou=WriteTune,ou=groups,dc=lmxopcua,dc=local` | Write Tune attrs | `WriteSecured` (Tune only) |
| WriteConfigure | 5505 | `ou=WriteConfigure,ou=groups,dc=lmxopcua,dc=local` | Write Configure attrs | `WriteSecured` (Configure) |
| AlarmAck | 5503 | `ou=AlarmAck,ou=groups,dc=lmxopcua,dc=local` | Acknowledge alarms | gw alarm-ack RPC, when added |
**A user can be in multiple groups** — `othergroups = [...]` in the
config is a list. `admin` is the canonical example (in every role
@@ -68,26 +59,20 @@ For mxaccessgw dev, `admin` covers every gw-side capability test;
`readonly` is the right "negative" case for proving Browse-OK /
Write-denied.
The gateway dashboard uses two gateway-specific groups beyond the LmxOpcUa taxonomy:
`GwAdmin` (gid 5610 → role `Administrator`) and `GwReader` (gid 5611 → role `Viewer`).
These are already provisioned in the shared `scadaproj/infra/glauth/config.toml`.
The dashboard test users are **`multi-role`/`password`** (Administrator) and
**`gw-viewer`/`password`** (Viewer). `LdapOptions.RequiredGroup` defaults to `GwAdmin`.
See [Provisioning the GwAdmin group](#provisioning-the-gwadmin-group) below for the
(now-retired) per-box procedure and for the shared-config equivalent.
> **Dashboard role value (Task 1.7):** the LDAP `GwAdmin` group now maps to
> the canonical dashboard role **`Administrator`** (was `Admin`); `GwReader`
> maps to `Viewer`. This is a pure value rename via
> `MxGateway:Dashboard:GroupToRole` — same operations are authorized. (This
> dashboard role is distinct from the lowercase gRPC `admin` *API-key scope*.)
The gateway dashboard adds one role beyond this LmxOpcUa taxonomy:
`GwAdmin`. `LdapOptions.RequiredGroup` defaults to `GwAdmin`, so the
dashboard login and `DashboardLdapLiveTests` require `admin` to be a
member of a `GwAdmin` group. `GwAdmin` is **not** in the baseline
GLAuth config — it must be provisioned before dashboard authn or the
LDAP live tests work. See [Provisioning the GwAdmin
group](#provisioning-the-gwadmin-group) below.
## Two bind patterns
### 1. Direct bind (simplest)
```
DN: cn=admin,dc=zb,dc=local
DN: cn=admin,dc=lmxopcua,dc=local
Password: admin123
```
@@ -99,9 +84,9 @@ by `sAMAccountName`, not `cn`. Use this only for dev convenience.
### 2. Bind-then-search (production-grade)
```
1. Bind as the service account (cn=serviceaccount,dc=zb,dc=local
1. Bind as the service account (cn=serviceaccount,dc=lmxopcua,dc=local
/ serviceaccount123).
2. Search under dc=zb,dc=local with filter
2. Search under dc=lmxopcua,dc=local with filter
(uid=<entered-username>) — or any attribute the deployment
identifies users by. GLAuth populates uid + cn.
3. Read the returned entry's DN + memberOf list (groups).
@@ -127,12 +112,12 @@ record:
```yaml
ldap:
enabled: true
server: 10.100.0.35 # shared GLAuth on docker host (was localhost)
server: localhost
port: 3893
useTls: false
allowInsecureLdap: true # dev only
searchBase: "dc=zb,dc=local"
serviceAccountDn: "cn=serviceaccount,dc=zb,dc=local"
searchBase: "dc=lmxopcua,dc=local"
serviceAccountDn: "cn=serviceaccount,dc=lmxopcua,dc=local"
serviceAccountPassword: "serviceaccount123"
userNameAttribute: "uid" # GLAuth populates this; AD uses sAMAccountName
displayNameAttribute: "cn"
@@ -146,35 +131,19 @@ ldap:
```
`groupAttribute` returns full DNs like
`ou=ReadOnly,ou=groups,dc=zb,dc=local` — the authenticator
`ou=ReadOnly,ou=groups,dc=lmxopcua,dc=local` — the authenticator
should strip the leading `ou=` (or `cn=` against AD) RDN value and
look that up in `groupToRole`.
## Provisioning the GwAdmin group
> **UPDATED 2026-06-04 — RETIRED per-box procedure.** `GwAdmin` (gid 5610) and `GwReader`
> (gid 5611) are already present in the shared GLAuth. To add or modify users/groups,
> edit **`~/Desktop/scadaproj/infra/glauth/config.toml`** on host `10.100.0.35` and run:
>
> ```bash
> cd ~/Desktop/scadaproj/infra/glauth
> docker compose up -d --force-recreate
> ```
>
> The per-box `C:\publish\glauth\glauth.cfg` + NSSM procedure below is kept for
> rollback reference only — do not use it for new provisioning.
`GwAdmin` is the gateway-specific dashboard-admin role. It is the
default `LdapOptions.RequiredGroup`, so the dashboard cookie login and
`DashboardLdapLiveTests` (`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`) reject
logins unless the user is a member of `GwAdmin`.
The `GwAdmin` (gid 5610) and `GwReader` (gid 5611) groups already exist in the shared
config at `scadaproj/infra/glauth/config.toml`. Dashboard test users are
`multi-role`/`password` (Administrator) and `gw-viewer`/`password` (Viewer).
---
**RETIRED — per-box provisioning (reference/rollback only):**
`admin` until a `GwAdmin` group exists and `admin` is a member.
GLAuth's baseline config ships only the five LmxOpcUa role groups, so
`GwAdmin` must be added to GLAuth rather than run from a separate LDAP
server:
1. Edit `C:\publish\glauth\glauth.cfg`
2. Append the group:
@@ -203,7 +172,7 @@ config at `scadaproj/infra/glauth/config.toml`. Dashboard test users are
4. `nssm restart GLAuth`
After the restart, `admin`'s `memberOf` includes
`ou=GwAdmin,ou=groups,dc=zb,dc=local`, which the authenticator
`ou=GwAdmin,ou=groups,dc=lmxopcua,dc=local`, which the authenticator
strips to `GwAdmin` and matches against `RequiredGroup`. The same
pattern applies to any future permission that doesn't fit the existing
five roles.
@@ -224,16 +193,15 @@ echo -n "yourpassword" | openssl dgst -sha256
## Quick verification
From mxaccessgw's dev box, prove the shared directory is reachable:
From mxaccessgw's dev box, prove the directory is reachable:
```powershell
# Plain bind via PowerShell + System.DirectoryServices.Protocols
# (shared GLAuth on 10.100.0.35 — was localhost, now the docker host)
$ldap = New-Object System.DirectoryServices.Protocols.LdapConnection("10.100.0.35:3893")
$ldap = New-Object System.DirectoryServices.Protocols.LdapConnection("localhost:3893")
$ldap.AuthType = [System.DirectoryServices.Protocols.AuthType]::Basic
$ldap.SessionOptions.ProtocolVersion = 3
$ldap.SessionOptions.SecureSocketLayer = $false
$cred = New-Object System.Net.NetworkCredential("cn=multi-role,dc=zb,dc=local","password")
$cred = New-Object System.Net.NetworkCredential("cn=admin,dc=lmxopcua,dc=local","admin123")
$ldap.Bind($cred)
"Bind OK"
```
@@ -241,32 +209,17 @@ $ldap.Bind($cred)
Or via `ldapsearch` if you have OpenLDAP CLI tools:
```bash
ldapsearch -x -H ldap://10.100.0.35:3893 \
-D "cn=serviceaccount,dc=zb,dc=local" -w serviceaccount123 \
-b "dc=zb,dc=local" "(uid=multi-role)"
ldapsearch -x -H ldap://localhost:3893 \
-D "cn=admin,dc=lmxopcua,dc=local" -w admin123 \
-b "dc=lmxopcua,dc=local" "(uid=admin)"
```
The response should list `multi-role`'s entry with `memberOf` including
`ou=GwAdmin,ou=groups,dc=zb,dc=local`.
The response should list `admin`'s entry with `memberOf` populated for
all five role groups — plus `GwAdmin` once the gateway-specific group
is provisioned.
## Service management
> **RETIRED — per-box NSSM service (reference/rollback only).** The shared GLAuth is
> managed via `docker compose` on `10.100.0.35` (`scadaproj/infra/glauth/`). The
> Windows NSSM `GLAuth` service on the dev box has been stopped and set to
> `StartupType=Manual`; only restart it if you need to roll back to a local directory.
>
> **Active (shared) management:**
> ```bash
> ssh 10.100.0.35
> cd ~/Desktop/scadaproj/infra/glauth
> docker compose ps # check container status
> docker compose up -d --force-recreate # apply config.toml changes
> docker compose logs -f # tail logs
> ```
**RETIRED — per-box NSSM commands (rollback reference):**
```powershell
# Status / start / stop / restart
nssm status GLAuth
@@ -300,12 +253,12 @@ applies to mxaccessgw verbatim. Keys that change:
| Field | GLAuth dev value | AD production value |
|---|---|---|
| `Server` | `10.100.0.35` (shared docker host) | a domain controller FQDN, or the domain itself |
| `Server` | `localhost` | a domain controller FQDN, or the domain itself |
| `Port` | `3893` | `636` (LDAPS) — AD increasingly rejects plain bind under LDAP-signing enforcement |
| `UseTls` | `false` | `true` |
| `AllowInsecureLdap` | `true` | `false` |
| `SearchBase` | `dc=zb,dc=local` | `DC=corp,DC=example,DC=com` |
| `ServiceAccountDn` | `cn=serviceaccount,dc=zb,dc=local` | `CN=MxGwSvc,OU=Service Accounts,DC=corp,...` |
| `SearchBase` | `dc=lmxopcua,dc=local` | `DC=corp,DC=example,DC=com` |
| `ServiceAccountDn` | `cn=serviceaccount,dc=lmxopcua,dc=local` | `CN=MxGwSvc,OU=Service Accounts,DC=corp,...` |
| `UserNameAttribute` | `uid` | `sAMAccountName` (or `userPrincipalName`) |
| `GroupAttribute` | `memberOf` (unchanged) | `memberOf` (unchanged) |
@@ -316,12 +269,12 @@ add a `tokenGroups` query as an enhancement.
## Security notes for production
- **Plaintext passwords in `config.toml` are dev-only.** The shared config is in
`scadaproj/infra/glauth/config.toml` (unencrypted); restrict filesystem access on
`10.100.0.35` accordingly. Treat the dev creds as throwaway. Production LDAP is Active
Directory. *(The retired per-box `C:\publish\glauth\glauth.cfg` has the same caveat.)*
- **Plaintext passwords in `glauth.cfg` are dev-only.** The config is
unencrypted on disk; anyone with read access to `C:\publish\glauth\`
can SHA256-rainbow-table the entries. Treat the dev creds as
throwaway. Production LDAP is Active Directory.
- The 3-fail / 10-minute lockout is per source IP, not per user — a
shared NAT can lock out a whole office. Tunable in `[behaviors]`.
- LDAPS isn't enabled in dev; binding sends passwords cleartext on the
wire. The shared GLAuth listens only on the LAN (`10.100.0.35`); never
expose port 3893 externally without enabling TLS first.
wire. Fine for `localhost`, never expose port 3893 off-box without
enabling TLS first.
-26
View File
@@ -1,26 +0,0 @@
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<packageSources>
<clear />
<add key="nuget.org" value="https://api.nuget.org/v3/index.json" />
<add key="dohertj2-gitea" value="https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json" />
</packageSources>
<!-- nuget.org serves everything; the Gitea feed serves only the ZB.MOM.WW.* shared libs.
Credentials are NOT committed: they are provided per-developer at the user level. -->
<packageSourceMapping>
<packageSource key="nuget.org">
<package pattern="*" />
</packageSource>
<packageSource key="dohertj2-gitea">
<package pattern="ZB.MOM.WW.Health" />
<package pattern="ZB.MOM.WW.Health.*" />
<package pattern="ZB.MOM.WW.Telemetry" />
<package pattern="ZB.MOM.WW.Telemetry.*" />
<package pattern="ZB.MOM.WW.Configuration" />
<package pattern="ZB.MOM.WW.Auth" />
<package pattern="ZB.MOM.WW.Auth.*" />
<package pattern="ZB.MOM.WW.Audit" />
<package pattern="ZB.MOM.WW.Theme" />
</packageSource>
</packageSourceMapping>
</configuration>
@@ -1,11 +1,8 @@
using System.Security.Claims;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Auth.Ldap;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Dashboard;
using LibraryLdapOptions = ZB.MOM.WW.Auth.Abstractions.Ldap.LdapOptions;
namespace ZB.MOM.WW.MxGateway.IntegrationTests;
@@ -31,11 +28,12 @@ public sealed class DashboardLdapLiveTests
claim.Type == DashboardAuthenticationDefaults.LdapGroupClaimType
&& claim.Value.Contains("GwAdmin", StringComparison.OrdinalIgnoreCase));
// IntegrationTests-023: DashboardAuthenticator builds the principal with a
// ClaimTypes.Role claim resolved from the LDAP groups via the
// DashboardGroupRoleMapper. The seeded GroupToRole map (GwAdmin -> Admin)
// means the admin principal must carry Role=Admin alongside the raw LDAP-group
// claim. A regression in the group→role mapping would fail this assertion.
// IntegrationTests-023: DashboardAuthenticator.CreatePrincipal emits a
// ClaimTypes.Role claim derived from MapGroupsToRoles. The seeded
// GroupToRole map (GwAdmin -> Admin) means the admin principal must
// carry Role=Admin alongside the raw LDAP-group claim. A regression in
// MapGroupsToRoles (returning an empty list, missing the RDN fallback)
// would silently pass without this assertion.
Assert.Contains(result.Principal.Claims, claim =>
claim.Type == ClaimTypes.Role
&& claim.Value == DashboardRoles.Admin);
@@ -61,7 +59,7 @@ public sealed class DashboardLdapLiveTests
[LiveLdapFact]
public async Task AuthenticateAsync_AdminWithWrongPassword_FailsWithoutLeakingPassword()
{
// Exercises the user-bind-failure branch: the user exists and the service
// Exercises the LdapException branch: the user exists and the service
// account search succeeds, but the candidate bind is rejected.
const string wrongPassword = "definitely-not-the-admin-password";
DashboardAuthenticator authenticator = CreateAuthenticator();
@@ -80,8 +78,8 @@ public sealed class DashboardLdapLiveTests
[LiveLdapFact]
public async Task AuthenticateAsync_UnknownUsername_Fails()
{
// Exercises the user-not-found branch: the service-account search returns no
// entry, so no candidate bind is attempted.
// Exercises the `candidate is null` branch: the service-account search
// returns no entry, so no candidate bind is attempted.
DashboardAuthenticator authenticator = CreateAuthenticator();
DashboardAuthenticationResult result = await authenticator.AuthenticateAsync(
@@ -98,13 +96,18 @@ public sealed class DashboardLdapLiveTests
public async Task AuthenticateAsync_ServerUnreachable_FailsWithoutThrowing()
{
// Exercises the connect-failure path: a closed loopback port produces a
// connection error that the shared LdapAuthService must absorb into a Fail
// connection error that DashboardAuthenticator must absorb into a Fail
// result rather than propagating an exception to the dashboard.
DashboardAuthenticator authenticator = CreateAuthenticator(LibraryOptions() with
{
// 1 is a reserved port number that no LDAP server listens on.
Port = 1,
});
DashboardAuthenticator authenticator = new(
Options.Create(new GatewayOptions
{
Ldap = new LdapOptions
{
// 1 is a reserved port number that no LDAP server listens on.
Port = 1,
},
}),
NullLogger<DashboardAuthenticator>.Instance);
DashboardAuthenticationResult result = await authenticator.AuthenticateAsync(
"admin",
@@ -115,48 +118,19 @@ public sealed class DashboardLdapLiveTests
Assert.Null(result.Principal);
}
private static DashboardAuthenticator CreateAuthenticator() => CreateAuthenticator(LibraryOptions());
private static DashboardAuthenticator CreateAuthenticator(LibraryLdapOptions ldapOptions)
private static DashboardAuthenticator CreateAuthenticator()
{
GatewayOptions gatewayOptions = new()
{
Dashboard = new DashboardOptions
{
GroupToRole = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
["GwAdmin"] = DashboardRoles.Admin,
},
},
};
return new DashboardAuthenticator(
new LdapAuthService(ldapOptions),
new DashboardGroupRoleMapper(Options.Create(gatewayOptions)),
Options.Create(new GatewayOptions
{
Dashboard = new DashboardOptions
{
GroupToRole = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
["GwAdmin"] = DashboardRoles.Admin,
},
},
}),
NullLogger<DashboardAuthenticator>.Instance);
}
/// <summary>
/// Builds the shared library <see cref="LibraryLdapOptions"/> from the gateway's
/// default LDAP settings so the live tests exercise the same seeded directory the
/// gateway connects to (localhost:3893, plaintext, with AllowInsecure for dev).
/// </summary>
private static LibraryLdapOptions LibraryOptions()
{
ZB.MOM.WW.MxGateway.Server.Configuration.LdapOptions gateway = new();
return new LibraryLdapOptions
{
Enabled = gateway.Enabled,
Server = gateway.Server,
Port = gateway.Port,
Transport = gateway.Transport,
AllowInsecure = gateway.AllowInsecure,
SearchBase = gateway.SearchBase,
ServiceAccountDn = gateway.ServiceAccountDn,
ServiceAccountPassword = gateway.ServiceAccountPassword,
UserNameAttribute = gateway.UserNameAttribute,
DisplayNameAttribute = gateway.DisplayNameAttribute,
GroupAttribute = gateway.GroupAttribute,
};
}
}
@@ -21,17 +21,6 @@ public sealed class DashboardOptions
/// </summary>
public bool RequireHttpsCookie { get; init; } = true;
/// <summary>
/// Dashboard auth cookie name. When null/blank (the default) the canonical
/// <see cref="ZB.MOM.WW.MxGateway.Server.Dashboard.DashboardAuthenticationDefaults.CookieName"/>
/// is used. Override it (<c>MxGateway:Dashboard:CookieName</c>) to give a distinct name to a
/// gateway that shares a hostname with another gateway instance — browser cookies are scoped
/// by host+path but NOT by port, so two instances on the same host would otherwise clobber
/// each other's dashboard session under a shared cookie name. Changing this signs out
/// existing dashboard sessions on next deploy.
/// </summary>
public string? CookieName { get; init; }
/// <summary>Gets the dashboard snapshot update interval in milliseconds.</summary>
public int SnapshotIntervalMilliseconds { get; init; } = 1_000;
@@ -4,8 +4,8 @@ public sealed record EffectiveLdapConfiguration(
bool Enabled,
string Server,
int Port,
string Transport,
bool AllowInsecure,
bool UseTls,
bool AllowInsecureLdap,
string SearchBase,
string ServiceAccountDn,
string ServiceAccountPassword,
@@ -23,8 +23,8 @@ public sealed class GatewayConfigurationProvider(IOptions<GatewayOptions> option
Enabled: value.Ldap.Enabled,
Server: value.Ldap.Server,
Port: value.Ldap.Port,
Transport: value.Ldap.Transport.ToString(),
AllowInsecure: value.Ldap.AllowInsecure,
UseTls: value.Ldap.UseTls,
AllowInsecureLdap: value.Ldap.AllowInsecureLdap,
SearchBase: value.Ldap.SearchBase,
ServiceAccountDn: value.Ldap.ServiceAccountDn,
ServiceAccountPassword: RedactedValue,
@@ -1,5 +1,4 @@
using Microsoft.Extensions.Configuration;
using ZB.MOM.WW.Configuration;
using Microsoft.Extensions.Options;
namespace ZB.MOM.WW.MxGateway.Server.Configuration;
@@ -7,14 +6,15 @@ public static class GatewayConfigurationServiceCollectionExtensions
{
/// <summary>Registers gateway configuration services in the dependency injection container.</summary>
/// <param name="services">The service collection.</param>
/// <param name="configuration">The configuration to bind gateway options from.</param>
/// <returns>The service collection for chaining.</returns>
public static IServiceCollection AddGatewayConfiguration(
this IServiceCollection services, IConfiguration configuration)
public static IServiceCollection AddGatewayConfiguration(this IServiceCollection services)
{
services.AddValidatedOptions<GatewayOptions, GatewayOptionsValidator>(
configuration, GatewayOptions.SectionName);
services
.AddOptions<GatewayOptions>()
.BindConfiguration(GatewayOptions.SectionName)
.ValidateOnStart();
services.AddSingleton<IValidateOptions<GatewayOptions>, GatewayOptionsValidator>();
services.AddSingleton<IGatewayConfigurationProvider, GatewayConfigurationProvider>();
return services;
@@ -43,7 +43,4 @@ public sealed class GatewayOptions
/// behaviour (alarms disabled).
/// </summary>
public AlarmsOptions Alarms { get; init; } = new();
/// <summary>Gets self-signed TLS certificate auto-generation options.</summary>
public TlsOptions Tls { get; init; } = new();
}
@@ -1,10 +1,9 @@
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Configuration;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Contracts;
namespace ZB.MOM.WW.MxGateway.Server.Configuration;
public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOptions>
public sealed class GatewayOptionsValidator : IValidateOptions<GatewayOptions>
{
private const int MinimumMaxMessageBytes = 1024;
private const int MaximumMaxMessageBytes = 256 * 1024 * 1024;
@@ -12,26 +11,32 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
/// <summary>
/// Validates gateway configuration options.
/// </summary>
/// <param name="builder">The accumulator to record failures on.</param>
/// <param name="name">Options name.</param>
/// <param name="options">Gateway options to validate.</param>
protected override void Validate(ValidationBuilder builder, GatewayOptions options)
/// <returns>Validation result.</returns>
public ValidateOptionsResult Validate(string? name, GatewayOptions options)
{
ValidateAuthentication(options.Authentication, builder);
ValidateLdap(options.Ldap, builder);
ValidateWorker(options.Worker, builder);
ValidateSessions(options.Sessions, builder);
ValidateEvents(options.Events, builder);
ValidateDashboard(options.Dashboard, builder);
ValidateProtocol(options.Protocol, builder);
ValidateAlarms(options.Alarms, builder);
ValidateTls(options.Tls, builder);
List<string> failures = [];
ValidateAuthentication(options.Authentication, failures);
ValidateLdap(options.Ldap, failures);
ValidateWorker(options.Worker, failures);
ValidateSessions(options.Sessions, failures);
ValidateEvents(options.Events, failures);
ValidateDashboard(options.Dashboard, failures);
ValidateProtocol(options.Protocol, failures);
ValidateAlarms(options.Alarms, failures);
return failures.Count == 0
? ValidateOptionsResult.Success
: ValidateOptionsResult.Fail(failures);
}
private static void ValidateAuthentication(AuthenticationOptions options, ValidationBuilder builder)
private static void ValidateAuthentication(AuthenticationOptions options, List<string> failures)
{
if (!Enum.IsDefined(options.Mode))
{
builder.Add("MxGateway:Authentication:Mode must be a supported authentication mode.");
failures.Add("MxGateway:Authentication:Mode must be a supported authentication mode.");
return;
}
@@ -40,67 +45,67 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
AddIfBlank(
options.SqlitePath,
"MxGateway:Authentication:SqlitePath is required when API-key authentication is enabled.",
builder);
failures);
AddIfInvalidPath(
options.SqlitePath,
"MxGateway:Authentication:SqlitePath must be a valid filesystem path.",
builder);
failures);
AddIfBlank(
options.PepperSecretName,
"MxGateway:Authentication:PepperSecretName is required when API-key authentication is enabled.",
builder);
failures);
}
}
private static void ValidateLdap(LdapOptions options, ValidationBuilder builder)
private static void ValidateLdap(LdapOptions options, List<string> failures)
{
if (!options.Enabled)
{
return;
}
AddIfBlank(options.Server, "MxGateway:Ldap:Server is required when LDAP login is enabled.", builder);
AddIfBlank(options.SearchBase, "MxGateway:Ldap:SearchBase is required when LDAP login is enabled.", builder);
AddIfBlank(options.Server, "MxGateway:Ldap:Server is required when LDAP login is enabled.", failures);
AddIfBlank(options.SearchBase, "MxGateway:Ldap:SearchBase is required when LDAP login is enabled.", failures);
AddIfBlank(
options.ServiceAccountDn,
"MxGateway:Ldap:ServiceAccountDn is required when LDAP login is enabled.",
builder);
failures);
AddIfBlank(
options.ServiceAccountPassword,
"MxGateway:Ldap:ServiceAccountPassword is required when LDAP login is enabled.",
builder);
failures);
AddIfBlank(
options.UserNameAttribute,
"MxGateway:Ldap:UserNameAttribute is required when LDAP login is enabled.",
builder);
failures);
AddIfBlank(
options.DisplayNameAttribute,
"MxGateway:Ldap:DisplayNameAttribute is required when LDAP login is enabled.",
builder);
failures);
AddIfBlank(
options.GroupAttribute,
"MxGateway:Ldap:GroupAttribute is required when LDAP login is enabled.",
builder);
builder.Port(options.Port, "MxGateway:Ldap:Port");
failures);
AddIfNotPositive(options.Port, "MxGateway:Ldap:Port must be greater than zero.", failures);
if (options.Transport == LdapTransport.None && !options.AllowInsecure)
if (!options.UseTls && !options.AllowInsecureLdap)
{
builder.Add("MxGateway:Ldap:AllowInsecure must be true when Transport is None (plaintext).");
failures.Add("MxGateway:Ldap:AllowInsecureLdap must be true when UseTls is false.");
}
}
private static void ValidateWorker(WorkerOptions options, ValidationBuilder builder)
private static void ValidateWorker(WorkerOptions options, List<string> failures)
{
AddIfBlank(options.ExecutablePath, "MxGateway:Worker:ExecutablePath is required.", builder);
AddIfBlank(options.ExecutablePath, "MxGateway:Worker:ExecutablePath is required.", failures);
AddIfInvalidPath(
options.ExecutablePath,
"MxGateway:Worker:ExecutablePath must be a valid filesystem path.",
builder);
failures);
if (!string.IsNullOrWhiteSpace(options.ExecutablePath)
&& !string.Equals(Path.GetExtension(options.ExecutablePath), ".exe", StringComparison.OrdinalIgnoreCase))
{
builder.Add("MxGateway:Worker:ExecutablePath must point to a .exe file.");
failures.Add("MxGateway:Worker:ExecutablePath must point to a .exe file.");
}
if (!string.IsNullOrWhiteSpace(options.WorkingDirectory))
@@ -108,94 +113,94 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
AddIfInvalidPath(
options.WorkingDirectory,
"MxGateway:Worker:WorkingDirectory must be a valid filesystem path.",
builder);
failures);
}
if (!Enum.IsDefined(options.RequiredArchitecture))
{
builder.Add("MxGateway:Worker:RequiredArchitecture must be a supported worker architecture.");
failures.Add("MxGateway:Worker:RequiredArchitecture must be a supported worker architecture.");
}
AddIfNotPositive(
options.StartupTimeoutSeconds,
"MxGateway:Worker:StartupTimeoutSeconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.StartupProbeRetryAttempts,
"MxGateway:Worker:StartupProbeRetryAttempts must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.StartupProbeRetryDelayMilliseconds,
"MxGateway:Worker:StartupProbeRetryDelayMilliseconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.PipeConnectAttemptTimeoutMilliseconds,
"MxGateway:Worker:PipeConnectAttemptTimeoutMilliseconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.ShutdownTimeoutSeconds,
"MxGateway:Worker:ShutdownTimeoutSeconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.HeartbeatIntervalSeconds,
"MxGateway:Worker:HeartbeatIntervalSeconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.HeartbeatGraceSeconds,
"MxGateway:Worker:HeartbeatGraceSeconds must be greater than zero.",
builder);
failures);
if (options.HeartbeatGraceSeconds < options.HeartbeatIntervalSeconds)
{
builder.Add(
failures.Add(
"MxGateway:Worker:HeartbeatGraceSeconds must be greater than or equal to HeartbeatIntervalSeconds.");
}
if (options.MaxMessageBytes is < MinimumMaxMessageBytes or > MaximumMaxMessageBytes)
{
builder.Add(
failures.Add(
$"MxGateway:Worker:MaxMessageBytes must be between {MinimumMaxMessageBytes} and {MaximumMaxMessageBytes}.");
}
}
private static void ValidateSessions(SessionOptions options, ValidationBuilder builder)
private static void ValidateSessions(SessionOptions options, List<string> failures)
{
AddIfNotPositive(
options.DefaultCommandTimeoutSeconds,
"MxGateway:Sessions:DefaultCommandTimeoutSeconds must be greater than zero.",
builder);
AddIfNotPositive(options.MaxSessions, "MxGateway:Sessions:MaxSessions must be greater than zero.", builder);
failures);
AddIfNotPositive(options.MaxSessions, "MxGateway:Sessions:MaxSessions must be greater than zero.", failures);
AddIfNotPositive(
options.MaxPendingCommandsPerSession,
"MxGateway:Sessions:MaxPendingCommandsPerSession must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.DefaultLeaseSeconds,
"MxGateway:Sessions:DefaultLeaseSeconds must be greater than zero.",
builder);
failures);
AddIfNotPositive(
options.LeaseSweepIntervalSeconds,
"MxGateway:Sessions:LeaseSweepIntervalSeconds must be greater than zero.",
builder);
failures);
if (options.AllowMultipleEventSubscribers)
{
builder.Add(
failures.Add(
"MxGateway:Sessions:AllowMultipleEventSubscribers is not supported until event fan-out is implemented.");
}
}
private static void ValidateEvents(EventOptions options, ValidationBuilder builder)
private static void ValidateEvents(EventOptions options, List<string> failures)
{
AddIfNotPositive(options.QueueCapacity, "MxGateway:Events:QueueCapacity must be greater than zero.", builder);
AddIfNotPositive(options.QueueCapacity, "MxGateway:Events:QueueCapacity must be greater than zero.", failures);
if (!Enum.IsDefined(options.BackpressurePolicy))
{
builder.Add("MxGateway:Events:BackpressurePolicy must be a supported backpressure policy.");
failures.Add("MxGateway:Events:BackpressurePolicy must be a supported backpressure policy.");
}
}
private static void ValidateDashboard(DashboardOptions options, ValidationBuilder builder)
private static void ValidateDashboard(DashboardOptions options, List<string> failures)
{
// GroupToRole shape is validated even when the dashboard is disabled so
// misconfiguration surfaces at startup; emptiness is allowed, with the
@@ -206,13 +211,13 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
{
if (string.IsNullOrWhiteSpace(entry.Key))
{
builder.Add("MxGateway:Dashboard:GroupToRole keys (LDAP group names) must be non-blank.");
failures.Add("MxGateway:Dashboard:GroupToRole keys (LDAP group names) must be non-blank.");
}
if (!string.Equals(entry.Value, Dashboard.DashboardRoles.Admin, StringComparison.Ordinal)
&& !string.Equals(entry.Value, Dashboard.DashboardRoles.Viewer, StringComparison.Ordinal))
{
builder.Add(
failures.Add(
$"MxGateway:Dashboard:GroupToRole['{entry.Key}'] must be '{Dashboard.DashboardRoles.Admin}' or '{Dashboard.DashboardRoles.Viewer}'.");
}
}
@@ -220,18 +225,18 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
AddIfNotPositive(
options.SnapshotIntervalMilliseconds,
"MxGateway:Dashboard:SnapshotIntervalMilliseconds must be greater than zero.",
builder);
failures);
AddIfNegative(
options.RecentFaultLimit,
"MxGateway:Dashboard:RecentFaultLimit must be greater than or equal to zero.",
builder);
failures);
AddIfNegative(
options.RecentSessionLimit,
"MxGateway:Dashboard:RecentSessionLimit must be greater than or equal to zero.",
builder);
failures);
}
private static void ValidateAlarms(AlarmsOptions options, ValidationBuilder builder)
private static void ValidateAlarms(AlarmsOptions options, List<string> failures)
{
if (!options.Enabled)
{
@@ -245,79 +250,58 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
if (string.IsNullOrWhiteSpace(options.SubscriptionExpression)
&& string.IsNullOrWhiteSpace(options.DefaultArea))
{
builder.Add(
failures.Add(
"MxGateway:Alarms requires either a non-blank SubscriptionExpression or a non-blank DefaultArea when Enabled is true.");
}
if (!string.IsNullOrWhiteSpace(options.SubscriptionExpression)
&& !options.SubscriptionExpression.StartsWith(@"\\", StringComparison.Ordinal))
{
builder.Add(
failures.Add(
@"MxGateway:Alarms:SubscriptionExpression must start with '\\' (canonical \\<host>\Galaxy!<area> shape).");
}
}
private const int MinimumCertValidityYears = 1;
private const int MaximumCertValidityYears = 100;
private static void ValidateTls(TlsOptions options, ValidationBuilder builder)
{
if (options.ValidityYears is < MinimumCertValidityYears or > MaximumCertValidityYears)
{
builder.Add(
$"MxGateway:Tls:ValidityYears must be between {MinimumCertValidityYears} and {MaximumCertValidityYears}.");
}
// The default is non-blank, so this only catches an explicitly-blanked path.
AddIfBlank(
options.SelfSignedCertPath,
"MxGateway:Tls:SelfSignedCertPath must not be blank.",
builder);
AddIfInvalidPath(
options.SelfSignedCertPath,
"MxGateway:Tls:SelfSignedCertPath must be a valid filesystem path.",
builder);
foreach (string dns in options.AdditionalDnsNames)
{
if (string.IsNullOrWhiteSpace(dns))
{
builder.Add("MxGateway:Tls:AdditionalDnsNames entries must be non-blank.");
}
}
}
private static void ValidateProtocol(ProtocolOptions options, ValidationBuilder builder)
private static void ValidateProtocol(ProtocolOptions options, List<string> failures)
{
if (options.WorkerProtocolVersion != GatewayContractInfo.WorkerProtocolVersion)
{
builder.Add(
failures.Add(
$"MxGateway:Protocol:WorkerProtocolVersion must be {GatewayContractInfo.WorkerProtocolVersion}.");
}
if (options.MaxGrpcMessageBytes is < MinimumMaxMessageBytes or > MaximumMaxMessageBytes)
{
builder.Add(
failures.Add(
$"MxGateway:Protocol:MaxGrpcMessageBytes must be between {MinimumMaxMessageBytes} and {MaximumMaxMessageBytes}.");
}
}
private static void AddIfBlank(string? value, string message, ValidationBuilder builder)
private static void AddIfBlank(string? value, string message, List<string> failures)
{
builder.RequireThat(!string.IsNullOrWhiteSpace(value), message);
if (string.IsNullOrWhiteSpace(value))
{
failures.Add(message);
}
}
private static void AddIfNotPositive(int value, string message, ValidationBuilder builder)
private static void AddIfNotPositive(int value, string message, List<string> failures)
{
builder.RequireThat(value > 0, message);
if (value <= 0)
{
failures.Add(message);
}
}
private static void AddIfNegative(int value, string message, ValidationBuilder builder)
private static void AddIfNegative(int value, string message, List<string> failures)
{
builder.RequireThat(value >= 0, message);
if (value < 0)
{
failures.Add(message);
}
}
private static void AddIfInvalidPath(string? value, string message, ValidationBuilder builder)
private static void AddIfInvalidPath(string? value, string message, List<string> failures)
{
if (string.IsNullOrWhiteSpace(value))
{
@@ -330,19 +314,15 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
}
catch (ArgumentException)
{
builder.Add(message);
failures.Add(message);
}
catch (NotSupportedException)
{
builder.Add(message);
failures.Add(message);
}
catch (PathTooLongException)
{
builder.Add(message);
}
catch (IOException)
{
builder.Add(message);
failures.Add(message);
}
}
}
@@ -1,32 +1,5 @@
using ZB.MOM.WW.Auth.Abstractions.Ldap;
namespace ZB.MOM.WW.MxGateway.Server.Configuration;
/// <summary>
/// Gateway-side view of the <c>MxGateway:Ldap</c> section. This is a SHADOW of the
/// shared <see cref="ZB.MOM.WW.Auth.Abstractions.Ldap.LdapOptions"/> type and is NOT
/// used to perform LDAP authentication at runtime — runtime bind/search is done by the
/// shared <c>ZB.MOM.WW.Auth.Ldap</c> provider, whose options are bound directly from the
/// same <c>MxGateway:Ldap</c> section by <c>AddZbLdapAuth</c> (see
/// <see cref="ZB.MOM.WW.MxGateway.Server.Dashboard.DashboardServiceCollectionExtensions"/>).
/// <para>
/// This shadow exists for three things only: (1) startup validation via
/// <see cref="GatewayOptionsValidator"/>; (2) the redacted effective-config display
/// (<see cref="EffectiveLdapConfiguration"/> / <see cref="GatewayConfigurationProvider"/>);
/// and (3) it is the single home of the gateway's dev/default LDAP values, which the
/// integration live-test helper copies onto the shared options.
/// </para>
/// <para>
/// Review C2 — DRIFT WARNING: this class MUST stay field-compatible with the shared
/// <see cref="ZB.MOM.WW.Auth.Abstractions.Ldap.LdapOptions"/> so the one config section
/// binds cleanly onto both. The two are intentionally NOT merged because their defaults
/// differ on purpose: this shadow ships dev-friendly defaults (plaintext localhost,
/// <c>AllowInsecure=true</c>, populated <c>SearchBase</c>/<c>ServiceAccount*</c>), whereas
/// the shared type is secure-by-default (<c>Transport=Ldaps</c>, <c>AllowInsecure=false</c>,
/// empty DN fields). If you add/rename/remove a field on the shared type, mirror it here
/// (and in the validator + effective-config) so the section keeps binding to both.
/// </para>
/// </summary>
public sealed class LdapOptions
{
/// <summary>Gets a value indicating whether LDAP authentication is enabled.</summary>
@@ -38,24 +11,17 @@ public sealed class LdapOptions
/// <summary>Gets the LDAP server port.</summary>
public int Port { get; init; } = 3893;
/// <summary>
/// Gets the transport/TLS mode for the LDAP connection. Replaces the former
/// boolean <c>UseTls</c> (true ≈ <see cref="LdapTransport.Ldaps"/>, false =
/// <see cref="LdapTransport.None"/>). <see cref="LdapTransport.StartTls"/> upgrades
/// a plaintext connection to TLS. Matches the shared
/// <see cref="ZB.MOM.WW.Auth.Abstractions.Ldap.LdapOptions.Transport"/> field so the
/// <c>MxGateway:Ldap</c> section binds straight onto the shared options.
/// </summary>
public LdapTransport Transport { get; init; } = LdapTransport.None;
/// <summary>Gets a value indicating whether TLS is required for the connection.</summary>
public bool UseTls { get; init; }
/// <summary>Gets a value indicating whether insecure (plaintext) LDAP connections are allowed.</summary>
public bool AllowInsecure { get; init; } = true;
/// <summary>Gets a value indicating whether insecure LDAP connections are allowed.</summary>
public bool AllowInsecureLdap { get; init; } = true;
/// <summary>Gets the LDAP search base distinguished name.</summary>
public string SearchBase { get; init; } = "dc=zb,dc=local";
public string SearchBase { get; init; } = "dc=lmxopcua,dc=local";
/// <summary>Gets the service account distinguished name.</summary>
public string ServiceAccountDn { get; init; } = "cn=serviceaccount,dc=zb,dc=local";
public string ServiceAccountDn { get; init; } = "cn=serviceaccount,dc=lmxopcua,dc=local";
/// <summary>Gets the service account password.</summary>
public string ServiceAccountPassword { get; init; } = "serviceaccount123";
@@ -1,22 +0,0 @@
namespace ZB.MOM.WW.MxGateway.Server.Configuration;
/// <summary>
/// Options controlling the gateway's self-signed certificate auto-generation.
/// Only consulted when a Kestrel HTTPS endpoint is configured without its own
/// certificate; plaintext deployments never trigger generation.
/// </summary>
public sealed class TlsOptions
{
/// <summary>Path to the persisted self-signed PFX. Reused across restarts.</summary>
public string SelfSignedCertPath { get; init; } =
@"C:\ProgramData\MxGateway\certs\gateway-selfsigned.pfx";
/// <summary>Lifetime in years of a freshly generated certificate.</summary>
public int ValidityYears { get; init; } = 10;
/// <summary>Extra DNS SANs to embed (e.g. a load-balancer name).</summary>
public IReadOnlyList<string> AdditionalDnsNames { get; init; } = [];
/// <summary>Regenerate the persisted certificate when it has expired.</summary>
public bool RegenerateIfExpired { get; init; } = true;
}
@@ -5,14 +5,14 @@
<meta name="viewport" content="width=device-width, initial-scale=1" />
<base href="/" />
<link rel="stylesheet" href="/lib/bootstrap/css/bootstrap.min.css" />
<ThemeHead />
<link rel="stylesheet" href="/css/theme.css" />
<link rel="stylesheet" href="/css/site.css" />
<HeadOutlet @rendermode="InteractiveServer" />
</head>
<body class="dashboard-body">
<Routes @rendermode="InteractiveServer" />
<script src="/lib/bootstrap/js/bootstrap.bundle.min.js"></script>
<ThemeScripts />
<script src="/js/nav-state.js"></script>
<script src="/_framework/blazor.web.js"></script>
</body>
</html>
@@ -1,6 +0,0 @@
@inherits LayoutComponentBase
@* Minimal layout for the login page: no side rail, no brand block. The page
renders its own centred card via the shared kit's <LoginCard>. Mirrors
OtOpcUa AdminUI's LoginLayout. *@
@Body
@@ -1,40 +1,210 @@
@using System.Linq
@using Microsoft.AspNetCore.Components.Routing
@using Microsoft.JSInterop
@implements IDisposable
@inherits LayoutComponentBase
@inject NavigationManager Navigation
@inject IJSRuntime JS
@* Thin layout: delegates the side-rail chassis (hamburger, brand, responsive
collapse) to the shared ZB.MOM.WW.Theme <ThemeShell>. The nav is reproduced
with the kit's NavRailSection / NavRailItem; section expand-state persistence
is owned by the kit's <details> + ThemeScripts (no JS interop here). *@
<ThemeShell Product="MXAccess Gateway" Accent="#2f5fd0">
<Nav>
<NavRailItem Href="/" Text="Dashboard" Match="NavLinkMatch.All" />
<NavRailSection Title="Runtime" Key="runtime">
<NavRailItem Href="/sessions" Text="Sessions" />
<NavRailItem Href="/workers" Text="Workers" />
<NavRailItem Href="/events" Text="Events" />
<NavRailItem Href="/alarms" Text="Alarms" />
</NavRailSection>
<NavRailSection Title="Galaxy" Key="galaxy">
<NavRailItem Href="/galaxy" Text="Repository" />
<NavRailItem Href="/browse" Text="Browse" />
</NavRailSection>
<NavRailSection Title="Admin" Key="admin">
<NavRailItem Href="/apikeys" Text="API Keys" />
<NavRailItem Href="/settings" Text="Settings" />
</NavRailSection>
</Nav>
<RailFooter>
<AuthorizeView>
<Authorized Context="authState">
<span class="rail-user">@authState.User.Identity?.Name</span>
<form method="post" action="/logout" data-enhance="false">
<AntiforgeryToken />
<button class="rail-btn" type="submit">Sign Out</button>
</form>
</Authorized>
<NotAuthorized>
<a class="rail-btn" href="/login">Sign In</a>
</NotAuthorized>
</AuthorizeView>
</RailFooter>
<ChildContent>@Body</ChildContent>
</ThemeShell>
<div class="d-flex flex-column flex-lg-row" style="min-height: 100vh;">
@* Hamburger toggle: visible only on viewports <lg. Bootstrap collapse JS
lives in bootstrap.bundle.min.js (loaded in App.razor). *@
<button class="btn btn-outline-secondary btn-sm d-lg-none m-2 align-self-start"
type="button"
data-bs-toggle="collapse"
data-bs-target="#sidebar-collapse"
aria-controls="sidebar-collapse"
aria-expanded="false"
aria-label="Toggle navigation">
&#9776;
</button>
<div class="collapse d-lg-block" id="sidebar-collapse">
<nav class="sidebar d-flex flex-column">
<a class="brand" href="/"><span class="mark">&#9646;</span> MXAccess Gateway</a>
<div style="overflow-y:auto; flex:1 1 auto; min-height:0;">
<ul class="nav flex-column">
<li class="nav-item">
<NavLink class="nav-link" href="/" Match="NavLinkMatch.All">Dashboard</NavLink>
</li>
<NavSection Title="Runtime"
Expanded="@_expanded.Contains("runtime")"
OnToggle="@(() => ToggleAsync("runtime"))">
<li class="nav-item">
<NavLink class="nav-link" href="/sessions" Match="NavLinkMatch.Prefix">Sessions</NavLink>
</li>
<li class="nav-item">
<NavLink class="nav-link" href="/workers" Match="NavLinkMatch.Prefix">Workers</NavLink>
</li>
<li class="nav-item">
<NavLink class="nav-link" href="/events" Match="NavLinkMatch.Prefix">Events</NavLink>
</li>
<li class="nav-item">
<NavLink class="nav-link" href="/alarms" Match="NavLinkMatch.Prefix">Alarms</NavLink>
</li>
</NavSection>
<NavSection Title="Galaxy"
Expanded="@_expanded.Contains("galaxy")"
OnToggle="@(() => ToggleAsync("galaxy"))">
<li class="nav-item">
<NavLink class="nav-link" href="/galaxy" Match="NavLinkMatch.Prefix">Repository</NavLink>
</li>
<li class="nav-item">
<NavLink class="nav-link" href="/browse" Match="NavLinkMatch.Prefix">Browse</NavLink>
</li>
</NavSection>
<NavSection Title="Admin"
Expanded="@_expanded.Contains("admin")"
OnToggle="@(() => ToggleAsync("admin"))">
<li class="nav-item">
<NavLink class="nav-link" href="/apikeys" Match="NavLinkMatch.Prefix">API Keys</NavLink>
</li>
<li class="nav-item">
<NavLink class="nav-link" href="/settings" Match="NavLinkMatch.Prefix">Settings</NavLink>
</li>
</NavSection>
</ul>
</div>
<AuthorizeView>
<Authorized Context="authState">
<div class="border-top px-3 py-2">
<div class="d-flex justify-content-between align-items-center">
<span class="text-body-secondary small">@authState.User.Identity?.Name</span>
<form method="post" action="/logout" data-enhance="false">
<AntiforgeryToken />
<button type="submit" class="btn btn-outline-secondary btn-sm py-0 px-2">Sign Out</button>
</form>
</div>
</div>
</Authorized>
<NotAuthorized>
<div class="border-top px-3 py-2">
<a href="/login" class="btn btn-outline-secondary btn-sm py-0 px-2 w-100">Sign In</a>
</div>
</NotAuthorized>
</AuthorizeView>
</nav>
</div>
<main class="page flex-grow-1">
@Body
</main>
</div>
@code {
// Sections whose collapsed/expanded state we persist. Acts as the allow-list
// when parsing the cookie so stale or attacker-supplied ids are ignored.
private static readonly string[] SectionIds = { "runtime", "galaxy", "admin" };
// The currently-expanded sections. Populated from the cookie on first
// render; mutated by ToggleAsync and by navigating into a section.
private readonly HashSet<string> _expanded = new(StringComparer.Ordinal);
protected override void OnInitialized()
{
Navigation.LocationChanged += OnLocationChanged;
}
protected override async Task OnAfterRenderAsync(bool firstRender)
{
if (!firstRender)
{
return;
}
// Hydrate from the cookie. Until this completes the sidebar paints
// collapsed, matching the CentralUI behaviour.
string saved;
try
{
saved = await JS.InvokeAsync<string>("navState.get") ?? string.Empty;
}
catch (JSDisconnectedException)
{
return;
}
foreach (var id in saved.Split(
',', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries))
{
if (Array.IndexOf(SectionIds, id) >= 0)
{
_expanded.Add(id);
}
}
// The section of the page we loaded on is always expanded.
if (EnsureCurrentSectionExpanded())
{
await PersistAsync();
}
StateHasChanged();
}
private void OnLocationChanged(object? sender, LocationChangedEventArgs e)
{
if (EnsureCurrentSectionExpanded())
{
_ = PersistAsync();
_ = InvokeAsync(StateHasChanged);
}
}
private async Task ToggleAsync(string id)
{
if (!_expanded.Remove(id))
{
_expanded.Add(id);
}
await PersistAsync();
}
// Adds the current page's section to _expanded; returns true if it changed.
private bool EnsureCurrentSectionExpanded()
{
var section = CurrentSection();
return section is not null && _expanded.Add(section);
}
// Maps the current URL's first path segment to a section id, or null for
// sectionless pages (Dashboard, Login).
private string? CurrentSection()
{
var relative = Navigation.ToBaseRelativePath(Navigation.Uri);
var firstSegment = relative.Split('?', '#')[0]
.Split('/', StringSplitOptions.RemoveEmptyEntries)
.FirstOrDefault();
return firstSegment switch
{
"sessions" or "workers" or "events" or "alarms" => "runtime",
"galaxy" or "browse" => "galaxy",
"apikeys" or "settings" => "admin",
_ => null,
};
}
private async Task PersistAsync()
{
try
{
await JS.InvokeVoidAsync("navState.set", string.Join(',', _expanded));
}
catch (JSDisconnectedException)
{
// The circuit is gone — nothing to persist to.
}
}
public void Dispose()
{
Navigation.LocationChanged -= OnLocationChanged;
}
}
@@ -0,0 +1,35 @@
@* A collapsible sidebar nav section. The header is a full-width button that
toggles ChildContent visibility. Pattern lifted from ScadaLink CentralUI
(Components/Layout/NavSection.razor) — see [[project-deployed-service]]. *@
<li class="nav-item">
<button type="button"
class="nav-section-toggle"
@onclick="OnToggle"
aria-expanded="@(Expanded ? "true" : "false")">
<span class="chevron" aria-hidden="true">@(Expanded ? "▾" : "▸")</span>
<span>@Title</span>
</button>
</li>
@if (Expanded)
{
@ChildContent
}
@code {
/// <summary>Section label shown in the header (e.g. "Runtime").</summary>
[Parameter, EditorRequired]
public string Title { get; set; } = string.Empty;
/// <summary>Whether the section is expanded — its items rendered.</summary>
[Parameter]
public bool Expanded { get; set; }
/// <summary>Raised when the header button is clicked.</summary>
[Parameter]
public EventCallback OnToggle { get; set; }
/// <summary>The section's nav items, rendered only while expanded.</summary>
[Parameter]
public RenderFragment? ChildContent { get; set; }
}
@@ -1,33 +0,0 @@
@page "/login"
@layout LoginLayout
@using Microsoft.AspNetCore.Authorization
@* Login MUST stay anonymously reachable — [AllowAnonymous] overrides the
RequireAuthorization(ViewerPolicy) that MapRazorComponents<App>() applies, so the
cookie scheme's LoginPath="/login" redirect lands here for unauthenticated users.
The card is the shared kit's <LoginCard>: it renders a NATIVE static
<form method="post" action="/auth/login"> (username/password + hidden returnUrl). A native
form submit is not a Blazor event, so it reaches the minimal-API POST /auth/login endpoint
regardless of this app's InteractiveServer render mode. <AntiforgeryToken/> supplies the
token that PostLoginAsync's antiforgery.ValidateRequestAsync checks.
NOTE: the POST target is /auth/login, NOT /login. This @page lives at "/login" and the
Razor Components endpoint matches ALL methods, so a POST to /login collided with the
minimal-API MapPost("/login") and threw AmbiguousMatchException (HTTP 500). Posting to a
distinct /auth/login path (mirroring ScadaBridge) keeps the GET page and POST handler from
sharing a route. *@
@attribute [AllowAnonymous]
<LoginCard Product="MXAccess Gateway" Action="/auth/login" ReturnUrl="@ReturnUrl" Error="@Error">
<AntiforgeryToken />
</LoginCard>
@code {
/// <summary>Original protected URL the operator was bounced from; round-tripped to POST /login.</summary>
[SupplyParameterFromQuery(Name = "returnUrl")]
private string? ReturnUrl { get; set; }
/// <summary>Failure message surfaced by POST /login after a failed authentication.</summary>
[SupplyParameterFromQuery(Name = "error")]
private string? Error { get; set; }
}
@@ -26,7 +26,7 @@ else
<tr><th scope="row">Run migrations</th><td>@Snapshot.Configuration.Authentication.RunMigrationsOnStartup</td></tr>
<tr><th scope="row">LDAP enabled</th><td>@Snapshot.Configuration.Ldap.Enabled</td></tr>
<tr><th scope="row">LDAP server</th><td>@Snapshot.Configuration.Ldap.Server:@Snapshot.Configuration.Ldap.Port</td></tr>
<tr><th scope="row">LDAP transport</th><td>@Snapshot.Configuration.Ldap.Transport</td></tr>
<tr><th scope="row">LDAP TLS</th><td>@Snapshot.Configuration.Ldap.UseTls</td></tr>
<tr><th scope="row">LDAP search base</th><td><code>@Snapshot.Configuration.Ldap.SearchBase</code></td></tr>
<tr><th scope="row">LDAP service account</th><td><code>@Snapshot.Configuration.Ldap.ServiceAccountDn</code></td></tr>
<tr><th scope="row">LDAP service password</th><td>@Snapshot.Configuration.Ldap.ServiceAccountPassword</td></tr>
@@ -1,16 +1,16 @@
@* Thin adapter: maps MxGateway runtime state text → kit StatusPill state.
The bespoke .chip rendering now lives in the kit; only the app's domain
text→state vocabulary remains here. Call sites (<StatusBadge Text="..."/>) unchanged. *@
<StatusPill State="MapState(Text)">@Text</StatusPill>
@code {
[Parameter] public string? Text { get; set; }
<span class="chip @CssClass">@Text</span>
private static StatusState MapState(string? text) => text switch
@code {
[Parameter]
public string? Text { get; set; }
private string CssClass => Text switch
{
"Ready" or "Healthy" or "Active" => StatusState.Ok,
"Creating" or "StartingWorker" or "WaitingForPipe" or "InitializingWorker" or "Closing"
or "Stale" or "Degraded" => StatusState.Warn,
"Faulted" or "Unavailable" => StatusState.Bad,
_ => StatusState.Idle,
"Ready" or "Healthy" or "Active" => "chip-ok",
"Creating" or "StartingWorker" or "WaitingForPipe" or "InitializingWorker" or "Closing" => "chip-warn",
"Stale" or "Degraded" => "chip-warn",
"Faulted" or "Unavailable" => "chip-bad",
"Closed" or "Revoked" or "Unknown" => "chip-idle",
_ => "chip-idle"
};
}
@@ -10,5 +10,4 @@
@using ZB.MOM.WW.MxGateway.Server.Dashboard.Components.Shared
@using ZB.MOM.WW.MxGateway.Server.Security.Authorization
@using ZB.MOM.WW.MxGateway.Server.Workers
@using ZB.MOM.WW.Theme
@using static Microsoft.AspNetCore.Components.Web.RenderMode
@@ -1,10 +1,5 @@
using System.Security.Claims;
using System.Text.Json;
using Microsoft.Data.Sqlite;
using ZB.MOM.WW.Audit;
using ZB.MOM.WW.Auth.Abstractions.ApiKeys;
using ZB.MOM.WW.Auth.ApiKeys.Admin;
using ZB.MOM.WW.MxGateway.Server.Security.Audit;
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
using ZB.MOM.WW.MxGateway.Server.Security.Authorization;
@@ -12,13 +7,12 @@ namespace ZB.MOM.WW.MxGateway.Server.Dashboard;
public sealed class DashboardApiKeyManagementService(
DashboardApiKeyAuthorization authorization,
ApiKeyAdminCommands adminCommands,
IApiKeyAdminStore adminStore,
IAuditWriter auditWriter,
IApiKeyAuditStore auditStore,
IApiKeySecretHasher hasher,
IHttpContextAccessor httpContextAccessor) : IDashboardApiKeyManagementService
{
private const string UnauthorizedMessage = "Sign in with an authorized LDAP account to manage API keys.";
private const string PepperUnavailableMarker = "pepper unavailable";
/// <summary>Determines whether the user can manage API keys.</summary>
/// <param name="user">The authenticated user principal.</param>
@@ -48,31 +42,28 @@ public sealed class DashboardApiKeyManagementService(
}
string keyId = request.KeyId.Trim();
string secret = ApiKeySecretGenerator.Generate();
string apiKey = FormatApiKey(keyId, secret);
try
{
// The shared command set generates the secret, hashes it with the pepper, persists the
// record and assembles the mxgw_<id>_<secret> token (shown once). It also appends its own
// "create-key" audit entry (now canonicalized through the IApiKeyAuditStore->IAuditWriter
// adapter); the dashboard layers a richer "dashboard-create-key" canonical AuditEvent
// (Target + CorrelationId + remote address) on top via IAuditWriter to preserve the
// dashboard audit vocabulary — both rows land in the canonical audit_event store.
CreateKeyResult created = await adminCommands.CreateKeyAsync(
keyId,
request.DisplayName.Trim(),
request.Scopes,
ApiKeyConstraintSerializer.Serialize(request.Constraints),
RemoteAddress(),
await adminStore.CreateAsync(
new ApiKeyCreateRequest(
KeyId: keyId,
KeyPrefix: $"mxgw_{keyId}",
SecretHash: hasher.HashSecret(secret),
DisplayName: request.DisplayName.Trim(),
Scopes: request.Scopes,
Constraints: request.Constraints,
CreatedUtc: DateTimeOffset.UtcNow),
cancellationToken)
.ConfigureAwait(false);
await WriteDashboardAuditAsync(user, keyId, "dashboard-create-key", null, cancellationToken).ConfigureAwait(false);
await AppendAuditAsync(keyId, "dashboard-create-key", null, cancellationToken).ConfigureAwait(false);
return DashboardApiKeyManagementResult.Success(
"API key created. Copy the key now; it will not be shown again.",
created.Token);
return DashboardApiKeyManagementResult.Success("API key created. Copy the key now; it will not be shown again.", apiKey);
}
catch (InvalidOperationException exception) when (IsPepperUnavailable(exception))
catch (ApiKeyPepperUnavailableException)
{
return DashboardApiKeyManagementResult.Fail("API key pepper is not configured.");
}
@@ -103,19 +94,18 @@ public sealed class DashboardApiKeyManagementService(
}
string normalizedKeyId = keyId.Trim();
KeyActionResult result = await adminCommands
.RevokeKeyAsync(normalizedKeyId, RemoteAddress(), cancellationToken)
bool revoked = await adminStore
.RevokeAsync(normalizedKeyId, DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false);
await WriteDashboardAuditAsync(
user,
await AppendAuditAsync(
normalizedKeyId,
"dashboard-revoke-key",
result.Succeeded ? "revoked" : "not-found-or-already-revoked",
revoked ? "revoked" : "not-found-or-already-revoked",
cancellationToken)
.ConfigureAwait(false);
return result.Succeeded
return revoked
? DashboardApiKeyManagementResult.Success("API key revoked.")
: DashboardApiKeyManagementResult.Fail("API key was not found or is already revoked.");
}
@@ -141,30 +131,27 @@ public sealed class DashboardApiKeyManagementService(
}
string normalizedKeyId = keyId.Trim();
string secret = ApiKeySecretGenerator.Generate();
string apiKey = FormatApiKey(normalizedKeyId, secret);
try
{
CreateKeyResult rotated = await adminCommands
.RotateKeyAsync(normalizedKeyId, RemoteAddress(), cancellationToken)
bool rotated = await adminStore
.RotateAsync(normalizedKeyId, hasher.HashSecret(secret), DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false);
bool succeeded = rotated.Token is not null;
await WriteDashboardAuditAsync(
user,
await AppendAuditAsync(
normalizedKeyId,
"dashboard-rotate-key",
succeeded ? "rotated" : "not-found",
rotated ? "rotated" : "not-found",
cancellationToken)
.ConfigureAwait(false);
return succeeded
? DashboardApiKeyManagementResult.Success(
"API key rotated. Copy the key now; it will not be shown again.",
rotated.Token)
return rotated
? DashboardApiKeyManagementResult.Success("API key rotated. Copy the key now; it will not be shown again.", apiKey)
: DashboardApiKeyManagementResult.Fail("API key was not found.");
}
catch (InvalidOperationException exception) when (IsPepperUnavailable(exception))
catch (ApiKeyPepperUnavailableException)
{
return DashboardApiKeyManagementResult.Fail("API key pepper is not configured.");
}
@@ -195,8 +182,7 @@ public sealed class DashboardApiKeyManagementService(
.DeleteAsync(normalizedKeyId, cancellationToken)
.ConfigureAwait(false);
await WriteDashboardAuditAsync(
user,
await AppendAuditAsync(
normalizedKeyId,
"dashboard-delete-key",
deleted ? "deleted" : "not-found-or-active",
@@ -208,92 +194,22 @@ public sealed class DashboardApiKeyManagementService(
: DashboardApiKeyManagementResult.Fail("API key was not found, or is still active. Revoke it before deleting.");
}
private string? RemoteAddress() =>
httpContextAccessor.HttpContext?.Connection.RemoteIpAddress?.ToString();
/// <summary>
/// Resolves the operator's username from the authenticated dashboard principal.
/// </summary>
/// <remarks>
/// The passed <paramref name="user"/> is preferred over the ambient HTTP context because it
/// is already in scope at every call site (the callers gate on <see cref="CanManage"/> using
/// it) and is unambiguous. Falls back to <see cref="IAuditActorAccessor.CurrentActor"/> for
/// defensive coverage, then to <c>"unknown"</c> when neither is available.
/// </remarks>
private static string ResolveOperatorActor(ClaimsPrincipal user)
{
// ZbClaimTypes.Username = "zb:username" — the canonical LDAP login name.
string? username = user.FindFirstValue(ZB.MOM.WW.Auth.AspNetCore.ZbClaimTypes.Username);
if (!string.IsNullOrWhiteSpace(username))
{
return username;
}
// Framework fallback: Identity.Name is driven by the nameClaimType on the ClaimsIdentity
// (set to ZbClaimTypes.Name = ClaimTypes.Name by DashboardAuthenticator → display name).
string? identityName = user.Identity?.Name;
if (!string.IsNullOrWhiteSpace(identityName))
{
return identityName;
}
return "unknown";
}
/// <summary>
/// Emits the dashboard's own canonical <see cref="AuditEvent"/> for a <c>dashboard-*</c> op
/// directly through the best-effort <see cref="IAuditWriter"/> (Task 2.3 #6). This is in
/// addition to the <c>create/revoke/rotate-key</c> event that <see cref="ApiKeyAdminCommands"/>
/// emits via the canonical-forwarding <c>IApiKeyAuditStore</c> adapter — the doubled-audit
/// behaviour is preserved, both rows now land in the canonical <c>audit_event</c> store.
/// </summary>
/// <remarks>
/// Phase 3 (Actor = operator principal): <c>Actor</c> is the LDAP operator who performed the
/// action (resolved from the <paramref name="user"/> principal); <c>Target</c> is the managed
/// API key id. This fixes the pre-Phase-3 semantic gap where both fields held the keyId.
/// </remarks>
private async Task WriteDashboardAuditAsync(
ClaimsPrincipal user,
string keyId,
string action,
string? detail,
private async Task AppendAuditAsync(
string? keyId,
string eventType,
string? details,
CancellationToken cancellationToken)
{
AuditEvent auditEvent = new()
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTimeOffset.UtcNow,
Actor = ResolveOperatorActor(user),
Action = action,
Outcome = AuditOutcome.Success,
Category = CanonicalForwardingApiKeyAuditStore.ApiKeyCategory,
Target = keyId,
SourceNode = RemoteAddress(),
CorrelationId = ParseCorrelationId(),
DetailsJson = WrapDetail(detail),
};
await auditWriter.WriteAsync(auditEvent, cancellationToken).ConfigureAwait(false);
await auditStore.AppendAsync(
new ApiKeyAuditEntry(
KeyId: keyId,
EventType: eventType,
RemoteAddress: httpContextAccessor.HttpContext?.Connection.RemoteIpAddress?.ToString(),
Details: details),
cancellationToken)
.ConfigureAwait(false);
}
/// <summary>
/// Derives a correlation id from the ASP.NET Core request trace identifier when it is a
/// well-formed GUID; otherwise null (the default <c>HttpContext.TraceIdentifier</c> is the
/// connection:request form, not a GUID, so it correlates to null rather than fabricating one).
/// </summary>
private Guid? ParseCorrelationId() =>
Guid.TryParse(httpContextAccessor.HttpContext?.TraceIdentifier, out Guid correlationId)
? correlationId
: null;
private static string? WrapDetail(string? detail) =>
detail is null
? null
: JsonSerializer.Serialize(new Dictionary<string, string> { ["detail"] = detail });
private static bool IsPepperUnavailable(InvalidOperationException exception) =>
exception.Message.Contains(PepperUnavailableMarker, StringComparison.OrdinalIgnoreCase);
private static string? ValidateCreateRequest(DashboardApiKeyManagementRequest request)
{
string? keyIdValidation = ValidateKeyId(request.KeyId);
@@ -332,4 +248,9 @@ public sealed class DashboardApiKeyManagementService(
? null
: "API key id may contain only letters, numbers, periods, and hyphens.";
}
private static string FormatApiKey(string keyId, string secret)
{
return $"mxgw_{keyId}_{secret}";
}
}
@@ -1,26 +1,14 @@
using System.Security.Claims;
using ZB.MOM.WW.Auth.Abstractions.Ldap;
using ZB.MOM.WW.Auth.Abstractions.Roles;
using ZB.MOM.WW.Auth.AspNetCore;
using System.Text;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Security.Authorization;
using Novell.Directory.Ldap;
namespace ZB.MOM.WW.MxGateway.Server.Dashboard;
/// <summary>
/// Authenticates interactive dashboard logins against LDAP. The bind/search
/// mechanics are delegated to the shared <see cref="ILdapAuthService"/>
/// (<c>ZB.MOM.WW.Auth.Ldap</c>), which performs bind-then-search, fails closed,
/// and never throws — returning the user's display name and LDAP groups on
/// success. This class keeps the dashboard-specific policy: groups are resolved
/// to dashboard roles via <see cref="IGroupRoleMapper{TRole}"/>, a login with no
/// matching role is denied, and the resulting <see cref="ClaimsPrincipal"/> is
/// shaped exactly as before (see <see cref="CreatePrincipal"/>).
/// </summary>
/// <param name="ldapAuthService">Shared LDAP bind-then-search provider.</param>
/// <param name="roleMapper">Maps LDAP groups to dashboard roles (Task 1.1 seam).</param>
/// <param name="logger">Logger for diagnostic, credential-free login outcomes.</param>
public sealed class DashboardAuthenticator(
ILdapAuthService ldapAuthService,
IGroupRoleMapper<string> roleMapper,
IOptions<GatewayOptions> options,
ILogger<DashboardAuthenticator> logger) : IDashboardAuthenticator
{
private const string GenericFailureMessage = "The username or password is invalid, or the user is not authorized.";
@@ -31,72 +19,240 @@ public sealed class DashboardAuthenticator(
string? password,
CancellationToken cancellationToken)
{
if (string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(password))
LdapOptions ldapOptions = options.Value.Ldap;
DashboardOptions dashboardOptions = options.Value.Dashboard;
if (!ldapOptions.Enabled
|| string.IsNullOrWhiteSpace(username)
|| string.IsNullOrWhiteSpace(password))
{
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
if (!ldapOptions.UseTls && !ldapOptions.AllowInsecureLdap)
{
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
string normalizedUsername = username.Trim();
// The shared service owns connect/bind/search and the fail-closed contract:
// it returns Fail(Disabled) when LDAP is off, enforces TLS-or-AllowInsecure via
// its startup validator, and never throws. We only translate its outcome into a
// dashboard principal here.
LdapAuthResult ldapResult = await ldapAuthService
.AuthenticateAsync(normalizedUsername, password, cancellationToken)
.ConfigureAwait(false);
if (!ldapResult.Succeeded)
try
{
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
using LdapConnection connection = new();
connection.SecureSocketLayer = ldapOptions.UseTls;
await Task.Run(
() => connection.Connect(ldapOptions.Server, ldapOptions.Port),
cancellationToken)
.ConfigureAwait(false);
await BindServiceAccountAsync(connection, ldapOptions, cancellationToken).ConfigureAwait(false);
LdapEntry? candidate = await SearchUserAsync(
connection,
ldapOptions,
normalizedUsername,
cancellationToken)
.ConfigureAwait(false);
if (candidate is null)
{
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
await Task.Run(
() => connection.Bind(candidate.Dn, password),
cancellationToken)
.ConfigureAwait(false);
await BindServiceAccountAsync(connection, ldapOptions, cancellationToken).ConfigureAwait(false);
LdapEntry? authenticatedEntry = await SearchUserAsync(
connection,
ldapOptions,
normalizedUsername,
cancellationToken)
.ConfigureAwait(false);
if (authenticatedEntry is null)
{
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
string displayName = ReadAttribute(authenticatedEntry, ldapOptions.DisplayNameAttribute)
?? normalizedUsername;
IReadOnlyList<string> groups = ReadAttributeValues(authenticatedEntry, ldapOptions.GroupAttribute);
IReadOnlyList<string> roles = MapGroupsToRoles(groups, dashboardOptions.GroupToRole);
if (roles.Count == 0)
{
logger.LogInformation(
"LDAP dashboard login denied for user {User}: no GroupToRole mapping matched their LDAP groups.",
normalizedUsername);
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
return DashboardAuthenticationResult.Success(CreatePrincipal(
normalizedUsername,
displayName,
groups,
roles));
}
GroupRoleMapping<string> mapping = await roleMapper
.MapAsync(ldapResult.Groups, cancellationToken)
.ConfigureAwait(false);
IReadOnlyList<string> roles = mapping.Roles;
if (roles.Count == 0)
catch (OperationCanceledException)
{
throw;
}
catch (LdapException ex)
{
// Preserve the long-standing "no roles matched -> login denied" rule.
logger.LogInformation(
"LDAP dashboard login denied for user {User}: no GroupToRole mapping matched their LDAP groups.",
ldapResult.Username);
"LDAP dashboard login rejected for user {User}: result code {ResultCode}.",
normalizedUsername,
ex.ResultCode);
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
catch (Exception ex)
{
logger.LogError(ex, "Unexpected LDAP dashboard login error for user {User}.", normalizedUsername);
return DashboardAuthenticationResult.Success(CreatePrincipal(
ldapResult.Username,
ldapResult.DisplayName,
ldapResult.Groups,
roles));
return DashboardAuthenticationResult.Fail(GenericFailureMessage);
}
}
/// <summary>Escapes special characters in LDAP filter strings.</summary>
/// <param name="value">The string value to escape.</param>
internal static string EscapeLdapFilter(string value)
{
StringBuilder builder = new(value.Length);
foreach (char character in value)
{
builder.Append(character switch
{
'\\' => @"\5c",
'*' => @"\2a",
'(' => @"\28",
')' => @"\29",
'\0' => @"\00",
_ => character.ToString()
});
}
return builder.ToString();
}
/// <summary>
/// Builds the dashboard <see cref="ClaimsPrincipal"/> from the LDAP outcome.
/// Maps the user's LDAP groups to dashboard roles. A user can pick up
/// multiple roles; Admin and Viewer are the only legal values. Returns
/// an empty list when no group matches (caller rejects the login).
/// </summary>
/// <param name="username">
/// The (trimmed) login name. Emitted as <see cref="ClaimTypes.NameIdentifier"/> (kept for
/// back-compat reads) and as the canonical <see cref="ZbClaimTypes.Username"/> ("zb:username").
/// </param>
/// <param name="displayName">
/// The user's display name. Emitted as <see cref="ZbClaimTypes.Name"/> (= <see cref="ClaimTypes.Name"/>
/// so <c>Identity.Name</c> resolves) and as <see cref="ZbClaimTypes.DisplayName"/> ("zb:displayname")
/// for cross-app consistency.
/// </param>
/// <param name="groups">
/// The user's LDAP groups, as returned by <see cref="ILdapAuthService"/>. NOTE
/// (review C1): these are <b>already-normalized short RDN names</b> (e.g.
/// <c>GwAdmin</c>), not raw distinguished names. The shared
/// <c>ZB.MOM.WW.Auth.Ldap</c> provider strips each group DN to its first RDN
/// value before returning it, so the <see cref="DashboardAuthenticationDefaults.LdapGroupClaimType"/>
/// claim carries the short name. This differs from the pre-cutover behaviour,
/// which surfaced the raw <c>memberOf</c> values (full DNs) on the claim; the
/// claim is informational only (no policy or UI reads its value — authorization
/// is role-based), so the shape change is non-breaking for dashboard consumers.
/// </param>
/// <param name="roles">The dashboard roles resolved from <paramref name="groups"/>.</param>
/// <param name="groups">The collection of LDAP groups the user belongs to.</param>
/// <param name="groupToRole">The mapping from group names to dashboard role names.</param>
internal static IReadOnlyList<string> MapGroupsToRoles(
IEnumerable<string> groups,
IReadOnlyDictionary<string, string> groupToRole)
{
if (groupToRole.Count == 0)
{
return [];
}
HashSet<string> roles = new(StringComparer.Ordinal);
foreach (string group in groups)
{
string normalizedGroup = group.Trim();
// Lookup precedence (Server-040): the full literal group string is
// tried first; only if that misses do we fall back to the leading
// RDN value (e.g. "GwAdmin" extracted from
// "ou=GwAdmin,ou=groups,..."). The map's comparer is
// OrdinalIgnoreCase (see DashboardOptions.GroupToRole), so e.g.
// "GwAdmin" and "gwadmin" both match.
if (groupToRole.TryGetValue(normalizedGroup, out string? mapped)
|| groupToRole.TryGetValue(ExtractFirstRdnValue(normalizedGroup), out mapped))
{
roles.Add(mapped);
}
}
return [.. roles];
}
/// <summary>Extracts the first RDN value from a distinguished name.</summary>
/// <param name="distinguishedName">The LDAP distinguished name.</param>
internal static string ExtractFirstRdnValue(string distinguishedName)
{
int equalsIndex = distinguishedName.IndexOf('=');
if (equalsIndex < 0)
{
return distinguishedName;
}
int valueStart = equalsIndex + 1;
int commaIndex = distinguishedName.IndexOf(',', valueStart);
return commaIndex > valueStart
? distinguishedName[valueStart..commaIndex]
: distinguishedName[valueStart..];
}
private static Task BindServiceAccountAsync(
LdapConnection connection,
LdapOptions ldapOptions,
CancellationToken cancellationToken)
{
return Task.Run(
() => connection.Bind(ldapOptions.ServiceAccountDn, ldapOptions.ServiceAccountPassword),
cancellationToken);
}
private static async Task<LdapEntry?> SearchUserAsync(
LdapConnection connection,
LdapOptions ldapOptions,
string username,
CancellationToken cancellationToken)
{
string filter = $"({ldapOptions.UserNameAttribute}={EscapeLdapFilter(username)})";
ILdapSearchResults results = await Task.Run(
() => connection.Search(
ldapOptions.SearchBase,
LdapConnection.ScopeSub,
filter,
attrs: null,
typesOnly: false),
cancellationToken)
.ConfigureAwait(false);
LdapEntry? entry = null;
while (results.HasMore())
{
LdapEntry next = results.Next();
if (entry is not null)
{
return null;
}
entry = next;
}
return entry;
}
private static string? ReadAttribute(LdapEntry entry, string attributeName)
{
return ReadLdapAttribute(entry, attributeName)?.StringValue;
}
private static IReadOnlyList<string> ReadAttributeValues(LdapEntry entry, string attributeName)
{
LdapAttribute? attribute = ReadLdapAttribute(entry, attributeName);
return attribute?.StringValueArray ?? [];
}
private static LdapAttribute? ReadLdapAttribute(LdapEntry entry, string attributeName)
{
return entry.GetAttribute(attributeName)
?? entry.GetAttribute(attributeName.ToLowerInvariant())
?? entry.GetAttribute(attributeName.ToUpperInvariant());
}
private static ClaimsPrincipal CreatePrincipal(
string username,
string displayName,
@@ -105,21 +261,11 @@ public sealed class DashboardAuthenticator(
{
List<Claim> claims =
[
// Keep NameIdentifier so any existing read-site that uses it continues to work.
new Claim(ClaimTypes.NameIdentifier, username),
// Canonical login-username claim (Task 1.5).
new Claim(ZbClaimTypes.Username, username),
// ZbClaimTypes.Name == ClaimTypes.Name — drives Identity.Name resolution.
new Claim(ZbClaimTypes.Name, displayName),
// Canonical display-name claim for cross-app consistency (Task 1.5).
new Claim(ZbClaimTypes.DisplayName, displayName),
new Claim(ClaimTypes.Name, displayName),
];
// ZbClaimTypes.Role == ClaimTypes.Role — drives IsInRole and [Authorize(Roles=...)].
claims.AddRange(roles.Select(role => new Claim(ZbClaimTypes.Role, role)));
// Groups are short RDN names from ILdapAuthService (see param doc above), so
// this claim value is the short group name, not the original DN.
// LdapGroupClaimType is MxGateway-specific ("mxgateway:ldap_group") — no ZbClaimType for groups.
claims.AddRange(roles.Select(role => new Claim(ClaimTypes.Role, role)));
claims.AddRange(groups.Select(group => new Claim(
DashboardAuthenticationDefaults.LdapGroupClaimType,
group)));
@@ -127,8 +273,8 @@ public sealed class DashboardAuthenticator(
ClaimsIdentity claimsIdentity = new(
claims,
DashboardAuthenticationDefaults.AuthenticationScheme,
ZbClaimTypes.Name,
ZbClaimTypes.Role);
ClaimTypes.Name,
ClaimTypes.Role);
return new ClaimsPrincipal(claimsIdentity);
}
@@ -1,6 +1,7 @@
using System.Text.Encodings.Web;
using Microsoft.AspNetCore.Antiforgery;
using Microsoft.AspNetCore.Authentication;
using Microsoft.AspNetCore.Http.HttpResults;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Dashboard.Components;
using ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
@@ -24,19 +25,14 @@ public static class DashboardEndpointRouteBuilderExtensions
return endpoints;
}
// GET /login is served by the [AllowAnonymous] Blazor <Login> component
// (Components/Pages/Login.razor → @page "/login"), which renders the shared
// kit's <LoginCard>. Its [AllowAnonymous] attribute overrides the
// RequireAuthorization(ViewerPolicy) that MapRazorComponents<App>() applies,
// so the cookie scheme's LoginPath="/login" redirect resolves for anonymous users.
//
// The credential POST is mapped to /auth/login, NOT /login. The @page "/login"
// Razor Components endpoint matches ALL HTTP methods, so a MapPost("/login") shared
// the "/login" route with it and every POST threw AmbiguousMatchException (HTTP 500).
// A distinct /auth/login path (as ScadaBridge does) keeps the GET page and the POST
// handler on separate routes. The <LoginCard Action="/auth/login"> form posts here.
endpoints.MapGet(
"/login",
(HttpContext httpContext, IAntiforgery antiforgery) => GetLoginAsync(httpContext, antiforgery))
.AllowAnonymous()
.WithName("DashboardLogin");
endpoints.MapPost(
"/auth/login",
"/login",
(HttpContext httpContext, IAntiforgery antiforgery, IDashboardAuthenticator authenticator) =>
PostLoginAsync(httpContext, antiforgery, authenticator))
.AllowAnonymous()
@@ -96,6 +92,17 @@ public static class DashboardEndpointRouteBuilderExtensions
return endpoints;
}
private static Task<ContentHttpResult> GetLoginAsync(
HttpContext httpContext,
IAntiforgery antiforgery)
{
string returnUrl = SanitizeReturnUrl(httpContext.Request.Query["returnUrl"].ToString());
return Task.FromResult(TypedResults.Content(
RenderLoginPage(httpContext, antiforgery, returnUrl, failureMessage: null),
"text/html"));
}
private static async Task<IResult> PostLoginAsync(
HttpContext httpContext,
IAntiforgery antiforgery,
@@ -117,13 +124,10 @@ public static class DashboardEndpointRouteBuilderExtensions
if (!result.Succeeded || result.Principal is null)
{
// Round-trip the failure back to the anonymous Blazor /login page, carrying
// the (sanitized) returnUrl so a successful retry still lands on the target.
string failureMessage = result.FailureMessage
?? "The username or password is invalid, or the user is not authorized.";
return Results.Redirect(
$"/login?error={Uri.EscapeDataString(failureMessage)}"
+ $"&returnUrl={Uri.EscapeDataString(returnUrl)}");
return TypedResults.Content(
RenderLoginPage(httpContext, antiforgery, returnUrl, result.FailureMessage),
"text/html",
statusCode: StatusCodes.Status401Unauthorized);
}
await httpContext
@@ -154,6 +158,42 @@ public static class DashboardEndpointRouteBuilderExtensions
return Results.LocalRedirect("/login");
}
private static string RenderLoginPage(
HttpContext httpContext,
IAntiforgery antiforgery,
string returnUrl,
string? failureMessage)
{
AntiforgeryTokenSet tokens = antiforgery.GetAndStoreTokens(httpContext);
string requestToken = tokens.RequestToken ?? string.Empty;
string alert = string.IsNullOrWhiteSpace(failureMessage)
? string.Empty
: $"<p class=\"alert alert-danger\" role=\"alert\">{HtmlEncoder.Default.Encode(failureMessage)}</p>";
string body = $"""
<section class="dashboard-login">
{alert}
<form method="post" action="/login" class="card login-card">
<div class="card-body">
<input name="{tokens.FormFieldName}" type="hidden" value="{HtmlEncoder.Default.Encode(requestToken)}" />
<input name="returnUrl" type="hidden" value="{HtmlEncoder.Default.Encode(returnUrl)}" />
<div class="mb-3">
<label for="username" class="form-label">Username</label>
<input id="username" name="username" type="text" autocomplete="username" class="form-control" />
</div>
<div class="mb-3">
<label for="password" class="form-label">Password</label>
<input id="password" name="password" type="password" autocomplete="current-password" class="form-control" />
</div>
<button type="submit" class="btn btn-primary">Sign in</button>
</div>
</form>
</section>
""";
return RenderPage("Dashboard Sign In", heading: null, body);
}
private static string RenderPage(string title, string body)
=> RenderPage(title, heading: title, body);
@@ -175,8 +215,7 @@ public static class DashboardEndpointRouteBuilderExtensions
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>{HtmlEncoder.Default.Encode(title)}</title>
<link rel="stylesheet" href="/lib/bootstrap/css/bootstrap.min.css" />
<link rel="stylesheet" href="/_content/ZB.MOM.WW.Theme/css/theme.css" />
<link rel="stylesheet" href="/_content/ZB.MOM.WW.Theme/css/layout.css" />
<link rel="stylesheet" href="/css/theme.css" />
<link rel="stylesheet" href="/css/site.css" />
</head>
<body class="dashboard-body">
@@ -1,31 +0,0 @@
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Auth.Abstractions.Roles;
using ZB.MOM.WW.MxGateway.Server.Configuration;
namespace ZB.MOM.WW.MxGateway.Server.Dashboard;
/// <summary>
/// Shared-Auth <see cref="IGroupRoleMapper{TRole}"/> seam over the dashboard's
/// LDAP-group → role mapping. Roles are plain strings
/// (<see cref="DashboardRoles.Admin"/> / <see cref="DashboardRoles.Viewer"/>),
/// so <c>TRole</c> is <see cref="string"/>. The mapping rules (full-DN first,
/// leading-RDN fallback, case-insensitive) live in
/// <see cref="DashboardGroupRoleMapping"/>, shared with
/// <see cref="DashboardAuthenticator"/> so behaviour stays identical.
/// </summary>
/// <param name="options">Gateway options supplying the dashboard GroupToRole map.</param>
public sealed class DashboardGroupRoleMapper(IOptions<GatewayOptions> options)
: IGroupRoleMapper<string>
{
/// <inheritdoc />
public Task<GroupRoleMapping<string>> MapAsync(
IReadOnlyList<string> groups,
CancellationToken ct)
{
IReadOnlyList<string> roles = DashboardGroupRoleMapping.MapGroupsToRoles(
groups,
options.Value.Dashboard.GroupToRole);
return Task.FromResult(new GroupRoleMapping<string>(roles, Scope: null));
}
}
@@ -1,79 +0,0 @@
namespace ZB.MOM.WW.MxGateway.Server.Dashboard;
/// <summary>
/// Single source of truth for mapping a user's LDAP groups to dashboard roles.
/// Both <see cref="DashboardAuthenticator"/> (the existing login flow) and
/// <see cref="DashboardGroupRoleMapper"/> (the shared-Auth
/// <see cref="ZB.MOM.WW.Auth.Abstractions.Roles.IGroupRoleMapper{TRole}"/> seam)
/// delegate here so the precedence and case rules stay identical.
/// </summary>
internal static class DashboardGroupRoleMapping
{
/// <summary>
/// Maps the user's LDAP groups to dashboard roles. A user can pick up
/// multiple roles; Admin and Viewer are the only legal values. Returns
/// an empty list when no group matches (caller rejects the login).
/// </summary>
/// <param name="groups">The collection of LDAP groups the user belongs to.</param>
/// <param name="groupToRole">The mapping from group names to dashboard role names.</param>
internal static IReadOnlyList<string> MapGroupsToRoles(
IEnumerable<string> groups,
IReadOnlyDictionary<string, string> groupToRole)
{
if (groupToRole.Count == 0)
{
return [];
}
HashSet<string> roles = new(StringComparer.Ordinal);
foreach (string group in groups)
{
string normalizedGroup = group.Trim();
// Lookup precedence (Server-040): the full literal group string is
// tried first; only if that misses do we fall back to the leading
// RDN value (e.g. "GwAdmin" extracted from
// "ou=GwAdmin,ou=groups,..."). The map's comparer is
// OrdinalIgnoreCase (see DashboardOptions.GroupToRole), so e.g.
// "GwAdmin" and "gwadmin" both match.
//
// Review C1: with the shared ZB.MOM.WW.Auth.Ldap provider, groups
// arrive here already stripped to short RDN names (the library calls
// FirstRdnValue before returning them). So through the live login path
// the full-string branch only ever sees short names and the RDN
// fallback is effectively a no-op — they collapse to the same key.
// The fallback is retained because this mapping is also reachable
// directly via the IGroupRoleMapper<string> seam (DashboardGroupRoleMapper),
// where a caller could still pass a full DN. CONSEQUENCE: configuring a
// full-DN GroupToRole *key* (e.g. "ou=GwAdmin,ou=groups,...") is
// UNSUPPORTED with the shared library — the incoming group is a short
// name, so it will never equal a full-DN key. Keep GroupToRole keys as
// short group names.
if (groupToRole.TryGetValue(normalizedGroup, out string? mapped)
|| groupToRole.TryGetValue(ExtractFirstRdnValue(normalizedGroup), out mapped))
{
roles.Add(mapped);
}
}
return [.. roles];
}
/// <summary>Extracts the first RDN value from a distinguished name.</summary>
/// <param name="distinguishedName">The LDAP distinguished name.</param>
internal static string ExtractFirstRdnValue(string distinguishedName)
{
int equalsIndex = distinguishedName.IndexOf('=');
if (equalsIndex < 0)
{
return distinguishedName;
}
int valueStart = equalsIndex + 1;
int commaIndex = distinguishedName.IndexOf(',', valueStart);
return commaIndex > valueStart
? distinguishedName[valueStart..commaIndex]
: distinguishedName[valueStart..];
}
}
@@ -8,10 +8,8 @@ public static class DashboardRoles
{
/// <summary>
/// Read-write access: API-key CRUD, settings, any state-changing UI.
/// Canonical role value (Task 1.7); formerly <c>"Admin"</c> — pure value
/// rename, the operations this role authorizes are unchanged.
/// </summary>
public const string Admin = "Administrator";
public const string Admin = "Admin";
/// <summary>
/// Read-only access: all pages render but write affordances are hidden.
@@ -1,12 +1,8 @@
using Microsoft.AspNetCore.Authentication;
using Microsoft.AspNetCore.Authentication.Cookies;
using Microsoft.AspNetCore.Authorization;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Auth.Abstractions.Roles;
using ZB.MOM.WW.Auth.AspNetCore;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Security.Audit;
namespace ZB.MOM.WW.MxGateway.Server.Dashboard;
@@ -19,25 +15,11 @@ public static class DashboardServiceCollectionExtensions
/// Registers all dashboard services, authentication, and Razor components.
/// </summary>
/// <param name="services">Service collection to register services.</param>
/// <param name="configuration">
/// Application configuration, used to bind the shared LDAP provider's options
/// from the <c>MxGateway:Ldap</c> section.
/// </param>
public static IServiceCollection AddGatewayDashboard(
this IServiceCollection services,
IConfiguration configuration)
public static IServiceCollection AddGatewayDashboard(this IServiceCollection services)
{
// Dashboard logins delegate bind/search to the shared ZB.MOM.WW.Auth.Ldap
// provider. Its LdapOptions bind straight from MxGateway:Ldap (the gateway's
// LdapOptions field names match the shared options: Transport / AllowInsecure /
// SearchBase / ServiceAccount* / *Attribute). AddZbLdapAuth also adds a
// ValidateOnStart() so an insecure-transport misconfiguration fails fast at boot.
services.AddZbLdapAuth(configuration, "MxGateway:Ldap");
services.AddSingleton<IDashboardSnapshotService, DashboardSnapshotService>();
services.AddSingleton<IDashboardLiveDataService, DashboardLiveDataService>();
services.AddSingleton<IDashboardAuthenticator, DashboardAuthenticator>();
services.AddSingleton<IGroupRoleMapper<string>, DashboardGroupRoleMapper>();
services.AddSingleton<DashboardApiKeyAuthorization>();
services.AddSingleton<IDashboardApiKeyManagementService, DashboardApiKeyManagementService>();
services.AddSingleton<IDashboardSessionAdminService, DashboardSessionAdminService>();
@@ -48,7 +30,6 @@ public static class DashboardServiceCollectionExtensions
services.AddHostedService<Hubs.DashboardSnapshotPublisher>();
services.AddHostedService<Hubs.AlarmsHubPublisher>();
services.AddHttpContextAccessor();
services.AddSingleton<IAuditActorAccessor, HttpAuditActorAccessor>();
services.AddAntiforgery();
services.AddCascadingAuthenticationState();
services.AddRazorComponents()
@@ -59,42 +40,29 @@ public static class DashboardServiceCollectionExtensions
.AddAuthentication(DashboardAuthenticationDefaults.AuthenticationScheme)
.AddCookie(DashboardAuthenticationDefaults.AuthenticationScheme, cookieOptions =>
{
// Hardened defaults (HttpOnly, SameSite=Strict, SecurePolicy, SlidingExpiration,
// ExpireTimeSpan) via the shared ZbCookieDefaults.Apply. requireHttps is set to
// its default (true / Always) here and overridden per-environment by the
// PostConfigure below; the 8-hour idle timeout is preserved (not the 30-min default).
ZbCookieDefaults.Apply(cookieOptions, requireHttps: true, idleTimeout: TimeSpan.FromHours(8));
// Cookie name, path, and redirect paths are MxGateway-specific — set after Apply
// so they are never overwritten by the shared helper (Apply intentionally skips name).
// This is the canonical default; it is overridden per-environment from
// DashboardOptions.CookieName by the PostConfigure below.
cookieOptions.Cookie.Name = DashboardAuthenticationDefaults.CookieName;
cookieOptions.Cookie.HttpOnly = true;
cookieOptions.Cookie.SameSite = SameSiteMode.Strict;
// SecurePolicy is bound via PostConfigure below so it can honour
// DashboardOptions.RequireHttpsCookie (default Always; dev HTTP
// deployments set RequireHttpsCookie=false to use SameAsRequest).
cookieOptions.Cookie.Path = "/";
cookieOptions.LoginPath = "/login";
cookieOptions.LogoutPath = "/logout";
cookieOptions.AccessDeniedPath = "/denied";
cookieOptions.ExpireTimeSpan = TimeSpan.FromHours(8);
cookieOptions.SlidingExpiration = true;
})
.AddScheme<AuthenticationSchemeOptions, HubTokenAuthenticationHandler>(
DashboardAuthenticationDefaults.HubAuthenticationScheme,
_ => { });
// Honour DashboardOptions.RequireHttpsCookie (default true / Always; set false for dev
// HTTP deployments → SameAsRequest) and the optional per-environment cookie-name
// override. Both run after the inline AddCookie config above, so they win.
services.AddOptions<CookieAuthenticationOptions>(DashboardAuthenticationDefaults.AuthenticationScheme)
.Configure<IOptions<GatewayOptions>>((cookieOptions, gatewayOptions) =>
{
cookieOptions.Cookie.SecurePolicy = gatewayOptions.Value.Dashboard.RequireHttpsCookie
? CookieSecurePolicy.Always
: CookieSecurePolicy.SameAsRequest;
// Config-driven cookie name (MxGateway:Dashboard:CookieName). Null/blank keeps
// the canonical default set above, so a misconfiguration cannot unname the cookie.
var cookieName = gatewayOptions.Value.Dashboard.CookieName;
if (!string.IsNullOrWhiteSpace(cookieName))
{
cookieOptions.Cookie.Name = cookieName;
}
});
services.AddAuthorization(authorization =>
@@ -2,7 +2,6 @@ using System.Runtime.CompilerServices;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Auth.Abstractions.ApiKeys;
using ZB.MOM.WW.MxGateway.Server.Configuration;
using ZB.MOM.WW.MxGateway.Server.Galaxy;
using ZB.MOM.WW.MxGateway.Server.Metrics;
@@ -243,7 +242,7 @@ public sealed class DashboardSnapshotService : IDashboardSnapshotService
KeyId: key.KeyId,
DisplayName: key.DisplayName,
Scopes: key.Scopes,
Constraints: ApiKeyConstraintSerializer.Deserialize(key.ConstraintsJson),
Constraints: key.Constraints,
CreatedUtc: key.CreatedUtc,
LastUsedUtc: key.LastUsedUtc,
RevokedUtc: key.RevokedUtc))
@@ -1,40 +0,0 @@
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using ZB.MOM.WW.Auth.ApiKeys.Sqlite;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Readiness probe: verifies the SQLite authentication store is reachable. The gateway
/// authenticates every gRPC call against this store, so its reachability gates readiness.
/// </summary>
public sealed class AuthStoreHealthCheck : IHealthCheck
{
private readonly AuthSqliteConnectionFactory _connectionFactory;
public AuthStoreHealthCheck(AuthSqliteConnectionFactory connectionFactory) =>
_connectionFactory = connectionFactory ?? throw new ArgumentNullException(nameof(connectionFactory));
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await using SqliteConnection connection =
await _connectionFactory.OpenConnectionAsync(cancellationToken).ConfigureAwait(false);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText = "SELECT 1;";
await command.ExecuteScalarAsync(cancellationToken).ConfigureAwait(false);
return HealthCheckResult.Healthy("Auth store is reachable.");
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
throw;
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Auth store is unreachable.", ex);
}
}
}
@@ -0,0 +1,64 @@
using ZB.MOM.WW.Telemetry.Serilog;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Bridges the gateway's <see cref="GatewayLogRedactor"/> policy onto the shared
/// <see cref="ILogRedactor"/> seam consumed by <c>ZB.MOM.WW.Telemetry.Serilog</c>'s redaction
/// enricher. Applied to every Serilog log event before it reaches a sink, it masks the same
/// secrets the original MEL-scope path masked: API-key bearer tokens / client identities
/// (<c>mxgw_*</c>) and command values for credential-bearing MXAccess commands. All masking
/// decisions delegate to <see cref="GatewayLogRedactor"/> — this type adds no new policy.
/// </summary>
public sealed class GatewayLogRedactorAdapter : ILogRedactor
{
/// <summary>Property name carrying a client identity / authorization header value.</summary>
private const string ClientIdentityProperty = "ClientIdentity";
/// <summary>Property name carrying a raw authorization header value.</summary>
private const string AuthorizationProperty = "Authorization";
/// <summary>Property name carrying the MXAccess command method, used to gate value redaction.</summary>
private const string CommandMethodProperty = "CommandMethod";
/// <summary>Property name carrying a command payload value that may bear credentials.</summary>
private const string CommandValueProperty = "CommandValue";
/// <summary>
/// Masks any sensitive values in <paramref name="properties"/> in place using the shared
/// <see cref="GatewayLogRedactor"/> policy. Identity/authorization properties have their API-key
/// secret stripped; a command value is redacted when its associated command method bears
/// credentials.
/// </summary>
/// <param name="properties">The mutable log-event property dictionary for the current event.</param>
public void Redact(IDictionary<string, object?> properties)
{
ArgumentNullException.ThrowIfNull(properties);
RedactIdentity(properties, ClientIdentityProperty);
RedactIdentity(properties, AuthorizationProperty);
RedactCommandValue(properties);
}
private static void RedactIdentity(IDictionary<string, object?> properties, string propertyName)
{
if (properties.TryGetValue(propertyName, out object? value) && value is string identity)
{
properties[propertyName] = GatewayLogRedactor.RedactClientIdentity(identity);
}
}
private static void RedactCommandValue(IDictionary<string, object?> properties)
{
if (!properties.TryGetValue(CommandValueProperty, out object? value) || value is null)
{
return;
}
string? commandMethod = properties.TryGetValue(CommandMethodProperty, out object? method)
? method as string
: null;
properties[CommandValueProperty] = GatewayLogRedactor.RedactCommandValue(commandMethod, value);
}
}
@@ -1,28 +0,0 @@
using ZB.MOM.WW.Telemetry.Serilog;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
/// <summary>
/// Adapts the static <see cref="GatewayLogRedactor"/> to the shared <see cref="ILogRedactor"/> seam
/// so the telemetry RedactionEnricher masks API-key/credential material on every log event.
/// </summary>
public sealed class GatewayLogRedactorSeam : ILogRedactor
{
private static readonly string[] IdentityKeys = ["ClientIdentity", "authorization", "Authorization"];
/// <summary>
/// Masks API-key/credential material in known identity-bearing log properties.
/// </summary>
/// <param name="properties">The log event property dictionary to redact in place.</param>
public void Redact(IDictionary<string, object?> properties)
{
ArgumentNullException.ThrowIfNull(properties);
foreach (var key in IdentityKeys)
{
if (properties.TryGetValue(key, out var value) && value is string s)
{
properties[key] = GatewayLogRedactor.RedactClientIdentity(s);
}
}
}
}
@@ -1,20 +0,0 @@
using Microsoft.Extensions.Logging;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
public static class GatewayLoggerExtensions
{
/// <summary>Begins a gateway log scope with the specified scope properties.</summary>
/// <param name="logger">Logger used for diagnostic output.</param>
/// <param name="scope">Scope properties to apply.</param>
/// <returns>A disposable that ends the scope when disposed.</returns>
public static IDisposable? BeginGatewayScope(
this ILogger logger,
GatewayLogScope scope)
{
ArgumentNullException.ThrowIfNull(logger);
ArgumentNullException.ThrowIfNull(scope);
return logger.BeginScope(scope.ToDictionary());
}
}
@@ -1,4 +1,5 @@
using Microsoft.Extensions.Primitives;
using Serilog.Context;
namespace ZB.MOM.WW.MxGateway.Server.Diagnostics;
@@ -17,7 +18,12 @@ public static class GatewayRequestLoggingMiddlewareExtensions
/// <summary>Header name for the command method name.</summary>
public const string CommandMethodHeaderName = "x-command-method";
/// <summary>Adds gateway request logging scope middleware that reads correlation headers and redacts sensitive data.</summary>
/// <summary>
/// Adds gateway request logging middleware that reads the correlation headers and pushes them
/// as Serilog <see cref="LogContext"/> properties for the duration of the request. The pushed
/// properties (SessionId / WorkerProcessId / CorrelationId / CommandMethod / ClientIdentity)
/// are disposed when the request completes; the shared redaction enricher masks any secrets.
/// </summary>
/// <param name="app">Application builder.</param>
public static IApplicationBuilder UseGatewayRequestLoggingScope(this IApplicationBuilder app)
{
@@ -25,21 +31,56 @@ public static class GatewayRequestLoggingMiddlewareExtensions
return app.Use(async (context, next) =>
{
ILogger logger = context.RequestServices
.GetRequiredService<ILoggerFactory>()
.CreateLogger("MxGateway.Request");
using IDisposable? scope = logger.BeginGatewayScope(new GatewayLogScope(
GatewayLogScope scope = new(
SessionId: ReadHeader(context, SessionIdHeaderName),
WorkerProcessId: ReadInt32Header(context, WorkerProcessIdHeaderName),
CorrelationId: ReadUInt64Header(context, CorrelationIdHeaderName),
CommandMethod: ReadHeader(context, CommandMethodHeaderName),
ClientIdentity: ReadHeader(context, "authorization")));
ClientIdentity: ReadHeader(context, "authorization"));
using IDisposable correlationScope = PushCorrelationProperties(scope);
await next(context);
});
}
/// <summary>
/// Pushes the populated <paramref name="scope"/> properties onto the Serilog
/// <see cref="LogContext"/>, returning a single disposable that pops them all when the request
/// completes. Only the properties present in <see cref="GatewayLogScope.ToDictionary"/> (which
/// already applies the client-identity redaction policy) are pushed.
/// </summary>
/// <param name="scope">The correlation properties for the current request.</param>
/// <returns>A disposable that removes the pushed properties on disposal.</returns>
private static IDisposable PushCorrelationProperties(GatewayLogScope scope)
{
Stack<IDisposable> pushed = new();
foreach (KeyValuePair<string, object?> property in scope.ToDictionary())
{
pushed.Push(LogContext.PushProperty(property.Key, property.Value));
}
return new CorrelationPropertyScope(pushed);
}
/// <summary>
/// Disposes the pushed <see cref="LogContext"/> property bindings in reverse order, restoring
/// the ambient context to its pre-request state.
/// </summary>
private sealed class CorrelationPropertyScope(Stack<IDisposable> bindings) : IDisposable
{
private readonly Stack<IDisposable> _bindings = bindings;
public void Dispose()
{
while (_bindings.Count > 0)
{
_bindings.Pop().Dispose();
}
}
}
private static string? ReadHeader(HttpContext context, string headerName)
{
return context.Request.Headers.TryGetValue(headerName, out StringValues values)
@@ -1,8 +1,5 @@
using System.Security.Cryptography.X509Certificates;
using Microsoft.AspNetCore.Hosting.StaticWebAssets;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Configuration;
using ZB.MOM.WW.Health;
using Serilog;
using ZB.MOM.WW.MxGateway.Contracts;
using ZB.MOM.WW.MxGateway.Server.Alarms;
using ZB.MOM.WW.MxGateway.Server.Configuration;
@@ -15,7 +12,6 @@ using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
using ZB.MOM.WW.MxGateway.Server.Security.Authorization;
using ZB.MOM.WW.MxGateway.Server.Sessions;
using ZB.MOM.WW.MxGateway.Server.Workers;
using ZB.MOM.WW.Telemetry;
using ZB.MOM.WW.Telemetry.Serilog;
namespace ZB.MOM.WW.MxGateway.Server;
@@ -37,7 +33,10 @@ public static class GatewayApplication
WebApplicationBuilder builder = CreateBuilder(args);
WebApplication app = builder.Build();
// Push the per-request correlation properties (via Serilog LogContext) before the
// request-logging middleware emits its completion event, so those properties appear on it.
app.UseGatewayRequestLoggingScope();
app.UseSerilogRequestLogging();
app.UseStaticFiles();
app.UseAuthentication();
app.UseAuthorization();
@@ -61,68 +60,47 @@ public static class GatewayApplication
});
StaticWebAssetsLoader.UseStaticWebAssets(builder.Environment, builder.Configuration);
ConfigureSelfSignedTls(builder);
ConfigureSerilog(builder);
builder.AddZbSerilog(o => o.ServiceName = "mxgateway");
builder.Services.AddGatewayConfiguration(builder.Configuration);
builder.Services.AddSqliteAuthStore(builder.Configuration);
builder.Services.AddGatewayConfiguration();
builder.Services.AddSqliteAuthStore();
builder.Services.AddGatewayGrpcAuthorization();
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<AuthStoreHealthCheck>(
"auth-store",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready });
builder.Services.AddHealthChecks();
builder.Services.AddSingleton<GatewayMetrics>();
builder.AddZbTelemetry(o =>
{
o.ServiceName = "mxgateway";
o.Meters = [GatewayMetrics.MeterName]; // "MxGateway.Server" — name unchanged
if (Enum.TryParse<ZbExporter>(builder.Configuration["MxGateway:Telemetry:Exporter"], ignoreCase: true, out var exporter))
o.Exporter = exporter;
var otlp = builder.Configuration["MxGateway:Telemetry:OtlpEndpoint"];
if (!string.IsNullOrWhiteSpace(otlp))
o.OtlpEndpoint = otlp;
});
builder.Services.AddSingleton<ILogRedactor, GatewayLogRedactorSeam>();
builder.Services.AddSingleton<MxAccessGrpcMapper>();
builder.Services.AddSingleton<MxAccessGrpcRequestValidator>();
builder.Services.AddSingleton<IEventStreamService, EventStreamService>();
builder.Services.AddWorkerProcessLauncher();
builder.Services.AddGatewaySessions();
builder.Services.AddGatewayAlarms();
builder.Services.AddGatewayDashboard(builder.Configuration);
builder.Services.AddGatewayDashboard();
builder.Services.AddGalaxyRepository();
return builder;
}
private static void ConfigureSelfSignedTls(WebApplicationBuilder builder)
/// <summary>
/// Replaces the default Microsoft.Extensions.Logging provider with the shared
/// <c>ZB.MOM.WW.Telemetry.Serilog</c> bootstrap (<see cref="ZbSerilogExtensions.AddZbSerilog"/>).
/// Sinks and minimum level come from the <c>Serilog</c> configuration section; identity
/// (<c>SiteId</c>/<c>NodeRole</c>) is read from <c>MxGateway:Telemetry</c> when present.
/// Also registers the project's <see cref="ILogRedactor"/> adapter so the shared redaction
/// enricher masks gateway secrets on every event.
/// </summary>
/// <param name="builder">The web application builder being configured.</param>
private static void ConfigureSerilog(WebApplicationBuilder builder)
{
if (!Security.Tls.KestrelTlsInspector.RequiresGeneratedCertificate(builder.Configuration))
{
return;
}
string? siteId = builder.Configuration["MxGateway:Telemetry:SiteId"];
string? nodeRole = builder.Configuration["MxGateway:Telemetry:NodeRole"];
Configuration.TlsOptions tlsOptions =
builder.Configuration.GetSection("MxGateway:Tls").Get<Configuration.TlsOptions>()
?? new Configuration.TlsOptions();
builder.Services.AddSingleton<ILogRedactor, GatewayLogRedactorAdapter>();
using ILoggerFactory loggerFactory = LoggerFactory.Create(logging =>
builder.AddZbSerilog(options =>
{
logging.AddConfiguration(builder.Configuration.GetSection("Logging"));
logging.AddConsole();
options.ServiceName = "mxgateway";
options.SiteId = string.IsNullOrWhiteSpace(siteId) ? null : siteId;
options.NodeRole = string.IsNullOrWhiteSpace(nodeRole) ? null : nodeRole;
});
Security.Tls.SelfSignedCertificateProvider provider = new(
tlsOptions,
loggerFactory.CreateLogger<Security.Tls.SelfSignedCertificateProvider>(),
TimeProvider.System);
X509Certificate2 certificate = provider.LoadOrCreate();
builder.WebHost.ConfigureKestrel(options =>
// The certificate is intentionally owned by Kestrel for the application
// lifetime; it is not disposed here.
options.ConfigureHttpsDefaults(https => https.ServerCertificate = certificate));
}
private static string ResolveContentRootPath()
@@ -189,8 +167,13 @@ public static class GatewayApplication
{
endpoints.MapStaticAssets(ResolveStaticAssetsManifestPath());
endpoints.MapZbHealth();
endpoints.MapZbMetrics();
endpoints.MapGet(
"/health/live",
() => Results.Ok(new GatewayHealthReply(
Status: "Healthy",
DefaultBackend: GatewayContractInfo.DefaultBackendName,
WorkerProtocolVersion: GatewayContractInfo.WorkerProtocolVersion)))
.WithName("LiveHealth");
endpoints.MapGrpcService<MxAccessGatewayService>();
endpoints.MapGrpcService<GalaxyRepositoryGrpcService>();
@@ -5,7 +5,7 @@ namespace ZB.MOM.WW.MxGateway.Server.Metrics;
public sealed class GatewayMetrics : IDisposable
{
public const string MeterName = "ZB.MOM.WW.MxGateway";
public const string MeterName = "MxGateway.Server";
private readonly object _syncRoot = new();
private readonly Meter _meter;
@@ -68,9 +68,9 @@ public sealed class GatewayMetrics : IDisposable
_heartbeatFailuresCounter = _meter.CreateCounter<long>("mxgateway.heartbeats.failed");
_streamDisconnectsCounter = _meter.CreateCounter<long>("mxgateway.grpc.streams.disconnected");
_retryAttemptsCounter = _meter.CreateCounter<long>("mxgateway.retries.attempted");
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "s");
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "s");
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "s");
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "ms");
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "ms");
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "ms");
_meter.CreateObservableGauge("mxgateway.sessions.open", GetOpenSessions);
_meter.CreateObservableGauge("mxgateway.workers.running", GetWorkersRunning);
@@ -144,7 +144,7 @@ public sealed class GatewayMetrics : IDisposable
_workersRunning++;
}
_workerStartupLatencyHistogram.Record(startupDuration.TotalSeconds);
_workerStartupLatencyHistogram.Record(startupDuration.TotalMilliseconds);
}
/// <summary>
@@ -208,7 +208,7 @@ public sealed class GatewayMetrics : IDisposable
KeyValuePair<string, object?> methodTag = new("method", method);
_commandsSucceededCounter.Add(1, methodTag);
_commandLatencyHistogram.Record(duration.TotalSeconds, methodTag);
_commandLatencyHistogram.Record(duration.TotalMilliseconds, methodTag);
}
/// <summary>
@@ -228,7 +228,7 @@ public sealed class GatewayMetrics : IDisposable
KeyValuePair<string, object?> methodTag = new("method", method);
KeyValuePair<string, object?> categoryTag = new("category", category);
_commandsFailedCounter.Add(1, methodTag, categoryTag);
_commandLatencyHistogram.Record(duration.TotalSeconds, methodTag, categoryTag);
_commandLatencyHistogram.Record(duration.TotalMilliseconds, methodTag, categoryTag);
}
/// <summary>
@@ -255,7 +255,7 @@ public sealed class GatewayMetrics : IDisposable
public void RecordEventStreamSend(string family, TimeSpan duration)
{
_eventStreamSendLatencyHistogram.Record(
duration.TotalSeconds,
duration.TotalMilliseconds,
new KeyValuePair<string, object?>("family", family));
}
@@ -1,42 +0,0 @@
using ZB.MOM.WW.Audit;
namespace ZB.MOM.WW.MxGateway.Server.Security.Audit;
/// <summary>
/// Best-effort <see cref="IAuditWriter"/> over the MxGateway-owned
/// <see cref="SqliteCanonicalAuditStore"/>. It honours the canonical
/// <see cref="IAuditWriter"/> contract: a failed audit write is swallowed and logged
/// rather than propagated, so it can never abort the user-facing action that produced it.
/// </summary>
/// <remarks>
/// This is the single sink through which ALL MxGateway audit flows — the library admin
/// verbs (via <see cref="CanonicalForwardingApiKeyAuditStore"/>) and the gateway's own
/// dashboard / constraint-denial producers, which write canonical events directly. The
/// best-effort wrapping here also closes the gap that the library's
/// <c>SqliteApiKeyAuditStore.AppendAsync</c> propagated exceptions.
/// </remarks>
public sealed class CanonicalAuditWriter(
SqliteCanonicalAuditStore store,
ILogger<CanonicalAuditWriter> logger) : IAuditWriter
{
/// <inheritdoc />
public async Task WriteAsync(AuditEvent auditEvent, CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(auditEvent);
try
{
await store.InsertAsync(auditEvent, cancellationToken).ConfigureAwait(false);
}
catch (Exception exception)
{
// Best-effort: a failed audit write must never abort the action that produced it.
// Swallow everything (including OperationCanceledException) and log for diagnosis.
logger.LogWarning(
exception,
"Failed to persist audit event {EventId} (action {Action}); audit write is best-effort and was suppressed.",
auditEvent.EventId,
auditEvent.Action);
}
}
}
@@ -1,143 +0,0 @@
using System.Text.Json;
using ZB.MOM.WW.Audit;
using ZB.MOM.WW.Auth.Abstractions.ApiKeys;
namespace ZB.MOM.WW.MxGateway.Server.Security.Audit;
/// <summary>
/// Adapter that overrides the shared library's <see cref="IApiKeyAuditStore"/> so that
/// library-emitted API-key audit events (CLI / admin verbs from
/// <c>ApiKeyAdminCommands</c>) are canonicalized onto <see cref="AuditEvent"/> and routed
/// through the gateway's <see cref="IAuditWriter"/> into the canonical
/// <c>audit_event</c> store.
/// </summary>
/// <remarks>
/// Overriding the registered <see cref="IApiKeyAuditStore"/> is the ONLY way to
/// canonicalize the library-internal <c>ApiKeyAdminCommands</c> events, since that type
/// cannot be edited. <see cref="ListRecentAsync"/> reads back from the canonical store
/// and maps each <see cref="AuditEvent"/> to an <see cref="ApiKeyAuditEntry"/> so the
/// existing dashboard "recent audit" view (and the CLI/store tests) keep working through
/// this same seam, unchanged.
/// <para>
/// The library's own <c>api_key_audit</c> table is left in place but UNUSED — nothing
/// writes to it once this adapter overrides the library's <c>SqliteApiKeyAuditStore</c>
/// registration.
/// </para>
/// </remarks>
public sealed class CanonicalForwardingApiKeyAuditStore(
IAuditWriter auditWriter,
SqliteCanonicalAuditStore store) : IApiKeyAuditStore
{
/// <summary>The canonical <see cref="AuditEvent.Category"/> assigned to API-key events.</summary>
public const string ApiKeyCategory = "ApiKey";
/// <summary>Actor used for the library's keyless <c>init-db</c> event.</summary>
private const string SystemActor = "system";
/// <summary>Actor used for any other keyless (CLI-originated) library event.</summary>
private const string CliActor = "cli";
/// <summary>The library event type that denotes a constraint denial.</summary>
private const string ConstraintDeniedEventType = "constraint-denied";
/// <summary>The library's keyless schema-init event type.</summary>
private const string InitDbEventType = "init-db";
/// <inheritdoc />
public async Task AppendAsync(ApiKeyAuditEntry entry, CancellationToken ct)
{
ArgumentNullException.ThrowIfNull(entry);
AuditEvent auditEvent = new()
{
EventId = Guid.NewGuid(),
OccurredAtUtc = entry.CreatedUtc,
// Keyless library events: init-db is system-originated; any other keyless event
// is a CLI/admin verb run without an authenticated principal.
Actor = entry.KeyId
?? (entry.EventType == InitDbEventType ? SystemActor : CliActor),
Action = entry.EventType,
Outcome = entry.EventType == ConstraintDeniedEventType
? AuditOutcome.Denied
: AuditOutcome.Success,
Category = ApiKeyCategory,
Target = entry.KeyId,
SourceNode = entry.RemoteAddress,
CorrelationId = null,
DetailsJson = WrapDetails(entry.Details),
};
// Best-effort: IAuditWriter swallows/logs failures, so this never throws.
await auditWriter.WriteAsync(auditEvent, ct).ConfigureAwait(false);
}
/// <inheritdoc />
public async Task<IReadOnlyList<ApiKeyAuditEntry>> ListRecentAsync(int limit, CancellationToken ct)
{
IReadOnlyList<AuditEvent> events = await store.ListRecentAsync(limit, ct).ConfigureAwait(false);
ApiKeyAuditEntry[] entries = new ApiKeyAuditEntry[events.Count];
for (int index = 0; index < events.Count; index++)
{
AuditEvent auditEvent = events[index];
entries[index] = new ApiKeyAuditEntry(
KeyId: auditEvent.Actor switch
{
// Keyless library events were mapped to the system/cli sentinel actors on the
// way in; map them back to a null KeyId so the dashboard view is faithful.
SystemActor or CliActor => null,
string actor => actor,
},
EventType: auditEvent.Action,
RemoteAddress: auditEvent.SourceNode,
CreatedUtc: auditEvent.OccurredAtUtc,
Details: UnwrapDetails(auditEvent.DetailsJson));
}
return entries;
}
/// <summary>
/// Wraps a free-form library detail string into the canonical
/// <c>{"detail": "&lt;escaped&gt;"}</c> JSON envelope, or null when there is no detail.
/// </summary>
private static string? WrapDetails(string? details)
{
if (details is null)
{
return null;
}
return JsonSerializer.Serialize(new Dictionary<string, string> { ["detail"] = details });
}
/// <summary>
/// Unwraps the canonical detail envelope back to the original free-form string. Falls
/// back to the raw JSON when it is not a recognised <c>{"detail": ...}</c> envelope, so
/// directly-emitted canonical events (whose DetailsJson is richer) still surface text.
/// </summary>
private static string? UnwrapDetails(string? detailsJson)
{
if (string.IsNullOrEmpty(detailsJson))
{
return null;
}
try
{
using JsonDocument document = JsonDocument.Parse(detailsJson);
if (document.RootElement.ValueKind == JsonValueKind.Object
&& document.RootElement.TryGetProperty("detail", out JsonElement detail)
&& detail.ValueKind == JsonValueKind.String)
{
return detail.GetString();
}
}
catch (JsonException)
{
// Not JSON we recognise; surface the raw payload below.
}
return detailsJson;
}
}
@@ -1,51 +0,0 @@
using System.Security.Claims;
using ZB.MOM.WW.Auth.AspNetCore;
namespace ZB.MOM.WW.MxGateway.Server.Security.Audit;
/// <summary>
/// HTTP-context-backed implementation of <see cref="IAuditActorAccessor"/> that reads the
/// dashboard operator's identity from the current <see cref="IHttpContextAccessor"/>.
/// </summary>
/// <remarks>
/// Claim resolution order:
/// <list type="number">
/// <item><see cref="ZbClaimTypes.Username"/> ("zb:username") — the canonical LDAP login name.</item>
/// <item><see cref="ClaimsPrincipal.Identity"/>.<see cref="System.Security.Principal.IIdentity.Name"/> — framework fallback (= <see cref="ZbClaimTypes.Name"/> = <see cref="ClaimTypes.Name"/> = display name).</item>
/// <item><see cref="ZbClaimTypes.Name"/> — explicit fallback matching the claim emitted by <c>DashboardAuthenticator.CreatePrincipal</c>.</item>
/// </list>
/// Returns <see langword="null"/> when there is no HTTP context or the user is not authenticated.
/// </remarks>
public sealed class HttpAuditActorAccessor(IHttpContextAccessor httpContextAccessor) : IAuditActorAccessor
{
/// <inheritdoc />
public string? CurrentActor
{
get
{
ClaimsPrincipal? user = httpContextAccessor.HttpContext?.User;
if (user?.Identity?.IsAuthenticated != true)
{
return null;
}
// Prefer the canonical login-username claim (set by DashboardAuthenticator).
string? username = user.FindFirstValue(ZbClaimTypes.Username);
if (!string.IsNullOrWhiteSpace(username))
{
return username;
}
// Framework fallback: Identity.Name is driven by the ClaimsIdentity nameClaimType,
// which DashboardAuthenticator sets to ZbClaimTypes.Name (= ClaimTypes.Name = display name).
string? identityName = user.Identity?.Name;
if (!string.IsNullOrWhiteSpace(identityName))
{
return identityName;
}
// Final explicit fallback — ZbClaimTypes.Name claim value directly.
return user.FindFirstValue(ZbClaimTypes.Name);
}
}
}
@@ -1,18 +0,0 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Audit;
/// <summary>
/// Returns the current actor name for use in audit events.
/// </summary>
/// <remarks>
/// Implementations resolve the actor from the ambient request context. For the dashboard
/// this is the authenticated LDAP operator; for non-HTTP contexts (gRPC, CLI) the caller
/// provides the actor directly and this seam is not used.
/// </remarks>
public interface IAuditActorAccessor
{
/// <summary>
/// Gets the current actor's username, or <see langword="null"/> when there is no
/// authenticated principal in scope (e.g. an anonymous or unauthenticated request).
/// </summary>
string? CurrentActor { get; }
}
@@ -1,138 +0,0 @@
using System.Globalization;
using Microsoft.Data.Sqlite;
using ZB.MOM.WW.Audit;
using ZB.MOM.WW.Auth.ApiKeys.Sqlite;
namespace ZB.MOM.WW.MxGateway.Server.Security.Audit;
/// <summary>
/// MxGateway-owned, append-only SQLite store for canonical
/// <see cref="AuditEvent"/>s. It writes to a NEW <c>audit_event</c> table in the
/// SAME database file as the shared <c>ZB.MOM.WW.Auth.ApiKeys</c> stores: both share
/// the library's <see cref="AuthSqliteConnectionFactory"/> (so they target the same
/// <c>ApiKeyOptions.SqlitePath</c> with the same WAL/busy-timeout connection config).
/// </summary>
/// <remarks>
/// This store is the canonical sink for ALL MxGateway audit. The library's own
/// <c>api_key_audit</c> table is left in place but UNUSED after adoption — the library's
/// <c>IApiKeyAuditStore</c> registration is overridden by
/// <see cref="CanonicalForwardingApiKeyAuditStore"/>, which forwards onto this store via
/// <see cref="CanonicalAuditWriter"/>. The library's <c>schema_version</c> /
/// <c>api_key_audit</c> tables are not touched here; the <c>audit_event</c> table is
/// created idempotently (<c>CREATE TABLE IF NOT EXISTS</c>) on each write so it
/// self-bootstraps regardless of migration ordering.
/// </remarks>
public sealed class SqliteCanonicalAuditStore(AuthSqliteConnectionFactory connectionFactory)
{
private const string CreateTableSql =
"""
CREATE TABLE IF NOT EXISTS audit_event (
event_id TEXT PRIMARY KEY,
occurred_at_utc TEXT NOT NULL,
actor TEXT NOT NULL,
action TEXT NOT NULL,
outcome TEXT NOT NULL,
category TEXT NULL,
target TEXT NULL,
source_node TEXT NULL,
correlation_id TEXT NULL,
details_json TEXT NULL
);
""";
/// <summary>Inserts a canonical audit event into the <c>audit_event</c> table.</summary>
/// <param name="auditEvent">The canonical event to persist.</param>
/// <param name="cancellationToken">Token to observe for cancellation.</param>
public async Task InsertAsync(AuditEvent auditEvent, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(auditEvent);
await using SqliteConnection connection =
await connectionFactory.OpenConnectionAsync(cancellationToken).ConfigureAwait(false);
await EnsureTableAsync(connection, cancellationToken).ConfigureAwait(false);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText =
"""
INSERT INTO audit_event
(event_id, occurred_at_utc, actor, action, outcome,
category, target, source_node, correlation_id, details_json)
VALUES
($event_id, $occurred_at_utc, $actor, $action, $outcome,
$category, $target, $source_node, $correlation_id, $details_json);
""";
command.Parameters.AddWithValue("$event_id", auditEvent.EventId.ToString());
command.Parameters.AddWithValue("$occurred_at_utc", auditEvent.OccurredAtUtc.ToString("O", CultureInfo.InvariantCulture));
command.Parameters.AddWithValue("$actor", auditEvent.Actor);
command.Parameters.AddWithValue("$action", auditEvent.Action);
command.Parameters.AddWithValue("$outcome", auditEvent.Outcome.ToString());
command.Parameters.AddWithValue("$category", (object?)auditEvent.Category ?? DBNull.Value);
command.Parameters.AddWithValue("$target", (object?)auditEvent.Target ?? DBNull.Value);
command.Parameters.AddWithValue("$source_node", (object?)auditEvent.SourceNode ?? DBNull.Value);
command.Parameters.AddWithValue("$correlation_id", (object?)auditEvent.CorrelationId?.ToString() ?? DBNull.Value);
command.Parameters.AddWithValue("$details_json", (object?)auditEvent.DetailsJson ?? DBNull.Value);
await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
}
/// <summary>Returns the most recent canonical audit events, newest first.</summary>
/// <param name="limit">Maximum number of events to return.</param>
/// <param name="cancellationToken">Token to observe for cancellation.</param>
public async Task<IReadOnlyList<AuditEvent>> ListRecentAsync(int limit, CancellationToken cancellationToken)
{
if (limit <= 0)
{
return [];
}
await using SqliteConnection connection =
await connectionFactory.OpenConnectionAsync(cancellationToken).ConfigureAwait(false);
await EnsureTableAsync(connection, cancellationToken).ConfigureAwait(false);
await using SqliteCommand command = connection.CreateCommand();
command.CommandText =
"""
SELECT event_id, occurred_at_utc, actor, action, outcome,
category, target, source_node, correlation_id, details_json
FROM audit_event
ORDER BY rowid DESC
LIMIT $limit;
""";
command.Parameters.AddWithValue("$limit", limit);
List<AuditEvent> events = [];
await using SqliteDataReader reader = await command.ExecuteReaderAsync(cancellationToken).ConfigureAwait(false);
while (await reader.ReadAsync(cancellationToken).ConfigureAwait(false))
{
events.Add(new AuditEvent
{
EventId = Guid.Parse(reader.GetString(0)),
OccurredAtUtc = ParseUtc(reader.GetString(1)),
Actor = reader.GetString(2),
Action = reader.GetString(3),
Outcome = Enum.Parse<AuditOutcome>(reader.GetString(4)),
Category = reader.IsDBNull(5) ? null : reader.GetString(5),
Target = reader.IsDBNull(6) ? null : reader.GetString(6),
SourceNode = reader.IsDBNull(7) ? null : reader.GetString(7),
CorrelationId = reader.IsDBNull(8) ? null : Guid.Parse(reader.GetString(8)),
DetailsJson = reader.IsDBNull(9) ? null : reader.GetString(9),
});
}
return events;
}
private static async Task EnsureTableAsync(SqliteConnection connection, CancellationToken cancellationToken)
{
await using SqliteCommand command = connection.CreateCommand();
command.CommandText = CreateTableSql;
await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
}
private static DateTimeOffset ParseUtc(string value) =>
DateTimeOffset.Parse(value, CultureInfo.InvariantCulture, DateTimeStyles.RoundtripKind);
}
@@ -1,19 +1,15 @@
using System.Text.Json;
using ZB.MOM.WW.Auth.Abstractions.ApiKeys;
using ZB.MOM.WW.Auth.ApiKeys.Admin;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Executes API key administration commands from the CLI.
/// </summary>
/// <remarks>
/// The create/revoke/rotate/list/init-db verbs (secret generation, peppered hashing, token
/// assembly and per-action audit) are delegated to the shared
/// <see cref="ApiKeyAdminCommands"/>. This runner adapts the gateway's strongly-typed command and
/// output DTOs (which carry <see cref="ApiKeyConstraints"/>) onto the library's JSON-based contract.
/// </remarks>
public sealed class ApiKeyAdminCliRunner(ApiKeyAdminCommands commands)
public sealed class ApiKeyAdminCliRunner(
IAuthStoreMigrator migrator,
IApiKeyAdminStore adminStore,
IApiKeyAuditStore auditStore,
IApiKeySecretHasher hasher)
{
private static readonly JsonSerializerOptions JsonOptions = new()
{
@@ -48,7 +44,8 @@ public sealed class ApiKeyAdminCliRunner(ApiKeyAdminCommands commands)
private async Task<ApiKeyAdminOutput> InitDbAsync(CancellationToken cancellationToken)
{
await commands.InitDbAsync(remoteAddress: null, cancellationToken).ConfigureAwait(false);
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
await AppendAuditAsync(null, "init-db", null, cancellationToken).ConfigureAwait(false);
return new ApiKeyAdminOutput("init-db", "initialized", null, []);
}
@@ -57,26 +54,33 @@ public sealed class ApiKeyAdminCliRunner(ApiKeyAdminCommands commands)
ApiKeyAdminCommand command,
CancellationToken cancellationToken)
{
// The shared command set requires the schema to exist; init-db is idempotent.
await commands.InitDbAsync(remoteAddress: null, cancellationToken).ConfigureAwait(false);
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
string keyId = Required(command.KeyId);
CreateKeyResult created = await commands.CreateKeyAsync(
keyId,
Required(command.DisplayName),
command.Scopes,
ApiKeyConstraintSerializer.Serialize(command.Constraints),
remoteAddress: null,
string secret = ApiKeySecretGenerator.Generate();
string apiKey = FormatApiKey(keyId, secret);
await adminStore.CreateAsync(
new ApiKeyCreateRequest(
KeyId: keyId,
KeyPrefix: $"mxgw_{keyId}",
SecretHash: hasher.HashSecret(secret),
DisplayName: Required(command.DisplayName),
Scopes: command.Scopes,
Constraints: command.Constraints,
CreatedUtc: DateTimeOffset.UtcNow),
cancellationToken)
.ConfigureAwait(false);
await AppendAuditAsync(keyId, "create-key", null, cancellationToken).ConfigureAwait(false);
return new ApiKeyAdminOutput("create-key", "created", created.Token, []);
return new ApiKeyAdminOutput("create-key", "created", apiKey, []);
}
private async Task<ApiKeyAdminOutput> ListKeysAsync(CancellationToken cancellationToken)
{
await commands.InitDbAsync(remoteAddress: null, cancellationToken).ConfigureAwait(false);
IReadOnlyList<ApiKeyListItem> keys = await commands.ListKeysAsync(cancellationToken).ConfigureAwait(false);
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
IReadOnlyList<ApiKeyRecord> keys = await adminStore.ListAsync(cancellationToken).ConfigureAwait(false);
await AppendAuditAsync(null, "list-keys", null, cancellationToken).ConfigureAwait(false);
return new ApiKeyAdminOutput(
"list-keys",
@@ -89,28 +93,35 @@ public sealed class ApiKeyAdminCliRunner(ApiKeyAdminCommands commands)
ApiKeyAdminCommand command,
CancellationToken cancellationToken)
{
await commands.InitDbAsync(remoteAddress: null, cancellationToken).ConfigureAwait(false);
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
string keyId = Required(command.KeyId);
KeyActionResult result = await commands.RevokeKeyAsync(keyId, remoteAddress: null, cancellationToken)
bool revoked = await adminStore.RevokeAsync(keyId, DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false);
return new ApiKeyAdminOutput("revoke-key", result.Succeeded ? "revoked" : "not-found-or-already-revoked", null, []);
await AppendAuditAsync(keyId, "revoke-key", revoked ? "revoked" : "not-found-or-already-revoked", cancellationToken)
.ConfigureAwait(false);
return new ApiKeyAdminOutput("revoke-key", revoked ? "revoked" : "not-found-or-already-revoked", null, []);
}
private async Task<ApiKeyAdminOutput> RotateKeyAsync(
ApiKeyAdminCommand command,
CancellationToken cancellationToken)
{
await commands.InitDbAsync(remoteAddress: null, cancellationToken).ConfigureAwait(false);
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
string keyId = Required(command.KeyId);
CreateKeyResult rotated = await commands.RotateKeyAsync(keyId, remoteAddress: null, cancellationToken)
string secret = ApiKeySecretGenerator.Generate();
string apiKey = FormatApiKey(keyId, secret);
bool rotated = await adminStore.RotateAsync(keyId, hasher.HashSecret(secret), DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false);
bool succeeded = rotated.Token is not null;
await AppendAuditAsync(keyId, "rotate-key", rotated ? "rotated" : "not-found", cancellationToken)
.ConfigureAwait(false);
return new ApiKeyAdminOutput("rotate-key", succeeded ? "rotated" : "not-found", rotated.Token, []);
return new ApiKeyAdminOutput("rotate-key", rotated ? "rotated" : "not-found", rotated ? apiKey : null, []);
}
private static async Task WriteOutputAsync(
@@ -139,19 +150,40 @@ public sealed class ApiKeyAdminCliRunner(ApiKeyAdminCommands commands)
}
}
private static ApiKeyAdminListedKey ToListedKey(ApiKeyListItem key)
private async Task AppendAuditAsync(
string? keyId,
string eventType,
string? details,
CancellationToken cancellationToken)
{
await auditStore.AppendAsync(
new ApiKeyAuditEntry(
KeyId: keyId,
EventType: eventType,
RemoteAddress: null,
Details: details),
cancellationToken)
.ConfigureAwait(false);
}
private static ApiKeyAdminListedKey ToListedKey(ApiKeyRecord key)
{
return new ApiKeyAdminListedKey(
KeyId: key.KeyId,
KeyPrefix: key.KeyPrefix,
DisplayName: key.DisplayName,
Scopes: key.Scopes,
Constraints: ApiKeyConstraintSerializer.Deserialize(key.ConstraintsJson),
Constraints: key.Constraints,
CreatedUtc: key.CreatedUtc,
LastUsedUtc: key.LastUsedUtc,
RevokedUtc: key.RevokedUtc);
}
private static string FormatApiKey(string keyId, string secret)
{
return $"mxgw_{keyId}_{secret}";
}
private static string Required(string? value)
{
return value ?? throw new InvalidOperationException("Required command value was not provided.");
@@ -0,0 +1,7 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed record ApiKeyAuditEntry(
string? KeyId,
string EventType,
string? RemoteAddress,
string? Details);
@@ -0,0 +1,9 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed record ApiKeyAuditRecord(
long AuditId,
string? KeyId,
string EventType,
string? RemoteAddress,
DateTimeOffset CreatedUtc,
string? Details);
@@ -0,0 +1,10 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed record ApiKeyCreateRequest(
string KeyId,
string KeyPrefix,
byte[] SecretHash,
string DisplayName,
IReadOnlySet<string> Scopes,
ApiKeyConstraints Constraints,
DateTimeOffset CreatedUtc);
@@ -0,0 +1,49 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed class ApiKeyParser : IApiKeyParser
{
private const string BearerPrefix = "Bearer ";
private const string TokenPrefix = "mxgw_";
/// <summary>Attempts to parse a Bearer token from an Authorization header and extract the API key ID and secret.</summary>
/// <param name="authorizationHeader">Authorization header value to parse.</param>
/// <param name="apiKey">Parsed API key with ID and secret, or null if parsing failed.</param>
/// <returns>True if the header was successfully parsed; otherwise, false.</returns>
public bool TryParseAuthorizationHeader(string? authorizationHeader, out ParsedApiKey? apiKey)
{
apiKey = null;
if (string.IsNullOrWhiteSpace(authorizationHeader)
|| !authorizationHeader.StartsWith(BearerPrefix, StringComparison.OrdinalIgnoreCase))
{
return false;
}
string token = authorizationHeader[BearerPrefix.Length..].Trim();
if (!token.StartsWith(TokenPrefix, StringComparison.OrdinalIgnoreCase))
{
return false;
}
string keyPayload = token[TokenPrefix.Length..];
int separatorIndex = keyPayload.IndexOf('_', StringComparison.Ordinal);
if (separatorIndex <= 0 || separatorIndex == keyPayload.Length - 1)
{
return false;
}
string keyId = keyPayload[..separatorIndex];
string secret = keyPayload[(separatorIndex + 1)..];
if (string.IsNullOrWhiteSpace(keyId) || string.IsNullOrWhiteSpace(secret))
{
return false;
}
apiKey = new ParsedApiKey(keyId, secret);
return true;
}
}
@@ -0,0 +1,4 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed class ApiKeyPepperUnavailableException(string pepperSecretName)
: InvalidOperationException($"API key pepper secret '{pepperSecretName}' is not configured.");
@@ -0,0 +1,12 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed record ApiKeyRecord(
string KeyId,
string KeyPrefix,
byte[] SecretHash,
string DisplayName,
IReadOnlySet<string> Scopes,
ApiKeyConstraints Constraints,
DateTimeOffset CreatedUtc,
DateTimeOffset? LastUsedUtc,
DateTimeOffset? RevokedUtc);
@@ -0,0 +1,31 @@
using Microsoft.Data.Sqlite;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>Reads API key records from SQLite query results.</summary>
public static class ApiKeyRecordReader
{
/// <summary>Deserializes a row from the API key table into an ApiKeyRecord.</summary>
/// <param name="reader">The data reader positioned at the API key row.</param>
/// <returns>The deserialized API key record.</returns>
public static ApiKeyRecord Read(SqliteDataReader reader)
{
return new ApiKeyRecord(
KeyId: reader.GetString(0),
KeyPrefix: reader.GetString(1),
SecretHash: (byte[])reader["secret_hash"],
DisplayName: reader.GetString(3),
Scopes: ApiKeyScopeSerializer.Deserialize(reader.GetString(4)),
Constraints: ApiKeyConstraintSerializer.Deserialize(reader.IsDBNull(5) ? null : reader.GetString(5)),
CreatedUtc: DateTimeOffset.Parse(reader.GetString(6), System.Globalization.CultureInfo.InvariantCulture),
LastUsedUtc: ReadNullableDateTimeOffset(reader, 7),
RevokedUtc: ReadNullableDateTimeOffset(reader, 8));
}
private static DateTimeOffset? ReadNullableDateTimeOffset(SqliteDataReader reader, int ordinal)
{
return reader.IsDBNull(ordinal)
? null
: DateTimeOffset.Parse(reader.GetString(ordinal), System.Globalization.CultureInfo.InvariantCulture);
}
}
@@ -0,0 +1,29 @@
using System.Text.Json;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public static class ApiKeyScopeSerializer
{
/// <summary>Serializes scopes to JSON string.</summary>
/// <param name="scopes">The scopes to serialize.</param>
/// <returns>JSON string representation.</returns>
public static string Serialize(IReadOnlySet<string> scopes)
{
return JsonSerializer.Serialize(scopes.Order(StringComparer.Ordinal));
}
/// <summary>Deserializes scopes from JSON string.</summary>
/// <param name="value">The JSON string to deserialize.</param>
/// <returns>Deserialized scopes set.</returns>
public static IReadOnlySet<string> Deserialize(string value)
{
if (string.IsNullOrWhiteSpace(value))
{
return new HashSet<string>(StringComparer.Ordinal);
}
string[]? scopes = JsonSerializer.Deserialize<string[]>(value);
return new HashSet<string>(scopes ?? [], StringComparer.Ordinal);
}
}
@@ -0,0 +1,19 @@
using System.Security.Cryptography;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>Generates cryptographically secure API key secrets.</summary>
public static class ApiKeySecretGenerator
{
/// <summary>Generates a new random API key secret string.</summary>
public static string Generate()
{
Span<byte> bytes = stackalloc byte[32];
RandomNumberGenerator.Fill(bytes);
return Convert.ToBase64String(bytes)
.TrimEnd('=')
.Replace('+', '-')
.Replace('/', '_');
}
}
@@ -0,0 +1,38 @@
using System.Security.Cryptography;
using System.Text;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed class ApiKeySecretHasher(
IConfiguration configuration,
IOptions<GatewayOptions> options) : IApiKeySecretHasher
{
/// <summary>Hashes an API key secret with pepper using HMAC-SHA256.</summary>
/// <param name="secret">The secret to hash.</param>
/// <returns>The hashed secret.</returns>
public byte[] HashSecret(string secret)
{
string pepper = GetPepper();
byte[] pepperBytes = Encoding.UTF8.GetBytes(pepper);
byte[] secretBytes = Encoding.UTF8.GetBytes(secret);
using HMACSHA256 hmac = new(pepperBytes);
return hmac.ComputeHash(secretBytes);
}
private string GetPepper()
{
string pepperSecretName = options.Value.Authentication.PepperSecretName;
string? pepper = configuration[pepperSecretName];
if (string.IsNullOrWhiteSpace(pepper))
{
throw new ApiKeyPepperUnavailableException(pepperSecretName);
}
return pepper;
}
}
@@ -0,0 +1,11 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public enum ApiKeyVerificationFailure
{
None,
MissingOrMalformedCredentials,
PepperUnavailable,
KeyNotFound,
KeyRevoked,
SecretMismatch
}
@@ -0,0 +1,33 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed record ApiKeyVerificationResult(
bool Succeeded,
ApiKeyIdentity? Identity,
ApiKeyVerificationFailure Failure)
{
/// <summary>
/// Creates a successful verification result.
/// </summary>
/// <param name="identity">API key identity.</param>
/// <returns>Success result.</returns>
public static ApiKeyVerificationResult Success(ApiKeyIdentity identity)
{
return new ApiKeyVerificationResult(
Succeeded: true,
Identity: identity,
Failure: ApiKeyVerificationFailure.None);
}
/// <summary>
/// Creates a failed verification result.
/// </summary>
/// <param name="failure">Verification failure reason.</param>
/// <returns>Failure result.</returns>
public static ApiKeyVerificationResult Fail(ApiKeyVerificationFailure failure)
{
return new ApiKeyVerificationResult(
Succeeded: false,
Identity: null,
Failure: failure);
}
}
@@ -0,0 +1,64 @@
using System.Security.Cryptography;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed class ApiKeyVerifier(
IApiKeyParser parser,
IApiKeySecretHasher hasher,
IApiKeyStore keyStore) : IApiKeyVerifier
{
/// <summary>
/// Verifies an API key from an authorization header asynchronously.
/// </summary>
/// <param name="authorizationHeader">Authorization header value.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>Verification result.</returns>
public async Task<ApiKeyVerificationResult> VerifyAsync(
string? authorizationHeader,
CancellationToken cancellationToken)
{
if (!parser.TryParseAuthorizationHeader(authorizationHeader, out ParsedApiKey? parsedKey)
|| parsedKey is null)
{
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.MissingOrMalformedCredentials);
}
ApiKeyRecord? storedKey = await keyStore.FindByKeyIdAsync(parsedKey.KeyId, cancellationToken)
.ConfigureAwait(false);
if (storedKey is null)
{
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.KeyNotFound);
}
if (storedKey.RevokedUtc is not null)
{
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.KeyRevoked);
}
byte[] presentedHash;
try
{
presentedHash = hasher.HashSecret(parsedKey.Secret);
}
catch (ApiKeyPepperUnavailableException)
{
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.PepperUnavailable);
}
if (!CryptographicOperations.FixedTimeEquals(presentedHash, storedKey.SecretHash))
{
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.SecretMismatch);
}
await keyStore.MarkKeyUsedAsync(storedKey.KeyId, DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false);
return ApiKeyVerificationResult.Success(new ApiKeyIdentity(
KeyId: storedKey.KeyId,
KeyPrefix: storedKey.KeyPrefix,
DisplayName: storedKey.DisplayName,
Scopes: storedKey.Scopes,
Constraints: storedKey.Constraints));
}
}
@@ -0,0 +1,80 @@
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Factory for creating SQLite connections to the authentication store.
/// </summary>
public sealed class AuthSqliteConnectionFactory(IOptions<GatewayOptions> options)
{
/// <summary>
/// Busy timeout applied to every auth-store connection. SQLite retries a busy
/// database for this long before surfacing <c>SQLITE_BUSY</c>, so the concurrent
/// <c>MarkKeyUsedAsync</c> / audit-append writers degrade gracefully under load
/// instead of failing the request path.
/// </summary>
private static readonly TimeSpan BusyTimeout = TimeSpan.FromSeconds(5);
/// <summary>
/// Creates an unopened SQLite connection to the auth database. Prefer
/// <see cref="OpenConnectionAsync"/>, which also applies WAL journaling and the
/// busy timeout.
/// </summary>
public SqliteConnection CreateConnection()
{
string sqlitePath = options.Value.Authentication.SqlitePath;
string? directory = Path.GetDirectoryName(sqlitePath);
if (!string.IsNullOrWhiteSpace(directory))
{
Directory.CreateDirectory(directory);
}
SqliteConnectionStringBuilder builder = new()
{
DataSource = sqlitePath,
Mode = SqliteOpenMode.ReadWriteCreate,
Pooling = true,
DefaultTimeout = (int)BusyTimeout.TotalSeconds,
};
return new SqliteConnection(builder.ToString());
}
/// <summary>
/// Creates a SQLite connection, opens it, and configures WAL journaling and a
/// non-zero busy timeout so concurrent readers and writers degrade gracefully
/// rather than surfacing <c>SQLITE_BUSY</c> as a hard failure.
/// </summary>
/// <param name="cancellationToken">Cancellation token for the operation.</param>
/// <returns>An opened and configured SQLite connection.</returns>
public async Task<SqliteConnection> OpenConnectionAsync(CancellationToken cancellationToken)
{
SqliteConnection connection = CreateConnection();
try
{
await connection.OpenAsync(cancellationToken).ConfigureAwait(false);
await ConfigureConnectionAsync(connection, cancellationToken).ConfigureAwait(false);
return connection;
}
catch
{
await connection.DisposeAsync().ConfigureAwait(false);
throw;
}
}
private static async Task ConfigureConnectionAsync(
SqliteConnection connection,
CancellationToken cancellationToken)
{
// WAL is a persistent, database-level setting; re-applying it per connection
// is cheap and a no-op once set. busy_timeout is per-connection state.
await using SqliteCommand command = connection.CreateCommand();
command.CommandText =
$"PRAGMA journal_mode=WAL; PRAGMA busy_timeout={(int)BusyTimeout.TotalMilliseconds};";
await command.ExecuteNonQueryAsync(cancellationToken).ConfigureAwait(false);
}
}
@@ -0,0 +1,3 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public sealed class AuthStoreMigrationException(string message) : InvalidOperationException(message);
@@ -0,0 +1,29 @@
using Microsoft.Extensions.Options;
using ZB.MOM.WW.MxGateway.Server.Configuration;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Hosted service that runs authentication store migrations on startup.
/// </summary>
public sealed class AuthStoreMigrationHostedService(
IOptions<GatewayOptions> options,
IAuthStoreMigrator migrator) : IHostedService
{
/// <inheritdoc />
public async Task StartAsync(CancellationToken cancellationToken)
{
AuthenticationOptions authentication = options.Value.Authentication;
if (authentication.Mode == AuthenticationMode.ApiKey && authentication.RunMigrationsOnStartup)
{
await migrator.MigrateAsync(cancellationToken).ConfigureAwait(false);
}
}
/// <inheritdoc />
public Task StopAsync(CancellationToken cancellationToken)
{
return Task.CompletedTask;
}
}
@@ -1,109 +1,27 @@
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Options;
using ZB.MOM.WW.Audit;
using ZB.MOM.WW.Auth.Abstractions.ApiKeys;
using ZB.MOM.WW.Auth.ApiKeys;
using ZB.MOM.WW.Auth.ApiKeys.Admin;
using ZB.MOM.WW.Auth.ApiKeys.DependencyInjection;
using ZB.MOM.WW.Auth.ApiKeys.Sqlite;
using ZB.MOM.WW.MxGateway.Server.Security.Audit;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Extension methods for configuring the SQLite authentication store.
/// </summary>
/// <remarks>
/// The peppered-HMAC API-key pipeline (token format, hashing, constant-time compare, SQLite
/// schema, stores, verifier and migration) is provided by the shared
/// <c>ZB.MOM.WW.Auth.ApiKeys</c> library, of which this gateway is the donor. This wiring binds
/// the library's <see cref="ApiKeyOptions"/> from the gateway's <c>MxGateway:Authentication</c>
/// section and layers the gateway-specific constraint enforcement, gRPC interceptor, CLI and
/// dashboard on top.
/// </remarks>
public static class AuthStoreServiceCollectionExtensions
{
/// <summary>The configuration section the gateway binds API-key options from.</summary>
public const string AuthenticationSectionPath = "MxGateway:Authentication";
/// <summary>The gateway API-key token prefix (token format <c>mxgw_&lt;id&gt;_&lt;secret&gt;</c>).</summary>
public const string TokenPrefix = "mxgw";
/// <summary>The configuration key the API-key pepper is resolved from.</summary>
public const string PepperSecretName = "MxGateway:ApiKeyPepper";
/// <summary>
/// Adds the SQLite authentication store and related services to the dependency container.
/// </summary>
/// <param name="services">Service collection to configure.</param>
/// <param name="configuration">Application configuration carrying the API-key options.</param>
/// <returns>The service collection for chaining.</returns>
public static IServiceCollection AddSqliteAuthStore(
this IServiceCollection services,
IConfiguration configuration)
public static IServiceCollection AddSqliteAuthStore(this IServiceCollection services)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configuration);
// Pin the gateway's API-key contract (token prefix "mxgw"; pepper resolved from
// MxGateway:ApiKeyPepper) by layering fallback defaults UNDER the supplied configuration:
// an in-memory source provides TokenPrefix/PepperSecretName only when the bound
// MxGateway:Authentication section omits them (the section has no TokenPrefix, and the pepper
// is intentionally not in appsettings — it is supplied at runtime). Explicit config wins
// because it is added last. ApiKeyOptions is an init-only record, so the values must be
// present at bind time rather than mutated post-configure.
IConfiguration effectiveConfig = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
[$"{AuthenticationSectionPath}:TokenPrefix"] = TokenPrefix,
[$"{AuthenticationSectionPath}:PepperSecretName"] = PepperSecretName,
})
.AddConfiguration(configuration)
.Build();
// Register the shared API-key provider: binds ApiKeyOptions from MxGateway:Authentication,
// wires up the SQLite stores, the configuration-backed pepper provider, the verifier, the
// migrator and the migration hosted service.
services.AddZbApiKeyAuth(effectiveConfig, AuthenticationSectionPath);
// Canonical audit (Task 2.3 — DEEP adopt ZB.MOM.WW.Audit). All MxGateway audit flows as a
// canonical AuditEvent through the library IAuditWriter, persisted in a NEW gateway-owned
// audit_event table that lives in the SAME SQLite DB file as the api-key stores (it reuses
// the library's AuthSqliteConnectionFactory, registered by AddZbApiKeyAuth above).
services.AddSingleton(sp =>
new SqliteCanonicalAuditStore(sp.GetRequiredService<AuthSqliteConnectionFactory>()));
// Resolve the logger defensively: the production host always registers ILogger<T>, but the
// DI-only auth/CLI/dashboard unit tests build a bare ServiceCollection without AddLogging().
// Fall back to NullLogger there so the audit writer (and the IApiKeyAuditStore override that
// depends on it) still resolve. The write path is best-effort regardless.
services.AddSingleton<IAuditWriter>(sp =>
new CanonicalAuditWriter(
sp.GetRequiredService<SqliteCanonicalAuditStore>(),
sp.GetService<ILogger<CanonicalAuditWriter>>()
?? Microsoft.Extensions.Logging.Abstractions.NullLogger<CanonicalAuditWriter>.Instance));
// OVERRIDE the library's IApiKeyAuditStore (AddZbApiKeyAuth registered the library's
// SqliteApiKeyAuditStore via TryAddSingleton) with an adapter that canonicalizes every
// library-emitted ApiKeyAuditEntry onto AuditEvent and forwards it through IAuditWriter.
// This is the only way to canonicalize the library-internal ApiKeyAdminCommands verbs
// (create/revoke/rotate/init-db, etc.), which we cannot edit. The adapter is registered
// after AddZbApiKeyAuth so it (last registration) is what ApiKeyAdminCommands resolves and
// what the dashboard "recent audit" view reads via IApiKeyAuditStore.ListRecentAsync.
// The library's api_key_audit table is left in place but is now UNUSED — nothing writes to
// it once this adapter replaces the library's audit store.
services.AddSingleton<IApiKeyAuditStore, CanonicalForwardingApiKeyAuditStore>();
// The shared admin command set (create/revoke/rotate/list/init-db with audit) is not
// auto-registered by AddZbApiKeyAuth; the gateway CLI and dashboard drive it, so register
// it here over the already-wired stores, pepper provider and migrator.
services.AddSingleton(sp => new ApiKeyAdminCommands(
sp.GetRequiredService<IOptions<ApiKeyOptions>>().Value,
sp.GetRequiredService<IApiKeyAdminStore>(),
sp.GetRequiredService<IApiKeyAuditStore>(),
sp.GetRequiredService<IApiKeyPepperProvider>(),
sp.GetRequiredService<SqliteAuthStoreMigrator>()));
services.AddSingleton<IApiKeyParser, ApiKeyParser>();
services.AddSingleton<IApiKeySecretHasher, ApiKeySecretHasher>();
services.AddSingleton<IApiKeyVerifier, ApiKeyVerifier>();
services.AddSingleton<ApiKeyAdminCliRunner>();
services.AddSingleton<AuthSqliteConnectionFactory>();
services.AddSingleton<IAuthStoreMigrator, SqliteAuthStoreMigrator>();
services.AddSingleton<IApiKeyStore, SqliteApiKeyStore>();
services.AddSingleton<IApiKeyAdminStore, SqliteApiKeyAdminStore>();
services.AddSingleton<IApiKeyAuditStore, SqliteApiKeyAuditStore>();
services.AddHostedService<AuthStoreMigrationHostedService>();
return services;
}
@@ -1,43 +0,0 @@
using LibApiKeyIdentity = ZB.MOM.WW.Auth.Abstractions.ApiKeys.ApiKeyIdentity;
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Maps the shared <see cref="ZB.MOM.WW.Auth.Abstractions.ApiKeys.ApiKeyIdentity"/> (which
/// carries the key's scopes plus the opaque constraints JSON blob) onto the gateway's
/// <see cref="ApiKeyIdentity"/> (which exposes the deserialized
/// <see cref="ApiKeyConstraints"/> the downstream authorization code enforces).
/// </summary>
/// <remarks>
/// The shared verifier does not interpret the constraints column; it returns the stored
/// JSON verbatim in <see cref="ZB.MOM.WW.Auth.Abstractions.ApiKeys.ApiKeyIdentity.Constraints"/>.
/// This mapper re-hydrates it via <see cref="ApiKeyConstraintSerializer"/> so the gateway's
/// constraint enforcement (<c>ConstraintEnforcer</c>) and request-identity accessor continue
/// to operate on the strongly-typed model unchanged.
/// </remarks>
public static class GatewayApiKeyIdentityMapper
{
/// <summary>
/// Converts a shared API-key identity into the gateway identity, deserializing the opaque
/// constraints JSON into <see cref="ApiKeyConstraints"/>.
/// </summary>
/// <param name="identity">The shared identity returned by the library verifier.</param>
/// <returns>The gateway identity carrying the effective constraints.</returns>
public static ApiKeyIdentity ToGatewayIdentity(LibApiKeyIdentity identity)
{
ArgumentNullException.ThrowIfNull(identity);
// The library stores the opaque constraints blob in Constraints as the ConstraintsJson
// string (or null when the key has no constraints).
string? constraintsJson = identity.Constraints as string;
return new ApiKeyIdentity(
KeyId: identity.KeyId,
// The gateway token prefix is fixed ("mxgw"); the key id is its own field. KeyPrefix
// is retained only for surface compatibility with the gateway identity record.
KeyPrefix: "mxgw",
DisplayName: identity.DisplayName,
Scopes: identity.Scopes,
Constraints: ApiKeyConstraintSerializer.Deserialize(constraintsJson));
}
}
@@ -0,0 +1,53 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
public interface IApiKeyAdminStore
{
/// <summary>
/// Creates a new API key asynchronously.
/// </summary>
/// <param name="request">API key creation request.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>Completed task.</returns>
Task CreateAsync(ApiKeyCreateRequest request, CancellationToken cancellationToken);
/// <summary>
/// Lists all API keys asynchronously.
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>List of API key records.</returns>
Task<IReadOnlyList<ApiKeyRecord>> ListAsync(CancellationToken cancellationToken);
/// <summary>
/// Revokes an API key asynchronously.
/// </summary>
/// <param name="keyId">Key identifier.</param>
/// <param name="revokedUtc">Revocation timestamp.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>True if revoked; otherwise false.</returns>
Task<bool> RevokeAsync(string keyId, DateTimeOffset revokedUtc, CancellationToken cancellationToken);
/// <summary>
/// Rotates an API key secret asynchronously.
/// </summary>
/// <param name="keyId">Key identifier.</param>
/// <param name="secretHash">New secret hash.</param>
/// <param name="rotatedUtc">Rotation timestamp.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>True if rotated; otherwise false.</returns>
Task<bool> RotateAsync(
string keyId,
byte[] secretHash,
DateTimeOffset rotatedUtc,
CancellationToken cancellationToken);
/// <summary>
/// Permanently deletes an API key, but only if it is already revoked. Active keys are
/// untouched (returns false) so an admin cannot delete a working credential without
/// first revoking it — that preserves the audit trail and forces the revoke event to
/// land in the audit log before the row disappears.
/// </summary>
/// <param name="keyId">Key identifier.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>True if a revoked key was deleted; false if the key is missing or active.</returns>
Task<bool> DeleteAsync(string keyId, CancellationToken cancellationToken);
}
@@ -0,0 +1,23 @@
namespace ZB.MOM.WW.MxGateway.Server.Security.Authentication;
/// <summary>
/// Stores and retrieves audit events for API key operations.
/// </summary>
public interface IApiKeyAuditStore
{
/// <summary>
/// Appends an audit entry to the audit log.
/// </summary>
/// <param name="entry">Audit entry to record.</param>
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
/// <returns>Asynchronous task representing the append operation.</returns>
Task AppendAsync(ApiKeyAuditEntry entry, CancellationToken cancellationToken);
/// <summary>
/// Lists the most recent audit entries, up to the specified count.
/// </summary>
/// <param name="count">Maximum number of entries to return.</param>
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
/// <returns>Asynchronous task returning the list of audit records.</returns>
Task<IReadOnlyList<ApiKeyAuditRecord>> ListRecentAsync(int count, CancellationToken cancellationToken);
}

Some files were not shown because too many files have changed in this diff Show More