fix(probe): Galaxy gRPC ping — drop invalid Retry, treat MxGatewayAuth exceptions as reachable (live /run)
v2-ci / build (push) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped

Two bugs caught by live verification against the mxaccessgw at 10.100.0.48:5120:
- MaxAttempts=1 produced an invalid Polly RetryStrategyOptions -> the probe failed
  on every real gateway. Removed the Retry override (matches GalaxyDriver); fail-fast
  is already guaranteed by the TCP preflight + the per-call deadline.
- A rejected key surfaces as a typed MxGatewayAuthenticationException, not a raw
  RpcException, so 'auth-rejection = reachable' was bypassed. Catch the typed auth/
  authorization exceptions -> Ok=true.
Adds DriverProbeHandshakeE2eTests: direct-probe, skip-gated cross-protocol green/red
discrimination (Modbus, OpcUaClient, Galaxy + a local real OPC UA server).
This commit is contained in:
Joseph Doherty
2026-06-16 07:32:59 -04:00
parent af280af842
commit 1164d423b6
3 changed files with 205 additions and 4 deletions
@@ -101,6 +101,15 @@ public sealed class GalaxyDriverProbe : IDriverProbe
var (ok, message) = ClassifyRpc(ex.StatusCode, host, port);
return new(ok, message, ok ? sw.Elapsed : null);
}
catch (Exception ex) when (ex is MxGatewayAuthenticationException or MxGatewayAuthorizationException)
{
// The gateway authenticated/authorized our call and rejected the (unresolved /
// placeholder) key — the mxaccessgw client surfaces this as a typed exception, NOT a
// raw RpcException. It still PROVES a live gateway gRPC server answered, so auth
// rejection counts as reachable (the probe never resolves the real secret).
sw.Stop();
return new(true, "gateway reachable & speaking gRPC (auth not checked)", sw.Elapsed);
}
catch (OperationCanceledException) when (ct.IsCancellationRequested)
{
// The caller cancelled (their own timeout / shutdown) — surface a timeout message.
@@ -169,9 +178,11 @@ public sealed class GalaxyDriverProbe : IDriverProbe
CaCertificatePath = gw.CaCertificatePath,
ConnectTimeout = budget,
DefaultCallTimeout = budget,
// One shot — the probe must not spin on transient (Unavailable/DeadlineExceeded)
// retries; the linked deadline above bounds the whole call regardless.
Retry = new MxGatewayClientRetryOptions { MaxAttempts = 1 },
// Leave Retry at the client default (as GalaxyDriver does) — an explicit
// MaxAttempts=1 maps to 0 Polly retries, which Polly rejects as an invalid
// RetryStrategyOptions. Fast-fail is already guaranteed: the TCP preflight rejects
// unreachable hosts before the gRPC call, and the linked deadline caps the call to
// the probe budget regardless of retries.
};
}