Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules

Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 09:46:47 -04:00
parent 1cd51bbda3
commit a0203503a7
122 changed files with 8723 additions and 757 deletions
+17
View File
@@ -0,0 +1,17 @@
<Project>
<!--
Mirrors src/Directory.Build.props for the .NET client projects under
clients/dotnet/ so they share the same enforcement floor (warnings-as-
errors, latest analyzers, code-style enforcement, deterministic builds)
even though they live outside src/.
-->
<PropertyGroup>
<LangVersion>latest</LangVersion>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<AnalysisLevel>latest</AnalysisLevel>
<EnforceCodeStyleInBuild>true</EnforceCodeStyleInBuild>
<Deterministic>true</Deterministic>
</PropertyGroup>
</Project>
@@ -3,6 +3,12 @@ using MxGateway.Contracts.Proto.Galaxy;
namespace MxGateway.Client.Cli;
/// <summary>
/// Minimal transport surface the CLI talks to. Exposes only the gateway and
/// Galaxy Repository RPCs the CLI needs so tests can substitute an in-process
/// fake without standing up a real gRPC channel. The production binding is a
/// thin adapter over <see cref="MxGatewayClient"/> and <see cref="GalaxyRepositoryClient"/>.
/// </summary>
public interface IMxGatewayCliClient : IAsyncDisposable
{
/// <summary>
@@ -635,7 +635,7 @@ public static class MxGatewayClientCli
}),
cancellationToken)
.ConfigureAwait(false);
int serverHandle = registerReply.Register?.ServerHandle ?? registerReply.ReturnValue.Int32Value;
int serverHandle = RequireRegisterServerHandle(registerReply);
SubscribeBulkCommand subscribe = new() { ServerHandle = serverHandle };
subscribe.TagAddresses.Add(tags);
@@ -893,7 +893,7 @@ public static class MxGatewayClientCli
}),
cancellationToken)
.ConfigureAwait(false);
int serverHandle = registerReply.Register?.ServerHandle ?? registerReply.ReturnValue.Int32Value;
int serverHandle = RequireRegisterServerHandle(registerReply);
SubscribeBulkCommand subscribe = new() { ServerHandle = serverHandle };
subscribe.TagAddresses.Add(tags);
@@ -941,11 +941,16 @@ public static class MxGatewayClientCli
continue;
}
if (firstSteadyEventUtc is null)
// Guarded by latencyLock so parallel sessions can't tear a 64-bit
// DateTime? read or stomp an already-set firstSteadyEventUtc with
// a later timestamp from a slower-to-start session. The lock is
// already held by the latency append a few lines below, so the
// extra cost is one uncontended lock acquisition per event.
lock (latencyLock)
{
firstSteadyEventUtc = nowUtc;
firstSteadyEventUtc ??= nowUtc;
lastSteadyEventUtc = nowUtc;
}
lastSteadyEventUtc = nowUtc;
Interlocked.Increment(ref steadyEvents);
if (mxEvent.Family == MxEventFamily.OnDataChange)
{
@@ -1258,7 +1263,7 @@ public static class MxGatewayClientCli
Kind = MxCommandKind.Register,
Register = new RegisterCommand { ClientName = arguments.GetOptional("client-name") ?? "mxgw-dotnet-smoke" },
},
reply => reply.Register?.ServerHandle ?? reply.ReturnValue.Int32Value,
RequireRegisterServerHandle,
commandReplies,
cancellationToken)
.ConfigureAwait(false);
@@ -1276,7 +1281,7 @@ public static class MxGatewayClientCli
ItemDefinition = arguments.GetRequired("item"),
},
},
reply => reply.AddItem?.ItemHandle ?? reply.ReturnValue.Int32Value,
RequireAddItemItemHandle,
commandReplies,
cancellationToken)
.ConfigureAwait(false);
@@ -1408,6 +1413,41 @@ public static class MxGatewayClientCli
return reply;
}
/// <summary>
/// Returns the server handle from a successful <c>register</c> reply, or throws
/// <see cref="MxGatewayException"/> when the typed <see cref="MxCommandReply.Register"/>
/// payload is absent. Mirrors the SDK-level <see cref="MxGatewaySession.RegisterAsync"/>
/// contract: a successful reply without the typed payload is a gateway protocol
/// error, not a license to fall through to <c>ReturnValue.Int32Value</c> (which is 0
/// when the reply carries no return value).
/// </summary>
private static int RequireRegisterServerHandle(MxCommandReply reply)
{
return reply.Register?.ServerHandle
?? throw CreateMissingPayloadException(reply, "register");
}
/// <summary>
/// Returns the item handle from a successful <c>add_item</c> reply, or throws
/// <see cref="MxGatewayException"/> when the typed <see cref="MxCommandReply.AddItem"/>
/// payload is absent. See <see cref="RequireRegisterServerHandle"/> for the rationale.
/// </summary>
private static int RequireAddItemItemHandle(MxCommandReply reply)
{
return reply.AddItem?.ItemHandle
?? throw CreateMissingPayloadException(reply, "add_item");
}
private static MxGatewayException CreateMissingPayloadException(
MxCommandReply reply,
string expectedPayload)
{
return new MxGatewayException(
$"Gateway reply for command kind={reply.Kind} reported success but is missing "
+ $"the required '{expectedPayload}' payload; cannot resolve a handle. "
+ $"session={reply.SessionId}; correlation={reply.CorrelationId}");
}
private static MxCommandRequest CreateCommandRequest(
string sessionId,
MxCommand command)
@@ -216,7 +216,7 @@ internal sealed class FakeGatewayTransport(MxGatewayClientOptions options) : IMx
AcknowledgeAlarmCalls.Add((request, callOptions));
if (AcknowledgeAlarmExceptions.TryDequeue(out Exception? exception))
{
throw exception;
throw Translate(exception, callOptions);
}
return Task.FromResult(_acknowledgeReplies.Count > 0
@@ -73,19 +73,17 @@ public sealed class MxGatewayClientAlarmsTests
}
[Fact]
public async Task AcknowledgeAlarmAsync_MapsUnauthenticated_RpcException_ToTypedException()
public async Task AcknowledgeAlarmAsync_SurfacesRpcExceptionFromFakeTransportVerbatim_WhenMappingDisabled()
{
// Default FakeGatewayTransport.MapTransportExceptions is false, matching the
// historical pass-through shape: a thrown RpcException reaches the caller as
// RpcException rather than being mapped to a typed MxGatewayException. This
// test pins that shape so a future change can't silently flip it.
FakeGatewayTransport transport = CreateTransport();
transport.AcknowledgeAlarmExceptions.Enqueue(
new RpcException(new Status(StatusCode.Unauthenticated, "expired key")));
await using MxGatewayClient client = CreateClient(transport);
// Note: the FakeGatewayTransport surfaces RpcException directly (it does not run
// through GrpcMxGatewayClientTransport's mapping); the fake's contract here is to
// pass the exception verbatim. RpcException → typed exception mapping is covered
// in the GrpcMxGatewayClientTransport-level tests; the SDK-level test pins the
// pass-through shape so a future migration to direct mapping won't silently
// change observable behaviour.
var ex = await Assert.ThrowsAsync<RpcException>(
() => client.AcknowledgeAlarmAsync(new AcknowledgeAlarmRequest
{
@@ -97,6 +95,32 @@ public sealed class MxGatewayClientAlarmsTests
Assert.Equal(StatusCode.Unauthenticated, ex.StatusCode);
}
[Fact]
public async Task AcknowledgeAlarmAsync_MapsUnauthenticated_RpcException_ToTypedException()
{
// Production parity: GrpcMxGatewayClientTransport.AcknowledgeAlarmAsync runs
// every thrown RpcException through RpcExceptionMapper.Map, so callers see
// MxGatewayAuthenticationException (for Unauthenticated) rather than the raw
// RpcException. The fake transport reproduces that mapping when
// MapTransportExceptions is set, letting this SDK-level test cover the same
// observable behaviour without standing up a real gRPC channel.
FakeGatewayTransport transport = CreateTransport();
transport.MapTransportExceptions = true;
transport.AcknowledgeAlarmExceptions.Enqueue(
new RpcException(new Status(StatusCode.Unauthenticated, "expired key")));
await using MxGatewayClient client = CreateClient(transport);
var ex = await Assert.ThrowsAsync<MxGatewayAuthenticationException>(
() => client.AcknowledgeAlarmAsync(new AcknowledgeAlarmRequest
{
SessionId = "session-fixture",
AlarmFullReference = "Tank01.Level.HiHi",
Comment = string.Empty,
OperatorUser = "alice",
}));
Assert.Equal(StatusCode.Unauthenticated, ex.StatusCode);
}
[Fact]
public async Task QueryActiveAlarmsAsync_StreamsEnqueuedSnapshots()
{
@@ -1,24 +1,67 @@
namespace MxGateway.Client;
/// <summary>
/// Server-side filters and shape options for
/// <see cref="GalaxyRepositoryClient.DiscoverHierarchyAsync(DiscoverHierarchyOptions, System.Threading.CancellationToken)"/>.
/// Each property maps directly to the corresponding field on the
/// <c>DiscoverHierarchyRequest</c> proto so the gateway can narrow the
/// hierarchy walk before serializing it back to the client.
/// </summary>
public sealed record DiscoverHierarchyOptions
{
/// <summary>
/// Root Galaxy object id to start the walk from. When set, takes
/// precedence over <see cref="RootTagName"/> and <see cref="RootContainedPath"/>.
/// </summary>
public int? RootGobjectId { get; init; }
/// <summary>
/// Root tag (assigned) name to start the walk from. Used when
/// <see cref="RootGobjectId"/> is null.
/// </summary>
public string? RootTagName { get; init; }
/// <summary>
/// Root contained-name dotted path to start the walk from. Used when
/// neither <see cref="RootGobjectId"/> nor <see cref="RootTagName"/> are set.
/// </summary>
public string? RootContainedPath { get; init; }
/// <summary>
/// Maximum traversal depth below the root, inclusive. Leave null for the
/// server default (unbounded).
/// </summary>
public int? MaxDepth { get; init; }
/// <summary>
/// Galaxy category ids to include. Empty means all categories.
/// </summary>
public IReadOnlyList<int> CategoryIds { get; init; } = Array.Empty<int>();
/// <summary>
/// Template tag names that must appear somewhere in each returned
/// object's template chain. Empty means no template filter.
/// </summary>
public IReadOnlyList<string> TemplateChainContains { get; init; } = Array.Empty<string>();
/// <summary>
/// Optional glob (e.g. <c>"Tank*"</c>) matched against each object's tag name.
/// </summary>
public string? TagNameGlob { get; init; }
/// <summary>
/// When set, overrides whether each returned <c>GalaxyObject</c> includes
/// its dynamic attribute list. Leave null to use the server default.
/// </summary>
public bool? IncludeAttributes { get; init; }
/// <summary>
/// When true, restrict results to objects that bear at least one configured alarm.
/// </summary>
public bool AlarmBearingOnly { get; init; }
/// <summary>
/// When true, restrict results to objects that have at least one historized attribute.
/// </summary>
public bool HistorizedOnly { get; init; }
}
@@ -23,7 +23,7 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
private readonly GrpcChannel? _channel;
private readonly IGalaxyRepositoryClientTransport _transport;
private readonly ResiliencePipeline _safeUnaryRetryPipeline;
private bool _disposed;
private int _disposed;
/// <summary>
/// Initializes a Galaxy Repository client with custom transport and options.
@@ -182,6 +182,17 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
return await DiscoverHierarchyAsync(new DiscoverHierarchyOptions(), cancellationToken).ConfigureAwait(false);
}
/// <summary>
/// Enumerates the deployed Galaxy object hierarchy with caller-supplied
/// server-side filters. Each returned <see cref="GalaxyObject"/> may include
/// its dynamic attributes (controlled by <see cref="DiscoverHierarchyOptions.IncludeAttributes"/>),
/// so callers can determine which tag references they may subscribe to via
/// the MxAccessGateway service. The client transparently follows the
/// gateway's pagination cursor until the hierarchy is fully drained.
/// </summary>
/// <param name="options">Server-side filter and shape options.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>The filtered collection of Galaxy objects.</returns>
public async Task<IReadOnlyList<GalaxyObject>> DiscoverHierarchyAsync(
DiscoverHierarchyOptions options,
CancellationToken cancellationToken = default)
@@ -338,12 +349,11 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
/// </summary>
public ValueTask DisposeAsync()
{
if (_disposed)
if (Interlocked.Exchange(ref _disposed, 1) != 0)
{
return ValueTask.CompletedTask;
}
_disposed = true;
_channel?.Dispose();
return ValueTask.CompletedTask;
}
@@ -444,6 +454,6 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
private void ThrowIfDisposed()
{
ObjectDisposedException.ThrowIf(_disposed, this);
ObjectDisposedException.ThrowIf(Volatile.Read(ref _disposed) != 0, this);
}
}