Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules

Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 09:46:47 -04:00
parent 1cd51bbda3
commit a0203503a7
122 changed files with 8723 additions and 757 deletions
@@ -56,7 +56,6 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
private wwAlarmConsumerClass? client;
private wwAlarmConsumerClass? ackClient;
private string subscriptionExpression = string.Empty;
private bool subscribed;
private bool disposed;
@@ -157,8 +156,28 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
// also breaks AlarmAckByName on the same consumer (rejects with
// -55), so a separate ack-only consumer is provisioned below
// that gets only Initialize/Register/Subscribe (no SetXmlAlarmQuery).
//
// The wnwrap interop signature is `void SetXmlAlarmQuery(string)`
// — there is no integer return code to gate on like the other v1
// lifecycle calls in this method. A genuine failure surfaces as a
// COM exception (mapped from the underlying HRESULT). Wrap the
// call so a failure becomes an InvalidOperationException with
// diagnostic context, matching the other call-gates' failure
// shape rather than letting an opaque COMException escape with
// no indication that the alarm subscription is now misconfigured
// and the next GetXmlCurrentAlarms2 poll will fail with E_FAIL.
string xmlQuery = ComposeXmlAlarmQuery(subscription);
com.SetXmlAlarmQuery(xmlQuery);
try
{
com.SetXmlAlarmQuery(xmlQuery);
}
catch (COMException ex)
{
throw new InvalidOperationException(
$"wwAlarmConsumer.SetXmlAlarmQuery failed with HRESULT 0x{ex.HResult:X8}; " +
"subsequent GetXmlCurrentAlarms2 polls would return E_FAIL.",
ex);
}
// Provision a parallel COM consumer for ack calls. It runs the
// v1 lifecycle (Initialize/Register/Subscribe) only; without
@@ -185,7 +204,6 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
$"Ack consumer setup returned non-zero status: " +
$"Initialize={ackInit}, Register={ackReg}, Subscribe={ackSub}.");
}
subscriptionExpression = subscription;
subscribed = true;
}
@@ -303,23 +321,10 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
Dictionary<Guid, MxAlarmSnapshotRecord> next = ParseSnapshotXml(xml);
List<MxAlarmTransitionEvent> transitions = new List<MxAlarmTransitionEvent>();
IReadOnlyList<MxAlarmTransitionEvent> transitions;
lock (syncRoot)
{
foreach (KeyValuePair<Guid, MxAlarmSnapshotRecord> kv in next)
{
MxAlarmStateKind previousState = MxAlarmStateKind.Unspecified;
if (latestSnapshot.TryGetValue(kv.Key, out MxAlarmSnapshotRecord? prev))
{
previousState = prev.State;
if (previousState == kv.Value.State) continue; // no transition
}
transitions.Add(new MxAlarmTransitionEvent
{
Record = kv.Value,
PreviousState = previousState,
});
}
transitions = ComputeTransitions(latestSnapshot, next);
latestSnapshot.Clear();
foreach (KeyValuePair<Guid, MxAlarmSnapshotRecord> kv in next)
{
@@ -336,6 +341,52 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
}
}
/// <summary>
/// Pure snapshot-to-transitions diff. Compares the previous polled
/// snapshot to the next snapshot and produces one
/// <see cref="MxAlarmTransitionEvent"/> per state change. Used by
/// <see cref="PollOnce"/> after a successful
/// <c>GetXmlCurrentAlarms2</c> call; exposed as <c>internal static</c>
/// so the diff rules can be unit-tested without driving the
/// wnwrapConsumer COM object (Worker.Tests-022).
/// </summary>
/// <remarks>
/// <para>Rules:</para>
/// <list type="bullet">
/// <item><description>A GUID present in <paramref name="next"/> but not in <paramref name="previous"/> produces a transition with <see cref="MxAlarmStateKind.Unspecified"/> as the previous state — first sighting.</description></item>
/// <item><description>A GUID present in both with the same <see cref="MxAlarmSnapshotRecord.State"/> produces no transition.</description></item>
/// <item><description>A GUID present in both with a different <see cref="MxAlarmSnapshotRecord.State"/> produces a transition carrying the prior state.</description></item>
/// <item><description>A GUID present in <paramref name="previous"/> but absent from <paramref name="next"/> produces no transition. AVEVA drops cleared alarms from the active set; the snapshot simply stops mentioning them.</description></item>
/// </list>
/// </remarks>
/// <param name="previous">The snapshot from the previous poll (or empty on first call).</param>
/// <param name="next">The snapshot just parsed from <c>GetXmlCurrentAlarms2</c>.</param>
/// <returns>One transition per state change in <paramref name="next"/>.</returns>
internal static IReadOnlyList<MxAlarmTransitionEvent> ComputeTransitions(
Dictionary<Guid, MxAlarmSnapshotRecord> previous,
Dictionary<Guid, MxAlarmSnapshotRecord> next)
{
if (previous is null) throw new ArgumentNullException(nameof(previous));
if (next is null) throw new ArgumentNullException(nameof(next));
List<MxAlarmTransitionEvent> transitions = new List<MxAlarmTransitionEvent>();
foreach (KeyValuePair<Guid, MxAlarmSnapshotRecord> kv in next)
{
MxAlarmStateKind previousState = MxAlarmStateKind.Unspecified;
if (previous.TryGetValue(kv.Key, out MxAlarmSnapshotRecord? prev))
{
previousState = prev.State;
if (previousState == kv.Value.State) continue; // no transition
}
transitions.Add(new MxAlarmTransitionEvent
{
Record = kv.Value,
PreviousState = previousState,
});
}
return transitions;
}
/// <summary>
/// Parse the XML payload returned by <c>GetXmlCurrentAlarms2</c>
/// into a GUID-keyed dictionary. Records with malformed GUIDs are