Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules

Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit 1cd51bb, filed 72 new findings, and
fixed them in three priority waves (3 High, 17 Medium, 52 Low).

Highs
- Server-017: enumerate AcknowledgeAlarm / QueryActiveAlarms in
  GatewayGrpcScopeResolver so non-admin keys can use them; document
  the mapping in docs/Authorization.md; add interceptor tests.
- Client.Java-013: add the five missing bulk-method stubs to the
  CLI FakeSession so the test module compiles on a clean tree.
- Client.Rust-013: fix the clippy::doc_lazy_continuation regression
  in generated tonic code by reformatting the ReadBulkCommand proto
  comment and scoping a #![allow(...)] to the generated submodules.

Mediums (highlights)
- Server: unify GatewaySession state-lock discipline (-015) and
  make DisposeAsync race-safe against in-flight CloseAsync (-016);
  add constraint-enforcement test coverage for the bulk-plan path
  (-021).
- Worker: introduce StaRuntimeShutdownException so RunAlarmPollLoop
  can distinguish graceful shutdown from a real STA-affinity
  violation (-016); have the watchdog skip StaHung while
  CurrentCommandCorrelationId is non-empty so a legitimate slow
  ReadBulk no longer self-faults (-017).
- Tests: add per-method round-trip + cancellation coverage for the
  11 GatewaySession bulk methods (-013); replace the real TCP probe
  in GalaxyHierarchyCacheTests with an IGalaxyRepository fake
  (-016).
- IntegrationTests: drive the StreamEvents writer in the live Write
  test and assert OnWriteComplete (-012); add live tests for
  Unadvise/RemoveItem/Unregister ordering, WriteSecured, and
  abnormal worker exit (-014).
- Worker.Tests: replace MxAccessSession reflection with an internal
  CreateForTesting factory (-016); cover WorkerCancel and
  unexpected-body envelope branches (-017).
- Client.Java: cancel MxEventStream when close() races
  beforeStart() (-014); return a CancellingCompletableFuture that
  actually forwards cancellation through .thenApply chains (-015).
- Client.Python: drop the silent localhost-plaintext downgrade in
  the CLI; require explicit --plaintext (-013).
- Client.Rust: stop bench-read-bulk from polluting success-latency
  histograms with failed-call durations (-015); add coverage for
  the five MalformedReply paths, the bulk-write helpers, the
  Error::Unavailable mapping, and the unary-fault path (-016).
- Contracts: extend docs/Contracts.md with the bulk read/write
  command family (-009).

Lows (highlights)
- Server: cap GalaxyGlobMatcher.RegexCache; align
  WorkerAlarmRpcDispatcher missing-session handling; drop the
  duplicate dashboard @page routes; refresh IAlarmRpcDispatcher
  XML doc.
- Worker: surface SetXmlAlarmQuery COM failures; remove dead
  subscriptionExpression / ExecutingCommand arms; preserve
  factory-supplied runtime sessions; split MxAlarmSnapshot.cs into
  three files.
- Tests: dispose the WebApplication in seven test classes; rebuild
  FakeWorkerProcess.WaitForExitAsync against a real TaskCompletion
  source; switch the heartbeat-expires test to ManualTimeProvider;
  add InvariantCulture to the remaining DateTimeOffset.Parse sites;
  document GalaxyFilterInputSafetyTests in GatewayTesting.md.
- IntegrationTests: comment fixes, RecordingServerStreamWriter
  IDisposable, class-level [Trait], single-source ZB default
  connection string.
- Worker.Tests: replace silent-return gating with LiveMxAccessFact
  so absent env vars SKIP not pass; PascalCase rename of probe
  [Fact]s; deterministic deadline test; new frame-protocol error
  tests; ComputeTransitions diff-coverage; relocate dev-rig probes
  to Probes/.
- Contracts: add round-trip coverage and per-field redaction /
  Galaxy-identifier comments to the protos.
- Client.Dotnet: introduce clients/dotnet/Directory.Build.props so
  TreatWarningsAsErrors / analysers apply; document
  DiscoverHierarchyOptions and IMxGatewayCliClient; require typed
  bulk-read handles in CLI; surface AcknowledgeAlarm transport
  faults through Translate().
- Client.Go: kill dead code in alarms_test / fakeGalaxyServer /
  runWriteBulkVariant; document the six new subcommands in
  writeUsage; drain galaxy-watch events on limit; switch io.EOF
  comparisons to errors.Is.
- Client.Java: shared shutdown helpers + new shutdownTimeout
  option; regex-based credential redaction; Long.toUnsignedString
  for uint64 sequence; doc fixes.
- Client.Python: combine duplicate imports; add coverage for
  _percentile / bench-read-bulk / MAX_AGGREGATE_EVENTS /
  _api_key_from_env; populate pyproject metadata and ship py.typed.
- Client.Rust: expose next_correlation_id() so CLI ping/close
  stop hard-coding correlation IDs; resync RustClientDesign.md
  with the current Session / Error surface and CLI subcommand set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 09:46:47 -04:00
parent 1cd51bbda3
commit a0203503a7
122 changed files with 8723 additions and 757 deletions
+121 -12
View File
@@ -4,25 +4,27 @@
|---|---|
| Module | `src/MxGateway.Worker` |
| Reviewer | Claude Code |
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Review date | 2026-05-20 |
| Commit reviewed | `1cd51bb` |
| Status | Reviewed |
| Open findings | 0 |
## Checklist coverage
This row reflects the 2026-05-20 re-review at commit `1cd51bb`. Worker-001..015 are all closed; the row only summarises new findings filed against this branch.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Issues found: heartbeat loop sleeps before first beat (Worker-002), `ProcessCommandAsync` state race drops replies (Worker-003), watchdog/heartbeat state inconsistency (Worker-004), double-dispose path (Worker-006), plus Worker-010/011/015. |
| 2 | mxaccessgw conventions | Issue found: Worker-007 (reflection-based COM invocation bypasses the typed interface contract). |
| 3 | Concurrency & thread safety | Issues found: Worker-001 (`WnWrapAlarmConsumer` timer fires COM off the STA), Worker-008 (consumer factory STA-affinity not enforced). |
| 4 | Error handling & resilience | Issue found: Worker-005 (`OnPoll` silently swallows all poll failures). |
| 5 | Security | No secret logging (redaction applied); inbound frame validation reasonable. No issues found. |
| 6 | Performance & resource management | Issue found: Worker-009 (per-frame `byte[]` allocations on the hot event path). COM release is correct. |
| 7 | Design-document adherence | Code matches `WorkerSta.md`/`WorkerFrameProtocol.md`; stale alarm-path docs (Worker-012). |
| 8 | Code organization & conventions | Issue found: Worker-014 (`AlarmCommandHandler.cs` declares two public types in one file). |
| 9 | Testing coverage | Issue found: Worker-013 (`StaMessagePump` has no direct tests; poll-loop lifecycle untested). |
| 10 | Documentation & comments | Issue found: Worker-012 (stale "future PR / A.3" comments now describe shipped code). |
| 1 | Correctness & logic bugs | Issues found: Worker-018 (`SetXmlAlarmQuery` return code ignored), Worker-019 (`subscriptionExpression` is write-only dead state), Worker-020 (dead `ExecutingCommand` arm in `ProcessCommandAsync` state check), Worker-021 (`InitializeMxAccessAsync` can overwrite an already-set `_runtimeSession`). |
| 2 | mxaccessgw conventions | Issue found: Worker-022 (`MxAlarmSnapshot.cs` declares three public types in one file). |
| 3 | Concurrency & thread safety | Issue found: Worker-016 (`RunAlarmPollLoopAsync` swallows the `EnsureOnAlarmConsumerThread` assertion as part of its generic `InvalidOperationException` catch, defeating Worker-008's invariant). |
| 4 | Error handling & resilience | Issue found: Worker-017 (long-running commands like `ReadBulk` cannot mark STA activity, so the heartbeat watchdog can fire `StaHung` while a command is legitimately executing — `CurrentCommandCorrelationId` is non-empty in the heartbeat but ignored by the watchdog). |
| 5 | Security | No secret logging (redaction applied); inbound frame validation reasonable; secured-write user IDs do not leak through reply diagnostics. No new issues found. |
| 6 | Performance & resource management | Frame I/O uses pooled buffers (Worker-009 resolved); STA ownership and COM final-release are correct. No new issues found. |
| 7 | Design-document adherence | Code matches `gateway.md` / `MxAccessWorkerInstanceDesign.md` / `WorkerFrameProtocol.md`. No new design drift. |
| 8 | Code organization & conventions | Issue found: Worker-022 (see row 2). |
| 9 | Testing coverage | `RunAlarmPollLoop_WhenPollOnceThrows_RecordsFaultOnEventQueue` exists but uses a `COMException`; the `InvalidOperationException` arm raised by Worker-016 is not exercised. No standalone finding (subsumed by Worker-016's recommendation to add a regression test). |
| 10 | Documentation & comments | `RunAlarmPollLoopAsync`'s "STA runtime shutting down — stop the loop gracefully" comment is misleading once Worker-016 is considered (the catch also swallows STA-affinity violations). Noted in Worker-016. |
## Findings
@@ -258,3 +260,110 @@
**Recommendation:** Add a brief comment in `EnqueueEvent` clarifying that an overflow exception is expected and already self-records its fault, so the catch is intentionally a near no-op.
**Resolution:** 2026-05-18 — Added a comment in `MxAccessBaseEventSink.EnqueueEvent`'s catch block (per the finding's recommendation) explaining that two distinct fail-fast failures land there: a conversion failure from `createEvent()` (recorded here as an `MxaccessEventConversionFailed` fault) and an `MxAccessEventQueueOverflowException` from `Enqueue` at capacity, which — per the fail-fast backpressure design in `docs/DesignDecisions.md` — drops the event and has *already* self-recorded a `QueueOverflow` fault inside `Enqueue`. Because `MxAccessEventQueue.RecordFault` keeps only the first fault, the catch's `RecordFault` call is then a deliberate near no-op rather than a second, conflicting fault. Pure comment change as recommended — no behavior altered. `docs/DesignDecisions.md` already documents the fail-fast event backpressure rule, so no doc change was required.
### Worker-016
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:261-265` |
| Status | Resolved |
**Description:** `RunAlarmPollLoopAsync` catches `InvalidOperationException` and silently returns with the rationale "STA runtime shutting down — stop the loop gracefully". The same catch arm, however, also swallows the `InvalidOperationException` thrown by `EnsureOnAlarmConsumerThread()` / `AssertOnAlarmConsumerThread()` — the STA-affinity guard added under Worker-008. If the alarm poll ever ran on the wrong thread (a regression of the STA-affinity invariant), the assertion would fire, the loop would silently stop, no fault would be recorded, and the only observable symptom would be alarms no longer flowing. The assertion exists to catch a programming error early; this catch defeats it.
**Recommendation:** Either tighten the `InvalidOperationException` catch so it only swallows the STA-runtime-shutting-down sentinel (e.g. match on the exception message produced by `StaRuntime.InvokeAsync`, or have the STA runtime throw a dedicated exception type for shutdown), or rethrow / record-a-fault for `InvalidOperationException`s whose message does not match the shutdown sentinel. Add a regression test that drives `RunAlarmPollLoopAsync` with a handler that throws `InvalidOperationException` from `PollOnce` and asserts the loop records a fault rather than silently exiting.
**Resolution:** 2026-05-20 — Introduced a dedicated `StaRuntimeShutdownException` (`src/MxGateway.Worker/Sta/StaRuntimeShutdownException.cs`) that `StaRuntime.InvokeAsync` and the queue-enqueue path now throw in place of a generic `InvalidOperationException` when `shutdownRequested` is set. `RunAlarmPollLoopAsync` in `MxAccessStaSession.cs:258-291` now catches `StaRuntimeShutdownException` (graceful stop, returns silently) separately from the generic `Exception` arm, which records the fault on the event queue. An STA-affinity `InvalidOperationException` from `EnsureOnAlarmConsumerThread` therefore now falls through to the fault path and becomes observable on the IPC fault path instead of silently terminating alarm delivery. Verified: `dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86` clean (0 warnings). Regression coverage in `MxAccessStaSessionTests.cs` exercises both the graceful-shutdown and the affinity-violation paths.
### Worker-017
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | `src/MxGateway.Worker/Sta/StaRuntime.cs:280-288`, `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:602-631` |
| Status | Resolved |
**Description:** `StaRuntime.ProcessQueuedCommands` calls `MarkActivity()` only before and after `workItem.Execute()`. For a command that synchronously holds the STA for longer than `WorkerPipeSessionOptions.HeartbeatGrace` (default 15s) — e.g. `ReadBulk` with many uncached tags, each waiting up to its per-tag `TimeoutMs` (default 1000 ms) — no `MarkActivity()` runs during the wait, `LastActivityUtc` stays frozen, and `ReportWatchdogFaultIfNeededAsync` fires an `StaHung` fault. The heartbeat itself reports `WorkerState.ExecutingCommand` with the live `CurrentCommandCorrelationId`, so the worker actually knows it is executing a command rather than hung — but the watchdog branch only checks `staleFor > HeartbeatGrace` and ignores the in-flight command. A legitimate slow bulk read then self-faults and tears the session down.
**Recommendation:** Either (a) extend `WorkerPipeSession.ReportWatchdogFaultIfNeededAsync` to skip the `StaHung` fault when the snapshot's `CurrentCommandCorrelationId` is non-empty (the worker is executing a command, not hung), or (b) thread a `MarkActivity`-style callback into the bulk-read `pumpStep` so long synchronous STA operations periodically refresh `LastActivityUtc`. Option (a) is the smaller surface — the heartbeat already carries enough signal for the gateway to decide the command is just slow. Either way, the design intent (watchdog catches a hung STA, not a slow command) should be documented on `ReportWatchdogFaultIfNeededAsync`.
**Resolution:** 2026-05-20 — Applied option (a): `WorkerPipeSession.ReportWatchdogFaultIfNeededAsync` (`src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:602-645`) now returns early when `snapshot.CurrentCommandCorrelationId` is non-empty — the STA is busy executing a known command, not hung, and the heartbeat already surfaces the correlation id so the gateway can decide whether the command is too slow against its own per-command timeout. The next `MarkActivity()` after the command returns lifts `LastActivityUtc` and the watchdog resumes normal operation. A new XML doc comment on the method records the design intent (watchdog catches a hung STA, not a slow command). Verified: `dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86` clean. Regression coverage added in `WorkerPipeSessionTests.cs`.
### Worker-018
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:160-161` |
| Status | Resolved |
**Description:** `Subscribe` calls `com.SetXmlAlarmQuery(xmlQuery)` and discards the return value. The block-level comment immediately above states that this call is empirically required for subsequent `GetXmlCurrentAlarms2` to succeed — i.e. it is on the critical path of the alarm subscription. Every other AVEVA-COM call in the same method (`InitializeConsumer`, `RegisterConsumer`, `Subscribe`, `AlarmAckByName`, etc.) is gated on a `!= 0` return-code check and throws `InvalidOperationException` on failure. If `SetXmlAlarmQuery` ever returns non-zero (or otherwise fails non-fatally), the consumer reaches `subscribed = true` with the wnwrap state misconfigured, and the next `PollOnce` fails with the same `E_FAIL` the comment warns about — without any indication where the regression lies.
**Recommendation:** Either (a) check the `SetXmlAlarmQuery` return code and treat a non-zero value as a subscription failure (matching the other call-gates in the method) or (b) document explicitly in the comment that `SetXmlAlarmQuery`'s return code is meaningless on this AVEVA build (referencing `docs/AlarmClientDiscovery.md` if so). At minimum capture the return value in a local for diagnostic purposes so a future failure is easier to triage.
**Re-triage:** The finding's framing assumed an integer return code; inspection of the `Interop.WNWRAPCONSUMERLib` assembly confirmed `SetXmlAlarmQuery` is declared `Void SetXmlAlarmQuery(System.String)` on all three flavors (`IwwAlarmConsumer`, `IwwAlarmConsumer2`, `wwAlarmConsumerClass`). There is no integer return code to gate on. A genuine failure can only surface as a `COMException` mapped from the underlying HRESULT, so the fix wraps the call to translate that into the same `InvalidOperationException` failure-shape used by every other call-gate in `Subscribe`, with the HRESULT included in the diagnostic message.
**Resolution:** 2026-05-20 — `WnWrapAlarmConsumer.Subscribe` now wraps the `com.SetXmlAlarmQuery(xmlQuery)` call in a `try`/`catch (COMException ex)` that throws an `InvalidOperationException` carrying the HRESULT (`$"wwAlarmConsumer.SetXmlAlarmQuery failed with HRESULT 0x{ex.HResult:X8}; subsequent GetXmlCurrentAlarms2 polls would return E_FAIL."`) with the original `COMException` as `InnerException`. A previously silent failure that left `subscribed = true` with misconfigured wnwrap state — and produced an opaque `E_FAIL` from the next `PollOnce` with no indication where the regression lay — now surfaces as a subscription failure at the `Subscribe` call-site, matching the existing v1-lifecycle failure shape. The block comment was extended to record that the interop signature returns `void` (no integer return code to gate on like the sibling v1 calls) so a future maintainer doesn't try to add one. No new regression test was added in this agent because Worker.Tests is being modified by a concurrent agent; the change is structurally analogous to the existing `Initialize/Register/Subscribe` call-gates and is exercised end-to-end by the live alarm smoke path.
### Worker-019
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:59`, `:188` |
| Status | Resolved |
**Description:** `WnWrapAlarmConsumer` declares `private string subscriptionExpression = string.Empty;` and assigns it once inside `Subscribe` (line 188), but never reads it. It is dead state — neither `PollOnce`, `AcknowledgeByName`, `AcknowledgeByGuid`, `SnapshotActiveAlarms`, nor `Dispose` consults it. Either it is genuinely unused (delete it) or it was intended to support a not-yet-implemented feature (e.g. re-subscribing after a transient failure, or echoing the subscription back through `IsSubscribed`/`SubscriptionExpression`), in which case the intent should be wired up or documented.
**Recommendation:** Delete the field (the safest option — `treatWarningsAsErrors=true` will continue to permit it as long as it's read into; consider promoting it to read-only via an exposed property `SubscriptionExpression` so smoke tests can assert what subscription is active without touching wnwrap state). If a future use is expected, file a follow-up issue.
**Resolution:** 2026-05-20 — Deleted the dead `private string subscriptionExpression = string.Empty;` field declaration and its sole assignment inside `Subscribe` (`subscriptionExpression = subscription;`). The field had no readers and was pure write-only state. Pure cleanup — no behaviour change, no public API surface affected. The worker build remains clean with zero warnings under `TreatWarningsAsErrors=true`.
### Worker-020
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:405`, `:423` |
| Status | Resolved |
**Description:** `ProcessCommandAsync` decides whether to write a command reply with `if (_state is not WorkerState.Ready and not WorkerState.ExecutingCommand)`. The `ExecutingCommand` arm is dead: `_state` is only ever assigned `Starting`, `Handshaking`, `InitializingSta`, `Ready`, `ShuttingDown`, `Faulted`, or `Stopped`. The string `WorkerState.ExecutingCommand` appears nowhere as a target of `_state = ...`. The `WorkerState.ExecutingCommand` value is synthesized only in `CreateHeartbeat` (line 811) when a command is in flight, so it never leaks back into `_state`. The check is effectively `_state is not WorkerState.Ready`. The intent is unclear: either the check should also accept the live "is executing" condition (which today is implicit via `_state == Ready` plus a non-empty `CurrentCommandCorrelationId` from the dispatcher), or the dead arm should be removed for clarity.
**Recommendation:** Simplify the check to `if (_state != WorkerState.Ready)` to match the actual state machine, and update the dropped-reply log fields accordingly. Alternatively, introduce an explicit `WorkerState.ExecutingCommand` transition (set when a command starts dispatching, restored to `Ready` on completion) so the check matches its name. The simpler fix is the former.
**Resolution:** 2026-05-20 — Both occurrences of the `_state is not WorkerState.Ready and not WorkerState.ExecutingCommand` check in `ProcessCommandAsync` (the post-`DispatchAsync` success path and the exception path) were simplified to `_state != WorkerState.Ready`. The `ExecutingCommand` arm was dead — `_state` is never written that value; only `CreateHeartbeat` synthesizes it on the wire when `CurrentCommandCorrelationId` is non-empty. A comment was added at the success-path site documenting the assignment-set of `_state` and why `Ready` is the only command-serving state. No behavioural change — `_state` could never be `ExecutingCommand` at that read, so the simplification preserves the same effective decision while removing the misleading dead arm. No new regression test was added in this agent because Worker.Tests is being modified by a concurrent agent.
### Worker-021
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:111-118`, `:790-805`, `:136-139` |
| Status | Resolved |
**Description:** `RunAsync` constructs the runtime session through `_runtimeSession = _runtimeSessionFactory()` (line 111) and immediately calls `CompleteStartupHandshakeAsync(token => _runtimeSession.StartAsync(...))`. That path is fine. However the public parameterless `CompleteStartupHandshakeAsync()` (line 136) routes through `InitializeMxAccessAsync` (line 790), which unconditionally reassigns `_runtimeSession = new MxAccessStaSession(eq => new AlarmCommandHandler(eq));` — overwriting whatever the factory put there. If anything ever calls `CompleteStartupHandshakeAsync()` after `RunAsync` has already begun, the factory-supplied session is leaked (no `Dispose` is called on the old instance) and a fresh hard-coded `MxAccessStaSession` is started instead. Today no production code path triggers this, but the API surface is public and dangerous — a test or a refactor could trip it.
**Recommendation:** Either (a) make `InitializeMxAccessAsync` a no-op if `_runtimeSession` is already non-null (treat the existing instance as authoritative and only call its `StartAsync`), or (b) make the parameterless `CompleteStartupHandshakeAsync()` and `InitializeMxAccessAsync` `internal` / remove them, since the production path is the factory-driven one in `RunAsync`. Option (b) is cleaner: the parameterless overload is dead in production.
**Resolution:** 2026-05-20 — Applied option (a): `InitializeMxAccessAsync` now uses `_runtimeSession ??= new MxAccessStaSession(eq => new AlarmCommandHandler(eq));`, so the existing factory-supplied instance from `RunAsync` is treated as authoritative and only the fall-back direct-invocation path (where the parameterless `CompleteStartupHandshakeAsync` is called without a prior factory call) constructs the hard-coded `MxAccessStaSession`. The `StartAsync` call and the `catch`-and-dispose path now operate on a local `session` captured from `_runtimeSession`, so a startup failure still disposes the runtime regardless of which path supplied it. A comment in `InitializeMxAccessAsync` documents the reasoning. Option (a) was preferred over (b) because the parameterless `CompleteStartupHandshakeAsync` overload is part of the existing public API surface and tightening it to `internal` would be a contract change with no production driver requesting it. No new regression test was added in this agent because Worker.Tests is being modified by a concurrent agent; the change is exercised end-to-end by the existing `RunAsync` factory path which now goes through the null-coalescing assignment instead of an unconditional `new`.
### Worker-022
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/MxGateway.Worker/MxAccess/MxAlarmSnapshot.cs:12`, `:26`, `:49` |
| Status | Resolved |
**Description:** `MxAlarmSnapshot.cs` declares three public types in one file: the `MxAlarmStateKind` enum, the `MxAlarmSnapshotRecord` class, and the `MxAlarmTransitionEvent` class. The C# style guide (`docs/style-guides/CSharpStyleGuide.md:68`) requires one public type per file unless a small nested type is clearer. The recently resolved Worker-014 split `IAlarmCommandHandler` out of `AlarmCommandHandler.cs` for exactly this reason — the same convention applies here.
**Recommendation:** Move `MxAlarmStateKind` and `MxAlarmTransitionEvent` into their own files (`MxAlarmStateKind.cs`, `MxAlarmTransitionEvent.cs`) and leave `MxAlarmSnapshotRecord` in `MxAlarmSnapshot.cs` (or rename the file to `MxAlarmSnapshotRecord.cs` to match the surviving type). Pure file-organization change; no behaviour or namespace impact.
**Resolution:** 2026-05-20 — Split `MxAlarmSnapshot.cs` into three files, each declaring one public type and keeping the original `MxGateway.Worker.MxAccess` namespace so existing usages are unaffected: `MxAlarmStateKind.cs` (the enum, with its XML doc), `MxAlarmTransitionEvent.cs` (the `EventArgs` subclass, with its `PreviousState` doc), and `MxAlarmSnapshot.cs` (now containing only `MxAlarmSnapshotRecord` plus its XML doc). Matches the one-public-type-per-file convention re-affirmed by Worker-014's `IAlarmCommandHandler` split. Pure file-organization change — no API, namespace, or behaviour change; build is clean.