code-reviews: 2026-06-16 re-review of all 11 modules at 8df5ab3
Re-review of the 99-commit delta since the 410acc9 baseline (session-resilience
epic, dashboard disable-login, galaxy browse fixes, and stillpending §8).
44 new Open findings, no Critical/High:
- Server 2 (incl. Medium design-doc drift), Worker 0 (026/027/028 confirmed
resolved), Contracts 3, Tests 3, Worker.Tests 3, IntegrationTests 4
- Client.Dotnet 4 (Medium env-var key redaction), Client.Go 5 (Medium watch
drain), Client.Java 9 (Medium overflow race), Client.Python 5 (Medium README
API), Client.Rust 6 (Medium --tls/--plaintext downgrade)
README regenerated; regen-readme.py --check passes.
This commit is contained in:
@@ -4,10 +4,10 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Server` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Review date | 2026-06-16 |
|
||||
| Commit reviewed | `8df5ab3` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
| Open findings | 2 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
@@ -69,6 +69,23 @@ findings (Server-001 through Server-032) are unchanged by this pass.
|
||||
| 9 | Testing coverage | Issues found: Server-037 (no test for the corrupt-snapshot restore path or for `PersistSnapshot = false` at the cache level). |
|
||||
| 10 | Documentation & comments | No issues found — XML docs match behavior; the `GalaxyRepository.md` "On-disk snapshot" section documents the Stale-on-restore lifecycle. |
|
||||
|
||||
### 2026-06-16 re-review (commit 8df5ab3)
|
||||
|
||||
Re-review of the session-resilience epic + §8 delta (`git diff 410acc9..8df5ab3`): `SessionEventDistributor` multi-subscriber fan-out, replay-on-reconnect, detach-grace retention, bounded worker-ready wait, dashboard auto-login.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Server-055 |
|
||||
| 2 | mxaccessgw conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | No issues found (replay/handoff atomicity, reconnect-vs-sweep, single-clock ready-wait all verified sound) |
|
||||
| 4 | Error handling & resilience | No issues found |
|
||||
| 5 | Security | No issues found (DisableLogin auto-login is intentional/config-gated/documented) |
|
||||
| 6 | Performance & resource management | No issues found |
|
||||
| 7 | Design-document adherence | Server-054 |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | No issues found |
|
||||
| 10 | Documentation & comments | No issues found |
|
||||
|
||||
### 2026-05-24 re-review (commit 42b0037)
|
||||
|
||||
Re-review pass at `42b0037` scoped to the dashboard destructive-action wave on
|
||||
@@ -1022,3 +1039,33 @@ Additionally, `GatewayAlarmMonitor.ApplyProviderModeChangeAsync` increments the
|
||||
**Recommendation:** Add resolver tests for (a) cancellation propagation and (b) an include that is also excluded; and a `GatewayAlarmMonitorProviderMode` test pinning the provider-switch counter behaviour for a same-mode repeat event (whichever semantics the team intends). These lock down the contracts the Server-051/052 findings expose.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15. Added all three missing tests: (a) `AlarmWatchListResolverTests.ResolveAsync_RepositoryCancelled_PropagatesOperationCanceled` (cancellation propagation, also covers Server-051); (b) `AlarmWatchListResolverTests.ResolveAsync_ExcludeAlsoSuppressesMatchingExplicitInclude` (exclude-vs-include precedence, also Server-052 item 2); and (c) `GatewayAlarmMonitorProviderModeTests.ProviderModeChange_RepeatedSameMode_RecordsASwitchForEachEvent`, which pins the existing semantics — each worker-reported `OnAlarmProviderModeChanged` event records a `provider_switches` increment (and resets `_providerSince`) even when `toMode` equals the current mode, since the worker is the authority on when a mode change occurred and the gateway does not synthesize or suppress it.
|
||||
|
||||
### Server-054
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `docs/DesignDecisions.md` (Session Reconnect / Event Subscribers / Later Revisit Items §470-471), `CLAUDE.md` (Repository-Specific Conventions) |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** The session-resilience epic shipped multi-subscriber fan-out (`SessionEventDistributor`), reconnectable sessions with replay (`AttachEventSubscriberWithReplay`/`ReplayGap`), and detach-grace retention — but `docs/DesignDecisions.md` still states "no reconnectable sessions for v1" and "one active StreamEvents subscriber per session for v1", and still files both as post-v1 "Later Revisit Items". `CLAUDE.md` likewise still says these are "explicitly out of scope". This is the stale-prose-vs-shipped-behavior drift the "update docs in the same change as the source" rule prohibits.
|
||||
|
||||
**Recommendation:** Update both `DesignDecisions.md` sections and the revisit list to describe the shipped behavior (gated by `AllowMultipleEventSubscribers`, `DetachGraceSeconds`, replay options), and amend the CLAUDE.md convention bullet.
|
||||
|
||||
**Resolution:** _(empty until closed)_
|
||||
|
||||
### Server-055
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:842-851,1841-1871` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** When `AttachEventSubscriber`/`AttachEventSubscriberWithReplay` fails inside `StartDistributorAndRegister`, the catch calls `DetachEventSubscriber()`, which decrements the active count back to 0 and — because the session is still `Ready` and detach-grace is enabled — stamps `_detachedAtUtc = now`. A freshly-`Ready` session that never had a successful subscriber thus enters the detach-grace window on a failed first attach, making it sweep-eligible after `DetachGraceSeconds` even though no client ever streamed. Impact is minor (the lease still protects it; a later successful attach clears the stamp) but the "last subscriber dropped" semantics are violated.
|
||||
|
||||
**Recommendation:** Only stamp `_detachedAtUtc` on a detach that mirrors a prior successful attach — roll the failure path back without entering grace, or guard the stamp on "a subscriber had previously been registered."
|
||||
|
||||
**Resolution:** _(empty until closed)_
|
||||
|
||||
Reference in New Issue
Block a user