Commit Graph

51 Commits

Author SHA1 Message Date
Joseph Doherty b062cc0440 code-reviews: 2026-06-25 re-review of Galaxy-adoption delta at 3cd7776
Re-review Server + Tests modules over 88915c3..3cd7776 (the
ZB.MOM.WW.GalaxyRepository 0.2.0 adoption + array-write fixes).
Security-critical browse-scope wiring verified sound. 3 new findings:

- Server-059 (Medium): dashboard Galaxy summary memoized only on
  cache Sequence -> status/timestamps freeze during a Galaxy SQL outage.
- Server-060 (Low): no DI test asserts IGalaxyBrowseScopeProvider
  resolves to GatewayBrowseScopeProvider (registration-order invariant).
- Tests-041 (Medium): memoization invalidation path untested.
2026-06-25 13:21:23 -04:00
Joseph Doherty af0f9e80af code-reviews: mark Client.Java-049/050/051 windev-verified (gradle test green, JDK 21) 2026-06-18 11:32:13 -04:00
Joseph Doherty e51377c933 code-reviews: regenerate index after resolving all 2026-06-18 findings (0 pending) 2026-06-18 10:58:43 -04:00
Joseph Doherty 2671639250 fix(gateway): resolve 2026-06-18 array-write review findings
- Server-057: extend []-suffix normalization to AddItemBulk/AddBufferedItem so bulk-added
  array tags bind write-capable handles (authz check, worker bind, and registration kept
  consistent); update gateway.md + client READMEs. Tests: AddItemBulk/AddBufferedItem wiring.
- Server-058: assert []-fallback-resolved bare array names are still denied when out of
  read/write scope and that MaxWriteClassification is enforced on suffixed array registrations.
- Contracts-023/024/025: round-trip + field-19 descriptor pin for MxSparseArray; document
  MxSparseArray in docs/Contracts.md; enumerate it in the protocol-version-3 test summary.
- Tests-040: add wiring tests for the six uncovered sparse-write arms (WriteSecured, Write2,
  WriteSecured2, Write2Bulk, WriteSecuredBulk, WriteSecured2Bulk).

dotnet build + targeted tests green (184 passed).
2026-06-18 10:58:42 -04:00
Joseph Doherty 6c853b43af fix(clients): resolve 2026-06-18 array-write review findings
- Client.Dotnet-030: add advise-supervisory to IsKnownGatewayCommand (was dead/unreachable, exit 2)
- Client.Go-035/036/037: usage+README list advise-supervisory; add session-id guard test; fix write2 README wording
- Client.Python-037/038: drop regressed 'scaffold' from pyproject; add advise-supervisory CLI tests
- Client.Rust-039/040: document write_array_elements/advise-supervisory in design doc; pin outer MxValue data_type==0
- Client.Java-049/050/051: sync CLIENT_VERSION to 0.1.2; add advise-supervisory test; guard negative uint32 inputs (pending windev gradle verification)

Client READMEs also updated for Server-057 add-family normalization wording.
.NET/Go/Python/Rust verified green locally; Java pending windev.
2026-06-18 10:58:33 -04:00
Joseph Doherty 85ef453d0d code-reviews: 2026-06-18 re-review of array-write-ergonomics feature at 88915c3
Re-reviewed the 10 modules touched by the MxSparseArray / array-write
ergonomics work (8df5ab3..88915c3). 16 new findings:

- Server-057 (Medium): [] AddItem normalization skips AddItemBulk/AddBufferedItem
- Client.Dotnet-030 (Medium): advise-supervisory missing from IsKnownGatewayCommand (dead command)
- 14 Low: MxSparseArray doc/test gaps, advise-supervisory CLI gaps across clients,
  Client.Java-049 / Client.Python-037 version-bump consistency misses

Worker.Tests and IntegrationTests clean. Worker unchanged by the feature, not re-reviewed.
2026-06-18 10:33:47 -04:00
Joseph Doherty bed647ca2c docs(code-reviews): mark Client.Java + Worker.Tests findings Resolved (windev-verified)
Client.Java-040..048 and Worker.Tests-034/035/036 flipped In Progress -> Resolved
after windev verification:
- Java: gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests -> BUILD SUCCESSFUL
- Worker.Tests: dotnet test -p:Platform=x86 -> 344 passed, 0 failed
All 11 modules now report 0 open findings; README regenerated (--check clean).
2026-06-17 05:33:30 -04:00
Joseph Doherty 8df0479b99 fix: resolve Client.Java + Worker.Tests findings (pending windev verification)
Client.Java-040..048, Worker.Tests-034/035/036. Edits applied on the Mac,
which has no JRE and cannot build the x86+MXAccess worker tests; findings are
marked In Progress pending gradle + x86 build verification on windev. Do not
mark Resolved until verified there.
2026-06-17 05:23:14 -04:00
Joseph Doherty 6b5fe6aa82 fix: resolve code-review findings (locally verified)
Server-054/055/056, Contracts-020/021/022, Tests-036/038/039,
IntegrationTests-030/031/032 (+033 deferred to live rig),
Client.Dotnet-026/028/029 (+027 won't-fix), Client.Go-030..034,
Client.Python-032..036, Client.Rust-033..038.

Key fix: SessionEventDistributor orphaned a subscriber that registered after
the pump completed but before disposal (Server-056) -> register paths now
complete late registrants under _lifecycleLock; regression test added. The
racy dashboard-mirror gRPC test made deterministic (Tests-039).

Verified green locally: gateway Tests targeted classes (GatewaySession,
SessionEventDistributor, GatewayOptionsValidator, ProtobufContractRoundTrip,
GatewaySessionDashboardMirror) + dotnet/go/python/rust client suites.
2026-06-17 05:23:14 -04:00
Joseph Doherty 25d04ec37e code-reviews: 2026-06-16 re-review of all 11 modules at 8df5ab3
Re-review of the 99-commit delta since the 410acc9 baseline (session-resilience
epic, dashboard disable-login, galaxy browse fixes, and stillpending §8).

44 new Open findings, no Critical/High:
- Server 2 (incl. Medium design-doc drift), Worker 0 (026/027/028 confirmed
  resolved), Contracts 3, Tests 3, Worker.Tests 3, IntegrationTests 4
- Client.Dotnet 4 (Medium env-var key redaction), Client.Go 5 (Medium watch
  drain), Client.Java 9 (Medium overflow race), Client.Python 5 (Medium README
  API), Client.Rust 6 (Medium --tls/--plaintext downgrade)

README regenerated; regen-readme.py --check passes.
2026-06-16 18:57:56 -04:00
Joseph Doherty 7c957908f8 docs(code-review): regenerate index — all re-review findings resolved 2026-06-15 03:04:37 -04:00
Joseph Doherty 6659653673 fix(client/java): handle PROVIDER_STATUS alarm-feed arm — CLI build break (Client.Java-039) 2026-06-15 03:01:13 -04:00
Joseph Doherty 75a39f5a8c fix(client/java): correct browseChildrenRaw README; CLI --require-certificate-validation (Client.Java-037,038) 2026-06-15 02:56:15 -04:00
Joseph Doherty cebe67e9bd fix(worker): resilient failover switch; FIPS-safe synthetic GUID; dup-reference guard + tests (Worker-026..028, Worker.Tests-031..033) 2026-06-15 02:56:15 -04:00
Joseph Doherty ddf2d84fbc contracts: round-trip degraded provenance/watch-list/mode-changed; proto doc (Contracts-018,019) 2026-06-15 02:46:06 -04:00
Joseph Doherty 56dd56954b test(gateway): cover failback reason, FromFeed/SinceUtc badge paths; style + bounded drain (Tests-032..035) 2026-06-15 02:46:06 -04:00
Joseph Doherty b57d02cc4d fix(client/rust): handle provider_status arm (build break); real system-roots TLS; design doc (Client.Rust-030..032) 2026-06-15 02:39:11 -04:00
Joseph Doherty 47062c1a6e fix(client/python): reachable cert-validation flag; bounded off-loop TOFU probe; license/marker fixes (Client.Python-027..031) 2026-06-15 02:39:11 -04:00
Joseph Doherty d0d1dcef15 fix(client/go): correct tag-go-module dirty-tree guard; GOPRIVATE docs (Client.Go-028,029) 2026-06-15 02:39:11 -04:00
Joseph Doherty fb2b1a4a52 fix(client/dotnet): restore warnings-as-errors floor; license metadata; LazyBrowseNode publication (Client.Dotnet-022..025) 2026-06-15 02:39:11 -04:00
Joseph Doherty d2c776901b fix(integrationtests): repair GatewayAlarmMonitor ctor build break; LDAP bind + docs (IntegrationTests-026..029) 2026-06-15 02:39:11 -04:00
Joseph Doherty 258e09e0de fix(server): propagate watch-list cancellation; doc + test gaps (Server-051..053) 2026-06-15 02:39:11 -04:00
Joseph Doherty 581b541801 code-reviews: regenerate after batch 2 resolutions
All 41 findings from the 42b0037 re-review are now Resolved across
8 modules. Open count = 0 for every module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:36 -04:00
Joseph Doherty d3cb311aae Resolve Client.Java-032..036: shared subscription base, batch tokenizer
Client.Java-032  README CLI examples for stream-alarms and
                 acknowledge-alarm now use the correct picocli flags
                 (--filter-prefix and --reference); two regression
                 tests parse each documented invocation.
Client.Java-033  StreamAlarmsCommand publishes an
                 AtomicReference<MxGatewayAlarmFeedSubscription> and
                 mirrors MxEventStream's overflow branch: a failed
                 queue.offer cancels the subscription, queues an
                 IllegalStateException, then queues the END sentinel
                 — preserving the fail-fast contract.
Client.Java-034  BatchCommand routes through a new
                 MxGatewayCli.tokenizeBatchLine POSIX-style shell
                 tokenizer that respects double-quoted, single-quoted,
                 and backslash-escaped arguments.
Client.Java-035  Added streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages
                 to MxGatewayClientSessionTests; asserts request shape,
                 message ordering, and cancellation propagation.
Client.Java-036  Extracted MxGatewayStreamSubscription<TRequest,TResponse>
                 abstract base; the four subscription classes
                 (MxGatewayEventSubscription, MxGatewayAlarmFeedSubscription,
                 MxGatewayActiveAlarmsSubscription, DeployEventSubscription)
                 collapse to ~10-line subclasses. A new contract test
                 runs identical lifecycle / cancellation assertions
                 across all four subclasses.

All resolved at 2026-05-24; gradle build + gradle test BUILD SUCCESSFUL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:27 -04:00
Joseph Doherty 186d03e5cc Resolve IntegrationTests-025: stopBoundary for repo-root walker
ResolveRepositoryRoot accepts an optional stopBoundary parameter that
caps the upward walk; production callers pass null and behavior is
unchanged. The two repository-marker tests now seal their walkers
inside their own temp directories, so a redirected TMP or a co-located
C:\src checkout no longer leaks ambient marker-bearing ancestors into
the assertion.

Regression test ResolveRepositoryRoot_StopBoundary_IsolatesWalkerFromAmbientAncestorMarkers
constructs an outer ancestor that carries src/ + .git, confirms the
walker leaks into it without the boundary, then asserts the same call
throws with the boundary supplied.

Resolved at 2026-05-24; IntegrationTestEnvironmentTests 5/5 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:09 -04:00
Joseph Doherty 6bae5ea3a3 Resolve Tests-027..031: flake root cause + coverage gaps
Tests-027  GatewayMetrics exposes its internal Meter; the
           StreamEvents_WhenEventIsWritten_RecordsSendDuration listener
           now filters by ReferenceEquals(instrument.Meter, metrics.Meter)
           instead of Meter.Name, so parallel tests with their own
           GatewayMetrics no longer cross-contaminate the families list.
Tests-028  FakeWorkerClient.Kill now captures LastKillReason;
           SessionManager.KillWorkerAsync tests pin the reason
           propagation end-to-end and cover the blank/null guard. The
           DashboardSessionAdminService kill test pins the literal
           dashboard-admin-kill reason.
Tests-029  Added CloseSessionAsync_BlankSessionId_ReturnsFailure to mirror
           the existing KillWorkerAsync blank-id coverage.
Tests-030  DeleteAsync_WhenStoreRefuses_ReportsFriendlyError renamed and
           extended to assert the dashboard-delete-key audit row with
           Details = not-found-or-active. Added
           DeleteAsync_BlankKeyId_ReturnsFailure.
Tests-031  DashboardSnapshotPublisher reconnect test now measures the
           gap from the first throw inside the fake (firstThrowAt) to
           secondSubscribeAt, isolating Task.Delay from StartAsync /
           scheduling overhead.

All resolved at 2026-05-24; 512/512 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:28:54 -04:00
Joseph Doherty 430187c28b code-reviews: regenerate after batch 1 resolutions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:39 -04:00
Joseph Doherty f5b50c4484 Resolve Client.Python-022..026: TLS-by-default, batch CLI, README
Client.Python-022  README CLI examples for stream-alarms and
                   acknowledge-alarm now use the correct flags;
                   regression test parses every documented line through
                   Click.
Client.Python-023  Re-applied Client.Python-013 — _use_plaintext drops
                   the silent localhost / 127.0.0.1 auto-downgrade
                   branch; --plaintext and --tls are mutually exclusive
                   and TLS is the default.
Client.Python-024  batch dispatch routes through main.main(...,
                   standalone_mode=False) under a redirected stdout
                   instead of click.testing.CliRunner; recursive batch
                   lines are rejected outright.
Client.Python-025  Added behavioural tests for the five bulk SDK methods,
                   stream_alarms, and the new CLI subcommands.
Client.Python-026  _bench_read_bulk hoists 'import time' to module scope
                   and logs cleanup failures instead of swallowing them.

All resolved at 2026-05-24; python -m pytest is 65/65 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:27 -04:00
Joseph Doherty 4a0f88b17d Resolve Client.Rust-022..029: MalformedReply, correlation ids, clippy
Client.Rust-022  Restored Error::MalformedReply for register / add_item /
                 add_item2 and the bulk-subscribe / read-bulk / write-bulk
                 dispatch arms so malformed-but-OK replies fail loudly
                 instead of returning Vec::new().
Client.Rust-023  Restored next_correlation_id and routed every CLI close /
                 stream-alarms / acknowledge-alarm / bench-read-bulk call
                 through it so each call carries a unique opaque token.
Client.Rust-024  Added round-trip tests for read_bulk / write_bulk /
                 write2_bulk / write_secured_bulk / write_secured2_bulk
                 plus stream_alarms and percentile_summary unit tests.
Client.Rust-025  RustClientDesign.md re-synced — new bulk SDK, alarms
                 surface, Error variants, CLI command list, and the
                 Windows stack workaround.
Client.Rust-026  Session::read_bulk now borrows a tag slice; bench-read-
                 bulk binds tags once outside the warm-up / steady-state
                 loops.
Client.Rust-027  .cargo/config.toml selector tightened to
                 cfg(all(windows, target_env = "msvc")) and comment
                 rewritten to match reality (release + debug ship the
                 8 MB reservation).
Client.Rust-028  run_batch removed the empty-line break; stdin EOF is
                 the only terminator.
Client.Rust-029  Re-applied Client.Rust-001 / 002 / 012 — added the
                 missing doc comments, renamed BulkReplyKind variants,
                 and replaced the clone-on-copy with a deref under lock
                 so cargo clippy -D warnings is clean.

All resolved at 2026-05-24; cargo fmt + check + clippy + test all green
(55 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:15 -04:00
Joseph Doherty 82996aa8e6 Resolve Client.Go-022..027: bulk flags, bench cancel, batch loop
Client.Go-022  Re-applied Client.Go-015 shape — runWriteBulkVariant drops
               the unused secured param and gates -current-user-id /
               -verifier-user-id / -user-id behind the secured-only
               variants.
Client.Go-023  Re-applied Client.Go-018 shape — bench warm-up and steady-
               state loops respect ctx.Err().
Client.Go-024  Added SDK-level tests for WriteBulk / Write2Bulk /
               WriteSecuredBulk / WriteSecured2Bulk / ReadBulk and
               StreamAlarms via the existing bufconn fake gateway pattern.
Client.Go-025  Five bulk SDK methods short-circuit on empty input without
               an RPC round-trip and document the behavior.
Client.Go-026  runBatch widens scanner.Buffer to 16 MiB and emits an
               error-with-sentinel if a longer line still arrives, rather
               than aborting the session silently.
Client.Go-027  runBatch treats blank lines as skip-and-continue; only EOF
               ends the session.

All resolved at 2026-05-24; gofmt + go vet + go build + go test ./... all
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:58 -04:00
Joseph Doherty 712cb06442 Resolve Client.Dotnet-018..021: README + bench-read-bulk hardening
Client.Dotnet-018  README CLI examples for stream-alarms / acknowledge-alarm
                   replaced with parser-correct flags; new theory test
                   parses each documented README example through the CLI.
Client.Dotnet-019  BenchReadBulkAsync routes through new
                   RequireRegisterServerHandle helper that fails loudly when
                   the OK register reply has no typed payload.
Client.Dotnet-020  Bench steady-state catch is now
                   catch (Exception ex) when (ex is not OperationCanceledException)
                   so user-driven cancellation exits promptly.
Client.Dotnet-021  --timeout-ms now flows through ParseTimeoutMs which
                   rejects negatives with a clear error in both read-bulk
                   and bench-read-bulk.

All resolved at 2026-05-24; 67/67 .NET client tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:45 -04:00
Joseph Doherty 4d77279e7e Resolve Server-044..050: KillWorker accounting + admin service hardening
Server-044  KillWorkerAsync catch path now calls _metrics.SessionRemoved
            so the open-session gauge does not leak when KillWorker throws.
Server-045  KillWorkerAsync routes through a new
            GatewaySession.KillWorkerWithCloseGateAsync that takes the
            per-session close lock, so concurrent kills count SessionsClosed
            exactly once.
Server-046  CloseSessionCoreAsync's SessionCloseStartedException branch and
            ShutdownAsync's kill fallback both increment SessionsClosed (not
            just the gauge), so the counter and gauge stay consistent.
Server-047  ApiKeysPage.ConfirmPendingAsync holds PendingAction across the
            awaited action and clears it in finally, matching the sessions
            pages.
Server-048  Closed: the 044/045 regression tests cover the previously-
            untested kill paths.
Server-049  IDashboardSessionAdminService + DashboardSessionAdminService
            now carry XML docs that pin the Admin gate, missing-session
            return-Fail semantics, and the dashboard-admin-kill reason.
Server-050  CloseSessionAsync and KillWorkerAsync catch unexpected
            exceptions after the SessionManagerException catches and return
            a friendly Fail; OperationCanceledException tied to the caller
            token still propagates.

All resolved at 2026-05-24; 503/503 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:34 -04:00
Joseph Doherty 6079c62709 code-reviews: regenerate index at 42b0037
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:56 -04:00
Joseph Doherty 37ef27e8ed code-reviews: bump Worker + Worker.Tests headers to 42b0037
No source changes since d692232 — header bumped to track the latest
reviewed commit, all prior findings remain closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:56 -04:00
Joseph Doherty db2218f395 code-reviews: re-review Client.Java at 42b0037
Append 5 new findings (Client.Java-032..036): README flags for new
alarm subcommands do not exist; StreamAlarmsCommand bounded queue
silently drops alarms instead of fail-fast; BatchCommand whitespace
tokenisation shreds quoted args; no library-side stream_alarms test;
MxGatewayAlarmFeedSubscription is the fourth duplicate subscription
class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:55 -04:00
Joseph Doherty bc28fee641 code-reviews: re-review Client.Python at 42b0037
Append 5 new findings (Client.Python-022..026): README flags for new
alarm subcommands do not exist; Client.Python-013 regression — the
silent localhost auto-plaintext branch is still present (the prior
Resolution did not survive the rename); production batch path uses
the click.testing.CliRunner helper; no behavioural tests for new SDK
+ CLI; bench cleanup swallows exceptions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:55 -04:00
Joseph Doherty 15fceed536 code-reviews: re-review Client.Rust at 42b0037
Append 8 new findings (Client.Rust-022..029): MalformedReply path
absent from the new bulk SDK methods, hard-coded client_correlation_id
in new CLI commands, no tests for stream_alarms / bulk SDK / bench,
RustClientDesign.md silent on new surface, run_bench_read_bulk clones
tags inside the loop, .cargo/config.toml comment is wrong, run_batch
exits on blank line, and cargo clippy fails at HEAD with three
warnings the prior reviewer punted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:41 -04:00
Joseph Doherty afa82e0989 code-reviews: re-review Client.Go at 42b0037
Append 6 new findings (Client.Go-022..027): regressions of Client.Go-015
(runWriteBulkVariant secured flag) and Client.Go-018 (bench loop
ignores ctx.Err()) reintroduced by the bulk port; no SDK tests for the
new WriteBulk/ReadBulk/StreamAlarms methods; bulk SDK accepts empty
slices; bufio.Scanner buffer + blank-line sentinel can abort the batch
session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:41 -04:00
Joseph Doherty b9ef09d26e code-reviews: re-review Client.Dotnet at 42b0037
Append 4 new findings (Client.Dotnet-018..021): README flags for the
new stream-alarms/acknowledge-alarm subcommands cite options that do
not exist on the CLI; BenchReadBulkAsync reinstates the silent
register-handle fallback and swallows OperationCanceledException;
both new --timeout-ms consumers cast int32 to uint without bounds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:40 -04:00
Joseph Doherty 7d66967122 code-reviews: re-review IntegrationTests at 42b0037
Append 1 new finding (IntegrationTests-025): the
ResolveRepositoryRoot_NoMarkers test walks up from Path.GetTempPath()
through unisolated ancestors; a redirected TMP or co-located checkout
at C:\src silently breaks the assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:26 -04:00
Joseph Doherty 2f8404d2ef code-reviews: re-review Contracts at 42b0037
No new findings — the only Contracts commit in window (bd1d1f1) is a
comment-only proto edit; field numbering remains additive with no
reuse, renumbering, or type narrowing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:25 -04:00
Joseph Doherty 2b92be02b9 code-reviews: re-review Tests at 42b0037
Append 5 new findings (Tests-027..031) covering the
StreamEvents_WhenEventIsWritten_RecordsSendDuration flake root cause
(shared MeterListener by meter name), missing kill-path coverage
(reason propagation + concurrent-kill double-count), asymmetric guard
coverage between Close and Kill, missing audit-failure-path coverage
for ApiKey Delete, and the DashboardSnapshotPublisher reconnect-window
timer sensitivity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:15 -04:00
Joseph Doherty 056f0d8808 code-reviews: re-review Server at 42b0037
Append 7 new findings (Server-044..050) covering the destructive-action
wave: KillWorkerAsync metric/state leaks, ShutdownAsync kill-fallback
gauge leak, inconsistent ConfirmDialog cleanup across pages, missing
XML docs on the new DashboardSessionAdmin surface, and unhandled
RemoveSessionAsync exception paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:15 -04:00
Joseph Doherty a68f0cf222 code-review: regen README, all 21 open findings resolved
Closes the 2026-05-24 resolution sweep at HEAD `83eba4b`. All 21 open
findings from the d692232 re-review are now Resolved:

  Server          327e9c5  Server-031, -032, -038, -039, -040, -041, -042, -043
  Contracts       bd1d1f1  Contracts-016, -017
  Tests           d48099f  Tests-025, -026
  IntegrationTests 865c22a IntegrationTests-022, -023, -024
  Client.Java     10bd0c0  Client.Java-027, -028, -029, -030, -031
  Client.Rust     83eba4b  Client.Rust-021

Also fixes stale "Open findings" header counts in Client.Java and
IntegrationTests findings.md that survived the resolve passes
(the agents updated each finding's Status but missed the header
sum). `regen-readme.py --check` is now green.

Module status: 11 / 11 reviewed, 0 / 276 total open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:21:39 -04:00
Joseph Doherty 83eba4bec5 Resolve Client.Rust-021
Client.Rust-021 (Design adherence): RustClientDesign.md "Crate layout"
section now describes the actual flat workspace structure instead of
the aspirational nested form. The replacement text states that the
workspace root is clients/rust/, the top-level crate
zb-mom-ww-mxgateway-client is declared in clients/rust/Cargo.toml
directly, and crates/mxgw-cli/ is the sole [workspace.members] entry.
The accompanying tree lists the real files on disk (Cargo.toml,
Cargo.lock, build.rs, README.md, RustClientDesign.md,
src/{lib,client,session,galaxy,options,auth,error,value,version,generated}.rs
plus the src/generated/ tonic-build output dir, tests/, and
crates/mxgw-cli/).

Doc-only change. cargo build --workspace + cargo test --workspace clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:21:07 -04:00
Joseph Doherty 10bd0c0e4d Resolve Client.Java-027..031
Client.Java-027 (Documentation): Updated 17 Gradle task references in
clients/java/README.md (lines 37, 108-110, 160-161, 169-176, 186, 206,
221) and 3 in clients/java/JavaClientDesign.md from the retired short
subproject names to the canonical zb-mom-ww-mxgateway-client /
zb-mom-ww-mxgateway-cli names. Copy-pasting any documented command now
matches the subproject names declared in settings.gradle.

Client.Java-028 (Design adherence): Build-layout block in
JavaClientDesign.md lines 23-27 updated to show the actual package
paths com/zb/mom/ww/mxgateway/{client,cli}/ instead of the retired
com/dohertylan/mxgateway/{client,cli}/ paths.

Client.Java-029 (Documentation): README.md line 210 corrected from
"zb-mom-ww-mxgateway-cli/build/install/mxgateway-cli" to
"zb-mom-ww-mxgateway-cli/build/install/zb-mom-ww-mxgateway-cli" — Gradle
installDist produces a directory whose name matches the project name,
not the short suffix. The e2e script already used the correct path.

Client.Java-030 (Testing coverage): Added
queryActiveAlarmsForwardsRequestAndStreamsSnapshots to
MxGatewayClientSessionTests. The test pushes a QueryActiveAlarmsRequest
carrying session_id / client_correlation_id / alarm_filter_prefix
through an InProcessGateway + TestGatewayService and asserts the server
observed all three request fields, two ActiveAlarmSnapshots stream in
order, and onError is never called. TDD red→green confirmed via a
deliberately-wrong session_id assertion. The re-triage note in
Client.Java-030's resolution clarifies that the finding's reference to
"the existing acknowledgeAlarm test" was aspirational — the alarm RPC
surface had zero coverage before this commit.

Client.Java-031 (Conventions): README.md prose lines 17, 22, 26 updated
to use the canonical zb-mom-ww-mxgateway-client / zb-mom-ww-mxgateway-cli
names so the layout description matches Gradle / IDE project names.

Verification: gradle build BUILD SUCCESSFUL; all Java unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:21:06 -04:00
Joseph Doherty 865c22a884 Resolve IntegrationTests-022..024
IntegrationTests-022 (Conventions): ResolveRepositoryRoot now throws
InvalidOperationException when the walk exhausts without finding a root
marker, with a message naming the start directory, the expected markers
(src/, .git, *.sln, *.slnx), and the MXGATEWAY_LIVE_MXACCESS_WORKER_EXE
escape hatch. Replaces the silent fallback to
Directory.GetCurrentDirectory() that previously masked misconfiguration.
New regression test
ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers
in IntegrationTestEnvironmentTests asserts the throw and the message
contents. TDD red→green confirmed.

IntegrationTests-023 (Testing coverage): DashboardLdapLiveTests's
AuthenticateAsync_AdminInGwAdminGroup_Succeeds now asserts that the
authenticated principal carries a ClaimTypes.Role claim with value
DashboardRoles.Admin in addition to the existing LdapGroupClaimType
assertion. A regression in MapGroupsToRoles (returning an empty list or
missing the RDN fallback) would now surface here. Gated by
MXGATEWAY_RUN_LIVE_LDAP_TESTS.

IntegrationTests-024 (Conventions): Option (b) — extracted within
IntegrationTests. New file TestSupport/NullDashboardEventBroadcaster.cs
(public type, private ctor, singleton Instance). The inline class at
the bottom of WorkerLiveMxAccessSmokeTests is gone; the file now imports
the shared type. Matches the unit-test project's Tests-007 / Tests-021 /
Tests-025 pattern while keeping the two test projects independently
buildable (no shared test-helpers project crossing module boundaries).

Verification: dotnet build src/ZB.MOM.WW.MxGateway.IntegrationTests
clean; 19/19 integration tests passing (live MxAccess + LDAP + Galaxy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:20:40 -04:00
Joseph Doherty d48099f0d0 Resolve Tests-025..026
Tests-025 (Conventions): Extracted the previously-duplicated
NullDashboardEventBroadcaster into TestSupport/NullDashboardEventBroadcaster.cs
(singleton Instance, private ctor). The two nested copies in
EventStreamServiceTests and GatewayEndToEndFakeWorkerSmokeTests were
removed; both files now use the shared type via
'using ZB.MOM.WW.MxGateway.Tests.TestSupport;'. The Server-041 regression
test's ThrowingDashboardEventBroadcaster is intentionally left nested —
single-file usage doesn't warrant promotion to TestSupport. The third
copy in IntegrationTests/WorkerLiveMxAccessSmokeTests was handled by
IntegrationTests-024 in its own commit.

Tests-026 (Testing coverage): Added a new RecordingDashboardEventBroadcaster
test double in TestSupport — a thread-safe (ConcurrentQueue<DashboardEventCapture>)
recorder. New fixture
StreamEventsAsync_PublishesEachEventToDashboardBroadcaster in
EventStreamServiceTests pushes two events through the fake session and
asserts the broadcaster received both with the correct sessionId and
WorkerSequence. TDD red→green confirmed: the deliberately-wrong
"Expected 3, Actual 2" red phase proved the recording fake was actually
invoked by the production code path.

Verification: 486/486 server tests passing (485 previous + 1 new).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:20:40 -04:00
Joseph Doherty bd1d1f1c0e Resolve Contracts-016..017
Contracts-016 (Conventions): QueryActiveAlarmsRequest.session_id header
replaced with the unambiguous "Clients may leave session_id empty; the
gateway currently ignores it and serves the session-less central-monitor
cache. A future version may use it to scope the snapshot to one
session." Removes the ambiguity that the prior "reserved for future
use" wording introduced.

Contracts-017 (Documentation): The rpc QueryActiveAlarms comment now
includes the alarm_filter_prefix description: "QueryActiveAlarmsRequest.alarm_filter_prefix
optionally narrows the snapshot to alarms whose alarm_full_reference
starts with the given prefix; an empty prefix returns the full set."

Both are proto-comment-only changes — no wire-format impact, no field
renumbering, and the regenerated MxaccessGateway.cs / MxaccessGatewayGrpc.cs
carry only the doc-comment delta. Added the additive-only regression
guard QueryActiveAlarmsRequest_PinsFieldNumbersAndRoundTripsPrefixFilter
to ProtobufContractRoundTripTests — pins
session_id=1 / client_correlation_id=2 / alarm_filter_prefix=3 by
descriptor lookup and round-trips the message with and without the
filter populated.

Verification: dotnet build src/ZB.MOM.WW.MxGateway.slnx clean;
ProtobufContractRoundTripTests 40/40 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:20:13 -04:00
Joseph Doherty 327e9c5f94 Resolve Server-031..032 (re-triaged) + Server-038..043
Server-031: re-triaged. The recommended gateway-side
"skip-while-command-in-flight" guard is already in place at
WorkerClient.HeartbeatLoopAsync via WorkerClientOptions.HeartbeatStuckCeiling
(default 75s = 5× HeartbeatGrace). Two regression tests pin the
behaviour. Recommendation #1 (decouple worker-side _writeLock) is a
Worker-module concern (Worker-017 / Worker-023) and out of scope here.

Server-032: re-triaged. Recommendation #2 (rich diagnostic) is already
in EnqueueWorkerEventAsync, with #3 (overflow grace) absorbed by the
TryWrite → WriteAsync-with-timeout fall-through. Test
EnqueueWorkerEvent_WhenChannelFullPastTimeout_FaultsWithRichDiagnostic
pins the diagnostic string. Recommendation #1 (prose contract in
gateway.md / docs) is deferred — outside this pass's edit scope.

Server-038 (Security): EventsHub.SubscribeSession's missing per-session
ACL is documented with a TODO(per-session-acl) and a <remarks> block
explaining the v1 acceptance (any dashboard role can subscribe to any
session — non-secret metadata, redacted value logging). The per-session
ACL design lands in a follow-up once a session-scoped role exists.

Server-039 (Error handling): HubTokenService.Validate now rejects a
deserialized payload where both Name and NameIdentifier are null/empty.
New test file HubTokenServiceTests.cs covers the regression and five
sanity cases. TDD confirmed.

Server-040 (Conventions): MapGroupsToRoles gains a precedence comment
explaining "full literal match first, leading-RDN fallback;
OrdinalIgnoreCase via DashboardOptions.GroupToRole". Documentation-only.

Server-041 (Design adherence): EventStreamService.ProduceEventsAsync
wraps the broadcaster.Publish call in try/catch (Exception). The
producer loop and gRPC stream are no longer at the mercy of the
broadcaster's never-throw discipline. New regression test
StreamEventsAsync_WhenDashboardBroadcasterThrows_StillYieldsEventsAndDoesNotFaultSession.

Server-042 (Performance): DashboardSnapshotPublisher.ExecuteAsync now
mirrors AlarmsHubPublisher's reconnect loop — wraps the await foreach
in a while-not-cancelled, catches general exceptions, and Task.Delays
5s before retrying. An internal ctor accepts a shorter delay for the
test. New test file DashboardSnapshotPublisherTests.cs covers the
throw-then-yield reconnect path and the normal-completion case.

Server-043 (Documentation): HubTokenService class XML doc gains a
<remarks> describing the singleton lifetime, the two consumer scopes
(DashboardHubConnectionFactory scoped, HubTokenAuthenticationHandler
transient), and the thread-safety contract.

Verification: dotnet build src/ZB.MOM.WW.MxGateway.slnx clean
(0 warnings / 0 errors); src/ZB.MOM.WW.MxGateway.Tests 486/486 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 03:18:52 -04:00