Commit Graph

248 Commits

Author SHA1 Message Date
Joseph Doherty ba157b4b4f grpc: implement BrowseChildren handler + metadata:read scope 2026-05-28 13:08:45 -04:00
Joseph Doherty 87e22dd529 galaxy: add GalaxyBrowseProjector for direct-children projection 2026-05-28 12:58:07 -04:00
Joseph Doherty d9eaf4b056 galaxy: add ChildrenByParent index for level-at-a-time browse 2026-05-28 12:51:48 -04:00
Joseph Doherty 2c5c5e5c7e contracts: add BrowseChildren RPC for lazy hierarchy browse 2026-05-28 12:47:02 -04:00
Joseph Doherty b3ebf583ad docs: implementation plan for lazy-browse BrowseChildren RPC
12-task bite-sized plan executing the approved design.
Includes native task persistence file.
2026-05-28 12:41:11 -04:00
Joseph Doherty edb812d859 docs: design for lazy-browse BrowseChildren RPC
OPC UA-style level-at-a-time browse across gRPC, dashboard, and the
shared cache projector. Server still loads the full Galaxy hierarchy;
laziness is wire-side and UI-side only.
2026-05-28 12:34:37 -04:00
Joseph Doherty 795eee72e3 client/dotnet: backfill XML doc comments to satisfy analyzers
Adds missing <summary>/<param> docs across the .NET client library and its
test suite so CommentChecker reports zero issues. TreatWarningsAsErrors
requires the analyzer surface clean before publishing the NuGet package.
2026-05-27 14:30:53 -04:00
Joseph Doherty 615b487a77 docs+ui: backfill XML doc comments and finish dashboard layout pass
Adds missing <summary>/<param> XML docs across 99 server, worker, and test
files so CommentChecker reports zero issues (TreatWarningsAsErrors needs the
analyzer clean). Bundles in WIP dashboard work: NavSection extraction,
MainLayout/site.css/js styling alignment, and DashboardOptions/Auth tweaks.
2026-05-27 14:20:10 -04:00
Joseph Doherty 382861c602 build: add NonWindows.slnx for macOS/Linux dev hosts
The Worker + Worker.Tests projects pull in the Windows-only ArchestrA.MxAccess
COM stack and can't be built off Windows. Add a sibling .slnx that lists only
the cross-platform projects (Contracts, Server, IntegrationTests, Tests) so
non-Windows hosts can restore + build the rest of the solution with:

    dotnet build src/ZB.MOM.WW.MxGateway.NonWindows.slnx

The canonical solution on Windows remains ZB.MOM.WW.MxGateway.slnx.
2026-05-26 01:18:29 -04:00
Joseph Doherty ba2b936609 ui: align dashboard styling with ScadaLink master conventions
- Rename DashboardLayout.razor -> MainLayout.razor; dashboard.css -> site.css
- Sidebar 218 -> 220px; add hamburger + Bootstrap collapse for <lg viewports
- Rename .metric-* KPI classes to .agg-* (matches shared theme tokens)
- Rebuild ApiKeysPage create form as card + h6 subsections + bottom Save/Cancel
2026-05-26 01:12:54 -04:00
Joseph Doherty 7fc1955287 Dashboard: handle GET /logout (was 405) by signing out + redirecting to /login
Browsers that navigate directly to /logout via the address bar issued a GET
against a POST-only route and got 405 Method Not Allowed. Logout is
self-destructive, so the GET path can skip antiforgery; the existing POST
form (used by the layout's Sign out button) is unchanged and still
antiforgery-protected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 23:40:39 -04:00
Joseph Doherty 54480dde61 Add review-process + glauth design docs, bench scripts; ignore install/
Picks up the missing glauth.md referenced by CLAUDE.md, captures the
review workflow alongside the bench-read-bulk and review-readme helper
scripts, and excludes the local install/ deployment tree from source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 23:26:21 -04:00
Joseph Doherty 581b541801 code-reviews: regenerate after batch 2 resolutions
All 41 findings from the 42b0037 re-review are now Resolved across
8 modules. Open count = 0 for every module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:36 -04:00
Joseph Doherty d3cb311aae Resolve Client.Java-032..036: shared subscription base, batch tokenizer
Client.Java-032  README CLI examples for stream-alarms and
                 acknowledge-alarm now use the correct picocli flags
                 (--filter-prefix and --reference); two regression
                 tests parse each documented invocation.
Client.Java-033  StreamAlarmsCommand publishes an
                 AtomicReference<MxGatewayAlarmFeedSubscription> and
                 mirrors MxEventStream's overflow branch: a failed
                 queue.offer cancels the subscription, queues an
                 IllegalStateException, then queues the END sentinel
                 — preserving the fail-fast contract.
Client.Java-034  BatchCommand routes through a new
                 MxGatewayCli.tokenizeBatchLine POSIX-style shell
                 tokenizer that respects double-quoted, single-quoted,
                 and backslash-escaped arguments.
Client.Java-035  Added streamAlarmsForwardsRequestAndStreamsAlarmFeedMessages
                 to MxGatewayClientSessionTests; asserts request shape,
                 message ordering, and cancellation propagation.
Client.Java-036  Extracted MxGatewayStreamSubscription<TRequest,TResponse>
                 abstract base; the four subscription classes
                 (MxGatewayEventSubscription, MxGatewayAlarmFeedSubscription,
                 MxGatewayActiveAlarmsSubscription, DeployEventSubscription)
                 collapse to ~10-line subclasses. A new contract test
                 runs identical lifecycle / cancellation assertions
                 across all four subclasses.

All resolved at 2026-05-24; gradle build + gradle test BUILD SUCCESSFUL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:27 -04:00
Joseph Doherty 186d03e5cc Resolve IntegrationTests-025: stopBoundary for repo-root walker
ResolveRepositoryRoot accepts an optional stopBoundary parameter that
caps the upward walk; production callers pass null and behavior is
unchanged. The two repository-marker tests now seal their walkers
inside their own temp directories, so a redirected TMP or a co-located
C:\src checkout no longer leaks ambient marker-bearing ancestors into
the assertion.

Regression test ResolveRepositoryRoot_StopBoundary_IsolatesWalkerFromAmbientAncestorMarkers
constructs an outer ancestor that carries src/ + .git, confirms the
walker leaks into it without the boundary, then asserts the same call
throws with the boundary supplied.

Resolved at 2026-05-24; IntegrationTestEnvironmentTests 5/5 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:29:09 -04:00
Joseph Doherty 6bae5ea3a3 Resolve Tests-027..031: flake root cause + coverage gaps
Tests-027  GatewayMetrics exposes its internal Meter; the
           StreamEvents_WhenEventIsWritten_RecordsSendDuration listener
           now filters by ReferenceEquals(instrument.Meter, metrics.Meter)
           instead of Meter.Name, so parallel tests with their own
           GatewayMetrics no longer cross-contaminate the families list.
Tests-028  FakeWorkerClient.Kill now captures LastKillReason;
           SessionManager.KillWorkerAsync tests pin the reason
           propagation end-to-end and cover the blank/null guard. The
           DashboardSessionAdminService kill test pins the literal
           dashboard-admin-kill reason.
Tests-029  Added CloseSessionAsync_BlankSessionId_ReturnsFailure to mirror
           the existing KillWorkerAsync blank-id coverage.
Tests-030  DeleteAsync_WhenStoreRefuses_ReportsFriendlyError renamed and
           extended to assert the dashboard-delete-key audit row with
           Details = not-found-or-active. Added
           DeleteAsync_BlankKeyId_ReturnsFailure.
Tests-031  DashboardSnapshotPublisher reconnect test now measures the
           gap from the first throw inside the fake (firstThrowAt) to
           secondSubscribeAt, isolating Task.Delay from StartAsync /
           scheduling overhead.

All resolved at 2026-05-24; 512/512 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 09:28:54 -04:00
Joseph Doherty 430187c28b code-reviews: regenerate after batch 1 resolutions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:39 -04:00
Joseph Doherty f5b50c4484 Resolve Client.Python-022..026: TLS-by-default, batch CLI, README
Client.Python-022  README CLI examples for stream-alarms and
                   acknowledge-alarm now use the correct flags;
                   regression test parses every documented line through
                   Click.
Client.Python-023  Re-applied Client.Python-013 — _use_plaintext drops
                   the silent localhost / 127.0.0.1 auto-downgrade
                   branch; --plaintext and --tls are mutually exclusive
                   and TLS is the default.
Client.Python-024  batch dispatch routes through main.main(...,
                   standalone_mode=False) under a redirected stdout
                   instead of click.testing.CliRunner; recursive batch
                   lines are rejected outright.
Client.Python-025  Added behavioural tests for the five bulk SDK methods,
                   stream_alarms, and the new CLI subcommands.
Client.Python-026  _bench_read_bulk hoists 'import time' to module scope
                   and logs cleanup failures instead of swallowing them.

All resolved at 2026-05-24; python -m pytest is 65/65 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:27 -04:00
Joseph Doherty 4a0f88b17d Resolve Client.Rust-022..029: MalformedReply, correlation ids, clippy
Client.Rust-022  Restored Error::MalformedReply for register / add_item /
                 add_item2 and the bulk-subscribe / read-bulk / write-bulk
                 dispatch arms so malformed-but-OK replies fail loudly
                 instead of returning Vec::new().
Client.Rust-023  Restored next_correlation_id and routed every CLI close /
                 stream-alarms / acknowledge-alarm / bench-read-bulk call
                 through it so each call carries a unique opaque token.
Client.Rust-024  Added round-trip tests for read_bulk / write_bulk /
                 write2_bulk / write_secured_bulk / write_secured2_bulk
                 plus stream_alarms and percentile_summary unit tests.
Client.Rust-025  RustClientDesign.md re-synced — new bulk SDK, alarms
                 surface, Error variants, CLI command list, and the
                 Windows stack workaround.
Client.Rust-026  Session::read_bulk now borrows a tag slice; bench-read-
                 bulk binds tags once outside the warm-up / steady-state
                 loops.
Client.Rust-027  .cargo/config.toml selector tightened to
                 cfg(all(windows, target_env = "msvc")) and comment
                 rewritten to match reality (release + debug ship the
                 8 MB reservation).
Client.Rust-028  run_batch removed the empty-line break; stdin EOF is
                 the only terminator.
Client.Rust-029  Re-applied Client.Rust-001 / 002 / 012 — added the
                 missing doc comments, renamed BulkReplyKind variants,
                 and replaced the clone-on-copy with a deref under lock
                 so cargo clippy -D warnings is clean.

All resolved at 2026-05-24; cargo fmt + check + clippy + test all green
(55 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:50:15 -04:00
Joseph Doherty 82996aa8e6 Resolve Client.Go-022..027: bulk flags, bench cancel, batch loop
Client.Go-022  Re-applied Client.Go-015 shape — runWriteBulkVariant drops
               the unused secured param and gates -current-user-id /
               -verifier-user-id / -user-id behind the secured-only
               variants.
Client.Go-023  Re-applied Client.Go-018 shape — bench warm-up and steady-
               state loops respect ctx.Err().
Client.Go-024  Added SDK-level tests for WriteBulk / Write2Bulk /
               WriteSecuredBulk / WriteSecured2Bulk / ReadBulk and
               StreamAlarms via the existing bufconn fake gateway pattern.
Client.Go-025  Five bulk SDK methods short-circuit on empty input without
               an RPC round-trip and document the behavior.
Client.Go-026  runBatch widens scanner.Buffer to 16 MiB and emits an
               error-with-sentinel if a longer line still arrives, rather
               than aborting the session silently.
Client.Go-027  runBatch treats blank lines as skip-and-continue; only EOF
               ends the session.

All resolved at 2026-05-24; gofmt + go vet + go build + go test ./... all
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:58 -04:00
Joseph Doherty 712cb06442 Resolve Client.Dotnet-018..021: README + bench-read-bulk hardening
Client.Dotnet-018  README CLI examples for stream-alarms / acknowledge-alarm
                   replaced with parser-correct flags; new theory test
                   parses each documented README example through the CLI.
Client.Dotnet-019  BenchReadBulkAsync routes through new
                   RequireRegisterServerHandle helper that fails loudly when
                   the OK register reply has no typed payload.
Client.Dotnet-020  Bench steady-state catch is now
                   catch (Exception ex) when (ex is not OperationCanceledException)
                   so user-driven cancellation exits promptly.
Client.Dotnet-021  --timeout-ms now flows through ParseTimeoutMs which
                   rejects negatives with a clear error in both read-bulk
                   and bench-read-bulk.

All resolved at 2026-05-24; 67/67 .NET client tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:45 -04:00
Joseph Doherty 4d77279e7e Resolve Server-044..050: KillWorker accounting + admin service hardening
Server-044  KillWorkerAsync catch path now calls _metrics.SessionRemoved
            so the open-session gauge does not leak when KillWorker throws.
Server-045  KillWorkerAsync routes through a new
            GatewaySession.KillWorkerWithCloseGateAsync that takes the
            per-session close lock, so concurrent kills count SessionsClosed
            exactly once.
Server-046  CloseSessionCoreAsync's SessionCloseStartedException branch and
            ShutdownAsync's kill fallback both increment SessionsClosed (not
            just the gauge), so the counter and gauge stay consistent.
Server-047  ApiKeysPage.ConfirmPendingAsync holds PendingAction across the
            awaited action and clears it in finally, matching the sessions
            pages.
Server-048  Closed: the 044/045 regression tests cover the previously-
            untested kill paths.
Server-049  IDashboardSessionAdminService + DashboardSessionAdminService
            now carry XML docs that pin the Admin gate, missing-session
            return-Fail semantics, and the dashboard-admin-kill reason.
Server-050  CloseSessionAsync and KillWorkerAsync catch unexpected
            exceptions after the SessionManagerException catches and return
            a friendly Fail; OperationCanceledException tied to the caller
            token still propagates.

All resolved at 2026-05-24; 503/503 gateway tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:49:34 -04:00
Joseph Doherty 6079c62709 code-reviews: regenerate index at 42b0037
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:56 -04:00
Joseph Doherty 37ef27e8ed code-reviews: bump Worker + Worker.Tests headers to 42b0037
No source changes since d692232 — header bumped to track the latest
reviewed commit, all prior findings remain closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:56 -04:00
Joseph Doherty db2218f395 code-reviews: re-review Client.Java at 42b0037
Append 5 new findings (Client.Java-032..036): README flags for new
alarm subcommands do not exist; StreamAlarmsCommand bounded queue
silently drops alarms instead of fail-fast; BatchCommand whitespace
tokenisation shreds quoted args; no library-side stream_alarms test;
MxGatewayAlarmFeedSubscription is the fourth duplicate subscription
class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:55 -04:00
Joseph Doherty bc28fee641 code-reviews: re-review Client.Python at 42b0037
Append 5 new findings (Client.Python-022..026): README flags for new
alarm subcommands do not exist; Client.Python-013 regression — the
silent localhost auto-plaintext branch is still present (the prior
Resolution did not survive the rename); production batch path uses
the click.testing.CliRunner helper; no behavioural tests for new SDK
+ CLI; bench cleanup swallows exceptions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:55 -04:00
Joseph Doherty 15fceed536 code-reviews: re-review Client.Rust at 42b0037
Append 8 new findings (Client.Rust-022..029): MalformedReply path
absent from the new bulk SDK methods, hard-coded client_correlation_id
in new CLI commands, no tests for stream_alarms / bulk SDK / bench,
RustClientDesign.md silent on new surface, run_bench_read_bulk clones
tags inside the loop, .cargo/config.toml comment is wrong, run_batch
exits on blank line, and cargo clippy fails at HEAD with three
warnings the prior reviewer punted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:41 -04:00
Joseph Doherty afa82e0989 code-reviews: re-review Client.Go at 42b0037
Append 6 new findings (Client.Go-022..027): regressions of Client.Go-015
(runWriteBulkVariant secured flag) and Client.Go-018 (bench loop
ignores ctx.Err()) reintroduced by the bulk port; no SDK tests for the
new WriteBulk/ReadBulk/StreamAlarms methods; bulk SDK accepts empty
slices; bufio.Scanner buffer + blank-line sentinel can abort the batch
session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:41 -04:00
Joseph Doherty b9ef09d26e code-reviews: re-review Client.Dotnet at 42b0037
Append 4 new findings (Client.Dotnet-018..021): README flags for the
new stream-alarms/acknowledge-alarm subcommands cite options that do
not exist on the CLI; BenchReadBulkAsync reinstates the silent
register-handle fallback and swallows OperationCanceledException;
both new --timeout-ms consumers cast int32 to uint without bounds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:40 -04:00
Joseph Doherty 7d66967122 code-reviews: re-review IntegrationTests at 42b0037
Append 1 new finding (IntegrationTests-025): the
ResolveRepositoryRoot_NoMarkers test walks up from Path.GetTempPath()
through unisolated ancestors; a redirected TMP or co-located checkout
at C:\src silently breaks the assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:26 -04:00
Joseph Doherty 2f8404d2ef code-reviews: re-review Contracts at 42b0037
No new findings — the only Contracts commit in window (bd1d1f1) is a
comment-only proto edit; field numbering remains additive with no
reuse, renumbering, or type narrowing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:25 -04:00
Joseph Doherty 2b92be02b9 code-reviews: re-review Tests at 42b0037
Append 5 new findings (Tests-027..031) covering the
StreamEvents_WhenEventIsWritten_RecordsSendDuration flake root cause
(shared MeterListener by meter name), missing kill-path coverage
(reason propagation + concurrent-kill double-count), asymmetric guard
coverage between Close and Kill, missing audit-failure-path coverage
for ApiKey Delete, and the DashboardSnapshotPublisher reconnect-window
timer sensitivity.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:15 -04:00
Joseph Doherty 056f0d8808 code-reviews: re-review Server at 42b0037
Append 7 new findings (Server-044..050) covering the destructive-action
wave: KillWorkerAsync metric/state leaks, ShutdownAsync kill-fallback
gauge leak, inconsistent ConfirmDialog cleanup across pages, missing
XML docs on the new DashboardSessionAdmin surface, and unhandled
RemoveSessionAsync exception paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 08:28:15 -04:00
Joseph Doherty 42b0037376 Dashboard: replace inline fully-qualified type refs with @using
ApiKeysPage and GalaxyPage referenced ApiKeyConstraints and
GalaxyRepositoryOptions by full namespace inside @code. Add the
appropriate @using directive at the top of each file and let the
short type names resolve through that.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:51:00 -04:00
Joseph Doherty de7639a3e9 Dashboard: fix 500 on / from duplicate endpoint mapping
GatewayApplication.MapGatewayEndpoints registered MapGet("/", ...)
that redirected to /health/live. After the dashboard added a Blazor
component at @page "/", both endpoints matched GET / and the matcher
threw AmbiguousMatchException, surfacing as a 500. The redirect
predated the dashboard home page — drop it so / lands on Overview.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:48:43 -04:00
Joseph Doherty 8738735f0d clients: document StreamAlarms + AcknowledgeAlarm in each README
Each client's README now covers the alarms surface in both the SDK
section (StreamAlarms / AcknowledgeAlarm beside the existing
QueryActiveAlarms entry, with the streaming-cancellation note) and
the CLI examples (stream-alarms / acknowledge-alarm invocations
mirroring the in-tree implementations across .NET, Go, Rust, Python,
and Java).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:40:23 -04:00
Joseph Doherty e80f3c70b6 docs: cover admin dashboard actions + API key Delete
Update the design docs so they match the implemented Admin-only
dashboard surface. GatewayDashboardDesign now documents the Close
session / Kill worker controls and the new Delete action on revoked
API keys, plus the ConfirmDialog gate for every destructive action.
Sessions.md adds the SessionManager.KillWorkerAsync entry alongside
CloseSessionAsync and explains the immediate-kill semantics. Authentication.md adds the IApiKeyAdminStore.DeleteAsync write path
and the dashboard-delete-key audit event. DashboardInterfaceDesign
drops the "read-only until admin workflows have a separate design"
line in favor of the confirm-before-act invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:35:25 -04:00
Joseph Doherty 24cc5fd0f0 Dashboard: delete revoked API keys + confirm Rotate/Revoke/Delete
Add IApiKeyAdminStore.DeleteAsync that only deletes already-revoked
rows (active keys must be revoked first so the revoke event lands in
the audit log before the row disappears) and a matching admin-gated
DashboardApiKeyManagementService.DeleteAsync. ApiKeysPage now shows
Delete on revoked rows in place of the old "No actions" stub, and
Rotate/Revoke/Delete all route through ConfirmDialog so each
destructive action requires an explicit confirmation step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:30:30 -04:00
Joseph Doherty c5153d68bb Dashboard: fix API keys page stuck on "Loading"
ApiKeysPage.OnInitializedAsync overrode the base without chaining, so
DashboardPageBase's Snapshot seed and hub connect never ran and the
page rendered the null-snapshot empty state forever.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:22:38 -04:00
Joseph Doherty 0e56b5befb Dashboard: confirm before Close session / Kill worker
Add a shared ConfirmDialog component and route Sessions, Workers, and
SessionDetails Close/Kill buttons through it. The dialog shows the
target session id and a color-matched confirm button (yellow Close,
red Kill); Cancel dismisses without invoking the admin service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:17:32 -04:00
Joseph Doherty c5e7479ee4 Dashboard: admin-only Close session / Kill worker
Add IDashboardSessionAdminService (Admin-role gate, friendly errors,
audit logging) wrapping a new ISessionManager.KillWorkerAsync that
skips graceful shutdown and cleans up registry/metrics. Sessions,
Workers, and SessionDetails pages render Close / Kill buttons only
when CanManage; the service re-checks the role on every call so
forged clicks return Unauthenticated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 07:10:32 -04:00
Joseph Doherty 8a0c59d7e8 Java client: port stream-alarms and acknowledge-alarm
Adds the session-less alarm CLI subcommands to the Java CLI. stream-alarms
attaches to the gateway's central alarm feed (--filter-prefix, --limit, --json
— NDJSON, one AlarmFeedMessage per line); acknowledge-alarm is a unary ack
(--reference required, --comment, --operator). streamAlarms joins
queryActiveAlarms on MxGatewayClient and uses a new
MxGatewayAlarmFeedSubscription cancellable handle. Batch dispatch re-enters the
picocli command line per stdin line, so registering the two new subcommands
suffices.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 06:46:03 -04:00
Joseph Doherty 828e3e6cf6 Python client: port stream-alarms and acknowledge-alarm
Adds the session-less alarm CLI subcommands to mxgw-py. stream-alarms reads a
bounded slice of the gateway's central alarm feed (--filter-prefix,
--max-messages, --timeout, --json; aggregate `{messages: [...]}`);
acknowledge-alarm is a unary ack (--reference required, --comment, --operator).
GatewayClient.stream_alarms joins query_active_alarms via a
_canceling_alarm_feed_iterator helper mirroring the existing
_canceling_active_alarms_iterator pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 06:45:54 -04:00
Joseph Doherty 7de4efeb02 Rust client: port stream-alarms and acknowledge-alarm + fix stream-events family + 8MB Windows stack
Adds the session-less alarm CLI subcommands to mxgw. stream-alarms attaches to
the gateway's central alarm feed (--filter-prefix, --max-events, --json/--jsonl;
aggregate shape `{messageCount, messages: [...]}`); acknowledge-alarm is a unary
ack (--reference required, --comment, --operator). stream_alarms joins
query_active_alarms on GatewayClient and re-exports AlarmFeedStream.

Also extends stream-events JSON to emit a full `events` array (itemHandle, value
projected to protojson-shaped `*Value` keys, etc.) instead of just `eventCount`,
matching the other four CLIs, and renders MxEvent.family as the protobuf enum
NAME (MX_EVENT_FAMILY_ON_WRITE_COMPLETE) rather than the raw i32 so the e2e
write round-trip can recognise the OnWriteComplete echo.

Adds clients/rust/.cargo/config.toml bumping the Windows main-thread stack to
8 MB via /STACK:8388608. clap-derive's Command enum (one variant per subcommand)
overflowed the default 1 MB stack in debug builds after the new variants
landed; release builds were unaffected but the e2e matrix runs Rust via
`cargo run` (debug).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 06:45:46 -04:00
Joseph Doherty 6f0d142639 Go client: port stream-alarms and acknowledge-alarm
Adds the session-less alarm CLI subcommands to mxgw-go. stream-alarms attaches
to the gateway's central alarm feed (--filter-prefix, --limit, --json — NDJSON,
one AlarmFeedMessage per line); acknowledge-alarm is a unary ack (--reference
required, --comment, --operator). StreamAlarms joins QueryActiveAlarms on the
public Client and is wired through the existing batch dispatcher via runWithIO.
SDK type aliases for StreamAlarmsRequest / AlarmFeedMessage / StreamAlarmsClient
land alongside the existing alarm types.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 06:45:32 -04:00
Joseph Doherty 11cc6715ed .NET client: port stream-alarms and acknowledge-alarm + fix stream-events OCE
Adds the session-less alarm CLI subcommands. stream-alarms attaches to the
gateway's central alarm feed (--filter-prefix, --max-events, --json/--jsonl);
acknowledge-alarm is a unary ack (--reference required, --comment, --operator).
StreamAlarmsAsync joins QueryActiveAlarmsAsync on MxGatewayClient and the
transport interface; the CLI client interface, adapter, and FakeGatewayTransport
follow.

Also fixes the OCE bug exposed by -VerifyWrite in the cross-language e2e:
StreamEventsAsync's await foreach now swallows OperationCanceledException when
the supplied cancellation token is the one that fired (graceful end-of-window),
and RunBatchAsync no longer excludes OCE from its outer catch — so a streaming
command that hits its --timeout reports a JSON error inside its EOR-delimited
record instead of killing the long-lived batch process.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 06:45:24 -04:00
Joseph Doherty f90bff01db Java client: port bulk read/write SDK methods + CLI subcommands
Final language in the bulk-CLI port wave. HEAD's MxGatewaySession had
only the subscribe-style bulks; this commit adds the value-bulks plus
matching picocli subcommands and a bench-read-bulk harness.

SDK (MxGatewaySession.java):
- List<BulkWriteResult> writeBulk(int serverHandle, List<WriteBulkEntry> entries)
- List<BulkWriteResult> write2Bulk(int serverHandle, List<Write2BulkEntry> entries)
- List<BulkWriteResult> writeSecuredBulk(int serverHandle, List<WriteSecuredBulkEntry> entries)
- List<BulkWriteResult> writeSecured2Bulk(int serverHandle, List<WriteSecured2BulkEntry> entries)
- List<BulkReadResult> readBulk(int serverHandle, List<String> tagAddresses, Duration timeout)

readBulk uses java.time.Duration for the timeout parameter (idiomatic
Java) and internally converts to the timeoutMs proto field;
Duration.ZERO / null both delegate to the worker default. Per-entry
secured user ids stay on each WriteSecured(2)BulkEntry to match the
proto's per-row shape.

CLI (MxGatewayCli.java):
- read-bulk / write-bulk / write2-bulk / write-secured-bulk /
  write-secured2-bulk as picocli @Command subcommands. Write families
  share value-parsing logic; gating of --current-user-id /
  --verifier-user-id / --timestamp matches the cross-language flag
  contract.
- bench-read-bulk: --iterations / --warmup loop with avg/min/max ms
  reporting plus a --json mode that emits the cross-language bench
  JSON schema.

A small fixture in MxGatewayCliTests.FakeSession adds stub
implementations of the five new interface methods so the test module
compiles.

Verification: gradle build BUILD SUCCESSFUL (4 tasks executed, all
tests pass); gradle :zb-mom-ww-mxgateway-cli:installDist BUILD
SUCCESSFUL. Manual smoke against live gateway on localhost:5120:
open-session → register → read-bulk cold (wasCached=false both tags)
→ subscribe-bulk → read-bulk warm (wasCached=true both tags) →
write-bulk int32 111,222 (both wasSuccessful=true) → write2-bulk
timestamped (both wasSuccessful=true) → write-secured-bulk and
write-secured2-bulk return per-entry MXAccess "Value does not fall
within the expected range" failures with the configured user/verifier
ids (0,0) — confirming the SDK does NOT throw on per-entry MXAccess
failures and surfaces them through BulkWriteResult exactly as the
.NET and Go ports do → bench-read-bulk iterations=20 avg=9.5 ms
last_success=2/2 cached=2/2 → close-session SESSION_STATE_CLOSED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 04:50:34 -04:00
Joseph Doherty 6add4b4acc Python client: port bulk read/write SDK methods + CLI subcommands
Mirrors the .NET / Go ports of divergent branch commit f220908. HEAD's
Session class had only the subscribe-style bulks; this commit adds the
value-bulk SDK surface plus matching CLI subcommands and a
bench-read-bulk harness.

SDK (zb_mom_ww_mxgateway/session.py):
- async def write_bulk(server_handle, entries, *, correlation_id="")
  → list[pb.BulkWriteResult]
- async def write2_bulk(server_handle, entries, *, correlation_id="")
  → list[pb.BulkWriteResult]
- async def write_secured_bulk(server_handle, entries, *, correlation_id="")
  → list[pb.BulkWriteResult]
- async def write_secured2_bulk(server_handle, entries, *, correlation_id="")
  → list[pb.BulkWriteResult]
- async def read_bulk(server_handle, tag_addresses, *, timeout_ms=0,
  correlation_id="") → list[pb.BulkReadResult]

All five reuse the existing _ensure_bulk_size validator and route
through the existing invoke() pipeline. read_bulk additionally enforces
timeout_ms >= 0.

CLI (zb_mom_ww_mxgateway_cli/commands.py):
- read-bulk / write-bulk / write2-bulk / write-secured-bulk /
  write-secured2-bulk registered as click @main.command(...). The
  write families share a _build_write_bulk_entries() helper that parses
  --item-handles and --values with a single --type, validates count
  match, converts via to_mx_value, and assembles the correct per-entry
  proto message.
- bench-read-bulk: opens its own session, subscribes to --bulk-size
  TestMachine_NNN.TestChangingInt tags, runs warmup then steady-state
  ReadBulk for --duration-seconds with time.perf_counter() latency
  capture, and emits the shared JSON schema (language, durationMs,
  totalCalls, successfulCalls, failedCalls, totalReadResults,
  cachedReadResults, callsPerSecond, latencyMs:{p50,p95,p99,max,mean})
  so scripts/bench-read-bulk.ps1 collates Python alongside the four
  other clients. _percentile_summary + linear-interpolation
  _percentile helper match the Go / .NET implementations.

to_mx_value is added to the existing values-module import line in
commands.py since the bulk-write commands need it.

Verification: python -m pip install -e . --quiet --no-deps; pytest
42/42 passing. Manual smoke against live gateway on localhost:5120:
open-session → register → subscribe-bulk on two
TestMachine_NNN.TestChangingInt tags (both wasSuccessful=true) →
read-bulk (both wasSuccessful=true / wasCached=true / int32 values
present) → close-session SESSION_STATE_CLOSED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 04:50:10 -04:00
Joseph Doherty 325106920f Rust client: port BenchReadBulk subcommand + session.rs tightening
The bulk-write/read SDK methods (read_bulk, write_bulk, write2_bulk,
write_secured_bulk, write_secured2_bulk) and the matching clap
subcommands (ReadBulk, WriteBulk, Write2Bulk, WriteSecuredBulk,
WriteSecured2Bulk) were already on HEAD from a prior session — they
were the only bulk family that HEAD shipped before the .NET / Go /
Python / Java parallel ports. The one missing piece from the divergent
branch (commit f220908) was the BenchReadBulk benchmark harness.

mxgw-cli/src/main.rs adds:
- BenchReadBulk clap variant with flags --client-name,
  --duration-seconds, --warmup-seconds, --bulk-size, --tag-start,
  --tag-prefix, --tag-attribute, --timeout-ms, --json — defaults match
  the .NET and Go benches.
- run_bench_read_bulk(): open-session → register → subscribe_bulk on
  the synthesized TestMachine_NNN.TestChangingInt tags to populate the
  worker value cache → warmup → steady-state loop with per-call
  std::time::Instant capture → unsubscribe → close-session.
- BenchStats + LatencySummary structs and a percentile()
  helper (nearest-rank with linear interpolation, matching the Go and
  .NET implementations) so the cross-language JSON output is byte-for-
  byte comparable. JSON schema: language / command / endpoint /
  clientName / bulkSize / durationSeconds / warmupSeconds / durationMs
  / tags / totalCalls / successfulCalls / failedCalls /
  totalReadResults / cachedReadResults / callsPerSecond /
  latencyMs:{p50,p95,p99,max,mean}. scripts/bench-read-bulk.ps1 will
  pick up the Rust line on its next run.

session.rs picks up minor tightening tied to the bulk SDK methods that
were already in the file (per-entry validation paths, BulkReplyKind
dispatch coverage) — no public-surface change.

Verification: cargo build --workspace clean (the 2 pre-existing
options.rs missing_docs warnings remain — out of scope); cargo test
--workspace 34/34 passing; cargo clippy --workspace --all-targets has
only the 3 pre-existing tolerated warnings (enum_variant_names on
BulkReplyKind, missing_docs on options.rs, clone_on_copy on
galaxy.rs:282). Manual smoke against live gateway on localhost:5120:
read-bulk on two TestMachine tags returned wasCached=true,
wasSuccessful=true; bench-read-bulk --duration-seconds 2
--warmup-seconds 1 --bulk-size 2 --json ran 363 calls / 181.35 calls
per second / p50=5.3 ms / p99=7.8 ms / 726 of 726 cached reads, all
emitting valid JSON in the shared bench schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 04:50:09 -04:00
Joseph Doherty 8aaab82287 Go client: port bulk read/write SDK methods + CLI subcommands
Mirrors the .NET addition: HEAD's session.go had only the subscribe-style
bulks (AddItemBulk / AdviseItemBulk / RemoveItemBulk / UnAdviseItemBulk /
SubscribeBulk / UnsubscribeBulk). This commit ports the value-bulk SDK
surface and CLI subcommands from divergent branch commit f220908.

SDK (clients/go/mxgateway/session.go):
- WriteBulk(ctx, serverHandle int32, entries []*WriteBulkEntry)
- Write2Bulk(ctx, ..., entries []*Write2BulkEntry)
- WriteSecuredBulk(ctx, ..., entries []*WriteSecuredBulkEntry)
- WriteSecured2Bulk(ctx, ..., entries []*WriteSecured2BulkEntry)
- ReadBulk(ctx, serverHandle int32, tagAddresses []string, timeout time.Duration)
  → []*BulkReadResult

types.go gains public re-exports of the generated proto types
(WriteBulkCommand, WriteBulkEntry, Write2BulkCommand, Write2BulkEntry,
WriteSecuredBulkCommand, WriteSecuredBulkEntry, WriteSecured2BulkCommand,
WriteSecured2BulkEntry, ReadBulkCommand, BulkWriteReply, BulkWriteResult,
BulkReadReply, BulkReadResult) so external callers can construct entries
through the public `mxgateway` package without dipping into the internal
generated path.

CLI (clients/go/cmd/mxgw-go/main.go):
- read-bulk, write-bulk, write2-bulk, write-secured-bulk,
  write-secured2-bulk routed through runWithIO. write families share a
  runWriteBulkVariant helper that gates per-variant flags
  (--current-user-id, --verifier-user-id, --timestamp) so the
  Client.Go-015 flag-gating contract is preserved.
- bench-read-bulk: percentile + timing helpers; JSON output schema
  identical to the .NET / Rust / Python / Java benches.

parseInt32List was changed from panic-on-error to ([]int32, error) so
the new write-bulk commands surface parse errors gracefully; the
existing runUnsubscribeBulk caller is updated accordingly.

Verification: go build ./... + go vet ./... + go test ./... all clean.
Manual smoke against live gateway on localhost:5120: open-session →
register → subscribe-bulk on 3 TestMachine_NNN.TestChangingInt tags
(all wasSuccessful=true) → read-bulk (all wasSuccessful=true /
wasCached=true) → write-bulk int32 100/200/300 (all wasSuccessful=true)
→ close-session SESSION_STATE_CLOSED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 04:49:33 -04:00