Files
mxaccessgw/docs/plans/2026-06-15-stillpending-completion.md
T
Joseph Doherty 883557fc8a docs: implementation plan for stillpending.md completion
28 tasks across 5 workstreams (A worker control cmds, B worker COM cmds,
C audit CorrelationId, D client CLI parity, E docs). Zero proto changes;
worker net48/x86 + Java on windev, rest local.
2026-06-15 09:35:50 -04:00

24 KiB
Raw Blame History

Still-Pending Completion Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.

Goal: Close the actionable items in stillpending.md — 11 unimplemented worker command kinds (§1.1), audit CorrelationId threading (§1.2), client CLI/helper parity (§4), and doc hygiene (§7).

Architecture: Two-process gateway/worker design is unchanged. All 11 worker commands already have proto request+reply messages, gateway validation, scope mapping, and generic pass-through routing — so the work is worker executor arms + 6 new COM-wrapper methods + a gateway constraint-path CorrelationId thread + client CLI additions. Zero .proto changes, therefore no codegen and no net48 regen risk.

Tech Stack: .NET 10 (gateway, x64), .NET Framework 4.8 (worker, x86, MXAccess COM on STA), Go/Python/Rust/Java/.NET clients. Worker net48/x86 + Java client build/test on Windows host windev (10.100.0.48, passwordless ssh, PowerShell); everything else builds locally on macOS.

Design source: docs/plans/2026-06-15-stillpending-completion-design.md.

Branch: feat/stillpending-completion (already created).


Cross-platform build reference (read before any worker/Java task)

  • Worker (net48/x86) + Worker.Tests + Java client do NOT build on macOS. Build/test them on windev:
    • Copy the working tree (or use the existing build worktree pattern) to windev, git fetch && git reset --hard origin/<branch> in the build worktree (NEVER trust a stale local main — see memory project_deploy_mechanics).
    • Build: dotnet build src/ZB.MOM.WW.MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86
    • Test: dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86
    • Live MXAccess: set $env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1" then run the IntegrationTests filter.
    • Nested ssh→PowerShell mangles quotes; scp a .ps1 and run powershell -NoProfile -ExecutionPolicy Bypass -File. Wrap git in cmd /c "git ... 2>&1".
  • net48 worker C#: no init-only props / positional records (no IsExternalInit); use { get; set; } or ctors (memory project_net48_worker_csharp).
  • Gateway, .NET client, Go, Rust, Python build+test locally on macOS.

Workstream A — Worker control/lifecycle commands (5)

These add arms to MxAccessCommandExecutor.Execute (src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:90-129). They need runtime state the executor does not currently hold. Task A0 establishes how the executor reaches that state; do it first.

Task A0: Decide & wire control-command collaborators into the executor

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (A1A5 depend on it)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs (constructor + fields)
  • Read first: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessEventQueue.cs (Drain(uint):168, Count:58), WorkerRuntimeHeartbeatSnapshot.cs, src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs (IsRunning:80, Shutdown()), MxAccessInteropInfo.cs (progid/clsid), the executor's existing construction site (grep new MxAccessCommandExecutor()

What to do: The 5 control commands need: the event queue (DrainEvents), a session-state source (GetSessionState), worker identity — pid/version/progid/clsid (GetWorkerInfo), and a shutdown signal (ShutdownWorker). Determine the cleanest seam:

  • Preferred: inject the collaborators the executor lacks (event queue reference, a Func<SessionState> or the session object, MxAccessInteropInfo, and a shutdown delegate/Action) via the constructor, matching how its existing COM collaborator is passed.
  • If the executor's construction site shows control commands are better intercepted one layer up (where StaRuntime/session context already lives), surface that to the controller before proceeding — do NOT silently relocate the dispatch.

Acceptance: executor compiles on windev with new collaborators available to A1A5; no behavior change yet (arms still fall through). Commit.

Note: A1A5 are sequential edits to the same Execute switch + helper region of one file, so they are NOT parallelizable with each other. Bundle their review.

Task A1: Ping

Classification: small Estimated implement time: ~3 min Parallelizable with: none (same file as A0/A2-A5)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs (add MxCommandKind.Ping arm + ExecutePing)
  • Test: src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs (or the existing executor test file — grep to confirm name)

Step 1 — failing test: assert Execute with a Ping command (PingCommand { Message = "hi" }) returns ProtocolStatusCode.Ok, Hresult == 0, and echoes the message (via reply diagnostic or base reply — Ping has no dedicated reply message, so assert OK status). Build/test on windev.

Step 2 — run, expect FAIL (currently INVALID_REQUEST).

Step 3 — implement: add MxCommandKind.Ping => ExecutePing(command), to the switch (:99-126 region). ExecutePing returns CreateOkReply(command) (helper at :784).

Step 4 — run, expect PASS on windev.

Step 5 — commit: feat(worker): implement Ping command

Task A2: DrainEvents

Classification: small Estimated implement time: ~4 min Parallelizable with: none

Files:

  • Modify: MxAccessCommandExecutor.cs (DrainEvents arm + ExecuteDrainEvents)
  • Test: executor test file

Steps (TDD): test that DrainEvents { MaxEvents = N } drains up to N from the injected MxAccessEventQueue and returns DrainEventsReply { events = [...] } (reply field 102). MaxEvents == 0 drains all. Map each WorkerEventMxEvent using the existing event-mapping path (grep how the live event loop converts WorkerEventMxEvent; reuse, do not duplicate). Build/test windev. Commit feat(worker): implement DrainEvents command.

Task A3: GetSessionState

Classification: small Estimated implement time: ~3 min Parallelizable with: none

Files: MxAccessCommandExecutor.cs + executor test.

Steps: test that GetSessionState returns SessionStateReply { State = <current> } (reply field 100) mapping the worker's lifecycle to the proto SessionState enum (READY when the STA is running). Build/test windev. Commit feat(worker): implement GetSessionState command.

Task A4: GetWorkerInfo

Classification: small Estimated implement time: ~3 min Parallelizable with: none

Files: MxAccessCommandExecutor.cs + executor test.

Steps: test that GetWorkerInfo returns WorkerInfoReply { WorkerProcessId, WorkerVersion, MxaccessProgid, MxaccessClsid } (reply field 101) sourced from Process.GetCurrentProcess().Id, the worker assembly version, and MxAccessInteropInfo (progid LMXProxy.LMXProxyServer.1, clsid {C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}). Build/test windev. Commit feat(worker): implement GetWorkerInfo command.

Task A5: ShutdownWorker

Classification: standard Estimated implement time: ~5 min Parallelizable with: none

Files: MxAccessCommandExecutor.cs + executor test.

Steps: test that ShutdownWorker { GracePeriod } returns a base OK reply and triggers the injected shutdown signal after the reply is produced (must not deadlock the STA — signal shutdown, return reply, let the pump drain). Verify the grace period is honored (or documented as best-effort). Build/test windev. Commit feat(worker): implement ShutdownWorker command.

Task A6: Make FakeWorkerHarness respond to control commands

Classification: standard Estimated implement time: ~5 min Parallelizable with: none (depends on A1A5 reply shapes)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs
  • Test: a gateway-side test that invokes Ping/GetWorkerInfo/DrainEvents through the harness and asserts the reply (builds locally on macOS).

Why: the audit (§1.1) flagged that control kinds were "exercised only through FakeWorkerHarness" but the harness is a passive relay that does not auto-respond — so gateway tests could not actually cover them. Add canned responses so the gateway↔worker round-trip for these commands is verified in the default (no-MXAccess) suite. Commit test(gateway): fake worker responds to control commands.


Workstream B — Worker MXAccess COM commands (6)

Suspend, Activate, AuthenticateUser, ArchestrAUserToId, AddBufferedItem, SetBufferedUpdateInterval. Task B0 (windev interop inspection) MUST run first — the native interface exposing each method is unknown until inspected.

Task B0: Resolve native COM signatures on windev

Classification: standard Estimated implement time: ~5 min (investigation) Parallelizable with: A-workstream tasks (different files/host activity)

Files:

  • Read on windev: the generated interop for ArchestrA.MXAccess.dll (the ILMXProxyServer / ILMXProxyServer3 / ILMXProxyServer4 RCW definitions), C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Public-API.md (method list/signatures).
  • Output: a short note appended to this plan (or a comment block) recording, for each of the 6 methods, which interface version exposes it and its exact signature.

What to do: Confirm the exact native signatures for Suspend(int serverHandle, int itemHandle), Activate(int serverHandle, int itemHandle), AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword) → user id, ArchestrAUserToId(int serverHandle, string userIdGuid) → user id, AddBufferedItem(int serverHandle, string itemDefinition, string itemContext) → item handle, SetBufferedUpdateInterval(int serverHandle, int intervalMs). If any method is not present on the installed interop (mirroring the §3.4/§3.5 vendor-stub pattern for alarms), STOP and surface it — implement only the available ones and record the rest as vendor-gated residuals. Commit the note.

Task B1: Add 6 wrapper methods to IMxAccessServer + MxAccessComServer

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: none (blocked by B0)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessServer.cs (add 6 method declarations)
  • Modify: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessComServer.cs (dispatch to the interface version resolved in B0, mirroring existing methods like Write2:173, AddItem2:84)
  • Modify: any fake/test IMxAccessServer implementation (grep : IMxAccessServer) to add the 6 methods (return canned values).
  • Test: src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/... for MxAccessComServer if one exists.

Steps (TDD): add the 6 declarations; implement dispatch following the existing version-selection pattern; update fakes so the solution compiles. Build on windev -p:Platform=x86. Commit feat(worker): add MXAccess COM wrappers for suspend/activate/auth/buffered.

B2B7 are sequential edits to the same Execute switch; not parallelizable with each other. Bundle review.

Task B2: Suspend arm

Classification: small · ~3 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: Suspend { ServerHandle, ItemHandle } calls the wrapper and returns SuspendReply { Status = MxStatusProxy } (reply field 24). Use a fake IMxAccessServer asserting the call. Build/test windev. Commit feat(worker): implement Suspend command.

Task B3: Activate arm

Classification: small · ~3 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: Activate { ServerHandle, ItemHandle }ActivateReply { Status } (field 25). Build/test windev. Commit feat(worker): implement Activate command.

Task B4: AuthenticateUser arm

Classification: standard · ~4 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: AuthenticateUser { ServerHandle, VerifyUser, VerifyUserPassword }AuthenticateUserReply { UserId } (field 26). Credentials must never be logged (standing rule) — assert no log statement includes the password. AuthenticateUser is allowed to fail (surface the native HResult, do not throw). Build/test windev. Commit feat(worker): implement AuthenticateUser command.

Task B5: ArchestrAUserToId arm

Classification: small · ~3 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: ArchestrAUserToId { ServerHandle, UserIdGuid }ArchestrAUserToIdReply { UserId } (field 27). Build/test windev. Commit feat(worker): implement ArchestrAUserToId command.

Task B6: AddBufferedItem arm

Classification: standard · ~4 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: AddBufferedItem { ServerHandle, ItemDefinition, ItemContext }AddBufferedItemReply { ItemHandle } (field 23). Build/test windev. Commit feat(worker): implement AddBufferedItem command.

Task B7: SetBufferedUpdateInterval arm

Classification: small · ~3 min · Parallelizable with: none Files: MxAccessCommandExecutor.cs + executor test. TDD: SetBufferedUpdateInterval { ServerHandle, UpdateIntervalMilliseconds } → base OK reply (no dedicated reply message). Build/test windev. Commit feat(worker): implement SetBufferedUpdateInterval command.

Task B8: Live COM smoke + buffered capture on windev

Classification: high-risk Estimated implement time: ~5 min (authoring; live run is manual) Parallelizable with: none (blocked by B1B7)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs (the existing AuthenticateUser send at ~line 919/931 should now get an OK/typed reply instead of INVALID_REQUEST; add Suspend/Activate/AddBufferedItem+SetBufferedUpdateInterval sends).
  • Possibly: src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/ for a buffered-capture probe.

Steps: Under MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1 on windev: verify the 4 unambiguous COM commands round-trip; then AddBufferedItem + SetBufferedUpdateInterval on a real tag and capture a multi-sample OnBufferedDataChange batch to validate the §3.2 VariantConverter path. If the buffered conversion proves correct, record it; if it surfaces a conversion bug, STOP and report (do not silently ship). If a live buffered sample cannot be elicited on the rig, record buffered round-trip as the documented residual (close the command gap, leave §3.2 open). Commit test(integration): live COM command + buffered capture smoke.


Workstream C — §1.2 gateway audit CorrelationId

Task C1: Thread ClientCorrelationId into constraint-denial audit records

Classification: high-risk Estimated implement time: ~5 min Parallelizable with: all A/B/D tasks (gateway-only files, builds locally)

Files:

  • Modify: src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/IConstraintEnforcer.cs (add string? correlationId param to RecordDenialAsync, signature at :49-54)
  • Modify: src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/ConstraintEnforcer.cs (:124-158): accept the param, Guid.TryParse it into the Guid? CorrelationId audit field (was hardcoded null at :147); remove TODO(Task 2.3) at :134-136.
  • Modify: src/ZB.MOM.WW.MxGateway.Server/Grpc/MxAccessGatewayService.cs: thread request.ClientCorrelationId from Invoke (:96) → ApplyConstraintsAsync (:279) → the 6 filter helpers (EnforceReadTagAsync:427, EnforceWriteHandleAsync:448, FilterTagBulkAsync:474, FilterReadBulkAsync:529, FilterWriteBulkAsync:584, FilterHandleBulkAsync:656) → RecordDenialAsync.
  • Test: src/ZB.MOM.WW.MxGateway.Tests/... constraint-enforcer / gateway-service test.

Step 1 — failing test: a denied operation with ClientCorrelationId = "<a real GUID>" persists an audit record whose CorrelationId equals that GUID; a non-GUID correlation id persists null (documented behavior). Run locally: dotnet test src/ZB.MOM.WW.MxGateway.Tests/MxGateway.Tests.csproj --filter <name>.

Step 2 — FAIL (currently always null).

Step 3 — implement the threading + Guid.TryParse.

Step 4 — PASS locally + full gateway suite green.

Step 5 — commit: feat(gateway): thread ClientCorrelationId into constraint-denial audit (§1.2)


Workstream D — Client CLI/helper parity (5 clients)

All D tasks touch disjoint client trees and are parallelizable across languages. Each builds/tests on its own toolchain (Java on windev; the rest local).

Task D1: Go single-shot Write2 helper

Classification: small · ~3 min · Parallelizable with: D2D9 Files:

  • Modify: clients/go/mxgateway/session.go (add Write2/Write2Raw after Write:559, modeled on Write + the Write2Bulk:427 payload shape)
  • Test: clients/go/mxgateway/session_test.go (or nearest) TDD: Write2(ctx, serverHandle, itemHandle, value, timestampValue *MxValue, userID int32) error issues MX_COMMAND_KIND_WRITE2 with Write2Command{ServerHandle,ItemHandle,Value,TimestampValue,UserId}. Verify: gofmt, go build ./..., go test ./... from clients/go. Commit feat(go): add single-shot Write2 session helper (§4.1).

Task D2: Python galaxy-* CLI commands (4)

Classification: standard · ~5 min · Parallelizable with: D1,D3D9 Files:

  • Modify: clients/python/src/zb_mom_ww_mxgateway_cli/commands.py (add galaxy-test-connection, galaxy-last-deploy, galaxy-discover, galaxy-watch Click commands wrapping galaxy.py test_connection/get_last_deploy_time/discover_hierarchy/watch_deploy_events; mirror the existing ping command structure at :221)
  • Modify: clients/python/README.md:217 (correct the understated galaxy CLI claim)
  • Test: clients/python/tests/ CLI test TDD then python -m pytest from clients/python. Commit feat(python): add galaxy-* CLI commands (§4.2).

Task D3: ping CLI in Go

Classification: small · ~3 min · Parallelizable with: others Files: clients/go/cmd/mxgw-go/main.go (add ping case to the switch ~:77-130/:1199, modeled on an existing simple command) + test. TDD; gofmt, go build ./..., go test ./.... Commit feat(go): add ping CLI subcommand (§4.3).

Task D4: ping CLI in Java

Classification: small · ~3 min · Parallelizable with: others — build on windev Files: clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java (register a ping subcommand ~:126-149) + test. TDD; gradle test on windev. Commit feat(java): add ping CLI subcommand (§4.3).

Task D5: browse CLI — Go

Classification: standard · ~4 min · Parallelizable with: others Files: clients/go/cmd/mxgw-go/main.go (new browse command wrapping GalaxyClient.Browse:398 / LazyBrowseNode.Expand:337) + test. go build/test. Commit feat(go): add browse CLI (§4.6).

Task D6: browse CLI — Python

Classification: standard · ~4 min · Parallelizable with: others Files: clients/python/src/zb_mom_ww_mxgateway_cli/commands.py (new browse command wrapping galaxy.py browse:163) + test. pytest. Commit feat(python): add browse CLI (§4.6).

Task D7: browse CLI — Rust

Classification: standard · ~4 min · Parallelizable with: others Files: clients/rust/crates/mxgw-cli/src/main.rs (new Browse command variant wrapping the galaxy browse helper in galaxy.rs) + test. cargo fmt, cargo test --workspace, cargo clippy --all-targets -- -D warnings. Commit feat(rust): add browse CLI (§4.6).

Task D8: browse CLI — Java + dotnet

Classification: standard · ~5 min · Parallelizable with: others — Java builds on windev Files: clients/java/.../MxGatewayCli.java (browse subcommand wrapping GalaxyRepositoryClient.browse) + clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs (browse command wrapping LazyBrowseNode.ExpandAsync:63) + tests. gradle test (windev), dotnet test (local). Commit feat(dotnet,java): add browse CLI (§4.6) — 5/5 parity.

Task D9: Java galaxy-name aliases + verify dotnet version

Classification: small · ~4 min · Parallelizable with: others — Java builds on windev Files:

  • Modify: clients/java/.../MxGatewayCli.java:145-146 — add canonical galaxy-test-connection/galaxy-last-deploy as the primary names; keep galaxy-test/galaxy-deploy-time as deprecated aliases (picocli @Command(name=..., aliases={...}) or equivalent).
  • Verify: clients/dotnet/.../MxGatewayClientCli.cs — the explorer found a version path at :85 that conflicts with audit §4.4. Read it: if a version subcommand genuinely works, no change (note it in the §7 update); if it's only a --version flag and IsKnownGatewayCommand lacks version, add the subcommand. Do not add what already exists.
  • Test: Java CLI test asserting both names resolve. gradle test (windev), dotnet build/test (local). Commit feat(java): galaxy command aliases; chore(dotnet): verify version subcommand (§4.4,§4.5).

Workstream E — Docs/hygiene + residual recording

Task E1: Doc hygiene + dead-code removal

Classification: small · ~5 min · Parallelizable with: all (mostly doc-only; one code deletion) Files:

  • docs/plans/2026-06-14-deferred-followups.md:4 — change "Plan only — NOT yet executed" to reflect D1 done (4af24b9).
  • docs/AlarmClientDiscovery.md:765-774 — rewrite stale STA "production fix needed" prose (alarms now run through worker STA / GatewayAlarmMonitor).
  • src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:9-17 — remove/update stale "publisher side is a follow-up" comment (broadcaster shipped).
  • CLAUDE.md — fix project-name drift src/MxGateway.*src/ZB.MOM.WW.MxGateway.* throughout.
  • src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:350-360 — remove dead IDE0051-suppressed MapSqlException. Verify: dotnet build src/ZB.MOM.WW.MxGateway.Server locally (the only code change is the deletion). Commit docs+chore: fix stale prose, project names, remove dead MapSqlException (§7).

Task E2: Record §1.3 and §1.4 residuals + refresh stillpending.md

Classification: trivial · ~3 min · Parallelizable with: all (doc-only) Files:

  • docs/plans/2026-06-14-deferred-followups.md — record §1.3 (provider_switches counter live-exercise unproven; rig can't drive a real failover) as an explicit documented residual.
  • Add a short note (in the worker alarm code's existing comment near WnWrapAlarmConsumer.cs:261 or the design doc) that §1.4's 8-arg ack drops domain/full-name because the AVEVA AlarmAckByName v2 is a vendor stub (-55) — already partly noted; make it explicit and cross-referenced.
  • stillpending.md — mark §1.1, §1.2, §4.1/§4.2/§4.3/§4.6 (and §4.4/§4.5 per outcome) as Resolved with commit refs; keep the documented residuals. Commit docs: record §1.3/§1.4 residuals and refresh stillpending.md (§7).

Final integration review

After all workstreams: run the full local suite (dotnet test gateway + .NET client, go test, pytest, cargo test+clippy) and the windev suite (worker net48/x86 + Java + live MXAccess smoke). Then use superpowers-extended-cc:finishing-a-development-branch.

Dependency summary

  • A0 → A1..A5 → A6
  • B0 → B1 → B2..B7 → B8
  • C1 independent (gateway-only, local)
  • D1..D9 independent of A/B/C and of each other (disjoint client trees)
  • E1, E2 last (reflect what closed); E1 mostly independent