Commit Graph

35 Commits

Author SHA1 Message Date
Joseph Doherty 4e02927f01 A.3 (alarm-ack-by-name): public AcknowledgeAlarm now accepts Provider!Group.Tag references
Closes the gap where the public AcknowledgeAlarm RPC required canonical
GUIDs but OnAlarmTransitionEvent.AlarmFullReference is "Provider!Group.Tag".
Adds an AVEVA AlarmAckByName path that wraps wwAlarmConsumerClass.AlarmAckByName
so callers can ack with the natural reference.

Proto:
- New MxCommandKind.AcknowledgeAlarmByName (=29).
- New AcknowledgeAlarmByNameCommand(alarm_name, provider_name, group_name,
  comment, operator_user/node/domain/full_name) on MxCommand oneof.
- AcknowledgeAlarmReplyPayload (existing) carries the AVEVA native
  status; reused for the by-name path.

Worker:
- IMxAccessAlarmConsumer + WnWrapAlarmConsumer + AlarmDispatcher +
  AlarmCommandHandler all gain an AcknowledgeByName(name, provider,
  group, comment, operator-identity) overload that maps to
  wwAlarmConsumerClass.AlarmAckByName.
- MxAccessCommandExecutor: new switch arm routes
  MxCommandKind.AcknowledgeAlarmByName to the handler. Empty alarm_name
  yields InvalidRequest; handler exceptions surface as MxaccessFailure.

Gateway:
- WorkerAlarmRpcDispatcher.TryParseAlarmReference: parses
  "Provider!Group.Tag" with the convention that the FIRST '!' separates
  provider, the FIRST '.' after '!' separates group; tag may contain
  more dots.
- AcknowledgeAsync now branches: GUID input → AcknowledgeAlarm command
  (existing path); reference input → AcknowledgeAlarmByName command
  (new path); neither parses → InvalidRequest with a clear diagnostic.

Tests: 13 new unit tests cover each layer end-to-end:
- WorkerAlarmRpcDispatcher.TryParseAlarmReference (3 valid + 8 invalid
  forms) including the realistic 4-component "Galaxy!TestArea.
  TestMachine_001.TestAlarm001" reference.
- WorkerAlarmRpcDispatcher.AcknowledgeAsync routes references through
  AcknowledgeAlarmByName + propagates the full operator tuple.
- Executor switch arm carries the by-name tuple and rejects empty
  alarm_name.
- AlarmDispatcher.AcknowledgeByName forwards to consumer.
- Existing fakes extended for the new overload.

Counts: server 308/0, worker 195/3 skip / 1 pre-existing structure-fail
(untouched). Solution builds clean.

End-to-end alarms-over-gateway now serves the full lmxopcua flow:
client.AcknowledgeAlarm(reference="Galaxy!TestArea.TestMachine_001.TestAlarm001",
operator_user="alice") → gateway parses → IPC AcknowledgeAlarmByName →
worker AlarmAckByName → AVEVA history. The remaining piece for full
parity is a live dev-rig smoke test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:17:15 -04:00
Joseph Doherty 47b1fd422c A.3 (auto-subscribe): SessionManager issues SubscribeAlarms on session open
Adds the missing trigger that activates the worker's wnwrap consumer.
Without this, every session opened in OK state but the consumer never
started, so AcknowledgeAlarm/QueryActiveAlarms returned "alarm consumer
not configured" forever.

New AlarmsOptions config block (under MxGateway:Alarms):
  - Enabled (default false): gates the auto-subscribe path so existing
    deployments without alarm configuration are unaffected.
  - SubscriptionExpression: explicit AVEVA expression like
    \<machine>\Galaxy!<area>.
  - DefaultArea: fallback used when SubscriptionExpression is empty;
    composes \$(MachineName)\Galaxy!$(DefaultArea).
  - RequireSubscribeOnOpen (default false): when true, an auto-subscribe
    failure faults the session; when false, the failure is logged and
    the session stays Ready (data subscriptions keep working, alarms
    return "not subscribed" until the operator retries).

SessionManager.OpenSessionAsync gains a TryAutoSubscribeAlarmsAsync hook
that runs after MarkReady. Skips when alarms are disabled; otherwise
builds a SubscribeAlarmsCommand, invokes it on the session's worker
client, and either logs the resulting status or escalates per
RequireSubscribeOnOpen. SessionManagerException is the failure mode for
the strict path so callers in MxAccessGatewayService surface it as
session-open-failed.

Tests: 7 new unit tests cover the disabled lane, expression-driven
subscribe, DefaultArea fallback, success path, soft-failure (require
off), strict-failure (require on), and missing-config-strict-throw.
Server suite total: 295 pass / 0 fail. Solution builds clean.

End-to-end alarms-over-gateway path is now live (with config). Open a
session against a gateway with Alarms.Enabled=true + a valid
SubscriptionExpression; the worker's wnwrap consumer auto-subscribes;
QueryActiveAlarms streams snapshots; AcknowledgeAlarm acks by GUID.
Reference→GUID resolution (AlarmAckByName worker command) and the live
dev-rig smoke test remain follow-ups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 11:10:13 -04:00
Joseph Doherty 9b21ca3554 A.3 (gateway dispatcher): WorkerAlarmRpcDispatcher routes alarm RPCs over the worker pipe
Replaces NotWiredAlarmRpcDispatcher in DI with a production
implementation that issues the new MxCommandKind.{AcknowledgeAlarm,
QueryActiveAlarms} commands across the IPC and unwraps the resulting
MxCommandReply into the public RPC types.

QueryActiveAlarms is fully wired: builds the QueryActiveAlarmsCommand
(forwarding alarm_filter_prefix), invokes it on the resolved
GatewaySession's worker client, and yields each ActiveAlarmSnapshot
from the QueryActiveAlarmsReplyPayload as the RPC stream. Worker
failures + missing sessions yield an empty stream — matches the
ConditionRefresh contract clients already speak to.

AcknowledgeAlarm is partially wired: the public RPC takes
AlarmFullReference (Provider!Group.Tag), but the worker's wnwrap
consumer acks by GUID. Strategy:
- If AlarmFullReference parses as a canonical GUID, forward it
  directly through MxCommandKind.AcknowledgeAlarm. Native status
  flows back via MxCommandReply.Hresult and the dedicated
  AcknowledgeAlarmReplyPayload.NativeStatus.
- Otherwise, return InvalidRequest with a clear diagnostic naming the
  follow-up — reference→GUID lookup needs a worker-side AlarmAckByName
  command wrapping wwAlarmConsumerClass.AlarmAckByName.

DI: SessionServiceCollectionExtensions registers WorkerAlarmRpcDispatcher
as the default IAlarmRpcDispatcher; MxAccessGatewayService picks it up
via constructor injection. NotWiredAlarmRpcDispatcher is retained for
test fixtures that want the no-side-effect fake.

Tests: 7 new unit tests cover session-not-found short-circuit, GUID-vs-
reference branching, native-status propagation, worker MxaccessFailure
diagnostic propagation, and snapshot-stream yielding. Server test
suite total: 288/0 fail. Solution builds clean.

End-to-end alarms-over-gateway pipeline status:
  consumer → sink → queue (A.2 + A.3 in-process slice)
  worker IPC commands (A.3 worker slice)
  gateway dispatcher (this slice)

Remaining for full E2E:
  - Auto-issue SubscribeAlarms on session open (or add a public
    SubscribeAlarms RPC). Without this trigger the consumer never
    starts and Acknowledge/Query return "not subscribed".
  - AlarmAckByName worker command for ack-by-reference.
  - End-to-end live test against the dev rig.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 10:58:40 -04:00
Joseph Doherty 6b3c117d1e gateway: alarm-RPC dispatcher seam (PRs A.6 + A.7)
Replaces the inline diagnostic strings in PR A.3's AcknowledgeAlarm
+ QueryActiveAlarms handlers with an IAlarmRpcDispatcher seam.

- IAlarmRpcDispatcher (new) — gateway-side abstraction over the
  worker-RPC path that fronts AlarmClient.AlarmAckByGUID and the
  active-alarm walk. AcknowledgeAsync returns the
  AcknowledgeAlarmReply directly; QueryActiveAlarmsAsync yields an
  IAsyncEnumerable<ActiveAlarmSnapshot>.
- NotWiredAlarmRpcDispatcher (new, default impl) — returns
  PROTOCOL_STATUS_OK with a structured worker-pending diagnostic
  on Acknowledge, yields an empty stream on QueryActiveAlarms.
  Same observable shape as PR A.3, but the integration seam is
  now in code instead of hardcoded inside the handler.
- MxAccessGatewayService — handlers delegate to the dispatcher.
  Constructor accepts an optional IAlarmRpcDispatcher (default
  NotWiredAlarmRpcDispatcher); a future WorkerAlarmRpcDispatcher
  registration in DI swaps in the live worker-IPC routing without
  changing the public RPC surface.
- 2 new dispatcher tests pin the not-wired contract; 279 → 281
  total tests, all green.

Worker-side dispatch (translating Acknowledge / QueryActiveAlarms
to the IPC method that calls IMxAccessAlarmConsumer from PR A.5)
is the dev-rig follow-up — it depends on validating the AVEVA
GetAlarmChangesCompleted event subscription against a live alarm
provider before pinning a wire format.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 22:47:42 -04:00
Joseph Doherty 4f0f03fca5 gateway: AcknowledgeAlarm + QueryActiveAlarms RPC handlers (PR A.3)
Twelfth PR of the alarms-over-gateway epic
(docs/plans/alarms-over-gateway.md). Lands the public RPC handler
surface that PR A.1's proto introduced. The actual worker-side
ack call + active-alarm walk depend on PR A.2 (worker MxAccess
subscription); this PR ensures clients can call the RPCs and
receive a meaningful response without UNIMPLEMENTED at the gRPC
layer.

- AcknowledgeAlarm — validates session_id + alarm_full_reference,
  resolves the session (NotFound on miss), returns a successful
  reply with a structured DiagnosticMessage indicating worker
  dispatch is pending PR A.2. Once A.2 ships, the body translates
  the request into a WorkerCommand and forwards through
  SessionManager.InvokeAsync.
- QueryActiveAlarms — validates session_id, returns an empty
  stream. PR A.4 layers the actual ConditionRefresh implementation
  once the worker's QueryActiveAlarmsCommand is available.
- OpenSessionReply.Capabilities advertises both new RPCs
  (unary-acknowledge-alarm, server-stream-active-alarms) so
  clients can negotiate against the contract surface.

OnAlarmTransition events flow through the existing StreamEvents
path automatically — EventStreamService and MxAccessGrpcMapper
forward whatever family the worker emits without filtering, so
no changes are needed there for A.3.

Tests: full 273-test suite still green. Per-handler unit tests
ship with PR A.4's expanded surface; A.3's stub handlers are
narrow enough that the existing parity-fixture tests cover the
contract round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:03:58 -04:00
Joseph Doherty ddad573b75 Merge origin/main with local pending work and update AGENTS.md references
- Resolve 14 conflicts from popping local stash on top of origin's
  eed1e88 + 8d3352f doc-comment additions (11 mechanical, plus
  version.rs, DashboardAuthenticatorTests.cs, DashboardGalaxyProjector.cs)
- Fix 4 test files that used AGENTS.md as the repo-root sentinel
  (now use CLAUDE.md, since AGENTS.md was removed in 4731ab5)
- Redirect 10 doc citations from AGENTS.md to the matching gateway.md
  sections (Value Model, Status Model, Security, STA Worker Thread
  Model, gRPC Layer rule, cancellation rule)

Verified: solution build clean, x86 worker build clean, 266/266
gateway tests passing, 121/121 worker tests passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:13:33 -04:00
Joseph Doherty eed1e88a37 Add XML documentation across gateway, worker, and .NET client 2026-04-30 11:49:58 -04:00
Joseph Doherty 133c83029b Add Galaxy repository API and clients 2026-04-29 07:27:00 -04:00
Joseph Doherty 047d875fe6 Fix remaining reliability findings 2026-04-28 06:38:05 -04:00
Joseph Doherty b0041c5d18 Fix reliability findings 2026-04-28 06:27:01 -04:00
Joseph Doherty 907aa49aea Improve gateway reliability and client e2e coverage 2026-04-28 06:11:18 -04:00
Joseph Doherty 4fc355b357 Improve gateway reliability and dashboard docs 2026-04-28 00:13:22 -04:00
Joseph Doherty bd4a09a35e Add Polly resilience policies 2026-04-27 15:37:56 -04:00
Joseph Doherty d431ff9660 Fix dashboard static assets and add client e2e scripts 2026-04-27 12:10:40 -04:00
Joseph Doherty 3d11ac3316 Add bulk MXAccess subscription commands 2026-04-26 22:29:27 -04:00
Joseph Doherty d890eff862 Implement graceful worker shutdown 2026-04-26 19:36:22 -04:00
Joseph Doherty 6a40d26366 Publish stable client proto inputs 2026-04-26 18:52:39 -04:00
Joseph Doherty 56886c3b4e Implement Blazor Server dashboard 2026-04-26 18:37:16 -04:00
Joseph Doherty 2e4ba11a9f Merge remote-tracking branch 'origin/main' into agent-1/issue-17-implement-dashboard-authentication 2026-04-26 18:15:38 -04:00
Joseph Doherty ff86b3f0b0 Implement dashboard authentication 2026-04-26 18:15:22 -04:00
Joseph Doherty 77eac95f33 Issue #14: implement event streaming and backpressure 2026-04-26 17:59:37 -04:00
Joseph Doherty 3661420f0a Issue #15: implement dashboard snapshot service 2026-04-26 17:49:59 -04:00
Joseph Doherty 8d6d3f6188 Issue #13: implement public grpc service 2026-04-26 17:42:46 -04:00
Joseph Doherty d496f1fd75 Issue #12: implement session manager and registry 2026-04-26 17:29:47 -04:00
Joseph Doherty fce9e99553 Issue #11: implement gateway workerclient 2026-04-26 17:09:51 -04:00
Joseph Doherty fad0ac9948 Issue #8: add grpc authentication and scope authorization 2026-04-26 17:01:59 -04:00
Joseph Doherty da9ffe0e11 Issue #7: implement local api key admin cli 2026-04-26 16:56:12 -04:00
Joseph Doherty c1188c6957 Issue #10: implement worker process launcher 2026-04-26 16:45:42 -04:00
Joseph Doherty 696be17139 Issue #6: implement api key hashing and verification 2026-04-26 16:40:46 -04:00
Joseph Doherty ec1155de6d Issue #5: implement sqlite auth store and migrations 2026-04-26 16:29:38 -04:00
Joseph Doherty a5098e6815 Issue #9: implement worker frame protocol 2026-04-26 16:20:59 -04:00
Joseph Doherty a25f09e795 Merge remote-tracking branch 'origin/main' into agent-3/issue-4-add-structured-logging-and-metrics-foundation
# Conflicts:
#	src/MxGateway.Server/GatewayApplication.cs
#	src/MxGateway.Tests/Gateway/GatewayApplicationTests.cs
2026-04-26 16:17:22 -04:00
Joseph Doherty 91ea71b0b7 Issue #3: add gateway configuration and validation 2026-04-26 16:11:30 -04:00
Joseph Doherty 7dfec6dc8c Issue #4: add structured logging and metrics foundation 2026-04-26 16:10:58 -04:00
Joseph Doherty a45f439029 Issue #1: scaffold gateway solution and projects 2026-04-26 15:49:05 -04:00