a0203503a7b7d473afe3c2d206366ee1e7827dcd
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a0203503a7 |
Code-review 2026-05-20 sweep: re-review at 1cd51bb, resolve 72 findings across all 11 modules
Re-reviewed every module/client against the 10-category checklist
(REVIEW-PROCESS.md) at commit
|
||
|
|
53e3973209 |
Resolve Worker-001, Worker-002, Worker-003 code-review findings
Worker-001: WnWrapAlarmConsumer armed a System.Threading.Timer whose OnPoll callback ran GetXmlCurrentAlarms2 on a thread-pool thread against the Apartment-threaded wnwrap COM object, which can deadlock on cross-apartment marshaling. Removed the pollTimer/pollIntervalMs fields, OnPoll, the poll-interval constructor parameter, and the timer arm/disposal. Polls are driven externally by the STA via StaRuntime.InvokeAsync(PollOnce). Worker-002: RunHeartbeatLoopAsync delayed a full HeartbeatInterval before the first heartbeat. Restructured so the first beat is sent immediately on entering the loop and the delay applies only between subsequent beats. Worker-003: ProcessCommandAsync silently returned without a reply when _state was not a command-serving state after dispatch. Both drop sites now log a WorkerCommandResultDropped diagnostic with correlation_id via IWorkerLogger; _state is now volatile. Three pre-existing tests that asserted strict frame ordering were updated to tolerate an interleaved first heartbeat (Worker-002 consequence). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a67a5a4857 |
fix(worker): wire alarm command handler and STA poll loop (Gap 1 + Gap 2)
Gap 1 — WorkerPipeSession now passes `eq => new AlarmCommandHandler(eq)` as
the alarmCommandHandlerFactory in all three places it constructs
MxAccessStaSession (two convenience constructors and InitializeMxAccessAsync).
Previously the parameterless MxAccessStaSession() set the factory to null,
so every SubscribeAlarms / AcknowledgeAlarm / QueryActiveAlarms command
returned "alarm consumer not configured" in a deployed worker.
- Added internal `MxAccessStaSession(Func<MxAccessEventQueue, IAlarmCommandHandler>?)`
constructor that builds all defaults but accepts a factory.
- Added public `MxAccessStaSession(StaRuntime, factory, eventQueue, alarmFactory?)`
4-arg overload to complete the constructor chain.
Gap 2 — WnWrapAlarmConsumer now disables its internal threadpool Timer
(pollIntervalMilliseconds=0 in the default constructor). MxAccessStaSession
starts a `RunAlarmPollLoopAsync` background task that sleeps off-STA then
calls `staRuntime.InvokeAsync(() => handler.PollOnce())` at 500ms intervals.
This satisfies the ThreadingModel=Apartment requirement of wwAlarmConsumerClass:
every GetXmlCurrentAlarms2 call now runs on the worker's STA.
- Added `PollOnce()` to `IMxAccessAlarmConsumer`, `AlarmDispatcher`,
`IAlarmCommandHandler`, and `AlarmCommandHandler`.
- Poll loop cancelled and awaited before alarm handler disposal in both
ShutdownGracefullyAsync and Dispose.
Tests: 4 new tests in MxAccessStaSessionTests verify that
- SubscribeAlarms reaches the handler when the factory is wired (Gap 1)
- SubscribeAlarms returns InvalidRequest without a factory (regression guard)
- PollOnce is called on the STA thread within 3s (Gap 2)
- The poll loop stops after Dispose (Gap 2 lifecycle)
All fake IMxAccessAlarmConsumer / IAlarmCommandHandler test implementations
updated with no-op PollOnce() to satisfy the new interface member.
Worker tests: 199 passed / 1 pre-existing failure / 4 skipped (was 195/1/4).
Server tests: 308 passed / 0 failures (unchanged).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
a4ed605f74 |
A.3 (live smoke): full alarms-over-gateway pipeline verified end-to-end
Skip-gated AlarmsLiveSmokeTests.Alarms_full_pipeline_round_trip ran
against the dev rig with the flip script firing
TestMachine_001.TestAlarm001 every 10s. Verified:
- Subscribe + 1st PollOnce yield real transition events
- Field-by-field decode correct (provider, group, tag, severity,
UTC timestamp, comment, type)
- SnapshotActiveAlarms reflects current state
- AcknowledgeByName(real identity) -> rc=0
- Pipeline keeps streaming transitions on the 10s cadence post-ack
Three production quirks surfaced and were fixed in
WnWrapAlarmConsumer:
1. SetXmlAlarmQuery is mandatory for reads. Skipping it (per the
earlier discovery-doc recommendation) makes the first
GetXmlCurrentAlarms2 fail with E_FAIL. The doc's claim that the
call is unnecessary because the round-trip echo is mangled was
wrong — mangled echo or not, the call is required.
2. SetXmlAlarmQuery breaks AlarmAckByName on the same consumer
instance (returns -55). Workaround: provision a parallel
"ack-only" wnwrap consumer that runs Initialize → Register →
Subscribe via the v1-prefixed methods, no SetXmlAlarmQuery.
Production WnWrapAlarmConsumer now holds two COM clients;
AcknowledgeByName always dispatches through the ack-only one.
3. AlarmAckByName has v2 (8-arg) and v1 (6-arg) overloads. The v2
8-arg overload returns -55 on this AVEVA build (apparently a
stub); the v1 6-arg overload works. Production now calls the
6-arg overload, discarding the proto's operator_domain and
operator_full_name fields. The proto contract keeps both for
forward-compat if AVEVA fixes the v2 method.
Bonus finding (not fixed here): AlarmAckByGUID throws
NotImplementedException on wnwrap. Reference→GUID lookup that we
initially planned to plumb is therefore not viable; all acks must
go through AlarmAckByName. WorkerAlarmRpcDispatcher.AcknowledgeAsync
already routes references through the by-name path, so this only
affects the GUID-input branch (which the worker tries first if the
input parses as a GUID — that branch will surface
NotImplementedException as MxaccessFailure if a client supplies one).
Threading caveat: wnwrap is ThreadingModel=Apartment, so the
consumer's internal Timer (firing on threadpool threads) blocks on
cross-apartment marshaling without an STA message pump. The smoke
test sidesteps this with pollIntervalMilliseconds=0 (Timer disabled)
+ manual PollOnce calls from the test STA. Production hosting will
route polls through the worker's StaRuntime in a follow-up; PollOnce
is now public so the wire-up is straightforward.
Test counts after this slice:
Worker: 195 pass / 4 skipped (live probes incl. new live smoke) /
1 pre-existing structure-fail (untouched)
Server: 308 pass / 0 fail
Solution builds clean.
docs/AlarmClientDiscovery.md "Live smoke-test discoveries" section
records all five findings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4e02927f01 |
A.3 (alarm-ack-by-name): public AcknowledgeAlarm now accepts Provider!Group.Tag references
Closes the gap where the public AcknowledgeAlarm RPC required canonical GUIDs but OnAlarmTransitionEvent.AlarmFullReference is "Provider!Group.Tag". Adds an AVEVA AlarmAckByName path that wraps wwAlarmConsumerClass.AlarmAckByName so callers can ack with the natural reference. Proto: - New MxCommandKind.AcknowledgeAlarmByName (=29). - New AcknowledgeAlarmByNameCommand(alarm_name, provider_name, group_name, comment, operator_user/node/domain/full_name) on MxCommand oneof. - AcknowledgeAlarmReplyPayload (existing) carries the AVEVA native status; reused for the by-name path. Worker: - IMxAccessAlarmConsumer + WnWrapAlarmConsumer + AlarmDispatcher + AlarmCommandHandler all gain an AcknowledgeByName(name, provider, group, comment, operator-identity) overload that maps to wwAlarmConsumerClass.AlarmAckByName. - MxAccessCommandExecutor: new switch arm routes MxCommandKind.AcknowledgeAlarmByName to the handler. Empty alarm_name yields InvalidRequest; handler exceptions surface as MxaccessFailure. Gateway: - WorkerAlarmRpcDispatcher.TryParseAlarmReference: parses "Provider!Group.Tag" with the convention that the FIRST '!' separates provider, the FIRST '.' after '!' separates group; tag may contain more dots. - AcknowledgeAsync now branches: GUID input → AcknowledgeAlarm command (existing path); reference input → AcknowledgeAlarmByName command (new path); neither parses → InvalidRequest with a clear diagnostic. Tests: 13 new unit tests cover each layer end-to-end: - WorkerAlarmRpcDispatcher.TryParseAlarmReference (3 valid + 8 invalid forms) including the realistic 4-component "Galaxy!TestArea. TestMachine_001.TestAlarm001" reference. - WorkerAlarmRpcDispatcher.AcknowledgeAsync routes references through AcknowledgeAlarmByName + propagates the full operator tuple. - Executor switch arm carries the by-name tuple and rejects empty alarm_name. - AlarmDispatcher.AcknowledgeByName forwards to consumer. - Existing fakes extended for the new overload. Counts: server 308/0, worker 195/3 skip / 1 pre-existing structure-fail (untouched). Solution builds clean. End-to-end alarms-over-gateway now serves the full lmxopcua flow: client.AcknowledgeAlarm(reference="Galaxy!TestArea.TestMachine_001.TestAlarm001", operator_user="alice") → gateway parses → IPC AcknowledgeAlarmByName → worker AlarmAckByName → AVEVA history. The remaining piece for full parity is a live dev-rig smoke test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f711a55be4 |
A.2: replace AlarmClientConsumer with wnwrap-based polling consumer
Switch the worker's alarm-consumer surface from `aaAlarmManagedClient.AlarmClient` to `WNWRAPCONSUMERLib.wwAlarmConsumerClass` (CLSID 7AB52E5F-…) hosted by `wnwrapConsumer.dll`. The new path returns alarm records as a BSTR XML payload via `GetXmlCurrentAlarms2`, bypassing the FILETIME→DateTime auto-marshaling that crashed `GetHighPriAlarm` with ArgumentOutOfRangeException on every poll. Live captured 60/60 polls clean against `\DESKTOP-6JL3KKO\Galaxy!DEV` while a System Platform script flipped TestMachine_001.TestAlarm001 every 10s; the GUID, priority, state (UNACK_ALM ↔ UNACK_RTN), and ASCII-formatted timestamps arrived end-to-end. Implementation: - `Interop.WNWRAPCONSUMERLib.dll` generated via tlbimp, checked in under `lib/` so dev boxes don't need the SDK to build. - New `WnWrapAlarmConsumer` (replaces `AlarmClientConsumer`): owns a 500ms polling timer, parses `GetXmlCurrentAlarms2` output, diffs the snapshot keyed by alarm GUID, and raises one `MxAlarmTransitionEvent` per state change. Includes the Initialize→Register-before-Subscribe ordering fix found during Discovery probe runs. - New library-agnostic types `MxAlarmSnapshotRecord` / `MxAlarmStateKind` / `MxAlarmTransitionEvent` so the proto-build path is testable without an AVEVA install. - `AlarmRecordTransitionMapper` retired the COM-coupled `MapTransitionKind(eAlmTransitions)`; new pure helpers `ParseStateKind`, `MapTransition(prev, curr)`, and `ParseTransitionTimestampUtc` cover XML decode + state-delta logic. - `IMxAccessAlarmConsumer` event surface changed from `EventHandler<AlarmRecord>` to `EventHandler<MxAlarmTransitionEvent>` and `SnapshotActiveAlarms()` returns `MxAlarmSnapshotRecord` — decoupling the interface from any specific COM library. - Worker csproj drops `aaAlarmManagedClient` / `IAlarmMgrDataProvider` refs; adds `Interop.WNWRAPCONSUMERLib`. Tests: - 36 new unit tests (state-string mapping, prev/current → proto kind decision table, timestamp UTC reassembly, XML payload parser, 32-char hex GUID round-trip) covering everything that doesn't touch the live COM surface — all passing. - Skip-gated `WnWrapConsumerProbeTests.ProbeWnWrapConsumer` archives the live capture flow for regression / future probes. Docs: - `docs/AlarmClientDiscovery.md` "Option A — captured" section records sample XML payloads, the mangled `SetXmlAlarmQuery` round-trip (prefer `Subscribe` for filtering), the `GetStatistics` AccessViolationException quirk, and the worker-integration outline. Pre-existing failure noted (separate): `MxAccessInteropReference_ExistsOnlyInWorkerProject` was already failing on HEAD — the test project still references `ArchestrA.MxAccess` for the Skip-gated discovery probes. Not regressed by this change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |