Compare commits

...

6 Commits

Author SHA1 Message Date
Joseph Doherty 5e795aeeb8 Regenerate code-reviews index after High/Critical resolution batch
Reflects the resolution of Tests-001/002, IntegrationTests-001/002,
Client.Go-001, Worker-001/002/003 and Worker.Tests-001/002.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:08:06 -04:00
Joseph Doherty 1b4dcf32d5 Resolve Worker.Tests-001 and Worker.Tests-002 code-review findings
Worker.Tests-001: StaMessagePump had no direct unit test. Added
Sta/StaMessagePumpTests.cs — 8 STA-thread facts covering WaitForWorkOrMessages
(wake-event signalled before/during the wait, timeout expiry, zero-timeout
fast path, the QS_ALLINPUT posted-message wake path) and PumpPendingMessages
drain counting.

Worker.Tests-002: no test drove a COM event through the integrated
sink -> mapper -> queue path. Added MxAccess/MxAccessBaseEventSinkTests.cs —
5 facts driving OnDataChange, OnWriteComplete, OperationComplete and
OnBufferedDataChange through a real MxAccessBaseEventSink + mapper + queue and
asserting the converted WorkerEvent lands in MxAccessEventQueue. The four COM
event handlers were widened private -> internal and InternalsVisibleTo for
MxGateway.Worker.Tests was added, mirroring MxAccessAlarmEventSink's existing
test seam; no worker behavior changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 21:07:48 -04:00
Joseph Doherty 53e3973209 Resolve Worker-001, Worker-002, Worker-003 code-review findings
Worker-001: WnWrapAlarmConsumer armed a System.Threading.Timer whose OnPoll
callback ran GetXmlCurrentAlarms2 on a thread-pool thread against the
Apartment-threaded wnwrap COM object, which can deadlock on cross-apartment
marshaling. Removed the pollTimer/pollIntervalMs fields, OnPoll, the
poll-interval constructor parameter, and the timer arm/disposal. Polls are
driven externally by the STA via StaRuntime.InvokeAsync(PollOnce).

Worker-002: RunHeartbeatLoopAsync delayed a full HeartbeatInterval before
the first heartbeat. Restructured so the first beat is sent immediately on
entering the loop and the delay applies only between subsequent beats.

Worker-003: ProcessCommandAsync silently returned without a reply when
_state was not a command-serving state after dispatch. Both drop sites now
log a WorkerCommandResultDropped diagnostic with correlation_id via
IWorkerLogger; _state is now volatile.

Three pre-existing tests that asserted strict frame ordering were updated to
tolerate an interleaved first heartbeat (Worker-002 consequence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:59:46 -04:00
Joseph Doherty e967e85973 Resolve Client.Go-001 code-review finding
MxAccessError.Unwrap returned e.Command directly; on the HRESULT-only path
Command is a nil *CommandError, so Unwrap returned a non-nil error wrapping
a typed nil and errors.As bound a nil *CommandError. Unwrap now returns an
untyped nil when Command is nil. Added errors_test.go regression coverage
for the HRESULT-only and populated-Command paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:46:12 -04:00
Joseph Doherty bc55396334 Resolve IntegrationTests-001 and IntegrationTests-002 code-review findings
IntegrationTests-001: documented the live Galaxy Repository test suite and
its MXGATEWAY_RUN_LIVE_GALAXY_TESTS / MXGATEWAY_LIVE_GALAXY_CONN gating in
docs/GatewayTesting.md.

IntegrationTests-002: documented the live LDAP test suite in
docs/GatewayTesting.md and added a concrete "Provisioning the GwAdmin group"
step to glauth.md so DashboardLdapLiveTests' GwAdmin-membership assumption
is reproducible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:46:09 -04:00
Joseph Doherty b381bfcaf1 Resolve Tests-001 and Tests-002 code-review findings
Tests-001: FakeSessionManager.TryGetSession unconditionally synthesized a
session, so Invoke_WhenSessionMissing_ThrowsNotFound did not actually
verify the missing-session path. Added ResolveOnlySeededSessions/SeedSession
to the fake, rewrote the missing-session test, and added seeded-resolution
and alarm-RPC missing-session coverage.

Tests-002: re-triaged. GalaxyRepository issues only constant SQL; filters
are applied in-memory by GalaxyHierarchyProjector/GalaxyGlobMatcher. Kept
as a valid coverage gap and added GalaxyFilterInputSafetyTests exercising
filter/glob input safety directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 20:46:02 -04:00
23 changed files with 1377 additions and 150 deletions
+5 -1
View File
@@ -85,8 +85,12 @@ func (e *MxAccessError) Error() string {
}
// Unwrap returns the wrapped CommandError, when one is present.
//
// When Command is nil (the HRESULT / MxStatusProxy path) it returns an
// untyped nil rather than a typed-nil *CommandError, so errors.As does not
// bind a nil pointer that a caller would then panic on.
func (e *MxAccessError) Unwrap() error {
if e == nil {
if e == nil || e.Command == nil {
return nil
}
return e.Command
+42
View File
@@ -0,0 +1,42 @@
package mxgateway
import (
"errors"
"testing"
)
// TestMxAccessErrorUnwrapHResultPathNoTypedNilCommandError reproduces
// Client.Go-001: an MxAccessError built via the HRESULT / MxStatusProxy path
// leaves Command nil. Unwrap must not hand back a typed-nil *CommandError,
// because errors.As would then succeed while binding a nil pointer and a
// caller dereferencing it would panic.
func TestMxAccessErrorUnwrapHResultPathNoTypedNilCommandError(t *testing.T) {
hresult := int32(-2147467259) // 0x80004005, a failing HRESULT.
reply := &MxCommandReply{Hresult: &hresult}
err := EnsureMxAccessSuccess("invoke", reply)
if err == nil {
t.Fatal("expected MxAccessError for a failing HRESULT, got nil")
}
var ce *CommandError
if errors.As(err, &ce) {
t.Fatalf("errors.As bound *CommandError from an HRESULT-only MxAccessError (ce=%v); "+
"a caller dereferencing ce.Status would panic", ce)
}
}
// TestMxAccessErrorUnwrapPopulatedCommand confirms the non-nil Command path
// still unwraps to the wrapped *CommandError.
func TestMxAccessErrorUnwrapPopulatedCommand(t *testing.T) {
command := &CommandError{Op: "invoke"}
err := &MxAccessError{Command: command}
var ce *CommandError
if !errors.As(err, &ce) {
t.Fatal("errors.As failed to bind the populated *CommandError")
}
if ce != command {
t.Fatalf("errors.As bound an unexpected *CommandError: got %v want %v", ce, command)
}
}
+3 -3
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `3cc53a8` |
| Status | Reviewed |
| Open findings | 10 |
| Open findings | 9 |
## Checklist coverage
@@ -33,13 +33,13 @@
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `clients/go/mxgateway/errors.go:88-93`, `clients/go/mxgateway/errors.go:117-128` |
| Status | Open |
| Status | Resolved |
**Description:** `MxAccessError.Unwrap` returns `e.Command` directly. `EnsureMxAccessSuccess` constructs `&MxAccessError{Reply: reply}` with `Command` left nil (the HRESULT / failing-`MxStatusProxy` path). When `Command` is a nil `*CommandError`, `Unwrap()` returns a non-nil `error` interface wrapping a nil pointer. Consequently `errors.As(err, &ce)` for `*CommandError` returns `true` while setting `ce` to nil — a caller writing the idiomatic `if errors.As(err, &commandErr) { use commandErr.Status }` nil-dereferences and panics. Verified empirically; the existing test only exercises the populated-`Command` path.
**Recommendation:** Make `Unwrap` return an untyped nil when `Command` is nil: `if e == nil || e.Command == nil { return nil }; return e.Command`. Add a test for the HRESULT-only `MxAccessError` asserting `errors.As(err, &ce)` is `false`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-18: `MxAccessError.Unwrap` now returns an untyped nil when `Command` is nil, so `errors.As` no longer binds a typed-nil `*CommandError`; added `errors_test.go` regression coverage for the HRESULT-only and populated-`Command` paths.
### Client.Go-002
+5 -5
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 10 |
| Open findings | 8 |
## Checklist coverage
@@ -33,13 +33,13 @@
| Severity | High |
| Category | Design-document adherence |
| Location | `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** The Galaxy Repository live test suite and its gating env var `MXGATEWAY_RUN_LIVE_GALAXY_TESTS` (plus connection-string override `MXGATEWAY_LIVE_GALAXY_CONN`) are completely absent from `docs/GatewayTesting.md`. CLAUDE.md mandates updating docs in the same change as the source. The opt-in matrix documents only the MXAccess and LDAP env vars, so an operator running the documented matrix has no way to know these tests exist or how to enable them.
**Recommendation:** Add a "Live Galaxy Repository" section to `docs/GatewayTesting.md` documenting `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1`, `MXGATEWAY_LIVE_GALAXY_CONN`, the `ZB` database prerequisite, and the covered RPCs, mirroring the existing "Live MXAccess Smoke" section.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-18: Added a "Live Galaxy Repository" section to `docs/GatewayTesting.md` documenting `MXGATEWAY_RUN_LIVE_GALAXY_TESTS`, `MXGATEWAY_LIVE_GALAXY_CONN`, the deployed-`ZB` prerequisite, and the covered `GalaxyRepository` RPCs.
### IntegrationTests-002
@@ -48,13 +48,13 @@
| Severity | High |
| Category | Design-document adherence |
| Location | `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13`, `src/MxGateway.Server/Configuration/LdapOptions.cs:27` |
| Status | Open |
| Status | Resolved |
**Description:** `DashboardLdapLiveTests` builds the authenticator with `new GatewayOptions()`, so it relies on `LdapOptions.RequiredGroup` defaulting to `GwAdmin` and asserts the `admin` user is a member of a `GwAdmin` LDAP group. `glauth.md` does not list `GwAdmin` as a provisioned group — it lists `admin` only in the five role groups and describes `GwAdmin` as a group to add "when reuse isn't enough." If GLAuth has only the documented baseline groups, `AuthenticateAsync_AdminInGwAdminGroup_Succeeds` fails (not skips) on any box where the env var is set. This is an undocumented hard prerequisite beyond "LDAP is up."
**Recommendation:** Either document the required `GwAdmin` GLAuth provisioning step in `glauth.md` and `GatewayTesting.md`, or have the test set `RequiredGroup` to a baseline group `glauth.md` guarantees `admin` belongs to (e.g. `WriteOperate`).
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-18: Took the documentation fix — promoted the `glauth.md` "Adding a gw-specific group" section into a concrete "Provisioning the GwAdmin group" step that grants `GwAdmin` to `admin`, cross-referenced it from the groups/verification sections, and added a "Live LDAP" section to `docs/GatewayTesting.md` calling out `GwAdmin` as a hard prerequisite. Alternative considered: weaken the test to a baseline group (`WriteOperate`) — rejected because `GwAdmin` is the real default `LdapOptions.RequiredGroup` and the test should exercise it.
### IntegrationTests-003
+15 -15
View File
@@ -11,16 +11,16 @@ Each module's `findings.md` is the source of truth; this file is generated from
| Module | Reviewer | Date | Commit | Status | Open | Total |
|---|---|---|---|---|---|---|
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 8 | 8 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 10 | 10 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 9 | 10 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 12 | 12 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 12 | 12 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-05-18 | `3cc53a8` | Reviewed | 0 | 12 |
| [Contracts](Contracts/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 8 | 8 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 10 | 10 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 8 | 10 |
| [Server](Server/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 12 | 14 |
| [Tests](Tests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 12 | 12 |
| [Worker](Worker/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 15 | 15 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 15 | 15 |
| [Tests](Tests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 10 | 12 |
| [Worker](Worker/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 12 | 15 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-05-18 | `6c64030` | Reviewed | 13 | 15 |
## Pending findings
@@ -28,16 +28,6 @@ Findings with status `Open` or `In Progress`, ordered by severity.
| ID | Severity | Category | Location | Description |
|---|---|---|---|---|
| Client.Go-001 | High | Correctness & logic bugs | `clients/go/mxgateway/errors.go:88-93`, `clients/go/mxgateway/errors.go:117-128` | `MxAccessError.Unwrap` returns `e.Command` directly. `EnsureMxAccessSuccess` constructs `&MxAccessError{Reply: reply}` with `Command` left nil (the HRESULT / failing-`MxStatusProxy` path). When `Command` is a nil `*CommandError`, `Unwrap()… |
| IntegrationTests-001 | High | Design-document adherence | `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` | The Galaxy Repository live test suite and its gating env var `MXGATEWAY_RUN_LIVE_GALAXY_TESTS` (plus connection-string override `MXGATEWAY_LIVE_GALAXY_CONN`) are completely absent from `docs/GatewayTesting.md`. CLAUDE.md mandates updating… |
| IntegrationTests-002 | High | Design-document adherence | `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13`, `src/MxGateway.Server/Configuration/LdapOptions.cs:27` | `DashboardLdapLiveTests` builds the authenticator with `new GatewayOptions()`, so it relies on `LdapOptions.RequiredGroup` defaulting to `GwAdmin` and asserts the `admin` user is a member of a `GwAdmin` LDAP group. `glauth.md` does not lis… |
| Tests-001 | High | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` | `FakeSessionManager.TryGetSession` unconditionally returns `true` and synthesizes a session for any id. As a result, `Invoke_WhenSessionMissing_ThrowsNotFound` (line 52) only passes because `InvokeException` is pre-seeded — it does not ver… |
| Tests-002 | High | Security | `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210` | The Galaxy Repository RPCs browse a SQL Server database (`ZB`). Every test injects a `StubGalaxyHierarchyCache`, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that… |
| Worker-001 | High | Concurrency & thread safety | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:204-207` | When constructed with `pollIntervalMilliseconds > 0`, `Subscribe` starts a `System.Threading.Timer` whose `OnPoll` callback runs `PollOnce()` — which calls `wwAlarmConsumerClass.GetXmlCurrentAlarms2` — on a thread-pool thread. The wnwrap C… |
| Worker-002 | High | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:545-549` | `RunHeartbeatLoopAsync` calls `await Task.Delay(_sessionOptions.HeartbeatInterval, ...)` before sending the first heartbeat. The gateway therefore receives no heartbeat for the first full interval (default 5s) after the worker reaches `Rea… |
| Worker-003 | High | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:399-403`, `:416-419` | `ProcessCommandAsync` checks `_state` after `DispatchAsync` completes and silently `return`s without writing a `WorkerCommandReply` (or fault) when `_state` is not `Ready`/`ExecutingCommand`. `_state` is a plain field mutated from multiple… |
| Worker.Tests-001 | High | Testing coverage | `src/MxGateway.Worker.Tests/Sta/` (no `StaMessagePumpTests.cs`) | `StaMessagePump` — whose entire reason for existing is pumping Windows messages so MXAccess COM event sink calls deliver onto the STA — has no direct unit test. `WaitForWorkOrMessages` (timeout conversion, the `MsgWaitForMultipleObjectsEx`… |
| Worker.Tests-002 | High | Testing coverage | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs`, `src/MxGateway.Worker.Tests/MxAccess/MxAccessEventMapperTests.cs` | No test verifies that a COM event raised on the STA thread is converted to protobuf and lands in the `MxAccessEventQueue`. `MxAccessEventMapperTests` exercises the mapper directly with hand-built fakes, and `AlarmDispatcherTests` covers th… |
| Client.Dotnet-001 | Medium | Error handling & resilience | `clients/dotnet/MxGateway.Client/GrpcMxGatewayClientTransport.cs:190-199`, `clients/dotnet/MxGateway.Client/GrpcGalaxyRepositoryClientTransport.cs:131-140` | `MapRpcException` only produces typed exceptions for `Unauthenticated` and `PermissionDenied`. Every other gRPC status — `NotFound`, `InvalidArgument`, `ResourceExhausted`, `FailedPrecondition`, `Unavailable`, `Internal` — collapses into t… |
| Client.Dotnet-002 | Medium | Testing coverage | `clients/dotnet/MxGateway.Client.Tests/FakeGatewayTransport.cs:145-148`, `clients/dotnet/MxGateway.Client.Tests/MxGatewayClientSessionTests.cs:236-256` | The retry predicate `MxGatewayClientRetryPolicy.IsTransientGrpcFailure` handles two shapes: a raw `RpcException` and an `MxGatewayException { InnerException: RpcException }`. In production the transport always maps `RpcException``MxGate… |
| Client.Dotnet-003 | Medium | Concurrency & thread safety | `clients/dotnet/MxGateway.Client/MxGatewaySession.cs:659-663`, `clients/dotnet/MxGateway.Client/MxGatewayClient.cs:230-240` | `DisposeAsync` calls `CloseAsync()` (no token) then unconditionally `_closeLock.Dispose()`. If another thread is concurrently awaiting `CloseAsync(token)` — legal, since the type exposes public async methods and no single-threaded contract… |
@@ -150,11 +140,21 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| ID | Severity | Status | Category | Location |
|---|---|---|---|---|
| Server-001 | Critical | Resolved | Security | `src/MxGateway.Server/GatewayApplication.cs:147-149`, `src/MxGateway.Server/Dashboard/DashboardEndpointRouteBuilderExtensions.cs:55-58`, `src/MxGateway.Server/Dashboard/Components/Routes.razor:1-15` |
| Client.Go-001 | High | Resolved | Correctness & logic bugs | `clients/go/mxgateway/errors.go:88-93`, `clients/go/mxgateway/errors.go:117-128` |
| Client.Rust-001 | High | Resolved | mxaccessgw conventions | `clients/rust/src/options.rs:98,143` |
| Client.Rust-002 | High | Resolved | mxaccessgw conventions | `clients/rust/src/session.rs:522` |
| Client.Rust-003 | High | Resolved | Correctness & logic bugs | `clients/rust/crates/mxgw-cli/src/main.rs:1051` |
| Client.Rust-012 | High | Resolved | mxaccessgw conventions | `clients/rust/src/galaxy.rs:282` |
| IntegrationTests-001 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` |
| IntegrationTests-002 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13`, `src/MxGateway.Server/Configuration/LdapOptions.cs:27` |
| Server-003 | High | Resolved | Security | `src/MxGateway.Server/Dashboard/DashboardAuthorizationHandler.cs:39,54-59`, `src/MxGateway.Server/Dashboard/DashboardAuthenticator.cs:236-258` |
| Tests-001 | High | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` |
| Tests-002 | High | Resolved | Security | `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210` |
| Worker-001 | High | Resolved | Concurrency & thread safety | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:204-207` |
| Worker-002 | High | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:545-549` |
| Worker-003 | High | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:399-403`, `:416-419` |
| Worker.Tests-001 | High | Resolved | Testing coverage | `src/MxGateway.Worker.Tests/Sta/` (no `StaMessagePumpTests.cs`) |
| Worker.Tests-002 | High | Resolved | Testing coverage | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs`, `src/MxGateway.Worker.Tests/MxAccess/MxAccessEventMapperTests.cs` |
| Client.Rust-005 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:489-520` |
| Client.Rust-006 | Medium | Resolved | Error handling & resilience | `clients/rust/src/session.rs:531-555` |
| Client.Rust-004 | Low | Resolved | Documentation & comments | `clients/rust/src/version.rs:7` |
+7 -5
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 12 |
| Open findings | 10 |
## Checklist coverage
@@ -33,13 +33,13 @@
| Severity | High |
| Category | Testing coverage |
| Location | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` |
| Status | Open |
| Status | Resolved |
**Description:** `FakeSessionManager.TryGetSession` unconditionally returns `true` and synthesizes a session for any id. As a result, `Invoke_WhenSessionMissing_ThrowsNotFound` (line 52) only passes because `InvokeException` is pre-seeded — it does not verify that the gateway service maps a genuinely missing session to `NotFound`. No test exercises the real gateway path where `TryGetSession` returns `false` (for `StreamEvents`, `CloseSession`, alarm RPCs). A regression dropping the missing-session check would not be caught.
**Recommendation:** Make `FakeSessionManager.TryGetSession` return `false` for unknown ids (return only seeded sessions), then assert `NotFound`/`InvalidArgument` is produced by the service's own lookup logic rather than an injected exception.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-18: confirmed root cause — added `ResolveOnlySeededSessions`/`SeedSession` to `FakeSessionManager` so `TryGetSession` returns `false` for unseeded ids, rewrote `Invoke_WhenSessionMissing_ThrowsNotFound` to drop the injected `InvokeException` and exercise the service's own `ResolveSession` lookup (asserts `InvokeCount == 0`), and added `Invoke_WhenSessionSeeded_ResolvesAndInvokes`, `AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound`, and `QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound`.
### Tests-002
@@ -48,13 +48,15 @@
| Severity | High |
| Category | Security |
| Location | `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:198-210` |
| Status | Open |
| Status | Resolved |
**Description:** The Galaxy Repository RPCs browse a SQL Server database (`ZB`). Every test injects a `StubGalaxyHierarchyCache`, so actual SQL query construction, parameterization, and filter/glob translation are never exercised. No test demonstrates that `TagNameGlob`, `RootTagName`, `AlarmFilterPrefix`, etc. are passed as parameters rather than concatenated into SQL. SQL-injection resistance of the Galaxy layer has zero coverage.
**Recommendation:** Add tests for the `GalaxyRepository` query-building layer (against SQLite or an in-memory abstraction, or by asserting parameter objects), covering glob/prefix inputs containing `'`, `%`, `_`, and `;`. At minimum add a unit test over the SQL `LIKE`-pattern escaping helper.
**Resolution:** _(open)_
**Re-triage note:** The finding's premise is partly misframed. `GalaxyRepository` issues only four *constant* SQL statements (`HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`) — no `DiscoverHierarchyRequest` field is ever concatenated into SQL, so there is no dynamic SQL-injection surface and no `LIKE`-escaping helper to test. `AlarmFilterPrefix` belongs to the worker alarm path, not the Galaxy SQL layer. All filters (`TagNameGlob`, `RootTagName`, template-chain, category, contained-path) are applied **in memory** by `GalaxyHierarchyProjector`/`GalaxyGlobMatcher` against the cached snapshot. The genuine, testable concern — that adversarial filter strings are treated as opaque literals (no wildcard behaviour, no ReDoS, no exceptions) — remains valid and was previously uncovered. Severity left at High: an unsafe in-memory filter would still be a real security gap.
**Resolution:** Resolved 2026-05-18: added `src/MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs` (10 test methods, mostly `[Theory]` over adversarial inputs `'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`, `%`, `_`, `100%_off`, `[abc]`, `Pump'001`) covering `GalaxyGlobMatcher` literal-treatment / `LIKE`-wildcard / pathological-input (ReDoS) behaviour and `GalaxyHierarchyProjector` + `DiscoverHierarchy` RPC handling of adversarial `TagNameGlob`, `RootTagName`, and `TemplateChainContains`. No product bug found — the in-memory filter layer treats all metacharacters as literals; the passing tests resolve the coverage gap.
### Tests-003
+5 -5
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 15 |
| Open findings | 13 |
## Checklist coverage
@@ -33,13 +33,13 @@
| Severity | High |
| Category | Testing coverage |
| Location | `src/MxGateway.Worker.Tests/Sta/` (no `StaMessagePumpTests.cs`) |
| Status | Open |
| Status | Resolved |
**Description:** `StaMessagePump` — whose entire reason for existing is pumping Windows messages so MXAccess COM event sink calls deliver onto the STA — has no direct unit test. `WaitForWorkOrMessages` (timeout conversion, the `MsgWaitForMultipleObjectsEx` failure path) and `PumpPendingMessages` (drain count) are exercised only indirectly via `StaRuntime`, which never asserts the pump returns/throws correctly. The `MsgWaitFailed` error branch and `ToTimeoutMilliseconds` edge cases (`InfiniteTimeSpan`, `<= Zero`, `>= uint.MaxValue`) are completely uncovered.
**Recommendation:** Add `StaMessagePumpTests` that post a Windows message to the STA thread and assert `PumpPendingMessages` returns the expected count; cover `WaitForWorkOrMessages` waking on a signaled event vs timeout; cover `ToTimeoutMilliseconds` boundaries through an internals-visible seam.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Added `src/MxGateway.Worker.Tests/Sta/StaMessagePumpTests.cs` (8 `[Fact]` tests, run on dedicated STA threads). Covers `WaitForWorkOrMessages` null-argument validation, returning immediately when the wake event is pre-signalled, waking when the event is signalled mid-wait, returning on timeout when never signalled, the `TimeSpan.Zero` (`<= Zero`) conversion branch, and waking on a `WM_NULL` Windows message posted to the STA thread (the `QS_ALLINPUT` path). `PumpPendingMessages` is covered for both an empty queue (returns 0) and three posted messages (returns 3). Boundary noted in the file: the `MsgWaitFailed` branch is not exercised because forcing `MsgWaitForMultipleObjectsEx` to fail needs a deliberately invalid native handle, which is unsafe to construct in-process; `ToTimeoutMilliseconds` is `private static` and is covered indirectly through wait-latency assertions rather than reflection.
### Worker.Tests-002
@@ -48,13 +48,13 @@
| Severity | High |
| Category | Testing coverage |
| Location | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs`, `src/MxGateway.Worker.Tests/MxAccess/MxAccessEventMapperTests.cs` |
| Status | Open |
| Status | Resolved |
**Description:** No test verifies that a COM event raised on the STA thread is converted to protobuf and lands in the `MxAccessEventQueue`. `MxAccessEventMapperTests` exercises the mapper directly with hand-built fakes, and `AlarmDispatcherTests` covers the alarm sink, but the non-alarm COM-event path (`MxAccessBaseEventSink`/`MxAccessComServer` event handlers → `MxAccessEventMapper` → queue, triggered by an actual sink callback) is never end-to-end tested. Given the worker's core purpose is to convert COM events to protobuf, this is a significant gap.
**Recommendation:** Add a test that invokes the base event sink's data-change handler (via an internal seam or a fake COM event source) and asserts a converted `WorkerEvent` with correct family/sequence appears in the queue.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Added `src/MxGateway.Worker.Tests/MxAccess/MxAccessBaseEventSinkTests.cs` (5 `[Fact]` tests). The four `MxAccessBaseEventSink` COM event handlers (`OnDataChange`, `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`) — the exact delegate targets the MXAccess COM runtime invokes — were widened from `private` to `internal` (with XML-doc notes that this is a unit-test seam), and `[assembly: InternalsVisibleTo("MxGateway.Worker.Tests")]` was added to `MxGateway.Worker.csproj`. The tests construct a real `MxAccessBaseEventSink` over a real `MxAccessEventMapper` and `MxAccessEventQueue`, invoke each handler with COM-style arguments, and assert a correctly-converted protobuf `WorkerEvent` (family, body case, server/item handle, value, quality, source timestamp, monotonic `WorkerSequence`) lands in the queue. Boundary noted in the file: the COM `+=` wire-up in `Attach`/`Detach` casts to the sealed `LMXProxyServerClass` RCW and cannot run without a live MXAccess COM object, so it is not exercised; invoking the handlers directly reproduces an STA-thread COM callback and exercises the genuine conversion + enqueue path.
### Worker.Tests-003
+7 -7
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-18 |
| Commit reviewed | `6c64030` |
| Status | Reviewed |
| Open findings | 15 |
| Open findings | 12 |
## Checklist coverage
@@ -33,13 +33,13 @@
| Severity | High |
| Category | Concurrency & thread safety |
| Location | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:204-207` |
| Status | Open |
| Status | Resolved |
**Description:** When constructed with `pollIntervalMilliseconds > 0`, `Subscribe` starts a `System.Threading.Timer` whose `OnPoll` callback runs `PollOnce()` — which calls `wwAlarmConsumerClass.GetXmlCurrentAlarms2` — on a thread-pool thread. The wnwrap CLSID is registered `ThreadingModel=Apartment`; calling its methods off the owning STA violates the hard rule that all COM calls happen on the dedicated STA thread, and can deadlock on cross-apartment marshaling when the STA is not pumping. The production path (default constructor, interval 0) is safe, but the public 3-arg constructor leaves this footgun callable, and tests/live-smoke use it.
**Recommendation:** Remove the internal `Timer` entirely (production already drives `PollOnce` from the STA), or document and gate it so it can only be used from an STA thread. At minimum, make the timer-driven mode unreachable from any production wiring.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Removed the off-STA timer infrastructure from `WnWrapAlarmConsumer`: the `Timer? pollTimer` and `pollIntervalMs` fields, the `DefaultPollIntervalMilliseconds` constant, the `OnPoll` callback, the timer-arming arm in `Subscribe`, and the timer disposal block in `Dispose`. The `pollIntervalMilliseconds` parameter is gone from both public constructors (the test-seam ctor is now 2-arg: `wwAlarmConsumerClass` + `maxAlarmsPerFetch`), so the off-STA footgun is structurally unreachable. `PollOnce()` remains the public STA-driven entry point. The stale "poll … on a timer below" comment was corrected. Verified by the regression tests `WnWrapAlarmConsumer_has_no_internal_timer_field` and `WnWrapAlarmConsumer_exposes_no_poll_interval_constructor_parameter`; the `AlarmsLiveSmokeTests` call site was updated to the 2-arg constructor.
### Worker-002
@@ -48,13 +48,13 @@
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:545-549` |
| Status | Open |
| Status | Resolved |
**Description:** `RunHeartbeatLoopAsync` calls `await Task.Delay(_sessionOptions.HeartbeatInterval, ...)` before sending the first heartbeat. The gateway therefore receives no heartbeat for the first full interval (default 5s) after the worker reaches `Ready`. If the gateway's liveness watchdog expects a heartbeat sooner, a healthy worker can be misclassified as hung at startup.
**Recommendation:** Send an initial heartbeat immediately on entering the loop, or move the `Task.Delay` to the end of the loop body.
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Restructured `RunHeartbeatLoopAsync` so the `Task.Delay(HeartbeatInterval)` is applied between beats only, not before the first. A `firstBeat` guard skips the delay on the initial iteration, so the gateway sees a heartbeat as soon as the worker is `Ready`; cancellation behavior is preserved (the loop still observes the token and the delay still throws on cancellation). Verified by the regression test `RunAsync_SendsFirstHeartbeatImmediatelyOnEnteringLoop`. Three pre-existing tests (`WorkerPipeClientTests.RunAsync_ConnectsToPipeAndCompletesHandshake`, `WorkerPipeClientTests.RunAsync_RetriesUntilPipeServerAppears`, `WorkerPipeSessionTests.RunAsync_WhenCommandThrowsAfterShutdown_DropsLateFaultAndWritesShutdownAck`) assumed strict frame ordering and were updated to skip the now-interleaved first heartbeat while still asserting the same shutdown-ack behavior.
### Worker-003
@@ -63,13 +63,13 @@
| Severity | High |
| Category | Correctness & logic bugs |
| Location | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:399-403`, `:416-419` |
| Status | Open |
| Status | Resolved |
**Description:** `ProcessCommandAsync` checks `_state` after `DispatchAsync` completes and silently `return`s without writing a `WorkerCommandReply` (or fault) when `_state` is not `Ready`/`ExecutingCommand`. `_state` is a plain field mutated from multiple tasks (heartbeat loop, event-drain loop, shutdown). A command that completes successfully while `_state` has transitioned will have its reply dropped with no diagnostic, and the gateway's correlation-id wait then hangs until its own timeout. The `_state` read is also not synchronized.
**Recommendation:** Always attempt to write the reply/fault for an in-flight command, or explicitly reject in-flight commands with a `Canceled`/`WorkerUnavailable` reply during state transitions. Make `_state` access thread-safe (volatile or locked).
**Resolution:** _(open)_
**Resolution:** 2026-05-18 — Both silent-drop `return` sites in `ProcessCommandAsync` (the post-`DispatchAsync` success path and the exception path) now call a new `LogCommandResultDropped` helper before returning. The helper logs an Information event named `WorkerCommandResultDropped` via the session's `IWorkerLogger`, carrying the command's `correlation_id` plus `command_method` and `worker_state`, so a stuck gateway correlation-id wait is now traceable. The `_state` field was made `volatile` (`WorkerState` is an int-backed protobuf enum, so volatile is valid) so cross-thread reads observe the latest value without tearing; this is a low-risk, non-behavioral change and did not destabilize any test. Verified by the regression test `RunAsync_WhenReplyIsDroppedAfterShutdown_LogsDiagnostic`.
### Worker-004
+52
View File
@@ -74,6 +74,58 @@ The test output includes session id, worker process id, command status,
HRESULT/status diagnostics, event sequence and handles, close status, and worker
stdout/stderr lines emitted during the run.
## Live Galaxy Repository
`GalaxyRepositoryLiveTests` in `src/MxGateway.IntegrationTests/Galaxy/` exercises
`GalaxyRepository` directly against the `ZB` Galaxy Repository SQL database. It is
skipped unless `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` is set because it depends on a
reachable SQL Server instance and deployed Galaxy state — fake-worker tests cannot
cover the SQL browse RPCs.
The suite covers `TestConnectionAsync`, `GetLastDeployTimeAsync`,
`GetHierarchyAsync`, and `GetAttributesAsync`. `GetHierarchyAsync` and
`GetAttributesAsync` assert a non-empty result, so the connected `ZB` database
must contain a deployed Galaxy, not just an empty schema.
Run the Galaxy live tests explicitly:
```bash
$env:MXGATEWAY_RUN_LIVE_GALAXY_TESTS = "1"
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests
```
Optional live Galaxy variables:
| Variable | Default | Description |
|----------|---------|-------------|
| `MXGATEWAY_LIVE_GALAXY_CONN` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | Galaxy Repository connection string. Set this when the `ZB` database is on a non-default instance or needs SQL authentication. |
The default connection string targets `ZB` on `localhost` with Windows
authentication, which matches the Galaxy Repository conventions in CLAUDE.md.
## Live LDAP
`DashboardLdapLiveTests` in `src/MxGateway.IntegrationTests/` exercises
`DashboardAuthenticator` against the live GLAuth directory. It is skipped unless
`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1` is set because it binds against the GLAuth
service described in `glauth.md`.
The suite builds the authenticator with a default `GatewayOptions`, so
`LdapOptions.RequiredGroup` keeps its `GwAdmin` default. `GwAdmin` is the
gateway-specific dashboard-admin role and is **not** part of the five baseline
GLAuth role groups — it must be provisioned before the LDAP live tests pass.
`AuthenticateAsync_AdminInGwAdminGroup_Succeeds` fails (rather than skips) when
GLAuth has only the baseline groups, so this is a hard prerequisite beyond "LDAP
is up." See the "Adding a gw-specific group" section of `glauth.md` for the
provisioning step that adds `GwAdmin` and grants it to `admin`.
Run the LDAP live tests explicitly:
```bash
$env:MXGATEWAY_RUN_LIVE_LDAP_TESTS = "1"
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests
```
## Client E2E Scripts
`scripts/discover-testmachine-tags.ps1` queries the ZB Galaxy Repository for the
+28 -8
View File
@@ -59,6 +59,14 @@ For mxaccessgw dev, `admin` covers every gw-side capability test;
`readonly` is the right "negative" case for proving Browse-OK /
Write-denied.
The gateway dashboard adds one role beyond this LmxOpcUa taxonomy:
`GwAdmin`. `LdapOptions.RequiredGroup` defaults to `GwAdmin`, so the
dashboard login and `DashboardLdapLiveTests` require `admin` to be a
member of a `GwAdmin` group. `GwAdmin` is **not** in the baseline
GLAuth config — it must be provisioned before dashboard authn or the
LDAP live tests work. See [Provisioning the GwAdmin
group](#provisioning-the-gwadmin-group) below.
## Two bind patterns
### 1. Direct bind (simplest)
@@ -127,14 +135,18 @@ ldap:
should strip the leading `ou=` (or `cn=` against AD) RDN value and
look that up in `groupToRole`.
## Adding a gw-specific group (when reuse isn't enough)
## Provisioning the GwAdmin group
If mxaccessgw needs a permission that doesn't fit the existing five
roles (e.g. `GwAdmin` for shutdown/recycle commands), add it to
GLAuth rather than running a separate LDAP server:
`GwAdmin` is the gateway-specific dashboard-admin role. It is the
default `LdapOptions.RequiredGroup`, so the dashboard cookie login and
`DashboardLdapLiveTests` (`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`) reject
`admin` until a `GwAdmin` group exists and `admin` is a member.
GLAuth's baseline config ships only the five LmxOpcUa role groups, so
`GwAdmin` must be added to GLAuth rather than run from a separate LDAP
server:
1. Edit `C:\publish\glauth\glauth.cfg`
2. Append:
2. Append the group:
```toml
[[groups]]
@@ -142,8 +154,9 @@ GLAuth rather than running a separate LDAP server:
gidnumber = 5510 # pick the next free GID
```
3. Add the group to whichever existing user(s) should have it via
`othergroups = [..., 5510]`. Or create a new user:
3. Add `5510` to `admin`'s `othergroups` list so `admin` resolves the
`GwAdmin` role. Add it to any other user that needs dashboard-admin
rights. Or create a dedicated user:
```toml
[[users]]
@@ -158,6 +171,12 @@ GLAuth rather than running a separate LDAP server:
4. `nssm restart GLAuth`
After the restart, `admin`'s `memberOf` includes
`ou=GwAdmin,ou=groups,dc=lmxopcua,dc=local`, which the authenticator
strips to `GwAdmin` and matches against `RequiredGroup`. The same
pattern applies to any future permission that doesn't fit the existing
five roles.
Generate `passsha256` from a plaintext password:
```powershell
@@ -196,7 +215,8 @@ ldapsearch -x -H ldap://localhost:3893 \
```
The response should list `admin`'s entry with `memberOf` populated for
all five role groups.
all five role groups — plus `GwAdmin` once the gateway-specific group
is provisioned.
## Service management
@@ -0,0 +1,331 @@
using System.Diagnostics;
using Grpc.Core;
using Microsoft.Extensions.Logging.Abstractions;
using MxGateway.Contracts.Proto.Galaxy;
using MxGateway.Server.Dashboard;
using MxGateway.Server.Galaxy;
using MxGateway.Server.Grpc;
using MxGateway.Server.Security.Authorization;
namespace MxGateway.Tests.Galaxy;
/// <summary>
/// Adversarial-input coverage for the Galaxy Repository browse filter layer.
/// <para>
/// Re-triage note (finding Tests-002): the Galaxy Repository's SQL surface
/// (<c>HierarchySql</c>, <c>AttributesSql</c>, <c>SELECT 1</c>,
/// <c>SELECT time_of_last_deploy FROM galaxy</c>) is entirely constant — no
/// <see cref="DiscoverHierarchyRequest"/> field is ever concatenated into a SQL
/// string. All filters (<c>TagNameGlob</c>, <c>RootTagName</c>, category ids,
/// template-chain filters, contained-path roots) are applied in memory by
/// <see cref="GalaxyHierarchyProjector"/> against the cached snapshot, so there is
/// no SQL-injection surface and no <c>LIKE</c>-escaping helper to test.
/// </para>
/// <para>
/// The genuine, testable concern is that adversarial filter strings — SQL
/// metacharacters (<c>'</c>, <c>;</c>) and <c>LIKE</c>-wildcards (<c>%</c>,
/// <c>_</c>) — are treated as opaque literals by the in-memory filter layer:
/// they must never act as wildcards, never throw, and never trigger catastrophic
/// regex backtracking in <see cref="GalaxyGlobMatcher"/>.
/// </para>
/// </summary>
public sealed class GalaxyFilterInputSafetyTests
{
private static readonly string[] AdversarialInputs =
[
"'",
"' OR '1'='1",
"'; DROP TABLE gobject;--",
"%",
"_",
"100%_off",
"[abc]",
"Pump'001",
];
public static TheoryData<string> AdversarialInputCases()
{
TheoryData<string> data = [];
foreach (string input in AdversarialInputs)
{
data.Add(input);
}
return data;
}
/// <summary>
/// Verifies <see cref="GalaxyGlobMatcher"/> treats SQL metacharacters and
/// <c>LIKE</c>-wildcards as literals — a glob equal to the literal value matches,
/// and the same glob does not spuriously match an unrelated value.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public void GlobMatcher_TreatsSqlMetacharactersAsLiterals(string input)
{
Assert.True(
GalaxyGlobMatcher.IsMatch(input, input),
$"A glob equal to the literal value should match: {input}");
Assert.False(
GalaxyGlobMatcher.IsMatch("UnrelatedTagName", input),
$"Adversarial glob must not behave as a wildcard against unrelated text: {input}");
}
/// <summary>
/// Verifies the SQL <c>LIKE</c> wildcards <c>%</c> and <c>_</c> are NOT treated as
/// wildcards by the glob matcher; only <c>*</c> and <c>?</c> are glob wildcards.
/// </summary>
[Fact]
public void GlobMatcher_DoesNotTreatLikeWildcardsAsWildcards()
{
// '%' would match anything if interpreted as a SQL LIKE wildcard.
Assert.False(GalaxyGlobMatcher.IsMatch("Pump_001", "%"));
// '_' would match a single character if interpreted as a SQL LIKE wildcard.
Assert.False(GalaxyGlobMatcher.IsMatch("A", "_"));
Assert.True(GalaxyGlobMatcher.IsMatch("_", "_"));
// '*' and '?' remain glob wildcards.
Assert.True(GalaxyGlobMatcher.IsMatch("Pump_001", "Pump*"));
Assert.True(GalaxyGlobMatcher.IsMatch("Pump_001", "Pump_00?"));
}
/// <summary>
/// Verifies a pathological glob does not cause catastrophic regex backtracking —
/// <see cref="GalaxyGlobMatcher"/> escapes every literal character and applies a
/// 100 ms regex timeout, so a long adversarial input completes promptly.
/// </summary>
[Fact]
public void GlobMatcher_WithPathologicalInput_DoesNotHang()
{
string pathologicalGlob = new string('a', 5000) + "!";
string pathologicalValue = new string('a', 5000);
Stopwatch stopwatch = Stopwatch.StartNew();
bool matched = GalaxyGlobMatcher.IsMatch(pathologicalValue, pathologicalGlob);
stopwatch.Stop();
Assert.False(matched);
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(2),
$"Glob matching took {stopwatch.ElapsedMilliseconds} ms — expected sub-second.");
}
/// <summary>
/// Verifies the <see cref="GalaxyHierarchyProjector"/> <c>TagNameGlob</c> filter
/// treats an adversarial glob as a literal: it never wildcard-matches the whole
/// hierarchy and never throws.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public void Projector_TagNameGlob_WithAdversarialInput_DoesNotMatchEverything(string glob)
{
GalaxyHierarchyCacheEntry entry = CreateEntry(CreateObjects());
GalaxyHierarchyQueryResult result = GalaxyHierarchyProjector.Project(
entry,
new DiscoverHierarchyRequest { TagNameGlob = glob });
// None of the seeded tag names equal an adversarial string, so a correctly
// literal filter returns zero matches rather than the whole hierarchy.
Assert.Equal(0, result.TotalObjectCount);
Assert.Empty(result.Objects);
}
/// <summary>
/// Verifies an adversarial <c>RootTagName</c> resolves through the projector as a
/// literal — an exact-match lookup that finds nothing and surfaces NotFound,
/// never matching unrelated objects or throwing an unexpected exception.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public void Projector_RootTagName_WithAdversarialInput_ThrowsNotFound(string rootTagName)
{
GalaxyHierarchyCacheEntry entry = CreateEntry(CreateObjects());
RpcException exception = Assert.Throws<RpcException>(
() => GalaxyHierarchyProjector.Project(
entry,
new DiscoverHierarchyRequest { RootTagName = rootTagName }));
Assert.Equal(StatusCode.NotFound, exception.StatusCode);
}
/// <summary>
/// Verifies an adversarial <c>TemplateChainContains</c> filter is a literal
/// substring test — it never matches unrelated template chains and never throws.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public void Projector_TemplateChainContains_WithAdversarialInput_MatchesNothing(string filter)
{
GalaxyHierarchyCacheEntry entry = CreateEntry(CreateObjects());
DiscoverHierarchyRequest request = new();
request.TemplateChainContains.Add(filter);
GalaxyHierarchyQueryResult result = GalaxyHierarchyProjector.Project(entry, request);
Assert.Equal(0, result.TotalObjectCount);
}
/// <summary>
/// Verifies the <see cref="GalaxyRepositoryGrpcService.DiscoverHierarchy"/> RPC
/// handles an adversarial <c>TagNameGlob</c> end-to-end: the request succeeds with
/// zero matches rather than returning the whole hierarchy or faulting.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public async Task DiscoverHierarchy_WithAdversarialTagNameGlob_ReturnsZeroMatches(string glob)
{
GalaxyRepositoryGrpcService service = CreateService(CreateEntry(CreateObjects()));
DiscoverHierarchyReply reply = await service.DiscoverHierarchy(
new DiscoverHierarchyRequest { TagNameGlob = glob, PageSize = 100 },
new TestServerCallContext());
Assert.Equal(0, reply.TotalObjectCount);
Assert.Empty(reply.Objects);
}
/// <summary>
/// Verifies the <see cref="GalaxyRepositoryGrpcService.DiscoverHierarchy"/> RPC
/// maps an adversarial <c>RootTagName</c> to NotFound rather than executing it as
/// a query fragment or matching unrelated objects.
/// </summary>
[Theory]
[MemberData(nameof(AdversarialInputCases))]
public async Task DiscoverHierarchy_WithAdversarialRootTagName_ReturnsNotFound(string rootTagName)
{
GalaxyRepositoryGrpcService service = CreateService(CreateEntry(CreateObjects()));
RpcException exception = await Assert.ThrowsAsync<RpcException>(
async () => await service.DiscoverHierarchy(
new DiscoverHierarchyRequest { RootTagName = rootTagName, PageSize = 100 },
new TestServerCallContext()));
Assert.Equal(StatusCode.NotFound, exception.StatusCode);
}
private static GalaxyRepositoryGrpcService CreateService(GalaxyHierarchyCacheEntry entry)
{
GalaxyRepositoryOptions options = new()
{
ConnectionString = "Server=localhost;Database=ZB;Integrated Security=True;Encrypt=False;",
};
return new GalaxyRepositoryGrpcService(
new MxGateway.Server.Galaxy.GalaxyRepository(options),
new StubGalaxyHierarchyCache(entry),
new GalaxyDeployNotifier(),
new GatewayRequestIdentityAccessor(),
NullLogger<GalaxyRepositoryGrpcService>.Instance);
}
private static GalaxyHierarchyCacheEntry CreateEntry(IReadOnlyList<GalaxyObject> objects)
{
return GalaxyHierarchyCacheEntry.Empty with
{
Status = GalaxyCacheStatus.Healthy,
Sequence = 1,
LastSuccessAt = DateTimeOffset.UtcNow,
Objects = objects,
Index = GalaxyHierarchyIndex.Build(objects),
DashboardSummary = DashboardGalaxySummary.Unknown with
{
Status = DashboardGalaxyStatus.Healthy,
ObjectCount = objects.Count,
},
ObjectCount = objects.Count,
};
}
private static IReadOnlyList<GalaxyObject> CreateObjects()
{
return
[
new GalaxyObject
{
GobjectId = 1,
TagName = "Area1",
ContainedName = "Area1",
BrowseName = "Area1",
IsArea = true,
CategoryId = 13,
},
new GalaxyObject
{
GobjectId = 2,
TagName = "Pump_001",
ContainedName = "Pump",
BrowseName = "Pump_001",
ParentGobjectId = 1,
CategoryId = 10,
TemplateChain = { "$Pump", "$Base" },
},
new GalaxyObject
{
GobjectId = 3,
TagName = "Valve_001",
ContainedName = "Valve",
BrowseName = "Valve_001",
ParentGobjectId = 1,
CategoryId = 11,
TemplateChain = { "$Valve" },
},
];
}
private sealed class StubGalaxyHierarchyCache(GalaxyHierarchyCacheEntry current) : IGalaxyHierarchyCache
{
public GalaxyHierarchyCacheEntry Current { get; } = current;
public Task RefreshAsync(CancellationToken cancellationToken) => Task.CompletedTask;
public Task WaitForFirstLoadAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}
private sealed class TestServerCallContext(CancellationToken cancellationToken = default) : ServerCallContext
{
private readonly Metadata requestHeaders = [];
private readonly Metadata responseTrailers = [];
private readonly Dictionary<object, object> userState = [];
private Status status;
private WriteOptions? writeOptions;
protected override string MethodCore => "/galaxy_repository.v1.GalaxyRepository/DiscoverHierarchy";
protected override string HostCore => "localhost";
protected override string PeerCore => "ipv4:127.0.0.1:5000";
protected override DateTime DeadlineCore => DateTime.UtcNow.AddMinutes(1);
protected override Metadata RequestHeadersCore => requestHeaders;
protected override CancellationToken CancellationTokenCore => cancellationToken;
protected override Metadata ResponseTrailersCore => responseTrailers;
protected override Status StatusCore
{
get => status;
set => status = value;
}
protected override WriteOptions? WriteOptionsCore
{
get => writeOptions;
set => writeOptions = value;
}
protected override AuthContext AuthContextCore { get; } = new(
string.Empty,
new Dictionary<string, List<AuthProperty>>(StringComparer.Ordinal));
protected override IDictionary<object, object> UserStateCore => userState;
protected override Task WriteResponseHeadersAsyncCore(Metadata responseHeaders) => Task.CompletedTask;
protected override ContextPropagationToken CreatePropagationTokenCore(ContextPropagationOptions? options)
{
throw new NotSupportedException();
}
}
}
@@ -47,16 +47,17 @@ public sealed class MxAccessGatewayServiceTests
Assert.Equal("operator-session", sessionManager.LastOpenRequest?.ClientSessionName);
}
/// <summary>Verifies that Invoke throws NotFound when the session does not exist.</summary>
/// <summary>
/// Verifies that Invoke maps a genuinely missing session to NotFound via the
/// service's own <c>ResolveSession</c> lookup. No <c>InvokeException</c> is
/// injected — <see cref="FakeSessionManager.ResolveOnlySeededSessions"/> makes
/// <c>TryGetSession</c> return false, so this test fails if the service drops
/// its missing-session check rather than passing for the wrong reason.
/// </summary>
[Fact]
public async Task Invoke_WhenSessionMissing_ThrowsNotFound()
{
FakeSessionManager sessionManager = new()
{
InvokeException = new SessionManagerException(
SessionManagerErrorCode.SessionNotFound,
"Session session-missing was not found."),
};
FakeSessionManager sessionManager = new() { ResolveOnlySeededSessions = true };
MxAccessGatewayService service = CreateService(sessionManager);
RpcException exception = await Assert.ThrowsAsync<RpcException>(
@@ -66,6 +67,76 @@ public sealed class MxAccessGatewayServiceTests
Assert.Equal(StatusCode.NotFound, exception.StatusCode);
Assert.Contains("session-missing", exception.Status.Detail, StringComparison.Ordinal);
// The service must reject before delegating to the session manager.
Assert.Equal(0, sessionManager.InvokeCount);
}
/// <summary>
/// Verifies that Invoke resolves a session that was seeded into the session
/// manager when <see cref="FakeSessionManager.ResolveOnlySeededSessions"/> is on,
/// confirming the missing-session test above is gated on a real lookup.
/// </summary>
[Fact]
public async Task Invoke_WhenSessionSeeded_ResolvesAndInvokes()
{
FakeSessionManager sessionManager = new() { ResolveOnlySeededSessions = true };
sessionManager.SeedSession(CreateSession("session-1", processId: 1234));
MxAccessGatewayService service = CreateService(sessionManager);
MxCommandReply reply = await service.Invoke(
CreatePingRequest("session-1"),
new TestServerCallContext());
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
Assert.Equal(1, sessionManager.InvokeCount);
}
/// <summary>
/// Verifies that AcknowledgeAlarm maps a genuinely missing session to NotFound via
/// the service's own <c>ResolveSession</c> lookup rather than an injected exception.
/// </summary>
[Fact]
public async Task AcknowledgeAlarm_WhenSessionMissing_ThrowsNotFound()
{
FakeSessionManager sessionManager = new() { ResolveOnlySeededSessions = true };
MxAccessGatewayService service = CreateService(sessionManager);
RpcException exception = await Assert.ThrowsAsync<RpcException>(
async () => await service.AcknowledgeAlarm(
new AcknowledgeAlarmRequest
{
SessionId = "session-missing",
AlarmFullReference = "Tank01.Level.HiHi",
OperatorUser = "alice",
},
new TestServerCallContext()));
Assert.Equal(StatusCode.NotFound, exception.StatusCode);
Assert.Contains("session-missing", exception.Status.Detail, StringComparison.Ordinal);
}
/// <summary>
/// Verifies that QueryActiveAlarms maps a genuinely missing session to NotFound via
/// the service's own <c>ResolveSession</c> lookup rather than an injected exception.
/// </summary>
[Fact]
public async Task QueryActiveAlarms_WhenSessionMissing_ThrowsNotFound()
{
FakeSessionManager sessionManager = new() { ResolveOnlySeededSessions = true };
MxAccessGatewayService service = CreateService(sessionManager);
RpcException exception = await Assert.ThrowsAsync<RpcException>(
async () => await service.QueryActiveAlarms(
new QueryActiveAlarmsRequest
{
SessionId = "session-missing",
AlarmFilterPrefix = "Tank01.",
},
new RecordingStreamWriter<ActiveAlarmSnapshot>(),
new TestServerCallContext()));
Assert.Equal(StatusCode.NotFound, exception.StatusCode);
Assert.Contains("session-missing", exception.Status.Detail, StringComparison.Ordinal);
}
/// <summary>Verifies that Invoke throws InvalidArgument and does not invoke the session manager when payload is mismatched.</summary>
@@ -425,9 +496,26 @@ public sealed class MxAccessGatewayServiceTests
private sealed class FakeSessionManager : ISessionManager
{
private readonly Dictionary<string, GatewaySession> seededSessions = new(StringComparer.Ordinal);
/// <summary>The session to return from OpenSessionAsync.</summary>
public GatewaySession? OpenSessionResult { get; init; }
/// <summary>
/// When true, <see cref="TryGetSession"/> only resolves sessions that have been
/// explicitly seeded via <see cref="SeedSession"/> (or <see cref="OpenSessionResult"/>),
/// and returns false for any other id. This exercises the gateway service's own
/// missing-session handling instead of masking it with a synthesized session.
/// </summary>
public bool ResolveOnlySeededSessions { get; init; }
/// <summary>Registers a session so <see cref="TryGetSession"/> resolves its id.</summary>
/// <param name="session">Session to register by its <see cref="GatewaySession.SessionId"/>.</param>
public void SeedSession(GatewaySession session)
{
seededSessions[session.SessionId] = session;
}
/// <summary>The last OpenSessionAsync request captured.</summary>
public SessionOpenRequest? LastOpenRequest { get; private set; }
@@ -484,6 +572,18 @@ public sealed class MxAccessGatewayServiceTests
string sessionId,
out GatewaySession session)
{
if (seededSessions.TryGetValue(sessionId, out GatewaySession? seeded))
{
session = seeded;
return true;
}
if (ResolveOnlySeededSessions)
{
session = null!;
return false;
}
session = OpenSessionResult ?? CreateSession(sessionId, processId: 1234);
return true;
}
@@ -1,7 +1,6 @@
using System;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using MxGateway.Contracts.Proto;
using MxGateway.Worker.MxAccess;
@@ -77,13 +76,11 @@ public sealed class AlarmsLiveSmokeTests
Log($"Pump duration: {PumpDuration.TotalSeconds:F0}s; transition wait timeout: {TransitionWaitTimeout.TotalSeconds:F0}s");
MxAccessEventQueue queue = new MxAccessEventQueue();
// pollIntervalMs=0 disables the internal Timer; we drive PollOnce
// manually from the STA below to avoid threadpool→STA marshaling
// (the wnwrap COM is ThreadingModel=Apartment, and this test
// doesn't run a Win32 message pump on its STA).
// The consumer owns no internal timer; we drive PollOnce manually
// from the STA below (the wnwrap COM is ThreadingModel=Apartment,
// and this test doesn't run a Win32 message pump on its STA).
WnWrapAlarmConsumer consumer = new WnWrapAlarmConsumer(
new WNWRAPCONSUMERLib.wwAlarmConsumerClass(),
pollIntervalMilliseconds: 0,
maxAlarmsPerFetch: 1024);
MxAccessAlarmEventSink sink = new MxAccessAlarmEventSink(queue, new MxAccessEventMapper());
using AlarmDispatcher dispatcher = new AlarmDispatcher(consumer, sink, SessionId);
@@ -92,13 +89,10 @@ public sealed class AlarmsLiveSmokeTests
dispatcher.Subscribe(SubscriptionExpression);
Log("Subscribe -> ok. Driving PollOnce manually from this STA...");
// The wnwrap COM object is ThreadingModel=Apartment. The consumer's
// internal Timer would fire on a threadpool thread and deadlock on
// cross-apartment marshaling without a Win32 message pump. For the
// smoke test we constructed the consumer with pollIntervalMs=0
// (Timer disabled) and drive PollOnce manually here on the STA.
// Production hosting will route polls through the worker's
// StaRuntime in a follow-up PR.
// The wnwrap COM object is ThreadingModel=Apartment. The consumer
// owns no internal timer, so we drive PollOnce manually here on the
// STA. Production hosting routes polls through the worker's
// StaRuntime.
// 1. Wait for the first transition (any kind), then keep waiting
// for one with kind=Raise so the alarm is currently Active when
@@ -77,7 +77,9 @@ public sealed class WorkerPipeClientTests
},
});
WorkerEnvelope shutdownAck = await reader.ReadAsync();
WorkerEnvelope shutdownAck = await ReadUntilAsync(
reader,
WorkerEnvelope.BodyOneofCase.WorkerShutdownAck);
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerShutdownAck, shutdownAck.BodyCase);
await clientTask;
}
@@ -120,7 +122,9 @@ public sealed class WorkerPipeClientTests
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerReady, (await reader.ReadAsync()).BodyCase);
await writer.WriteAsync(CreateShutdown());
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerShutdownAck, (await reader.ReadAsync()).BodyCase);
Assert.Equal(
WorkerEnvelope.BodyOneofCase.WorkerShutdownAck,
(await ReadUntilAsync(reader, WorkerEnvelope.BodyOneofCase.WorkerShutdownAck)).BodyCase);
await clientTask;
}
@@ -143,6 +147,25 @@ public sealed class WorkerPipeClientTests
await Assert.ThrowsAsync<TimeoutException>(async () => await client.RunAsync(workerOptions));
}
/// <summary>
/// Reads frames until one matching the expected body case is found,
/// skipping interleaved heartbeats (the first heartbeat is emitted
/// immediately on entering the heartbeat loop — see Worker-002).
/// </summary>
private static async Task<WorkerEnvelope> ReadUntilAsync(
WorkerFrameReader reader,
WorkerEnvelope.BodyOneofCase expectedBody)
{
while (true)
{
WorkerEnvelope envelope = await reader.ReadAsync();
if (envelope.BodyCase == expectedBody)
{
return envelope;
}
}
}
private static WorkerPipeSession CreateSession(
Stream stream,
WorkerFrameProtocolOptions options)
@@ -383,16 +383,126 @@ public sealed class WorkerPipeSessionTests
await pipePair.GatewayWriter
.WriteAsync(CreateShutdownEnvelope(), cancellation.Token);
WorkerEnvelope firstEnvelopeAfterShutdown = await pipePair.GatewayReader
.ReadAsync(cancellation.Token);
// The first heartbeat is emitted immediately on entering the loop
// (Worker-002), so skip any interleaved heartbeats; the late fault
// must still be dropped — no WorkerFault may precede the ack.
WorkerEnvelope envelopeAfterShutdown;
do
{
envelopeAfterShutdown = await pipePair.GatewayReader.ReadAsync(cancellation.Token);
Assert.NotEqual(
WorkerEnvelope.BodyOneofCase.WorkerFault,
envelopeAfterShutdown.BodyCase);
}
while (envelopeAfterShutdown.BodyCase == WorkerEnvelope.BodyOneofCase.WorkerHeartbeat);
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerShutdownAck, firstEnvelopeAfterShutdown.BodyCase);
Assert.Equal(ProtocolStatusCode.Ok, firstEnvelopeAfterShutdown.WorkerShutdownAck.Status.Code);
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerShutdownAck, envelopeAfterShutdown.BodyCase);
Assert.Equal(ProtocolStatusCode.Ok, envelopeAfterShutdown.WorkerShutdownAck.Status.Code);
Task completedTask = await Task.WhenAny(runTask, Task.Delay(TimeSpan.FromSeconds(2), cancellation.Token));
Assert.Same(runTask, completedTask);
await runTask;
}
/// <summary>
/// Worker-002 regression: the first heartbeat must be emitted
/// immediately on entering the heartbeat loop, not after a full
/// HeartbeatInterval. A long interval is configured so a delay-first
/// loop would fail to deliver a heartbeat inside the assertion window.
/// </summary>
[Fact]
public async Task RunAsync_SendsFirstHeartbeatImmediatelyOnEnteringLoop()
{
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(10));
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
FakeRuntimeSession runtime = new();
WorkerPipeSession session = CreatePipeSession(
pipePair.WorkerStream,
runtime,
new WorkerPipeSessionOptions
{
// A deliberately long interval: a delay-before-first-beat
// loop would not produce a heartbeat for 30s.
HeartbeatInterval = TimeSpan.FromSeconds(30),
HeartbeatGrace = TimeSpan.FromSeconds(60),
});
Task runTask = session.RunAsync(cancellation.Token);
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
DateTimeOffset start = DateTimeOffset.UtcNow;
using CancellationTokenSource heartbeatWait = CancellationTokenSource
.CreateLinkedTokenSource(cancellation.Token);
heartbeatWait.CancelAfter(TimeSpan.FromSeconds(5));
WorkerEnvelope heartbeat = await ReadUntilAsync(
pipePair.GatewayReader,
WorkerEnvelope.BodyOneofCase.WorkerHeartbeat,
heartbeatWait.Token);
TimeSpan elapsed = DateTimeOffset.UtcNow - start;
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerHeartbeat, heartbeat.BodyCase);
Assert.True(
elapsed < TimeSpan.FromSeconds(5),
$"First heartbeat took {elapsed}, expected well under the 30s interval.");
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
}
/// <summary>
/// Worker-003 regression: when a command completes after the worker
/// has transitioned out of a command-serving state, the dropped
/// reply must be logged with a diagnostic rather than discarded
/// silently, so a stuck gateway correlation wait can be traced.
/// </summary>
[Fact]
public async Task RunAsync_WhenReplyIsDroppedAfterShutdown_LogsDiagnostic()
{
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(10));
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
FakeRuntimeSession runtime = new()
{
BlockDispatch = true,
};
RecordingWorkerLogger logger = new();
WorkerFrameProtocolOptions options = CreateOptions();
WorkerPipeSession session = new(
new WorkerFrameReader(pipePair.WorkerStream, options),
new WorkerFrameWriter(pipePair.WorkerStream, options),
options,
() => 1234,
new WorkerPipeSessionOptions
{
HeartbeatInterval = TimeSpan.FromSeconds(1),
HeartbeatGrace = TimeSpan.FromSeconds(5),
},
() => runtime,
logger);
Task runTask = session.RunAsync(cancellation.Token);
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
await pipePair.GatewayWriter.WriteAsync(
CreateCommandEnvelope("command-dropped-after-shutdown"),
cancellation.Token);
Assert.True(runtime.DispatchStarted.Wait(TimeSpan.FromSeconds(2)));
await pipePair.GatewayWriter
.WriteAsync(CreateShutdownEnvelope(), cancellation.Token);
WorkerEnvelope shutdownAck = await ReadUntilAsync(
pipePair.GatewayReader,
WorkerEnvelope.BodyOneofCase.WorkerShutdownAck,
cancellation.Token);
Assert.Equal(ProtocolStatusCode.Ok, shutdownAck.WorkerShutdownAck.Status.Code);
Task completedTask = await Task.WhenAny(runTask, Task.Delay(TimeSpan.FromSeconds(3), cancellation.Token));
Assert.Same(runTask, completedTask);
await runTask;
Assert.Contains(
logger.Events,
entry => entry.EventName == "WorkerCommandResultDropped"
&& entry.Fields.TryGetValue("correlation_id", out object? correlationId)
&& (string?)correlationId == "command-dropped-after-shutdown");
}
private static WorkerPipeSession CreateSession(
Stream inbound,
Stream outbound,
@@ -619,6 +729,69 @@ public sealed class WorkerPipeSessionTests
buffer[3] = (byte)(value >> 24);
}
private sealed class RecordingWorkerLogger : MxGateway.Worker.Bootstrap.IWorkerLogger
{
private readonly object gate = new();
private readonly List<LogEntry> events = new();
/// <summary>Gets a snapshot of the recorded log entries.</summary>
public IReadOnlyList<LogEntry> Events
{
get
{
lock (gate)
{
return new List<LogEntry>(events);
}
}
}
/// <summary>Records an informational log event.</summary>
public void Information(string eventName, IReadOnlyDictionary<string, object?> fields)
{
Record(eventName, fields);
}
/// <summary>Records an error log event.</summary>
public void Error(string eventName, IReadOnlyDictionary<string, object?> fields)
{
Record(eventName, fields);
}
private void Record(string eventName, IReadOnlyDictionary<string, object?> fields)
{
Dictionary<string, object?> copy = new();
foreach (KeyValuePair<string, object?> field in fields)
{
copy[field.Key] = field.Value;
}
lock (gate)
{
events.Add(new LogEntry(eventName, copy));
}
}
/// <summary>A single recorded log entry.</summary>
public sealed class LogEntry
{
/// <summary>Initializes a recorded log entry.</summary>
/// <param name="eventName">The log event name.</param>
/// <param name="fields">The log event fields.</param>
public LogEntry(string eventName, IReadOnlyDictionary<string, object?> fields)
{
EventName = eventName;
Fields = fields;
}
/// <summary>Gets the log event name.</summary>
public string EventName { get; }
/// <summary>Gets the log event fields.</summary>
public IReadOnlyDictionary<string, object?> Fields { get; }
}
}
private sealed class FakeRuntimeSession : IWorkerRuntimeSession
{
private readonly ManualResetEventSlim releaseDispatch = new(false);
@@ -0,0 +1,167 @@
using System;
using ArchestrA.MxAccess;
using MxGateway.Contracts.Proto;
using MxGateway.Worker.MxAccess;
using ComMxDataType = ArchestrA.MxAccess.MxDataType;
namespace MxGateway.Worker.Tests.MxAccess;
/// <summary>
/// Integrated tests for <see cref="MxAccessBaseEventSink"/>: drive an MXAccess COM
/// event through the real sink → <see cref="MxAccessEventMapper"/> →
/// <see cref="MxAccessEventQueue"/> pipeline and assert a correctly-converted
/// protobuf <see cref="WorkerEvent"/> lands in the queue.
/// </summary>
/// <remarks>
/// Boundary: the COM-side <c>+=</c> subscription performed in
/// <see cref="MxAccessBaseEventSink.Attach"/> casts the supplied object to the
/// sealed <c>LMXProxyServerClass</c> RCW and cannot run without a live MXAccess COM
/// object, so <c>Attach</c>/<c>Detach</c> are not exercised here. The event
/// handlers themselves (<c>OnDataChange</c>, <c>OnWriteComplete</c>,
/// <c>OperationComplete</c>, <c>OnBufferedDataChange</c>) are the exact delegate
/// targets the COM runtime invokes; calling them directly reproduces an STA-thread
/// COM callback and exercises the genuine conversion + enqueue path. The
/// <c>sessionId</c> normally set by <c>Attach</c> defaults to empty here, which the
/// assertions account for. The COM-event-conversion fault branch is left to
/// <see cref="MxAccessEventMapperTests"/> and the queue's own fault tests.
/// </remarks>
public sealed class MxAccessBaseEventSinkTests
{
/// <summary>
/// Verifies that an OnDataChange COM callback converts to a protobuf event and lands in the queue.
/// </summary>
[Fact]
public void OnDataChange_ComCallback_ConvertedEventLandsInQueue()
{
MxAccessEventQueue queue = new();
MxAccessBaseEventSink sink = new(queue, new MxAccessEventMapper());
DateTime timestamp = new(2026, 5, 18, 9, 15, 0, DateTimeKind.Utc);
MXSTATUS_PROXY[] statuses = Array.Empty<MXSTATUS_PROXY>();
sink.OnDataChange(
hLMXServerHandle: 7,
phItemHandle: 21,
pvItemValue: 1234,
pwItemQuality: 192,
pftItemTimeStamp: timestamp,
ref statuses);
Assert.Equal(1, queue.Count);
Assert.Equal(1UL, queue.LastEventSequence);
Assert.True(queue.TryDequeue(out WorkerEvent? workerEvent));
Assert.NotNull(workerEvent);
MxEvent mxEvent = workerEvent!.Event;
Assert.Equal(MxEventFamily.OnDataChange, mxEvent.Family);
Assert.Equal(MxEvent.BodyOneofCase.OnDataChange, mxEvent.BodyCase);
Assert.Equal(7, mxEvent.ServerHandle);
Assert.Equal(21, mxEvent.ItemHandle);
Assert.Equal(1234, mxEvent.Value.Int32Value);
Assert.Equal(192, mxEvent.Quality);
Assert.Equal(timestamp, mxEvent.SourceTimestamp.ToDateTime());
Assert.Equal(1UL, mxEvent.WorkerSequence);
Assert.NotNull(mxEvent.WorkerTimestamp);
}
/// <summary>
/// Verifies that consecutive OnDataChange callbacks land in the queue with monotonic sequences.
/// </summary>
[Fact]
public void OnDataChange_MultipleComCallbacks_QueueAssignsMonotonicSequences()
{
MxAccessEventQueue queue = new();
MxAccessBaseEventSink sink = new(queue, new MxAccessEventMapper());
MXSTATUS_PROXY[] statuses = Array.Empty<MXSTATUS_PROXY>();
sink.OnDataChange(1, 10, 100, 192, DateTime.UtcNow, ref statuses);
sink.OnDataChange(1, 11, 200, 192, DateTime.UtcNow, ref statuses);
sink.OnDataChange(1, 12, 300, 192, DateTime.UtcNow, ref statuses);
Assert.Equal(3, queue.Count);
Assert.Equal(3UL, queue.LastEventSequence);
Assert.True(queue.TryDequeue(out WorkerEvent? first));
Assert.True(queue.TryDequeue(out WorkerEvent? second));
Assert.True(queue.TryDequeue(out WorkerEvent? third));
Assert.Equal(1UL, first!.Event.WorkerSequence);
Assert.Equal(2UL, second!.Event.WorkerSequence);
Assert.Equal(3UL, third!.Event.WorkerSequence);
Assert.Equal(10, first.Event.ItemHandle);
Assert.Equal(12, third.Event.ItemHandle);
}
/// <summary>
/// Verifies that an OnWriteComplete COM callback lands in the queue with the correct family.
/// </summary>
[Fact]
public void OnWriteComplete_ComCallback_ConvertedEventLandsInQueue()
{
MxAccessEventQueue queue = new();
MxAccessBaseEventSink sink = new(queue, new MxAccessEventMapper());
MXSTATUS_PROXY[] statuses = Array.Empty<MXSTATUS_PROXY>();
sink.OnWriteComplete(hLMXServerHandle: 3, phItemHandle: 9, ref statuses);
Assert.Equal(1, queue.Count);
Assert.True(queue.TryDequeue(out WorkerEvent? workerEvent));
MxEvent mxEvent = workerEvent!.Event;
Assert.Equal(MxEventFamily.OnWriteComplete, mxEvent.Family);
Assert.Equal(MxEvent.BodyOneofCase.OnWriteComplete, mxEvent.BodyCase);
Assert.Equal(3, mxEvent.ServerHandle);
Assert.Equal(9, mxEvent.ItemHandle);
Assert.Equal(1UL, mxEvent.WorkerSequence);
}
/// <summary>
/// Verifies that an OperationComplete COM callback lands in the queue with the correct family.
/// </summary>
[Fact]
public void OperationComplete_ComCallback_ConvertedEventLandsInQueue()
{
MxAccessEventQueue queue = new();
MxAccessBaseEventSink sink = new(queue, new MxAccessEventMapper());
MXSTATUS_PROXY[] statuses = Array.Empty<MXSTATUS_PROXY>();
sink.OperationComplete(hLMXServerHandle: 4, phItemHandle: 8, ref statuses);
Assert.Equal(1, queue.Count);
Assert.True(queue.TryDequeue(out WorkerEvent? workerEvent));
MxEvent mxEvent = workerEvent!.Event;
Assert.Equal(MxEventFamily.OperationComplete, mxEvent.Family);
Assert.Equal(MxEvent.BodyOneofCase.OperationComplete, mxEvent.BodyCase);
Assert.Equal(4, mxEvent.ServerHandle);
Assert.Equal(8, mxEvent.ItemHandle);
}
/// <summary>
/// Verifies that an OnBufferedDataChange COM callback converts the value and lands in the queue.
/// </summary>
[Fact]
public void OnBufferedDataChange_ComCallback_ConvertedEventLandsInQueue()
{
MxAccessEventQueue queue = new();
MxAccessBaseEventSink sink = new(queue, new MxAccessEventMapper());
MXSTATUS_PROXY[] statuses = Array.Empty<MXSTATUS_PROXY>();
// Raw MXAccess data-type code 2 == Integer (see MxAccessEventMapper.MapMxDataType).
const int integerDataTypeCode = 2;
sink.OnBufferedDataChange(
hLMXServerHandle: 5,
phItemHandle: 13,
dtDataType: (ComMxDataType)integerDataTypeCode,
pvItemValue: 77,
pwItemQuality: 192,
pftItemTimeStamp: DateTime.UtcNow,
ref statuses);
Assert.Equal(1, queue.Count);
Assert.True(queue.TryDequeue(out WorkerEvent? workerEvent));
MxEvent mxEvent = workerEvent!.Event;
Assert.Equal(MxEventFamily.OnBufferedDataChange, mxEvent.Family);
Assert.Equal(MxEvent.BodyOneofCase.OnBufferedDataChange, mxEvent.BodyCase);
Assert.Equal(5, mxEvent.ServerHandle);
Assert.Equal(13, mxEvent.ItemHandle);
Assert.Equal(integerDataTypeCode, mxEvent.OnBufferedDataChange.RawDataType);
}
}
@@ -1,5 +1,7 @@
using System;
using System.Linq;
using System.Reflection;
using System.Threading;
using MxGateway.Worker.MxAccess;
namespace MxGateway.Worker.Tests.MxAccess;
@@ -109,4 +111,42 @@ public sealed class WnWrapAlarmConsumerXmlTests
Assert.False(WnWrapAlarmConsumer.TryParseHexGuid(hex, out Guid guid));
Assert.Equal(Guid.Empty, guid);
}
/// <summary>
/// Worker-001 regression: the consumer must own no internal
/// <see cref="Timer"/>. A thread-pool timer calling the
/// apartment-threaded wnwrap COM object off its owning STA can
/// deadlock on cross-apartment marshaling, so the timer field and
/// callback must not exist on the type.
/// </summary>
[Fact]
public void WnWrapAlarmConsumer_has_no_internal_timer_field()
{
FieldInfo[] fields = typeof(WnWrapAlarmConsumer)
.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
Assert.DoesNotContain(fields, field => field.FieldType == typeof(Timer));
Assert.Null(typeof(WnWrapAlarmConsumer).GetMethod(
"OnPoll",
BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic));
}
/// <summary>
/// Worker-001 regression: no public constructor may accept a
/// poll-interval parameter. A non-zero poll interval was the only
/// way to arm the off-STA timer; removing the parameter makes the
/// footgun structurally unreachable.
/// </summary>
[Fact]
public void WnWrapAlarmConsumer_exposes_no_poll_interval_constructor_parameter()
{
foreach (ConstructorInfo constructor in typeof(WnWrapAlarmConsumer)
.GetConstructors(BindingFlags.Instance | BindingFlags.Public))
{
Assert.DoesNotContain(
constructor.GetParameters(),
parameter => parameter.Name is not null
&& parameter.Name.IndexOf("poll", StringComparison.OrdinalIgnoreCase) >= 0);
}
}
}
@@ -0,0 +1,260 @@
using System;
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using MxGateway.Worker.Sta;
namespace MxGateway.Worker.Tests.Sta;
/// <summary>
/// Tests for <see cref="StaMessagePump"/>.
/// </summary>
/// <remarks>
/// Boundary: the <c>MsgWaitFailed</c> failure branch of <c>WaitForWorkOrMessages</c>
/// is not exercised. Forcing <c>MsgWaitForMultipleObjectsEx</c> to fail requires
/// passing a deliberately invalid native handle, which is unsafe to construct in a
/// managed test and can corrupt the thread's wait state. The other behavior — null
/// argument validation, waking on a signalled event, returning on timeout, the
/// timeout conversion edge cases observable through wait latency, and the
/// pump's drain count — is covered directly here.
/// </remarks>
public sealed class StaMessagePumpTests
{
/// <summary>
/// Verifies that WaitForWorkOrMessages throws ArgumentNullException for a null wake event.
/// </summary>
[Fact]
public void WaitForWorkOrMessages_NullWakeEvent_ThrowsArgumentNullException()
{
StaMessagePump pump = new();
ArgumentNullException exception = Assert.Throws<ArgumentNullException>(
() => pump.WaitForWorkOrMessages(null!, TimeSpan.FromMilliseconds(10)));
Assert.Equal("commandWakeEvent", exception.ParamName);
}
/// <summary>
/// Verifies that WaitForWorkOrMessages returns promptly when the wake event is already signalled.
/// </summary>
[Fact]
public async Task WaitForWorkOrMessages_WakeEventAlreadySignalled_ReturnsImmediately()
{
StaMessagePump pump = new();
using ManualResetEventSlim wakeEvent = new(initialState: true);
await RunOnStaThreadAsync(() =>
{
Stopwatch stopwatch = Stopwatch.StartNew();
pump.WaitForWorkOrMessages(wakeEvent.WaitHandle, TimeSpan.FromSeconds(30));
stopwatch.Stop();
// A 30s timeout was supplied; returning quickly proves the signalled
// wake handle — not the timeout — ended the wait.
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(5),
$"Wait took {stopwatch.Elapsed}; a pre-signalled wake event should return immediately.");
});
}
/// <summary>
/// Verifies that WaitForWorkOrMessages wakes when the wake event is signalled from another thread.
/// </summary>
[Fact]
public async Task WaitForWorkOrMessages_WakeEventSignalledDuringWait_Returns()
{
StaMessagePump pump = new();
using ManualResetEventSlim wakeEvent = new(initialState: false);
Task signalTask = Task.Run(async () =>
{
await Task.Delay(150, CancellationToken.None);
wakeEvent.Set();
});
await RunOnStaThreadAsync(() =>
{
Stopwatch stopwatch = Stopwatch.StartNew();
pump.WaitForWorkOrMessages(wakeEvent.WaitHandle, TimeSpan.FromSeconds(30));
stopwatch.Stop();
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(10),
$"Wait took {stopwatch.Elapsed}; signalling the wake event should end the 30s wait early.");
});
await signalTask;
}
/// <summary>
/// Verifies that WaitForWorkOrMessages returns on timeout when the wake event is never signalled.
/// </summary>
[Fact]
public async Task WaitForWorkOrMessages_WakeEventNeverSignalled_ReturnsAfterTimeout()
{
StaMessagePump pump = new();
using ManualResetEventSlim wakeEvent = new(initialState: false);
await RunOnStaThreadAsync(() =>
{
Stopwatch stopwatch = Stopwatch.StartNew();
pump.WaitForWorkOrMessages(wakeEvent.WaitHandle, TimeSpan.FromMilliseconds(150));
stopwatch.Stop();
// The wait must end of its own accord (timeout). Lower bound is loose
// because the timeout converts via Math.Ceiling and the OS scheduler
// adds slack; upper bound proves it is not waiting indefinitely.
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(10),
$"Wait took {stopwatch.Elapsed}; a 150ms timeout should end the wait without a signal.");
});
}
/// <summary>
/// Verifies that a zero timeout (the TimeSpan.Zero conversion branch) returns without blocking.
/// </summary>
[Fact]
public async Task WaitForWorkOrMessages_ZeroTimeout_ReturnsWithoutBlocking()
{
StaMessagePump pump = new();
using ManualResetEventSlim wakeEvent = new(initialState: false);
await RunOnStaThreadAsync(() =>
{
Stopwatch stopwatch = Stopwatch.StartNew();
// TimeSpan.Zero exercises the "<= Zero -> 0 ms" conversion branch:
// MsgWaitForMultipleObjectsEx polls and returns immediately.
pump.WaitForWorkOrMessages(wakeEvent.WaitHandle, TimeSpan.Zero);
stopwatch.Stop();
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(2),
$"Wait took {stopwatch.Elapsed}; a zero timeout must not block.");
});
}
/// <summary>
/// Verifies that PumpPendingMessages returns zero when the STA thread message queue is empty.
/// </summary>
[Fact]
public async Task PumpPendingMessages_NoMessagesPosted_ReturnsZero()
{
StaMessagePump pump = new();
int pumped = await RunOnStaThreadAsync(() =>
{
// Drain anything the apartment/thread start posted, then measure a clean queue.
pump.PumpPendingMessages();
return pump.PumpPendingMessages();
});
Assert.Equal(0, pumped);
}
/// <summary>
/// Verifies that PumpPendingMessages dispatches and counts messages posted to the STA thread.
/// </summary>
[Fact]
public async Task PumpPendingMessages_MessagesPostedToStaThread_ReturnsCountProcessed()
{
StaMessagePump pump = new();
int pumped = await RunOnStaThreadAsync(() =>
{
// Clear any startup messages so the count reflects only what we post.
pump.PumpPendingMessages();
uint threadId = GetCurrentThreadId();
Assert.True(PostThreadMessage(threadId, WmNull, UIntPtr.Zero, IntPtr.Zero));
Assert.True(PostThreadMessage(threadId, WmNull, UIntPtr.Zero, IntPtr.Zero));
Assert.True(PostThreadMessage(threadId, WmNull, UIntPtr.Zero, IntPtr.Zero));
return pump.PumpPendingMessages();
});
Assert.Equal(3, pumped);
}
/// <summary>
/// Verifies that WaitForWorkOrMessages returns once a Windows message is posted to the STA thread.
/// </summary>
[Fact]
public async Task WaitForWorkOrMessages_WindowsMessagePosted_ReturnsForInputAvailable()
{
StaMessagePump pump = new();
using ManualResetEventSlim wakeEvent = new(initialState: false);
using ManualResetEventSlim threadReady = new(initialState: false);
uint staThreadId = 0;
Task staTask = RunOnStaThreadAsync(() =>
{
staThreadId = GetCurrentThreadId();
pump.PumpPendingMessages();
threadReady.Set();
Stopwatch stopwatch = Stopwatch.StartNew();
// The wake event is never signalled. Only the posted Windows message
// (QS_ALLINPUT wake mask) can end this 30s wait early.
pump.WaitForWorkOrMessages(wakeEvent.WaitHandle, TimeSpan.FromSeconds(30));
stopwatch.Stop();
Assert.True(
stopwatch.Elapsed < TimeSpan.FromSeconds(10),
$"Wait took {stopwatch.Elapsed}; a posted Windows message should wake the pump.");
});
Assert.True(threadReady.Wait(TimeSpan.FromSeconds(5)), "STA thread did not start.");
await Task.Delay(100, CancellationToken.None);
Assert.True(
PostThreadMessage(staThreadId, WmNull, UIntPtr.Zero, IntPtr.Zero),
"Failed to post a Windows message to the STA thread.");
await staTask;
}
private const uint WmNull = 0x0000;
/// <summary>Runs an action on a dedicated STA thread and returns when it completes.</summary>
private static Task RunOnStaThreadAsync(Action action)
{
return RunOnStaThreadAsync(() =>
{
action();
return 0;
});
}
/// <summary>Runs a function on a dedicated STA thread and returns its result.</summary>
private static Task<T> RunOnStaThreadAsync<T>(Func<T> function)
{
TaskCompletionSource<T> completion = new();
Thread thread = new(() =>
{
try
{
completion.SetResult(function());
}
catch (Exception exception)
{
completion.SetException(exception);
}
})
{
IsBackground = true,
};
thread.SetApartmentState(ApartmentState.STA);
thread.Start();
return completion.Task;
}
[System.Runtime.InteropServices.DllImport("kernel32.dll")]
private static extern uint GetCurrentThreadId();
[System.Runtime.InteropServices.DllImport("user32.dll", SetLastError = true)]
private static extern bool PostThreadMessage(
uint threadId,
uint message,
UIntPtr wParam,
IntPtr lParam);
}
+38 -2
View File
@@ -29,7 +29,11 @@ public sealed class WorkerPipeSession
private readonly HashSet<Task> _activeCommandTasks = new();
private IWorkerRuntimeSession? _runtimeSession;
private long _nextSequence;
private WorkerState _state = WorkerState.Starting;
// Mutated from the message loop, command tasks, the heartbeat loop and the
// shutdown path; volatile so cross-thread reads observe the latest state
// without tearing (WorkerState is an int-backed protobuf enum).
private volatile WorkerState _state = WorkerState.Starting;
private bool _acceptingCommands = true;
private bool _watchdogFaultSent;
private bool _shutdownTimedOut;
@@ -398,6 +402,7 @@ public sealed class WorkerPipeSession
MxCommandReply reply = await runtimeSession.DispatchAsync(staCommand).ConfigureAwait(false);
if (_state is not WorkerState.Ready and not WorkerState.ExecutingCommand)
{
LogCommandResultDropped(envelope.CorrelationId, staCommand.MethodName);
return;
}
@@ -415,6 +420,7 @@ public sealed class WorkerPipeSession
{
if (_state is not WorkerState.Ready and not WorkerState.ExecutingCommand)
{
LogCommandResultDropped(envelope.CorrelationId, staCommand.MethodName);
return;
}
@@ -428,6 +434,25 @@ public sealed class WorkerPipeSession
}
}
/// <summary>
/// Logs that a completed command result was dropped because the
/// worker is no longer in a command-serving state (typically a
/// shutdown that raced the command's completion). Without this
/// diagnostic the gateway's correlation-id wait blocks until its own
/// timeout with no trace of why no reply arrived.
/// </summary>
private void LogCommandResultDropped(string correlationId, string commandMethod)
{
_logger?.Information(
"WorkerCommandResultDropped",
new Dictionary<string, object?>
{
["correlation_id"] = correlationId,
["command_method"] = commandMethod,
["worker_state"] = _state.ToString(),
});
}
private async Task ShutdownAsync(
WorkerShutdown shutdown,
CancellationToken cancellationToken)
@@ -544,9 +569,20 @@ public sealed class WorkerPipeSession
private async Task RunHeartbeatLoopAsync(CancellationToken cancellationToken)
{
// The first heartbeat is sent immediately on entering the loop so the
// gateway's liveness watchdog sees a beat as soon as the worker is
// Ready; the delay is applied between subsequent beats only. A
// delay-before-first-beat loop would leave the gateway without a
// heartbeat for a full HeartbeatInterval after startup.
bool firstBeat = true;
while (!cancellationToken.IsCancellationRequested)
{
await Task.Delay(_sessionOptions.HeartbeatInterval, cancellationToken).ConfigureAwait(false);
if (!firstBeat)
{
await Task.Delay(_sessionOptions.HeartbeatInterval, cancellationToken).ConfigureAwait(false);
}
firstBeat = false;
IWorkerRuntimeSession? runtimeSession = _runtimeSession;
if (runtimeSession is null)
{
@@ -42,8 +42,8 @@ public interface IMxAccessAlarmConsumer : IDisposable
/// Subscription string follows AVEVA's canonical format:
/// <c>\\&lt;node&gt;\Galaxy!&lt;area&gt;</c>. The literal "Galaxy" is
/// the provider name (regardless of the configured Galaxy database
/// name). Calling Subscribe also begins polling on the consumer's
/// internal timer.
/// name). Subscribe does not start any polling of its own; the caller
/// drives polls explicitly via <see cref="PollOnce"/>.
/// </summary>
void Subscribe(string subscription);
@@ -88,10 +88,8 @@ public interface IMxAccessAlarmConsumer : IDisposable
/// <summary>
/// Drives a single synchronous poll of the underlying alarm source.
/// Implementations that use an internal <see cref="System.Threading.Timer"/>
/// are constructed with <c>pollIntervalMilliseconds=0</c> in production so
/// the timer is disabled; the worker's STA drives polls via
/// <c>StaRuntime.InvokeAsync</c> instead, satisfying the
/// The production consumer owns no internal timer; the worker's STA
/// drives polls via <c>StaRuntime.InvokeAsync</c>, satisfying the
/// <c>ThreadingModel=Apartment</c> requirement of
/// <c>wwAlarmConsumerClass</c>. Fake implementations should no-op.
/// This method must be invoked on the thread that created the consumer
@@ -65,7 +65,14 @@ public sealed class MxAccessBaseEventSink : IMxAccessEventSink
sessionId = string.Empty;
}
private void OnDataChange(
/// <summary>
/// Handles the MXAccess <c>OnDataChange</c> COM event: converts the
/// event arguments to a protobuf <see cref="Proto.MxEvent"/> and enqueues
/// it. Subscribed to the COM object's event in <see cref="Attach"/>.
/// Exposed <c>internal</c> so unit tests can drive the integrated
/// sink → mapper → queue path without a live MXAccess COM event source.
/// </summary>
internal void OnDataChange(
int hLMXServerHandle,
int phItemHandle,
object pvItemValue,
@@ -84,7 +91,11 @@ public sealed class MxAccessBaseEventSink : IMxAccessEventSink
statuses));
}
private void OnWriteComplete(
/// <summary>
/// Handles the MXAccess <c>OnWriteComplete</c> COM event. Exposed
/// <c>internal</c> as a unit-test seam; see <see cref="OnDataChange"/>.
/// </summary>
internal void OnWriteComplete(
int hLMXServerHandle,
int phItemHandle,
ref MXSTATUS_PROXY[] pVars)
@@ -97,7 +108,11 @@ public sealed class MxAccessBaseEventSink : IMxAccessEventSink
statuses));
}
private void OperationComplete(
/// <summary>
/// Handles the MXAccess <c>OperationComplete</c> COM event. Exposed
/// <c>internal</c> as a unit-test seam; see <see cref="OnDataChange"/>.
/// </summary>
internal void OperationComplete(
int hLMXServerHandle,
int phItemHandle,
ref MXSTATUS_PROXY[] pVars)
@@ -110,7 +125,11 @@ public sealed class MxAccessBaseEventSink : IMxAccessEventSink
statuses));
}
private void OnBufferedDataChange(
/// <summary>
/// Handles the MXAccess <c>OnBufferedDataChange</c> COM event. Exposed
/// <c>internal</c> as a unit-test seam; see <see cref="OnDataChange"/>.
/// </summary>
internal void OnBufferedDataChange(
int hLMXServerHandle,
int phItemHandle,
MxDataType dtDataType,
@@ -2,7 +2,6 @@ using System;
using System.Collections.Generic;
using System.Globalization;
using System.Runtime.InteropServices;
using System.Threading;
using System.Xml;
using WNWRAPCONSUMERLib;
@@ -31,15 +30,16 @@ namespace MxGateway.Worker.MxAccess;
/// <strong>Threading.</strong> The wnwrap CLSID is registered with
/// <c>ThreadingModel=Apartment</c>. The consumer must be created
/// and operated from an STA thread; the worker's
/// <see cref="MxAccessStaSession"/> already runs an STA pump that
/// is the natural host. Polling cadence is governed by
/// <see cref="PollIntervalMilliseconds"/> on a dedicated timer the
/// consumer owns; in production the worker's STA dispatcher should
/// marshal each callback onto the STA thread before invoking
/// <c>GetXmlCurrentAlarms2</c>. For now (test-grade), this consumer
/// calls the COM API on whichever thread the timer fires it on —
/// the worker bootstrap will gain a thin "run-on-STA" wrapper as
/// part of A.3 dispatcher wiring.
/// <see cref="MxAccessStaSession"/> runs an STA pump that hosts it.
/// The consumer owns <em>no</em> internal timer: every COM call
/// (<c>Subscribe</c>, <c>PollOnce</c>, <c>AcknowledgeBy*</c>) must
/// be invoked on the STA that created the consumer. Polling cadence
/// is driven externally by the worker's STA via
/// <c>StaRuntime.InvokeAsync(() =&gt; consumer.PollOnce())</c>, which
/// keeps every <c>GetXmlCurrentAlarms2</c> call on the apartment that
/// owns the COM object. A thread-pool timer would call the COM API
/// off the owning STA and can deadlock on cross-apartment marshaling
/// when the STA is not pumping messages, so no such timer exists.
/// </para>
/// </remarks>
public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
@@ -47,52 +47,39 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
private const string DefaultProductName = "OtOpcUa.MxGateway";
private const string DefaultApplicationName = "OtOpcUa.MxGateway.Worker";
private const string DefaultVersion = "1.0";
private const int DefaultPollIntervalMilliseconds = 500;
private const int DefaultMaxAlarmsPerFetch = 1024;
private readonly object syncRoot = new object();
private readonly Dictionary<Guid, MxAlarmSnapshotRecord> latestSnapshot =
new Dictionary<Guid, MxAlarmSnapshotRecord>();
private readonly int pollIntervalMs;
private readonly int maxAlarmsPerFetch;
private wwAlarmConsumerClass? client;
private wwAlarmConsumerClass? ackClient;
private string subscriptionExpression = string.Empty;
private Timer? pollTimer;
private bool subscribed;
private bool disposed;
/// <summary>
/// Production constructor — creates the wnwrap COM object on the
/// current thread (must be the worker's STA) and disables the
/// internal <see cref="Timer"/> (<c>pollIntervalMilliseconds=0</c>).
/// Polling is driven externally by the STA via
/// <c>StaRuntime.InvokeAsync(() =&gt; consumer.PollOnce())</c> so
/// that every COM call stays on the STA that owns the apartment.
/// current thread (which must be the worker's STA). Polling is driven
/// externally by the STA via
/// <c>StaRuntime.InvokeAsync(() =&gt; consumer.PollOnce())</c> so that
/// every COM call stays on the STA that owns the apartment.
/// </summary>
public WnWrapAlarmConsumer()
: this(new wwAlarmConsumerClass(), pollIntervalMilliseconds: 0, DefaultMaxAlarmsPerFetch)
: this(new wwAlarmConsumerClass(), DefaultMaxAlarmsPerFetch)
{
}
/// <summary>
/// Test seam / explicit construction — inject a pre-created COM
/// client and tune the poll cadence. <c>pollIntervalMilliseconds == 0</c>
/// disables the internal <see cref="Timer"/> entirely; the caller
/// must drive <see cref="PollOnce"/> manually (used by hosts that
/// marshal polls onto a foreign STA, and by live smoke tests that
/// pump from the STA they own).
/// Test seam / explicit construction.
/// </summary>
public WnWrapAlarmConsumer(
wwAlarmConsumerClass client,
int pollIntervalMilliseconds,
int maxAlarmsPerFetch)
{
this.client = client ?? throw new ArgumentNullException(nameof(client));
this.pollIntervalMs = pollIntervalMilliseconds < 0
? DefaultPollIntervalMilliseconds
: pollIntervalMilliseconds;
this.maxAlarmsPerFetch = maxAlarmsPerFetch > 0
? maxAlarmsPerFetch
: DefaultMaxAlarmsPerFetch;
@@ -101,8 +88,6 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
/// <inheritdoc />
public event EventHandler<MxAlarmTransitionEvent>? AlarmTransitionEmitted;
public int PollIntervalMilliseconds => pollIntervalMs;
/// <inheritdoc />
public void Subscribe(string subscription)
{
@@ -136,7 +121,9 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
}
// hWnd=0: wnwrap supports a pull-based model — no message pump
// is required. We poll GetXmlCurrentAlarms2 on a timer below.
// is required. GetXmlCurrentAlarms2 is polled by the worker's STA
// via StaRuntime.InvokeAsync(() => consumer.PollOnce()); this type
// owns no internal timer.
int reg = com.IwwAlarmConsumer_RegisterConsumer(
hWnd: 0,
szProductName: DefaultProductName,
@@ -201,10 +188,6 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
subscriptionExpression = subscription;
subscribed = true;
if (pollIntervalMs > 0)
{
pollTimer = new Timer(OnPoll, state: null, dueTime: 0, period: pollIntervalMs);
}
}
}
@@ -294,31 +277,14 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
}
}
private void OnPoll(object? _)
{
if (disposed) return;
try
{
PollOnce();
}
catch (Exception ex)
{
// Swallow — the poll loop must not propagate exceptions out of
// the timer callback, or the worker process tears down. The
// EventQueue fault counter (wired in by the future A.3 dispatcher)
// is the right place to surface poll failures; for now the
// exception is intentionally silent so the timer keeps firing.
_ = ex;
}
}
/// <summary>
/// Synchronously poll the wnwrap consumer once and dispatch any
/// transitions. Public so STA-bound hosts can drive polling from
/// the thread that owns the COM object instead of relying on the
/// internal <see cref="Timer"/> (which fires on a thread-pool
/// thread and blocks indefinitely on cross-apartment marshaling
/// when the host STA isn't pumping messages).
/// transitions. STA-bound hosts drive polling by calling this from
/// the thread that owns the COM object. The consumer deliberately
/// owns no internal timer: a thread-pool timer would call the
/// apartment-threaded COM object off its owning STA and can block
/// indefinitely on cross-apartment marshaling when the STA is not
/// pumping messages.
/// </summary>
public void PollOnce()
{
@@ -524,21 +490,17 @@ public sealed class WnWrapAlarmConsumer : IMxAccessAlarmConsumer
/// <inheritdoc />
public void Dispose()
{
Timer? timerToDispose;
wwAlarmConsumerClass? clientToDispose;
wwAlarmConsumerClass? ackClientToDispose;
lock (syncRoot)
{
if (disposed) return;
disposed = true;
timerToDispose = pollTimer;
pollTimer = null;
clientToDispose = client;
client = null;
ackClientToDispose = ackClient;
ackClient = null;
}
timerToDispose?.Dispose();
ReleaseConsumerCom(clientToDispose);
ReleaseConsumerCom(ackClientToDispose);
}
@@ -14,6 +14,10 @@
<PackageReference Include="Polly.Core" Version="8.6.6" />
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="MxGateway.Worker.Tests" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\MxGateway.Contracts\MxGateway.Contracts.csproj" />
</ItemGroup>