Code-review 2026-05-20 sweep #2: re-review at a020350, resolve 48 findings

Second re-review pass at commit a020350 caught 48 new findings — including
one High-severity regression I introduced in the prior sweep — and fixed
them all in one parallel wave.

High (1)
- Client.Python-018: prior sweep set `license = "Proprietary"` in
  pyproject.toml. setuptools >= 77 enforces PEP 639 and rejects the
  string (it must be a valid SPDX expression), so `pip wheel .` and
  `pip install -e .` both fail before any source compiles. Tests
  still pass because pytest bypasses the build backend via
  `pythonpath`. Dropped the invalid license string, kept the
  `License :: Other/Proprietary License` classifier, and added
  `tests/test_packaging.py` so a future regression of the same shape
  is caught in CI.

Mediums (6)
- Worker-023: `HeartbeatStuckCeiling` (default 75s = 5x HeartbeatGrace)
  on WorkerPipeSessionOptions bounds the in-flight-command watchdog
  suppression so a truly stuck COM call still triggers StaHung
  instead of permanently defeating the watchdog.
- Client.Rust-018: reverted Rust's `latencyMs` split so the
  cross-language bench comparison is apples-to-apples again;
  `failureLatencyMs` kept as Rust-only enrichment.
- Client.Java-021: applied Client.Java-002's terminal-state
  serialisation pattern to DeployEventStream so close() arriving
  after queue-overflow can't erase the overflow exception.
- IntegrationTests-017: teardown-parity test now uses a two-window
  stability check after UnAdvise instead of strict equality against
  the pre-UnAdvise count (which raced against in-flight events).
- IntegrationTests-019: new RecordingTestOutputHelper wraps every
  log sink the WriteSecured live test owns (worker stdout/stderr,
  gateway logs, direct WriteLine) so the credential is proven
  absent from the full output buffer, not just the diagnostic
  message.
- Tests-020: added MxAccessGatewayServiceConstraintTests coverage
  for the previously-uncovered Write2Bulk and WriteSecured2Bulk
  arms of WriteBulkConstraintPlan.SetPayload.

Lows (41 — highlights)
- Server: Galaxy glob cache eviction is race-free (Server-024);
  GalaxyRepositoryGrpcService takes IGalaxyRepository (Server-025);
  AlarmsOptions validated at startup (Server-026); Authorization.md
  Constraint Enforcement snippet/prose enumerate the bulk write/read
  family (Server-027); bulk-read-commands and bulk-write-commands
  capability tokens added to OpenSession (Server-029);
  NotWiredAlarmRpcDispatcher XML doc and missing scope-resolver and
  state-machine tests cleaned up (023, 028).
- Worker: AlarmCommandHandler now invokes the same STA-affinity
  guard the poll path uses, at every command entry (Worker-024);
  RunAsync null-checks the runtime-session factory result
  (Worker-025).
- Worker.Tests: shared LiveMxAccessOptInVariableName lives on
  GatewayContractInfo (Worker.Tests-025); MxAccessSession.CreateForTesting
  rejects production sinks (Worker.Tests-026); FakeRuntimeSession's
  CancelCommandReturnValue serialised under lock (Worker.Tests-027);
  Probes namespace lifted to MxGateway.Worker.Tests.Probes
  (Worker.Tests-029); cancel-envelope sequence numbers monotonised
  (Worker.Tests-030); docs/GatewayTesting.md gains a "Dev-rig Probes"
  section (Worker.Tests-028).
- Tests: ManualTimeProvider consolidated into one TestSupport/ copy
  (Tests-021); SessionManagerBulkTests adds a mid-flight cancellation
  test backed by a TaskCompletionSource fake (Tests-022); companion
  FakeWorkerProcess.WaitForExitAsync no longer fakes its exit signal
  (Tests-023); constraint plan reply-count divergence pinned
  (Tests-024).
- IntegrationTests: TryGetSession chain carries [MaybeNullWhen(false)]
  end-to-end (IntegrationTests-018); abnormal-exit keyword set
  tightened to pipe-disconnected/end-of-stream and the test now
  asserts streamTask.IsFaulted (020, 021).
- Client.Dotnet: bench commands added to isLongRunning so the
  default 30s wall-clock budget doesn't kill them (015);
  BenchStreamEventsAsync observes the inner stream task on every
  exit path (016).
- Client.Go: parseValue wraps strconv errors with flag context and
  %w (017); bench loops honour ctx.Done() (018); galaxy-watch parses
  RFC3339Nano with fractional seconds (019); runStreamEvents installs
  signal.NotifyContext like runGalaxyWatch (020); five new CLI-level
  table-driven tests cover the bulk/bench subcommands (021).
- Client.Java: toCompletable Javadoc rewritten to match the actual
  cancellation contract Client.Java-015 established (022); stream-events
  text path uses Long.toUnsignedString for worker_sequence (023);
  bench-read-bulk no longer pollutes success-latency histogram with
  failure durations (024); --shutdown-timeout CLI option propagates
  through to ClientOptions (025); seven new MxGatewayCliTests cover
  the bulk and bench commands (026).
- Client.Python: mxgateway_cli ships its own py.typed marker (019);
  wheel-build smoke test added under tests/test_packaging.py (020);
  README documents the Galaxy CLI parity gap explicitly (021).
- Client.Rust: RustClientDesign.md signatures match session.rs and
  document the AsRef<str> read_bulk genericism (019);
  next_correlation_id re-exported at the crate root, with a
  property-style doc contract and an explicit disclaimer that the
  literal textual format is not part of the contract (020).
- Contracts: BulkWriteResult comment names the actual
  IConstraintEnforcer mechanism instead of "tag-allowlist filter"
  (014); BulkReadResult gains explicit per-arm payload-population
  documentation for the success vs failure cases (015).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-20 10:28:54 -04:00
parent a0203503a7
commit 1aafd6bde4
74 changed files with 3349 additions and 395 deletions
@@ -13,4 +13,14 @@ public static class GatewayContractInfo
public const uint WorkerProtocolVersion = 1;
public const string DefaultBackendName = "mxaccess-worker";
/// <summary>
/// Environment variable name that opts an xUnit suite into running live
/// MXAccess COM tests. Single source of truth shared by both
/// <c>MxGateway.IntegrationTests.LiveMxAccessFactAttribute</c> and
/// <c>MxGateway.Worker.Tests.TestSupport.LiveMxAccessFactAttribute</c>
/// so any future opt-in tweak does not silently leave one project
/// behind — see Worker.Tests-025.
/// </summary>
public const string LiveMxAccessOptInVariableName = "MXGATEWAY_RUN_LIVE_MXACCESS_TESTS";
}
@@ -19753,9 +19753,11 @@ namespace MxGateway.Contracts.Proto {
/// <summary>
/// Per-item result for the four bulk write families. `item_handle` mirrors the
/// request entry's item_handle so callers can correlate inputs to outputs even
/// when the gateway's tag-allowlist filter dropped some entries before reaching
/// the worker. Per-item failures populate `error_message` + `hresult` and never
/// raise — callers iterate and inspect each entry.
/// when the gateway's per-entry `IConstraintEnforcer.CheckWriteHandleAsync`
/// filter (see `MxAccessGatewayService.ReplaceWriteBulkEntries` and
/// `docs/Authorization.md`) dropped some entries before reaching the worker.
/// Per-item failures populate `error_message` + `hresult` and never raise —
/// callers iterate and inspect each entry.
/// </summary>
[global::System.Diagnostics.DebuggerDisplayAttribute("{ToString(),nq}")]
public sealed partial class BulkWriteResult : pb::IMessage<BulkWriteResult>
@@ -20338,6 +20340,20 @@ namespace MxGateway.Contracts.Proto {
/// an existing live subscription's last OnDataChange (the worker did not touch
/// the subscription); false when the worker took the AddItem + Advise + wait +
/// UnAdvise + RemoveItem snapshot lifecycle itself.
///
/// On `was_successful = true`, `value`, `quality`, `source_timestamp`, and
/// `statuses` carry the read data (from the cached subscription or the snapshot
/// lifecycle, depending on `was_cached`) and `error_message` is empty. On
/// `was_successful = false`, only `server_handle`, `tag_address`, `item_handle`
/// (when allocated), `was_cached`, and `error_message` are populated; `value`,
/// `quality`, `source_timestamp`, and `statuses` are left at their proto3
/// defaults (null / 0 / null / empty) and must not be read as data — they are
/// wire-indistinguishable from "value is null with quality bad" data and serve
/// only as absent markers. ReadBulk has no `hresult` field by design (its
/// outcomes are timeout / cache / lifecycle states, not MXAccess COM return
/// codes — see `docs/DesignDecisions.md` "Bulk Command Family"). Per-tag
/// failures populate `error_message` and never raise — callers iterate and
/// inspect each entry.
/// </summary>
[global::System.Diagnostics.DebuggerDisplayAttribute("{ToString(),nq}")]
public sealed partial class BulkReadResult : pb::IMessage<BulkReadResult>
@@ -548,9 +548,11 @@ message BulkSubscribeReply {
// Per-item result for the four bulk write families. `item_handle` mirrors the
// request entry's item_handle so callers can correlate inputs to outputs even
// when the gateway's tag-allowlist filter dropped some entries before reaching
// the worker. Per-item failures populate `error_message` + `hresult` and never
// raise — callers iterate and inspect each entry.
// when the gateway's per-entry `IConstraintEnforcer.CheckWriteHandleAsync`
// filter (see `MxAccessGatewayService.ReplaceWriteBulkEntries` and
// `docs/Authorization.md`) dropped some entries before reaching the worker.
// Per-item failures populate `error_message` + `hresult` and never raise —
// callers iterate and inspect each entry.
message BulkWriteResult {
int32 server_handle = 1;
int32 item_handle = 2;
@@ -568,6 +570,20 @@ message BulkWriteReply {
// an existing live subscription's last OnDataChange (the worker did not touch
// the subscription); false when the worker took the AddItem + Advise + wait +
// UnAdvise + RemoveItem snapshot lifecycle itself.
//
// On `was_successful = true`, `value`, `quality`, `source_timestamp`, and
// `statuses` carry the read data (from the cached subscription or the snapshot
// lifecycle, depending on `was_cached`) and `error_message` is empty. On
// `was_successful = false`, only `server_handle`, `tag_address`, `item_handle`
// (when allocated), `was_cached`, and `error_message` are populated; `value`,
// `quality`, `source_timestamp`, and `statuses` are left at their proto3
// defaults (null / 0 / null / empty) and must not be read as data — they are
// wire-indistinguishable from "value is null with quality bad" data and serve
// only as absent markers. ReadBulk has no `hresult` field by design (its
// outcomes are timeout / cache / lifecycle states, not MXAccess COM return
// codes — see `docs/DesignDecisions.md` "Bulk Command Family"). Per-tag
// failures populate `error_message` and never raise — callers iterate and
// inspect each entry.
message BulkReadResult {
int32 server_handle = 1;
string tag_address = 2;
@@ -1,8 +1,16 @@
using MxGateway.Contracts;
namespace MxGateway.IntegrationTests;
public static class IntegrationTestEnvironment
{
public const string LiveMxAccessVariableName = "MXGATEWAY_RUN_LIVE_MXACCESS_TESTS";
/// <summary>
/// Sourced from <see cref="GatewayContractInfo.LiveMxAccessOptInVariableName"/>
/// so the env-var literal is shared with
/// <c>MxGateway.Worker.Tests.TestSupport.LiveMxAccessFactAttribute</c>
/// (Worker.Tests-025).
/// </summary>
public const string LiveMxAccessVariableName = GatewayContractInfo.LiveMxAccessOptInVariableName;
public const string LiveMxAccessWorkerExecutableVariableName = "MXGATEWAY_LIVE_MXACCESS_WORKER_EXE";
public const string LiveMxAccessItemVariableName = "MXGATEWAY_LIVE_MXACCESS_ITEM";
public const string LiveMxAccessClientNameVariableName = "MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME";
@@ -1,5 +1,7 @@
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.Text;
using Google.Protobuf.WellKnownTypes;
using Grpc.Core;
using Microsoft.Extensions.Logging;
@@ -357,14 +359,6 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
.ConfigureAwait(false);
LogEvent(firstDataChange);
// RecordingServerStreamWriter.Messages returns a snapshot copy under its own
// lock, so iterating after each teardown step is safe without external sync.
int dataChangeCountBeforeUnadvise = CountMatchingEvents(
eventWriter,
e => e.Family == MxEventFamily.OnDataChange
&& e.ServerHandle == serverHandle
&& e.ItemHandle == itemHandle);
// 1) UnAdvise — must reply Ok; the worker must stop emitting OnDataChange
// for this (server, item) pair after this returns.
MxCommandReply unadviseReply = await fixture.Service.Invoke(
@@ -390,21 +384,33 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
Assert.Equal(ProtocolStatusCode.Ok, unregisterReply.ProtocolStatus.Code);
Assert.Equal(MxCommandKind.Unregister, unregisterReply.Kind);
// Allow a short settle window for any in-flight OnDataChange to drain, then
// assert no further events arrived for the un-advised (serverHandle, itemHandle).
// MXAccess parity: after UnAdvise the provider must stop publishing OnDataChange
// for this item — a regression that left a stale subscription alive would surface
// as additional events after this delay.
// Parity rule: after UnAdvise returns Ok the worker must stop emitting
// OnDataChange for this (server, item) pair. Events the provider already
// published before that ack are in-flight and not a regression — the rule
// only constrains events generated AFTER the teardown returned. So the
// "before" baseline is taken *after* a first settle window drains those
// in-flight events, not before UnAdvise was issued (which races against
// the round-trip + STA dispatch + pipe send window — see IntegrationTests-017).
//
// RecordingServerStreamWriter.Messages returns a snapshot copy under its
// own lock, so iterating after each settle window is safe without external
// sync.
await Task.Delay(TimeSpan.FromMilliseconds(500)).ConfigureAwait(false);
int dataChangeCountAfterFirstSettle = CountMatchingEvents(
eventWriter,
e => e.Family == MxEventFamily.OnDataChange
&& e.ServerHandle == serverHandle
&& e.ItemHandle == itemHandle);
int dataChangeCountAfterTeardown = CountMatchingEvents(
await Task.Delay(TimeSpan.FromMilliseconds(500)).ConfigureAwait(false);
int dataChangeCountAfterSecondSettle = CountMatchingEvents(
eventWriter,
e => e.Family == MxEventFamily.OnDataChange
&& e.ServerHandle == serverHandle
&& e.ItemHandle == itemHandle);
output.WriteLine(
$"DataChange count before UnAdvise={dataChangeCountBeforeUnadvise} after teardown+settle={dataChangeCountAfterTeardown}");
Assert.Equal(dataChangeCountBeforeUnadvise, dataChangeCountAfterTeardown);
$"DataChange count after first settle={dataChangeCountAfterFirstSettle} after second settle={dataChangeCountAfterSecondSettle}");
Assert.Equal(dataChangeCountAfterFirstSettle, dataChangeCountAfterSecondSettle);
// A RemoveItem against the just-freed item handle must not silently succeed —
// the worker has to relay MXAccess's invalid-handle response. Closing the
@@ -438,8 +444,16 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
// IntegrationTests-019: CLAUDE.md's credential-redaction rule covers every log
// surface the test sees, not just the reply's DiagnosticMessage. Wire a buffering
// wrapper around output and route the worker stdout/stderr echo and the gateway
// ILogger sink through it so the post-run assertion covers the accumulated test
// output. A regression that logged the request body, the WorkerCommandRequest
// envelope, or printed the credential from inside the worker is caught here
// even if the bare DiagnosticMessage check still passes.
RecordingTestOutputHelper recordedOutput = new(output);
TestWorkerProcessFactory processFactory = new(recordedOutput);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, recordedOutput);
// Stream events so a regression that emitted an OperationComplete or
// OnWriteComplete with wrong handles would still be observable via the test
// output (we don't assert a specific event here — the docs note successful
@@ -450,6 +464,7 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
(string verifyUser, string verifyPassword) = ResolveLiveMxAccessSecuredCredentials();
try
{
@@ -473,32 +488,31 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
MxCommandReply registerReply = await fixture.Service.Invoke(
CreateRegisterRequest(sessionId),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Register", registerReply);
LogReplyTo(recordedOutput, "Register", registerReply);
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
int serverHandle = registerReply.Register.ServerHandle;
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, serverHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("AddItem", addItemReply);
LogReplyTo(recordedOutput, "AddItem", addItemReply);
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
int itemHandle = addItemReply.AddItem.ItemHandle;
MxCommandReply adviseReply = await fixture.Service.Invoke(
CreateAdviseRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Advise", adviseReply);
LogReplyTo(recordedOutput, "Advise", adviseReply);
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
// AuthenticateUser resolves an ArchestrA user id for the WriteSecured call.
// Credentials are env-overridable so the test honors the gateway's "do not
// log secrets" rule and works against either MXAccess's own user store or
// the LmxOpcUa-baseline GLAuth-bridged ArchestrA identity (admin/admin123).
(string verifyUser, string verifyPassword) = ResolveLiveMxAccessSecuredCredentials();
MxCommandReply authReply = await fixture.Service.Invoke(
CreateAuthenticateUserRequest(sessionId, serverHandle, verifyUser, verifyPassword),
new TestServerCallContext()).ConfigureAwait(false);
output.WriteLine(
recordedOutput.WriteLine(
$"AuthenticateUser status={authReply.ProtocolStatus.Code} hresult={authReply.Hresult} user_id={authReply.AuthenticateUser?.UserId}");
// AuthenticateUser is allowed to fail (the underlying provider may reject
@@ -518,7 +532,7 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
currentUserId,
verifierUserId: 0),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("WriteSecured", writeSecuredReply);
LogReplyTo(recordedOutput, "WriteSecured", writeSecuredReply);
// Parity: the command itself completed its round-trip — the reply kind is
// WriteSecured and the gateway protocol status is set. The MXAccess outcome
@@ -538,6 +552,13 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
streamCancellation.Cancel();
await ShutDownAsync(fixture, processFactory, sessionId, streamTask).ConfigureAwait(false);
}
// CLAUDE.md credential contract: passwords and WriteSecured payloads must never
// reach logs. The buffered output covers the gateway ILogger sink, worker
// stdout/stderr, and every direct WriteLine the test body issued. A regression
// that dumped the request envelope, the AuthenticateUserCommand body, or any
// command-level WriteSecured payload would land here and trip this assertion.
Assert.DoesNotContain(verifyPassword, recordedOutput.Captured, StringComparison.Ordinal);
}
/// <summary>
@@ -611,15 +632,50 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
// The fault classification must come from a known worker-client error code so
// operators get an actionable cause string rather than an opaque exception
// trace. We accept any of the abnormal-exit classifications WorkerClient
// routes through SetFaulted on a killed worker.
// trace. We accept the classifications WorkerClient actually drives on an
// abnormal exit (kill-the-process path): the read loop hits EndOfStream and
// calls SetFaulted with WorkerClientErrorCode.PipeDisconnected and the
// message "Worker pipe disconnected." (see WorkerClient.cs:378-381). The
// earlier broad list (including "worker") matched every WorkerClient fault
// message (they all begin with "Worker"); tighten to the pipe/disconnect/
// end-of-stream classifications that match THIS path, so a regression that
// routed an unrelated fault here would surface as a test failure rather
// than silently passing (see IntegrationTests-020). "heartbeat" is dropped
// because HeartbeatGraceSeconds (15s) exceeds the StreamShutdownTimeout
// (10s) poll window, so a heartbeat-expired transition can never be
// observed inside this test.
Assert.True(
observedFault!.Contains("disconnect", StringComparison.OrdinalIgnoreCase)
|| observedFault.Contains("pipe", StringComparison.OrdinalIgnoreCase)
|| observedFault.Contains("heartbeat", StringComparison.OrdinalIgnoreCase)
|| observedFault.Contains("worker", StringComparison.OrdinalIgnoreCase)
observedFault!.Contains("pipe disconnected", StringComparison.OrdinalIgnoreCase)
|| observedFault.Contains("end of stream", StringComparison.OrdinalIgnoreCase),
$"Fault description '{observedFault}' did not match a known worker-exit classification.");
$"Fault description '{observedFault}' did not match a known abnormal-exit classification "
+ "(expected 'pipe disconnected' or 'end of stream' from WorkerClient's EndOfStream path).");
// IntegrationTests-021: also assert the StreamEvents call observed the fault
// — the chain that puts the session into Faulted goes through ReadEventsAsync
// propagating a WorkerClientException into EventStreamService, which calls
// session.MarkFaulted. The gateway then maps the WorkerClientException to an
// RpcException at the public boundary (MxAccessGatewayService.MapException →
// MapWorkerClientException). Polling session.State alone would silently pass
// if a future refactor moved MarkFaulted off the stream-consumption path —
// assert the streamTask itself terminated with a fault so the test couples
// to the actual fault-propagation path. Compare to the inverse assertion in
// the Write parity test (line 217: Assert.False(streamTask.IsFaulted, ...)).
try
{
await streamTask.WaitAsync(StreamShutdownTimeout).ConfigureAwait(false);
}
catch (Exception streamException)
{
output.WriteLine($"StreamEvents task terminated with: {streamException.GetType().Name}: {streamException.Message}");
}
Assert.True(
streamTask.IsCompleted,
"StreamEvents task did not complete within the shutdown timeout after the worker was killed.");
Assert.True(
streamTask.IsFaulted,
"StreamEvents task must fault on abnormal worker exit, not complete cleanly — "
+ "the fault-propagation path from WorkerClient.SetFaulted through ReadEventsAsync is the contract.");
}
finally
{
@@ -948,12 +1004,20 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
string method,
MxCommandReply reply)
{
output.WriteLine(
LogReplyTo(output, method, reply);
}
private static void LogReplyTo(
ITestOutputHelper sink,
string method,
MxCommandReply reply)
{
sink.WriteLine(
$"{method} status={reply.ProtocolStatus.Code} hresult={reply.Hresult} diagnostic={reply.DiagnosticMessage}");
foreach (MxStatusProxy status in reply.Statuses)
{
output.WriteLine(
sink.WriteLine(
$"{method} mxstatus success={status.Success} category={status.Category} detail={status.Detail} text={status.DiagnosticText}");
}
}
@@ -1034,7 +1098,7 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
/// transitions it to Faulted, which the public gRPC API only exposes indirectly via
/// CloseSession's reply (and not before a graceful close completes).
/// </summary>
public bool TryGetSession(string sessionId, out GatewaySession session)
public bool TryGetSession(string sessionId, [MaybeNullWhen(false)] out GatewaySession session)
{
return _registry.TryGet(sessionId, out session);
}
@@ -1439,6 +1503,56 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
}
}
/// <summary>
/// Buffering wrapper around an <see cref="ITestOutputHelper"/> that mirrors every line
/// written through it into a <see cref="StringBuilder"/> the test owns. The WriteSecured
/// parity test (IntegrationTests-019) uses this to make CLAUDE.md's "passwords and
/// <c>WriteSecured</c> payloads must never reach logs" rule a property of the entire
/// test output stream — gateway <see cref="ILogger"/> entries (echoed via
/// <see cref="TestOutputLoggerProvider"/>), worker stdout/stderr (echoed via
/// <see cref="TestWorkerProcessFactory.WriteWorkerOutput"/>), and direct
/// <c>output.WriteLine</c> calls all land in the same buffer, so a future maintenance
/// change that prints a credential through any of those channels is caught by the
/// assertion rather than slipping past the existing <c>DiagnosticMessage</c> check.
/// </summary>
private sealed class RecordingTestOutputHelper(ITestOutputHelper inner) : ITestOutputHelper
{
private readonly StringBuilder buffer = new();
private readonly object syncRoot = new();
public string Captured
{
get
{
lock (syncRoot)
{
return buffer.ToString();
}
}
}
public void WriteLine(string message)
{
lock (syncRoot)
{
buffer.AppendLine(message);
}
inner.WriteLine(message);
}
public void WriteLine(string format, params object[] args)
{
string formatted = string.Format(System.Globalization.CultureInfo.InvariantCulture, format, args);
lock (syncRoot)
{
buffer.AppendLine(formatted);
}
inner.WriteLine(format, args);
}
}
private sealed class AllowAllConstraintEnforcer : IConstraintEnforcer
{
public Task<ConstraintFailure?> CheckReadTagAsync(
@@ -25,6 +25,7 @@ public sealed class GatewayOptionsValidator : IValidateOptions<GatewayOptions>
ValidateEvents(options.Events, failures);
ValidateDashboard(options.Dashboard, failures);
ValidateProtocol(options.Protocol, failures);
ValidateAlarms(options.Alarms, failures);
return failures.Count == 0
? ValidateOptionsResult.Success
@@ -228,6 +229,33 @@ public sealed class GatewayOptionsValidator : IValidateOptions<GatewayOptions>
failures);
}
private static void ValidateAlarms(AlarmsOptions options, List<string> failures)
{
if (!options.Enabled)
{
return;
}
// When the alarm auto-subscribe hook is enabled, the gateway needs either a
// canonical SubscriptionExpression or a DefaultArea to compose one from. Both
// empty is the configuration mistake SessionManager.TryAutoSubscribeAlarmsAsync
// currently surfaces per-session — pulling it up to startup validation makes
// the misconfiguration fail-fast at boot, in line with every other section.
if (string.IsNullOrWhiteSpace(options.SubscriptionExpression)
&& string.IsNullOrWhiteSpace(options.DefaultArea))
{
failures.Add(
"MxGateway:Alarms requires either a non-blank SubscriptionExpression or a non-blank DefaultArea when Enabled is true.");
}
if (!string.IsNullOrWhiteSpace(options.SubscriptionExpression)
&& !options.SubscriptionExpression.StartsWith(@"\\", StringComparison.Ordinal))
{
failures.Add(
@"MxGateway:Alarms:SubscriptionExpression must start with '\\' (canonical \\<host>\Galaxy!<area> shape).");
}
}
private static void ValidateProtocol(ProtocolOptions options, List<string> failures)
{
if (options.WorkerProtocolVersion != GatewayContractInfo.WorkerProtocolVersion)
@@ -65,15 +65,20 @@ public static class GalaxyGlobMatcher
RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | RegexOptions.Compiled,
TimeSpan.FromMilliseconds(100));
if (RegexCache.TryAdd(glob, compiled))
// GetOrAdd atomically returns whichever instance is in the cache after the
// call — either the locally-compiled regex (we won the race) or the regex
// another thread inserted (we lost). It also avoids the TryAdd-then-indexer
// pattern where the key could be evicted between the failed TryAdd and the
// indexer read, producing a KeyNotFoundException under contention near the
// cap (Server-024).
Regex result = RegexCache.GetOrAdd(glob, compiled);
if (ReferenceEquals(result, compiled))
{
// We were the inserter — track for FIFO eviction and bound the cache.
InsertionOrder.Enqueue(glob);
EvictIfOverCapacity();
return compiled;
}
// Another thread won the race — use its compiled regex.
return RegexCache[glob];
return result;
}
private static void EvictIfOverCapacity()
@@ -26,7 +26,7 @@ public sealed class EventStreamService(
StreamEventsRequest request,
[EnumeratorCancellation] CancellationToken cancellationToken)
{
if (!sessionManager.TryGetSession(request.SessionId, out GatewaySession session))
if (!sessionManager.TryGetSession(request.SessionId, out GatewaySession? session) || session is null)
{
throw new SessionManagerException(
SessionManagerErrorCode.SessionNotFound,
@@ -17,7 +17,7 @@ namespace MxGateway.Server.Grpc;
/// direct SQL probe since callers use it as a health check.
/// </summary>
public sealed class GalaxyRepositoryGrpcService(
GalaxyDb.GalaxyRepository repository,
GalaxyDb.IGalaxyRepository repository,
GalaxyDb.IGalaxyHierarchyCache cache,
GalaxyDb.IGalaxyDeployNotifier notifier,
IGatewayRequestIdentityAccessor identityAccessor,
@@ -54,6 +54,8 @@ public sealed class MxAccessGatewayService(
reply.Capabilities.Add("unary-invoke");
reply.Capabilities.Add("server-stream-events");
reply.Capabilities.Add("bulk-subscribe-commands");
reply.Capabilities.Add("bulk-read-commands");
reply.Capabilities.Add("bulk-write-commands");
reply.Capabilities.Add("unary-acknowledge-alarm");
reply.Capabilities.Add("server-stream-active-alarms");
@@ -253,7 +255,7 @@ public sealed class MxAccessGatewayService(
private GatewaySession ResolveSession(string sessionId)
{
if (!sessionManager.TryGetSession(sessionId, out GatewaySession session))
if (!sessionManager.TryGetSession(sessionId, out GatewaySession? session) || session is null)
{
throw new SessionManagerException(
SessionManagerErrorCode.SessionNotFound,
@@ -1,3 +1,4 @@
using System.Diagnostics.CodeAnalysis;
using MxGateway.Contracts.Proto;
namespace MxGateway.Server.Sessions;
@@ -20,7 +21,7 @@ public interface ISessionManager
/// <returns>True if the session exists; otherwise false.</returns>
bool TryGetSession(
string sessionId,
out GatewaySession session);
[MaybeNullWhen(false)] out GatewaySession session);
/// <summary>Invokes a command on the worker for the specified session.</summary>
/// <param name="sessionId">Identifier of the session.</param>
@@ -1,3 +1,5 @@
using System.Diagnostics.CodeAnalysis;
namespace MxGateway.Server.Sessions;
/// <summary>
@@ -28,7 +30,7 @@ public interface ISessionRegistry
/// <param name="sessionId">Identifier of the session.</param>
/// <param name="session">The retrieved session, if found.</param>
/// <returns>True if found; false otherwise.</returns>
bool TryGet(string sessionId, out GatewaySession session);
bool TryGet(string sessionId, [MaybeNullWhen(false)] out GatewaySession session);
/// <summary>
/// Attempts to remove a session by ID; returns false if not found.
@@ -36,7 +38,7 @@ public interface ISessionRegistry
/// <param name="sessionId">Identifier of the session to remove.</param>
/// <param name="session">The removed session, if found.</param>
/// <returns>True if removed; false if not found.</returns>
bool TryRemove(string sessionId, out GatewaySession session);
bool TryRemove(string sessionId, [MaybeNullWhen(false)] out GatewaySession session);
/// <summary>
/// Returns a snapshot of all sessions in the registry.
@@ -8,20 +8,19 @@ using MxGateway.Server.Grpc;
namespace MxGateway.Server.Sessions;
/// <summary>
/// PR A.6 / A.7 — default <see cref="IAlarmRpcDispatcher"/> shipped while
/// the worker-side AlarmClient event subscription is gated on dev-rig
/// validation. Acknowledges with a structured "worker-pending"
/// Null fallback <see cref="IAlarmRpcDispatcher"/> used when no dispatcher
/// is registered in the DI container (DI omission or standalone tests).
/// Acknowledges with a structured "alarm dispatcher not registered"
/// diagnostic and yields an empty active-alarm stream.
/// </summary>
/// <remarks>
/// <para>
/// Replaces the inline diagnostic strings in
/// <c>MxAccessGatewayService.AcknowledgeAlarm</c> /
/// <c>QueryActiveAlarms</c> from PR A.3 with an injectable seam.
/// When the worker dispatcher (PR A.6/A.7 dev-rig follow-up) lands,
/// <c>WorkerAlarmRpcDispatcher</c> replaces this implementation in
/// the DI container and the same handler shape comes alive without
/// further changes to the public RPC surface.
/// Production wires <see cref="WorkerAlarmRpcDispatcher"/> as the
/// default <see cref="IAlarmRpcDispatcher"/> via
/// <c>SessionServiceCollectionExtensions.AddGatewaySessions</c>, so
/// clients that hit this fallback are running against an
/// intentionally minimal service composition rather than the full
/// gateway.
/// </para>
/// </remarks>
public sealed class NotWiredAlarmRpcDispatcher : IAlarmRpcDispatcher
@@ -35,8 +34,8 @@ public sealed class NotWiredAlarmRpcDispatcher : IAlarmRpcDispatcher
{
SessionId = request.SessionId,
CorrelationId = request.ClientCorrelationId,
ProtocolStatus = MxAccessGrpcMapper.Ok("AcknowledgeAlarm accepted; worker dispatch pending dev-rig wiring."),
DiagnosticMessage = "Gateway-side AcknowledgeAlarm accepted; the worker-side AlarmClient consumer (PR A.5) is in place but the dispatcher hookup is gated on validating the AVEVA alarm-provider event subscription on the dev rig.",
ProtocolStatus = MxAccessGrpcMapper.Ok("AcknowledgeAlarm accepted; alarm dispatcher is not registered."),
DiagnosticMessage = "Alarm dispatcher is not registered.",
});
}
@@ -1,3 +1,4 @@
using System.Diagnostics.CodeAnalysis;
using System.Security.Cryptography;
using Google.Protobuf.WellKnownTypes;
using Microsoft.Extensions.Logging;
@@ -132,7 +133,7 @@ public sealed class SessionManager : ISessionManager
/// <returns>True if session found; otherwise false.</returns>
public bool TryGetSession(
string sessionId,
out GatewaySession session)
[MaybeNullWhen(false)] out GatewaySession session)
{
return _registry.TryGet(sessionId, out session);
}
@@ -297,7 +298,7 @@ public sealed class SessionManager : ISessionManager
private GatewaySession GetRequiredSession(string sessionId)
{
if (!_registry.TryGet(sessionId, out GatewaySession session))
if (!_registry.TryGet(sessionId, out GatewaySession? session) || session is null)
{
throw new SessionManagerException(
SessionManagerErrorCode.SessionNotFound,
@@ -1,4 +1,5 @@
using System.Collections.Concurrent;
using System.Diagnostics.CodeAnalysis;
using MxGateway.Contracts.Proto;
namespace MxGateway.Server.Sessions;
@@ -38,9 +39,9 @@ public sealed class SessionRegistry : ISessionRegistry
/// <param name="session">The retrieved session if found.</param>
public bool TryGet(
string sessionId,
out GatewaySession session)
[MaybeNullWhen(false)] out GatewaySession session)
{
return _sessions.TryGetValue(sessionId, out session!);
return _sessions.TryGetValue(sessionId, out session);
}
/// <summary>
@@ -50,9 +51,9 @@ public sealed class SessionRegistry : ISessionRegistry
/// <param name="session">The removed session if found.</param>
public bool TryRemove(
string sessionId,
out GatewaySession session)
[MaybeNullWhen(false)] out GatewaySession session)
{
return _sessions.TryRemove(sessionId, out session!);
return _sessions.TryRemove(sessionId, out session);
}
/// <summary>
@@ -76,7 +76,7 @@ public sealed class WorkerAlarmRpcDispatcher(
{
ArgumentNullException.ThrowIfNull(request);
if (!sessionRegistry.TryGet(request.SessionId, out GatewaySession session))
if (!sessionRegistry.TryGet(request.SessionId, out GatewaySession? session) || session is null)
{
return new AcknowledgeAlarmReply
{
@@ -186,7 +186,7 @@ public sealed class WorkerAlarmRpcDispatcher(
{
ArgumentNullException.ThrowIfNull(request);
if (!sessionRegistry.TryGet(request.SessionId, out GatewaySession session))
if (!sessionRegistry.TryGet(request.SessionId, out GatewaySession? session) || session is null)
{
// Server-019: align with AcknowledgeAsync's missing-session handling and
// surface a SessionNotFound error rather than yielding an empty stream.
@@ -1,5 +1,6 @@
using MxGateway.Server.Galaxy;
using MxGateway.Contracts.Proto.Galaxy;
using MxGateway.Tests.TestSupport;
namespace MxGateway.Tests.Galaxy;
@@ -156,17 +157,4 @@ public sealed class GalaxyHierarchyCacheTests
}
}
private sealed class ManualTimeProvider(DateTimeOffset start = default) : TimeProvider
{
private DateTimeOffset _now = start == default ? DateTimeOffset.UtcNow : start;
/// <inheritdoc />
public override DateTimeOffset GetUtcNow() => _now;
/// <summary>
/// Advances the current time by the specified duration.
/// </summary>
/// <param name="duration">Time duration to advance.</param>
public void Advance(TimeSpan duration) => _now += duration;
}
}
@@ -346,6 +346,180 @@ public sealed class MxAccessGatewayServiceConstraintTests
Assert.All(reply.WriteSecuredBulk.Results, r => Assert.False(r.WasSuccessful));
}
/// <summary>
/// Tests-020: <c>Write2Bulk</c> takes the third <c>GetPayload</c>/<c>SetPayload</c>
/// switch arm in <c>WriteBulkConstraintPlan</c>. The merge logic is shared with
/// <c>WriteBulk</c>, but a full denial through the <c>CreateDeniedReply</c> path
/// proves the <c>Write2Bulk</c> arm of the per-kind <c>SetPayload</c> switch fires
/// (and not, say, <c>WriteBulk</c> by mistake) — guarding against a refactor that
/// drops or misroutes the <c>Write2Bulk</c> case.
/// </summary>
[Fact]
public async Task Invoke_Write2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply()
{
PredicateConstraintEnforcer enforcer = new() { DenyWriteHandle = (_, _) => true };
FakeSessionManager sessionManager = CreateSessionManagerWithSeed();
MxAccessGatewayService service = CreateService(sessionManager, enforcer);
MxCommandReply reply = await service.Invoke(
CreateWrite2BulkRequest(7, [10, 11]),
new TestServerCallContext());
Assert.Equal(0, sessionManager.InvokeCount);
Assert.Equal(MxCommandKind.Write2Bulk, reply.Kind);
Assert.Equal(2, reply.Write2Bulk.Results.Count);
Assert.All(reply.Write2Bulk.Results, r => Assert.False(r.WasSuccessful));
// Sibling reply slots must remain empty — pin the SetPayload arm fired
// for Write2Bulk and not for one of the other three Write*Bulk kinds.
Assert.Empty(reply.WriteBulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
Assert.Empty(reply.WriteSecuredBulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
Assert.Empty(reply.WriteSecured2Bulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
}
/// <summary>
/// Tests-020: <c>WriteSecured2Bulk</c> takes the fourth <c>GetPayload</c>/<c>SetPayload</c>
/// switch arm in <c>WriteBulkConstraintPlan</c>. Same reasoning as
/// <c>Write2Bulk</c> — assert the <c>WriteSecured2Bulk</c> reply slot is populated
/// to prove that arm of the switch fires.
/// </summary>
[Fact]
public async Task Invoke_WriteSecured2Bulk_WhenAllHandlesDenied_ShortCircuitsWithDeniedOnlyReply()
{
PredicateConstraintEnforcer enforcer = new() { DenyWriteHandle = (_, _) => true };
FakeSessionManager sessionManager = CreateSessionManagerWithSeed();
MxAccessGatewayService service = CreateService(sessionManager, enforcer);
MxCommandReply reply = await service.Invoke(
CreateWriteSecured2BulkRequest(7, [10, 11]),
new TestServerCallContext());
Assert.Equal(0, sessionManager.InvokeCount);
Assert.Equal(MxCommandKind.WriteSecured2Bulk, reply.Kind);
Assert.Equal(2, reply.WriteSecured2Bulk.Results.Count);
Assert.All(reply.WriteSecured2Bulk.Results, r => Assert.False(r.WasSuccessful));
// Sibling reply slots must remain empty — pin the SetPayload arm fired
// for WriteSecured2Bulk and not for one of the other three Write*Bulk kinds.
Assert.Empty(reply.WriteBulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
Assert.Empty(reply.Write2Bulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
Assert.Empty(reply.WriteSecuredBulk?.Results ?? new Google.Protobuf.Collections.RepeatedField<BulkWriteResult>());
}
// === Worker reply-count divergence (Tests-024) ===
/// <summary>
/// Tests-024: <c>WriteBulkConstraintPlan.MergeDeniedInto</c> dequeues from
/// <c>allowedResults</c> per non-denied slot via <c>Queue.TryDequeue</c>,
/// which silently returns <c>false</c> when the queue is empty. Pin the
/// observable behaviour when the worker returns FEWER allowed results than
/// the gateway forwarded: the merged reply is truncated — denied entries
/// keep their slots, but the trailing allowed slot for which no worker
/// result arrived is dropped (no synthetic failure result is fabricated).
/// This fixture makes that "silent truncate" behaviour explicit so a future
/// change either fills the gap with a synthetic failure or fails this test.
/// </summary>
[Fact]
public async Task Invoke_WriteBulk_WhenWorkerReturnsFewerResultsThanAllowed_MergedReplyIsTruncated()
{
PredicateConstraintEnforcer enforcer = new()
{
DenyWriteHandle = (_, itemHandle) => itemHandle == 902,
};
FakeSessionManager sessionManager = CreateSessionManagerWithSeed();
// Gateway forwards 2 allowed handles (901, 903) but the worker returns only
// 1 result. The merge logic should keep denied entry 902 at index 1, place
// the single worker result at index 0, and leave index 2 empty (truncate).
sessionManager.InvokeReply = new WorkerCommandReply
{
Reply = new MxCommandReply
{
SessionId = SessionId,
Kind = MxCommandKind.WriteBulk,
ProtocolStatus = MxAccessGrpcMapper.Ok(),
WriteBulk = new BulkWriteReply
{
Results =
{
new BulkWriteResult { ServerHandle = 7, ItemHandle = 901, WasSuccessful = true },
},
},
},
};
MxAccessGatewayService service = CreateService(sessionManager, enforcer);
MxCommandReply reply = await service.Invoke(
CreateWriteBulkRequest(7, [901, 902, 903]),
new TestServerCallContext());
Assert.Equal(1, sessionManager.InvokeCount);
BulkWriteReply merged = reply.WriteBulk;
// Current behaviour: the merged reply is shorter than OriginalCount when
// the worker under-supplies. Two slots survive — the worker result at
// index 0 and the denied entry at index 1 — and the trailing slot is
// silently dropped via Queue.TryDequeue returning false.
Assert.Equal(2, merged.Results.Count);
Assert.True(merged.Results[0].WasSuccessful);
Assert.Equal(901, merged.Results[0].ItemHandle);
Assert.False(merged.Results[1].WasSuccessful);
Assert.Equal(902, merged.Results[1].ItemHandle);
}
/// <summary>
/// Tests-024: when the worker returns MORE allowed results than the
/// gateway forwarded, the extras must be silently ignored — the merged
/// reply length stays at <c>OriginalCount</c>. This pins the
/// <c>for index &lt; OriginalCount</c> loop bound so a regression that
/// accidentally surfaces extras as trailing results is caught.
/// </summary>
[Fact]
public async Task Invoke_WriteBulk_WhenWorkerReturnsExtraResults_IgnoresExtras()
{
PredicateConstraintEnforcer enforcer = new()
{
DenyWriteHandle = (_, itemHandle) => itemHandle == 902,
};
FakeSessionManager sessionManager = CreateSessionManagerWithSeed();
// Gateway forwards 2 allowed handles (901, 903) but the worker returns 4.
sessionManager.InvokeReply = new WorkerCommandReply
{
Reply = new MxCommandReply
{
SessionId = SessionId,
Kind = MxCommandKind.WriteBulk,
ProtocolStatus = MxAccessGrpcMapper.Ok(),
WriteBulk = new BulkWriteReply
{
Results =
{
new BulkWriteResult { ServerHandle = 7, ItemHandle = 901, WasSuccessful = true },
new BulkWriteResult { ServerHandle = 7, ItemHandle = 903, WasSuccessful = true },
new BulkWriteResult { ServerHandle = 7, ItemHandle = 999, WasSuccessful = true },
new BulkWriteResult { ServerHandle = 7, ItemHandle = 1000, WasSuccessful = true },
},
},
},
};
MxAccessGatewayService service = CreateService(sessionManager, enforcer);
MxCommandReply reply = await service.Invoke(
CreateWriteBulkRequest(7, [901, 902, 903]),
new TestServerCallContext());
Assert.Equal(1, sessionManager.InvokeCount);
BulkWriteReply merged = reply.WriteBulk;
// Merged reply length stays at OriginalCount (3); the two extra worker
// results (item handles 999, 1000) are silently discarded by the
// OriginalCount-bounded loop.
Assert.Equal(3, merged.Results.Count);
Assert.Equal(901, merged.Results[0].ItemHandle);
Assert.True(merged.Results[0].WasSuccessful);
Assert.Equal(902, merged.Results[1].ItemHandle);
Assert.False(merged.Results[1].WasSuccessful);
Assert.Equal(903, merged.Results[2].ItemHandle);
Assert.True(merged.Results[2].WasSuccessful);
Assert.DoesNotContain(merged.Results, r => r.ItemHandle == 999);
Assert.DoesNotContain(merged.Results, r => r.ItemHandle == 1000);
}
// === Unary write-handle enforcement (EnforceWriteHandleAsync) ===
/// <summary>
@@ -547,6 +721,48 @@ public sealed class MxAccessGatewayServiceConstraintTests
};
}
private static MxCommandRequest CreateWrite2BulkRequest(int serverHandle, IReadOnlyList<int> itemHandles)
{
Write2BulkCommand cmd = new() { ServerHandle = serverHandle };
foreach (int handle in itemHandles)
{
cmd.Entries.Add(new Write2BulkEntry
{
ItemHandle = handle,
Value = new MxValue { StringValue = "v" },
TimestampValue = new MxValue { Int64Value = 1234567890L },
});
}
return new MxCommandRequest
{
SessionId = SessionId,
Command = new MxCommand { Kind = MxCommandKind.Write2Bulk, Write2Bulk = cmd },
};
}
private static MxCommandRequest CreateWriteSecured2BulkRequest(int serverHandle, IReadOnlyList<int> itemHandles)
{
WriteSecured2BulkCommand cmd = new() { ServerHandle = serverHandle };
foreach (int handle in itemHandles)
{
cmd.Entries.Add(new WriteSecured2BulkEntry
{
ItemHandle = handle,
CurrentUserId = 1,
VerifierUserId = 2,
Value = new MxValue { StringValue = "v" },
TimestampValue = new MxValue { Int64Value = 1234567890L },
});
}
return new MxCommandRequest
{
SessionId = SessionId,
Command = new MxCommand { Kind = MxCommandKind.WriteSecured2Bulk, WriteSecured2Bulk = cmd },
};
}
private static MxCommandRequest CreateWriteRequest(int serverHandle, int itemHandle)
{
return new MxCommandRequest
@@ -344,9 +344,9 @@ public sealed class MxAccessGatewayServiceTests
Assert.Equal(StatusCode.InvalidArgument, exception.StatusCode);
}
/// <summary>Verifies AcknowledgeAlarm returns OK with a worker-pending diagnostic for valid input.</summary>
/// <summary>Verifies AcknowledgeAlarm returns OK with a "dispatcher not registered" diagnostic when DI omits the dispatcher.</summary>
[Fact]
public async Task AcknowledgeAlarm_WithValidRequest_ReturnsOkWithWorkerPendingDiagnostic()
public async Task AcknowledgeAlarm_WithValidRequest_ReturnsOkWithNotRegisteredDiagnostic()
{
MxAccessGatewayService service = CreateService(new FakeSessionManager());
@@ -364,7 +364,7 @@ public sealed class MxAccessGatewayServiceTests
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
Assert.Equal("session-1", reply.SessionId);
Assert.Equal("corr-1", reply.CorrelationId);
Assert.Contains("worker", reply.DiagnosticMessage, StringComparison.OrdinalIgnoreCase);
Assert.Contains("not registered", reply.DiagnosticMessage, StringComparison.OrdinalIgnoreCase);
}
/// <summary>Verifies QueryActiveAlarms rejects empty session_id.</summary>
@@ -68,6 +68,43 @@ public sealed class GatewaySessionTests
await session.DisposeAsync();
}
/// <summary>
/// Server-028 regression. A <see cref="GatewaySession.MarkFaulted"/> issued
/// while <see cref="GatewaySession.CloseAsync"/> is parked between its
/// <c>Closing</c> and <c>Closed</c> writes must not break the close path's
/// terminal contract: the in-flight close runs to <c>Closed</c>, the fault
/// reason is preserved on <see cref="GatewaySession.FinalFault"/>, and the
/// session does not get stuck in <see cref="SessionState.Faulted"/>. The
/// state machine documents "Closing only allows a transition to Closed or
/// Faulted" — this test pins the resolved end state so a future tightening
/// of <c>MarkFaulted</c> cannot silently regress it.
/// </summary>
[Fact]
public async Task MarkFaulted_DuringInFlightClose_PreservesFaultButYieldsToClose()
{
BlockingShutdownWorkerClient workerClient = new();
GatewaySession session = CreateReadySession(workerClient);
Task<SessionCloseResult> closeTask = session.CloseAsync("test-close", CancellationToken.None);
await workerClient.WaitForShutdownStartAsync();
// Close has set _state = Closing under _syncRoot and is parked inside
// worker.ShutdownAsync. Fault the session from another thread while parked.
Assert.Equal(SessionState.Closing, session.State);
session.MarkFaulted("concurrent-fault");
workerClient.ReleaseShutdown();
SessionCloseResult result = await closeTask;
// Close still wins — Closed is terminal — but the fault reason is preserved
// so observers see the original cause once the session settles.
Assert.Equal(SessionState.Closed, result.FinalState);
Assert.Equal(SessionState.Closed, session.State);
Assert.Equal("concurrent-fault", session.FinalFault);
await session.DisposeAsync();
}
/// <summary>
/// Server-016 regression. <see cref="GatewaySession.DisposeAsync"/> must wait
/// for an in-flight <see cref="GatewaySession.CloseAsync"/> before disposing
@@ -4,16 +4,16 @@ using MxGateway.Server.Sessions;
namespace MxGateway.Tests.Gateway.Sessions;
/// <summary>
/// PR A.6 / A.7 — pins the not-yet-wired dispatcher's behaviour:
/// AcknowledgeAsync returns OK with a worker-pending diagnostic and
/// QueryActiveAlarmsAsync yields an empty stream. Production
/// <c>WorkerAlarmRpcDispatcher</c> (dev-rig follow-up) replaces this
/// impl in DI without changing the gateway handler shape.
/// Pins the null-fallback dispatcher's behaviour: AcknowledgeAsync
/// returns OK with a "dispatcher not registered" diagnostic and
/// QueryActiveAlarmsAsync yields an empty stream. Production binds
/// <c>WorkerAlarmRpcDispatcher</c> in DI; this fallback is only used
/// when no dispatcher is registered (DI omission / standalone tests).
/// </summary>
public sealed class NotWiredAlarmRpcDispatcherTests
{
[Fact]
public async Task AcknowledgeAsync_WhenNotWired_ReturnsOkWithWorkerPendingDiagnostic()
public async Task AcknowledgeAsync_WhenNotWired_ReturnsOkWithNotRegisteredDiagnostic()
{
IAlarmRpcDispatcher dispatcher = new NotWiredAlarmRpcDispatcher();
@@ -31,7 +31,7 @@ public sealed class NotWiredAlarmRpcDispatcherTests
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
Assert.Equal("session-1", reply.SessionId);
Assert.Equal("corr-1", reply.CorrelationId);
Assert.Contains("worker", reply.DiagnosticMessage, StringComparison.OrdinalIgnoreCase);
Assert.Contains("not registered", reply.DiagnosticMessage, StringComparison.OrdinalIgnoreCase);
}
[Fact]
@@ -433,6 +433,53 @@ public sealed class SessionManagerBulkTests
cts.Token));
}
/// <summary>
/// Tests-022: Pin mid-flight cancellation behaviour for at least one bulk
/// path. Unlike the pre-cancel <c>WriteSecuredBulkAsync_PropagatesCancellation</c>
/// above, this fake's <see cref="MidFlightBulkWorkerClient.InvokeAsync"/>
/// returns a <see cref="TaskCompletionSource"/>-backed task that does NOT
/// complete until the registered token fires. The session call therefore
/// reaches <c>InvokeBulkInternalAsync</c> → <c>InvokeAsync</c> →
/// <c>workerClient.InvokeAsync</c> and parks on an in-flight await; only
/// after that does <c>cts.CancelAsync()</c> fire. This is the path a real
/// client closing its stream would hit, which the pre-cancel pattern can't
/// exercise.
/// </summary>
[Fact]
public async Task WriteSecuredBulkAsync_WhenCancelledMidFlight_ThrowsOperationCanceledForRequestToken()
{
MidFlightBulkWorkerClient workerClient = new();
GatewaySession session = await OpenSessionAsync(workerClient);
using CancellationTokenSource cts = new();
Task<IReadOnlyList<BulkWriteResult>> writeTask = session.WriteSecuredBulkAsync(
12,
new[]
{
new WriteSecuredBulkEntry
{
ItemHandle = 1,
CurrentUserId = 7,
VerifierUserId = 8,
Value = new MxValue { DataType = MxDataType.Integer, Int32Value = 0 },
},
},
cts.Token);
// Wait until the gateway has descended into the worker's InvokeAsync and
// registered its cancellation continuation — only then is this a true
// mid-flight cancel.
await workerClient.InvokeStarted.Task.WaitAsync(TimeSpan.FromSeconds(5));
Assert.False(writeTask.IsCompleted);
await cts.CancelAsync();
OperationCanceledException exception = await Assert.ThrowsAnyAsync<OperationCanceledException>(
async () => await writeTask);
Assert.Equal(cts.Token, exception.CancellationToken);
Assert.Equal(1, workerClient.InvokeCount);
}
[Fact]
public async Task WriteSecured2BulkAsync_ForwardsOneWriteSecured2BulkCommandAndPreservesCredentialAndTimestampPayload()
{
@@ -587,12 +634,17 @@ public sealed class SessionManagerBulkTests
}
private static async Task<GatewaySession> OpenSessionAsync(FakeBulkWorkerClient workerClient)
{
return await OpenSessionAsync((IWorkerClient)workerClient);
}
private static async Task<GatewaySession> OpenSessionAsync(IWorkerClient workerClient)
{
SessionManager manager = CreateManager(workerClient);
return await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
}
private static SessionManager CreateManager(FakeBulkWorkerClient workerClient)
private static SessionManager CreateManager(IWorkerClient workerClient)
{
return new SessionManager(
new SessionRegistry(),
@@ -708,4 +760,87 @@ public sealed class SessionManagerBulkTests
/// <inheritdoc />
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
}
/// <summary>
/// Mid-flight cancellation fake for Tests-022.
/// <see cref="InvokeAsync"/> signals <see cref="InvokeStarted"/>, registers
/// a cancellation continuation on the caller's <see cref="CancellationToken"/>,
/// and parks on a <see cref="TaskCompletionSource{TResult}"/> that completes
/// only when the token fires or the fake is shut down. This is the only
/// way to land an <see cref="OperationCanceledException"/> on the async
/// continuation rather than the synchronous fast-path inside
/// <c>ThrowIfCancellationRequested</c>.
/// </summary>
private sealed class MidFlightBulkWorkerClient : IWorkerClient
{
private readonly TaskCompletionSource<WorkerCommandReply> _invokeCompletion =
new(TaskCreationOptions.RunContinuationsAsynchronously);
/// <inheritdoc />
public string SessionId { get; init; } = "session-1";
/// <inheritdoc />
public int? ProcessId { get; init; } = 1234;
/// <inheritdoc />
public WorkerClientState State { get; set; } = WorkerClientState.Ready;
/// <inheritdoc />
public DateTimeOffset LastHeartbeatAt { get; init; } = DateTimeOffset.UtcNow;
/// <summary>Gets the number of times <see cref="InvokeAsync"/> was entered.</summary>
public int InvokeCount { get; private set; }
/// <summary>Signals when <see cref="InvokeAsync"/> first enters — the test
/// awaits this before triggering mid-flight cancellation.</summary>
public TaskCompletionSource InvokeStarted { get; } =
new(TaskCreationOptions.RunContinuationsAsynchronously);
/// <inheritdoc />
public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask;
/// <inheritdoc />
public Task<WorkerCommandReply> InvokeAsync(
WorkerCommand command,
TimeSpan timeout,
CancellationToken cancellationToken)
{
InvokeCount++;
// Register cancellation BEFORE signalling start so the test can be
// certain the continuation is wired the moment InvokeStarted resolves.
cancellationToken.Register(() => _invokeCompletion.TrySetCanceled(cancellationToken));
InvokeStarted.TrySetResult();
return _invokeCompletion.Task;
}
/// <inheritdoc />
public async IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
[System.Runtime.CompilerServices.EnumeratorCancellation] CancellationToken cancellationToken)
{
await Task.CompletedTask;
yield break;
}
/// <inheritdoc />
public Task ShutdownAsync(TimeSpan timeout, CancellationToken cancellationToken)
{
State = WorkerClientState.Closed;
_invokeCompletion.TrySetCanceled(cancellationToken);
return Task.CompletedTask;
}
/// <inheritdoc />
public void Kill(string reason)
{
State = WorkerClientState.Faulted;
_invokeCompletion.TrySetCanceled();
}
/// <inheritdoc />
public ValueTask DisposeAsync()
{
_invokeCompletion.TrySetCanceled();
return ValueTask.CompletedTask;
}
}
}
@@ -5,6 +5,7 @@ using MxGateway.Server.Configuration;
using MxGateway.Server.Metrics;
using MxGateway.Server.Sessions;
using MxGateway.Server.Workers;
using MxGateway.Tests.TestSupport;
namespace MxGateway.Tests.Gateway.Sessions;
@@ -24,7 +25,7 @@ public sealed class SessionManagerTests
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
Assert.True(manager.TryGetSession(session.SessionId, out GatewaySession registered));
Assert.True(manager.TryGetSession(session.SessionId, out GatewaySession? registered));
Assert.Same(session, registered);
Assert.Equal(SessionState.Ready, session.State);
Assert.Equal("client-1", session.ClientIdentity);
@@ -763,10 +764,4 @@ public sealed class SessionManagerTests
}
}
private sealed class ManualTimeProvider(DateTimeOffset start) : TimeProvider
{
private DateTimeOffset _now = start;
public override DateTimeOffset GetUtcNow() => _now;
}
}
@@ -330,18 +330,28 @@ public sealed class SessionWorkerClientFactoryFakeWorkerTests : IAsyncDisposable
DateTimeOffset.UtcNow);
}
/// <summary>Fake worker process for testing process lifecycle.</summary>
/// <summary>
/// Fake worker process for testing process lifecycle. <see cref="WaitForExitAsync"/>
/// awaits a <see cref="TaskCompletionSource"/> completed only by
/// <see cref="Kill"/> or <see cref="MarkExited"/>, so a caller observing
/// completion can trust that exit actually happened — bringing this fake into
/// parity with the smoke-test variant in <c>GatewayEndToEndFakeWorkerSmokeTests</c>
/// (Tests-015 / Tests-023). This removes the latent regression vector where a
/// future <c>Assert.True(launcher.Process.HasExited)</c> in this file would
/// pass spuriously regardless of whether the worker truly exited.
/// </summary>
private sealed class FakeWorkerProcess(int processId) : IWorkerProcess
{
private readonly TaskCompletionSource _exited = new(TaskCreationOptions.RunContinuationsAsynchronously);
private bool _disposed;
/// <inheritdoc />
public int Id { get; } = processId;
/// <summary>Gets or sets a value indicating whether the process has exited.</summary>
/// <summary>Gets a value indicating whether the process has exited.</summary>
public bool HasExited { get; private set; }
/// <summary>Gets or sets the process exit code.</summary>
/// <summary>Gets the process exit code, or null if the process has not exited.</summary>
public int? ExitCode { get; private set; }
/// <summary>Gets the number of times the Kill method was called.</summary>
@@ -350,17 +360,14 @@ public sealed class SessionWorkerClientFactoryFakeWorkerTests : IAsyncDisposable
/// <inheritdoc />
public ValueTask WaitForExitAsync(CancellationToken cancellationToken)
{
HasExited = true;
ExitCode = 0;
return ValueTask.CompletedTask;
return new ValueTask(_exited.Task.WaitAsync(cancellationToken));
}
/// <inheritdoc />
public void Kill(bool entireProcessTree)
{
KillCount++;
HasExited = true;
ExitCode = -1;
MarkExited(-1);
}
/// <inheritdoc />
@@ -371,5 +378,14 @@ public sealed class SessionWorkerClientFactoryFakeWorkerTests : IAsyncDisposable
/// <summary>Gets a value indicating whether this process has been disposed.</summary>
public bool IsDisposed => _disposed;
/// <summary>Marks the process as exited with the specified exit code.</summary>
/// <param name="exitCode">The process exit code.</param>
public void MarkExited(int exitCode)
{
HasExited = true;
ExitCode = exitCode;
_exited.TrySetResult();
}
}
}
@@ -2,6 +2,7 @@ using MxGateway.Contracts;
using MxGateway.Contracts.Proto;
using MxGateway.Server.Workers;
using MxGateway.Tests.Gateway.Workers.Fakes;
using MxGateway.Tests.TestSupport;
namespace MxGateway.Tests.Gateway.Workers;
@@ -222,16 +223,4 @@ public sealed class FakeWorkerHarnessTests
}
}
/// <summary>Time provider with a manually advanced clock for deterministic timestamp tests.</summary>
private sealed class ManualTimeProvider(DateTimeOffset start) : TimeProvider
{
private DateTimeOffset _now = start;
/// <inheritdoc />
public override DateTimeOffset GetUtcNow() => _now;
/// <summary>Advances the manual clock by the given amount.</summary>
/// <param name="delta">Amount of time to add to the current clock value.</param>
public void Advance(TimeSpan delta) => _now += delta;
}
}
@@ -4,6 +4,7 @@ using MxGateway.Contracts;
using MxGateway.Contracts.Proto;
using MxGateway.Server.Metrics;
using MxGateway.Server.Workers;
using MxGateway.Tests.TestSupport;
namespace MxGateway.Tests.Gateway.Workers;
@@ -616,19 +617,6 @@ public sealed class WorkerClientTests
}
}
/// <summary>Time provider with a manually advanced clock for deterministic timestamp tests.</summary>
private sealed class ManualTimeProvider(DateTimeOffset start) : TimeProvider
{
private DateTimeOffset _now = start;
/// <inheritdoc />
public override DateTimeOffset GetUtcNow() => _now;
/// <summary>Advances the manual clock by the given amount.</summary>
/// <param name="delta">Amount of time to add to the current clock value.</param>
public void Advance(TimeSpan delta) => _now += delta;
}
private sealed class FakeWorkerProcess : IWorkerProcess
{
private readonly TaskCompletionSource _exited = new(TaskCreationOptions.RunContinuationsAsynchronously);
@@ -18,6 +18,7 @@ public sealed class GatewayGrpcScopeResolverTests
[InlineData(typeof(TestConnectionRequest), GatewayScopes.MetadataRead)]
[InlineData(typeof(GetLastDeployTimeRequest), GatewayScopes.MetadataRead)]
[InlineData(typeof(DiscoverHierarchyRequest), GatewayScopes.MetadataRead)]
[InlineData(typeof(WatchDeployEventsRequest), GatewayScopes.MetadataRead)]
public void ResolveRequiredScope_KnownRpcRequest_ReturnsExpectedScope(
Type requestType,
string expectedScope)
@@ -0,0 +1,22 @@
namespace MxGateway.Tests.TestSupport;
/// <summary>
/// <see cref="TimeProvider"/> with a manually advanced clock for deterministic
/// timestamp / heartbeat / lease tests. Tests inject one of these instead of
/// <see cref="TimeProvider.System"/> so timing assertions don't depend on the
/// wall clock. Constructed without arguments (or with <c>default</c>) it seeds
/// from <see cref="DateTimeOffset.UtcNow"/>; for fully deterministic tests pass
/// an explicit start instant.
/// </summary>
/// <param name="start">Initial clock value. When <c>default</c>, the clock seeds from <see cref="DateTimeOffset.UtcNow"/>.</param>
public sealed class ManualTimeProvider(DateTimeOffset start = default) : TimeProvider
{
private DateTimeOffset _now = start == default ? DateTimeOffset.UtcNow : start;
/// <inheritdoc />
public override DateTimeOffset GetUtcNow() => _now;
/// <summary>Advances the manual clock by the given amount.</summary>
/// <param name="delta">Amount of time to add to the current clock value.</param>
public void Advance(TimeSpan delta) => _now += delta;
}
@@ -442,6 +442,78 @@ public sealed class WorkerPipeSessionTests
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
}
/// <summary>
/// Worker-023 regression: the in-flight-command suppression on the
/// <c>StaHung</c> watchdog (Worker-017) is bounded by
/// <c>WorkerPipeSessionOptions.HeartbeatStuckCeiling</c>. A truly
/// stuck synchronous STA command (e.g. a dead MXAccess provider) would
/// otherwise keep <c>CurrentCommandCorrelationId</c> non-empty forever
/// and permanently defeat the watchdog. Once <c>LastStaActivityUtc</c>
/// has been stale for longer than <c>HeartbeatStuckCeiling</c> the
/// watchdog DOES fire <c>StaHung</c> even with a command in flight.
/// </summary>
[Fact]
public async Task RunAsync_WhenStaActivityIsStaleBeyondCeilingWithCommandInFlight_WritesWatchdogFault()
{
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
FakeRuntimeSession runtime = new();
// Stale by 5s, which exceeds the configured 200 ms ceiling — the
// watchdog must fire even with a command in flight.
runtime.SetSnapshot(new WorkerRuntimeHeartbeatSnapshot(
DateTimeOffset.UtcNow - TimeSpan.FromSeconds(5),
pendingCommandCount: 0,
outboundEventQueueDepth: 0,
lastEventSequence: 0,
currentCommandCorrelationId: "stuck-command"));
WorkerPipeSession session = CreatePipeSession(
pipePair.WorkerStream,
runtime,
new WorkerPipeSessionOptions
{
HeartbeatInterval = TimeSpan.FromMilliseconds(20),
HeartbeatGrace = TimeSpan.FromMilliseconds(50),
HeartbeatStuckCeiling = TimeSpan.FromMilliseconds(200),
});
Task runTask = session.RunAsync(cancellation.Token);
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
WorkerEnvelope fault = await ReadUntilAsync(
pipePair.GatewayReader,
WorkerEnvelope.BodyOneofCase.WorkerFault,
cancellation.Token);
Assert.Equal(WorkerFaultCategory.StaHung, fault.WorkerFault.Category);
Assert.Contains("STA activity is stale", fault.WorkerFault.DiagnosticMessage);
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
}
/// <summary>
/// Worker-025 regression: <c>RunAsync</c> must throw a diagnostic
/// exception if the runtime-session factory returns null, rather than
/// deferring the failure to an NRE on the next dereference.
/// </summary>
[Fact]
public async Task RunAsync_WhenRuntimeSessionFactoryReturnsNull_ThrowsDiagnosticException()
{
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
WorkerFrameProtocolOptions options = CreateOptions();
WorkerPipeSession session = new(
new WorkerFrameReader(pipePair.WorkerStream, options),
new WorkerFrameWriter(pipePair.WorkerStream, options),
options,
() => 1234,
new WorkerPipeSessionOptions(),
() => null!);
InvalidOperationException exception = await Assert.ThrowsAsync<InvalidOperationException>(
() => session.RunAsync(cancellation.Token));
Assert.Contains("factory returned null", exception.Message);
}
/// <summary>
/// Worker-006 regression: when graceful shutdown times out, RunAsync
/// must still dispose the runtime session in its finally block.
@@ -818,15 +890,23 @@ public sealed class WorkerPipeSessionTests
Nonce);
}
// Inbound-envelope sequence numbers below are documentation-only: the
// worker has no inbound monotonicity check, so the literal values do
// not affect dispatch. Each helper exposes a sequence parameter
// (default = position in the typical Hello/Command/Cancel/Shutdown
// ordering) so a multi-frame test that interleaves the helpers can
// assign monotonically increasing values and produce a wire trace
// that reads in ascending order — see Worker.Tests-030.
private static WorkerEnvelope CreateGatewayHelloEnvelope(
string nonce = Nonce,
uint supportedProtocolVersion = GatewayContractInfo.WorkerProtocolVersion)
uint supportedProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
ulong sequence = 1)
{
return new WorkerEnvelope
{
ProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
SessionId = SessionId,
Sequence = 1,
Sequence = sequence,
GatewayHello = new GatewayHello
{
SupportedProtocolVersion = supportedProtocolVersion,
@@ -836,13 +916,13 @@ public sealed class WorkerPipeSessionTests
};
}
private static WorkerEnvelope CreateCommandEnvelope(string correlationId)
private static WorkerEnvelope CreateCommandEnvelope(string correlationId, ulong sequence = 2)
{
return new WorkerEnvelope
{
ProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
SessionId = SessionId,
Sequence = 2,
Sequence = sequence,
CorrelationId = correlationId,
WorkerCommand = new WorkerCommand
{
@@ -859,13 +939,13 @@ public sealed class WorkerPipeSessionTests
};
}
private static WorkerEnvelope CreateCancelEnvelope(string correlationId)
private static WorkerEnvelope CreateCancelEnvelope(string correlationId, ulong sequence = 2)
{
return new WorkerEnvelope
{
ProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
SessionId = SessionId,
Sequence = 4,
Sequence = sequence,
CorrelationId = correlationId,
WorkerCancel = new WorkerCancel
{
@@ -874,13 +954,13 @@ public sealed class WorkerPipeSessionTests
};
}
private static WorkerEnvelope CreateShutdownEnvelope()
private static WorkerEnvelope CreateShutdownEnvelope(ulong sequence = 3)
{
return new WorkerEnvelope
{
ProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
SessionId = SessionId,
Sequence = 3,
Sequence = sequence,
WorkerShutdown = new WorkerShutdown
{
GracePeriod = Duration.FromTimeSpan(TimeSpan.FromSeconds(1)),
@@ -189,6 +189,81 @@ public sealed class AlarmCommandHandlerTests
() => handler.Subscribe("x", "y"));
}
/// <summary>
/// Worker-024 regression: every method that touches the underlying
/// <see cref="IMxAccessAlarmConsumer"/> must invoke the configured
/// STA-affinity guard. A guard that throws (simulating an off-STA
/// call) must propagate from every command-path entry point.
/// </summary>
[Fact]
public void EveryCommandPathEntry_InvokesThreadAffinityGuard()
{
FakeConsumer consumer = new FakeConsumer();
int guardInvocations = 0;
AlarmCommandHandler handler = new AlarmCommandHandler(
new MxAccessEventQueue(),
() => consumer,
() => guardInvocations++);
// Subscribe is the first call — guard must run before the consumer
// factory is invoked. We tally invocation counts after each call so
// that a missed guard surfaces as the diagnostic count, not a generic
// "Subscribe should have failed".
handler.Subscribe(@"\\HOST\Galaxy!A", "s1");
Assert.Equal(1, guardInvocations);
handler.Acknowledge(Guid.NewGuid(), "c", "u", "n", "d", "F");
Assert.Equal(2, guardInvocations);
handler.AcknowledgeByName("a", "p", "g", "c", "u", "n", "d", "F");
Assert.Equal(3, guardInvocations);
_ = handler.QueryActive(null);
Assert.Equal(4, guardInvocations);
handler.PollOnce();
Assert.Equal(5, guardInvocations);
handler.Unsubscribe();
Assert.Equal(6, guardInvocations);
}
/// <summary>
/// Worker-024 regression: a guard that throws must propagate from
/// every command-path entry point — proving the guard is not
/// swallowed by an inner try/catch.
/// </summary>
[Fact]
public void EveryCommandPathEntry_PropagatesAffinityGuardException()
{
FakeConsumer consumer = new FakeConsumer();
AlarmCommandHandler handler = new AlarmCommandHandler(
new MxAccessEventQueue(),
() => consumer,
threadAffinityCheck: () =>
throw new InvalidOperationException("off-STA"));
// Subscribe: guard runs before the dispatcher is constructed.
Assert.Throws<InvalidOperationException>(
() => handler.Subscribe(@"\\HOST\Galaxy!A", "s1"));
// To exercise the other entry points we need a subscribed handler.
// Construct a parallel handler with a passing guard, then swap in a
// throwing one — but the existing handler is the simpler vehicle:
// re-build the handler with the guard initially silent, subscribe,
// then verify each remaining entry by passing a guard that throws
// through a second handler instance — actually the cleaner way is to
// assert each independently with a fresh handler. Below we reuse
// the same throwing handler for the not-subscribed-yet entries:
Assert.Throws<InvalidOperationException>(
() => handler.Acknowledge(Guid.Empty, "", "", "", "", ""));
Assert.Throws<InvalidOperationException>(
() => handler.AcknowledgeByName("", "", "", "", "", "", "", ""));
Assert.Throws<InvalidOperationException>(() => handler.QueryActive(null));
Assert.Throws<InvalidOperationException>(() => handler.PollOnce());
Assert.Throws<InvalidOperationException>(() => handler.Unsubscribe());
}
private static MxAlarmSnapshotRecord NewRecord(string provider, string group, string tag)
{
return new MxAlarmSnapshotRecord
@@ -200,7 +200,7 @@ public sealed class MxAccessStaSessionTests
factory,
eventSink,
new MxAccessEventQueue(),
_eq => handler);
(_eq, _affinity) => handler);
await session.StartAsync("session-1", workerProcessId: 1);
@@ -279,7 +279,7 @@ public sealed class MxAccessStaSessionTests
factory,
eventSink,
new MxAccessEventQueue(),
_eq => handler);
(_eq, _affinity) => handler);
await session.StartAsync("session-1", workerProcessId: 1);
@@ -320,7 +320,7 @@ public sealed class MxAccessStaSessionTests
factory,
eventSink,
new MxAccessEventQueue(),
_eq => handler);
(_eq, _affinity) => handler);
await session.StartAsync("session-1", workerProcessId: 1);
@@ -369,7 +369,7 @@ public sealed class MxAccessStaSessionTests
factory,
eventSink,
eventQueue,
_eq => handler);
(_eq, _affinity) => handler);
await session.StartAsync("session-1", workerProcessId: 1);
@@ -416,7 +416,7 @@ public sealed class MxAccessStaSessionTests
factory,
eventSink,
eventQueue,
_eq => handler);
(_eq, _affinity) => handler);
await session.StartAsync("session-1", workerProcessId: 1);
@@ -11,7 +11,7 @@ using aaAlarmManagedClient;
using ArchestrA.MxAccess;
using Xunit.Abstractions;
namespace MxGateway.Worker.Tests;
namespace MxGateway.Worker.Tests.Probes;
/// <summary>
/// Runtime probe — registers as an AlarmClient consumer with a real
@@ -6,7 +6,7 @@ using MxGateway.Contracts.Proto;
using MxGateway.Worker.MxAccess;
using Xunit.Abstractions;
namespace MxGateway.Worker.Tests;
namespace MxGateway.Worker.Tests.Probes;
/// <summary>
/// Live dev-rig smoke test for the alarms-over-gateway pipeline.
@@ -7,7 +7,7 @@ using System.Threading;
using WNWRAPCONSUMERLib;
using Xunit.Abstractions;
namespace MxGateway.Worker.Tests;
namespace MxGateway.Worker.Tests.Probes;
/// <summary>
/// Runtime probe — instantiate AVEVA's standalone wnwrapConsumer COM
@@ -166,12 +166,34 @@ internal sealed class FakeRuntimeSession : IWorkerRuntimeSession
}
}
private bool cancelCommandReturnValue;
/// <summary>
/// Optional return value yielded by <see cref="CancelCommand"/>.
/// Defaults to <c>false</c> (the runtime had no matching in-flight
/// command), matching the previous test-double behaviour.
/// command), matching the previous test-double behaviour. Mutated
/// and read under <c>lock(gate)</c> to match the locking convention
/// the rest of this fake uses for <c>cancelledCorrelationIds</c>,
/// <c>snapshot</c>, and <c>events</c> (Worker.Tests-027).
/// </summary>
public bool CancelCommandReturnValue { get; set; }
public bool CancelCommandReturnValue
{
get
{
lock (gate)
{
return cancelCommandReturnValue;
}
}
set
{
lock (gate)
{
cancelCommandReturnValue = value;
}
}
}
/// <summary>Cancels command by correlation ID.</summary>
/// <param name="correlationId">The command correlation ID.</param>
@@ -181,9 +203,8 @@ internal sealed class FakeRuntimeSession : IWorkerRuntimeSession
lock (gate)
{
cancelledCorrelationIds.Add(correlationId);
return cancelCommandReturnValue;
}
return CancelCommandReturnValue;
}
/// <summary>Requests graceful shutdown.</summary>
@@ -1,17 +1,18 @@
using System;
using MxGateway.Contracts;
namespace MxGateway.Worker.Tests.TestSupport;
/// <summary>
/// Marks an xUnit test as requiring installed MXAccess COM and live
/// provider state. When the opt-in environment variable
/// <c>MXGATEWAY_RUN_LIVE_MXACCESS_TESTS</c> is not set to <c>1</c>, the
/// test is reported as <c>Skipped</c> by xUnit rather than silently
/// returning early (which xUnit would otherwise report as
/// <c>Passed</c>). Mirrors
/// <c>MxGateway.IntegrationTests.LiveMxAccessFactAttribute</c>; the
/// copy avoids a cross-project reference and keeps the Worker.Tests
/// net48/x86 build self-contained.
/// provider state. When the opt-in environment variable named by
/// <see cref="GatewayContractInfo.LiveMxAccessOptInVariableName"/> is
/// not set to <c>1</c>, the test is reported as <c>Skipped</c> by
/// xUnit rather than silently returning early (which xUnit would
/// otherwise report as <c>Passed</c>). Mirrors
/// <c>MxGateway.IntegrationTests.LiveMxAccessFactAttribute</c>; both
/// copies bind to the same <c>GatewayContractInfo</c> constant so the
/// env-var name has a single literal source of truth (Worker.Tests-025).
/// </summary>
public sealed class LiveMxAccessFactAttribute : FactAttribute
{
@@ -19,8 +20,10 @@ public sealed class LiveMxAccessFactAttribute : FactAttribute
/// The environment variable that opts the suite into running live
/// MXAccess COM tests. Must be set to <c>1</c> on a machine with the
/// installed MXAccess runtime and a reachable Galaxy provider.
/// Sourced from <see cref="GatewayContractInfo.LiveMxAccessOptInVariableName"/>
/// so a single constant gates both Worker.Tests and IntegrationTests.
/// </summary>
public const string LiveMxAccessVariableName = "MXGATEWAY_RUN_LIVE_MXACCESS_TESTS";
public const string LiveMxAccessVariableName = GatewayContractInfo.LiveMxAccessOptInVariableName;
/// <summary>Initializes the attribute, skipping the test unless the env var is set.</summary>
public LiveMxAccessFactAttribute()
+36 -6
View File
@@ -51,7 +51,7 @@ public sealed class WorkerPipeSession
options,
() => Process.GetCurrentProcess().Id,
new WorkerPipeSessionOptions(),
() => new MxAccessStaSession(eq => new AlarmCommandHandler(eq)),
() => new MxAccessStaSession((eq, affinity) => new AlarmCommandHandler(eq, () => new WnWrapAlarmConsumer(), affinity)),
logger)
{
}
@@ -72,7 +72,7 @@ public sealed class WorkerPipeSession
options,
processIdProvider,
new WorkerPipeSessionOptions(),
() => new MxAccessStaSession(eq => new AlarmCommandHandler(eq)),
() => new MxAccessStaSession((eq, affinity) => new AlarmCommandHandler(eq, () => new WnWrapAlarmConsumer(), affinity)),
logger: null)
{
}
@@ -108,7 +108,16 @@ public sealed class WorkerPipeSession
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
public async Task RunAsync(CancellationToken cancellationToken = default)
{
_runtimeSession = _runtimeSessionFactory();
// Worker-025: the factory delegate itself is null-checked in the
// constructor, but its return value is not — a factory that returned
// null would NRE on the StartAsync lambda below. Throw a diagnostic
// exception instead so the failure is unambiguous (and so the
// finally block's _runtimeSession?.Dispose() can't silently no-op
// on a torn half-initialized session). Mirrors the same pattern
// AlarmCommandHandler.Subscribe uses for its consumerFactory().
_runtimeSession = _runtimeSessionFactory()
?? throw new InvalidOperationException(
"Worker runtime session factory returned null.");
try
{
await CompleteStartupHandshakeAsync(
@@ -625,6 +634,18 @@ public sealed class WorkerPipeSession
/// STA (no command in flight and no activity), which is the only case
/// the watchdog can usefully distinguish from a slow command.
/// </summary>
/// <remarks>
/// Worker-023: the in-flight-command suppression is itself bounded by
/// <c>WorkerPipeSessionOptions.HeartbeatStuckCeiling</c>. A truly stuck
/// synchronous COM call (e.g. against a dead MXAccess provider whose
/// cross-apartment marshaler is permanently blocked) leaves
/// <c>CurrentCommandCorrelationId</c> non-empty forever; without an
/// upper bound the worker-side <c>StaHung</c> watchdog would be
/// permanently defeated and only the gateway's per-command timeout
/// would catch the hang. Once <c>LastActivityUtc</c> has been stale
/// for longer than <c>HeartbeatStuckCeiling</c> the watchdog fires
/// <c>StaHung</c> regardless of whether a command is in flight.
/// </remarks>
private async Task ReportWatchdogFaultIfNeededAsync(
WorkerRuntimeHeartbeatSnapshot snapshot,
CancellationToken cancellationToken)
@@ -636,14 +657,22 @@ public sealed class WorkerPipeSession
return;
}
if (!string.IsNullOrEmpty(snapshot.CurrentCommandCorrelationId))
if (!string.IsNullOrEmpty(snapshot.CurrentCommandCorrelationId)
&& staleFor <= _sessionOptions.HeartbeatStuckCeiling)
{
// A command is in flight — the STA is busy executing it, not
// A command is in flight and we are still within the defensive
// suppression ceiling — the STA is busy executing it, not
// hung. The next MarkActivity() in StaRuntime.ProcessQueuedCommands
// will refresh LastActivityUtc once the command returns, at which
// point this branch stops being taken. The heartbeat already
// surfaces the in-flight correlation id so the gateway can apply
// its own per-command timeout if it considers the command too slow.
//
// Worker-023: once staleFor exceeds HeartbeatStuckCeiling we fall
// through to the fault path even with a command in flight — a
// truly stuck synchronous COM call would otherwise keep
// CurrentCommandCorrelationId non-empty indefinitely and the
// worker-side watchdog would never fire.
return;
}
@@ -837,7 +866,8 @@ public sealed class WorkerPipeSession
// is preserved for the legacy direct-invocation path where the
// parameterless CompleteStartupHandshakeAsync is used without a
// prior factory call.
_runtimeSession ??= new MxAccessStaSession(eq => new AlarmCommandHandler(eq));
_runtimeSession ??= new MxAccessStaSession(
(eq, affinity) => new AlarmCommandHandler(eq, () => new WnWrapAlarmConsumer(), affinity));
IWorkerRuntimeSession session = _runtimeSession;
try
{
@@ -9,12 +9,21 @@ public sealed class WorkerPipeSessionOptions
public static readonly TimeSpan DefaultHeartbeatInterval = TimeSpan.FromSeconds(5);
/// <summary>Default heartbeat grace period (15 seconds).</summary>
public static readonly TimeSpan DefaultHeartbeatGrace = TimeSpan.FromSeconds(15);
/// <summary>
/// Default defensive ceiling beyond which the watchdog fires
/// <see cref="MxGateway.Contracts.Proto.WorkerFaultCategory.StaHung"/>
/// even while a command is in flight (75 seconds = 5 ×
/// <see cref="DefaultHeartbeatGrace"/>). See <see cref="HeartbeatStuckCeiling"/>
/// for the rationale.
/// </summary>
public static readonly TimeSpan DefaultHeartbeatStuckCeiling = TimeSpan.FromSeconds(75);
/// <summary>Initializes a new instance of the WorkerPipeSessionOptions class with default values.</summary>
public WorkerPipeSessionOptions()
{
HeartbeatInterval = DefaultHeartbeatInterval;
HeartbeatGrace = DefaultHeartbeatGrace;
HeartbeatStuckCeiling = DefaultHeartbeatStuckCeiling;
}
/// <summary>Gets or sets the heartbeat interval.</summary>
@@ -23,6 +32,27 @@ public sealed class WorkerPipeSessionOptions
/// <summary>Gets or sets the heartbeat grace period.</summary>
public TimeSpan HeartbeatGrace { get; set; }
/// <summary>
/// Gets or sets the defensive upper bound on how long the watchdog
/// will suppress its <c>StaHung</c> fault while a command is in
/// flight. Worker-017 suppresses the watchdog when the heartbeat
/// snapshot's <c>CurrentCommandCorrelationId</c> is non-empty so a
/// legitimately slow command (e.g. <c>ReadBulk</c> against many
/// uncached tags) does not self-fault — but a truly stuck
/// synchronous COM call against a dead MXAccess provider leaves
/// <c>CurrentCommandCorrelationId</c> non-empty forever and would
/// permanently defeat the watchdog. <c>HeartbeatStuckCeiling</c> is
/// the upper bound on that suppression: once
/// <c>LastStaActivityUtc</c> has been stale for longer than this
/// ceiling, the watchdog DOES fire <c>StaHung</c> even with a
/// command in flight, on the assumption that no legitimate STA
/// command should run that long without periodically refreshing
/// activity. Default is <see cref="DefaultHeartbeatStuckCeiling"/>
/// (75 seconds = 5 × <see cref="DefaultHeartbeatGrace"/>); raise
/// for deployments that run very long bulk operations.
/// </summary>
public TimeSpan HeartbeatStuckCeiling { get; set; }
/// <summary>Validates the session options.</summary>
public void Validate()
{
@@ -39,5 +69,20 @@ public sealed class WorkerPipeSessionOptions
nameof(HeartbeatGrace),
"Worker heartbeat grace must be greater than zero.");
}
if (HeartbeatStuckCeiling <= TimeSpan.Zero)
{
throw new ArgumentOutOfRangeException(
nameof(HeartbeatStuckCeiling),
"Worker heartbeat stuck ceiling must be greater than zero.");
}
if (HeartbeatStuckCeiling <= HeartbeatGrace)
{
throw new ArgumentOutOfRangeException(
nameof(HeartbeatStuckCeiling),
"Worker heartbeat stuck ceiling must be greater than HeartbeatGrace; "
+ "otherwise it would fire before the in-flight-command suppression had any effect.");
}
}
}
@@ -36,12 +36,13 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
{
private readonly MxAccessEventQueue eventQueue;
private readonly Func<IMxAccessAlarmConsumer> consumerFactory;
private readonly Action? threadAffinityCheck;
private readonly object syncRoot = new object();
private AlarmDispatcher? dispatcher;
private bool disposed;
public AlarmCommandHandler(MxAccessEventQueue eventQueue)
: this(eventQueue, () => new WnWrapAlarmConsumer())
: this(eventQueue, () => new WnWrapAlarmConsumer(), threadAffinityCheck: null)
{
}
@@ -49,9 +50,32 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
public AlarmCommandHandler(
MxAccessEventQueue eventQueue,
Func<IMxAccessAlarmConsumer> consumerFactory)
: this(eventQueue, consumerFactory, threadAffinityCheck: null)
{
}
/// <summary>
/// Worker-024: production constructor that also injects an
/// STA-affinity guard. <paramref name="threadAffinityCheck"/> is
/// invoked at the entry of every method that touches the underlying
/// <see cref="IMxAccessAlarmConsumer"/> (or the wnwrap COM object
/// through it) — <see cref="Subscribe"/>, <see cref="Unsubscribe"/>,
/// <see cref="Acknowledge"/>, <see cref="AcknowledgeByName"/>,
/// <see cref="QueryActive"/>, <see cref="PollOnce"/> — so an
/// off-STA call raises a programming-error diagnostic instead of
/// deadlocking on cross-apartment marshaling to the
/// <c>ThreadingModel=Apartment</c> wnwrap CLSID. The guard is
/// optional: tests that already drive the handler on a single
/// thread can pass <c>null</c>.
/// </summary>
public AlarmCommandHandler(
MxAccessEventQueue eventQueue,
Func<IMxAccessAlarmConsumer> consumerFactory,
Action? threadAffinityCheck)
{
this.eventQueue = eventQueue ?? throw new ArgumentNullException(nameof(eventQueue));
this.consumerFactory = consumerFactory ?? throw new ArgumentNullException(nameof(consumerFactory));
this.threadAffinityCheck = threadAffinityCheck;
}
public bool IsSubscribed
@@ -64,6 +88,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
{
if (disposed) throw new ObjectDisposedException(nameof(AlarmCommandHandler));
if (subscription is null) throw new ArgumentNullException(nameof(subscription));
threadAffinityCheck?.Invoke();
lock (syncRoot)
{
@@ -94,6 +119,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
/// <inheritdoc />
public void Unsubscribe()
{
threadAffinityCheck?.Invoke();
AlarmDispatcher? toDispose;
lock (syncRoot)
{
@@ -112,6 +138,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
string operatorDomain,
string operatorFullName)
{
threadAffinityCheck?.Invoke();
AlarmDispatcher? d = GetDispatcherOrThrow();
return d.Acknowledge(
alarmGuid,
@@ -133,6 +160,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
string operatorDomain,
string operatorFullName)
{
threadAffinityCheck?.Invoke();
AlarmDispatcher? d = GetDispatcherOrThrow();
return d.AcknowledgeByName(
alarmName ?? string.Empty,
@@ -148,6 +176,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
/// <inheritdoc />
public IReadOnlyList<ActiveAlarmSnapshot> QueryActive(string? alarmFilterPrefix)
{
threadAffinityCheck?.Invoke();
AlarmDispatcher? d = GetDispatcherOrThrow();
IReadOnlyList<ActiveAlarmSnapshot> all = d.SnapshotActiveAlarms();
if (string.IsNullOrEmpty(alarmFilterPrefix)) return all;
@@ -165,6 +194,7 @@ public sealed class AlarmCommandHandler : IAlarmCommandHandler
/// <inheritdoc />
public void PollOnce()
{
threadAffinityCheck?.Invoke();
AlarmDispatcher? d;
lock (syncRoot) d = dispatcher;
// No-op when not yet subscribed or already disposed.
@@ -65,12 +65,23 @@ public sealed class MxAccessSession : IDisposable
/// session methods without touching MXAccess COM. This is exposed via
/// <c>InternalsVisibleTo("MxGateway.Worker.Tests")</c>; production code
/// must use the <see cref="Create"/> factory.
///
/// A runtime guard rejects an <see cref="MxAccessBaseEventSink"/> —
/// the production sink wired by <see cref="Create"/> — because the
/// <c>new object()</c> stand-in this factory uses for the COM object
/// would silently bypass <see cref="System.Runtime.InteropServices.Marshal.IsComObject(object)"/>
/// during disposal and mask lifetime regressions (Worker.Tests-026).
/// </summary>
/// <param name="mxAccessServer">The server abstraction to drive.</param>
/// <param name="eventSink">The event sink to attach to the session.</param>
/// <param name="handleRegistry">Optional handle registry; a fresh one is created when null.</param>
/// <param name="valueCache">Optional value cache; a fresh one is created when null.</param>
/// <param name="creationThreadId">Optional creation thread id; defaults to the current managed thread id.</param>
/// <exception cref="ArgumentException">
/// Thrown when <paramref name="eventSink"/> is the production
/// <see cref="MxAccessBaseEventSink"/>. Tests must pass a test
/// double sink — production code must use <see cref="Create"/>.
/// </exception>
internal static MxAccessSession CreateForTesting(
IMxAccessServer mxAccessServer,
IMxAccessEventSink eventSink,
@@ -78,6 +89,14 @@ public sealed class MxAccessSession : IDisposable
MxAccessValueCache? valueCache = null,
int? creationThreadId = null)
{
if (eventSink is MxAccessBaseEventSink)
{
throw new ArgumentException(
"CreateForTesting must not be used with the production MxAccessBaseEventSink. "
+ "Use MxAccessSession.Create for production code; pass a test-double IMxAccessEventSink here.",
nameof(eventSink));
}
return new MxAccessSession(
new object(),
mxAccessServer,
@@ -17,7 +17,13 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession
private readonly IMxAccessEventSink eventSink;
private readonly MxAccessEventQueue eventQueue;
private readonly StaRuntime staRuntime;
private readonly Func<MxAccessEventQueue, IAlarmCommandHandler>? alarmCommandHandlerFactory;
// Worker-024: the factory takes an Action so MxAccessStaSession can hand
// the alarm handler its STA-affinity guard (a closure over
// alarmConsumerThreadId captured at the factory call site). The handler
// then invokes the guard at the entry of every method that touches the
// wnwrap consumer, matching the STA-affinity invariant already enforced
// for the poll path via EnsureOnAlarmConsumerThread.
private readonly Func<MxAccessEventQueue, Action, IAlarmCommandHandler>? alarmCommandHandlerFactory;
private StaCommandDispatcher? commandDispatcher;
private MxAccessSession? session;
private IAlarmCommandHandler? alarmCommandHandler;
@@ -44,7 +50,7 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession
/// <see cref="StartAsync(string, int, CancellationToken)"/>; pass <c>null</c> to opt out
/// of alarm-side commands.
/// </summary>
internal MxAccessStaSession(Func<MxAccessEventQueue, IAlarmCommandHandler>? alarmCommandHandlerFactory)
internal MxAccessStaSession(Func<MxAccessEventQueue, Action, IAlarmCommandHandler>? alarmCommandHandlerFactory)
: this(
new StaRuntime(),
new MxAccessComObjectFactory(),
@@ -96,7 +102,7 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession
StaRuntime staRuntime,
IMxAccessComObjectFactory factory,
MxAccessEventQueue eventQueue,
Func<MxAccessEventQueue, IAlarmCommandHandler>? alarmCommandHandlerFactory)
Func<MxAccessEventQueue, Action, IAlarmCommandHandler>? alarmCommandHandlerFactory)
: this(staRuntime, factory, new MxAccessBaseEventSink(eventQueue), eventQueue, alarmCommandHandlerFactory)
{
}
@@ -129,7 +135,7 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession
IMxAccessComObjectFactory factory,
IMxAccessEventSink eventSink,
MxAccessEventQueue eventQueue,
Func<MxAccessEventQueue, IAlarmCommandHandler>? alarmCommandHandlerFactory)
Func<MxAccessEventQueue, Action, IAlarmCommandHandler>? alarmCommandHandlerFactory)
{
this.staRuntime = staRuntime ?? throw new ArgumentNullException(nameof(staRuntime));
this.factory = factory ?? throw new ArgumentNullException(nameof(factory));
@@ -189,7 +195,17 @@ public sealed class MxAccessStaSession : IWorkerRuntimeSession
// thread id; RunAlarmPollLoopAsync then asserts each
// PollOnce executes on the same thread.
alarmConsumerThreadId = Environment.CurrentManagedThreadId;
alarmCommandHandler = alarmCommandHandlerFactory(eventQueue);
// Worker-024: hand the handler an affinity guard so each
// of its command-path entries (Subscribe / Acknowledge /
// AcknowledgeByName / QueryActive / Unsubscribe / PollOnce)
// asserts the same STA-affinity invariant the poll path
// already enforced. Without this the command path relied
// on convention alone; a future refactor that let a
// command run off-STA would silently deadlock on
// cross-apartment marshaling against the wnwrap consumer.
alarmCommandHandler = alarmCommandHandlerFactory(
eventQueue,
EnsureOnAlarmConsumerThread);
}
commandDispatcher = new StaCommandDispatcher(
staRuntime,