Files
mxaccessgw/src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs
T
Joseph Doherty 1aafd6bde4 Code-review 2026-05-20 sweep #2: re-review at a020350, resolve 48 findings
Second re-review pass at commit a020350 caught 48 new findings — including
one High-severity regression I introduced in the prior sweep — and fixed
them all in one parallel wave.

High (1)
- Client.Python-018: prior sweep set `license = "Proprietary"` in
  pyproject.toml. setuptools >= 77 enforces PEP 639 and rejects the
  string (it must be a valid SPDX expression), so `pip wheel .` and
  `pip install -e .` both fail before any source compiles. Tests
  still pass because pytest bypasses the build backend via
  `pythonpath`. Dropped the invalid license string, kept the
  `License :: Other/Proprietary License` classifier, and added
  `tests/test_packaging.py` so a future regression of the same shape
  is caught in CI.

Mediums (6)
- Worker-023: `HeartbeatStuckCeiling` (default 75s = 5x HeartbeatGrace)
  on WorkerPipeSessionOptions bounds the in-flight-command watchdog
  suppression so a truly stuck COM call still triggers StaHung
  instead of permanently defeating the watchdog.
- Client.Rust-018: reverted Rust's `latencyMs` split so the
  cross-language bench comparison is apples-to-apples again;
  `failureLatencyMs` kept as Rust-only enrichment.
- Client.Java-021: applied Client.Java-002's terminal-state
  serialisation pattern to DeployEventStream so close() arriving
  after queue-overflow can't erase the overflow exception.
- IntegrationTests-017: teardown-parity test now uses a two-window
  stability check after UnAdvise instead of strict equality against
  the pre-UnAdvise count (which raced against in-flight events).
- IntegrationTests-019: new RecordingTestOutputHelper wraps every
  log sink the WriteSecured live test owns (worker stdout/stderr,
  gateway logs, direct WriteLine) so the credential is proven
  absent from the full output buffer, not just the diagnostic
  message.
- Tests-020: added MxAccessGatewayServiceConstraintTests coverage
  for the previously-uncovered Write2Bulk and WriteSecured2Bulk
  arms of WriteBulkConstraintPlan.SetPayload.

Lows (41 — highlights)
- Server: Galaxy glob cache eviction is race-free (Server-024);
  GalaxyRepositoryGrpcService takes IGalaxyRepository (Server-025);
  AlarmsOptions validated at startup (Server-026); Authorization.md
  Constraint Enforcement snippet/prose enumerate the bulk write/read
  family (Server-027); bulk-read-commands and bulk-write-commands
  capability tokens added to OpenSession (Server-029);
  NotWiredAlarmRpcDispatcher XML doc and missing scope-resolver and
  state-machine tests cleaned up (023, 028).
- Worker: AlarmCommandHandler now invokes the same STA-affinity
  guard the poll path uses, at every command entry (Worker-024);
  RunAsync null-checks the runtime-session factory result
  (Worker-025).
- Worker.Tests: shared LiveMxAccessOptInVariableName lives on
  GatewayContractInfo (Worker.Tests-025); MxAccessSession.CreateForTesting
  rejects production sinks (Worker.Tests-026); FakeRuntimeSession's
  CancelCommandReturnValue serialised under lock (Worker.Tests-027);
  Probes namespace lifted to MxGateway.Worker.Tests.Probes
  (Worker.Tests-029); cancel-envelope sequence numbers monotonised
  (Worker.Tests-030); docs/GatewayTesting.md gains a "Dev-rig Probes"
  section (Worker.Tests-028).
- Tests: ManualTimeProvider consolidated into one TestSupport/ copy
  (Tests-021); SessionManagerBulkTests adds a mid-flight cancellation
  test backed by a TaskCompletionSource fake (Tests-022); companion
  FakeWorkerProcess.WaitForExitAsync no longer fakes its exit signal
  (Tests-023); constraint plan reply-count divergence pinned
  (Tests-024).
- IntegrationTests: TryGetSession chain carries [MaybeNullWhen(false)]
  end-to-end (IntegrationTests-018); abnormal-exit keyword set
  tightened to pipe-disconnected/end-of-stream and the test now
  asserts streamTask.IsFaulted (020, 021).
- Client.Dotnet: bench commands added to isLongRunning so the
  default 30s wall-clock budget doesn't kill them (015);
  BenchStreamEventsAsync observes the inner stream task on every
  exit path (016).
- Client.Go: parseValue wraps strconv errors with flag context and
  %w (017); bench loops honour ctx.Done() (018); galaxy-watch parses
  RFC3339Nano with fractional seconds (019); runStreamEvents installs
  signal.NotifyContext like runGalaxyWatch (020); five new CLI-level
  table-driven tests cover the bulk/bench subcommands (021).
- Client.Java: toCompletable Javadoc rewritten to match the actual
  cancellation contract Client.Java-015 established (022); stream-events
  text path uses Long.toUnsignedString for worker_sequence (023);
  bench-read-bulk no longer pollutes success-latency histogram with
  failure durations (024); --shutdown-timeout CLI option propagates
  through to ClientOptions (025); seven new MxGatewayCliTests cover
  the bulk and bench commands (026).
- Client.Python: mxgateway_cli ships its own py.typed marker (019);
  wheel-build smoke test added under tests/test_packaging.py (020);
  README documents the Galaxy CLI parity gap explicitly (021).
- Client.Rust: RustClientDesign.md signatures match session.rs and
  document the AsRef<str> read_bulk genericism (019);
  next_correlation_id re-exported at the crate root, with a
  property-style doc contract and an explicit disclaimer that the
  literal textual format is not part of the contract (020).
- Contracts: BulkWriteResult comment names the actual
  IConstraintEnforcer mechanism instead of "tag-allowlist filter"
  (014); BulkReadResult gains explicit per-arm payload-population
  documentation for the success vs failure cases (015).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 10:28:54 -04:00

1585 lines
67 KiB
C#

using System.Collections.Concurrent;
using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.Text;
using Google.Protobuf.WellKnownTypes;
using Grpc.Core;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using MxGateway.Contracts;
using MxGateway.Contracts.Proto;
using MxGateway.Server.Configuration;
using MxGateway.Server.Grpc;
using MxGateway.Server.Metrics;
using MxGateway.Server.Security.Authentication;
using MxGateway.Server.Security.Authorization;
using MxGateway.Server.Sessions;
using MxGateway.Server.Workers;
using Xunit.Abstractions;
namespace MxGateway.IntegrationTests;
[Collection(LiveResourcesCollection.Name)]
[Trait("Category", "LiveMxAccess")]
public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
{
private static readonly TimeSpan CommandTimeout = TimeSpan.FromSeconds(15);
private static readonly TimeSpan StreamShutdownTimeout = TimeSpan.FromSeconds(10);
/// <summary>
/// Verifies that a gateway session can register, add item, advise, and stream events from live MXAccess.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_RegistersAdvisesStreamsDataAndCloses()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-smoke",
ClientCorrelationId = "live-open",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
output.WriteLine($"OpenSession status={openReply.ProtocolStatus.Code} session={sessionId} worker_pid={openReply.WorkerProcessId}");
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
Assert.True(openReply.WorkerProcessId > 0);
streamTask = fixture.Service.StreamEvents(
new StreamEventsRequest { SessionId = sessionId },
eventWriter,
new TestServerCallContext(streamCancellation.Token));
MxCommandReply registerReply = await fixture.Service.Invoke(
CreateRegisterRequest(sessionId),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Register", registerReply);
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
Assert.True(registerReply.Register.ServerHandle > 0);
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, registerReply.Register.ServerHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("AddItem", addItemReply);
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
Assert.True(addItemReply.AddItem.ItemHandle > 0);
MxCommandReply adviseReply = await fixture.Service.Invoke(
CreateAdviseRequest(
sessionId,
registerReply.Register.ServerHandle,
addItemReply.AddItem.ItemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Advise", adviseReply);
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
// A live MXAccess provider can deliver an initial registration-state
// or bad-quality bootstrap event before the OnDataChange the worker
// is contracted to emit. Match on the family rather than trusting
// whatever event arrives first so a genuine ordering defect cannot
// pass spuriously or leave a later wrong event unverified.
MxEvent dataChange = await eventWriter
.WaitForMessageAsync(
candidate => candidate.Family == MxEventFamily.OnDataChange,
IntegrationTestEnvironment.LiveMxAccessEventTimeout,
streamCancellation.Token)
.ConfigureAwait(false);
LogEvent(dataChange);
Assert.Equal(MxEventFamily.OnDataChange, dataChange.Family);
Assert.Equal(sessionId, dataChange.SessionId);
Assert.Equal(registerReply.Register.ServerHandle, dataChange.ServerHandle);
Assert.Equal(addItemReply.AddItem.ItemHandle, dataChange.ItemHandle);
}
finally
{
await ShutDownAsync(fixture, processFactory, sessionId, streamTask).ConfigureAwait(false);
}
}
/// <summary>
/// Verifies that a Write command round-trips through live MXAccess against an advised item
/// and that the worker emits a matching <see cref="MxEventFamily.OnWriteComplete"/> event
/// — the proof of round-trip the cross-language client e2e runner relies on.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_WritesValueToAdvisedItem()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-write",
ClientCorrelationId = "live-open-write",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
streamTask = fixture.Service.StreamEvents(
new StreamEventsRequest { SessionId = sessionId },
eventWriter,
new TestServerCallContext(streamCancellation.Token));
MxCommandReply registerReply = await fixture.Service.Invoke(
CreateRegisterRequest(sessionId),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Register", registerReply);
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, registerReply.Register.ServerHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("AddItem", addItemReply);
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
Assert.True(addItemReply.AddItem.ItemHandle > 0);
MxCommandReply adviseReply = await fixture.Service.Invoke(
CreateAdviseRequest(
sessionId,
registerReply.Register.ServerHandle,
addItemReply.AddItem.ItemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Advise", adviseReply);
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
MxCommandReply writeReply = await fixture.Service.Invoke(
CreateWriteRequest(
sessionId,
registerReply.Register.ServerHandle,
addItemReply.AddItem.ItemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Write", writeReply);
// Happy-path Write: the worker COM call succeeded so HResultConverter
// produces ProtocolStatusCode.Ok. An MXAccess rejection (a write to a
// bad item, a secured-item failure) would surface as
// ProtocolStatusCode.MxaccessFailure with a non-zero hresult — never
// as an RpcException / transport fault, because the command still
// completed its round-trip to the worker and back.
Assert.Equal(ProtocolStatusCode.Ok, writeReply.ProtocolStatus.Code);
Assert.Equal(MxCommandKind.Write, writeReply.Kind);
// Proof of round-trip: MXAccess fires OnWriteComplete (event id 2)
// after the underlying provider acknowledges the write — that is
// the event the cross-language client e2e runner asserts on. We
// scan the recorded stream (so an interleaving OnDataChange does
// not preempt the match) for an OnWriteComplete carrying the same
// server/item handles the Write command targeted.
MxEvent writeComplete = await eventWriter
.WaitForMessageAsync(
candidate => candidate.Family == MxEventFamily.OnWriteComplete
&& candidate.ServerHandle == registerReply.Register.ServerHandle
&& candidate.ItemHandle == addItemReply.AddItem.ItemHandle,
IntegrationTestEnvironment.LiveMxAccessEventTimeout,
streamCancellation.Token)
.ConfigureAwait(false);
LogEvent(writeComplete);
Assert.Equal(MxEventFamily.OnWriteComplete, writeComplete.Family);
Assert.Equal(sessionId, writeComplete.SessionId);
Assert.Equal(registerReply.Register.ServerHandle, writeComplete.ServerHandle);
Assert.Equal(addItemReply.AddItem.ItemHandle, writeComplete.ItemHandle);
// The stream task must not be in a faulted state. ShutDownAsync's
// broad catch would otherwise swallow the fault and silently let
// this Write-parity coverage pass against a broken event pipeline.
Assert.False(
streamTask.IsFaulted,
streamTask.Exception?.ToString() ?? "Event stream task faulted without an exception.");
}
finally
{
// Cancel the stream call before draining so StreamEvents observes
// cancellation rather than blocking on the channel. Any unhandled
// stream-task fault is rethrown from ShutDownAsync into the test.
streamCancellation.Cancel();
await ShutDownAsync(fixture, processFactory, sessionId, streamTask, propagateStreamFaults: true).ConfigureAwait(false);
}
}
/// <summary>
/// Verifies that an AddItem against an invalid server handle surfaces the MXAccess failure
/// without faulting the gateway transport, exercising the invalid-handle parity path.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_InvalidHandleCommand_SurfacesFailureWithoutTransportFault()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
string? sessionId = null;
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-invalid-handle",
ClientCorrelationId = "live-open-invalid",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
// Deliberately skip Register: server handle 0x7FFFFFFF was never
// issued by MXAccess. The worker must invoke COM and relay the
// invalid-handle failure rather than the gateway short-circuiting.
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, serverHandle: int.MaxValue),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("AddItem(invalid-handle)", addItemReply);
// MXAccess parity: an invalid handle is an MXAccess-level failure.
// The command still completed its worker round-trip, so the gateway
// must reply with ProtocolStatusCode.MxaccessFailure and a non-zero
// hresult carrying the COM failure (per HResultConverter) — never a
// gRPC transport fault. The assertion below just checks the status
// is not Ok; the failure detail lives in hresult / the status proxies.
Assert.NotEqual(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
Assert.True(
addItemReply.AddItem is null || addItemReply.AddItem.ItemHandle <= 0,
"Invalid-handle AddItem must not yield a usable item handle.");
}
finally
{
await ShutDownAsync(fixture, processFactory, sessionId, streamTask: null).ConfigureAwait(false);
}
}
/// <summary>
/// Verifies the MXAccess teardown chain: Unadvise then RemoveItem then Unregister
/// each return <see cref="ProtocolStatusCode.Ok"/>, and the worker stops emitting
/// OnDataChange events for the un-advised item. Exercises the lifecycle-ordering
/// parity CLAUDE.md singles out as a "do not synthesize" rule.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_UnadviseRemoveItemUnregister_TeardownOrderingParity()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-teardown",
ClientCorrelationId = "live-open-teardown",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
streamTask = fixture.Service.StreamEvents(
new StreamEventsRequest { SessionId = sessionId },
eventWriter,
new TestServerCallContext(streamCancellation.Token));
MxCommandReply registerReply = await fixture.Service.Invoke(
CreateRegisterRequest(sessionId),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Register", registerReply);
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
int serverHandle = registerReply.Register.ServerHandle;
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, serverHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("AddItem", addItemReply);
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
int itemHandle = addItemReply.AddItem.ItemHandle;
MxCommandReply adviseReply = await fixture.Service.Invoke(
CreateAdviseRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Advise", adviseReply);
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
// Wait for an OnDataChange to prove the subscription is live before tearing it down.
MxEvent firstDataChange = await eventWriter
.WaitForMessageAsync(
candidate => candidate.Family == MxEventFamily.OnDataChange
&& candidate.ServerHandle == serverHandle
&& candidate.ItemHandle == itemHandle,
IntegrationTestEnvironment.LiveMxAccessEventTimeout,
streamCancellation.Token)
.ConfigureAwait(false);
LogEvent(firstDataChange);
// 1) UnAdvise — must reply Ok; the worker must stop emitting OnDataChange
// for this (server, item) pair after this returns.
MxCommandReply unadviseReply = await fixture.Service.Invoke(
CreateUnAdviseRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("UnAdvise", unadviseReply);
Assert.Equal(ProtocolStatusCode.Ok, unadviseReply.ProtocolStatus.Code);
Assert.Equal(MxCommandKind.UnAdvise, unadviseReply.Kind);
// 2) RemoveItem — must reply Ok against the same handles.
MxCommandReply removeItemReply = await fixture.Service.Invoke(
CreateRemoveItemRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("RemoveItem", removeItemReply);
Assert.Equal(ProtocolStatusCode.Ok, removeItemReply.ProtocolStatus.Code);
Assert.Equal(MxCommandKind.RemoveItem, removeItemReply.Kind);
// 3) Unregister — closes the client session inside the worker.
MxCommandReply unregisterReply = await fixture.Service.Invoke(
CreateUnregisterRequest(sessionId, serverHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("Unregister", unregisterReply);
Assert.Equal(ProtocolStatusCode.Ok, unregisterReply.ProtocolStatus.Code);
Assert.Equal(MxCommandKind.Unregister, unregisterReply.Kind);
// Parity rule: after UnAdvise returns Ok the worker must stop emitting
// OnDataChange for this (server, item) pair. Events the provider already
// published before that ack are in-flight and not a regression — the rule
// only constrains events generated AFTER the teardown returned. So the
// "before" baseline is taken *after* a first settle window drains those
// in-flight events, not before UnAdvise was issued (which races against
// the round-trip + STA dispatch + pipe send window — see IntegrationTests-017).
//
// RecordingServerStreamWriter.Messages returns a snapshot copy under its
// own lock, so iterating after each settle window is safe without external
// sync.
await Task.Delay(TimeSpan.FromMilliseconds(500)).ConfigureAwait(false);
int dataChangeCountAfterFirstSettle = CountMatchingEvents(
eventWriter,
e => e.Family == MxEventFamily.OnDataChange
&& e.ServerHandle == serverHandle
&& e.ItemHandle == itemHandle);
await Task.Delay(TimeSpan.FromMilliseconds(500)).ConfigureAwait(false);
int dataChangeCountAfterSecondSettle = CountMatchingEvents(
eventWriter,
e => e.Family == MxEventFamily.OnDataChange
&& e.ServerHandle == serverHandle
&& e.ItemHandle == itemHandle);
output.WriteLine(
$"DataChange count after first settle={dataChangeCountAfterFirstSettle} after second settle={dataChangeCountAfterSecondSettle}");
Assert.Equal(dataChangeCountAfterFirstSettle, dataChangeCountAfterSecondSettle);
// A RemoveItem against the just-freed item handle must not silently succeed —
// the worker has to relay MXAccess's invalid-handle response. Closing the
// session is enough for parity, but we sanity-check that re-using the freed
// pair does not accidentally appear Ok.
MxCommandReply secondRemoveItemReply = await fixture.Service.Invoke(
CreateRemoveItemRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReply("RemoveItem(stale)", secondRemoveItemReply);
Assert.NotEqual(ProtocolStatusCode.Ok, secondRemoveItemReply.ProtocolStatus.Code);
}
finally
{
streamCancellation.Cancel();
await ShutDownAsync(fixture, processFactory, sessionId, streamTask).ConfigureAwait(false);
}
}
/// <summary>
/// Verifies the MXAccess <c>WriteSecured</c> path: <c>AuthenticateUser</c> resolves a
/// user id, then <c>WriteSecured</c> against the advised item completes its round-trip
/// to the worker and back. CLAUDE.md singles out <c>WriteSecured</c> ordering as a
/// parity surface the gateway must not "fix" — the test asserts the reply kind and
/// protocol status, not a fabricated outcome.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_WriteSecured_AuthenticatedRoundTripParity()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
// IntegrationTests-019: CLAUDE.md's credential-redaction rule covers every log
// surface the test sees, not just the reply's DiagnosticMessage. Wire a buffering
// wrapper around output and route the worker stdout/stderr echo and the gateway
// ILogger sink through it so the post-run assertion covers the accumulated test
// output. A regression that logged the request body, the WorkerCommandRequest
// envelope, or printed the credential from inside the worker is caught here
// even if the bare DiagnosticMessage check still passes.
RecordingTestOutputHelper recordedOutput = new(output);
TestWorkerProcessFactory processFactory = new(recordedOutput);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, recordedOutput);
// Stream events so a regression that emitted an OperationComplete or
// OnWriteComplete with wrong handles would still be observable via the test
// output (we don't assert a specific event here — the docs note successful
// writes raise only OnWriteComplete, but WriteSecured against an unprotected
// item commonly fails with 0x80004021 in this provider, which raises no event).
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
(string verifyUser, string verifyPassword) = ResolveLiveMxAccessSecuredCredentials();
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-write-secured",
ClientCorrelationId = "live-open-write-secured",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
streamTask = fixture.Service.StreamEvents(
new StreamEventsRequest { SessionId = sessionId },
eventWriter,
new TestServerCallContext(streamCancellation.Token));
MxCommandReply registerReply = await fixture.Service.Invoke(
CreateRegisterRequest(sessionId),
new TestServerCallContext()).ConfigureAwait(false);
LogReplyTo(recordedOutput, "Register", registerReply);
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
int serverHandle = registerReply.Register.ServerHandle;
MxCommandReply addItemReply = await fixture.Service.Invoke(
CreateAddItemRequest(sessionId, serverHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReplyTo(recordedOutput, "AddItem", addItemReply);
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
int itemHandle = addItemReply.AddItem.ItemHandle;
MxCommandReply adviseReply = await fixture.Service.Invoke(
CreateAdviseRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
LogReplyTo(recordedOutput, "Advise", adviseReply);
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
// AuthenticateUser resolves an ArchestrA user id for the WriteSecured call.
// Credentials are env-overridable so the test honors the gateway's "do not
// log secrets" rule and works against either MXAccess's own user store or
// the LmxOpcUa-baseline GLAuth-bridged ArchestrA identity (admin/admin123).
MxCommandReply authReply = await fixture.Service.Invoke(
CreateAuthenticateUserRequest(sessionId, serverHandle, verifyUser, verifyPassword),
new TestServerCallContext()).ConfigureAwait(false);
recordedOutput.WriteLine(
$"AuthenticateUser status={authReply.ProtocolStatus.Code} hresult={authReply.Hresult} user_id={authReply.AuthenticateUser?.UserId}");
// AuthenticateUser is allowed to fail (the underlying provider may reject
// the credential pair); we use the returned user id if non-zero and fall
// back to 0 ("operator only" / no verifier) so the parity assertion holds.
int currentUserId = authReply.ProtocolStatus.Code == ProtocolStatusCode.Ok
&& authReply.AuthenticateUser is not null
&& authReply.AuthenticateUser.UserId != 0
? authReply.AuthenticateUser.UserId
: 0;
MxCommandReply writeSecuredReply = await fixture.Service.Invoke(
CreateWriteSecuredRequest(
sessionId,
serverHandle,
itemHandle,
currentUserId,
verifierUserId: 0),
new TestServerCallContext()).ConfigureAwait(false);
LogReplyTo(recordedOutput, "WriteSecured", writeSecuredReply);
// Parity: the command itself completed its round-trip — the reply kind is
// WriteSecured and the gateway protocol status is set. The MXAccess outcome
// (Ok for an unprotected provider, MxaccessFailure with hresult 0x80004021
// when the item is not WriteSecured-eligible) lives in protocol_status +
// hresult, never as a transport fault. The diagnostic message must never
// contain the credential.
Assert.Equal(MxCommandKind.WriteSecured, writeSecuredReply.Kind);
Assert.True(
writeSecuredReply.ProtocolStatus.Code is ProtocolStatusCode.Ok
or ProtocolStatusCode.MxaccessFailure,
$"Unexpected WriteSecured protocol status {writeSecuredReply.ProtocolStatus.Code}.");
Assert.DoesNotContain(verifyPassword, writeSecuredReply.DiagnosticMessage ?? string.Empty, StringComparison.Ordinal);
}
finally
{
streamCancellation.Cancel();
await ShutDownAsync(fixture, processFactory, sessionId, streamTask).ConfigureAwait(false);
}
// CLAUDE.md credential contract: passwords and WriteSecured payloads must never
// reach logs. The buffered output covers the gateway ILogger sink, worker
// stdout/stderr, and every direct WriteLine the test body issued. A regression
// that dumped the request envelope, the AuthenticateUserCommand body, or any
// command-level WriteSecured payload would land here and trip this assertion.
Assert.DoesNotContain(verifyPassword, recordedOutput.Captured, StringComparison.Ordinal);
}
/// <summary>
/// Verifies that killing the worker process marks the session
/// <see cref="SessionState.Faulted"/> with a clean fault classification — the gateway
/// must observe the abnormal exit, transition the session, and surface a non-empty
/// fault description rather than hanging or crashing.
/// </summary>
[LiveMxAccessFact]
public async Task GatewaySession_WithLiveWorker_AbnormalWorkerExit_MarksSessionFaulted()
{
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
Assert.True(
File.Exists(workerExecutablePath),
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
TestWorkerProcessFactory processFactory = new(output);
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
string? sessionId = null;
Task? streamTask = null;
using CancellationTokenSource streamCancellation = new();
try
{
OpenSessionReply openReply = await fixture.Service.OpenSession(
new OpenSessionRequest
{
ClientSessionName = "live-mxaccess-abnormal-exit",
ClientCorrelationId = "live-open-abnormal",
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
},
new TestServerCallContext()).ConfigureAwait(false);
sessionId = openReply.SessionId;
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
streamTask = fixture.Service.StreamEvents(
new StreamEventsRequest { SessionId = sessionId },
eventWriter,
new TestServerCallContext(streamCancellation.Token));
// Kill the worker process directly. WorkerClient's read loop hits an
// end-of-stream on the named pipe and routes through SetFaulted; the
// session manager then marks the session Faulted. We avoid CloseSession
// so the transition is driven by the abnormal exit, not a graceful path.
processFactory.KillAllAndDetach();
DateTimeOffset waitDeadline = DateTimeOffset.UtcNow + StreamShutdownTimeout;
SessionState observedState = SessionState.Unspecified;
string? observedFault = null;
while (DateTimeOffset.UtcNow < waitDeadline)
{
if (fixture.TryGetSession(sessionId, out GatewaySession? session))
{
observedState = session.State;
observedFault = session.FinalFault;
if (observedState == SessionState.Faulted)
{
break;
}
}
await Task.Delay(TimeSpan.FromMilliseconds(50)).ConfigureAwait(false);
}
output.WriteLine($"AbnormalExit observed_state={observedState} fault={observedFault}");
Assert.Equal(SessionState.Faulted, observedState);
Assert.False(string.IsNullOrWhiteSpace(observedFault), "Faulted session must carry a non-empty fault description.");
// The fault classification must come from a known worker-client error code so
// operators get an actionable cause string rather than an opaque exception
// trace. We accept the classifications WorkerClient actually drives on an
// abnormal exit (kill-the-process path): the read loop hits EndOfStream and
// calls SetFaulted with WorkerClientErrorCode.PipeDisconnected and the
// message "Worker pipe disconnected." (see WorkerClient.cs:378-381). The
// earlier broad list (including "worker") matched every WorkerClient fault
// message (they all begin with "Worker"); tighten to the pipe/disconnect/
// end-of-stream classifications that match THIS path, so a regression that
// routed an unrelated fault here would surface as a test failure rather
// than silently passing (see IntegrationTests-020). "heartbeat" is dropped
// because HeartbeatGraceSeconds (15s) exceeds the StreamShutdownTimeout
// (10s) poll window, so a heartbeat-expired transition can never be
// observed inside this test.
Assert.True(
observedFault!.Contains("pipe disconnected", StringComparison.OrdinalIgnoreCase)
|| observedFault.Contains("end of stream", StringComparison.OrdinalIgnoreCase),
$"Fault description '{observedFault}' did not match a known abnormal-exit classification "
+ "(expected 'pipe disconnected' or 'end of stream' from WorkerClient's EndOfStream path).");
// IntegrationTests-021: also assert the StreamEvents call observed the fault
// — the chain that puts the session into Faulted goes through ReadEventsAsync
// propagating a WorkerClientException into EventStreamService, which calls
// session.MarkFaulted. The gateway then maps the WorkerClientException to an
// RpcException at the public boundary (MxAccessGatewayService.MapException →
// MapWorkerClientException). Polling session.State alone would silently pass
// if a future refactor moved MarkFaulted off the stream-consumption path —
// assert the streamTask itself terminated with a fault so the test couples
// to the actual fault-propagation path. Compare to the inverse assertion in
// the Write parity test (line 217: Assert.False(streamTask.IsFaulted, ...)).
try
{
await streamTask.WaitAsync(StreamShutdownTimeout).ConfigureAwait(false);
}
catch (Exception streamException)
{
output.WriteLine($"StreamEvents task terminated with: {streamException.GetType().Name}: {streamException.Message}");
}
Assert.True(
streamTask.IsCompleted,
"StreamEvents task did not complete within the shutdown timeout after the worker was killed.");
Assert.True(
streamTask.IsFaulted,
"StreamEvents task must fault on abnormal worker exit, not complete cleanly — "
+ "the fault-propagation path from WorkerClient.SetFaulted through ReadEventsAsync is the contract.");
}
finally
{
streamCancellation.Cancel();
// sessionId is intentionally null here — the session is already faulted and a
// CloseSession round-trip would just log a cleanup failure. We still wait for
// the worker process exit so the next test starts with a clean state.
await ShutDownAsync(fixture, processFactory, sessionId: null, streamTask).ConfigureAwait(false);
}
}
/// <summary>
/// Closes the session and drains the event stream / worker processes without letting a
/// cleanup timeout mask the original failure from the test body.
/// </summary>
/// <param name="propagateStreamFaults">
/// When <see langword="true"/>, a faulted <paramref name="streamTask"/> is rethrown so the
/// test fails on a silent stream-task exception (the Write parity test relies on this so
/// stream-side defects in event delivery are visible). When <see langword="false"/>, all
/// cleanup exceptions are logged and swallowed so a real test-body assertion failure is not
/// masked by a shutdown timeout (the original IntegrationTests-004 fix).
/// </param>
private async Task ShutDownAsync(
GatewayServiceFixture fixture,
TestWorkerProcessFactory processFactory,
string? sessionId,
Task? streamTask,
bool propagateStreamFaults = false)
{
Exception? streamFault = null;
try
{
if (!string.IsNullOrWhiteSpace(sessionId))
{
await CloseSessionAsync(fixture, sessionId).ConfigureAwait(false);
}
}
catch (Exception ex)
{
output.WriteLine($"Cleanup error during session close: {ex}");
}
if (streamTask is not null)
{
try
{
await streamTask.WaitAsync(StreamShutdownTimeout).ConfigureAwait(false);
}
catch (OperationCanceledException ex)
{
// A linked CancellationToken on the streaming TestServerCallContext is the
// intended way to stop StreamEvents promptly — treat the resulting
// OperationCanceledException as a clean shutdown, not a fault.
output.WriteLine($"Event stream task cancelled during shutdown: {ex.Message}");
}
catch (Exception ex)
{
// Cleanup runs in a finally block. By default a faulted StreamEvents task is
// logged and swallowed so a test-body assertion failure is not masked. When
// the caller opts into propagateStreamFaults (the Write parity test), we
// rethrow the fault after the worker-process wait so a silent stream-side
// defect actually fails the test.
output.WriteLine($"Event stream task faulted during shutdown: {ex}");
if (propagateStreamFaults)
{
streamFault = ex;
}
}
}
try
{
await processFactory.WaitForProcessesAsync(StreamShutdownTimeout).ConfigureAwait(false);
}
catch (Exception ex)
{
output.WriteLine($"Cleanup error while waiting for worker processes to exit: {ex}");
}
if (streamFault is not null)
{
throw streamFault;
}
}
private static MxCommandRequest CreateRegisterRequest(string sessionId)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-register",
Command = new MxCommand
{
Kind = MxCommandKind.Register,
Register = new RegisterCommand
{
ClientName = IntegrationTestEnvironment.LiveMxAccessClientName,
},
},
};
}
private static MxCommandRequest CreateAddItemRequest(
string sessionId,
int serverHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-add-item",
Command = new MxCommand
{
Kind = MxCommandKind.AddItem,
AddItem = new AddItemCommand
{
ServerHandle = serverHandle,
ItemDefinition = IntegrationTestEnvironment.LiveMxAccessItem,
},
},
};
}
private static MxCommandRequest CreateAdviseRequest(
string sessionId,
int serverHandle,
int itemHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-advise",
Command = new MxCommand
{
Kind = MxCommandKind.Advise,
Advise = new AdviseCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
},
},
};
}
private static MxCommandRequest CreateWriteRequest(
string sessionId,
int serverHandle,
int itemHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-write",
Command = new MxCommand
{
Kind = MxCommandKind.Write,
Write = new WriteCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
Value = new MxValue
{
DataType = MxDataType.Integer,
Int32Value = 1,
},
},
},
};
}
private static MxCommandRequest CreateUnAdviseRequest(
string sessionId,
int serverHandle,
int itemHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-unadvise",
Command = new MxCommand
{
Kind = MxCommandKind.UnAdvise,
UnAdvise = new UnAdviseCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
},
},
};
}
private static MxCommandRequest CreateRemoveItemRequest(
string sessionId,
int serverHandle,
int itemHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-remove-item",
Command = new MxCommand
{
Kind = MxCommandKind.RemoveItem,
RemoveItem = new RemoveItemCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
},
},
};
}
private static MxCommandRequest CreateUnregisterRequest(
string sessionId,
int serverHandle)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-unregister",
Command = new MxCommand
{
Kind = MxCommandKind.Unregister,
Unregister = new UnregisterCommand
{
ServerHandle = serverHandle,
},
},
};
}
private static MxCommandRequest CreateAuthenticateUserRequest(
string sessionId,
int serverHandle,
string verifyUser,
string verifyPassword)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-authenticate-user",
Command = new MxCommand
{
Kind = MxCommandKind.AuthenticateUser,
AuthenticateUser = new AuthenticateUserCommand
{
ServerHandle = serverHandle,
VerifyUser = verifyUser,
VerifyUserPassword = verifyPassword,
},
},
};
}
private static MxCommandRequest CreateWriteSecuredRequest(
string sessionId,
int serverHandle,
int itemHandle,
int currentUserId,
int verifierUserId)
{
return new MxCommandRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-write-secured",
Command = new MxCommand
{
Kind = MxCommandKind.WriteSecured,
WriteSecured = new WriteSecuredCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
CurrentUserId = currentUserId,
VerifierUserId = verifierUserId,
Value = new MxValue
{
DataType = MxDataType.Integer,
Int32Value = 2,
},
},
},
};
}
private static (string VerifyUser, string VerifyPassword) ResolveLiveMxAccessSecuredCredentials()
{
string verifyUser = Environment.GetEnvironmentVariable("MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER")
?? "admin";
string verifyPassword = Environment.GetEnvironmentVariable("MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD")
?? "admin123";
return (verifyUser, verifyPassword);
}
private static int CountMatchingEvents(
RecordingServerStreamWriter<MxEvent> writer,
Func<MxEvent, bool> predicate)
{
int count = 0;
foreach (MxEvent message in writer.Messages)
{
if (predicate(message))
{
count++;
}
}
return count;
}
private async Task CloseSessionAsync(
GatewayServiceFixture fixture,
string sessionId)
{
CloseSessionReply closeReply = await fixture.Service.CloseSession(
new CloseSessionRequest
{
SessionId = sessionId,
ClientCorrelationId = "live-close",
},
new TestServerCallContext()).ConfigureAwait(false);
output.WriteLine($"CloseSession status={closeReply.ProtocolStatus.Code} final_state={closeReply.FinalState}");
}
private void LogReply(
string method,
MxCommandReply reply)
{
LogReplyTo(output, method, reply);
}
private static void LogReplyTo(
ITestOutputHelper sink,
string method,
MxCommandReply reply)
{
sink.WriteLine(
$"{method} status={reply.ProtocolStatus.Code} hresult={reply.Hresult} diagnostic={reply.DiagnosticMessage}");
foreach (MxStatusProxy status in reply.Statuses)
{
sink.WriteLine(
$"{method} mxstatus success={status.Success} category={status.Category} detail={status.Detail} text={status.DiagnosticText}");
}
}
private void LogEvent(MxEvent dataChange)
{
output.WriteLine(
$"Event family={dataChange.Family} worker_sequence={dataChange.WorkerSequence} server_handle={dataChange.ServerHandle} item_handle={dataChange.ItemHandle} quality={dataChange.Quality}");
output.WriteLine(
$"Event value_type={dataChange.Value?.DataType} raw_status={dataChange.RawStatus}");
}
/// <summary>
/// Test fixture that assembles the gateway service with a worker process factory for live MXAccess testing.
/// </summary>
private sealed class GatewayServiceFixture : IAsyncDisposable
{
private readonly GatewayMetrics _metrics = new();
private readonly SessionRegistry _registry = new();
private readonly ILoggerFactory _loggerFactory;
/// <summary>
/// Initializes the fixture with worker executable path, factory, and test output helper.
/// </summary>
/// <param name="workerExecutablePath">Path to the worker process executable.</param>
/// <param name="processFactory">Factory for creating worker processes.</param>
/// <param name="output">Test output helper for logging.</param>
public GatewayServiceFixture(
string workerExecutablePath,
IWorkerProcessFactory processFactory,
ITestOutputHelper output)
{
IOptions<GatewayOptions> options = Options.Create(CreateOptions(workerExecutablePath));
_loggerFactory = LoggerFactory.Create(builder => builder.AddProvider(new TestOutputLoggerProvider(output)));
WorkerProcessLauncher launcher = new(
options,
processFactory,
new WorkerProcessStartedProbe(),
_metrics);
SessionWorkerClientFactory workerClientFactory = new(
launcher,
options,
_metrics,
_loggerFactory);
SessionManager sessionManager = new(
_registry,
workerClientFactory,
options,
_metrics,
logger: _loggerFactory.CreateLogger<SessionManager>());
MxAccessGrpcMapper mapper = new();
EventStreamService eventStreamService = new(
sessionManager,
options,
mapper,
_metrics,
_loggerFactory.CreateLogger<EventStreamService>());
Service = new MxAccessGatewayService(
sessionManager,
new GatewayRequestIdentityAccessor(),
new AllowAllConstraintEnforcer(),
new MxAccessGrpcRequestValidator(),
mapper,
eventStreamService,
_metrics,
_loggerFactory.CreateLogger<MxAccessGatewayService>());
}
/// <summary>
/// The assembled gateway service instance.
/// </summary>
public MxAccessGatewayService Service { get; }
/// <summary>
/// Looks up a session by id directly against the in-process registry. The abnormal
/// worker-exit test needs to observe the session's State / FinalFault as the gateway
/// transitions it to Faulted, which the public gRPC API only exposes indirectly via
/// CloseSession's reply (and not before a graceful close completes).
/// </summary>
public bool TryGetSession(string sessionId, [MaybeNullWhen(false)] out GatewaySession session)
{
return _registry.TryGet(sessionId, out session);
}
/// <summary>
/// Disposes the fixture resources and closes all sessions.
/// </summary>
public async ValueTask DisposeAsync()
{
foreach (GatewaySession session in _registry.Snapshot())
{
await session.DisposeAsync().ConfigureAwait(false);
}
_loggerFactory.Dispose();
_metrics.Dispose();
}
private static GatewayOptions CreateOptions(string workerExecutablePath)
{
return new GatewayOptions
{
Worker = new WorkerOptions
{
ExecutablePath = workerExecutablePath,
StartupTimeoutSeconds = 30,
ShutdownTimeoutSeconds = 15,
HeartbeatIntervalSeconds = 5,
HeartbeatGraceSeconds = 15,
MaxMessageBytes = WorkerFrameProtocolOptions.DefaultMaxMessageBytes,
RequiredArchitecture = WorkerArchitecture.X86,
},
Sessions = new SessionOptions
{
DefaultCommandTimeoutSeconds = 15,
MaxSessions = 1,
},
Events = new EventOptions
{
QueueCapacity = 32,
},
};
}
}
/// <summary>
/// Gathers messages written to a server stream for test inspection.
/// </summary>
private sealed class RecordingServerStreamWriter<T> : IServerStreamWriter<T>, IDisposable
{
private readonly object syncRoot = new();
private readonly List<T> messages = [];
private readonly SemaphoreSlim messageArrived = new(0);
/// <summary>
/// All messages that have been written to the stream.
/// </summary>
public IReadOnlyList<T> Messages
{
get
{
lock (syncRoot)
{
return messages.ToArray();
}
}
}
/// <summary>
/// Inherited write options.
/// </summary>
public WriteOptions? WriteOptions { get; set; }
/// <summary>
/// Records the message and signals any pending waiter.
/// </summary>
/// <param name="message">The message to write.</param>
public Task WriteAsync(T message)
{
lock (syncRoot)
{
messages.Add(message);
}
messageArrived.Release();
return Task.CompletedTask;
}
/// <summary>
/// Waits for the first recorded message that satisfies <paramref name="predicate"/>,
/// up to the specified timeout. Earlier non-matching messages (for example a
/// registration-state bootstrap event) are skipped rather than treated as the result.
/// </summary>
/// <param name="predicate">Filter the awaited message must satisfy.</param>
/// <param name="timeout">The maximum total time to wait.</param>
/// <param name="cancellationToken">
/// Token observed alongside the timeout so a per-test cancellation (for example the
/// gRPC call context's token) aborts the wait promptly instead of hanging until the
/// timeout elapses.
/// </param>
/// <returns>The first message that satisfies the predicate.</returns>
public async Task<T> WaitForMessageAsync(
Func<T, bool> predicate,
TimeSpan timeout,
CancellationToken cancellationToken = default)
{
using CancellationTokenSource timeoutCancellation = new(timeout);
using CancellationTokenSource linkedCancellation =
CancellationTokenSource.CreateLinkedTokenSource(timeoutCancellation.Token, cancellationToken);
int scanned = 0;
while (true)
{
T[] snapshot;
lock (syncRoot)
{
snapshot = messages.ToArray();
}
for (; scanned < snapshot.Length; scanned++)
{
if (predicate(snapshot[scanned]))
{
return snapshot[scanned];
}
}
try
{
await messageArrived.WaitAsync(linkedCancellation.Token).ConfigureAwait(false);
}
catch (OperationCanceledException) when (timeoutCancellation.IsCancellationRequested)
{
throw new TimeoutException(
$"No stream message satisfied the predicate within {timeout}. Recorded {scanned} message(s).");
}
}
}
/// <summary>
/// Releases the wait handle backing <c>messageArrived</c>. The writer owns an
/// <see cref="IDisposable"/> field so it must be disposable itself; the leak
/// is otherwise bounded only by how many opt-in live tests run.
/// </summary>
public void Dispose()
{
messageArrived.Dispose();
}
}
/// <summary>
/// Minimal <see cref="ServerCallContext"/> stub for invoking the gRPC service
/// in-process. It is a hand-written fake with no verification behavior — it
/// only supplies the context values the service reads during a call.
/// </summary>
private sealed class TestServerCallContext(CancellationToken cancellationToken = default) : ServerCallContext
{
private readonly Metadata requestHeaders = [];
private readonly Metadata responseTrailers = [];
private readonly Dictionary<object, object> userState = [];
private Status status;
private WriteOptions? writeOptions;
/// <inheritdoc />
protected override string MethodCore => "/mxaccess_gateway.v1.MxAccessGateway/Test";
/// <inheritdoc />
protected override string HostCore => "localhost";
/// <inheritdoc />
protected override string PeerCore => "ipv4:127.0.0.1:5000";
/// <inheritdoc />
protected override DateTime DeadlineCore => DateTime.UtcNow.AddMinutes(1);
/// <inheritdoc />
protected override Metadata RequestHeadersCore => requestHeaders;
/// <inheritdoc />
protected override CancellationToken CancellationTokenCore => cancellationToken;
/// <inheritdoc />
protected override Metadata ResponseTrailersCore => responseTrailers;
/// <inheritdoc />
protected override Status StatusCore
{
get => status;
set => status = value;
}
/// <inheritdoc />
protected override WriteOptions? WriteOptionsCore
{
get => writeOptions;
set => writeOptions = value;
}
/// <inheritdoc />
protected override AuthContext AuthContextCore { get; } = new(
string.Empty,
new Dictionary<string, List<AuthProperty>>(StringComparer.Ordinal));
/// <inheritdoc />
protected override IDictionary<object, object> UserStateCore => userState;
/// <inheritdoc />
protected override Task WriteResponseHeadersAsyncCore(Metadata responseHeaders)
{
return Task.CompletedTask;
}
/// <inheritdoc />
protected override ContextPropagationToken CreatePropagationTokenCore(
ContextPropagationOptions? options)
{
throw new NotSupportedException();
}
}
/// <summary>
/// Factory that launches worker processes and records their outputs for testing.
/// </summary>
private sealed class TestWorkerProcessFactory(ITestOutputHelper output) : IWorkerProcessFactory
{
private readonly ConcurrentBag<TestWorkerProcess> processes = [];
/// <inheritdoc />
public IWorkerProcess Start(ProcessStartInfo startInfo)
{
startInfo.RedirectStandardError = true;
startInfo.RedirectStandardOutput = true;
startInfo.UseShellExecute = false;
Process process = new()
{
StartInfo = startInfo,
EnableRaisingEvents = true,
};
process.OutputDataReceived += (_, args) => WriteWorkerOutput("stdout", args.Data);
process.ErrorDataReceived += (_, args) => WriteWorkerOutput("stderr", args.Data);
if (!process.Start())
{
process.Dispose();
throw new InvalidOperationException("Worker process failed to start.");
}
process.BeginOutputReadLine();
process.BeginErrorReadLine();
TestWorkerProcess workerProcess = new(process);
processes.Add(workerProcess);
output.WriteLine($"WorkerProcess started pid={workerProcess.Id} path={startInfo.FileName}");
return workerProcess;
}
/// <inheritdoc />
public async Task WaitForProcessesAsync(TimeSpan timeout)
{
foreach (TestWorkerProcess process in processes)
{
if (process.HasExited)
{
output.WriteLine($"WorkerProcess exited pid={process.Id} exit_code={process.ExitCode}");
continue;
}
using CancellationTokenSource timeoutCancellation = new(timeout);
await process.WaitForExitAsync(timeoutCancellation.Token).ConfigureAwait(false);
output.WriteLine($"WorkerProcess exited pid={process.Id} exit_code={process.ExitCode}");
}
}
/// <summary>
/// Kills every recorded worker process tree so the abnormal-exit test can simulate a
/// crashed worker without going through the graceful shutdown handshake. Failures to
/// kill an already-dead process are tolerated.
/// </summary>
public void KillAllAndDetach()
{
foreach (TestWorkerProcess process in processes)
{
if (process.HasExited)
{
continue;
}
try
{
process.Kill(entireProcessTree: true);
output.WriteLine($"WorkerProcess killed pid={process.Id} (abnormal-exit simulation)");
}
catch (InvalidOperationException ex)
{
output.WriteLine($"WorkerProcess kill skipped pid={process.Id}: {ex.Message}");
}
}
}
private void WriteWorkerOutput(
string streamName,
string? line)
{
if (!string.IsNullOrWhiteSpace(line))
{
output.WriteLine($"worker_{streamName}: {line}");
}
}
}
/// <summary>
/// Adapter wrapping a System.Diagnostics.Process as IWorkerProcess for testing.
/// </summary>
private sealed class TestWorkerProcess(Process process) : IWorkerProcess
{
/// <inheritdoc />
public int Id => process.Id;
/// <inheritdoc />
public bool HasExited => process.HasExited;
/// <inheritdoc />
public int? ExitCode => process.HasExited ? process.ExitCode : null;
/// <inheritdoc />
public async ValueTask WaitForExitAsync(CancellationToken cancellationToken)
{
await process.WaitForExitAsync(cancellationToken).ConfigureAwait(false);
}
/// <inheritdoc />
public void Kill(bool entireProcessTree)
{
process.Kill(entireProcessTree);
}
/// <inheritdoc />
public void Dispose()
{
process.Dispose();
}
}
/// <summary>
/// Logger provider that writes all output to the test output helper.
/// </summary>
private sealed class TestOutputLoggerProvider(ITestOutputHelper output) : ILoggerProvider
{
/// <inheritdoc />
public ILogger CreateLogger(string categoryName)
{
return new TestOutputLogger(output, categoryName);
}
/// <inheritdoc />
public void Dispose()
{
}
}
/// <summary>
/// Logger that writes messages to the test output helper.
/// </summary>
private sealed class TestOutputLogger(
ITestOutputHelper output,
string categoryName) : ILogger
{
/// <inheritdoc />
public IDisposable? BeginScope<TState>(TState state)
where TState : notnull
{
return null;
}
/// <inheritdoc />
public bool IsEnabled(LogLevel logLevel)
{
return logLevel >= LogLevel.Information;
}
/// <inheritdoc />
public void Log<TState>(
LogLevel logLevel,
EventId eventId,
TState state,
Exception? exception,
Func<TState, Exception?, string> formatter)
{
if (!IsEnabled(logLevel))
{
return;
}
output.WriteLine($"{logLevel} {categoryName}: {formatter(state, exception)}");
if (exception is not null)
{
output.WriteLine(exception.ToString());
}
}
}
/// <summary>
/// Buffering wrapper around an <see cref="ITestOutputHelper"/> that mirrors every line
/// written through it into a <see cref="StringBuilder"/> the test owns. The WriteSecured
/// parity test (IntegrationTests-019) uses this to make CLAUDE.md's "passwords and
/// <c>WriteSecured</c> payloads must never reach logs" rule a property of the entire
/// test output stream — gateway <see cref="ILogger"/> entries (echoed via
/// <see cref="TestOutputLoggerProvider"/>), worker stdout/stderr (echoed via
/// <see cref="TestWorkerProcessFactory.WriteWorkerOutput"/>), and direct
/// <c>output.WriteLine</c> calls all land in the same buffer, so a future maintenance
/// change that prints a credential through any of those channels is caught by the
/// assertion rather than slipping past the existing <c>DiagnosticMessage</c> check.
/// </summary>
private sealed class RecordingTestOutputHelper(ITestOutputHelper inner) : ITestOutputHelper
{
private readonly StringBuilder buffer = new();
private readonly object syncRoot = new();
public string Captured
{
get
{
lock (syncRoot)
{
return buffer.ToString();
}
}
}
public void WriteLine(string message)
{
lock (syncRoot)
{
buffer.AppendLine(message);
}
inner.WriteLine(message);
}
public void WriteLine(string format, params object[] args)
{
string formatted = string.Format(System.Globalization.CultureInfo.InvariantCulture, format, args);
lock (syncRoot)
{
buffer.AppendLine(formatted);
}
inner.WriteLine(format, args);
}
}
private sealed class AllowAllConstraintEnforcer : IConstraintEnforcer
{
public Task<ConstraintFailure?> CheckReadTagAsync(
ApiKeyIdentity? identity,
string tagAddress,
CancellationToken cancellationToken) => Task.FromResult<ConstraintFailure?>(null);
public Task<ConstraintFailure?> CheckReadHandleAsync(
ApiKeyIdentity? identity,
GatewaySession session,
int serverHandle,
int itemHandle,
CancellationToken cancellationToken) => Task.FromResult<ConstraintFailure?>(null);
public Task<ConstraintFailure?> CheckWriteHandleAsync(
ApiKeyIdentity? identity,
GatewaySession session,
int serverHandle,
int itemHandle,
CancellationToken cancellationToken) => Task.FromResult<ConstraintFailure?>(null);
public Task RecordDenialAsync(
ApiKeyIdentity? identity,
string commandKind,
string target,
ConstraintFailure failure,
CancellationToken cancellationToken) => Task.CompletedTask;
}
}