Compare commits

...

26 Commits

Author SHA1 Message Date
Joseph Doherty 88915c3d9a chore(clients): bump all five clients 0.1.1 -> 0.1.2 for MxSparseArray release 2026-06-18 04:20:54 -04:00
Joseph Doherty e7b8aa6114 feat(client-java): add writeArrayElements default-fill helper and document semantics 2026-06-18 03:32:20 -04:00
Joseph Doherty 8a1f037d5a fix(gateway): resolve array attribute constraints by bare name via [] fallback 2026-06-18 03:25:48 -04:00
Joseph Doherty e328758c53 fix(client-python): regenerate only MxSparseArray delta, drop spurious grpcio version churn
Commit 0702551 regenerated all three protos with grpcio-tools 1.81.1 /
protobuf 6.33.5, stamping every _pb2_grpc.py with GRPC_GENERATED_VERSION
'1.81.1' and every _pb2.py with 'Protobuf Python Version: 6.33.5'.  The
project pins grpcio>=1.80,<2 and the installed runtime is 1.80.0, so the
1.81.1 stamp caused a version error at import time, breaking pytest.

Fix: restore all generated files to the main baseline (grpcio 1.80.0,
protobuf 6.31.1 stamps), then regenerate only mxaccess_gateway.proto
using an isolated venv pinned to grpcio==1.80.0, grpcio-tools==1.80.0,
protobuf==6.31.1.  The net diff vs main is a single-file change to
mxaccess_gateway_pb2.py whose DESCRIPTOR blob now encodes the new
MxSparseArray / MxSparseElement messages and sparse_array_value field.
No _pb2_grpc.py and no galaxy/worker _pb2.py files change.
2026-06-18 03:19:12 -04:00
Joseph Doherty 72cf2f4091 fix(client-rust): correct outer MxValue data_type and add end-to-end write_array_elements test 2026-06-18 03:14:14 -04:00
Joseph Doherty 474b7bd0ff test(client-go): strengthen WriteArrayElements assertions and add README example 2026-06-18 03:12:57 -04:00
Joseph Doherty 437d29f19e docs: correct MxDataType default-table labels and note non-negative sparse indices
Fix the MxSparseArray type-defaults table in gateway.md to use the real MxDataType
enum member names (Integer, not Integer/LongInteger; Time, not Time/Timestamp) and
clarify the int64 note. Add a docstring sentence to write_array_elements in the
Python client noting that indices and total_length must be non-negative.
2026-06-18 03:12:56 -04:00
Joseph Doherty f0ef7ea0a8 feat(gateway): normalize array AddItem suffix and expand sparse writes at the worker boundary 2026-06-18 03:10:13 -04:00
Joseph Doherty 3a8f2bed4e feat(client-rust): add write_array_elements default-fill helper and document semantics
Handles the new MxSparseArray wire type (proto field 19 on MxValue::Kind):
- value.rs: map SparseArrayValue to MxValueProjection::Unset (write-only; never emitted on read path)
- session.rs: add write_array_elements() that builds the sparse proto value and delegates to write()
- tests: three unit tests asserting proto shape, empty-elements case, and read-path Unset projection
- README: document write_array_elements default-fill semantics and bare-name [] normalisation
2026-06-18 03:02:15 -04:00
Joseph Doherty b7f29f3048 feat(client-go): add WriteArrayElements default-fill helper and document semantics
Regenerate Go proto types from mxaccess_gateway.proto so MxSparseArray,
MxSparseElement, and MxValue_SparseArrayValue appear in the generated
package; add MxSparseArray/MxSparseElement type aliases to types.go;
add Session.WriteArrayElements and the unexported buildSparseArrayValue
builder; add three unit tests covering the sparse oneof structure,
empty-map case, and the round-trip through WriteArrayElements; update
README with default-fill reset semantics and auto-normalize note.
2026-06-18 03:01:55 -04:00
Joseph Doherty 0702551c25 feat(client-python): add write_array_elements default-fill helper and document semantics
Regenerate Python proto bindings to pick up MxSparseArray/MxSparseElement/
sparse_array_value from the shared mxaccess_gateway.proto. Add
Session.write_array_elements which builds an MxValue(sparse_array_value=…)
from a {index→scalar} dict and delegates to the existing write(). Add 8 pytest
tests covering builder correctness and full round-trip wire shape. Update
README with a default-fill semantics paragraph and bare-name array-write note.
2026-06-18 03:01:45 -04:00
Joseph Doherty db9c68ca9c docs: document MxSparseArray default-fill writes and bare-name array AddItem 2026-06-18 03:00:19 -04:00
Joseph Doherty 95b5b09a67 feat(client-dotnet): add WriteArrayElementsAsync default-fill helper and document semantics
Adds a public WriteArrayElementsAsync helper on MxGatewaySession that builds
an MxValue{SparseArrayValue} and delegates to the existing WriteAsync. Extracts
the MxValue construction into an internal static BuildSparseArray builder for
unit-testability. Two new tests cover builder output shape and the full write
command path. README documents the reset (not preserve) semantics alongside
the existing whole-array guidance.
2026-06-18 02:59:43 -04:00
Joseph Doherty 627c17fae1 fix(gateway): reject oversized sparse array total_length with InvalidArgument
Guard against proto uint32 total_length values that exceed Array.MaxLength
before casting; the previous checked cast threw OverflowException (gRPC
Internal) instead of the intended InvalidArgument. Adds tests for the new
guard, for the null-value ArgumentNullException path, and removes the
checked keyword (redundant after the guard).
2026-06-18 02:58:03 -04:00
Joseph Doherty 34a99c783b feat(gateway): add SparseArrayExpander for default-fill partial array writes 2026-06-18 02:52:33 -04:00
Joseph Doherty 52cd0da9f5 feat(gateway): add ArrayAddressNormalizer for bare-name array AddItem 2026-06-18 02:51:37 -04:00
Joseph Doherty 8ac9a33d91 feat(contracts): add MxSparseArray write-only value for default-fill partial writes 2026-06-18 02:48:18 -04:00
Joseph Doherty dd35ae1fe6 docs(plans): implementation plan for array write ergonomics + default-fill partial writes 2026-06-18 02:46:09 -04:00
Joseph Doherty 4a6a79d02e docs(plans): design for array write ergonomics and default-fill partial writes 2026-06-18 02:39:00 -04:00
Joseph Doherty 9eedf9d6a9 clients: document supervisory/array-write parity gotchas and add advise-supervisory to all CLIs
A consuming project hit two MXAccess parity surprises: a plain Write only
records its user_id when the item has an active supervisory advise (the path
to take when not authenticating), and array writes replace the whole array
rather than patching individual elements. Document both across the five client
READMEs and gateway.md's compatibility baseline, and expose the missing
advise-supervisory subcommand in the go/python/rust/java CLIs (plus the .NET
help text) so callers can establish the supervisory advise without dropping to
the raw command API.
2026-06-17 20:14:48 -04:00
Joseph Doherty bed647ca2c docs(code-reviews): mark Client.Java + Worker.Tests findings Resolved (windev-verified)
Client.Java-040..048 and Worker.Tests-034/035/036 flipped In Progress -> Resolved
after windev verification:
- Java: gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests -> BUILD SUCCESSFUL
- Worker.Tests: dotnet test -p:Platform=x86 -> 344 passed, 0 failed
All 11 modules now report 0 open findings; README regenerated (--check clean).
2026-06-17 05:33:30 -04:00
Joseph Doherty 8cebe431e1 fix(client-java): repair illegal unicode escape in MxGatewayCliTests comment
Same \u-in-comment issue as the production file (MxGatewayCliTests.java:69).
Reworded to plain ASCII. Scanned all agent-touched Java files; no other stray
unicode-escape sequences remain.
2026-06-17 05:29:43 -04:00
Joseph Doherty bdb7e1439e fix(client-java): repair illegal unicode escape in jsonString comment
The Client.Java-041 escaping fix left a literal \u00XX in a // comment; Java
processes \u unicode escapes even in comments, so it failed to compile on
windev (MxGatewayCli.java:2223 illegal unicode escape). Reworded the comment to
plain ASCII. The escaping code itself (String.format with the doubled-backslash
format string) was already correct.
2026-06-17 05:27:57 -04:00
Joseph Doherty 8df0479b99 fix: resolve Client.Java + Worker.Tests findings (pending windev verification)
Client.Java-040..048, Worker.Tests-034/035/036. Edits applied on the Mac,
which has no JRE and cannot build the x86+MXAccess worker tests; findings are
marked In Progress pending gradle + x86 build verification on windev. Do not
mark Resolved until verified there.
2026-06-17 05:23:14 -04:00
Joseph Doherty 6b5fe6aa82 fix: resolve code-review findings (locally verified)
Server-054/055/056, Contracts-020/021/022, Tests-036/038/039,
IntegrationTests-030/031/032 (+033 deferred to live rig),
Client.Dotnet-026/028/029 (+027 won't-fix), Client.Go-030..034,
Client.Python-032..036, Client.Rust-033..038.

Key fix: SessionEventDistributor orphaned a subscriber that registered after
the pump completed but before disposal (Server-056) -> register paths now
complete late registrants under _lifecycleLock; regression test added. The
racy dashboard-mirror gRPC test made deterministic (Tests-039).

Verified green locally: gateway Tests targeted classes (GatewaySession,
SessionEventDistributor, GatewayOptionsValidator, ProtobufContractRoundTrip,
GatewaySessionDashboardMirror) + dotnet/go/python/rust client suites.
2026-06-17 05:23:14 -04:00
Joseph Doherty 25d04ec37e code-reviews: 2026-06-16 re-review of all 11 modules at 8df5ab3
Re-review of the 99-commit delta since the 410acc9 baseline (session-resilience
epic, dashboard disable-login, galaxy browse fixes, and stillpending §8).

44 new Open findings, no Critical/High:
- Server 2 (incl. Medium design-doc drift), Worker 0 (026/027/028 confirmed
  resolved), Contracts 3, Tests 3, Worker.Tests 3, IntegrationTests 4
- Client.Dotnet 4 (Medium env-var key redaction), Client.Go 5 (Medium watch
  drain), Client.Java 9 (Medium overflow race), Client.Python 5 (Medium README
  API), Client.Rust 6 (Medium --tls/--plaintext downgrade)

README regenerated; regen-readme.py --check passes.
2026-06-16 18:57:56 -04:00
84 changed files with 17101 additions and 1512 deletions
+1 -1
View File
@@ -73,7 +73,7 @@ powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1
- **Style guides** in `docs/style-guides/` are authoritative. Follow `CSharpStyleGuide.md` for gateway/worker/.NET-client code: file-scoped namespaces, `sealed` by default, `Async` suffix on Task-returning methods, MXAccess-aligned names (`MxStatusProxy`, `ServerHandle`, `ItemHandle`, `HResult`).
- **MXAccess parity is the contract.** Don't "fix" surprising MXAccess behavior (e.g., `WriteSecured` failing before a value-bearing NMX body, distinct `OperationComplete` semantics, invalid-handle exceptions) unless the client explicitly opts into a non-parity mode. The installed MXAccess COM component is the baseline.
- **Don't synthesize events.** The gateway forwards only events the worker emits; it never invents `OperationComplete` from write completion or command replies.
- **One worker per session, one event subscriber per session** (v1). Multi-subscriber fan-out and reconnectable sessions are explicitly out of scope — see `docs/DesignDecisions.md`.
- **One worker per session** (invariant). Multi-subscriber event fan-out and reconnect-with-replay have shipped and are config-gated: `AllowMultipleEventSubscribers` (default `false`) enables fan-out up to `MaxEventSubscribersPerSession` (default `8`); `DetachGraceSeconds` (default `30`) retains a session after its last subscriber drops so clients can reconnect; `ReplayBufferCapacity` / `ReplayRetentionSeconds` control how much event history the replay ring keeps. Default config preserves the original single-subscriber, no-retention behavior. See `docs/DesignDecisions.md` and `docs/Sessions.md`.
- **Gateway restart does not reattach orphan workers.** The first version terminates orphaned workers on startup; do not design code paths that assume reattachment.
- **No Blazor UI component libraries.** Dashboard uses local Bootstrap CSS/JS only — do not introduce MudBlazor, Radzen, FluentUI, etc.
- **Don't log secrets or full tag values by default.** API keys, passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs. Value logging is opt-in and redacted.
+1 -1
View File
@@ -29,7 +29,7 @@
as a packaged license file instead. -->
<PackageLicenseFile>LICENSE.txt</PackageLicenseFile>
<!-- Versioning: bump per release. Symbols ship as snupkg. -->
<Version>0.1.1</Version>
<Version>0.1.2</Version>
<IncludeSymbols>true</IncludeSymbols>
<SymbolPackageFormat>snupkg</SymbolPackageFormat>
<!-- Default: do NOT pack. Each project opts in. -->
+71
View File
@@ -121,6 +121,77 @@ can keep the full `MxCommandReply`, HRESULT, and status array when MXAccess
itself rejects a command. `MxAccessException.Reply` contains the raw generated
reply.
## Write Semantics And Common Pitfalls
These are MXAccess parity behaviors that surprise new callers. The gateway
forwards them unchanged — it does not paper over them.
### Attributing a write to a user without `AuthenticateUser`
MXAccess only stamps a plain `Write`/`Write2` with a Galaxy user id when the
item carries an active *supervisory* advise. If you are **not** using the
verified/secured path (`AuthenticateUser``WriteSecured`/`WriteSecured2`) but
still need the write attributed to a user id, you must first advise the item
supervisory and then pass that user id on the write. Without the supervisory
advise the `userId` on a plain write is ignored.
The library exposes `Advise`/`UnAdvise` as named helpers but not supervisory
advise, so send it through the generic command channel:
```csharp
await session.InvokeAsync(new MxCommandRequest
{
SessionId = session.SessionId,
Command = new MxCommand
{
Kind = MxCommandKind.AdviseSupervisory,
AdviseSupervisory = new AdviseSupervisoryCommand
{
ServerHandle = serverHandle,
ItemHandle = itemHandle,
},
},
});
await session.WriteAsync(serverHandle, itemHandle, value.ToMxValue(), userId);
```
The CLI exposes the same command as `advise-supervisory`, and `write` /
`write2` take `--user-id`.
### Array writes replace the whole array
A write to an array attribute **replaces the entire array**; it is not an
element-wise patch. To change a subset of elements, send the full array with
the unchanged elements included. For example, to change 2 elements of a
20-element array, build the `MxValue` from all 20 values (the 18 unchanged plus
the 2 new ones). Sending only the 2 changed values overwrites the attribute
with a 2-element array.
When only a few indices need changing and the rest should be reset to the
element type's default, use `WriteArrayElementsAsync` instead of building the
full array manually:
```csharp
await session.WriteArrayElementsAsync(
serverHandle, itemHandle,
elementDataType: MxDataType.Integer,
totalLength: 20,
elements: new Dictionary<uint, MxValue>
{
[2] = 42.ToMxValue(),
[7] = 99.ToMxValue(),
});
```
The gateway expands the sparse descriptor into a full `totalLength`-element
array before forwarding to the worker. Indices not listed in `elements` are
written as the element type's default — this is a **reset**, not a preserve;
current values at those positions are discarded. `totalLength` is required and
must match the declared length of the array attribute. Bare-name array items
(`Area001.Pump001.Speed`) are auto-normalized to the `[]` form at `AddItem`
so the array attribute accepts the write.
## CLI Usage
The test CLI supports deterministic JSON output for automation:
@@ -3,6 +3,13 @@ using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
namespace ZB.MOM.WW.MxGateway.Client.Cli;
/// <summary>
/// Transport seam used by the CLI to drive gateway and Galaxy Repository
/// RPCs, exposing only the operations the command surface needs. The
/// production binding is <see cref="MxGatewayCliClientAdapter"/> (wrapping a
/// real <c>MxGatewayClient</c>); tests substitute an in-memory fake so the
/// command routing can be exercised without a live gateway.
/// </summary>
public interface IMxGatewayCliClient : IAsyncDisposable
{
/// <summary>
@@ -110,6 +110,8 @@ public static class MxGatewayClientCli
.ConfigureAwait(false),
"advise" => await AdviseAsync(arguments, client, standardOutput, cancellation.Token)
.ConfigureAwait(false),
"advise-supervisory" => await AdviseSupervisoryAsync(arguments, client, standardOutput, cancellation.Token)
.ConfigureAwait(false),
"subscribe-bulk" => await SubscribeBulkAsync(arguments, client, standardOutput, cancellation.Token)
.ConfigureAwait(false),
"unsubscribe-bulk" => await UnsubscribeBulkAsync(arguments, client, standardOutput, cancellation.Token)
@@ -153,7 +155,10 @@ public static class MxGatewayClientCli
}
catch (Exception exception) when (exception is not OperationCanceledException)
{
string? apiKey = arguments.GetOptional("api-key");
// Client.Dotnet-028: redact the *effective* key — from --api-key or the
// --api-key-env environment variable — so an env-var-sourced key echoed
// in a transport error never reaches stderr unredacted.
string? apiKey = TryResolveApiKey(arguments);
string message = MxGatewayCliSecretRedactor.Redact(exception.Message, apiKey);
if (forceJsonErrors || arguments.HasFlag("json"))
@@ -278,6 +283,29 @@ public static class MxGatewayClientCli
}
private static string ResolveApiKey(CliArguments arguments)
{
string? apiKey = TryResolveApiKey(arguments);
if (!string.IsNullOrWhiteSpace(apiKey))
{
return apiKey;
}
string apiKeyEnvironmentName = arguments.GetOptional("api-key-env")
?? "MXGATEWAY_API_KEY";
throw new ArgumentException(
$"Gateway API key is required. Pass --api-key or set {apiKeyEnvironmentName}.");
}
/// <summary>
/// Resolves the effective API key from <c>--api-key</c> or, failing that,
/// the <c>--api-key-env</c>-named environment variable (default
/// <c>MXGATEWAY_API_KEY</c>), returning <see langword="null"/> when neither
/// is set. Unlike <see cref="ResolveApiKey"/> this never throws, so the
/// error-redaction catch block can strip the env-var-sourced key from
/// output (Client.Dotnet-028) without re-raising on the absent-key path.
/// </summary>
private static string? TryResolveApiKey(CliArguments arguments)
{
string? apiKey = arguments.GetOptional("api-key");
if (!string.IsNullOrWhiteSpace(apiKey))
@@ -288,14 +316,7 @@ public static class MxGatewayClientCli
string apiKeyEnvironmentName = arguments.GetOptional("api-key-env")
?? "MXGATEWAY_API_KEY";
apiKey = Environment.GetEnvironmentVariable(apiKeyEnvironmentName);
if (!string.IsNullOrWhiteSpace(apiKey))
{
return apiKey;
}
throw new ArgumentException(
$"Gateway API key is required. Pass --api-key or set {apiKeyEnvironmentName}.");
return Environment.GetEnvironmentVariable(apiKeyEnvironmentName);
}
private static CancellationTokenSource CreateCancellation(CliArguments arguments, string command)
@@ -303,7 +324,7 @@ public static class MxGatewayClientCli
var cancellation = new CancellationTokenSource();
// Long-running streaming commands run until Ctrl+C / cancellation by default;
// a caller-supplied --timeout still applies if present.
bool isLongRunning = command is "galaxy-watch";
bool isLongRunning = command is "galaxy-watch" or "galaxy-browse";
string? rawTimeout = arguments.GetOptional("timeout");
if (isLongRunning && string.IsNullOrWhiteSpace(rawTimeout))
{
@@ -432,6 +453,28 @@ public static class MxGatewayClientCli
cancellationToken);
}
private static Task<int> AdviseSupervisoryAsync(
CliArguments arguments,
IMxGatewayCliClient client,
TextWriter output,
CancellationToken cancellationToken)
{
return InvokeAndWriteAsync(
arguments,
client,
output,
new MxCommand
{
Kind = MxCommandKind.AdviseSupervisory,
AdviseSupervisory = new AdviseSupervisoryCommand
{
ServerHandle = arguments.GetInt32("server-handle"),
ItemHandle = arguments.GetInt32("item-handle"),
},
},
cancellationToken);
}
private static Task<int> SubscribeBulkAsync(
CliArguments arguments,
IMxGatewayCliClient client,
@@ -2047,6 +2090,7 @@ public static class MxGatewayClientCli
writer.WriteLine("mxgw-dotnet register --session-id <id> --client-name <name> [--json]");
writer.WriteLine("mxgw-dotnet add-item --session-id <id> --server-handle <n> --item <ref> [--json]");
writer.WriteLine("mxgw-dotnet advise --session-id <id> --server-handle <n> --item-handle <n> [--json]");
writer.WriteLine("mxgw-dotnet advise-supervisory --session-id <id> --server-handle <n> --item-handle <n> [--json]");
writer.WriteLine("mxgw-dotnet subscribe-bulk --session-id <id> --server-handle <n> --items <ref,ref> [--json]");
writer.WriteLine("mxgw-dotnet unsubscribe-bulk --session-id <id> --server-handle <n> --item-handles <n,n> [--json]");
writer.WriteLine("mxgw-dotnet read-bulk --session-id <id> --server-handle <n> --items <ref,ref> [--timeout-ms <n>] [--json]");
@@ -106,6 +106,48 @@ public sealed class MxGatewayClientCliTests
Assert.Contains("[redacted]", error.ToString());
}
/// <summary>
/// Client.Dotnet-028: when the API key is sourced from the env var
/// (<c>--api-key-env</c> path, no <c>--api-key</c> flag), the error-redaction
/// catch block must still resolve and redact the effective key. Regression
/// guard for the catch block reverting to <c>GetOptional("api-key")</c> only,
/// which is null on the env-var path and leaves the key unredacted.
/// </summary>
[Fact]
public async Task RunAsync_ErrorOutput_RedactsApiKey_WhenSourcedFromEnvironmentVariable()
{
const string envName = "MXGATEWAY_TEST_API_KEY_028";
const string secret = "env-sourced-secret-key";
string? previousKey = Environment.GetEnvironmentVariable(envName);
Environment.SetEnvironmentVariable(envName, secret);
try
{
using var output = new StringWriter();
using var error = new StringWriter();
int exitCode = await MxGatewayClientCli.RunAsync(
[
"open-session",
"--endpoint",
"http://localhost:5000",
"--api-key-env",
envName,
],
output,
error,
_ => throw new InvalidOperationException($"boom {secret}"));
Assert.Equal(1, exitCode);
Assert.DoesNotContain(secret, error.ToString());
Assert.Contains("[redacted]", error.ToString());
}
finally
{
Environment.SetEnvironmentVariable(envName, previousKey);
}
}
/// <summary>Verifies that stream-events with max-events limit stops output in non-JSON format.</summary>
[Fact]
public async Task RunAsync_StreamEvents_WithMaxEventsStopsNonJsonOutput()
@@ -303,6 +303,69 @@ public sealed class MxGatewayClientSessionTests
Assert.Equal(cancellation.Token, Assert.Single(transport.InvokeCalls).CallOptions.CancellationToken);
}
/// <summary>Verifies that BuildSparseArray produces a SparseArrayValue MxValue with the correct total length and elements.</summary>
[Fact]
public void BuildSparseArray_ProducesSparseArrayValueWithCorrectTotalLengthAndElements()
{
MxValue element0 = 42.ToMxValue();
MxValue element3 = 99.ToMxValue();
Dictionary<uint, MxValue> elements = new()
{
[0u] = element0,
[3u] = element3,
};
MxValue result = MxGatewaySession.BuildSparseArray(MxDataType.Integer, totalLength: 10, elements);
Assert.Equal(MxValue.KindOneofCase.SparseArrayValue, result.KindCase);
Assert.Equal(10u, result.SparseArrayValue.TotalLength);
Assert.Equal(MxDataType.Integer, result.SparseArrayValue.ElementDataType);
Assert.Equal(2, result.SparseArrayValue.Elements.Count);
MxSparseElement el0 = Assert.Single(result.SparseArrayValue.Elements, e => e.Index == 0u);
Assert.Same(element0, el0.Value);
MxSparseElement el3 = Assert.Single(result.SparseArrayValue.Elements, e => e.Index == 3u);
Assert.Same(element3, el3.Value);
}
/// <summary>Verifies that WriteArrayElementsAsync builds a write command whose value is a SparseArrayValue.</summary>
[Fact]
public async Task WriteArrayElementsAsync_BuildsWriteCommandWithSparseArrayValue()
{
FakeGatewayTransport transport = CreateTransport();
transport.AddInvokeReply(new MxCommandReply
{
SessionId = "session-fixture",
Kind = MxCommandKind.Write,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
});
await using MxGatewayClient client = CreateClient(transport);
MxGatewaySession session = await client.OpenSessionAsync();
Dictionary<uint, MxValue> elements = new() { [1u] = 7.ToMxValue() };
await session.WriteArrayElementsAsync(
serverHandle: 12,
itemHandle: 34,
elementDataType: MxDataType.Integer,
totalLength: 5,
elements: elements,
userId: 56);
MxCommandRequest request = Assert.Single(transport.InvokeCalls).Request;
Assert.Equal(MxCommandKind.Write, request.Command.Kind);
Assert.Equal(12, request.Command.Write.ServerHandle);
Assert.Equal(34, request.Command.Write.ItemHandle);
Assert.Equal(56, request.Command.Write.UserId);
MxValue written = request.Command.Write.Value;
Assert.Equal(MxValue.KindOneofCase.SparseArrayValue, written.KindCase);
Assert.Equal(5u, written.SparseArrayValue.TotalLength);
Assert.Equal(MxDataType.Integer, written.SparseArrayValue.ElementDataType);
MxSparseElement el = Assert.Single(written.SparseArrayValue.Elements);
Assert.Equal(1u, el.Index);
Assert.Equal(7, el.Value.Int32Value);
}
private static MxGatewayClient CreateClient(FakeGatewayTransport transport)
{
return new MxGatewayClient(transport.Options, transport);
@@ -12,6 +12,12 @@ public sealed class LazyBrowseNode
{
private readonly GalaxyRepositoryClient _client;
private readonly BrowseChildrenOptions _options;
// Client.Dotnet-027 (Won't Fix): this gate is used only via WaitAsync/Release and
// never via AvailableWaitHandle, so SemaphoreSlim allocates no kernel wait handle —
// it holds no unmanaged/OS handle to leak. It is pure managed memory whose lifetime
// is the node's, so the type is intentionally not IDisposable: making it disposable
// would push per-node disposal onto every tree consumer (thousands of nodes) for no
// resource benefit.
private readonly SemaphoreSlim _expandLock = new(1, 1);
// Published once, under _expandLock, when expansion completes. Lock-free readers
@@ -687,6 +687,63 @@ public sealed class MxGatewaySession : IAsyncDisposable
reply.EnsureProtocolSuccess().EnsureMxAccessSuccess();
}
/// <summary>
/// Writes specific array indices to an item using default-fill semantics.
/// </summary>
/// <remarks>
/// The gateway expands the sparse descriptor into a full <c>totalLength</c>-element array
/// before forwarding to the worker. Indices not listed in <paramref name="elements"/> are
/// written as the element type's default value — this is a RESET, not a preserve. The
/// current values at those positions are discarded. <paramref name="totalLength"/> is
/// required and must match the declared length of the array attribute.
/// </remarks>
/// <param name="serverHandle">The ServerHandle from register.</param>
/// <param name="itemHandle">The ItemHandle from add-item.</param>
/// <param name="elementDataType">The MXAccess data type of each element.</param>
/// <param name="totalLength">The total declared length of the target array attribute.</param>
/// <param name="elements">Map of zero-based array index to scalar <see cref="MxValue"/>.</param>
/// <param name="userId">User ID context for the write.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public Task WriteArrayElementsAsync(
int serverHandle,
int itemHandle,
MxDataType elementDataType,
uint totalLength,
IReadOnlyDictionary<uint, MxValue> elements,
int userId = 0,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(elements);
MxValue value = BuildSparseArray(elementDataType, totalLength, elements);
return WriteAsync(serverHandle, itemHandle, value, userId, cancellationToken);
}
/// <summary>
/// Builds an <see cref="MxValue"/> whose <see cref="MxValue.SparseArrayValue"/> describes a
/// default-fill partial array write.
/// </summary>
/// <param name="elementDataType">The MXAccess data type of each element.</param>
/// <param name="totalLength">The total declared length of the target array attribute.</param>
/// <param name="elements">Map of zero-based array index to scalar <see cref="MxValue"/>.</param>
/// <returns>An <see cref="MxValue"/> with <see cref="MxValue.KindOneofCase.SparseArrayValue"/> set.</returns>
internal static MxValue BuildSparseArray(
MxDataType elementDataType,
uint totalLength,
IReadOnlyDictionary<uint, MxValue> elements)
{
MxSparseArray sparse = new()
{
ElementDataType = elementDataType,
TotalLength = totalLength,
};
foreach (KeyValuePair<uint, MxValue> kv in elements)
{
sparse.Elements.Add(new MxSparseElement { Index = kv.Key, Value = kv.Value });
}
return new MxValue { SparseArrayValue = sparse };
}
/// <summary>
/// Writes a value to an item on the MXAccess server without error checking.
/// </summary>
+126
View File
@@ -99,6 +99,78 @@ call returns a `StreamAlarmsClient`; cancel its context to terminate the
stream. All three pass straight through to the gateway's central alarm
monitor.
## Write Semantics And Common Pitfalls
These are MXAccess parity behaviors that surprise new callers. The gateway
forwards them unchanged — it does not paper over them.
### Attributing a write to a user without `AuthenticateUser`
MXAccess only stamps a plain `Write`/`Write2` with a Galaxy user id when the
item carries an active *supervisory* advise. If you are **not** using the
verified/secured path (`AuthenticateUser``WriteSecured`/`WriteSecured2`) but
still need the write attributed to a user id, you must first advise the item
supervisory and then pass that user id on the write. Without the supervisory
advise the `userID` on a plain write is ignored.
The session exposes `Advise`/`UnAdvise` but not supervisory advise, so send it
through the generic command channel:
```go
_, err := client.Invoke(ctx, &pb.MxCommandRequest{
SessionId: session.ID(),
Command: &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_ADVISE_SUPERVISORY,
Payload: &pb.MxCommand_AdviseSupervisory{
AdviseSupervisory: &pb.AdviseSupervisoryCommand{
ServerHandle: serverHandle,
ItemHandle: itemHandle,
},
},
},
})
err = session.Write(ctx, serverHandle, itemHandle, value, userID)
```
The CLI exposes the same command as `advise-supervisory`, and `write` /
`write2` take `--user-id`.
### Array writes replace the whole array
A write to an array attribute **replaces the entire array**; it is not an
element-wise patch. To change a subset of elements, send the full array with
the unchanged elements included. For example, to change 2 elements of a
20-element array, build the `MxValue` from all 20 values (the 18 unchanged plus
the 2 new ones). Sending only the 2 changed values overwrites the attribute
with a 2-element array.
`Session.WriteArrayElements` offers a default-fill shorthand: pass only the
indices you want to set along with a `totalLength`. The gateway expands the
sparse representation into a full array before forwarding to MXAccess — every
unmentioned index receives the element type's zero value (boolean `false`,
integer `0`, float `0.0`, string `""`, time = Unix epoch). This is a **RESET**
of unmentioned indices, not a preserve of existing values. Use the full-array
form (read-modify-write) when existing element values must be preserved.
```go
// Set element [3] of a 10-element float array; all other indices reset to 0.0.
err = session.WriteArrayElements(
ctx,
serverHandle, itemHandle,
mxgateway.DataTypeFloat,
10, // totalLength
map[uint32]*mxgateway.MxValue{
3: mxgateway.FloatValue(1.5),
},
userID,
)
```
`AddItem` (and `AddItem2`) now auto-normalize a bare attribute name to the `[]`
array address form expected by MXAccess, so callers do not need to append `[]`
themselves. Both forms are accepted; duplicates are deduplicated by the gateway.
## Galaxy Repository browse
The `GalaxyRepository` service (proto package `galaxy_repository.v1`) is a
@@ -247,24 +319,78 @@ one line per event in text mode or one JSON object per event with `-json`.
The `mxgw-go` CLI emits JSON with redacted API keys for commands that connect to
the gateway:
### Subcommand reference
Every subcommand wired into the CLI. All accept the common flags
(`-endpoint`, `-plaintext`, `-api-key` / `-api-key-env`, `-ca-cert`,
`-server-name-override`, `-call-timeout`) and most accept `-json`.
| Command | Purpose |
|---|---|
| `version` | Print client/contract versions. |
| `open-session` | Open a gateway session and print its id. |
| `close-session` | Close a session by id. |
| `ping` | Round-trip a `PING` command (`-session-id`, `-message`). |
| `register` | Register a client name on a session (`-session-id`, `-client-name`). |
| `add-item` | Add an item handle (`-session-id`, `-server-handle`, `-item`). |
| `advise` | Advise (subscribe) one item (`-session-id`, `-server-handle`, `-item-handle`). |
| `subscribe-bulk` | Advise many items in one call. |
| `unsubscribe-bulk` | Unadvise many item handles in one call. |
| `read-bulk` | Read snapshots for many item handles in one call. |
| `write` | Write one value (`-type`, `-value`). |
| `write-bulk` | Write many values (`-item-handles`, `-values`, counts must match). |
| `write2-bulk` | `write-bulk` with a shared `-timestamp-value` (RFC 3339). |
| `write-secured-bulk` | Secured bulk write (`-current-user-id`, `-verifier-user-id`). |
| `write-secured2-bulk` | Secured bulk write with a shared timestamp. |
| `bench-read-bulk` | Throughput benchmark (`-duration-seconds`, `-warmup-seconds`, `-bulk-size`). |
| `stream-events` | Stream item-value events for a session (`-session-id`, `-limit`). |
| `stream-alarms` | Stream the alarm feed (`-filter-prefix`, `-limit`). |
| `acknowledge-alarm` | Acknowledge an alarm reference. |
| `smoke` | End-to-end smoke workflow against one item. |
| `galaxy-test-connection` | Probe the Galaxy Repository RPC connection. |
| `galaxy-last-deploy` | Print the most recent deploy event. |
| `galaxy-discover` | Discover deployed objects. |
| `galaxy-watch` | Stream deploy events until Ctrl+C or `-limit`. |
| `galaxy-browse` | Lazy/eager browse of the Galaxy object tree. |
| `batch` | Read commands from stdin (see below). |
```powershell
go run ./cmd/mxgw-go version -json
go run ./cmd/mxgw-go open-session -endpoint localhost:5000 -plaintext -json
go run ./cmd/mxgw-go ping -session-id <id> -plaintext -json
go run ./cmd/mxgw-go register -session-id <id> -client-name mxgw-go -plaintext -json
go run ./cmd/mxgw-go add-item -session-id <id> -server-handle 1 -item Area001.Tag.Value -plaintext -json
go run ./cmd/mxgw-go advise -session-id <id> -server-handle 1 -item-handle 1 -plaintext -json
go run ./cmd/mxgw-go write -session-id <id> -server-handle 1 -item-handle 1 -type int32 -value 123 -plaintext -json
go run ./cmd/mxgw-go write-bulk -session-id <id> -server-handle 1 -item-handles 1,2 -values 10,20 -type int32 -plaintext -json
go run ./cmd/mxgw-go read-bulk -session-id <id> -item-handles 1,2 -plaintext -json
go run ./cmd/mxgw-go stream-events -session-id <id> -plaintext -json
go run ./cmd/mxgw-go stream-alarms -plaintext -json
go run ./cmd/mxgw-go smoke -item Area001.Tag.Value -plaintext -json
go run ./cmd/mxgw-go galaxy-test-connection -plaintext -json
go run ./cmd/mxgw-go galaxy-last-deploy -plaintext -json
go run ./cmd/mxgw-go galaxy-discover -plaintext -json
go run ./cmd/mxgw-go galaxy-watch -plaintext -json
go run ./cmd/mxgw-go galaxy-browse -plaintext -json
```
Use `-api-key-env MXGATEWAY_API_KEY` or `-api-key <key>` when authentication is
enabled. CLI output redacts the key value and never writes the raw secret.
### `batch` mode
`batch` reads one command line at a time from stdin and dispatches each through
the same routing as the standalone subcommands; it is the interface the
cross-language E2E harness drives. After every command's output it writes the
end-of-result sentinel line `__MXGW_BATCH_EOR__` to stdout and flushes, so the
harness can frame each result. Blank/whitespace-only lines are skipped; only
stdin EOF ends the session. Command errors are serialised as a JSON object
(`{"error":...,"type":"error"}`) to stdout (not stderr) and still followed by the
sentinel, so a failing command does not abort the batch. The input scanner
buffer is widened to 16 MiB so a single long command line (e.g. a bulk write with
thousands of handles) does not trip bufio's default 64 KiB token-too-long limit;
a line that still exceeds 16 MiB surfaces as a framed error and ends the session.
Use TLS options for a secured gateway:
```powershell
+62 -7
View File
@@ -21,6 +21,7 @@ import (
"syscall"
"time"
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
"gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/mxgateway"
"google.golang.org/protobuf/encoding/protojson"
"google.golang.org/protobuf/reflect/protoreflect"
@@ -87,6 +88,8 @@ func runWithIO(ctx context.Context, args []string, stdout, stderr io.Writer) err
return runAddItem(ctx, args[1:], stdout, stderr)
case "advise":
return runAdvise(ctx, args[1:], stdout, stderr)
case "advise-supervisory":
return runAdviseSupervisory(ctx, args[1:], stdout, stderr)
case "subscribe-bulk":
return runSubscribeBulk(ctx, args[1:], stdout, stderr)
case "unsubscribe-bulk":
@@ -358,6 +361,43 @@ func runAdvise(ctx context.Context, args []string, stdout, stderr io.Writer) err
return writeCommandOutput(stdout, *jsonOutput, "advise", options, reply, err)
}
func runAdviseSupervisory(ctx context.Context, args []string, stdout, stderr io.Writer) error {
flags := flag.NewFlagSet("advise-supervisory", flag.ContinueOnError)
flags.SetOutput(stderr)
common := bindCommonFlags(flags)
jsonOutput := flags.Bool("json", false, "write JSON output")
sessionID := flags.String("session-id", "", "gateway session id")
serverHandle := flags.Int("server-handle", 0, "MXAccess server handle")
itemHandle := flags.Int("item-handle", 0, "MXAccess item handle")
if err := flags.Parse(args); err != nil {
return err
}
if *sessionID == "" {
return errors.New("session-id is required")
}
client, options, err := dialForCommand(ctx, common)
if err != nil {
return err
}
defer client.Close()
reply, err := client.Invoke(ctx, &pb.MxCommandRequest{
SessionId: *sessionID,
Command: &pb.MxCommand{
Kind: pb.MxCommandKind_MX_COMMAND_KIND_ADVISE_SUPERVISORY,
Payload: &pb.MxCommand_AdviseSupervisory{
AdviseSupervisory: &pb.AdviseSupervisoryCommand{
ServerHandle: int32(*serverHandle),
ItemHandle: int32(*itemHandle),
},
},
},
})
return writeCommandOutput(stdout, *jsonOutput, "advise-supervisory", options, reply, err)
}
func runSubscribeBulk(ctx context.Context, args []string, stdout, stderr io.Writer) error {
flags := flag.NewFlagSet("subscribe-bulk", flag.ContinueOnError)
flags.SetOutput(stderr)
@@ -837,7 +877,14 @@ func runStreamEvents(ctx context.Context, args []string, stdout, stderr io.Write
defer client.Close()
session := mxgateway.NewSessionForID(client, *sessionID)
streamCtx, cancelStream := context.WithCancel(ctx)
// Ctrl+C on a long-running stream-events command cancels the gRPC stream
// cleanly (the gateway sees codes.Canceled rather than a torn TCP
// connection) and the deferred subscription.Close()/client.Close() run.
signalCtx, stopSignals := signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)
defer stopSignals()
streamCtx, cancelStream := context.WithCancel(signalCtx)
defer cancelStream()
subscription, err := session.SubscribeEventsAfter(streamCtx, *after)
if err != nil {
@@ -1035,15 +1082,17 @@ func runSmoke(ctx context.Context, args []string, stdout, stderr io.Writer) erro
}
func closeSmokeSession(ctx context.Context, session *mxgateway.Session, primaryErr error) error {
closeCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Compute the close timeout once so a single context (and a single
// deferred cancel) is allocated: default 5s, shortened to the caller's
// remaining deadline when that is sooner.
closeTimeout := 5 * time.Second
if deadline, ok := ctx.Deadline(); ok {
if until := time.Until(deadline); until > 0 && until < 5*time.Second {
cancel()
closeCtx, cancel = context.WithTimeout(context.Background(), until)
defer cancel()
if until := time.Until(deadline); until > 0 && until < closeTimeout {
closeTimeout = until
}
}
closeCtx, cancel := context.WithTimeout(context.Background(), closeTimeout)
defer cancel()
_, closeErr := session.Close(closeCtx)
if primaryErr != nil {
@@ -1490,6 +1539,12 @@ func runGalaxyWatch(ctx context.Context, args []string, stdout, stderr io.Writer
count++
if *limit > 0 && count >= *limit {
cancelStream()
// Drain so the WatchDeployEvents goroutine can exit instead
// of blocking on a send into the buffered events channel
// while the deferred client.Close() tears the stream down
// underneath it (mirrors the signal-cancel branch below).
for range events {
}
return nil
}
case streamErr, ok := <-errs:
+40
View File
@@ -537,3 +537,43 @@ func TestRunBatchHandlesLongCommandLine(t *testing.T) {
t.Fatalf("EOR sentinel count = %d, want 2 (one per command, even when first is too long); out length = %d", count, len(out))
}
}
// TestRunBenchReadBulkRejectsNonPositiveDuration pins the -duration-seconds
// positivity guard so the bench window cannot be configured to zero/negative.
func TestRunBenchReadBulkRejectsNonPositiveDuration(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runWithIO(t.Context(), []string{"bench-read-bulk", "-duration-seconds", "0"}, &stdout, &stderr)
if err == nil || !strings.Contains(err.Error(), "duration-seconds must be positive") {
t.Fatalf("bench-read-bulk -duration-seconds 0 error = %v", err)
}
}
// TestRunStreamEventsRequiresSessionID pins the session-id guard so stream-events
// fails fast before dialing when no session id is supplied.
func TestRunStreamEventsRequiresSessionID(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runWithIO(t.Context(), []string{"stream-events", "-plaintext", "-api-key", "test"}, &stdout, &stderr)
if err == nil || !strings.Contains(err.Error(), "session-id is required") {
t.Fatalf("stream-events without -session-id error = %v", err)
}
}
// TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues pins the len-mismatch
// guard so a write-bulk with unequal item-handles / values counts fails fast
// before any dial.
func TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runWithIO(t.Context(), []string{
"write-bulk",
"-session-id", "s1",
"-server-handle", "1",
"-item-handles", "1,2",
"-values", "10",
"-type", "int32",
"-plaintext",
"-api-key", "test",
}, &stdout, &stderr)
if err == nil || !strings.Contains(err.Error(), "does not match values count") {
t.Fatalf("write-bulk mismatched handles/values error = %v", err)
}
}
File diff suppressed because it is too large Load Diff
+121
View File
@@ -666,3 +666,124 @@ func authorizationFromContext(ctx context.Context) string {
}
return values[0]
}
// ---------------------------------------------------------------------------
// WriteArrayElements / buildSparseArrayValue unit tests
// ---------------------------------------------------------------------------
func TestBuildSparseArrayValueSetsSparseOneof(t *testing.T) {
elements := map[uint32]*MxValue{
2: Int32Value(99),
0: Int32Value(10),
}
v := buildSparseArrayValue(DataTypeInteger, 5, elements)
sa, ok := v.Kind.(*pb.MxValue_SparseArrayValue)
if !ok {
t.Fatalf("Kind is %T, want *pb.MxValue_SparseArrayValue", v.Kind)
}
got := sa.SparseArrayValue
if got.GetElementDataType() != DataTypeInteger {
t.Errorf("ElementDataType = %v, want DataTypeInteger", got.GetElementDataType())
}
if got.GetTotalLength() != 5 {
t.Errorf("TotalLength = %d, want 5", got.GetTotalLength())
}
if len(got.GetElements()) != 2 {
t.Fatalf("len(Elements) = %d, want 2", len(got.GetElements()))
}
// Elements must be sorted by index (ascending).
if got.GetElements()[0].GetIndex() != 0 {
t.Errorf("Elements[0].Index = %d, want 0", got.GetElements()[0].GetIndex())
}
if got.GetElements()[0].GetValue().GetInt32Value() != 10 {
t.Errorf("Elements[0].Value = %v, want 10", got.GetElements()[0].GetValue())
}
if got.GetElements()[1].GetIndex() != 2 {
t.Errorf("Elements[1].Index = %d, want 2", got.GetElements()[1].GetIndex())
}
if got.GetElements()[1].GetValue().GetInt32Value() != 99 {
t.Errorf("Elements[1].Value = %v, want 99", got.GetElements()[1].GetValue())
}
}
func TestBuildSparseArrayValueEmptyMapProducesEmptyElements(t *testing.T) {
v := buildSparseArrayValue(DataTypeBoolean, 4, map[uint32]*MxValue{})
sa, ok := v.Kind.(*pb.MxValue_SparseArrayValue)
if !ok {
t.Fatalf("Kind is %T, want *pb.MxValue_SparseArrayValue", v.Kind)
}
if len(sa.SparseArrayValue.GetElements()) != 0 {
t.Errorf("len(Elements) = %d, want 0", len(sa.SparseArrayValue.GetElements()))
}
if sa.SparseArrayValue.GetTotalLength() != 4 {
t.Errorf("TotalLength = %d, want 4", sa.SparseArrayValue.GetTotalLength())
}
}
func TestWriteArrayElementsSendsWriteCommandWithSparseOneof(t *testing.T) {
fake := &fakeGatewayServer{
invokeReply: &pb.MxCommandReply{
SessionId: "session-1",
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE,
ProtocolStatus: &pb.ProtocolStatus{
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
},
},
}
client, cleanup := newBufconnClient(t, fake)
defer cleanup()
session := NewSessionForID(client, "session-1")
err := session.WriteArrayElements(
context.Background(),
1, 2,
DataTypeFloat,
10,
map[uint32]*MxValue{3: FloatValue(1.5)},
42,
)
if err != nil {
t.Fatalf("WriteArrayElements() error = %v", err)
}
cmd := fake.invokeRequest.GetCommand()
if cmd.GetKind() != pb.MxCommandKind_MX_COMMAND_KIND_WRITE {
t.Fatalf("command kind = %s, want WRITE", cmd.GetKind())
}
w := cmd.GetWrite()
if w.GetServerHandle() != 1 {
t.Fatalf("server handle = %d, want 1", w.GetServerHandle())
}
if w.GetItemHandle() != 2 {
t.Fatalf("item handle = %d, want 2", w.GetItemHandle())
}
if w.GetUserId() == 0 {
t.Fatal("user id = 0, want non-zero")
}
if w.GetUserId() != 42 {
t.Fatalf("user id = %d, want 42", w.GetUserId())
}
val := w.GetValue()
sa, ok := val.Kind.(*pb.MxValue_SparseArrayValue)
if !ok {
t.Fatalf("value kind is %T, want *pb.MxValue_SparseArrayValue", val.Kind)
}
if sa.SparseArrayValue.GetTotalLength() != 10 {
t.Errorf("TotalLength = %d, want 10", sa.SparseArrayValue.GetTotalLength())
}
if sa.SparseArrayValue.GetElementDataType() != DataTypeFloat {
t.Errorf("ElementDataType = %v, want DataTypeFloat", sa.SparseArrayValue.GetElementDataType())
}
if len(sa.SparseArrayValue.GetElements()) != 1 {
t.Fatalf("len(Elements) = %d, want 1", len(sa.SparseArrayValue.GetElements()))
}
elem := sa.SparseArrayValue.GetElements()[0]
if elem.GetIndex() != 3 {
t.Errorf("element index = %d, want 3", elem.GetIndex())
}
if elem.GetValue().GetFloatValue() != 1.5 {
t.Errorf("element float value = %v, want 1.5", elem.GetValue().GetFloatValue())
}
}
+52
View File
@@ -7,6 +7,7 @@ import (
"errors"
"fmt"
"io"
"sort"
"sync"
"time"
@@ -580,6 +581,57 @@ func (s *Session) WriteRaw(ctx context.Context, serverHandle, itemHandle int32,
})
}
// WriteArrayElements writes a sparse, default-filled array: only the given
// elements (index → scalar value) are set; every unmentioned index up to
// totalLength is written as the element type's default (false / 0 / "" / Unix
// epoch for time). The gateway expands the sparse representation into a full
// array write before forwarding to MXAccess — this is a RESET of unmentioned
// indices, not a preserve. Neither RESET semantics nor the original array
// content are retained.
//
// elementDataType must be a scalar MXAccess type (Boolean, Integer, Float,
// Double, String, or Time). totalLength must be at least as large as the
// highest index in elements plus one.
func (s *Session) WriteArrayElements(
ctx context.Context,
serverHandle, itemHandle int32,
elementDataType MxDataType,
totalLength uint32,
elements map[uint32]*MxValue,
userID int32,
) error {
return s.Write(ctx, serverHandle, itemHandle, buildSparseArrayValue(elementDataType, totalLength, elements), userID)
}
// buildSparseArrayValue constructs the MxValue carrying an MxSparseArray oneof
// arm from a map of index → scalar MxValue. Keys are visited in ascending
// order so the produced slice is deterministic (important for test assertions).
func buildSparseArrayValue(elementDataType MxDataType, totalLength uint32, elements map[uint32]*MxValue) *MxValue {
indices := make([]uint32, 0, len(elements))
for idx := range elements {
indices = append(indices, idx)
}
sort.Slice(indices, func(i, j int) bool { return indices[i] < indices[j] })
sparseElements := make([]*MxSparseElement, 0, len(elements))
for _, idx := range indices {
sparseElements = append(sparseElements, &MxSparseElement{
Index: idx,
Value: elements[idx],
})
}
return &MxValue{
Kind: &pb.MxValue_SparseArrayValue{
SparseArrayValue: &MxSparseArray{
ElementDataType: elementDataType,
TotalLength: totalLength,
Elements: sparseElements,
},
},
}
}
// PingRaw sends a diagnostic PING command and returns the raw reply.
// The message is echoed back by the gateway in the reply's DiagnosticMessage field.
func (s *Session) PingRaw(ctx context.Context, message string) (*MxCommandReply, error) {
+7
View File
@@ -36,6 +36,13 @@ type (
Value = pb.MxValue
// MxArray is the protobuf representation of an MXAccess array value.
MxArray = pb.MxArray
// MxSparseArray is the write-only protobuf type for default-fill partial
// array writes. The gateway expands it to a full array before forwarding
// to MXAccess: unmentioned indices receive the element type's default value
// (boolean false, integer 0, float 0.0, string "", time = Unix epoch).
MxSparseArray = pb.MxSparseArray
// MxSparseElement is one index/value pair inside an MxSparseArray.
MxSparseElement = pb.MxSparseElement
// MxStatusProxy mirrors the MXAccess MXSTATUS_PROXY structure.
MxStatusProxy = pb.MxStatusProxy
// ProtocolStatus is the gateway-level status carried on every reply.
+63
View File
@@ -84,6 +84,69 @@ yields alarm-feed messages from the gateway's central monitor), and
`acknowledgeAlarm` (ack by full alarm reference with an optional comment and
ack target). Close the subscription to cancel the underlying gRPC stream.
## Write Semantics And Common Pitfalls
These are MXAccess parity behaviors that surprise new callers. The gateway
forwards them unchanged — it does not paper over them.
### Attributing a write to a user without `authenticateUser`
MXAccess only stamps a plain `write`/`write2` with a Galaxy user id when the
item carries an active *supervisory* advise. If you are **not** using the
verified/secured path (`authenticateUser``writeSecured`/`writeSecured2`) but
still need the write attributed to a user id, you must first advise the item
supervisory and then pass that user id on the write. Without the supervisory
advise the `userId` on a plain write is ignored.
The session exposes `advise`/`unAdvise` but not supervisory advise, so send it
through the generic command channel:
```java
session.invokeCommand(MxCommand.newBuilder()
.setKind(MxCommandKind.MX_COMMAND_KIND_ADVISE_SUPERVISORY)
.setAdviseSupervisory(AdviseSupervisoryCommand.newBuilder()
.setServerHandle(serverHandle)
.setItemHandle(itemHandle))
.build());
session.write(serverHandle, itemHandle, value, userId);
```
The CLI exposes the same command as `advise-supervisory`, and `write` /
`write2` take `--user-id`.
### Array writes replace the whole array
A write to an array attribute **replaces the entire array**; it is not an
element-wise patch. To change a subset of elements, send the full array with
the unchanged elements included. For example, to change 2 elements of a
20-element array, build the `MxValue` from all 20 values (the 18 unchanged plus
the 2 new ones). Sending only the 2 changed values overwrites the attribute
with a 2-element array.
When only a few indices need changing and the rest should be reset to the
element type's default, use `writeArrayElements` instead of building the full
array manually:
```java
session.writeArrayElements(
serverHandle, itemHandle,
MxDataType.MX_DATA_TYPE_INTEGER,
20, // totalLength
Map.of(
2, MxValues.int32Value(42),
7, MxValues.int32Value(99)),
userId);
```
The gateway expands the sparse descriptor into a full `totalLength`-element
array before forwarding to the worker. Indices not listed in the map are
written as the element type's default — this is a **reset**, not a preserve;
current values at those positions are discarded. `totalLength` is required and
must match the declared length of the array attribute. Bare-name array items
(`Area001.Pump001.Speed`) are auto-normalized to the `[]` form at `AddItem` so
the array attribute accepts the write.
## Galaxy Repository Browse
The Galaxy Repository service is a separate metadata-only gRPC service exposed
+1 -1
View File
@@ -13,7 +13,7 @@ ext {
subprojects {
group = 'com.zb.mom.ww.mxgateway'
version = '0.1.1'
version = '0.1.2'
pluginManager.withPlugin('java') {
java {
File diff suppressed because it is too large Load Diff
@@ -39,7 +39,10 @@ import java.util.Optional;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Callable;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Consumer;
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmReply;
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmRequest;
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
@@ -105,8 +108,14 @@ public final class MxGatewayCli implements Callable<Integer> {
}
/**
* Test-friendly entry point that runs the CLI against the supplied
* {@link PrintWriter} pair instead of the system streams.
* Entry point that runs the CLI against the supplied {@link PrintWriter}
* pair instead of the system streams. This overload wires the production
* {@link GrpcMxGatewayCliClientFactory} (a real gRPC channel), so it is
* suitable for embedding the CLI but not for unit tests that need to stub
* the gateway. Tests should use the package-private
* {@link #execute(MxGatewayCliClientFactory, PrintWriter, PrintWriter, String...)}
* / {@link #commandLine(MxGatewayCliClientFactory)} overloads, which accept
* an injectable client factory.
*
* @param out writer that receives standard output
* @param err writer that receives standard error
@@ -144,6 +153,8 @@ public final class MxGatewayCli implements Callable<Integer> {
commandLine.addSubcommand("register", new RegisterCommand(clientFactory));
commandLine.addSubcommand("add-item", new AddItemCommand(clientFactory));
commandLine.addSubcommand("advise", new AdviseCommand(clientFactory));
commandLine.addSubcommand(
"advise-supervisory", new AdviseSupervisoryCommand(clientFactory));
commandLine.addSubcommand("subscribe-bulk", new SubscribeBulkCommand(clientFactory));
commandLine.addSubcommand("unsubscribe-bulk", new UnsubscribeBulkCommand(clientFactory));
commandLine.addSubcommand("read-bulk", new ReadBulkCommand(clientFactory));
@@ -1035,6 +1046,34 @@ public final class MxGatewayCli implements Callable<Integer> {
}
}
@Command(
name = "advise-supervisory",
description = "Invokes MXAccess AdviseSupervisory.")
static final class AdviseSupervisoryCommand extends GatewayCommand {
@Option(names = "--session-id", required = true, description = "Gateway session id.")
String sessionId;
@Option(names = "--server-handle", required = true, description = "MXAccess server handle.")
int serverHandle;
@Option(names = "--item-handle", required = true, description = "MXAccess item handle.")
int itemHandle;
AdviseSupervisoryCommand(MxGatewayCliClientFactory clientFactory) {
super(clientFactory);
}
@Override
public Integer call() {
try (MxGatewayCliClient client = clientFactory.connect(common.resolved())) {
MxCommandReply reply =
client.session(sessionId).adviseSupervisoryRaw(serverHandle, itemHandle);
writeOutput("advise-supervisory", common, json, reply, () -> reply.getKind().name());
}
return 0;
}
}
@Command(name = "subscribe-bulk", description = "Invokes MXAccess SubscribeBulk.")
static final class SubscribeBulkCommand extends GatewayCommand {
@Option(names = "--session-id", required = true, description = "Gateway session id.")
@@ -1536,50 +1575,74 @@ public final class MxGatewayCli implements Callable<Integer> {
StreamAlarmsRequest request = StreamAlarmsRequest.newBuilder()
.setAlarmFilterPrefix(filterPrefix)
.build();
// Client.Java-033 fail-fast on overflow. A bare
// queue.offer(value) silently drops messages past capacity,
// which violates the JavaStyleGuide "do not drop events"
// contract and lets the CLI exit 0 on a truncated feed.
// Mirrors MxEventStream's overflow branch: detect a failed
// offer, cancel the subscription, drain the buffer, then
// queue an explicit overflow exception followed by the END
// sentinel so the drain loop surfaces a non-zero exit.
// Client.Java-033/040/042 fail-fast on overflow and on
// transport errors. A bare queue.offer(value) silently drops
// messages past capacity (violating the JavaStyleGuide "do not
// drop events" contract and letting the CLI exit 0 on a
// truncated feed), and a bare queue.offer(error) on a full
// queue would drop the terminal item and deadlock the drain on
// queue.take().
//
// Terminal transitions (overflow, transport error, clean
// completion) are now serialised through a single AtomicBoolean
// guard plus a dedicated `terminal` slot rather than
// re-clearing the shared queue. The first terminal condition
// wins; a concurrent onNext on the gRPC I/O thread can no
// longer displace it (Client.Java-040). The drain reads the
// terminal slot independently of the bounded queue, so a full
// queue can never strand the terminal item (Client.Java-042).
AtomicReference<MxGatewayAlarmFeedSubscription> subscriptionRef = new AtomicReference<>();
AtomicBoolean terminated = new AtomicBoolean();
AtomicReference<Object> terminal = new AtomicReference<>();
Consumer<Object> terminate = item -> {
if (terminated.compareAndSet(false, true)) {
terminal.set(item);
MxGatewayAlarmFeedSubscription sub = subscriptionRef.get();
if (sub != null) {
sub.cancel();
}
}
};
MxGatewayAlarmFeedSubscription subscription =
client.streamAlarms(request, new StreamObserver<>() {
@Override
public void onNext(AlarmFeedMessage value) {
if (terminated.get()) {
return;
}
if (!queue.offer(value)) {
MxGatewayAlarmFeedSubscription sub = subscriptionRef.get();
if (sub != null) {
sub.cancel();
}
queue.clear();
queue.offer(new IllegalStateException(
terminate.accept(new IllegalStateException(
"stream-alarms queue overflowed (capacity 1024); consumer too slow"));
queue.offer(ALARM_FEED_END);
}
}
@Override
public void onError(Throwable error) {
queue.offer(error);
terminate.accept(error);
}
@Override
public void onCompleted() {
queue.offer(ALARM_FEED_END);
terminate.accept(ALARM_FEED_END);
}
});
subscriptionRef.set(subscription);
try {
int count = 0;
while (true) {
Object item = queue.take();
if (item == ALARM_FEED_END) {
break;
}
if (item instanceof Throwable error) {
// Poll with a short timeout so the dedicated terminal
// slot is observed even when the bounded queue is full
// of normal messages the consumer has not yet drained.
Object item = queue.poll(50, TimeUnit.MILLISECONDS);
if (item == null) {
Object end = terminal.get();
if (end == null) {
continue;
}
if (end == ALARM_FEED_END) {
break;
}
Throwable error = (Throwable) end;
throw new IllegalStateException(
"gateway stream alarms failed: " + error.getMessage(), error);
}
@@ -1797,6 +1860,8 @@ public final class MxGatewayCli implements Callable<Integer> {
MxCommandReply adviseRaw(int serverHandle, int itemHandle);
MxCommandReply adviseSupervisoryRaw(int serverHandle, int itemHandle);
MxCommandReply writeRaw(int serverHandle, int itemHandle, MxValue value, int userId);
List<SubscribeResult> subscribeBulk(int serverHandle, List<String> items);
@@ -1915,6 +1980,17 @@ public final class MxGatewayCli implements Callable<Integer> {
return session.adviseRaw(serverHandle, itemHandle);
}
@Override
public MxCommandReply adviseSupervisoryRaw(int serverHandle, int itemHandle) {
return session.invokeCommand(MxCommand.newBuilder()
.setKind(MxCommandKind.MX_COMMAND_KIND_ADVISE_SUPERVISORY)
.setAdviseSupervisory(
mxaccess_gateway.v1.MxaccessGateway.AdviseSupervisoryCommand.newBuilder()
.setServerHandle(serverHandle)
.setItemHandle(itemHandle))
.build());
}
@Override
public MxCommandReply writeRaw(int serverHandle, int itemHandle, MxValue value, int userId) {
return session.writeRaw(serverHandle, itemHandle, value, userId);
@@ -2184,13 +2260,37 @@ public final class MxGatewayCli implements Callable<Integer> {
return jsonString(value.toString());
}
private static String jsonString(String value) {
return '"'
+ value.replace("\\", "\\\\")
.replace("\"", "\\\"")
.replace("\r", "\\r")
.replace("\n", "\\n")
+ '"';
// Package-private for the Client.Java-041 escaping regression test.
static String jsonString(String value) {
// RFC 8259 requires the two-character escapes for the named control
// characters and six-character uXXXX escapes for the remaining
// U+0000-U+001F (and U+007F) range. The old implementation escaped only
// backslash, quote, CR, and LF, so a
// value containing a tab, backspace, form-feed, or any other control
// character produced malformed JSON (Client.Java-041).
StringBuilder builder = new StringBuilder(value.length() + 2);
builder.append('"');
for (int i = 0; i < value.length(); i++) {
char c = value.charAt(i);
switch (c) {
case '\\' -> builder.append("\\\\");
case '"' -> builder.append("\\\"");
case '\r' -> builder.append("\\r");
case '\n' -> builder.append("\\n");
case '\t' -> builder.append("\\t");
case '\b' -> builder.append("\\b");
case '\f' -> builder.append("\\f");
default -> {
if (c < 0x20 || c == 0x7f) {
builder.append(String.format("\\u%04x", (int) c));
} else {
builder.append(c);
}
}
}
}
builder.append('"');
return builder.toString();
}
private record RawJson(String value) {
@@ -46,6 +46,22 @@ import mxaccess_gateway.v1.MxaccessGateway.SessionState;
* instance uses a unique server name so harnesses do not collide. The
* {@code directExecutor()} wiring keeps all dispatch on the calling thread, so
* no background threads are leaked.
*
* <p><strong>Implemented RPCs.</strong> The scripted services override only the
* RPCs the CLI tests currently exercise:
*
* <ul>
* <li>{@code MxAccessGateway}: {@code streamEvents}, {@code closeSession}.</li>
* <li>{@code GalaxyRepository}: {@code discoverHierarchy},
* {@code watchDeployEvents}.</li>
* </ul>
*
* Every other RPC (e.g. {@code openSession}, {@code invoke}, {@code register},
* {@code streamAlarms}, {@code queryActiveAlarms}, {@code browseChildren}) is
* left at the generated {@code *ImplBase} default and therefore returns gRPC
* {@code UNIMPLEMENTED} by design. A future test that needs one of those paths
* must add the corresponding scripted override here first otherwise the call
* fails with {@code UNIMPLEMENTED} rather than the behaviour under test.
*/
final class InProcessGatewayHarness implements AutoCloseable {
private final String serverName;
@@ -56,17 +56,37 @@ final class MxGatewayCliTests {
assertEquals(0, run.exitCode());
assertEquals("", run.errors());
assertTrue(run.output().contains("mxgateway-java 0.1.0"));
assertTrue(run.output().contains("mxgateway-java 0.1.1"));
assertTrue(run.output().contains("gatewayProtocolVersion=3"));
assertTrue(run.output().contains("workerProtocolVersion=1"));
}
@Test
void jsonStringEscapesControlCharacters() {
// Client.Java-041 the hand-rolled jsonString escaped only backslash,
// quote, CR, and LF, so a tab/backspace/form-feed or any other control
// char produced malformed JSON (RFC 8259). After the fix the named control
// chars use their two-character escapes and the rest use six-char uXXXX.
assertEquals("\"a\\tb\"", MxGatewayCli.jsonString("a\tb"));
assertEquals("\"a\\bb\"", MxGatewayCli.jsonString("a\bb"));
assertEquals("\"a\\fb\"", MxGatewayCli.jsonString("a\fb"));
assertEquals("\"a\\rb\"", MxGatewayCli.jsonString("a\rb"));
assertEquals("\"a\\nb\"", MxGatewayCli.jsonString("a\nb"));
// A non-named control character (U+0001) must become .
assertEquals("\"a\\u0001b\"", MxGatewayCli.jsonString("ab"));
// DEL (U+007F) is also escaped.
assertEquals("\"a\\u007fb\"", MxGatewayCli.jsonString("ab"));
// Quote and backslash still escape; ordinary printable text is verbatim.
assertEquals("\"a\\\"\\\\b\"", MxGatewayCli.jsonString("a\"\\b"));
assertEquals("\"plain\"", MxGatewayCli.jsonString("plain"));
}
@Test
void versionCommandPrintsJson() {
CliRun run = execute(new FakeClientFactory(), "version", "--json");
assertEquals(0, run.exitCode());
assertTrue(run.output().contains("\"clientVersion\":\"0.1.0\""));
assertTrue(run.output().contains("\"clientVersion\":\"0.1.1\""));
assertTrue(run.output().contains("\"gatewayProtocolVersion\":3"));
}
@@ -241,27 +261,20 @@ final class MxGatewayCliTests {
void galaxyBrowseParentZeroEmitsWarningToStderr() {
// --parent 0 is the server sentinel for roots; passing it explicitly is
// almost certainly a mistake. The CLI must print a warning to stderr
// (matching Go/Rust client behaviour) but must still attempt the call
// (exit behaviour depends on gateway reachability, not tested here;
// we only assert the warning path is triggered by checking the error
// writer before any gRPC connection is attempted).
// (matching Go/Rust client behaviour) but must still attempt the call.
//
// GalaxyBrowseCommand connects to a real GalaxyRepositoryClient, so the
// call() body will throw after printing the warning when no gateway is
// reachable. We only assert the warning appears on stderr.
StringWriter output = new StringWriter();
StringWriter errors = new StringWriter();
// Non-zero exit is expected (no live gateway), but the warning must
// appear on stderr regardless of what happens next.
MxGatewayCli.execute(
new FakeClientFactory(),
new PrintWriter(output, true),
new PrintWriter(errors, true),
// GalaxyBrowseCommand prints the warning, then calls connect() on the
// GalaxyClientFactory. We inject a stub factory whose connect() throws,
// so only the warning path runs no live Netty channel to localhost is
// constructed (Client.Java-043). The warning is emitted before
// connect() is reached, so it appears on stderr regardless.
CliRun run = executeGalaxy(
new ThrowingGalaxyClientFactory(),
"galaxy-browse", "--parent", "0", "--depth", "1");
assertTrue(
errors.toString().contains("--parent 0"),
"expected '--parent 0' warning on stderr; got: " + errors);
run.errors().contains("--parent 0"),
"expected '--parent 0' warning on stderr; got: " + run.errors());
}
// ---- galaxy command-name aliases (D9-java) ----
@@ -678,21 +691,28 @@ final class MxGatewayCliTests {
@Test
void streamAlarmsCommandFailsFastOnQueueOverflow() {
// Client.Java-033 regression the CLI's stream-alarms bounded queue
// used queue.offer(value) which silently dropped messages past
// capacity (1024). After the fix the CLI must surface the overflow
// as a non-zero exit (mirroring MxEventStream's fail-fast contract).
// Client.Java-033/040/046 regression the CLI's stream-alarms bounded
// queue used queue.offer(value) which silently dropped messages past
// capacity (1024). After the fix the CLI must surface the overflow as a
// non-zero exit (mirroring MxEventStream's fail-fast contract).
//
// The OverflowingFakeClient floods the gRPC observer with 2000
// messages synchronously, which exceeds the bounded 1024-element
// queue. The fix detects the failed offer, cancels the subscription,
// queues an overflow exception, and the drain loop surfaces it.
// The OverflowingFakeClient floods the gRPC observer on a BACKGROUND
// thread so the subscription is already published when the overflow
// fires exercising the terminate() cancel path with a non-null
// subscription (Client.Java-046), not just the synchronous-flood path
// where subscriptionRef is still null. The fix records the overflow in
// a dedicated terminal slot (no queue.clear, Client.Java-040) and the
// drain loop surfaces it with the overflow message text.
OverflowingFakeClientFactory factory = new OverflowingFakeClientFactory();
CliRun run = execute(factory, "stream-alarms", "--filter-prefix", "Flood");
assertFalse(run.exitCode() == 0,
"expected non-zero exit when the alarm queue overflows; got exit=" + run.exitCode()
+ " out=\n" + run.output() + "\nerr=\n" + run.errors());
assertTrue(
run.errors().contains("queue overflowed") || run.output().contains("queue overflowed"),
"expected the overflow message text to surface; out=\n" + run.output()
+ "\nerr=\n" + run.errors());
}
@Test
@@ -1050,6 +1070,18 @@ final class MxGatewayCliTests {
}
}
/**
* Galaxy client factory whose {@code connect} throws, so a test can exercise
* a command's pre-connect path (e.g. the {@code --parent 0} warning) without
* constructing a live Netty channel to localhost (Client.Java-043).
*/
private static final class ThrowingGalaxyClientFactory implements MxGatewayCli.GalaxyClientFactory {
@Override
public com.zb.mom.ww.mxgateway.client.GalaxyRepositoryClient connect(MxGatewayClientOptions options) {
throw new IllegalStateException("galaxy connect not available in this test");
}
}
private static final class OverflowingFakeClient implements MxGatewayCli.MxGatewayCliClient {
private final PrintWriter out;
@@ -1089,19 +1121,31 @@ final class MxGatewayCliTests {
@Override
public MxGatewayAlarmFeedSubscription streamAlarms(
StreamAlarmsRequest request, StreamObserver<AlarmFeedMessage> observer) {
// Synchronously push 2000 messages to overflow the CLI's bounded
// 1024-element queue. The CLI must surface the overflow rather
// than silently dropping the trailing ~976 messages.
for (int i = 0; i < 2000; i++) {
observer.onNext(AlarmFeedMessage.newBuilder()
.setActiveAlarm(ActiveAlarmSnapshot.newBuilder()
.setAlarmFullReference("Flood." + i)
.setCurrentState(AlarmConditionState.ALARM_CONDITION_STATE_ACTIVE)
.setSeverity(700))
.build());
}
observer.onCompleted();
return new MxGatewayAlarmFeedSubscription();
// Push messages on a BACKGROUND thread (mirroring real gRPC, which
// delivers onNext on a netty I/O thread) so the CLI's
// subscriptionRef is already published when the overflow fires
// this exercises the terminate() cancel path with a non-null
// subscription (Client.Java-046), unlike a synchronous flood that
// overflows before streamAlarms even returns. Keeps pushing until
// it observes the CLI cancelling the subscription on overflow, so
// no fixed message count is needed and the thread always exits.
MxGatewayAlarmFeedSubscription subscription = new MxGatewayAlarmFeedSubscription();
Thread flood = new Thread(() -> {
int i = 0;
while (!Thread.currentThread().isInterrupted() && i < 100_000) {
observer.onNext(AlarmFeedMessage.newBuilder()
.setActiveAlarm(ActiveAlarmSnapshot.newBuilder()
.setAlarmFullReference("Flood." + i)
.setCurrentState(AlarmConditionState.ALARM_CONDITION_STATE_ACTIVE)
.setSeverity(700))
.build());
i++;
}
observer.onCompleted();
}, "overflowing-fake-alarm-feed");
flood.setDaemon(true);
flood.start();
return subscription;
}
@Override
@@ -1258,6 +1302,15 @@ final class MxGatewayCliTests {
.build();
}
@Override
public MxCommandReply adviseSupervisoryRaw(int serverHandle, int itemHandle) {
adviseCalled = true;
return MxCommandReply.newBuilder()
.setKind(MxCommandKind.MX_COMMAND_KIND_ADVISE_SUPERVISORY)
.setProtocolStatus(ok())
.build();
}
@Override
public MxCommandReply writeRaw(int serverHandle, int itemHandle, MxValue value, int userId) {
lastWriteValue = value;
@@ -9,7 +9,7 @@ package com.zb.mom.ww.mxgateway.client;
public final class MxGatewayClientVersion {
private static final int GATEWAY_PROTOCOL_VERSION = 3;
private static final int WORKER_PROTOCOL_VERSION = 1;
private static final String CLIENT_VERSION = "0.1.0";
private static final String CLIENT_VERSION = "0.1.1";
private MxGatewayClientVersion() {
}
@@ -4,7 +4,9 @@ import java.security.SecureRandom;
import java.time.Duration;
import java.util.HexFormat;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import java.util.TreeMap;
import mxaccess_gateway.v1.MxaccessGateway.AddItem2Command;
import mxaccess_gateway.v1.MxaccessGateway.AddItemBulkCommand;
import mxaccess_gateway.v1.MxaccessGateway.AddItemCommand;
@@ -18,6 +20,9 @@ import mxaccess_gateway.v1.MxaccessGateway.MxCommand;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandKind;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandReply;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandRequest;
import mxaccess_gateway.v1.MxaccessGateway.MxDataType;
import mxaccess_gateway.v1.MxaccessGateway.MxSparseArray;
import mxaccess_gateway.v1.MxaccessGateway.MxSparseElement;
import mxaccess_gateway.v1.MxaccessGateway.MxValue;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionReply;
import mxaccess_gateway.v1.MxaccessGateway.ReadBulkCommand;
@@ -603,6 +608,54 @@ public final class MxGatewaySession implements AutoCloseable {
.build());
}
/**
* Writes a subset of an array's elements using MXAccess {@code Write}, building a
* write-only {@link MxSparseArray} value that the gateway expands into a full,
* default-filled array before forwarding to the worker.
*
* <p><strong>Default-fill semantics:</strong> only the indices supplied in
* {@code elements} are written; every unmentioned index is <em>reset</em> to the
* element type's default (for example {@code 0}, {@code false}, or an empty string),
* <em>not</em> preserved from the array's current contents. Use a full
* {@link MxValue} array write when you need to keep existing element values.
*
* <p>{@code totalLength} is required and defines the length of the expanded array;
* supplied indices must be within {@code [0, totalLength)}. Elements are iterated in
* ascending index order so the produced command is deterministic.
*
* @param serverHandle the {@code ServerHandle} owning the item
* @param itemHandle the {@code ItemHandle} to write
* @param elementDataType the {@link MxDataType} of the array's elements
* @param totalLength the total length of the expanded array
* @param elements the indices to write mapped to their scalar values; unmentioned
* indices are reset to the element type default
* @param userId the MXAccess user id used for security checks
* @throws MxGatewayException on transport or protocol failure
*/
public void writeArrayElements(
int serverHandle,
int itemHandle,
MxDataType elementDataType,
int totalLength,
Map<Integer, MxValue> elements,
int userId) {
Objects.requireNonNull(elementDataType, "elementDataType");
Objects.requireNonNull(elements, "elements");
MxSparseArray.Builder sparse = MxSparseArray.newBuilder()
.setElementDataType(elementDataType)
.setTotalLength(totalLength);
// Iterate in ascending index order so the built command is deterministic.
for (Map.Entry<Integer, MxValue> entry : new TreeMap<>(elements).entrySet()) {
sparse.addElements(MxSparseElement.newBuilder()
.setIndex(entry.getKey())
.setValue(Objects.requireNonNull(entry.getValue(), "elements value")));
}
MxValue value = MxValue.newBuilder()
.setSparseArrayValue(sparse)
.build();
writeRaw(serverHandle, itemHandle, value, userId);
}
/**
* Invokes MXAccess {@code Write2}, which carries an explicit timestamp.
*
@@ -153,6 +153,9 @@ public final class MxValues {
case TIMESTAMP_VALUE -> instant(value.getTimestampValue());
case ARRAY_VALUE -> nativeArray(value.getArrayValue());
case RAW_VALUE -> value.getRawValue().toByteArray();
// Write-only sparse descriptor: never produced by a read/decoded
// value, so it has no native representation.
case SPARSE_ARRAY_VALUE -> null;
case KIND_NOT_SET -> null;
};
}
@@ -19,6 +19,7 @@ import io.grpc.stub.ServerCallStreamObserver;
import io.grpc.stub.StreamObserver;
import java.time.Duration;
import java.util.List;
import java.util.Map;
import java.util.UUID;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;
@@ -36,7 +37,10 @@ import mxaccess_gateway.v1.MxaccessGateway.CloseSessionRequest;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandKind;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandReply;
import mxaccess_gateway.v1.MxaccessGateway.MxCommandRequest;
import mxaccess_gateway.v1.MxaccessGateway.MxDataType;
import mxaccess_gateway.v1.MxaccessGateway.MxEvent;
import mxaccess_gateway.v1.MxaccessGateway.MxSparseElement;
import mxaccess_gateway.v1.MxaccessGateway.MxValue;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionReply;
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionRequest;
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatus;
@@ -396,6 +400,57 @@ final class MxGatewayClientSessionTests {
}
}
@Test
void writeArrayElementsBuildsSparseArrayWriteCommand() throws Exception {
AtomicReference<MxCommandRequest> commandRequest = new AtomicReference<>();
TestGatewayService service = new TestGatewayService() {
@Override
public void invoke(MxCommandRequest request, StreamObserver<MxCommandReply> responseObserver) {
commandRequest.set(request);
responseObserver.onNext(MxCommandReply.newBuilder()
.setSessionId(request.getSessionId())
.setKind(request.getCommand().getKind())
.setProtocolStatus(ok())
.build());
responseObserver.onCompleted();
}
};
try (InProcessGateway gateway = InProcessGateway.start(service, new AtomicReference<>());
MxGatewayClient client = gateway.client("", Duration.ofSeconds(5))) {
MxGatewaySession session = MxGatewaySession.forSessionId(client, "sparse-session");
// Supply indices out of order to prove deterministic ascending iteration.
Map<Integer, MxValue> elements = Map.of(
3, MxValues.int32Value(99),
1, MxValues.int32Value(7));
session.writeArrayElements(12, 34, MxDataType.MX_DATA_TYPE_INTEGER, 5, elements, 56);
MxCommandRequest request = commandRequest.get();
assertNotNull(request);
assertEquals(MxCommandKind.MX_COMMAND_KIND_WRITE, request.getCommand().getKind());
assertEquals(12, request.getCommand().getWrite().getServerHandle());
assertEquals(34, request.getCommand().getWrite().getItemHandle());
assertEquals(56, request.getCommand().getWrite().getUserId());
MxValue written = request.getCommand().getWrite().getValue();
assertEquals(MxValue.KindCase.SPARSE_ARRAY_VALUE, written.getKindCase());
assertEquals(5, written.getSparseArrayValue().getTotalLength());
assertEquals(
MxDataType.MX_DATA_TYPE_INTEGER,
written.getSparseArrayValue().getElementDataType());
List<MxSparseElement> sparse = written.getSparseArrayValue().getElementsList();
assertEquals(2, sparse.size());
// Ascending index order is guaranteed by the helper.
assertEquals(1, sparse.get(0).getIndex());
assertEquals(7, sparse.get(0).getValue().getInt32Value());
assertEquals(3, sparse.get(1).getIndex());
assertEquals(99, sparse.get(1).getValue().getInt32Value());
}
}
private static ProtocolStatus ok() {
return ProtocolStatus.newBuilder()
.setCode(ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK)
+75 -3
View File
@@ -105,6 +105,76 @@ terminate the stream.
Canceling a Python task cancels the client-side gRPC call or stream wait. It
does not abort an in-flight MXAccess COM call inside the worker process.
## Write Semantics And Common Pitfalls
These are MXAccess parity behaviors that surprise new callers. The gateway
forwards them unchanged — it does not paper over them.
### Attributing a write to a user without `authenticate_user`
MXAccess only stamps a plain `write`/`write2` with a Galaxy user id when the
item carries an active *supervisory* advise. If you are **not** using the
verified/secured path (`authenticate_user``write_secured`/`write_secured2`)
but still need the write attributed to a user id, you must first advise the
item supervisory and then pass that user id on the write. Without the
supervisory advise the `user_id` on a plain write is ignored.
The session exposes `advise`/`unadvise` but not supervisory advise, so send it
through the generic command channel:
```python
await session.invoke(
pb.MxCommand(
kind=pb.MX_COMMAND_KIND_ADVISE_SUPERVISORY,
advise_supervisory=pb.AdviseSupervisoryCommand(
server_handle=server_handle,
item_handle=item_handle,
),
)
)
await session.write(server_handle, item_handle, value, user_id=user_id)
```
The CLI exposes the same command as `advise-supervisory`, and `write` /
`write2` take `--user-id`.
### Array writes replace the whole array
A write to an array attribute **replaces the entire array**; it is not an
element-wise patch. To change a subset of elements, send the full array with
the unchanged elements included. For example, to change 2 elements of a
20-element array, build the `MxValue` from all 20 values (the 18 unchanged plus
the 2 new ones). Sending only the 2 changed values overwrites the attribute
with a 2-element array.
### Default-fill partial array writes
`Session.write_array_elements` lets you write only the indices you care about.
The gateway fills every unmentioned position with the type default for the
declared `element_data_type` (0, `False`, `""`, Unix epoch for timestamps).
The previous value at those positions is **not** preserved — the gateway expands
the sparse map to a full array before forwarding the write to MXAccess, so this
is still a full replacement:
```python
# Write indices 0 and 5 of a 10-element integer array.
# Positions 1-4 and 6-9 become 0, not their previous values.
await session.write_array_elements(
server_handle=server_handle,
item_handle=item_handle,
element_data_type=pb.MX_DATA_TYPE_INTEGER,
total_length=10,
elements={0: 100, 5: 500},
)
```
Bare-name array items (e.g. `Object.ArrayAttr` without an index suffix) added
via `add_item` auto-normalize to `[]` — they refer to the whole array, not a
single element. Writes through such handles must cover the full array or use
`write_array_elements` to supply `total_length` and let the gateway fill
defaults for the rest.
## Galaxy Repository Browse
The `GalaxyRepositoryClient` wraps the read-only `GalaxyRepository` gRPC
@@ -140,19 +210,21 @@ service requires the `metadata:read` scope on the API key.
### Browsing lazily
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
For UI trees or OPC UA bridges, use `browse_children_raw` to walk one level at a
time instead of loading the full hierarchy with `discover_hierarchy`. Pass an
empty request for root objects; subsequent calls set `parent_gobject_id`,
`parent_tag_name`, or `parent_contained_path`. Filter fields match
`DiscoverHierarchy`. Each response pairs `children` with `child_has_children` so
you know which nodes to expand. See
you know which nodes to expand. Most callers should prefer the higher-level
`browse()` / `LazyBrowseNode` walker below; `browse_children_raw` is the
low-level escape hatch for direct page-token control. See
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
request and filter semantics.
```python
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb2
reply = await galaxy.browse_children(galaxy_pb2.BrowseChildrenRequest())
reply = await galaxy.browse_children_raw(galaxy_pb2.BrowseChildrenRequest())
for child, has_children in zip(reply.children, reply.child_has_children):
print(child.tag_name, "expand=" + str(has_children))
```
+1 -1
View File
@@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "zb-mom-ww-mxaccess-gateway-client"
version = "0.1.1"
version = "0.1.2"
description = "Async Python client scaffold for MXAccess Gateway."
readme = "README.md"
requires-python = ">=3.12"
@@ -2,7 +2,7 @@
from .auth import ApiKey, auth_metadata
from .client import GatewayClient
from .galaxy import GalaxyRepositoryClient
from .galaxy import GalaxyRepositoryClient, LazyBrowseNode
from .generated.galaxy_repository_pb2 import (
DeployEvent,
GalaxyAttribute,
@@ -19,19 +19,21 @@ from .errors import (
MxGatewayTransportError,
MxGatewayWorkerError,
)
from .options import ClientOptions
from .options import BrowseChildrenOptions, ClientOptions
from .session import Session
from .values import MxValueView, from_mx_value, to_mx_value
from .version import __version__
__all__ = [
"ApiKey",
"BrowseChildrenOptions",
"ClientOptions",
"DeployEvent",
"GalaxyAttribute",
"GalaxyObject",
"GalaxyRepositoryClient",
"GatewayClient",
"LazyBrowseNode",
"MxAccessError",
"MxGatewayAuthenticationError",
"MxGatewayAuthorizationError",
File diff suppressed because one or more lines are too long
@@ -489,6 +489,60 @@ class Session:
correlation_id=correlation_id,
)
async def write_array_elements(
self,
server_handle: int,
item_handle: int,
element_data_type: "pb.MxDataType.ValueType",
total_length: int,
elements: dict[int, MxValueInput],
*,
user_id: int = 0,
correlation_id: str = "",
) -> None:
"""Write a partial array by specifying only the indices you want to set.
The gateway expands the sparse representation into a full ``total_length``
array before forwarding the write to MXAccess. Indices not listed in
*elements* are filled with the type default for *element_data_type* (0,
False, empty string, Unix epoch for timestamps, etc.). The previous
value at those positions is **not** preserved this is a full array
replacement, not a patch.
Args:
server_handle: Handle returned by :meth:`register`.
item_handle: Handle returned by :meth:`add_item`.
element_data_type: ``pb.MX_DATA_TYPE_*`` enum value for the scalar
element type of the target array attribute.
total_length: Total number of elements in the written array. Must
be > 0 and large enough to contain every index in *elements*.
Both *total_length* and all keys in *elements* must be
non-negative; the gateway rejects negative or out-of-range
values with ``InvalidArgument`` (the proto fields are
``uint32``).
elements: Mapping of zero-based element index to scalar value.
Values are converted with :func:`~zb_mom_ww_mxgateway.values.to_mx_value`.
user_id: Galaxy user id to stamp on the write (requires a prior
supervisory advise to take effect see README).
correlation_id: Optional client-supplied correlation token echoed
in the command reply.
"""
sparse = pb.MxSparseArray(
element_data_type=element_data_type,
total_length=total_length,
elements=[
pb.MxSparseElement(index=idx, value=to_mx_value(val))
for idx, val in elements.items()
],
)
await self.write(
server_handle,
item_handle,
pb.MxValue(sparse_array_value=sparse),
user_id=user_id,
correlation_id=correlation_id,
)
async def write2(
self,
server_handle: int,
@@ -277,6 +277,23 @@ def advise(**kwargs: Any) -> None:
_run(_advise(**kwargs), output_json=kwargs["output_json"], secrets=_secrets(kwargs))
@main.command("advise-supervisory")
@gateway_options
@click.option("--session-id", required=True, help="Gateway session id.")
@click.option("--server-handle", required=True, type=int, help="MXAccess server handle.")
@click.option("--item-handle", required=True, type=int, help="MXAccess item handle.")
@click.option("--correlation-id", default="", help="Client correlation id.")
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
def advise_supervisory(**kwargs: Any) -> None:
"""Invoke MXAccess AdviseSupervisory."""
_run(
_advise_supervisory(**kwargs),
output_json=kwargs["output_json"],
secrets=_secrets(kwargs),
)
@main.command("subscribe-bulk")
@gateway_options
@click.option("--session-id", required=True, help="Gateway session id.")
@@ -725,6 +742,22 @@ async def _advise(**kwargs: Any) -> dict[str, Any]:
return {"ok": True}
async def _advise_supervisory(**kwargs: Any) -> dict[str, Any]:
async with await _connect(kwargs) as client:
session = _session(client, kwargs["session_id"])
await session.invoke(
pb.MxCommand(
kind=pb.MX_COMMAND_KIND_ADVISE_SUPERVISORY,
advise_supervisory=pb.AdviseSupervisoryCommand(
server_handle=kwargs["server_handle"],
item_handle=kwargs["item_handle"],
),
),
correlation_id=kwargs["correlation_id"],
)
return {"ok": True}
async def _subscribe_bulk(**kwargs: Any) -> dict[str, Any]:
async with await _connect(kwargs) as client:
session = _session(client, kwargs["session_id"])
@@ -769,7 +802,7 @@ def _build_write_bulk_entries(kwargs: dict[str, Any]):
"""
handles = _parse_int_list(kwargs["item_handles"])
value_texts = _parse_string_list(kwargs["values"])
value_texts = _parse_string_list(kwargs["values"], param_hint="--values")
if len(handles) != len(value_texts):
raise click.UsageError(
f"item-handles count ({len(handles)}) does not match values count ({len(value_texts)})",
@@ -1045,8 +1078,7 @@ async def _write2(**kwargs: Any) -> dict[str, Any]:
async def _smoke(**kwargs: Any) -> dict[str, Any]:
async with await _connect(kwargs) as client:
session = await client.open_session(client_session_name=kwargs["client_name"])
closed = False
try:
async with session:
server_handle = await session.register(kwargs["client_name"])
item_handle = await session.add_item(server_handle, kwargs["item"])
await session.advise(server_handle, item_handle)
@@ -1061,9 +1093,6 @@ async def _smoke(**kwargs: Any) -> dict[str, Any]:
"itemHandle": item_handle,
"events": [_message_dict(event) for event in events],
}
finally:
if not closed:
await session.close()
async def _galaxy_test_connection(**kwargs: Any) -> dict[str, Any]:
@@ -1487,10 +1516,10 @@ def _parse_datetime(raw_value: str) -> datetime:
return parsed
def _parse_string_list(raw_value: str) -> list[str]:
def _parse_string_list(raw_value: str, param_hint: str = "--items") -> list[str]:
values = [item.strip() for item in raw_value.split(",") if item.strip()]
if not values:
raise click.BadParameter("at least one item is required", param_hint="--items")
raise click.BadParameter("at least one item is required", param_hint=param_hint)
return values
@@ -1498,7 +1527,12 @@ def _parse_int_list(raw_value: str) -> list[int]:
values = [item.strip() for item in raw_value.split(",") if item.strip()]
if not values:
raise click.BadParameter("at least one item handle is required", param_hint="--item-handles")
return [int(item) for item in values]
try:
return [int(item) for item in values]
except ValueError as exc:
raise click.BadParameter(
f"item handles must be integers: {exc}", param_hint="--item-handles"
) from exc
def _message_dict(message: Any) -> dict[str, Any]:
@@ -0,0 +1,131 @@
"""Regression tests for Client.Python-032..036.
Each test corresponds to a finding from the 2026-06-16 re-review. Tests are
TDD-first they fail against the pre-fix source and pass against the fixed
source.
"""
from __future__ import annotations
import inspect
import re
from pathlib import Path
import click
import pytest
from zb_mom_ww_mxgateway_cli import commands as cli_commands
from zb_mom_ww_mxgateway_cli.commands import _parse_int_list, _parse_string_list
# ---------------------------------------------------------------------------
# Client.Python-032 — `_smoke` must not carry the dead `closed` guard variable.
# ---------------------------------------------------------------------------
def test_smoke_does_not_carry_dead_closed_guard() -> None:
"""`_smoke` must not reintroduce the dead `closed = False` / `if not closed`
guard removed by Client.Python-004. The variable is never reassigned, so the
guard misleads readers into expecting an early-close path that never exists.
"""
source = inspect.getsource(cli_commands._smoke)
assert "closed = False" not in source, (
"_smoke must not reintroduce the dead `closed = False` variable"
)
assert "if not closed:" not in source, (
"_smoke must not reintroduce the dead `if not closed:` guard"
)
# ---------------------------------------------------------------------------
# Client.Python-033 — `_parse_string_list` param_hint must reflect the caller.
# ---------------------------------------------------------------------------
def test_parse_string_list_default_param_hint_is_items() -> None:
with pytest.raises(click.BadParameter) as exc:
_parse_string_list("")
assert exc.value.param_hint == "--items"
def test_parse_string_list_accepts_caller_supplied_param_hint() -> None:
"""The write-bulk family passes `--values`, so an empty value must surface a
`--values` hint, not the irrelevant `--items` default.
"""
with pytest.raises(click.BadParameter) as exc:
_parse_string_list("", param_hint="--values")
assert exc.value.param_hint == "--values"
# ---------------------------------------------------------------------------
# Client.Python-034 — `_parse_int_list` must re-raise non-numeric tokens as
# click.BadParameter, not a raw ValueError traceback.
# ---------------------------------------------------------------------------
def test_parse_int_list_non_numeric_raises_bad_parameter() -> None:
with pytest.raises(click.BadParameter) as exc:
_parse_int_list("10,abc")
assert exc.value.param_hint == "--item-handles"
def test_parse_int_list_happy_path() -> None:
assert _parse_int_list("10, 20 ,30") == [10, 20, 30]
# ---------------------------------------------------------------------------
# Client.Python-035 — public browse types must be re-exported from the package
# root.
# ---------------------------------------------------------------------------
def test_browse_children_options_is_exported_from_package_root() -> None:
import zb_mom_ww_mxgateway as pkg
assert hasattr(pkg, "BrowseChildrenOptions")
assert "BrowseChildrenOptions" in pkg.__all__
def test_lazy_browse_node_is_exported_from_package_root() -> None:
import zb_mom_ww_mxgateway as pkg
assert hasattr(pkg, "LazyBrowseNode")
assert "LazyBrowseNode" in pkg.__all__
# ---------------------------------------------------------------------------
# Client.Python-036 — README "Browsing lazily" example must reference a method
# that actually exists on GalaxyRepositoryClient.
# ---------------------------------------------------------------------------
def _readme_path() -> Path:
return Path(__file__).resolve().parent.parent / "README.md"
def test_galaxy_client_exposes_browse_children_raw() -> None:
"""Guard the method name the README example depends on so future renames
break this test rather than only failing at runtime in user code.
"""
from zb_mom_ww_mxgateway import GalaxyRepositoryClient
assert hasattr(GalaxyRepositoryClient, "browse_children_raw")
def test_readme_browse_example_uses_existing_method() -> None:
"""The README's `galaxy.<method>(...BrowseChildrenRequest...)` call must name
a method that exists on GalaxyRepositoryClient.
"""
from zb_mom_ww_mxgateway import GalaxyRepositoryClient
text = _readme_path().read_text(encoding="utf-8")
called = set(re.findall(r"galaxy\.([A-Za-z_][A-Za-z0-9_]*)\s*\(", text))
assert called, "README must contain at least one galaxy.<method>(...) example"
for method in called:
assert hasattr(GalaxyRepositoryClient, method), (
f"README references galaxy.{method}() but no such method exists"
)
@@ -0,0 +1,209 @@
"""Tests for Session.write_array_elements default-fill sparse-array helper."""
from __future__ import annotations
from typing import Any
import pytest
from zb_mom_ww_mxgateway import ClientOptions, GatewayClient
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_sparse_mx_value(
element_data_type: "pb.MxDataType.ValueType",
total_length: int,
elements: dict[int, Any],
) -> pb.MxValue:
"""Build an MxValue wrapping an MxSparseArray from Python primitives.
Mirrors the logic inside Session.write_array_elements so tests can assert
the exact wire shape the helper produces without going through the full
gRPC stack.
"""
from zb_mom_ww_mxgateway.values import to_mx_value
return pb.MxValue(
sparse_array_value=pb.MxSparseArray(
element_data_type=element_data_type,
total_length=total_length,
elements=[
pb.MxSparseElement(index=idx, value=to_mx_value(val))
for idx, val in elements.items()
],
)
)
# ---------------------------------------------------------------------------
# Fake stub (minimal — only needs Invoke / OpenSession)
# ---------------------------------------------------------------------------
class _FakeUnary:
def __init__(self, replies: list[Any]) -> None:
self.replies = list(replies)
self.requests: list[Any] = []
self.metadata: tuple[tuple[str, str], ...] | None = None
async def __call__(
self,
request: Any,
*,
metadata: tuple[tuple[str, str], ...],
) -> Any:
self.requests.append(request)
self.metadata = metadata
return self.replies.pop(0)
class _FakeStub:
"""Minimal stub that satisfies GatewayClient for a single invoke round-trip."""
def __init__(self) -> None:
ok = pb.ProtocolStatus(code=pb.PROTOCOL_STATUS_CODE_OK)
self.open_session = _FakeUnary([pb.OpenSessionReply(session_id="s1", protocol_status=ok)])
self.invoke = _FakeUnary(
[
pb.MxCommandReply(
session_id="s1",
kind=pb.MX_COMMAND_KIND_WRITE,
protocol_status=ok,
),
]
)
self.OpenSession = self.open_session
self.Invoke = self.invoke
# ---------------------------------------------------------------------------
# Unit tests
# ---------------------------------------------------------------------------
def test_sparse_mx_value_builder_sets_correct_oneof() -> None:
"""Builder helper must produce an MxValue with kind == 'sparse_array_value'."""
mv = _make_sparse_mx_value(pb.MX_DATA_TYPE_INTEGER, 5, {0: 10, 3: 30})
assert mv.WhichOneof("kind") == "sparse_array_value"
def test_sparse_mx_value_builder_total_length() -> None:
"""total_length must equal the value passed to the builder."""
mv = _make_sparse_mx_value(pb.MX_DATA_TYPE_INTEGER, 20, {1: 7})
assert mv.sparse_array_value.total_length == 20
def test_sparse_mx_value_builder_element_count_and_values() -> None:
"""Elements list length and scalar values must match the input dict."""
mv = _make_sparse_mx_value(pb.MX_DATA_TYPE_INTEGER, 10, {0: 11, 4: 55, 9: 99})
sa = mv.sparse_array_value
assert len(sa.elements) == 3
by_index = {e.index: e.value for e in sa.elements}
assert by_index[0].int32_value == 11
assert by_index[4].int32_value == 55
assert by_index[9].int32_value == 99
def test_sparse_mx_value_builder_element_data_type() -> None:
"""element_data_type must be forwarded verbatim."""
mv = _make_sparse_mx_value(pb.MX_DATA_TYPE_FLOAT, 3, {})
assert mv.sparse_array_value.element_data_type == pb.MX_DATA_TYPE_FLOAT
def test_sparse_mx_value_builder_empty_elements() -> None:
"""An empty elements dict must still produce a valid MxSparseArray."""
mv = _make_sparse_mx_value(pb.MX_DATA_TYPE_BOOLEAN, 8, {})
sa = mv.sparse_array_value
assert len(sa.elements) == 0
assert sa.total_length == 8
# ---------------------------------------------------------------------------
# Integration-level: write_array_elements routes through Session.write
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_write_array_elements_sends_sparse_array_write_command() -> None:
"""write_array_elements must send a WRITE command whose value is sparse_array_value."""
stub = _FakeStub()
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
await session.write_array_elements(
server_handle=1,
item_handle=2,
element_data_type=pb.MX_DATA_TYPE_INTEGER,
total_length=10,
elements={0: 100, 5: 500},
)
assert len(stub.invoke.requests) == 1
cmd_req: pb.MxCommandRequest = stub.invoke.requests[0]
cmd = cmd_req.command
assert cmd.kind == pb.MX_COMMAND_KIND_WRITE
mv = cmd.write.value
assert mv.WhichOneof("kind") == "sparse_array_value"
sa = mv.sparse_array_value
assert sa.element_data_type == pb.MX_DATA_TYPE_INTEGER
assert sa.total_length == 10
assert len(sa.elements) == 2
by_index = {e.index: e.value for e in sa.elements}
assert by_index[0].int32_value == 100
assert by_index[5].int32_value == 500
@pytest.mark.asyncio
async def test_write_array_elements_forwards_user_id() -> None:
"""user_id must reach the WriteCommand."""
stub = _FakeStub()
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
await session.write_array_elements(
server_handle=1,
item_handle=2,
element_data_type=pb.MX_DATA_TYPE_BOOLEAN,
total_length=4,
elements={},
user_id=42,
)
cmd = stub.invoke.requests[0].command
assert cmd.write.user_id == 42
@pytest.mark.asyncio
async def test_write_array_elements_string_elements() -> None:
"""String element values must be encoded as string_value scalars."""
stub = _FakeStub()
client = await GatewayClient.connect(
ClientOptions(endpoint="fake", api_key="mxgw_test_secret", plaintext=True),
stub=stub,
)
session = await client.open_session()
await session.write_array_elements(
server_handle=1,
item_handle=2,
element_data_type=pb.MX_DATA_TYPE_STRING,
total_length=3,
elements={1: "hello", 2: "world"},
)
sa = stub.invoke.requests[0].command.write.value.sparse_array_value
by_index = {e.index: e.value for e in sa.elements}
assert by_index[1].string_value == "hello"
assert by_index[2].string_value == "world"
+2 -2
View File
@@ -590,7 +590,7 @@ checksum = "1d87ecb2933e8aeadb3e3a02b828fed80a7528047e68b4f424523a0981a3a084"
[[package]]
name = "mxgw-cli"
version = "0.1.1"
version = "0.1.2"
dependencies = [
"clap",
"futures-util",
@@ -1490,7 +1490,7 @@ dependencies = [
[[package]]
name = "zb-mom-ww-mxgateway-client"
version = "0.1.1"
version = "0.1.2"
dependencies = [
"futures-core",
"futures-util",
+2 -2
View File
@@ -1,6 +1,6 @@
[package]
name = "zb-mom-ww-mxgateway-client"
version = "0.1.1"
version = "0.1.2"
edition = "2021"
authors = ["Joseph Doherty"]
description = "Async Rust client for the MxAccessGateway gRPC service, including a lazy-browse walker over the Galaxy Repository hierarchy."
@@ -20,7 +20,7 @@ resolver = "2"
[workspace.package]
edition = "2021"
version = "0.1.1"
version = "0.1.2"
authors = ["Joseph Doherty"]
license = "Proprietary"
repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
+78 -2
View File
@@ -125,6 +125,82 @@ preserving the raw message for parity diagnostics. Command replies whose
protocol status is not `PROTOCOL_STATUS_CODE_OK` become `Error::Command` and
retain the raw `MxCommandReply`.
## Write Semantics And Common Pitfalls
These are MXAccess parity behaviors that surprise new callers. The gateway
forwards them unchanged — it does not paper over them.
### Attributing a write to a user without `authenticate_user`
MXAccess only stamps a plain `write`/`write2` with a Galaxy user id when the
item carries an active *supervisory* advise. If you are **not** using the
verified/secured path (`authenticate_user``write_secured`/`write_secured2`)
but still need the write attributed to a user id, you must first advise the
item supervisory and then pass that user id on the write. Without the
supervisory advise the `user_id` on a plain write is ignored.
The session exposes `advise`/`un_advise` but not supervisory advise, so send it
through the generic command channel:
```rust
session
.invoke(
MxCommandKind::AdviseSupervisory,
Payload::AdviseSupervisory(AdviseSupervisoryCommand {
server_handle,
item_handle,
}),
)
.await?;
session.write(server_handle, item_handle, value, user_id).await?;
```
The CLI exposes the same command as `advise-supervisory`, and `write` /
`write2` take `--user-id`.
### Array writes replace the whole array
A write to an array attribute **replaces the entire array**; it is not an
element-wise patch. To change a subset of elements, send the full array with
the unchanged elements included. For example, to change 2 elements of a
20-element array, build the `MxValue` from all 20 values (the 18 unchanged plus
the 2 new ones). Sending only the 2 changed values overwrites the attribute
with a 2-element array.
#### Default-fill partial array writes
When you only need to set a handful of indices and want every other position to
take the element type's default (zero / `false` / empty string / Unix epoch for
timestamps), use `Session::write_array_elements` instead:
```rust
// Write a 10-element integer array; index 0 = 42, index 7 = 99,
// all other indices default to 0 (not preserved from the previous value).
session
.write_array_elements(
server_handle,
item_handle,
MxDataType::Integer,
10,
[(0, MxValue::int32(42)), (7, MxValue::int32(99))],
user_id,
)
.await?;
```
The gateway expands the sparse representation into a full `MxArray` before
forwarding to the worker — the worker and MXAccess COM never see the sparse
form. Unmentioned indices are reset to the type default, **not** preserved from
the existing attribute value.
#### Bare-name array AddItem normalisation
`AddItem` for a bare array attribute name (e.g. `Tank01.Temperature`) is
automatically normalised to `Tank01.Temperature[]` by the gateway so the
worker can resolve the full array. You do not need to append `[]` in client
code; the gateway handles it.
## Galaxy Repository browse
The Galaxy Repository service exposes a read-only browse over the AVEVA System
@@ -161,7 +237,7 @@ cargo run -p mxgw-cli -- galaxy discover-hierarchy --endpoint http://localhost:5
### Browsing lazily
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
For UI trees or OPC UA bridges, use `browse_children_raw` to walk one level at a
time instead of paging the full hierarchy. Pass a default request for root
objects; subsequent calls set `parent_gobject_id`, `parent_tag_name`, or
`parent_contained_path`. Filter fields match `discover_hierarchy`. Each response
@@ -172,7 +248,7 @@ request and filter semantics.
```rust
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::BrowseChildrenRequest;
let reply = galaxy.browse_children(BrowseChildrenRequest::default()).await?.into_inner();
let reply = galaxy.browse_children_raw(BrowseChildrenRequest::default()).await?;
for (child, has_children) in reply.children.iter().zip(reply.child_has_children.iter()) {
println!("{} expand={}", child.tag_name, has_children);
}
+8
View File
@@ -349,8 +349,16 @@ mxgw bench-read-bulk [--duration-seconds <n>] [--warmup-seconds <n>] [--bulk-siz
mxgw smoke --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt
mxgw batch
mxgw galaxy {test-connection,last-deploy-time,discover-hierarchy,watch}
mxgw galaxy browse [--parent-gobject-id <id>] [--category-id <id>...] [--template-contains <s>...] [--tag-name-glob <glob>] [--include-attributes] [--alarm-bearing-only] [--historized-only] [--depth <n>] [--json]
```
`galaxy browse` walks the hierarchy one level at a time over the raw
`BrowseChildren` paging path. `--depth 0` (the default) prints only the
requested level; `--depth N` eagerly expands N additional levels beneath each
returned node. `--parent-gobject-id` makes `--depth` a no-op (the parent's
children are returned as a single level). Omit `--parent-gobject-id` to browse
root objects.
`batch` reads commands from stdin one per line and dispatches each through
the normal subcommand path; the loop terminates only on stdin EOF (blank
lines log an empty-EOR-bracketed result and continue) so accidental empty
+152 -17
View File
@@ -21,10 +21,11 @@ use serde_json::Value;
use zb_mom_ww_mxgateway_client::galaxy::{BrowseChildrenOptions, LazyBrowseNode};
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::DeployEvent;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
alarm_feed_message, AcknowledgeAlarmRequest, AlarmFeedMessage, CloseSessionRequest, MxCommand,
MxCommandKind, MxCommandRequest, MxEvent, MxEventFamily, MxValue as ProtoMxValue,
OpenSessionRequest, PingCommand, StreamAlarmsRequest, StreamEventsRequest, Write2BulkEntry,
WriteBulkEntry, WriteSecured2BulkEntry, WriteSecuredBulkEntry,
alarm_feed_message, AcknowledgeAlarmRequest, AdviseSupervisoryCommand, AlarmFeedMessage,
CloseSessionRequest, MxCommand, MxCommandKind, MxCommandRequest, MxEvent, MxEventFamily,
MxValue as ProtoMxValue, OpenSessionRequest, PingCommand, StreamAlarmsRequest,
StreamEventsRequest, Write2BulkEntry, WriteBulkEntry, WriteSecured2BulkEntry,
WriteSecuredBulkEntry,
};
use zb_mom_ww_mxgateway_client::{
next_correlation_id, ApiKey, ClientOptions, Error, GalaxyClient, GatewayClient, MxValue,
@@ -46,8 +47,6 @@ enum Command {
Version {
#[arg(long)]
json: bool,
#[arg(long)]
jsonl: bool,
},
Ping {
#[command(flatten)]
@@ -107,6 +106,18 @@ enum Command {
#[arg(long)]
json: bool,
},
AdviseSupervisory {
#[command(flatten)]
connection: ConnectionArgs,
#[arg(long)]
session_id: String,
#[arg(long)]
server_handle: i32,
#[arg(long)]
item_handle: i32,
#[arg(long)]
json: bool,
},
SubscribeBulk {
#[command(flatten)]
connection: ConnectionArgs,
@@ -458,9 +469,16 @@ struct ConnectionArgs {
endpoint: String,
#[arg(long)]
api_key: Option<String>,
/// Name of the environment variable holding the gateway API key. The
/// variable's value must be a full gateway key of the form
/// `mxgw_<key-id>_<secret>`; it is forwarded verbatim as the Bearer
/// token, so do not point this at an unrelated credential.
#[arg(long, default_value = "MXGATEWAY_API_KEY")]
api_key_env: String,
#[arg(long)]
/// Use an unencrypted (plaintext h2c) channel. Mutually exclusive with
/// `--tls`; supplying both is rejected so an explicit `--tls` cannot be
/// silently downgraded.
#[arg(long, conflicts_with = "tls")]
plaintext: bool,
#[arg(long)]
tls: bool,
@@ -545,7 +563,7 @@ async fn dispatch(command: Command) -> Result<(), Error> {
detail: "batch cannot be nested inside another batch session".to_owned(),
});
}
Command::Version { json, .. } => print_version(json),
Command::Version { json } => print_version(json),
Command::Ping {
connection,
message,
@@ -642,6 +660,27 @@ async fn dispatch(command: Command) -> Result<(), Error> {
session.advise(server_handle, item_handle).await?;
print_ok("advise", json);
}
Command::AdviseSupervisory {
connection,
session_id,
server_handle,
item_handle,
json,
} => {
let session = session_for(connection, session_id).await?;
session
.invoke(
MxCommandKind::AdviseSupervisory,
zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_command::Payload::AdviseSupervisory(
AdviseSupervisoryCommand {
server_handle,
item_handle,
},
),
)
.await?;
print_ok("advise-supervisory", json);
}
Command::SubscribeBulk {
connection,
session_id,
@@ -1214,6 +1253,24 @@ const BROWSE_PAGE_SIZE: i32 = 500;
/// Drive `BrowseChildren` paging by hand for a single parent and return the
/// flattened child list. Used by the `browse --parent-gobject-id` path, which
/// surfaces one level of children rather than the lazy root-tree walk.
/// Record a non-empty `next_page_token` in `seen` and reject a repeat. A
/// server that returns the same continuation token twice would loop forever,
/// so the second sighting is converted to an `InvalidArgument` error. Extracted
/// from [`browse_children_one_level`] so the guard can be unit-tested without a
/// network client.
fn register_page_token(
seen: &mut std::collections::HashSet<String>,
token: &str,
) -> Result<(), Error> {
if !seen.insert(token.to_owned()) {
return Err(Error::InvalidArgument {
name: "page_token".to_owned(),
detail: format!("galaxy browse children returned repeated page token `{token}`"),
});
}
Ok(())
}
async fn browse_children_one_level(
client: &mut GalaxyClient,
parent_gobject_id: i32,
@@ -1254,14 +1311,7 @@ async fn browse_children_one_level(
if page_token.is_empty() {
return Ok(children);
}
if !seen.insert(page_token.clone()) {
return Err(Error::InvalidArgument {
name: "page_token".to_owned(),
detail: format!(
"galaxy browse children returned repeated page token `{page_token}`"
),
});
}
register_page_token(&mut seen, &page_token)?;
}
}
@@ -2337,7 +2387,18 @@ where
mod tests {
use clap::Parser;
use super::Cli;
use super::{Cli, Command};
/// Pull the flattened `ConnectionArgs` out of a parsed `ping` command so
/// `ConnectionArgs::options()` can be exercised directly.
fn connection_from_ping(args: &[&str]) -> super::ConnectionArgs {
let mut full = vec!["mxgw", "ping"];
full.extend_from_slice(args);
match Cli::try_parse_from(full).expect("ping parse").command {
Command::Ping { connection, .. } => connection,
other => panic!("expected ping command, got {other:?}"),
}
}
#[test]
fn parses_version_json_command() {
@@ -2345,6 +2406,36 @@ mod tests {
assert!(parsed.is_ok());
}
#[test]
fn connection_defaults_to_plaintext() {
let options = connection_from_ping(&[]).options();
assert!(options.plaintext(), "default channel should be plaintext");
}
#[test]
fn connection_tls_flag_disables_plaintext() {
let options = connection_from_ping(&["--tls"]).options();
assert!(
!options.plaintext(),
"--tls must select an encrypted channel"
);
}
#[test]
fn connection_plaintext_flag_selects_plaintext() {
let options = connection_from_ping(&["--plaintext"]).options();
assert!(options.plaintext());
}
#[test]
fn connection_rejects_tls_and_plaintext_together() {
let parsed = Cli::try_parse_from(["mxgw", "ping", "--tls", "--plaintext"]);
assert!(
parsed.is_err(),
"--tls and --plaintext must conflict so TLS cannot be silently downgraded"
);
}
#[test]
fn parses_write_command() {
let parsed = Cli::try_parse_from([
@@ -2513,6 +2604,50 @@ mod tests {
assert_eq!(summary.mean, 42.0);
}
#[test]
fn register_page_token_accepts_distinct_tokens_and_rejects_repeats() {
let mut seen = std::collections::HashSet::new();
assert!(super::register_page_token(&mut seen, "tok-1").is_ok());
assert!(super::register_page_token(&mut seen, "tok-2").is_ok());
let repeated = super::register_page_token(&mut seen, "tok-1");
match repeated {
Err(super::Error::InvalidArgument { name, detail }) => {
assert_eq!(name, "page_token");
assert!(detail.contains("tok-1"), "detail: {detail}");
}
other => panic!("expected InvalidArgument on repeated token, got {other:?}"),
}
}
#[test]
fn rfc3339_parser_rejects_trailing_characters() {
let err = super::parse_rfc3339_timestamp("2026-04-28T15:30:00Zextra");
assert!(err.is_err(), "trailing chars after Z must be rejected");
}
#[test]
fn rfc3339_parser_rejects_day_zero() {
let err = super::parse_rfc3339_timestamp("2026-04-00T15:30:00Z");
assert!(err.is_err(), "day 0 must be rejected");
}
#[test]
fn rfc3339_parser_rejects_month_thirteen() {
let err = super::parse_rfc3339_timestamp("2026-13-01T15:30:00Z");
assert!(err.is_err(), "month 13 must be rejected");
}
#[test]
fn rfc3339_parser_rejects_day_out_of_range_for_month() {
// April has 30 days.
let err = super::parse_rfc3339_timestamp("2026-04-31T15:30:00Z");
assert!(err.is_err(), "April 31 must be rejected");
// February 29 in a non-leap year.
let feb = super::parse_rfc3339_timestamp("2025-02-29T00:00:00Z");
assert!(feb.is_err(), "Feb 29 in a non-leap year must be rejected");
}
#[test]
fn rfc3339_parser_round_trips_z_and_offset_inputs() {
// 2026-04-28T15:30:00Z = 1_777_995_000 (sanity-checked once below)
+60 -6
View File
@@ -17,12 +17,12 @@ use crate::generated::mxaccess_gateway::v1::mx_command_reply;
use crate::generated::mxaccess_gateway::v1::{
AddItem2Command, AddItemBulkCommand, AddItemCommand, AdviseCommand, AdviseItemBulkCommand,
BulkReadResult, BulkWriteResult, CloseSessionRequest, MxCommand, MxCommandKind, MxCommandReply,
MxCommandRequest, MxValue as ProtoMxValue, OpenSessionRequest, ReadBulkCommand,
RegisterCommand, RemoveItemBulkCommand, RemoveItemCommand, StreamEventsRequest,
SubscribeBulkCommand, SubscribeResult, UnAdviseCommand, UnAdviseItemBulkCommand,
UnsubscribeBulkCommand, Write2BulkCommand, Write2BulkEntry, Write2Command, WriteBulkCommand,
WriteBulkEntry, WriteCommand, WriteSecured2BulkCommand, WriteSecured2BulkEntry,
WriteSecuredBulkCommand, WriteSecuredBulkEntry,
MxCommandRequest, MxDataType, MxSparseArray, MxSparseElement, MxValue as ProtoMxValue,
OpenSessionRequest, ReadBulkCommand, RegisterCommand, RemoveItemBulkCommand, RemoveItemCommand,
StreamEventsRequest, SubscribeBulkCommand, SubscribeResult, UnAdviseCommand,
UnAdviseItemBulkCommand, UnsubscribeBulkCommand, Write2BulkCommand, Write2BulkEntry,
Write2Command, WriteBulkCommand, WriteBulkEntry, WriteCommand, WriteSecured2BulkCommand,
WriteSecured2BulkEntry, WriteSecuredBulkCommand, WriteSecuredBulkEntry,
};
use crate::value::MxValue;
@@ -547,6 +547,60 @@ impl Session {
Ok(())
}
/// Write a sparse, default-filled array: only the given elements
/// (index → scalar value) are set; every unmentioned index up to
/// `total_length` is written as the element type's default (a reset,
/// **not** a preserve). The gateway expands the sparse representation into
/// a whole-array write before forwarding to the worker.
///
/// This is a convenience wrapper around [`Session::write`] that builds the
/// `MxSparseArray` wire value for you. Call [`Session::write`] directly
/// if you need to pass a pre-built [`MxValue`] carrying a full
/// `MxArray`.
///
/// # Errors
///
/// Returns [`Error::InvalidArgument`] (propagated from the gateway) if
/// `total_length` is zero, exceeds the gateway's maximum array length, or
/// any element index is out of range. Returns [`Error::Command`] for
/// non-OK worker statuses, plus the usual transport/status errors.
pub async fn write_array_elements(
&self,
server_handle: i32,
item_handle: i32,
element_data_type: MxDataType,
total_length: u32,
elements: impl IntoIterator<Item = (u32, MxValue)>,
user_id: i32,
) -> Result<(), Error> {
use crate::generated::mxaccess_gateway::v1::mx_value::Kind;
let sparse_elements: Vec<MxSparseElement> = elements
.into_iter()
.map(|(index, value)| MxSparseElement {
index,
value: Some(value.into_proto()),
})
.collect();
let sparse_value = ProtoMxValue {
kind: Some(Kind::SparseArrayValue(MxSparseArray {
element_data_type: element_data_type as i32,
total_length,
elements: sparse_elements,
})),
..ProtoMxValue::default()
};
self.write(
server_handle,
item_handle,
MxValue::from_proto(sparse_value),
user_id,
)
.await
}
/// Run MXAccess `Write2` (single-value with caller-supplied timestamp).
///
/// # Errors
+5 -1
View File
@@ -173,7 +173,11 @@ impl MxValueProjection {
Some(Kind::TimestampValue(value)) => Self::Timestamp(*value),
Some(Kind::ArrayValue(value)) => Self::Array(MxArrayValue::from_proto(value.clone())),
Some(Kind::RawValue(value)) => Self::Raw(value.clone()),
None => Self::Unset,
// SparseArrayValue is write-only: the gateway expands it before forwarding
// to the worker and never emits it in events or read replies. Map it to
// Unset so any read-side code that encounters a stale or mis-routed
// sparse value degrades gracefully rather than panicking.
Some(Kind::SparseArrayValue(_)) | None => Self::Unset,
}
}
}
+189 -4
View File
@@ -17,6 +17,7 @@ use tonic::{Request, Response, Status};
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_access_gateway_server::{
MxAccessGateway, MxAccessGatewayServer,
};
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_command;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_command_reply;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::mx_value::Kind;
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
@@ -24,11 +25,11 @@ use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
AddItem2Reply, AddItemReply, AlarmConditionState, AlarmFeedMessage, AlarmTransitionKind,
BulkReadReply, BulkReadResult, BulkSubscribeReply, BulkWriteReply, BulkWriteResult,
CloseSessionReply, CloseSessionRequest, MxCommandKind, MxCommandReply, MxDataType, MxEvent,
MxEventFamily, MxStatusCategory, MxStatusProxy, MxStatusSource, MxValue,
OnAlarmTransitionEvent, OpenSessionReply, OpenSessionRequest, ProtocolStatus,
MxEventFamily, MxSparseArray, MxSparseElement, MxStatusCategory, MxStatusProxy, MxStatusSource,
MxValue, OnAlarmTransitionEvent, OpenSessionReply, OpenSessionRequest, ProtocolStatus,
ProtocolStatusCode, QueryActiveAlarmsRequest, RegisterReply, SessionState, StreamAlarmsRequest,
StreamEventsRequest, SubscribeResult, Write2BulkEntry, WriteBulkEntry, WriteSecured2BulkEntry,
WriteSecuredBulkEntry,
StreamEventsRequest, SubscribeResult, Write2BulkEntry, WriteBulkEntry, WriteCommand,
WriteSecured2BulkEntry, WriteSecuredBulkEntry,
};
use zb_mom_ww_mxgateway_client::{
next_correlation_id, ApiKey, ClientOptions, CommandError, Error, GatewayClient, MxStatus,
@@ -659,6 +660,9 @@ struct FakeState {
authorization: Mutex<Option<String>>,
last_command_kind: Mutex<Option<i32>>,
last_correlation_id: Mutex<Option<String>>,
/// Captures the last `WriteCommand` payload received, populated when the
/// `WriteOk` override is active. Used by `write_array_elements` e2e test.
last_write_command: Mutex<Option<WriteCommand>>,
stream_dropped: Arc<AtomicBool>,
/// Optional per-test override that pins the fake's `Invoke` handler to
/// a specific reply shape (or `Err(Status)`). The default of `None`
@@ -683,6 +687,10 @@ enum InvokeOverride {
/// Fail the unary call with `Status::unavailable(...)` so the client's
/// `Code::Unavailable` -> `Error::Unavailable` mapping is exercised.
Unavailable(String),
/// Accept a `Write` command (return `protocol_status = Ok`, no payload)
/// and capture the decoded `WriteCommand` in
/// `FakeState::last_write_command` for inspection.
WriteOk,
}
#[derive(Clone)]
@@ -764,6 +772,23 @@ impl MxAccessGateway for FakeGateway {
..MxCommandReply::default()
})),
InvokeOverride::Unavailable(message) => Err(Status::unavailable(message)),
InvokeOverride::WriteOk => {
// Extract and capture the WriteCommand payload so the test
// can assert on server_handle, item_handle, user_id, and value.
if let Some(mx_command::Payload::Write(write_cmd)) =
request.command.and_then(|c| c.payload)
{
*self.state.last_write_command.lock().await = Some(write_cmd);
}
Ok(Response::new(MxCommandReply {
session_id: request.session_id,
correlation_id: "fake-correlation".to_owned(),
kind,
protocol_status: Some(ok_status("write ok")),
payload: None,
..MxCommandReply::default()
}))
}
};
}
@@ -1091,3 +1116,163 @@ fn case_by_id<'a>(cases: &'a [Value], id: &str) -> &'a Value {
.find(|case| case["id"].as_str() == Some(id))
.unwrap_or_else(|| panic!("missing fixture case {id}"))
}
// ---------------------------------------------------------------------------
// write_array_elements — end-to-end fake-server test
// ---------------------------------------------------------------------------
#[tokio::test]
async fn write_array_elements_routes_sparse_array_write_through_fake_gateway() {
// Arrange: stand up the fake gateway with WriteOk so the Write command
// succeeds and the sent WriteCommand is captured for inspection.
let state = Arc::new(FakeState::default());
*state.invoke_override.lock().await = Some(InvokeOverride::WriteOk);
let endpoint = spawn_fake_gateway(state.clone()).await;
let client = GatewayClient::connect(ClientOptions::new(endpoint))
.await
.unwrap();
let session = client.session("session-fixture");
// Act: call the public write_array_elements helper.
session
.write_array_elements(
12,
34,
MxDataType::Integer,
10,
[(2u32, ClientMxValue::int32(42))],
7,
)
.await
.unwrap();
// Assert: the fake captured a Write command with the expected handles and
// a SparseArrayValue whose total_length and element index/value are correct.
let captured = state
.last_write_command
.lock()
.await
.take()
.expect("fake should have captured a WriteCommand");
assert_eq!(captured.server_handle, 12, "server_handle must round-trip");
assert_eq!(captured.item_handle, 34, "item_handle must round-trip");
assert_eq!(captured.user_id, 7, "user_id must round-trip");
let value = captured.value.expect("WriteCommand must carry a value");
assert_eq!(
value.data_type, 0,
"outer MxValue.data_type must be Unspecified (0), not the element type"
);
let Kind::SparseArrayValue(ref sparse) = value.kind.as_ref().unwrap() else {
panic!(
"expected SparseArrayValue kind on the outer MxValue, got {:?}",
value.kind
);
};
assert_eq!(
sparse.element_data_type,
MxDataType::Integer as i32,
"element_data_type must carry the element type"
);
assert_eq!(sparse.total_length, 10, "total_length must round-trip");
assert_eq!(sparse.elements.len(), 1, "one element supplied");
let elem = &sparse.elements[0];
assert_eq!(elem.index, 2, "element index must round-trip");
assert_eq!(
elem.value.as_ref().unwrap().kind,
Some(Kind::Int32Value(42)),
"element value must round-trip"
);
}
// ---------------------------------------------------------------------------
// write_array_elements — proto shape unit tests
// ---------------------------------------------------------------------------
/// Build the proto `MxValue` that `write_array_elements` would send and assert
/// the sparse oneof variant has the correct `total_length` and elements.
fn sparse_int32_value(
total_length: u32,
elements: impl IntoIterator<Item = (u32, i32)>,
) -> MxValue {
let sparse_elements: Vec<MxSparseElement> = elements
.into_iter()
.map(|(index, v)| MxSparseElement {
index,
value: Some(MxValue {
data_type: MxDataType::Integer as i32,
variant_type: "VT_I4".to_owned(),
kind: Some(Kind::Int32Value(v)),
..MxValue::default()
}),
})
.collect();
MxValue {
data_type: MxDataType::Integer as i32,
variant_type: String::new(),
kind: Some(Kind::SparseArrayValue(MxSparseArray {
element_data_type: MxDataType::Integer as i32,
total_length,
elements: sparse_elements,
})),
..MxValue::default()
}
}
#[test]
fn write_array_elements_proto_shape_has_sparse_oneof_kind() {
let proto = sparse_int32_value(5, [(0, 10), (3, 30)]);
let Kind::SparseArrayValue(ref sparse) = proto.kind.as_ref().unwrap() else {
panic!("expected SparseArrayValue kind, got {:?}", proto.kind);
};
assert_eq!(sparse.total_length, 5, "total_length must round-trip");
assert_eq!(sparse.elements.len(), 2, "two elements supplied");
assert_eq!(sparse.element_data_type, MxDataType::Integer as i32);
let elem0 = &sparse.elements[0];
assert_eq!(elem0.index, 0);
assert_eq!(
elem0.value.as_ref().unwrap().kind,
Some(Kind::Int32Value(10))
);
let elem3 = &sparse.elements[1];
assert_eq!(elem3.index, 3);
assert_eq!(
elem3.value.as_ref().unwrap().kind,
Some(Kind::Int32Value(30))
);
}
#[test]
fn write_array_elements_empty_elements_is_valid_all_defaults() {
let proto = sparse_int32_value(8, []);
let Kind::SparseArrayValue(ref sparse) = proto.kind.as_ref().unwrap() else {
panic!("expected SparseArrayValue kind");
};
assert_eq!(sparse.total_length, 8);
assert!(
sparse.elements.is_empty(),
"no elements means every index defaults"
);
}
#[test]
fn sparse_array_value_round_trips_through_client_mx_value_projection_as_unset() {
// SparseArrayValue is write-only. If it ever arrives on the read path
// (e.g. a future version bug), the projection should degrade to Unset
// rather than panic, because the enum variant is not readable.
let proto = sparse_int32_value(4, [(1, 99)]);
let client_value = ClientMxValue::from_proto(proto);
assert_eq!(
client_value.projection(),
&MxValueProjection::Unset,
"write-only SparseArrayValue must project to Unset, not panic"
);
}
+79 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `clients/dotnet` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -603,3 +603,80 @@ Net effect at HEAD: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx
**Recommendation:** Either (a) tighten the documented contract to "ExpandAsync is safe to call concurrently, but Children/IsExpanded must only be read after the awaited ExpandAsync completes (no concurrent reader/expander)", or (b) make the publication safe: write `_isExpanded` via `Volatile.Write` and read via `Volatile.Read`, and return an immutable snapshot from `Children` (e.g. assign a completed `IReadOnlyList` under the lock and expose that field) so lock-free readers never observe a partially-populated list. Option (a) is the smallest change and matches the realistic usage (UI thread expands then renders).
**Resolution:** 2026-06-15 — Confirmed against source: `Children => _children` returned the live mutable backing `List<LazyBrowseNode>` and `IsExpanded => _isExpanded` read a plain `bool`, while `ExpandAsync` appended to that same list under `_expandLock` with no release/acquire barrier to lock-free readers — so a concurrent reader could enumerate a mid-append list and throw `InvalidOperationException` ("collection was modified"). Applied option (b) (safe publication): `ExpandAsync` now accumulates children into a method-local `List<LazyBrowseNode>` and, only when fully drained across all pages, publishes it via `Volatile.Write(ref _children, children)` (release) immediately before setting the now-`volatile bool _isExpanded = true`. The `_children` field is an `IReadOnlyList<LazyBrowseNode>` read via `Volatile.Read` from the `Children` getter (acquire), so a reader that observes `IsExpanded == true` always sees the fully-populated snapshot and never enumerates a partially-built list. Updated the `ExpandAsync` `<remarks>` to document the strengthened concurrent-read guarantee. Regression test `LazyBrowseNodeTests.Expand_ConcurrentReadOfChildren_NeverTearsAndPublishesAtomically` gates the child-page RPCs (via a new `FakeGalaxyRepositoryTransport.BrowseChildrenGate` hook) to hold the expand mid-flight while a background reader spins enumerating `Children` and reading `IsExpanded`, asserting no exception escapes and that once `IsExpanded` is true the published snapshot has all five children. Verified red against the pre-fix code (the reader threw `InvalidOperationException: Collection was modified` deterministically across three runs) and green after the fix.
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the .NET client delta: `LazyBrowseNode` lazy paging + tests, the new `MxGatewayClientCli` galaxy-browse surface + tests, `GalaxyClientFactory`/adapter seam. Client.Dotnet-025 (LazyBrowseNode publish ordering) confirmed resolved. One Medium security regression.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Client.Dotnet-026 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | Client.Dotnet-028 |
| 6 | Performance & resource management | Client.Dotnet-027 |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | Client.Dotnet-029 |
| 9 | Testing coverage | No issues found |
| 10 | Documentation & comments | No issues found |
### Client.Dotnet-026
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/dotnet/.../MxGatewayClientCli.cs:306` (isLongRunning) |
| Status | Resolved |
**Description:** Client.Dotnet-015 extended `isLongRunning` to include the bench commands so they aren't silently cancelled by the default 30s CTS. The new `galaxy-browse` command is NOT in `isLongRunning`. A `galaxy-browse --depth N` tree walk on a large Galaxy can exceed 30s (sequential paginated RPCs per node); the CTS fires and the OCE escapes as a non-zero exit with no output — the same silent failure the bench commands were exempted from.
**Recommendation:** Add `"galaxy-browse"` to the `isLongRunning` set alongside `galaxy-watch`/bench, so it defaults to unlimited wall-clock and only applies `CancelAfter` with an explicit `--timeout`.
**Resolution:** 2026-06-16 — Confirmed against source: `CreateCancellation`'s `isLongRunning` expression at line 306 read `command is "galaxy-watch"` only — `galaxy-browse` was absent, so the default 30 s `CancelAfter` budget applied and a deep paginated tree walk that overran it would have the OCE escape as a non-zero exit with no output. (Note: at HEAD the bench commands the finding cites are also not in this set despite Client.Dotnet-015's recorded resolution, but per the task scope only `galaxy-browse` is added here.) Changed the expression to `command is "galaxy-watch" or "galaxy-browse"`, so `galaxy-browse` now runs to completion by default and only applies `CancelAfter` when the caller supplies an explicit `--timeout`. Pure correctness fix matching the existing `galaxy-watch` precedent.
### Client.Dotnet-027
| Field | Value |
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client/LazyBrowseNode.cs:15` |
| Status | Won't Fix |
**Description:** `LazyBrowseNode` allocates one `SemaphoreSlim _expandLock = new(1,1)` per node and never disposes it (the type is not IDisposable). For a large Galaxy browse tree (thousands of nodes), live SemaphoreSlim instances accumulate; OS handles are released only on finalization. Negligible for small trees, meaningful for long-lived large trees.
**Recommendation:** Replace the once-only async gate with a non-disposable primitive (e.g. `Lazy<Task>`-based dedup) or make `LazyBrowseNode` IDisposable and dispose the semaphore. Document the chosen lifetime contract.
**Resolution:** 2026-06-16 — **Won't Fix.** The finding's premise — that the undisposed semaphore leaks an OS handle until finalization — does not hold for this usage. `SemaphoreSlim` only allocates a kernel wait handle (`ManualResetEvent`) lazily, the first time its `AvailableWaitHandle` property is accessed; `LazyBrowseNode` uses the gate exclusively via `WaitAsync`/`Release` and never touches `AvailableWaitHandle` (verified by grep), so no unmanaged/OS handle is ever created. The semaphore is therefore pure managed memory whose lifetime is the node's and which is reclaimed by the GC with the node — `SemaphoreSlim.Dispose()` would have nothing to release. Making the type `IDisposable` (or restructuring to a `Lazy<Task>` gate) would change the public surface and push per-node disposal onto every tree consumer (thousands of nodes) for zero resource benefit, so it is not worth the over-engineering. Added an inline code comment at `LazyBrowseNode.cs:15` documenting this lifetime contract and the no-handle rationale so the design intent is explicit. No test added (no behavior change).
### Client.Dotnet-028
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | `clients/dotnet/.../MxGatewayClientCli.cs:156` |
| Status | Resolved |
**Description:** Client.Dotnet-008 was recorded resolved by adding a `TryResolveApiKey` helper resolving both `--api-key` and the `--api-key-env` env-var path, wired into the error-redaction catch block. At HEAD the catch block reads `arguments.GetOptional("api-key")` only — the pre-008 code. When the key is sourced from the env var, `GetOptional("api-key")` returns null, `Redact(message, null)` is a no-op, and an exception message echoing the bearer key would print it raw to stderr. The existing regression test only covers the `--api-key` direct path, so it passes against the broken code. (Claimed regression — verify root cause before fixing.)
**Recommendation:** Restore the `TryResolveApiKey` pattern (resolve `--api-key` then the `--api-key-env`-named env var, default `MXGATEWAY_API_KEY`) in the catch block, and add a regression test that sources the key from the env var and asserts it is redacted in stderr.
**Resolution:** 2026-06-16 — **Confirmed: real regression.** The `RunCoreAsync` catch block at line 156 resolved the redaction key via `arguments.GetOptional("api-key")` only, and no `TryResolveApiKey` helper existed anywhere in the CLI project (verified by grep) — the Client.Dotnet-008 helper had been lost from the history reaching HEAD, same as the 012/013/022/023 props/doc regressions. On the `--api-key-env` path `GetOptional("api-key")` is null, so `Redact(message, null)` was a no-op and a transport error echoing the bearer token would have reached stderr unredacted. Restored a non-throwing `TryResolveApiKey(CliArguments)` helper that resolves `--api-key` then the `--api-key-env`-named env var (default `MXGATEWAY_API_KEY`) and returns null when neither is set; refactored `ResolveApiKey` to call it (so the resolution order stays single-sourced) and changed the catch block to redact `TryResolveApiKey(arguments)` instead of `GetOptional("api-key")`. Regression test `MxGatewayClientCliTests.RunAsync_ErrorOutput_RedactsApiKey_WhenSourcedFromEnvironmentVariable` sets a dedicated env var (`MXGATEWAY_TEST_API_KEY_028`), runs `open-session --api-key-env <name>` (no `--api-key` flag) against a client factory that throws an `InvalidOperationException` whose message embeds the secret, and asserts exit 1, that the secret is absent from stderr, and that `[redacted]` is present. The pre-existing `--api-key`-path test (`RunAsync_ErrorOutput_RedactsApiKey`) is retained; the new test fails against the `GetOptional("api-key")`-only catch block (key printed raw) and passes after the fix.
### Client.Dotnet-029
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/dotnet/.../IMxGatewayCliClient.cs:6` |
| Status | Resolved |
**Description:** `IMxGatewayCliClient` is a public interface with no type-level `<summary>` XML doc. The Client.Dotnet-013 resolution recorded adding one; at HEAD it is absent. No CS1591 fires (GenerateDocumentationFile now scoped to the packable library only), but the public extension point should follow the public-surface doc convention.
**Recommendation:** Add a one-line `<summary>` describing the interface and noting `MxGatewayCliClientAdapter` is the production binding.
**Resolution:** 2026-06-16 — Confirmed against source: the interface declaration at `IMxGatewayCliClient.cs:6` had no type-level `<summary>` (only the members were documented). Added a type-level `<summary>` describing the interface as the CLI's transport seam over the gateway and Galaxy Repository RPCs, naming `MxGatewayCliClientAdapter` (over a real `MxGatewayClient`) as the production binding and the in-memory fake as the test substitute. Pure documentation change — no test needed.
+94 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `clients/go` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -116,6 +116,23 @@ justified — not a finding. The `LazyBrowseNode` concurrency model
| 9 | Testing coverage | No issues found — new walker, pagination, dup-token, filter-forwarding, and TLS-posture paths are all covered. |
| 10 | Documentation & comments | New issue: README "Installing the Go client" recommends the `GONOSUMCHECK` env var, which was removed from the Go toolchain in 1.13 and is a no-op on Go 1.26 (Client.Go-029). |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the Go client delta: new `ping`/`galaxy-browse` CLI commands, `Write2`/bulk additions, session.go. gofmt/vet/build clean. Two claimed regressions of prior resolutions (Go-013 drain, Go-020 signal handler) — verify root cause before fixing.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Client.Go-031 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | Client.Go-030 |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | Client.Go-032 |
| 9 | Testing coverage | Client.Go-033 |
| 10 | Documentation & comments | Client.Go-034 |
## Findings
### Client.Go-001
@@ -706,3 +723,78 @@ if ($dirty) {
**Recommendation:** Drop `GONOSUMCHECK` and document the current knobs: set `GOPRIVATE=gitea.dohertylan.com/*` (covers both sum-db bypass and direct VCS fetch), or for the checksum database specifically `GONOSUMCHECK`'s modern equivalent `GONOSUMDB` is also gone — use `GONOSUMCHECK``GOFLAGS=-insecure` only for plaintext, and `GONOSUMCHECK`. Concretely: "set `GOPRIVATE=gitea.dohertylan.com/*` (this disables both the checksum database and the public module proxy for that path); add `GOINSECURE=gitea.dohertylan.com/*` if the host serves the module over plain HTTP."
**Resolution:** 2026-06-15 — Dropped the dead `GONOSUMCHECK` advice from the "Installing the Go client" section of `clients/go/README.md`; it now documents `GOPRIVATE=gitea.dohertylan.com/*` (which bypasses both the public module proxy and checksum-database verification for that path) plus `GOINSECURE=gitea.dohertylan.com/*` for plain-HTTP hosts.
### Client.Go-030
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `clients/go/cmd/mxgw-go/main.go:1491-1494` |
| Status | Resolved |
**Description:** `runGalaxyWatch`'s limit-reached branch calls `cancelStream()` and returns WITHOUT draining the buffered `events` channel, unlike the signal-cancel branch which drains. This is the shape Client.Go-013's resolution claimed to have fixed ("now drains via for range events"). The WatchDeployEvents goroutine may still be blocked sending into the 16-deep channel; it exits via ctx cancellation (not a permanent leak) but remains alive until that propagates, racing `defer client.Close()`. (Claimed regression — verify root cause.)
**Recommendation:** After `cancelStream()` in the limit-reached branch, drain: `for range events {}`, mirroring the signal-cancel branch.
**Resolution:** 2026-06-16 — Confirmed real: the limit-reached branch returned right after `cancelStream()` while the signal-cancel branch drained `events`, so the buffered (16-deep) `WatchDeployEvents` producer could remain blocked on a send while `defer client.Close()` tore the stream down. Added the `for range events {}` drain to the limit-reached branch, mirroring the signal-cancel branch. Behaviour exercised by the existing `runGalaxyWatch` flow; verified via `go vet`/`go build`/`go test ./...`.
### Client.Go-031
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/go/cmd/mxgw-go/main.go:1037-1046` |
| Status | Resolved |
**Description:** `closeSmokeSession` registers `defer cancel()` twice on the same `cancel` variable across two `context.WithTimeout` calls when the deadline-shortening branch fires. Because `cancel` is reassigned, both defers end up calling the second context's cancel (idempotent, harmless today), while the first context is released by an explicit `cancel()`. The double-defer-on-reassigned-variable is fragile: removing the explicit `cancel()` in a future refactor would leak the first context's timer goroutine.
**Recommendation:** Use a distinct variable for the second cancel, or compute the close timeout once before allocating a single context.
**Resolution:** 2026-06-16 — Confirmed real. Rewrote `closeSmokeSession` to compute the close timeout once (default 5s, shortened to the caller's remaining deadline when sooner) and then allocate a single `context.WithTimeout` with a single `defer cancel()`, removing the reassigned-variable double-defer entirely.
### Client.Go-032
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/go/cmd/mxgw-go/main.go:839-841` |
| Status | Resolved |
**Description:** `runStreamEvents` does not install a `signal.NotifyContext` handler, while `runStreamAlarms` and `runGalaxyWatch` do. Client.Go-020's resolution claimed this was added. Without a signal-aware parent context, Ctrl+C kills the process without running `defer subscription.Close()`/`client.Close()`, so the gateway sees a torn connection rather than a clean `codes.Canceled`. (Claimed regression — verify root cause.)
**Recommendation:** Wrap `ctx` with `signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)` (defer the stop) before deriving `streamCtx`, matching the other two stream commands.
**Resolution:** 2026-06-16 — Confirmed real: `runStreamEvents` derived `streamCtx` directly from `ctx` with no signal handler (and `runStreamAlarms` even carried a "Mirror runStreamEvents" comment that no longer matched). Added `signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)` (with `defer stopSignals()`) before deriving `streamCtx`, so Ctrl+C/SIGTERM cancels the stream cleanly (gateway sees `codes.Canceled`) and the deferred `subscription.Close()`/`client.Close()` run. Imports already present. CLI guard covered by `TestRunStreamEventsRequiresSessionID`.
### Client.Go-033
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/go/cmd/mxgw-go/main_test.go` |
| Status | Resolved |
**Description:** Gaps vs prior coverage: (1) `TestRunBenchReadBulkRejectsNonPositiveDuration` (named in Client.Go-021's resolution) is absent — the `-duration-seconds`-positive guard at main.go:619 is untested; (2) `runStreamEvents` has no CLI-level test (session-id-required and limit paths untested); (3) `TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues` (Client.Go-021 deliverable) is absent — the len-mismatch guard at main.go:508-510 is untested.
**Recommendation:** Add the three missing tests; all run through `runWithIO` without a fake server (except the stream-events one which can reuse the ping test's fake-server pattern).
**Resolution:** 2026-06-16 — Confirmed all three tests absent. Added them to `cmd/mxgw-go/main_test.go`, each driving `runWithIO` and asserting the guard error before any dial: `TestRunBenchReadBulkRejectsNonPositiveDuration` (`-duration-seconds 0` → "duration-seconds must be positive"), `TestRunStreamEventsRequiresSessionID` (no `-session-id` → "session-id is required"), and `TestRunWriteBulkVariantRejectsMismatchedHandlesAndValues` (2 handles / 1 value → "does not match values count"). All three pass under `go test ./...`.
### Client.Go-034
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `clients/go/README.md:245-263` |
| Status | Resolved |
**Description:** The README CLI example table lists ~12 commands but the binary now exposes ~27 subcommands (per `writeUsage`). Absent: `ping`, `galaxy-browse`, `batch`, `read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`, `write-secured2-bulk`, `bench-read-bulk`, `stream-alarms`, `acknowledge-alarm`, and more. `batch` (the cross-language harness interface with an EOR sentinel + 16 MiB line cap) is undocumented entirely.
**Recommendation:** Add a complete subcommand reference, and document the `batch` EOR-sentinel protocol and line cap.
**Resolution:** 2026-06-16 — Expanded the README CLI section with a "Subcommand reference" table covering all 27 subcommands wired into `run` (incl. `ping`, `galaxy-browse`, `read-bulk`, the four bulk-write variants, `bench-read-bulk`, `stream-alarms`, `acknowledge-alarm`, `batch`), refreshed the example block, and added a "`batch` mode" subsection documenting the `__MXGW_BATCH_EOR__` end-of-result sentinel, the JSON error framing, blank-line skipping, and the 16 MiB scanner line cap.
+154 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `clients/java` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -106,6 +106,23 @@ Client.Java-001..036 are unchanged.
| 9 | Testing coverage | No issues found. The browse surface has thorough library tests in `GalaxyRepositoryClientTests` (roots, expand-populates, idempotent-single-RPC, unknown-parent not-found, multi-page gather, concurrent-callers-one-RPC, filter forwarding, repeated-page-token rejection); TLS lenient/strict paths are covered by `MxGatewayClientTlsTests` against a real in-process TLS server. |
| 10 | Documentation & comments | Issue found: the README "Browsing lazily" first code snippet calls `galaxy.browseChildren(BrowseChildrenRequest…)`, but no such method exists on `GalaxyRepositoryClient` — the raw single-RPC method is `browseChildrenRaw(BrowseChildrenRequest)`; the documented snippet does not compile (Client.Java-037). |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the Java client delta: the §8 `GalaxyClientFactory` seam, `InProcessGatewayHarness`, and the §8 CLI test coverage. Seam is behavior-preserving; harness channel lifecycle correct. One Medium concurrency item in the pre-existing stream-alarms overflow handler.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Client.Java-040, Client.Java-041 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | Client.Java-040 |
| 4 | Error handling & resilience | Client.Java-042 |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | Client.Java-043, Client.Java-044 |
| 9 | Testing coverage | Client.Java-045, Client.Java-046 |
| 10 | Documentation & comments | Client.Java-047, Client.Java-048 |
## Findings
### Client.Java-001
@@ -728,6 +745,141 @@ BrowseChildrenReply reply = galaxy.browseChildren(
**Resolution:** 2026-06-15 — Confirmed against source: `MxGatewayClientOptions` (`zb-mom-ww-mxgateway-client/.../MxGatewayClientOptions.java:108,260`) exposes `requireCertificateValidation()` and a `Builder.requireCertificateValidation(boolean)`, but the CLI `CommonOptions` in `MxGatewayCli.java` declared no flag and `toClientOptions()` never set it, forcing the lenient default on every non-pinned TLS CLI connection. Added a bare-boolean `@Option(names = "--require-certificate-validation")` field to `CommonOptions` (defaults to `false`, preserving the lenient default; mirrors the existing `--plaintext` flag-style option), propagated it through `toClientOptions()` via `.requireCertificateValidation(requireCertificateValidation)`, and added it to `redactedJsonMap()` so `--json` output reflects the effective trust posture. Documented the new flag and the lenient-by-default trust posture in `clients/java/README.md`. Note: the Client.Java-025 precedent (`shutdownTimeout`) was applied to the pre-rename `mxgateway-cli` module and is not present in this renamed `zb-mom-ww-mxgateway-cli` `toClientOptions()`; I mirrored the live `--ca-file`/`--server-name-override` TLS-option plumbing pattern instead, which is the correct precedent here. Regression tests in `MxGatewayCliTests`: `requireCertificateValidationFlagPropagatesThroughToClientOptions` (drives `acknowledge-alarm --require-certificate-validation` through a new `CapturingClientFactory` that records `options.toClientOptions()` and asserts `MxGatewayClientOptions.requireCertificateValidation()` is `true`) and `requireCertificateValidationDefaultsToLenientWhenFlagAbsent` (asserts the flag defaults to `false`). The capturing factory exercises the real `toClientOptions()` propagation, stronger than a parse-only check.
### Client.Java-040
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1552-1561` |
| Status | Resolved |
**Description:** The `stream-alarms` overflow handler does `queue.clear()` then `offer(exception)` + `offer(ALARM_FEED_END)` non-atomically on an `ArrayBlockingQueue` shared with the gRPC delivery thread. In production gRPC (netty I/O thread), a concurrent `onNext` between the clear and the offers can re-enqueue a normal message, displacing the overflow exception so the drain loop hits the normal message and may exit before reaching the exception — exiting 0 on a truncated feed. Same race class as Client.Java-002/033.
**Recommendation:** Guard the overflow transition with an `AtomicBoolean` (mirror `MxGatewayStreamSubscription.terminate()`'s terminated-flag + lock) instead of re-clearing the queue.
**Resolution:** 2026-06-16 — Confirmed root cause in `StreamAlarmsCommand.call()`: the overflow branch did `queue.clear()` then `offer(exception)` + `offer(ALARM_FEED_END)`, so a concurrent `onNext` between the clear and the offers could re-enqueue a normal message and displace the overflow signal. (Note: `MxGatewayStreamSubscription` has no `terminate()` method; the terminal-guard model lives in `MxEventStream`, which itself still uses the clear+offer shape — I implemented the atomic guard the finding asks for rather than copying the older pattern.) Replaced the clear+offer with a single `AtomicBoolean terminated` guard (`compareAndSet(false,true)` — first terminal wins) plus a dedicated `AtomicReference<Object> terminal` slot that holds the terminal item (overflow exception / transport error / `ALARM_FEED_END`) independently of the bounded queue. `onNext` no longer re-clears the queue; it just stops enqueueing once terminated. The drain loop now `poll(50ms)`s and, when the queue is empty, reads the terminal slot. No re-clear, and a concurrent `onNext` can no longer displace the terminal. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Regression test: `MxGatewayCliTests.streamAlarmsCommandFailsFastOnQueueOverflow` (strengthened under Client.Java-046 to drive async delivery and assert the overflow text).
### Client.Java-041
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:2187-2194` |
| Status | Resolved |
**Description:** `jsonString` escapes only `\`, `"`, `\r`, `\n` — not `\t`, `\b`, `\f`, or U+0000U+001F/U+007F. A tag address/message/reference containing a tab produces malformed JSON (RFC 8259). Affects the hand-rolled `jsonObject`/`jsonString`/`jsonValue` output paths (the protobuf `JsonFormat` path is spec-correct).
**Recommendation:** Add `\t`/`\b`/`\f` escapes and `\u00XX` for control chars, or route all JSON through a real JSON library.
**Resolution:** 2026-06-16 — Confirmed: `jsonString` escaped only `\\ \" \r \n`, so a tab/backspace/form-feed or any other U+0000U+001F (or U+007F) char produced malformed JSON. Rewrote `jsonString` as a per-character builder that emits the two-character escapes for `\t \b \f \r \n \" \\` and `\u00XX` for the remaining `< 0x20` range plus DEL (`0x7f`), keeping ordinary printable characters verbatim. Widened `jsonString` from `private` to package-private (matching the Client.Java-032 `commandLine(...)` precedent) so the escaping can be unit-tested directly. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Regression test: `MxGatewayCliTests.jsonStringEscapesControlCharacters`.
### Client.Java-042
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1565-1567` |
| Status | Resolved |
**Description:** `StreamAlarmsCommand.onError` calls `queue.offer(error)` without checking the return value. If the queue is full when a transport error arrives, the error is dropped and the drain loop blocks forever on `queue.take()`. Same class as Client.Java-033 on the error path.
**Recommendation:** Reserve a sentinel slot or use the `terminate(Throwable)` guard from `MxEventStream`; ensure the drain always sees a terminal item.
**Resolution:** 2026-06-16 — Confirmed: `onError` did a bare `queue.offer(error)` that, on a full queue, dropped the error and stranded the drain on `queue.take()` forever. Fixed together with Client.Java-040: `onError` now routes through the shared `terminate(error)` consumer, which records the throwable in the dedicated `terminal` slot (guarded by the `AtomicBoolean`, never enqueued into the bounded `queue`). The drain loop reads that slot via the `poll(50ms)` + terminal-check path, so a transport error is always observed even when the queue is full, and the `take()`-forever deadlock is gone. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Covered by the same `streamAlarmsCommandFailsFastOnQueueOverflow` terminal-slot plumbing; the error path shares the slot with the overflow path.
### Client.Java-043
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:241-264` |
| Status | Resolved |
**Description:** `galaxyBrowseParentZeroEmitsWarningToStderr` calls `MxGatewayCli.execute(new FakeClientFactory(), ...)` for a galaxy-browse command, which wires the real `GrpcGalaxyClientFactory` and constructs a live Netty channel to localhost:5000 as a side effect (asserting only the warning). Wasteful and non-deterministic if port 5000 is reachable.
**Recommendation:** Use `executeGalaxy(...)` with a `GalaxyClientFactory` stub that throws, so only the warning path runs.
**Resolution:** 2026-06-16 — Confirmed: the test called `MxGatewayCli.execute(new FakeClientFactory(), ...)`, which routes galaxy commands through the production `GrpcGalaxyClientFactory`; `GalaxyBrowseCommand.call()` prints the `--parent 0` warning then `connect()`s a live `GalaxyRepositoryClient` (Netty channel to localhost:5000) before failing — wasteful and non-deterministic. Rewrote the test to use the existing `executeGalaxy(...)` seam with a new `ThrowingGalaxyClientFactory` stub whose `connect()` throws; the warning is emitted before `connect()` is reached, so only the warning path runs and no live channel is constructed. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Test: `MxGatewayCliTests.galaxyBrowseParentZeroEmitsWarningToStderr` (updated).
### Client.Java-044
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientVersion.java:12` |
| Status | Resolved |
**Description:** `CLIENT_VERSION = "0.1.0"` is out of sync with Gradle `version = '0.1.1'` (cross-ref `clients/java/build.gradle:6`). The `version` command advertises 0.1.0 while the published artifact is 0.1.1; consumers can't use the version string as a reliable artifact check.
**Recommendation:** Bump `CLIENT_VERSION` to `0.1.1` (and the two test assertions), or source it from a Gradle-generated properties file.
**Resolution:** 2026-06-16 — Confirmed: `MxGatewayClientVersion.CLIENT_VERSION = "0.1.0"` while `clients/java/build.gradle:16` sets `version = '0.1.1'` and the README Maven coordinate is `:0.1.1`. Bumped `CLIENT_VERSION` to `"0.1.1"` and updated the two test assertions (`MxGatewayCliTests.versionCommandPrintsProtocolVersions` line asserting `"mxgateway-java 0.1.0"` and `versionCommandPrintsJson` asserting `"clientVersion":"0.1.0"`) to `0.1.1`. Left as a hardcoded constant (sourcing from a Gradle-generated properties file was the optional alternative, not required). Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Tests: `MxGatewayCliTests.versionCommandPrintsProtocolVersions`, `versionCommandPrintsJson`.
### Client.Java-045
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/InProcessGatewayHarness.java` |
| Status | Resolved |
**Description:** The harness implements only `streamEvents`/`closeSession` (gateway) and `discoverHierarchy`/`watchDeployEvents` (galaxy); all other RPCs return gRPC UNIMPLEMENTED. This is undocumented, so a future test exercising invoke/register through the harness would silently get UNIMPLEMENTED.
**Recommendation:** Add a Javadoc note enumerating implemented RPCs and warning that others return UNIMPLEMENTED by design.
**Resolution:** 2026-06-16 — Confirmed against source (the file lives under `src/test/...`, not `src/main/...` as the finding location states): the scripted fakes override only `streamEvents`/`closeSession` (gateway) and `discoverHierarchy`/`watchDeployEvents` (galaxy); every other RPC inherits the generated `*ImplBase` default and returns gRPC `UNIMPLEMENTED`. Added a "Implemented RPCs" section to the `InProcessGatewayHarness` class Javadoc enumerating the four overridden RPCs and warning that all others (openSession, invoke, register, streamAlarms, queryActiveAlarms, browseChildren, …) return `UNIMPLEMENTED` by design, so a future test must add a scripted override first. Doc-only change. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). No test needed.
### Client.Java-046
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:680-696` |
| Status | Resolved |
**Description:** `streamAlarmsCommandFailsFastOnQueueOverflow` delivers all 2000 onNext synchronously from within `streamAlarms`, so `subscriptionRef` is still null when the overflow fires — the `sub.cancel()` branch is never exercised. The test also doesn't assert the overflow message text. It passes for a reason that doesn't generalize to async gRPC delivery.
**Recommendation:** Deliver messages asynchronously so the cancel path runs, and assert the overflow error text appears in output.
**Resolution:** 2026-06-16 — Confirmed: `OverflowingFakeClient.streamAlarms` pushed all 2000 `onNext` synchronously and returned the subscription only afterward, so `subscriptionRef` was still null when the overflow fired and the `sub.cancel()` branch never ran; the test also asserted only the exit code, not the overflow text. Reworked `OverflowingFakeClient.streamAlarms` to flood on a background daemon thread (mirroring a real netty I/O thread) and return the subscription first, so the overflow fires with a non-null published subscription and exercises the `terminate()` cancel path. Strengthened `streamAlarmsCommandFailsFastOnQueueOverflow` to additionally assert the overflow message text ("queue overflowed") surfaces in stderr/stdout. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). Test: `MxGatewayCliTests.streamAlarmsCommandFailsFastOnQueueOverflow` (updated; also validates the Client.Java-040 terminal-slot fix).
### Client.Java-047
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `clients/java/README.md` |
| Status | Resolved |
**Description:** README advertises the `0.1.1` artifact coordinate (Gitea Maven section) while the `version` command reports `0.1.0` — the user-visible symptom of Client.Java-044. Cross-ref `MxGatewayClientVersion.java:12`.
**Recommendation:** Resolved by fixing Client.Java-044 (sync the compiled-in version).
**Resolution:** 2026-06-16 — Symptom of Client.Java-044, resolved together. The README's `0.1.1` Maven coordinate (`clients/java/README.md:336`) was already correct; the divergence was the compiled-in `CLIENT_VERSION = "0.1.0"`. Bumping `CLIENT_VERSION` to `0.1.1` (Client.Java-044) makes the `version` command report `0.1.1`, matching the README. No README edit needed. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL).
### Client.Java-048
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:88-105` |
| Status | Resolved |
**Description:** The public `execute(PrintWriter, PrintWriter, String...)` Javadoc calls it "Test-friendly entry point", but it wires `GrpcMxGatewayCliClientFactory` with no injection — the actual test seam is the package-private `execute(MxGatewayCliClientFactory, ...)` / `commandLine(...)` overload. Misleading.
**Recommendation:** Clarify the Javadoc to direct readers to the injectable overload for testing.
**Resolution:** 2026-06-16 — Confirmed: the public `execute(PrintWriter, PrintWriter, String...)` Javadoc called it the "Test-friendly entry point", but it wires the production `GrpcMxGatewayCliClientFactory` with no injection seam — unit tests actually use the package-private `execute(MxGatewayCliClientFactory, ...)` / `commandLine(...)` overloads. Rewrote the Javadoc to drop "test-friendly", explain it wires a real gRPC channel, and direct test authors to the injectable package-private overloads. Doc-only change. Fix applied 2026-06-16, verified on windev 2026-06-17 (gradle :zb-mom-ww-mxgateway-cli:test --tests *MxGatewayCliTests: BUILD SUCCESSFUL). No test needed.
### Client.Java-039
+94 -2
View File
@@ -4,13 +4,30 @@
|---|---|
| Module | `clients/python` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
## Checklist coverage
### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the Python client delta: new galaxy CLI commands, options.py TLS/auth, large test additions. Prior Client.Python-027..031 confirmed resolved. One claimed regression (Python-004 dead variable) and one Medium README/API mismatch.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Client.Python-032, Client.Python-033, Client.Python-034 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | Client.Python-035 |
| 9 | Testing coverage | Client.Python-036 |
| 10 | Documentation & comments | Client.Python-036 |
### 2026-06-15 re-review (commit 410acc9)
Re-review pass at `410acc9`. The diff against the previous review base
@@ -1438,3 +1455,78 @@ under `[tool.pytest.ini_options]` in `clients/python/pyproject.toml`.
`python -m pytest` now reports no `PytestUnknownMarkWarning` (full run: 91
passed, 1 skipped, 0 warnings; previously 1 warning). The `tls`-marked
`tests/test_tls.py` module is the guard — its run is now warning-free.
### Client.Python-032
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:1048,1065-1066` |
| Status | Resolved |
**Description:** `_smoke` reintroduces the dead `closed = False` / `if not closed:` guard that Client.Python-004's resolution claimed to have removed via `async with session:`. `closed` is never reassigned, so the guard is always true. Behavior is correct (session always closed) but the dead variable misleads readers into expecting an early-close path. (Claimed regression — verify root cause.)
**Recommendation:** Use `async with session:` or drop the `closed` variable and close unconditionally.
**Resolution:** 2026-06-16 — Confirmed regression: the dead `closed = False` / `if not closed:` guard had returned. Replaced the `try/finally` with `async with session:` (Session implements the async context-manager protocol). Test: `test_smoke_does_not_carry_dead_closed_guard` in `tests/test_review_findings_032_to_036.py`.
### Client.Python-033
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:772,1490-1494` |
| Status | Resolved |
**Description:** `_parse_string_list` always emits `param_hint="--items"`, but it is also called from `_build_write_bulk_entries` with `kwargs["values"]`. An empty `--values ""` on the write-bulk commands yields `Error: Invalid value for '--items': ...`, pointing at a flag that doesn't exist on those commands.
**Recommendation:** Add an optional `param_hint` parameter (default `--items`) and pass `--values` from the write-bulk caller.
**Resolution:** 2026-06-16 — Added `param_hint="--items"` default param to `_parse_string_list`; `_build_write_bulk_entries` now passes `param_hint="--values"`. Tests: `test_parse_string_list_default_param_hint_is_items`, `test_parse_string_list_accepts_caller_supplied_param_hint` in `tests/test_review_findings_032_to_036.py`.
### Client.Python-034
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:1497-1501` |
| Status | Resolved |
**Description:** `_parse_int_list` does `int(item)` with no error handling. A non-numeric token (e.g. `--item-handles "10,abc"`) raises a raw `ValueError`, surfacing as an unformatted traceback interactively (other input errors raise `click.BadParameter`).
**Recommendation:** Wrap the conversion and re-raise as `click.BadParameter(..., param_hint="--item-handles")`.
**Resolution:** 2026-06-16 — Wrapped the `int()` comprehension in `try/except ValueError` and re-raise as `click.BadParameter(..., param_hint="--item-handles")`. Tests: `test_parse_int_list_non_numeric_raises_bad_parameter`, `test_parse_int_list_happy_path` in `tests/test_review_findings_032_to_036.py`.
### Client.Python-035
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `clients/python/src/zb_mom_ww_mxgateway/__init__.py`, `.../options.py:63-77`, `.../galaxy.py:293` |
| Status | Resolved |
**Description:** Two new public types — `BrowseChildrenOptions` (options.py) and `LazyBrowseNode` (galaxy.py) — are absent from `__init__.py`/`__all__`, so callers can't `from zb_mom_ww_mxgateway import BrowseChildrenOptions`, breaking the package-root import contract that `ClientOptions`/`GatewayClient`/etc. follow.
**Recommendation:** Re-export both from `__init__.py` and add them to `__all__`.
**Resolution:** 2026-06-16 — Re-exported `BrowseChildrenOptions` (from `.options`) and `LazyBrowseNode` (from `.galaxy`) in `__init__.py` and added both to `__all__`. Tests: `test_browse_children_options_is_exported_from_package_root`, `test_lazy_browse_node_is_exported_from_package_root` in `tests/test_review_findings_032_to_036.py`.
### Client.Python-036
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Documentation & comments |
| Location | `clients/python/README.md:143-158` |
| Status | Resolved |
**Description:** The README "Browsing lazily" section's code example calls `galaxy.browse_children(...)`, a method that does not exist — the actual public low-level method is `browse_children_raw`. The example raises `AttributeError` at runtime. The README-parse test only covers shell CLI invocations, not Python code fragments, so it doesn't catch this.
**Recommendation:** Update the example/prose to `browse_children_raw(...)` (and promote the high-level `browse()`/`LazyBrowseNode` path), or add a `browse_children` alias. Add a `hasattr` test to catch future renames.
**Resolution:** 2026-06-16 — Updated the README "Browsing lazily" prose and example to `browse_children_raw(...)` and added a pointer to the higher-level `browse()`/`LazyBrowseNode` walker. Tests: `test_galaxy_client_exposes_browse_children_raw` (hasattr guard) and `test_readme_browse_example_uses_existing_method` (parses every `galaxy.<method>()` call in README against the client class) in `tests/test_review_findings_032_to_036.py`.
+109 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `clients/rust` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -115,6 +115,23 @@ Re-review pass at `410acc9`. The diff against `42b0037` (`git diff 42b0037..HEAD
| 9 | Testing coverage | No issues found in the new surface — the walker has six unit tests (roots, expand, idempotency, NotFound, multi-page, filter-forwarding) and TLS has four. Gap noted: `tls_with_require_certificate_validation_does_not_short_circuit` connects to a dead address, so it only asserts the guard does not fire and never exercises a real handshake — which is why the no-trust-roots defect in Client.Rust-031 is not caught by a test. |
| 10 | Documentation & comments | Issue found: the `alarm_feed_message_summary` / `alarm_feed_message_to_json` doc comments still say "three `payload` oneof cases" (`main.rs:1729,1755`) although the proto now has four; folded into Client.Rust-030's fix. The TLS doc inaccuracy is Client.Rust-031. |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the Rust client delta: options.rs TLS trust decision, mxgw-cli galaxy browse, Cargo metadata. Prior Client.Rust-030/031/032 confirmed resolved. fmt/clippy/test clean. One Medium TLS-downgrade correctness item.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Client.Rust-033, Client.Rust-034 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | Client.Rust-035 |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | Client.Rust-036, Client.Rust-037 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Client.Rust-038 |
| 10 | Documentation & comments | No issues found |
## Findings
### Client.Rust-001
@@ -762,3 +779,93 @@ This is masked by the tests: `tls_with_require_certificate_validation_does_not_s
**Recommendation:** Add a "Lazy browse" subsection to the Galaxy section of `RustClientDesign.md` enumerating `browse`, `browse_children_raw`, `BrowseChildrenOptions` (its filter fields and AND semantics), and `LazyBrowseNode` (the `Arc`-shared clone semantics, the idempotent single-RPC `expand`, the `has_children_hint`, and the internal paged `BrowseChildren` loop with its repeated-page-token guard). Cross-reference `docs/GalaxyRepository.md#browsechildren` for the wire-level request/filter semantics the README already links.
**Resolution:** 2026-06-15 — Confirmed by inspection that `RustClientDesign.md` had no Galaxy library-API coverage at all. Added a new "Galaxy Repository" section documenting `browse`, `browse_children_raw`, the `BrowseChildrenOptions` filter struct (all six fields, AND combination semantics, `include_attributes` tri-state), and `LazyBrowseNode` (`Arc`-shared clone semantics, `has_children_hint`, the idempotent single-RPC `expand` under an async mutex with page size 500, and the repeated-page-token `Error::InvalidArgument` guard), cross-referencing `docs/GalaxyRepository.md#browsechildren`. Also noted the fourth alarm `provider_status` oneof case in the Alarms section while resolving Client.Rust-030. Doc-only change verified by inspection; design-doc anchor target confirmed present.
### Client.Rust-033
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:485` |
| Status | Resolved |
**Description:** `ConnectionArgs::options()` computes plaintext as `!self.tls || self.plaintext`. With both `--tls` and `--plaintext` supplied, this is `true`, silently degrading to an unencrypted channel despite the explicit `--tls`. A security-sensitive footgun (e.g. a script auto-appending `--plaintext`).
**Recommendation:** Add clap `conflicts_with = "tls"` on `--plaintext` (reject the combo), or prefer `--tls` and warn.
**Resolution:** 2026-06-16 — Added `conflicts_with = "tls"` to the `--plaintext` arg so supplying both is rejected at parse time, removing the silent downgrade. Tests: `connection_rejects_tls_and_plaintext_together`, `connection_tls_flag_disables_plaintext`, `connection_defaults_to_plaintext`, `connection_plaintext_flag_selects_plaintext`.
### Client.Rust-034
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:48-51,548` |
| Status | Resolved |
**Description:** `Command::Version` carries a `jsonl: bool` field that is never read; the dispatch arm matches `{ json, .. }` and discards `jsonl`. `mxgw version --jsonl` silently behaves as plain text.
**Recommendation:** Handle `jsonl` in the Version arm (treat like `--json`) or remove the unused field.
**Resolution:** 2026-06-16 — Removed the unused `jsonl` field from `Command::Version` (version output is a single record, not a stream); the dispatch arm now matches `{ json }` exhaustively, so `mxgw version --jsonl` errors as an unknown flag instead of silently being ignored. No test (CLI surface change verified by build).
### Client.Rust-035
| Field | Value |
|---|---|
| Severity | Low |
| Category | Security |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:489-495` |
| Status | Resolved |
**Description:** `--api-key-env` (default `MXGATEWAY_API_KEY`) names an env var read into an `ApiKey` Bearer token, but its clap help has no description of the expected value format. A user pointing it at another credential's env var would silently forward that credential to the gateway as a Bearer token. Low risk (redacted Debug; bounded to user's own shell) but an implicit-trust gap.
**Recommendation:** Add help text stating the variable must hold a value of the form `mxgw_<key-id>_<secret>`.
**Resolution:** 2026-06-16 — Added clap doc-comment help to `--api-key-env` stating the variable's value must be a full gateway key of the form `mxgw_<key-id>_<secret>` and is forwarded verbatim as the Bearer token. Doc/help-only change, no test.
### Client.Rust-036
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `clients/rust/RustClientDesign.md:351` |
| Status | Resolved |
**Description:** The new `galaxy browse` subcommand (with its filter/depth/json flags) is not listed in the "Test CLI" command table in RustClientDesign.md, which still reads `galaxy {test-connection,last-deploy-time,discover-hierarchy,watch}`.
**Recommendation:** Add `mxgw galaxy browse [...flags]` and note `--depth 0` = requested level only, `--depth N` eagerly expands, and `--parent-gobject-id` makes `--depth` a no-op.
**Resolution:** 2026-06-16 — Added the `mxgw galaxy browse` line (with all flags) to the CLI table and a paragraph documenting that `--depth 0` prints only the requested level, `--depth N` eagerly expands N further levels, and `--parent-gobject-id` makes `--depth` a no-op. Doc-only change, no test.
### Client.Rust-037
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `clients/rust/README.md:164-179` |
| Status | Resolved |
**Description:** The README "Browsing lazily" example calls `galaxy.browse_children(...).await?.into_inner()`, but the public API is `GalaxyClient::browse_children_raw` (the bare `browse_children` is the generated proto-client method, not public; and `browse_children_raw` returns the reply struct directly, no `.into_inner()`). The example would not compile.
**Recommendation:** Replace with `galaxy.browse_children_raw(BrowseChildrenRequest::default()).await?` (drop `.into_inner()`).
**Resolution:** 2026-06-16 — Verified `browse_children_raw` is the public method (galaxy.rs:302) and returns `BrowseChildrenReply` directly. Updated the README prose and example to call `browse_children_raw(...).await?` without `.into_inner()`. Doc-only change, no test.
### Client.Rust-038
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:2336-2564` |
| Status | Resolved |
**Description:** Three CLI test gaps: (1) `ConnectionArgs::options()` `--tls`/`--plaintext` resolution (incl. the both-set path of Client.Rust-033) is untested; (2) `browse_children_one_level`'s repeated-page-token guard is untested; (3) `parse_rfc3339_timestamp` has no error-path tests (trailing chars, day=0, month 13, out-of-range day).
**Recommendation:** Add unit tests for all three (none need a network connection).
**Resolution:** 2026-06-16 — Added all three test groups. (1) `--tls`/`--plaintext` resolution: `connection_defaults_to_plaintext`, `connection_tls_flag_disables_plaintext`, `connection_plaintext_flag_selects_plaintext`, `connection_rejects_tls_and_plaintext_together`. (2) Extracted the page-token dedup guard into pure `register_page_token` and covered it with `register_page_token_accepts_distinct_tokens_and_rejects_repeats`. (3) RFC3339 error paths: `rfc3339_parser_rejects_trailing_characters`, `rfc3339_parser_rejects_day_zero`, `rfc3339_parser_rejects_month_thirteen`, `rfc3339_parser_rejects_day_out_of_range_for_month`.
+64 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.Contracts` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -379,6 +379,23 @@ Re-review: no new findings. Open finding count remains 0. All seventeen
recorded Contracts findings (Contracts-001..017) remain closed
(Resolved / Won't Fix).
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the proto delta (`git diff 410acc9..8df5ab3 -- .../Protos/`): the new `optional ReplayGap replay_gap = 14` on `MxEvent` plus the `ReplayGap` message for reconnect replay. Additive-only confirmed (field 14 is new; oneof body arms 20-25 and fields 1-13 unchanged); `Generated/MxaccessGateway.cs` is consistent (contains `ReplayGapFieldNumber = 14`).
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | mxaccessgw conventions | No issues found (additive-only honoured) |
| 3 | Concurrency & thread safety | N/A — pure contract |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | Contracts-020 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Contracts-022 |
| 10 | Documentation & comments | Contracts-021 |
### Contracts-018
| Field | Value |
@@ -408,3 +425,48 @@ recorded Contracts findings (Contracts-001..017) remain closed
**Recommendation:** (1) Add comments to `ActiveAlarmSnapshot.degraded` / `source_provider` mirroring the wording already on `OnAlarmTransitionEvent` (or a one-line cross-reference). (2) Extend the `AlarmProviderMode` enum comment to note that as a `source_provider` / `mode` provenance value the field is always `ALARMMGR` or `SUBTAG` on the wire and `UNSPECIFIED` should be treated as "unknown / not yet determined", so the zero value is unambiguous at every use site. Comment-only changes; no wire-format impact.
**Resolution:** _(2026-06-15)_ Confirmed both gaps in `mxaccess_gateway.proto`: `ActiveAlarmSnapshot.degraded`/`source_provider` (14/15) were bare while the byte-identical `OnAlarmTransitionEvent` fields were documented, and the `AlarmProviderMode` enum comment only explained `UNSPECIFIED` for the `forced_mode` use. (1) Added comments to `ActiveAlarmSnapshot.degraded`/`source_provider` mirroring the `OnAlarmTransitionEvent` wording (subtag-fallback / reduced-fidelity, always ALARMMGR or SUBTAG, never UNSPECIFIED). (2) Extended the `AlarmProviderMode` enum comment to distinguish its two use sites: as `forced_mode`, `UNSPECIFIED` = auto; as a provenance value (`OnAlarmTransitionEvent.source_provider`, `ActiveAlarmSnapshot.source_provider`, `OnAlarmProviderModeChangedEvent.mode`, `AlarmProviderStatus.mode`) the worker always emits ALARMMGR/SUBTAG and `UNSPECIFIED` should be read as "unknown / not yet determined". Comment-only changes; no wire-format impact. NOTE: on this dev box the `csharp` protoc generator DOES emit proto leading comments into `Generated/MxaccessGateway.cs` `<summary>` XML doc (contrary to the brief's assumption), so the build regenerated `Generated/MxaccessGateway.cs` with the new doc comments only — diff is `///`-comment lines exclusively, zero code/wire/type changes. `dotnet build -f net10.0` succeeds with 0 warnings / 0 errors.
### Contracts-020
| Field | Value |
|---|---|
| Severity | Low |
| Category | Design-document adherence |
| Location | `gateway.md:1087,1101-1102` |
| Status | Resolved |
**Description:** gateway.md still lists "no reconnectable sessions" under "Resolved for v1" and lists "reconnectable sessions" / "multi-subscriber event fan-out" as post-v1 revisit items. The shipped `ReplayGap` reconnect-replay contract and multi-subscriber fan-out (documented in docs/Sessions.md) contradict this. docs/Sessions.md was updated; gateway.md's scope summary was left stale.
**Recommendation:** Update the gateway.md Resolved/Post-v1 lists to reflect that reconnectable sessions (via `after_worker_sequence` + `ReplayGap`) and multi-subscriber fan-out have shipped, cross-referencing docs/Sessions.md.
**Resolution:** _(2026-06-16)_ Updated `gateway.md` "Resolved for v1" list: replaced "no reconnectable sessions" / "one active event subscriber" with bullet points describing the shipped reconnect-replay (`after_worker_sequence` + `ReplayGap` sentinel, cross-referencing `docs/Sessions.md`) and multi-subscriber fan-out (single-subscriber fail-fast vs. multi-subscriber per-consumer disconnect, cross-referencing `docs/Sessions.md`). Removed "reconnectable sessions" and "multi-subscriber event fan-out" from the Post-v1 revisit list. Updated the backpressure bullet to mention both modes.
### Contracts-021
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:731-733` |
| Status | Resolved |
**Description:** The `replay_gap` field comment ends with "(Reconnect/replay logic is Task 12; this is the contract surface only.)". That parenthetical is now stale — the reconnect/replay logic has shipped and is exercised by EventStreamServiceTests/SessionEventDistributorTests. A reader is misled into thinking only the contract exists.
**Recommendation:** Drop the "Task 12 / contract surface only" parenthetical; the rest of the comment is accurate.
**Resolution:** _(2026-06-16)_ Removed the stale "(Reconnect/replay logic is Task 12; this is the contract surface only.)" parenthetical from the `replay_gap` field comment in `mxaccess_gateway.proto`. The "Additive (proto3):" sentence before it is retained. Comment-only change; no wire-format or generated-type impact.
### Contracts-022
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Resolved |
**Description:** No round-trip / descriptor pin exists for the new `ReplayGap` message or `MxEvent.replay_gap` (field 14). The field is exercised functionally end-to-end, but there is no contract-level pin to catch a future renumber/type-narrowing of `replay_gap = 14` or the two `ReplayGap` sequence-field numbers — the same gap class as Contracts-007/010/018.
**Recommendation:** Add a round-trip test setting `MxEvent.ReplayGap` with both sequence fields, asserting `BodyCase == None`, plus a descriptor assertion pinning `ReplayGapFieldNumber == 14` and the `ReplayGap` field numbers (1, 2).
**Resolution:** _(2026-06-16)_ Added `ProtobufContractRoundTripTests.MxEvent_RoundTripsReplayGapSentinelAndPinsFieldNumbers` to `ProtobufContractRoundTripTests.cs`. The test pins `MxEvent.ReplayGapFieldNumber == 14` via the generated constant, pins `ReplayGap.RequestedAfterSequenceFieldNumber == 1` and `ReplayGap.OldestAvailableSequenceFieldNumber == 2` via `ReplayGap.Descriptor.Fields` (asserting both the number and the field name), builds a sentinel `MxEvent` with both sequence fields populated and no body oneof set, serializes and parses it, then asserts both sequence values survive and `BodyCase == None` (confirming `replay_gap` is orthogonal to the body oneof).
+79 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.IntegrationTests` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -135,6 +135,23 @@ parameter (`d692232`).
| 9 | Testing coverage | Issues found: IntegrationTests-023 (`DashboardLdapLiveTests.AuthenticateAsync_AdminInGwAdminGroup_Succeeds` asserts the `ldap_group` claim but does not assert the emitted `Role: Admin` claim, leaving the role-mapping path untested). |
| 10 | Documentation & comments | No issues found. |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the live-test delta: two new `[LiveMxAccessFact]` smoke tests (B8 new COM commands; buffered-item path) + `EmptyAlarmWatchListResolver`. Tests correctly gated and serialized; credential-redaction coverage present. Only Low docs/coverage items.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | IntegrationTests-030 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | IntegrationTests-032, IntegrationTests-033 |
| 10 | Documentation & comments | IntegrationTests-030, IntegrationTests-031 |
## Findings
### IntegrationTests-001
@@ -608,3 +625,63 @@ The prior `DashboardAuthenticator` ctor took `IOptions<GatewayOptions>`, so the
**Recommendation:** Reword the `docs/GatewayTesting.md` "Live LDAP" failure-branch sentences to describe observable behavior without referencing the now-internal "candidate bind" mechanics (e.g. "a wrong password is rejected without leaking the password", "an unknown username fails authentication"), and note that bind/search is delegated to the shared `ZB.MOM.WW.Auth.Ldap` provider so the prose stays accurate after the cutover.
**Resolution:** Resolved 2026-06-15: Reworded the "Live LDAP" failure-branch prose to describe observable behavior ("fails authentication without leaking the password", "an unknown username fails authentication") instead of the now-internal "candidate bind" / "no candidate" mechanics, and added a sentence noting `DashboardAuthenticator` delegates the bind/search to the shared `ZB.MOM.WW.Auth.Ldap` provider (`LdapAuthService`) and only maps groups to roles — matching the in-source test-comment cutover. Verified by inspection.
### IntegrationTests-030
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `docs/GatewayTesting.md:76`, `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:576,728` |
| Status | Resolved |
**Description:** `docs/GatewayTesting.md` says "All six tests are gated by MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1" and enumerates five parity paths. This diff adds two new `[LiveMxAccessFact]` tests (B8 new COM commands: AuthenticateUser/ArchestrAUserToId/Suspend/Activate; and the buffered-data path: AddBufferedItem/SetBufferedUpdateInterval), bringing the total to eight. The doc still says "six" and omits the two new parity surfaces.
**Recommendation:** Update GatewayTesting.md to "eight" and add bullets for the B8 new-COM-commands and buffered-data parity surfaces.
**Resolution:** 2026-06-16: Updated `docs/GatewayTesting.md` — changed "five parity paths" to "seven", "All six tests" to "All eight tests", and added bullets for the B8 new-COM-commands surface (AuthenticateUser/ArchestrAUserToId/Suspend/Activate against an added-but-not-advised item) and the buffered-data surface (AddBufferedItem/SetBufferedUpdateInterval/Advise round-trip with at least one OnBufferedDataChange event, residual noted for multi-sample conversion).
### IntegrationTests-031
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:672` |
| Status | Resolved |
**Description:** The inline comment at line 672 says "Suspend / Activate against the advised item", but no `Advise` call is made between `AddItem` (line 616) and `CreateSuspendRequest` (line 677) — the item is added but not advised. The comment mislabels the COM subscription state under test (the parity assertion only requires a real reply, not a successful one).
**Recommendation:** Change "against the advised item" to "against the added-but-not-advised item" (or remove "advised"), and note that Suspend/Activate is exercised without a prior Advise.
**Resolution:** 2026-06-16: Rewrote the comment to "Suspend / Activate against the added-but-not-advised item (no Advise was issued between AddItem and this call)," making the COM subscription state explicit and noting that parity requires only a real reply, not a successful one.
### IntegrationTests-032
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:823-865` |
| Status | Resolved |
**Description:** In the buffered-item test, when no sample-bearing `OnBufferedDataChange` batch arrives, the sample-predicate `TimeoutException` is caught and discarded (line 831) before asserting `bootstrapBufferedEvents > 0`. The final failure message ("No OnBufferedDataChange event arrived at all") conflates two failure modes (NoData bootstrap not delivered vs. delivered-but-no-sample), reducing residual diagnostic quality.
**Recommendation:** Before nulling the batch, log the caught timeout message (e.g. `output.WriteLine($"B8: sample-bearing batch predicate timed out: {ex.Message}")`) so the residual log distinguishes the two cases.
**Resolution:** 2026-06-16: Added `output.WriteLine($"B8: sample-bearing batch predicate timed out: {ex.Message}")` inside the `catch (TimeoutException ex)` block before nulling `bufferedBatch`, so the residual log clearly records the timeout detail and distinguishes "predicate timed out" from "no OnBufferedDataChange arrived at all".
### IntegrationTests-033
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:577-709` |
| Status | Deferred |
**Description:** The new-COM-commands live test covers AuthenticateUser/ArchestrAUserToId/Suspend/Activate but not `AddItem2`/`Write2` — the B8 extended commands with a second context parameter introduced in the same bundle. Only live COM tests can verify the COM call succeeds with the correct argument split; a parity regression short-circuiting AddItem2/Write2 to InvalidRequest would not be caught.
**Recommendation:** Add AddItem2/Write2 to the parity test (or a dedicated test) asserting each produces a real reply (not InvalidRequest) against a valid handle and item-definition split.
**Resolution:** 2026-06-16: requires a live MXAccess rig + provider state not available on this dev box; add the AddItem2/Write2 parity assertions when running on the MXAccess host.
+57 -11
View File
@@ -10,17 +10,17 @@ Each module's `findings.md` is the source of truth; this file is generated from
| Module | Reviewer | Date | Commit | Status | Open | Total |
|---|---|---|---|---|---|---|
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 25 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 29 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 39 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 31 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 32 |
| [Contracts](Contracts/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 19 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 29 |
| [Server](Server/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 53 |
| [Tests](Tests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 35 |
| [Worker](Worker/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 28 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 33 |
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 29 |
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 34 |
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 48 |
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 36 |
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 38 |
| [Contracts](Contracts/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 22 |
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 33 |
| [Server](Server/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 56 |
| [Tests](Tests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 39 |
| [Worker](Worker/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 28 |
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-06-16 | `8df5ab3` | Re-reviewed | 0 | 36 |
## Pending findings
@@ -66,11 +66,13 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Dotnet-003 | Medium | Resolved | Concurrency & thread safety | `clients/dotnet/MxGateway.Client/MxGatewaySession.cs:659-663`, `clients/dotnet/MxGateway.Client/MxGatewayClient.cs:230-240` |
| Client.Dotnet-018 | Medium | Resolved | Documentation & comments | `clients/dotnet/README.md:137-138` |
| Client.Dotnet-022 | Medium | Resolved | mxaccessgw conventions | `clients/dotnet/Directory.Build.props:1-21` |
| Client.Dotnet-028 | Medium | Resolved | Security | `clients/dotnet/.../MxGatewayClientCli.cs:156` |
| Client.Go-002 | Medium | Resolved | Error handling & resilience | `clients/go/mxgateway/session.go:440-516` |
| Client.Go-003 | Medium | Resolved | Correctness & logic bugs | `clients/go/cmd/mxgw-go/main.go:517-532` |
| Client.Go-022 | Medium | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:398-412,417-519` |
| Client.Go-023 | Medium | Resolved | Concurrency & thread safety | `clients/go/cmd/mxgw-go/main.go:604-606,616-632` |
| Client.Go-028 | Medium | Resolved | Correctness & logic bugs | `scripts/tag-go-module.ps1:42-46` |
| Client.Go-030 | Medium | Resolved | Concurrency & thread safety | `clients/go/cmd/mxgw-go/main.go:1491-1494` |
| Client.Java-001 | Medium | Resolved | Security | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewaySecrets.java:30-32` |
| Client.Java-002 | Medium | Resolved | Concurrency & thread safety | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxEventStream.java:31,66-92` |
| Client.Java-003 | Medium | Resolved | mxaccessgw conventions | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:119-140` |
@@ -84,6 +86,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Java-033 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` |
| Client.Java-034 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` |
| Client.Java-037 | Medium | Resolved | Documentation & comments | `clients/java/README.md:138-149` |
| Client.Java-040 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1552-1561` |
| Client.Python-003 | Medium | Resolved | Error handling & resilience | `clients/python/src/mxgateway/client.py:125-137,155-173` |
| Client.Python-005 | Medium | Resolved | Performance & resource management | `clients/python/src/mxgateway/galaxy.py:117-140` |
| Client.Python-009 | Medium | Resolved | Testing coverage | `clients/python/tests/` |
@@ -92,6 +95,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Python-024 | Medium | Resolved | Code organization & conventions | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` |
| Client.Python-027 | Medium | Resolved | Security | `clients/python/src/zb_mom_ww_mxgateway/client.py:36-54`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:47-66`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:165-172,918-930` |
| Client.Python-028 | Medium | Resolved | Error handling & resilience | `clients/python/src/zb_mom_ww_mxgateway/options.py:120-130`, `clients/python/src/zb_mom_ww_mxgateway/client.py:59`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:71` |
| Client.Python-036 | Medium | Resolved | Documentation & comments | `clients/python/README.md:143-158` |
| Client.Rust-005 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:489-520` |
| Client.Rust-006 | Medium | Resolved | Error handling & resilience | `clients/rust/src/session.rs:531-555` |
| Client.Rust-015 | Medium | Resolved | Error handling & resilience | `clients/rust/crates/mxgw-cli/src/main.rs:1053-1070` |
@@ -100,6 +104,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Rust-022 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:369-391,403-420,427-444,452-469,476-493,631-696,706-724` |
| Client.Rust-024 | Medium | Resolved | Testing coverage | `clients/rust/tests/client_behavior.rs:405-415`; `clients/rust/src/session.rs:369-493`; `clients/rust/src/client.rs:265-291`; `clients/rust/crates/mxgw-cli/src/main.rs:1310-1505` |
| Client.Rust-031 | Medium | Resolved | Error handling & resilience | `clients/rust/src/options.rs:196-240` (`build_tls_config`); `clients/rust/Cargo.toml:40` (tonic features); docs: `clients/rust/src/options.rs:76-101`, `clients/rust/README.md` (TLS trust section), `clients/rust/crates/mxgw-cli/src/main.rs:429-431`, `clients/rust/RustClientDesign.md:202` |
| Client.Rust-033 | Medium | Resolved | Correctness & logic bugs | `clients/rust/crates/mxgw-cli/src/main.rs:485` |
| Contracts-002 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:384-385`, `:95` |
| Contracts-009 | Medium | Resolved | Design-document adherence | `docs/Contracts.md:13-24` |
| IntegrationTests-003 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:89-97` |
@@ -124,6 +129,8 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-038 | Medium | Resolved | Security | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:23-44` |
| Server-044 | Medium | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-254` |
| Server-051 | Medium | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:64-78` |
| Server-054 | Medium | Resolved | Design-document adherence | `docs/DesignDecisions.md` (Session Reconnect / Event Subscribers / Later Revisit Items §470-471), `CLAUDE.md` (Repository-Specific Conventions) |
| Server-056 | Medium | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionEventDistributor.cs:296-310,449-453,629-635` |
| Tests-003 | Medium | Resolved | Performance & resource management | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
| Tests-004 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
| Tests-005 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
@@ -172,6 +179,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Dotnet-023 | Low | Resolved | Code organization & conventions | `clients/dotnet/Directory.Build.props:17`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/IMxGatewayCliClient.cs:6`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Tests/*.cs` |
| Client.Dotnet-024 | Low | Resolved | Code organization & conventions | `clients/dotnet/Directory.Build.props:12`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj:19-24` |
| Client.Dotnet-025 | Low | Resolved | Concurrency & thread safety | `clients/dotnet/ZB.MOM.WW.MxGateway.Client/LazyBrowseNode.cs:38,41,54,82,94` |
| Client.Dotnet-026 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/.../MxGatewayClientCli.cs:306` (isLongRunning) |
| Client.Dotnet-027 | Low | Won't Fix | Performance & resource management | `clients/dotnet/ZB.MOM.WW.MxGateway.Client/LazyBrowseNode.cs:15` |
| Client.Dotnet-029 | Low | Resolved | Code organization & conventions | `clients/dotnet/.../IMxGatewayCliClient.cs:6` |
| Client.Go-004 | Low | Resolved | mxaccessgw conventions | `clients/go/mxgateway/alarms_test.go:153-154`, `clients/go/mxgateway/galaxy_test.go:58-59` |
| Client.Go-005 | Low | Resolved | Design-document adherence | `clients/go/mxgateway/client.go:64,68`, `clients/go/mxgateway/galaxy.go:83,87` |
| Client.Go-006 | Low | Resolved | Error handling & resilience | `clients/go/mxgateway/errors.go:9-130` |
@@ -195,6 +205,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Go-026 | Low | Resolved | Error handling & resilience | `clients/go/cmd/mxgw-go/main.go:1196-1222` |
| Client.Go-027 | Low | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:1195-1206` |
| Client.Go-029 | Low | Resolved | Documentation & comments | `clients/go/README.md:300-303` |
| Client.Go-031 | Low | Resolved | Correctness & logic bugs | `clients/go/cmd/mxgw-go/main.go:1037-1046` |
| Client.Go-032 | Low | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:839-841` |
| Client.Go-033 | Low | Resolved | Testing coverage | `clients/go/cmd/mxgw-go/main_test.go` |
| Client.Go-034 | Low | Resolved | Documentation & comments | `clients/go/README.md:245-263` |
| Client.Java-006 | Low | Resolved | Performance & resource management | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:323-328`, `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/GalaxyRepositoryClient.java:279-284` |
| Client.Java-007 | Low | Resolved | Testing coverage | `clients/java/mxgateway-client/src/test/java/com/dohertylan/mxgateway/client/` |
| Client.Java-008 | Low | Resolved | Error handling & resilience | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:298-304` |
@@ -218,6 +232,14 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Java-035 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` |
| Client.Java-036 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` |
| Client.Java-038 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1347-1393` |
| Client.Java-041 | Low | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:2187-2194` |
| Client.Java-042 | Low | Resolved | Error handling & resilience | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1565-1567` |
| Client.Java-043 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:241-264` |
| Client.Java-044 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientVersion.java:12` |
| Client.Java-045 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/InProcessGatewayHarness.java` |
| Client.Java-046 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-cli/src/test/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCliTests.java:680-696` |
| Client.Java-047 | Low | Resolved | Documentation & comments | `clients/java/README.md` |
| Client.Java-048 | Low | Resolved | Documentation & comments | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:88-105` |
| Client.Python-001 | Low | Resolved | Documentation & comments | `clients/python/pyproject.toml:8,25`, `clients/python/src/mxgateway_cli/commands.py:25` |
| Client.Python-002 | Low | Resolved | Code organization & conventions | `clients/python/src/mxgateway/__init__.py:27` |
| Client.Python-004 | Low | Resolved | Correctness & logic bugs | `clients/python/src/mxgateway_cli/commands.py:386,402-404` |
@@ -239,6 +261,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Python-029 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway/options.py:78-90` |
| Client.Python-030 | Low | Resolved | Code organization & conventions | `clients/python/pyproject.toml:17` |
| Client.Python-031 | Low | Resolved | Testing coverage | `clients/python/tests/test_tls.py:34`, `clients/python/pyproject.toml:53-56` |
| Client.Python-032 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:1048,1065-1066` |
| Client.Python-033 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:772,1490-1494` |
| Client.Python-034 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:1497-1501` |
| Client.Python-035 | Low | Resolved | Code organization & conventions | `clients/python/src/zb_mom_ww_mxgateway/__init__.py`, `.../options.py:63-77`, `.../galaxy.py:293` |
| Client.Rust-004 | Low | Resolved | Documentation & comments | `clients/rust/src/version.rs:7` |
| Client.Rust-007 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:14-55` |
| Client.Rust-008 | Low | Resolved | Performance & resource management | `clients/rust/src/value.rs:161-261` |
@@ -256,6 +282,11 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Client.Rust-027 | Low | Resolved | Documentation & comments | `clients/rust/.cargo/config.toml:1-9` |
| Client.Rust-028 | Low | Resolved | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:1126-1166` |
| Client.Rust-032 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md`; surface in `clients/rust/src/galaxy.rs:281-379` |
| Client.Rust-034 | Low | Resolved | Correctness & logic bugs | `clients/rust/crates/mxgw-cli/src/main.rs:48-51,548` |
| Client.Rust-035 | Low | Resolved | Security | `clients/rust/crates/mxgw-cli/src/main.rs:489-495` |
| Client.Rust-036 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:351` |
| Client.Rust-037 | Low | Resolved | Design-document adherence | `clients/rust/README.md:164-179` |
| Client.Rust-038 | Low | Resolved | Testing coverage | `clients/rust/crates/mxgw-cli/src/main.rs:2336-2564` |
| Contracts-001 | Low | Resolved | Design-document adherence | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
| Contracts-003 | Low | Won't Fix | Code organization & conventions | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
| Contracts-004 | Low | Resolved | Documentation & comments | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
@@ -273,6 +304,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Contracts-017 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:23-29` (the `rpc QueryActiveAlarms` block) |
| Contracts-018 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs:396` (`ActiveAlarmSnapshot_RoundTripsAllFields`) |
| Contracts-019 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:850-851` (`ActiveAlarmSnapshot`), `:318-324` (`AlarmProviderMode`) |
| Contracts-020 | Low | Resolved | Design-document adherence | `gateway.md:1087,1101-1102` |
| Contracts-021 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:731-733` |
| Contracts-022 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| IntegrationTests-007 | Low | Resolved | Concurrency & thread safety | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:20`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs:5`, `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:9` |
| IntegrationTests-008 | Low | Resolved | Code organization & conventions | `src/MxGateway.IntegrationTests/LiveLdapFactAttribute.cs`, `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs`, `src/MxGateway.IntegrationTests/LiveMxAccessFactAttribute.cs` |
| IntegrationTests-009 | Low | Resolved | Documentation & comments | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:372-375` |
@@ -291,6 +325,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| IntegrationTests-027 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj`, `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:4-5,134` |
| IntegrationTests-028 | Low | Resolved | Design-document adherence | `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:120-161`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:35` |
| IntegrationTests-029 | Low | Resolved | Documentation & comments | `docs/GatewayTesting.md:218-224` |
| IntegrationTests-030 | Low | Resolved | Documentation & comments | `docs/GatewayTesting.md:76`, `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:576,728` |
| IntegrationTests-031 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:672` |
| IntegrationTests-032 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:823-865` |
| IntegrationTests-033 | Low | Deferred | Testing coverage | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:577-709` |
| Server-007 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Galaxy/GalaxyHierarchyProjector.cs:55-70` |
| Server-008 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:111-134,160-189` |
| Server-009 | Low | Resolved | Error handling & resilience | `src/MxGateway.Server/Security/Authentication/AuthSqliteConnectionFactory.cs:15-32` |
@@ -327,6 +365,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-050 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:42-75,92-125` |
| Server-052 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Alarms/IAlarmWatchListResolver.cs:24-30`, `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:101-114`, `docs/GatewayConfiguration.md:247` |
| Server-053 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmWatchListResolverTests.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs` |
| Server-055 | Low | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:842-851,1841-1871` |
| Tests-007 | Low | Resolved | Code organization & conventions | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682`, `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324`, `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460`, `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233` |
| Tests-008 | Low | Resolved | mxaccessgw conventions | `src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9`, `src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1` |
| Tests-009 | Low | Resolved | Documentation & comments | `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365` |
@@ -350,6 +389,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Tests-033 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAlarmProviderStatus.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardBrowseAndAlarmModelTests.cs:140-195` |
| Tests-034 | Low | Resolved | mxaccessgw conventions | `src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs:1-15` |
| Tests-035 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs:315-329` |
| Tests-036 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Configuration/GatewayOptionsValidatorTests.cs` |
| Tests-037 | Low | Won't Fix | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Tests-038 | Low | Resolved | Performance & resource management | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionEventDistributorTests.cs:702-713` |
| Tests-039 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewaySessionDashboardMirrorTests.cs` (`DashboardMirror_AndGrpcSubscriber_BothReceiveEvents`) |
| Worker-009 | Low | Resolved | Performance & resource management | `src/MxGateway.Worker/Ipc/WorkerFrameReader.cs:31,49`, `src/MxGateway.Worker/Ipc/WorkerFrameWriter.cs:57-58` |
| Worker-010 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Conversion/VariantConverter.cs:204-226` |
| Worker-011 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeClient.cs:169-171` |
@@ -387,3 +430,6 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Worker.Tests-030 | Low | Resolved | Documentation & comments | `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:862-890` |
| Worker.Tests-032 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs` |
| Worker.Tests-033 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmStateMachineTests.cs` |
| Worker.Tests-034 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs:2233`, `src/ZB.MOM.WW.MxGateway.Worker.Tests/TestSupport/NoopMxAccessServer.cs:97` |
| Worker.Tests-035 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:99-136` |
| Worker.Tests-036 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:983-996` |
+64 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.Server` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -69,6 +69,23 @@ findings (Server-001 through Server-032) are unchanged by this pass.
| 9 | Testing coverage | Issues found: Server-037 (no test for the corrupt-snapshot restore path or for `PersistSnapshot = false` at the cache level). |
| 10 | Documentation & comments | No issues found — XML docs match behavior; the `GalaxyRepository.md` "On-disk snapshot" section documents the Stale-on-restore lifecycle. |
### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the session-resilience epic + §8 delta (`git diff 410acc9..8df5ab3`): `SessionEventDistributor` multi-subscriber fan-out, replay-on-reconnect, detach-grace retention, bounded worker-ready wait, dashboard auto-login.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Server-055 |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found (replay/handoff atomicity, reconnect-vs-sweep, single-clock ready-wait all verified sound) |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found (DisableLogin auto-login is intentional/config-gated/documented) |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | Server-054 |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | No issues found |
| 10 | Documentation & comments | No issues found |
### 2026-05-24 re-review (commit 42b0037)
Re-review pass at `42b0037` scoped to the dashboard destructive-action wave on
@@ -1022,3 +1039,48 @@ Additionally, `GatewayAlarmMonitor.ApplyProviderModeChangeAsync` increments the
**Recommendation:** Add resolver tests for (a) cancellation propagation and (b) an include that is also excluded; and a `GatewayAlarmMonitorProviderMode` test pinning the provider-switch counter behaviour for a same-mode repeat event (whichever semantics the team intends). These lock down the contracts the Server-051/052 findings expose.
**Resolution:** Resolved 2026-06-15. Added all three missing tests: (a) `AlarmWatchListResolverTests.ResolveAsync_RepositoryCancelled_PropagatesOperationCanceled` (cancellation propagation, also covers Server-051); (b) `AlarmWatchListResolverTests.ResolveAsync_ExcludeAlsoSuppressesMatchingExplicitInclude` (exclude-vs-include precedence, also Server-052 item 2); and (c) `GatewayAlarmMonitorProviderModeTests.ProviderModeChange_RepeatedSameMode_RecordsASwitchForEachEvent`, which pins the existing semantics — each worker-reported `OnAlarmProviderModeChanged` event records a `provider_switches` increment (and resets `_providerSince`) even when `toMode` equals the current mode, since the worker is the authority on when a mode change occurred and the gateway does not synthesize or suppress it.
### Server-054
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Design-document adherence |
| Location | `docs/DesignDecisions.md` (Session Reconnect / Event Subscribers / Later Revisit Items §470-471), `CLAUDE.md` (Repository-Specific Conventions) |
| Status | Resolved |
**Description:** The session-resilience epic shipped multi-subscriber fan-out (`SessionEventDistributor`), reconnectable sessions with replay (`AttachEventSubscriberWithReplay`/`ReplayGap`), and detach-grace retention — but `docs/DesignDecisions.md` still states "no reconnectable sessions for v1" and "one active StreamEvents subscriber per session for v1", and still files both as post-v1 "Later Revisit Items". `CLAUDE.md` likewise still says these are "explicitly out of scope". This is the stale-prose-vs-shipped-behavior drift the "update docs in the same change as the source" rule prohibits.
**Recommendation:** Update both `DesignDecisions.md` sections and the revisit list to describe the shipped behavior (gated by `AllowMultipleEventSubscribers`, `DetachGraceSeconds`, replay options), and amend the CLAUDE.md convention bullet.
**Resolution:** 2026-06-16: updated `docs/DesignDecisions.md` (Session Reconnect section rewritten to describe the shipped detach-grace + replay-on-reconnect behavior with config references; Event Subscribers section rewritten to describe the config-gated multi-subscriber fan-out, mode-dependent `FailFast` semantics, and internal vs external subscriber distinction; Later Revisit Items list removes the two shipped items and records them as shipped with config cross-references) and the `CLAUDE.md` conventions bullet to describe the shipped config-gated multi-subscriber + reconnect-replay behavior while preserving the one-worker-per-session invariant.
### Server-055
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:842-851,1841-1871` |
| Status | Resolved |
**Description:** When `AttachEventSubscriber`/`AttachEventSubscriberWithReplay` fails inside `StartDistributorAndRegister`, the catch calls `DetachEventSubscriber()`, which decrements the active count back to 0 and — because the session is still `Ready` and detach-grace is enabled — stamps `_detachedAtUtc = now`. A freshly-`Ready` session that never had a successful subscriber thus enters the detach-grace window on a failed first attach, making it sweep-eligible after `DetachGraceSeconds` even though no client ever streamed. Impact is minor (the lease still protects it; a later successful attach clears the stamp) but the "last subscriber dropped" semantics are violated.
**Recommendation:** Only stamp `_detachedAtUtc` on a detach that mirrors a prior successful attach — roll the failure path back without entering grace, or guard the stamp on "a subscriber had previously been registered."
**Resolution:** 2026-06-16: `GatewaySession` now tracks `_everHadEventSubscriber` (a `bool` field, set to `true` inside `MarkEventSubscriberAttached()` which is called only after `StartDistributorAndRegister` succeeds). `DetachEventSubscriber` gates the `_detachedAtUtc` stamp on `_everHadEventSubscriber`, so the catch-path rollback decrements the reserved slot but does not enter detach-grace. A regression `[Fact]` (`DetachGrace_FailedFirstAttach_DoesNotEnterGrace`) in `GatewaySessionTests.cs` verifies that after a failed first attach the session has `DetachedAtUtc == null`, `ActiveEventSubscriberCount == 0`, and `IsDetachGraceExpired` returns `false` regardless of clock advance.
### Server-056
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionEventDistributor.cs:296-310,449-453,629-635` |
| Status | Resolved |
**Description:** `SessionEventDistributor` orphaned any subscriber that registered in the window AFTER the pump ran its final `CompleteAllSubscribers` sweep (the event source completed or faulted and the pump exited) but BEFORE `DisposeAsync`. `RegisterSubscriber`/`RegisterWithReplay` guarded only against `_disposed`, not against the pump having already completed, so a subscriber added in that window got a channel the now-exited pump would never complete — its reader (`ReadAllAsync`) waited forever. In production this is the edge case of a client calling `StreamEvents` after the worker's event stream has ended but before the session is torn down. Discovered while diagnosing an order-dependent hang in `GatewaySessionDashboardMirrorTests`, where a gRPC subscriber attached after a fast-completing worker stream had already drained (its `await foreach` has no timeout, so the orphaned channel surfaced as an infinite hang rather than a clean failure).
**Recommendation:** Record terminal completion (a `_completed` flag plus the terminal error) under `_lifecycleLock` and have both register paths complete a late registrant's channel immediately with the same terminal state.
**Resolution:** 2026-06-17: added `_completed` + `_completionError`, set inside `CompleteAllSubscribers` under `_lifecycleLock` — the same lock the register paths take, so completion and registration serialize (a subscriber added before the sweep is completed by the loop; one racing in after sees `_completed` and self-completes). `Register` and `RegisterWithReplay` now `TryComplete` a late registrant's channel with `_completionError` when `_completed`; a late resume still receives its retained replay batch, then a cleanly-completed empty live channel. No lock-ordering risk — `CompleteAllSubscribers` takes only `_lifecycleLock`, and the subscriber channels use `AllowSynchronousContinuations=false` so `TryComplete` under the lock runs no continuation inline. New regression `[Fact]` `Register_AfterSourceCompletes_CompletesLateSubscriberInsteadOfHanging` (`SessionEventDistributorTests.cs`) registers a subscriber after the pump completes and asserts its channel completes (bounded read); verified it fails without the fix (5 s timeout) and passes with it (12 ms). The racy `GatewaySessionDashboardMirrorTests.DashboardMirror_AndGrpcSubscriber_BothReceiveEvents` that exposed it was also made deterministic — see Tests-039.
+79 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.Tests` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -111,6 +111,23 @@ fakes in two test files.
| 9 | Testing coverage | Issues found: Tests-026 (no test proves `EventStreamService` actually calls `IDashboardEventBroadcaster.Publish` for each event — the only consumers in tests are `Null` fakes). |
| 10 | Documentation & comments | No issues found in this diff. |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the gateway-test delta (session-resilience epic + §8). New tests are high quality (bounded async waits, FakeTimeProvider, deterministic gating, meaningful assertions). Verified the §8 FakeWorkerProcess consolidation did NOT drop the `entireProcessTree` kill assertion. Only Low coverage-gap / one latent helper footgun.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | Tests-038 |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Tests-036, Tests-037 |
| 10 | Documentation & comments | No issues found |
## Findings
### Tests-001
@@ -647,3 +664,63 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
**Recommendation:** Bound the second-subscriber drain with the same `WaitTimeout` used elsewhere — e.g. link `newStreamCts` to a `CancellationTokenSource.CreateLinkedTokenSource` plus `CancelAfter(WaitTimeout)`, or wrap the drain in a `Task` awaited via `WaitAsync(WaitTimeout)` — so a missing `SnapshotComplete` surfaces as a deterministic failure rather than a hang.
**Resolution:** 2026-06-15 — Confirmed the unbounded `await foreach` in `DegradedTransition_CachedThenReplayed_CarriesDegradedAndSourceProviderToNewSubscriber`. Bounded the second-subscriber drain with a `CancellationTokenSource.CreateLinkedTokenSource(newStreamCts.Token, drainTimeoutCts.Token)` where `drainTimeoutCts.CancelAfter(WaitTimeout)`, and wrapped the loop in a `try/catch (OperationCanceledException) when (drainTimeoutCts.IsCancellationRequested)` that rethrows a `TimeoutException`. A regression that never emits `SnapshotComplete` now fails cleanly instead of hanging. Test still passes.
### Tests-036
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Configuration/GatewayOptionsValidatorTests.cs` |
| Status | Resolved |
**Description:** Three new validator rules — `DetachGraceSeconds >= 0` (GatewayOptionsValidator.cs:185-186), `ReplayBufferCapacity >= 0` (:215-216), `ReplayRetentionSeconds >= 0` (:219-220) — have no tests, while the sibling new options (`MaxEventSubscribersPerSession`, `WorkerReadyWaitTimeoutMs`) do. A regression dropping/inverting any of the three guards would pass with no failing test.
**Recommendation:** Add boundary theories mirroring the `MaxEventSubscribersPerSession` pattern: a failing case (`-1`) asserting the message contains each config path, and a succeeding boundary case (`0`).
**Resolution:** 2026-06-16 — Added six tests to `GatewayOptionsValidatorTests.cs` covering all three guards: `Validate_Fails_WhenDetachGraceSecondsIsNegative` / `Validate_Succeeds_WhenDetachGraceSecondsIsZero` (via `CloneWithSessions`); `Validate_Fails_WhenReplayBufferCapacityIsNegative` / `Validate_Succeeds_WhenReplayBufferCapacityIsZero` and `Validate_Fails_WhenReplayRetentionSecondsIsNegative` / `Validate_Succeeds_WhenReplayRetentionSecondsIsZero` (via a new `CloneWithEvents` helper). Each failing case asserts the failure message contains the config path; each boundary case asserts `Succeeded`. Mirrors the `MaxEventSubscribersPerSession` / `WorkerReadyWaitTimeoutMs` pattern.
### Tests-037
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs` |
| Status | Won't Fix |
**Description:** The reconnect/replay contract surface (`ReplayGap` message, `MxEvent.replay_gap = 14`, `StreamEventsRequest.after_worker_sequence`) has no protobuf serialize/parse round-trip test pinning the wire shape and the documented sentinel invariant (family UNSPECIFIED, body oneof and per-item fields unset). Behavior is exercised in EventStreamServiceTests; this is a wire-contract gap.
**Recommendation:** Add a round-trip test building an `MxEvent` with `ReplayGap` populated, asserting the two sequence fields survive and the sentinel invariants hold (field 14, `Family == Unspecified`, `BodyCase` unset).
**Resolution:** 2026-06-16: covered by the ReplayGap round-trip + descriptor-pin test added under Contracts-022 in ProtobufContractRoundTripTests.cs; a duplicate here would be redundant.
### Tests-038
| Field | Value |
|---|---|
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionEventDistributorTests.cs:702-713` |
| Status | Resolved |
**Description:** `DrainUntilFaultAsync` relies on the channel completing WITH a fault so `WaitToReadAsync` re-throws. Correct for current callers, but if reused on a channel that completes gracefully, `WaitToReadAsync` returns false without throwing and the helper spins in a tight CPU loop with no escape (ReadTimeout bounds only the individual wait). A maintenance hazard, not a current bug.
**Recommendation:** When `WaitToReadAsync` returns false, await `reader.Completion` (surfaces the fault or completes cleanly) and `Assert.Fail` on graceful completion, so the helper fails fast instead of spinning.
**Resolution:** 2026-06-16 — When `WaitToReadAsync` returns `false` (graceful completion), the helper now awaits `reader.Completion` (propagating any stored fault) and then calls `Assert.Fail` so the helper fails fast rather than spinning; the fault-path behavior (re-throw from `WaitToReadAsync`) is preserved unchanged.
### Tests-039
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewaySessionDashboardMirrorTests.cs` (`DashboardMirror_AndGrpcSubscriber_BothReceiveEvents`) |
| Status | Resolved |
**Description:** `DashboardMirror_AndGrpcSubscriber_BothReceiveEvents` attached its gRPC subscriber via `StreamEventsAsync` AFTER `MarkReady()` had already started the pump draining a fast-completing 3-event fake worker — a register-vs-pump race. It passed alone but hung the whole `GatewaySessionDashboardMirrorTests` class when another test ran first (warm JIT let the pump drain and complete before the gRPC subscriber registered). Its `await foreach` over the gRPC stream uses `CancellationToken.None` with no timeout, so the race surfaced as an indefinite hang rather than a clean failure (unlike the sibling tests' `WaitUntilAsync`, which self-times-out at 5s). This exposed the production race fixed under Server-056.
**Recommendation:** Make the test deterministic — hold the worker stream until both the dashboard mirror and the gRPC subscriber have attached, then release, so neither subscriber can miss an event regardless of scheduling.
**Resolution:** 2026-06-17 — added a release-gate to the test's `FakeWorkerClient` (`HoldEventsUntilReleased()` / `ReleaseEvents()`; `ReadEventsAsync` awaits the gate before yielding, ungated by default so other tests are unaffected). The test now holds the stream, starts the gRPC reader on a background task, waits for `session.ActiveEventSubscriberCount == 1` (the internal dashboard mirror is excluded from the count, so this confirms the gRPC subscriber attached), then releases — both subscribers deterministically receive all three events. With the Server-056 production fix in place, the full `GatewaySessionDashboardMirrorTests` class now passes (5/5) instead of hanging.
+64 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.Worker.Tests` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -119,6 +119,23 @@ findings (Worker.Tests-001 through -030) are unaffected.
| 9 | Testing coverage | No issues found in this diff. |
| 10 | Documentation & comments | No issues found in this diff. |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the worker-test delta covering the new COM seam (`MxAccessCommandExecutorTests`, `MxAccessComServerTests`) and alarm work. Tests genuinely exercise STA dispatch and parity; only Low organization/coverage/flakiness items.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | Worker.Tests-036 |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found (password-no-leak test present) |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | Worker.Tests-034 |
| 9 | Testing coverage | Worker.Tests-035 |
| 10 | Documentation & comments | No issues found |
## Findings
### Worker.Tests-001
@@ -615,3 +632,48 @@ findings (Worker.Tests-001 through -030) are unaffected.
**Recommendation:** Add (a) `AckedTrueWhileInactive_EmitsNothingAndDoesNotLatch` — apply `.acked=true` with no prior active raise, assert `Apply` returns empty, then raise active and clear and assert the clear emits `UnackRtn` (proving the stale ack did not latch); and (b) `PriorityChange_FlowsIntoEmittedRecord` — apply a priority value then an active raise and assert the emitted record's `Priority` equals the supplied value (and a `CoerceInt` string/garbage case falls back).
**Resolution:** 2026-06-15 — Added both tests to `SubtagAlarmStateMachineTests`. `AckedTrueWhileInactive_EmitsNothingAndDoesNotLatch` applies `.acked=true` with no preceding active raise (asserts `Apply` returns empty), then drives a fresh raise→clear episode and asserts the clear emits `UnackRtn` — proving the stale inactive ack did not latch `AckedDuringEpisode`. `PriorityChange_FlowsIntoEmittedRecord` (the target now includes a `PrioritySubtag`) applies an `int` priority `750` (asserts the priority change emits nothing), raises active and asserts the emitted record's `Priority == 750` (exercising `CoerceInt`'s `int` path and the priority assignment), then applies a non-numeric `"not-a-number"` priority and asserts the snapshot `Priority` is still `750` (the `CoerceInt` string fallback keeps the prior value, not zero).
### Worker.Tests-034
| Field | Value |
|---|---|
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs:2233`, `src/ZB.MOM.WW.MxGateway.Worker.Tests/TestSupport/NoopMxAccessServer.cs:97` |
| Status | Resolved |
**Description:** `FakeMxStatus` is defined twice — file-scope in `TestSupport/NoopMxAccessServer.cs:97` and nested in `MxAccessCommandExecutorTests.FakeMxAccessComObject:2233` — both exposing the same four public fields that `MxStatusProxyConverter` reflects over. The two copies must stay structurally identical; a future field change to the real COM struct requires updating two places, and the duplication is invisible to a reader consulting only one file.
**Recommendation:** Extract `FakeMxStatus` into its own `TestSupport/FakeMxStatus.cs` (or colocate both doubles) and have `MxAccessCommandExecutorTests` use the shared type instead of its nested copy.
**Resolution:** 2026-06-16 — Removed the nested `FakeMxStatus` class from `MxAccessCommandExecutorTests.FakeMxAccessComObject`; the two `new FakeMxStatus { ... }` usages in `Suspend`/`Activate` now resolve to the shared `TestSupport.FakeMxStatus` via the pre-existing `using ZB.MOM.WW.MxGateway.Worker.Tests.TestSupport;` import. Updated the XML doc on `TestSupport/NoopMxAccessServer.cs:FakeMxStatus` to note the consolidation. Fix applied 2026-06-16, verified on windev 2026-06-17 (dotnet test -p:Platform=x86: 344 passed, 0 failed).
### Worker.Tests-035
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:99-136` |
| Status | Resolved |
**Description:** `MxAccessCommandExecutor.Execute` has a `_` discard arm returning `CreateInvalidRequestReply(... "Unsupported MXAccess command kind ...")` — the safety net for an unknown `MxCommandKind` (e.g. a future gateway enum value before the worker is updated). No test passes an unknown kind and asserts `InvalidRequest`. A regression changing the arm to `throw` would propagate an unhandled exception through `WorkerPipeSession` and no test would catch it.
**Recommendation:** Add a `[Fact]` constructing a `StaCommand` with an undefined `MxCommandKind` value and asserting the reply is `ProtocolStatusCode.InvalidRequest` with "Unsupported" in the diagnostic.
**Resolution:** 2026-06-16 — Added `DispatchAsync_WithUnknownCommandKind_ReturnsInvalidRequestWithUnsupportedDiagnostic` to `MxAccessCommandExecutorTests`. Casts `int.MaxValue` to `MxCommandKind` (an undefined value not present in the proto-generated enum), dispatches it through `MxAccessStaSession.DispatchAsync`, asserts `ProtocolStatusCode.InvalidRequest`, and asserts `reply.DiagnosticMessage` contains "Unsupported" (case-insensitive — matching `CreateInvalidRequestReply`'s `"Unsupported MXAccess command kind ..."` message). Fix applied 2026-06-16, verified on windev 2026-06-17 (dotnet test -p:Platform=x86: 344 passed, 0 failed).
### Worker.Tests-036
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:983-996` |
| Status | Resolved |
**Description:** `RunAsync_SendsFirstHeartbeatImmediatelyOnEnteringLoop` carries a redundant wall-clock assertion `Assert.True(elapsed < TimeSpan.FromSeconds(5), ...)`. The existing `heartbeatWait` CTS (cancel-after 5s) already enforces the same bound — the extra wall-clock check can only fire if the heartbeat arrived but took >5s to be received, which the CTS already prevents. It is the same coarse wall-clock pattern prior findings (Worker.Tests-003/004/013/020) corrected.
**Recommendation:** Remove the `start`/`elapsed`/`Assert.True(elapsed < ...)` check; the CTS timeout already pins the timing contract.
**Resolution:** 2026-06-16 — Removed the `DateTimeOffset start`, `TimeSpan elapsed`, and `Assert.True(elapsed < TimeSpan.FromSeconds(5), ...)` wall-clock assertions from `RunAsync_SendsFirstHeartbeatImmediatelyOnEnteringLoop`. The `heartbeatWait` CTS (cancel-after 5s) already enforces the same timing bound. Added an inline comment explaining why the wall-clock floor is omitted, consistent with the Worker.Tests-003/004/013/020 pattern. Fix applied 2026-06-16, verified on windev 2026-06-17 (dotnet test -p:Platform=x86: 344 passed, 0 failed).
+19 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/ZB.MOM.WW.MxGateway.Worker` |
| Reviewer | Claude Code |
| Review date | 2026-06-15 |
| Commit reviewed | `410acc9` |
| Review date | 2026-06-16 |
| Commit reviewed | `8df5ab3` |
| Status | Re-reviewed |
| Open findings | 0 |
@@ -87,6 +87,23 @@ contention with the gateway-side watchdog (Server-031) is unchanged.
| 9 | Testing coverage | No issues found in this diff. |
| 10 | Documentation & comments | No issues found in this diff. |
#### 2026-06-16 re-review (commit 8df5ab3)
Re-review of the worker delta (`git diff 410acc9..8df5ab3`): the `IMxAccessServer`/`MxAccessComServer`/`MxAccessSession`/`MxAccessCommandExecutor` seam-extraction refactor plus alarm failover/subtag work. **No new findings.** Prior findings Worker-026/027/028 confirmed resolved at this commit. Every MXAccess COM call in the new seam is reachable only via `StaCommandDispatcher``staRuntime.InvokeAsync` (STA affinity preserved); MXAccess parity preserved (no synthesized events, HRESULTs surfaced); the single COM RCW is released exactly once; net48 idioms respected.
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | No issues found |
| 2 | mxaccessgw conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found (STA affinity preserved across the new seam) |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found (no secret/WriteSecured-payload logging) |
| 6 | Performance & resource management | No issues found (single FinalReleaseComObject) |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | No issues found |
| 10 | Documentation & comments | No issues found |
## Findings
### Worker-001
+60 -21
View File
@@ -62,37 +62,67 @@ Implementation guidance:
## Session Reconnect
Decision: no reconnectable sessions for v1.
Reconnectable sessions with event replay are shipped and config-gated. The
original "no reconnectable sessions" constraint is superseded.
One `OpenSession` creates one gateway session and one worker process. The
session ends on `CloseSession`, client disconnect policy, lease expiry, worker
fault, or gateway shutdown.
fault, gateway shutdown, or — when `DetachGraceSeconds > 0` — detach-grace
expiry after the last external event subscriber drops.
Rationale: reconnectable sessions require event replay, orphan ownership,
security checks, and more complicated worker lifetime rules. They are not needed
for the first parity slice.
`MxGateway:Sessions:DetachGraceSeconds` (default `30`) controls the retention
window. When positive, a session whose last external gRPC event-stream
subscriber drops stays `Ready` for that many seconds so a client can reconnect
to the same session instead of triggering a new `OpenSession` → worker spawn.
Setting it to `0` reverts to closing only on normal lease expiry.
A reconnecting client issues `StreamEvents` with `after_worker_sequence` set to
the last sequence it observed; the gateway replays retained events newer than
that watermark (capped by `MxGateway:Events:ReplayBufferCapacity` and
`MxGateway:Events:ReplayRetentionSeconds`) then transitions seamlessly to live
delivery. If the requested position precedes the oldest retained event, a
`ReplayGap` sentinel signals the client to re-snapshot. The replay→live handoff
is atomic (no gap, no duplicate). See [Sessions](./Sessions.md) for the full
reconnect and replay protocol.
## Event Subscribers
Decision: one active `StreamEvents` subscriber per session for v1.
Multi-subscriber fan-out for data-side `StreamEvents` is shipped and
config-gated. The original "one active subscriber per session" constraint is
superseded for deployments that opt in.
A second subscriber should be rejected with a clear session error. Multi-client
fan-out may be added later with explicit backpressure semantics.
`MxGateway:Sessions:AllowMultipleEventSubscribers` (default `false`) controls
the mode. When `false` the session still rejects a second `StreamEvents`
subscriber with `EventSubscriberAlreadyActive`, preserving the original
single-subscriber behavior. When `true`, up to
`MxGateway:Sessions:MaxEventSubscribersPerSession` (default `8`) concurrent
external subscribers may attach; a new attach that would exceed the cap is
rejected with `EventSubscriberLimitReached`. The count-check-and-increment is
atomic under the session lock.
Rationale: one subscriber preserves simple event ordering and failure behavior
while parity is being proven.
Failure semantics differ by mode: in single-subscriber mode a slow consumer's
channel overflow faults the whole session (`FailFast` backpressure); in
multi-subscriber mode the same condition disconnects only that subscriber so one
slow consumer never faults a session shared by others. The mode is fixed at
session construction and is not changed by a live subscriber-count snapshot.
### Alarms — superseded for the alarm subsystem
The gateway-owned internal dashboard mirror subscribes directly on the
distributor with `isInternal: true` and is not counted toward the cap or the
detach-grace subscriber-count in either mode.
The single-subscriber rule above no longer applies to alarms. The gateway runs
an always-on central alarm monitor (`GatewayAlarmMonitor`) that owns one
See [Sessions](./Sessions.md) for the full event-distributor and backpressure
design.
### Alarms — separate fan-out architecture
The single-subscriber rule never applied to alarms. The gateway runs an
always-on central alarm monitor (`GatewayAlarmMonitor`) that owns one
gateway-managed worker session, caches the active-alarm set, and fans it out to
any number of clients through the session-less `StreamAlarms` RPC. Per-session
alarm auto-subscribe is removed; `AcknowledgeAlarm` is session-less and routes
through the monitor. Data-side `StreamEvents` remains one subscriber per
session. Rationale: alarm state is gateway-wide, not session-scoped — every
client wants the same current set plus updates, and forcing each to own a
worker would multiply AVEVA polling load for no benefit.
any number of clients through the session-less `StreamAlarms` RPC.
`AcknowledgeAlarm` is session-less and routes through the monitor. Rationale:
alarm state is gateway-wide, not session-scoped — every client wants the same
current set plus updates, and forcing each to own a worker would multiply AVEVA
polling load for no benefit.
## Authentication
@@ -467,12 +497,21 @@ against the live MXAccess attribute set.
These are explicit post-v1 revisit items, not open blockers:
- reconnectable sessions,
- multiple event subscribers per session,
- restricted worker service account,
- production coalescing by item handle,
- command batching for high-volume tag setup.
The following items were previously listed here and have since shipped:
- **Reconnectable sessions with replay** — shipped, config-gated via
`MxGateway:Sessions:DetachGraceSeconds` and
`MxGateway:Events:ReplayBufferCapacity` / `ReplayRetentionSeconds`.
See [Session Reconnect](#session-reconnect) above and [Sessions](./Sessions.md).
- **Multiple event subscribers per session** — shipped, config-gated via
`MxGateway:Sessions:AllowMultipleEventSubscribers` and
`MxGateway:Sessions:MaxEventSubscribersPerSession`.
See [Event Subscribers](#event-subscribers) above and [Sessions](./Sessions.md).
## Related Documentation
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
+12 -4
View File
@@ -51,7 +51,7 @@ shutdown request even when a command or event assertion fails. Cleanup failures
in that `finally` block are logged rather than thrown, so a real assertion
failure is never masked by a shutdown timeout.
`WorkerLiveMxAccessSmokeTests` additionally covers five MXAccess parity paths the
`WorkerLiveMxAccessSmokeTests` additionally covers seven MXAccess parity paths the
fake-worker tests cannot validate:
- a `Write` round-trip against an advised item, asserting both that the reply is
@@ -67,13 +67,21 @@ fake-worker tests cannot validate:
- a `WriteSecured` round-trip after `AuthenticateUser`, asserting the reply
carries `MxCommandKind.WriteSecured` and the credential password never
appears in the diagnostic message (parity for both the secured-write
ordering rule and the "do not log secrets" contract), and
ordering rule and the "do not log secrets" contract),
- an abnormal worker exit (the worker process is killed mid-session) where the
gateway must transition the session to `SessionState.Faulted` with a
non-empty fault description carrying a known worker-client classification
(pipe disconnected / worker faulted / end-of-stream / heartbeat expired).
(pipe disconnected / worker faulted / end-of-stream / heartbeat expired),
- the B8 new COM commands — `AuthenticateUser`, `ArchestrAUserToId`, `Suspend`,
and `Activate` — each asserting a real MXAccess reply (not `InvalidRequest`)
is returned against an added-but-not-advised item, and
- the buffered-data path — `AddBufferedItem` and `SetBufferedUpdateInterval`
asserting the commands round-trip and that the worker delivers at least one
`OnBufferedDataChange` event (the empty NoData bootstrap) without crashing
or dropping frames; live §3.2 multi-sample conversion is noted as a residual
when the rig does not drive sample-bearing buffered batches on demand.
All six tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`
All eight tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`
opt-in variable.
Build the worker before running the smoke:
+14
View File
@@ -75,6 +75,20 @@ private static MxValue CreateNullValue(
}
```
### Sparse array expansion (write path, gateway only)
`MxSparseArray` — the `sparse_array_value` arm on `MxValue` — is a write-only
shorthand. The worker never produces or receives it; the gateway expands it into
a full `MxArray` before the command reaches the named pipe. Expansion allocates
a complete array of `total_length` slots, initializes every slot to the element
type's default (bool → `false`; numeric → `0`; string → `""`; time/timestamp →
Unix epoch), then writes each `MxSparseElement` at its declared index. The
resulting `array_value` is an ordinary `MxArray` that passes through the
conversion layer unchanged. The worker therefore still performs a single
whole-array COM write, preserving MXAccess parity. Unmentioned indices are
**reset** to their type default, not preserved from prior state — there is no
read-modify-write merge.
### Array projection
`ConvertArray` records the rank and per-dimension lengths so multi-dimensional `SAFEARRAY` shapes survive the round trip. The element type is resolved from the caller-supplied hint or the CLR element type via `ResolveArrayElementDataType`, then dispatched to the matching typed builder (`ConvertBoolArray`, `ConvertInt64Array`, `ConvertTimestampArray`, and so on).
@@ -0,0 +1,193 @@
# Array Write Ergonomics & Default-Fill Partial Writes — Design
Date: 2026-06-18
## Problem
Writing array-typed MXAccess attributes through the gateway has two ergonomic
shortfalls:
1. **Asymmetric addressing.** An array attribute reads fine by its bare name
(`Obj.Arr`), but writes require the `[]` body suffix (`Obj.Arr[]`). The
handle registered from the bare name is read-capable but not cleanly
write-capable.
2. **Whole-array writes only.** Every write replaces the entire array; to change
a few elements the client must marshal and send the full array. This is a
native MXAccess COM constraint (there is no element-wise write API), but it
pushes avoidable cost onto clients for large arrays.
This design removes both frictions without breaking MXAccess parity. The worker
is not modified — it continues to perform an honest whole-array COM write. All
new behavior lives in the gateway and the contract.
## Why MXAccess forces whole-array writes
The native MXAccess COM `Write` takes a complete VARIANT (`SAFEARRAY` for
arrays). There is no `WriteArrayElement(index, value)`. Confirmed in the worker:
`VariantConverter.ConvertToComArray` marshals the entire CLR array in one shot,
and `MxAccessSession.Write` forwards it verbatim to the COM proxy. Any "partial
write" feature must therefore reconstruct a full array before the COM call.
We deliberately do **not** reconstruct it from current state (no
read-modify-write merge): that would add latency, cache-staleness, and a race
window against other writers, and would paper over MXAccess semantics. Instead
partial writes are **stateless default-fill** (see below).
## Goals
- Writing an array attribute by its bare name works like reading does — the
gateway appends `[]` automatically when it knows the attribute is an array.
- A client can send only the indices it wants plus a total length, instead of
the full array.
## Non-goals
- **No preserve-unchanged merge.** Unmentioned indices are written as the
element type's default, **not** kept at their current value.
- No element-wise COM write — MXAccess has no such API; every write is
whole-array and we keep it that way.
- No change to `ReadBulk` string addressing.
- The gateway does **not** infer total length; the client supplies it.
## Decisions (resolved during brainstorming)
| Question | Decision |
|---|---|
| Scope | Both: suffix ergonomics **and** partial writes |
| Partial-write semantics | Stateless **default-fill**: unmentioned indices = type default (reset, not preserved) |
| Total length | **Client specifies** `total_length` explicitly |
| Time/timestamp default | **Unix epoch** |
| Suffix fix location/actor | **Gateway**, using in-memory Galaxy `is_array` metadata, at **AddItem** time |
| Suffix fallback when metadata unavailable | **Pass through unchanged** (no regression) |
| Partial-write contract shape | New `MxSparseArray` as a `oneof` arm on `MxValue` |
| Per-client helpers | **Included** in this change |
## Contract changes (`mxaccess_gateway.proto`)
A write-only sparse representation, added as a `oneof kind` arm on `MxValue` so
every write command (`Write`, `Write2`, `WriteSecured`, `WriteSecured2`,
`WriteBulkEntry`) accepts it without new RPCs:
```proto
message MxSparseArray {
MxDataType element_data_type = 1;
uint32 total_length = 2;
repeated MxSparseElement elements = 3;
}
message MxSparseElement {
uint32 index = 1;
MxValue value = 2; // scalar
}
// added to MxValue oneof kind:
// MxSparseArray sparse_array_value = 19;
```
`sparse_array_value` is **write-only**: the worker never produces it, and the
gateway rejects it on any read/event path. Regenerate `Generated/` and commit
the generated `.cs` (the net48 worker build needs the checked-in types — see the
proto-codegen-regen rule).
## Suffix normalization — at `AddItem`, in the gateway
The item handle binds to the literal address string at `AddItem` and is reused
for both reads and writes; at write time only the integer handle is available,
which is too late to change the address. So normalization happens at
registration.
In the gateway's `AddItemCommand` / `AddItem2Command` handling
(`GatewaySession`), before forwarding to the worker:
1. If `item_definition` already ends with `[]` → leave unchanged.
2. Else look up `item_definition + "[]"` in the in-memory Galaxy hierarchy cache
(`IGalaxyHierarchyCache``GalaxyTagLookup.Attribute.IsArray`). The index is
keyed by `FullTagReference`, which already carries the `[]` suffix for
arrays, so the lookup key must include `[]`. If found and `is_array`
rewrite `item_definition` to the `[]` form.
3. **Fallback:** metadata unavailable or address not found as an array →
forward verbatim (current behavior).
Store the **normalized** address in `SessionItemRegistration.TagAddress` so
write-time constraint checks (`ConstraintEnforcer`) and readback resolve
consistently against the `[]`-keyed index.
This is safe for reads: both the bare and `[]` forms return the array on read,
so promoting a registration to the `[]` form does not change read behavior — it
only makes the handle write-capable.
`AddItem2Command` (with `item_context`) normalizes `item_definition` the same
way. `ReadBulk` is unaffected — it uses raw address strings with its own
ephemeral registration, so bare-name reads continue to work unchanged.
## Partial-write expansion — at the gateway, worker untouched
In the gateway write path, before forwarding any write command to the worker, if
`MxValue.KindCase == SparseArrayValue`:
1. Allocate a full array of `total_length`, element type `element_data_type`.
2. Initialize every slot to the type default:
- `bool``false`
- `int32` / `int64``0`
- `float` / `double``0`
- `string``""`
- `time` / `timestamp` → Unix epoch
3. For each `MxSparseElement`, set `array[index]` from the scalar `value`.
4. Replace the `MxValue` with a normal `array_value` (full `MxArray`).
The worker then receives an ordinary whole-array `MxValue`;
`VariantConverter.ConvertToComArray` and the COM `Write` are unchanged. Parity
preserved — it really is a whole-array write.
Expansion is applied uniformly to every write variant by normalizing the
`MxValue` of each command (`Write`, `Write2`, `WriteSecured`, `WriteSecured2`,
and each `WriteBulkEntry`) before it leaves the gateway.
## Validation & errors (gateway, `InvalidArgument`)
- `total_length == 0`, or any `index >= total_length` → reject.
- Duplicate indices → reject (no silent last-wins).
- `element_data_type` must be a supported scalar element type (not `Raw` /
`Unspecified`); each element `value` must match it.
- Empty `elements` with `total_length = N` → valid: writes an all-defaults array
of length N (explicit reset).
- A sparse value arriving on a read/event path → reject (guard; the worker never
produces one).
## Clients (all five) & docs — same change
Per the repo rule that docs change with the source:
- Regenerate proto types for dotnet, go, python, rust, java. Watch the Java
generated-file churn — revert spurious protobuf-version diffs when no `.proto`
semantics changed beyond the new messages; commit the net48-relevant regen.
- Add a thin per-client helper to build a sparse write, e.g.
`WriteArrayElements(handle, totalLength, {index → value})`.
- Update the **"Array writes replace the whole array"** section in all five
client READMEs: document default-fill semantics (unmentioned = reset to
default, not preserved), the `total_length` requirement, and that bare-name
array writes now auto-normalize to the `[]` form.
- Update `gateway.md` (command/value surface) and the value-conversion doc.
## Testing
- **Gateway (FakeWorkerHarness):**
- Sparse → full expansion per element type; default-fill sizing; correct
placement of specified indices.
- `total_length == 0`, index-out-of-range, and duplicate-index rejection.
- Empty-elements all-defaults case.
- Suffix normalization: bare array → `[]`; bare scalar → unchanged;
already-`[]` → unchanged; metadata-cold → pass-through.
- **Clients:** helper + round-trip serialization per language.
- **Live MXAccess (opt-in, windev):** one default-fill write and one bare-name
array write against real COM.
## Affected components
- Contracts: `mxaccess_gateway.proto` + regenerated `Generated/`.
- Gateway: `GatewaySession` (AddItem normalization, write expansion),
`SessionItemRegistration` (store normalized address), interaction with
`IGalaxyHierarchyCache` / `GalaxyTagLookup`.
- Worker: **unchanged.**
- Clients: dotnet, go, python, rust, java (regenerated types + helper + README).
- Docs: `gateway.md`, value-conversion doc, five client READMEs.
@@ -0,0 +1,329 @@
# Array Write Ergonomics & Default-Fill Partial Writes — Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
**Goal:** Let clients write array attributes by their bare name (gateway auto-appends `[]` at AddItem), and write a sparse, default-filled array (only the indices they care about + a total length) instead of marshalling the whole array.
**Architecture:** All new behavior lives in the **contract** and the **gateway**; the worker is untouched and keeps doing an honest whole-array COM write. The gateway intercepts outbound commands at the single choke point `GatewaySession.InvokeAsync(WorkerCommand)`: it (a) normalizes `AddItem`/`AddItem2` `item_definition` to the `[]` form when Galaxy metadata says the attribute is an array, and (b) expands an `MxSparseArray` write value into a full default-filled `MxArray` before it leaves the gateway. Partial writes are **stateless default-fill** — unmentioned indices are the type default (reset), never preserved.
**Tech Stack:** .NET 10 (gateway) / .NET Framework 4.8 x86 (worker, unchanged), protobuf + Grpc.Tools, xUnit + FakeWorkerHarness; clients in C#, Go, Python, Rust, Java.
**Design doc:** `docs/plans/2026-06-18-array-write-ergonomics-design.md`
**Key references for the implementer:**
- Choke point for all outbound commands: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:947-955` (`InvokeAsync(WorkerCommand command, ...)`). `command.Command` is the `MxCommand`.
- Handle→address tracking: `GatewaySession.TrackCommandReply` (lines 975-1014, AddItem at 989, AddItem2 at 992) → `TrackItem` (1826-1837) → `SessionItemRegistration` record (`Sessions/SessionItemRegistration.cs`). Tracking reads the **same** `MxCommand` instance that passed through `InvokeAsync`, so mutating `item_definition` there flows through automatically.
- Galaxy metadata lookup: `IGalaxyHierarchyCache.Current.Index.TagsByAddress.TryGetValue(addr, out GalaxyTagLookup)`, then `lookup.Attribute?.IsArray`. The index is keyed by `FullTagReference`, which **already contains** `[]` for arrays — look up `addr + "[]"`. See `Security/Authorization/ConstraintEnforcer.cs:15-17,197-204` for the injection + lookup pattern.
- Proto: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto``MxValue` (1026-1044), `MxArray` (1046-1063), `WriteCommand` (244-249), `Write2Command` (251-257), `WriteSecuredCommand`/`WriteSecured2Command`, `WriteBulk*`, `AddItemCommand` (192-195), `AddItem2Command` (197-201). Generated into `Contracts/Generated/MxaccessGateway.cs` (Compile-Removed + regenerated by Grpc.Tools).
- Gateway proto regen + commit rule (memory `project_proto_codegen_regen`): after a `.proto` edit, delete `Generated/*.cs`, rebuild contracts to regenerate, and **commit** `Generated/` or the net48 worker build fails CS0246.
- Java client gotcha (memory `project_java_generated_churn`): gradle regenerates a tracked 64k-line file with spurious protobuf-version churn — revert that churn; build/test Java on **windev** (memory `project_java_build_host`), Mac has no JRE.
---
## Task 0: Contract — add `MxSparseArray` and regenerate
**Classification:** high-risk
**Estimated implement time:** ~4 min
**Parallelizable with:** none (blocks all other tasks)
**Files:**
- Modify: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto` (MxValue oneof ~line 1043; new messages after MxArray ~line 1063)
- Regenerate + commit: `src/ZB.MOM.WW.MxGateway.Contracts/Generated/MxaccessGateway.cs`
**Step 1: Add the messages to the proto.** After the `MxArray` message (line 1063), add:
```proto
// Write-only sparse array value. The gateway expands this into a full,
// default-filled MxArray before forwarding to the worker; the worker never
// receives or produces it. Unmentioned indices take the element type's
// default (reset, NOT preserved).
message MxSparseArray {
MxDataType element_data_type = 1;
uint32 total_length = 2;
repeated MxSparseElement elements = 3;
}
message MxSparseElement {
uint32 index = 1;
MxValue value = 2; // scalar
}
```
**Step 2: Add the oneof arm to `MxValue`.** Inside the `oneof kind { ... }` block, after `bytes raw_value = 18;`:
```proto
MxSparseArray sparse_array_value = 19;
```
**Step 3: Regenerate generated code.**
Run (PowerShell on windev, or locally on Mac — .NET builds fine):
```
del src/ZB.MOM.WW.MxGateway.Contracts/Generated/*.cs
dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj
```
Expected: build succeeds; `Generated/MxaccessGateway.cs` now contains `MxSparseArray`, `MxSparseElement`, and `MxValue.SparseArrayValue`.
**Step 4: Verify net10 + net48 both compile** (the worker consumes these types via net48):
```
dotnet build src/ZB.MOM.WW.MxGateway.slnx
```
Expected: PASS (no CS0246 on the new types).
**Step 5: Commit** (include regenerated `Generated/`):
```bash
git add src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto \
src/ZB.MOM.WW.MxGateway.Contracts/Generated/MxaccessGateway.cs
git commit -m "feat(contracts): add MxSparseArray write-only value for default-fill partial writes"
```
---
## Task 1: Gateway — `SparseArrayExpander` (pure expansion + validation)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 2, Tasks 4-9
**Files:**
- Create: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SparseArrayExpander.cs`
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SparseArrayExpanderTests.cs`
**Step 1: Write failing tests.** Cover: default-fill sizing + placement (one per element type is enough for two types here, rest in step 4); `total_length == 0``InvalidArgument`; index `>= total_length``InvalidArgument`; duplicate index → `InvalidArgument`; `Raw`/`Unspecified` element type → `InvalidArgument`; empty `elements` → all-defaults array of length N; timestamp default == Unix epoch.
```csharp
using Grpc.Core;
using Mxaccess.Gateway.V1; // adjust to the generated namespace
using ZB.MOM.WW.MxGateway.Server.Sessions;
using Xunit;
public sealed class SparseArrayExpanderTests
{
private static MxValue Sparse(MxDataType type, uint length, params (uint Index, MxValue Value)[] els)
{
MxSparseArray sparse = new() { ElementDataType = type, TotalLength = length };
foreach ((uint index, MxValue value) in els)
sparse.Elements.Add(new MxSparseElement { Index = index, Value = value });
return new MxValue { SparseArrayValue = sparse };
}
[Fact]
public void Expand_Int32_FillsDefaultsAndPlacesValues()
{
MxValue v = Sparse(MxDataType.Integer, 4, (1, new MxValue { Int32Value = 7 }));
SparseArrayExpander.Expand(v);
Assert.Equal(MxValue.KindOneofCase.ArrayValue, v.KindCase);
Assert.Equal(new[] { 0, 7, 0, 0 }, v.ArrayValue.Int32Values.Values);
Assert.Equal((uint)4, v.ArrayValue.Dimensions[0]);
}
[Fact]
public void Expand_EmptyElements_ProducesAllDefaults()
{
MxValue v = Sparse(MxDataType.Boolean, 3);
SparseArrayExpander.Expand(v);
Assert.Equal(new[] { false, false, false }, v.ArrayValue.BoolValues.Values);
}
[Theory]
[InlineData(0u, 0u)] // total_length == 0
[InlineData(2u, 5u)] // index >= total_length
public void Expand_InvalidShape_Throws(uint length, uint badIndex)
{
MxValue v = Sparse(MxDataType.Integer, length, (badIndex, new MxValue { Int32Value = 1 }));
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(v));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_DuplicateIndex_Throws()
{
MxValue v = Sparse(MxDataType.Integer, 4,
(1, new MxValue { Int32Value = 1 }), (1, new MxValue { Int32Value = 2 }));
Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(v));
}
}
```
**Step 2: Run, confirm fail.** `dotnet test src/ZB.MOM.WW.MxGateway.Tests/... --filter FullyQualifiedName~SparseArrayExpanderTests` → FAIL (type not defined).
**Step 3: Implement `SparseArrayExpander`.** Mutates the passed `MxValue` in place, replacing `SparseArrayValue` with `ArrayValue`. Throw `RpcException(new Status(StatusCode.InvalidArgument, msg))` on any validation failure. Element-type switch must cover the supported scalar element types (`Boolean`, `Integer` → int32 or int64, `Float`, `Double`, `String`, `Time`); default/timestamp default = Unix epoch (`new Timestamp { Seconds = 0, Nanos = 0 }`); reject `Raw`/`Unknown`/`Unspecified`. Set `MxArray.Dimensions = { total_length }` and `ElementDataType`. Validate: `total_length > 0`, every index `< total_length`, no duplicate indices, each element `value` scalar kind matches `element_data_type`.
(Mirror the typed sub-array shapes from `VariantConverter.ConvertToComArray` in the worker so the worker's existing read path is satisfied: `Int32Values`/`Int64Values`/`BoolValues`/`FloatValues`/`DoubleValues`/`StringValues`/`TimestampValues` with their `Values` repeated fields.)
**Step 4: Add remaining element-type tests** (int64, float, double, string, time/epoch, type-mismatch element → throws). Run filter → PASS.
**Step 5: Commit.**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Sessions/SparseArrayExpander.cs \
src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SparseArrayExpanderTests.cs
git commit -m "feat(gateway): add SparseArrayExpander for default-fill partial array writes"
```
---
## Task 2: Gateway — `ArrayAddressNormalizer` (suffix normalization)
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** Task 1, Tasks 4-9
**Files:**
- Create: `src/ZB.MOM.WW.MxGateway.Server/Sessions/ArrayAddressNormalizer.cs`
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/ArrayAddressNormalizerTests.cs`
**Step 1: Write failing tests** using a fake/in-memory `IGalaxyHierarchyCache` whose `Current.Index.TagsByAddress` contains `"Obj.Arr[]"` (array) and `"Obj.Scalar"` (non-array). Cases:
- `"Obj.Arr"` (bare, is array) → `"Obj.Arr[]"`.
- `"Obj.Arr[]"` (already suffixed) → unchanged.
- `"Obj.Scalar"` (non-array) → unchanged.
- `"Obj.Unknown"` (not in cache / metadata cold) → unchanged (pass-through fallback).
```csharp
[Fact]
public void Normalize_BareArrayName_AppendsSuffix()
{
ArrayAddressNormalizer normalizer = new(FakeCacheWith("Obj.Arr[]", isArray: true));
Assert.Equal("Obj.Arr[]", normalizer.Normalize("Obj.Arr"));
}
[Theory]
[InlineData("Obj.Arr[]")] // already suffixed
[InlineData("Obj.Scalar")] // non-array
[InlineData("Obj.Unknown")] // not in cache → fallback pass-through
public void Normalize_LeavesUnchanged(string address) =>
Assert.Equal(address, new ArrayAddressNormalizer(FakeCacheWith("Obj.Arr[]", true)).Normalize(address));
```
**Step 2: Run, confirm fail.**
**Step 3: Implement.** Constructor injects `IGalaxyHierarchyCache cache`. `Normalize(string)`:
1. If `string.IsNullOrWhiteSpace(address)` or `address.EndsWith("[]", StringComparison.Ordinal)` → return unchanged.
2. Look up `address + "[]"` in `cache.Current.Index.TagsByAddress`. If found and `lookup.Attribute?.IsArray == true` → return `address + "[]"`.
3. Otherwise return `address` unchanged. Never throw (best-effort convenience).
**Step 4: Run filter → PASS.**
**Step 5: Commit.**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Sessions/ArrayAddressNormalizer.cs \
src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/ArrayAddressNormalizerTests.cs
git commit -m "feat(gateway): add ArrayAddressNormalizer for bare-name array AddItem"
```
---
## Task 3: Gateway — wire normalization + expansion into the outbound path
**Classification:** high-risk
**Estimated implement time:** ~5 min
**Parallelizable with:** none (depends on Tasks 1, 2)
**Files:**
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (constructor — inject `ArrayAddressNormalizer`; `InvokeAsync` at 947-955)
- Modify: DI registration wherever `ArrayAddressNormalizer`/`GatewaySession` deps are registered (search `Security/Authorization/ConstraintEnforcer` registration for the pattern; register `ArrayAddressNormalizer` as scoped/singleton consistent with `IGalaxyHierarchyCache`)
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewayArrayWriteWiringTests.cs`
**Step 1: Write a failing integration test** with `FakeWorkerHarness` (pattern: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/FakeWorkerHarnessTests.cs:51-69``CreateConnectedPairAsync`, `ReadCommandAsync`, `ReplyToCommandAsync`). Two assertions:
1. Client sends `AddItemCommand{ item_definition = "Obj.Arr" }` (array per the test's Galaxy cache) → the `WorkerEnvelope` the fake worker reads has `item_definition == "Obj.Arr[]"`.
2. Client sends `WriteCommand{ value = MxSparseArray(Integer, 4, {1:7}) }` → the worker receives `value.ArrayValue.Int32Values.Values == [0,7,0,0]` (no `SparseArrayValue` reaches the worker).
**Step 2: Run, confirm fail.**
**Step 3: Implement.** At the top of `InvokeAsync(WorkerCommand command, ...)`, before forwarding, transform `command.Command` (the `MxCommand`) by `PayloadCase`:
- `AddItem``command.Command.AddItem.ItemDefinition = _addressNormalizer.Normalize(command.Command.AddItem.ItemDefinition);`
- `AddItem2` → same on `AddItem2.ItemDefinition`.
- `Write`/`WriteSecured` → if `cmd.Value?.KindCase == SparseArrayValue` call `SparseArrayExpander.Expand(cmd.Value)`.
- `Write2`/`WriteSecured2` → expand `Value` only (not `TimestampValue`).
- `WriteBulk`/`Write2Bulk`/`WriteSecuredBulk`/`WriteSecured2Bulk` → expand each `entry.Value`.
Keep it a single private helper `NormalizeOutbound(MxCommand)` called once at the choke point. Because `TrackCommandReply` later reads the **same** `MxCommand` instance, the normalized `item_definition` flows into `SessionItemRegistration` with no extra change — add an assertion in the test that `TryGetItemRegistration(...).TagAddress == "Obj.Arr[]"` to lock that in.
**Step 4: Run the wiring test + Tasks 1/2 filters → PASS.** Then run the AddItem/Write fake-worker regression group once:
```
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter "FullyQualifiedName~ArrayWrite|FullyQualifiedName~SparseArray|FullyQualifiedName~ArrayAddressNormalizer"
```
**Step 5: Commit.**
```bash
git add src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs \
src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewayArrayWriteWiringTests.cs
git commit -m "feat(gateway): normalize array AddItem suffix and expand sparse writes at the worker boundary"
```
---
## Tasks 4-8: Client helpers + READMEs (one task per client, parallelizable)
Each client task does the same four things; only paths/idioms differ. **Depends on Task 0** (needs regenerated proto types). All five are parallelizable with each other and with Tasks 1-3, 9.
Common helper contract: `WriteArrayElements(serverHandle, itemHandle, elementDataType, totalLength, elements /* index→scalar MxValue */, userId)` builds an `MxValue { SparseArrayValue = MxSparseArray{...} }` and calls the existing raw write. Add a unit test that the built command carries `sparse_array_value` with the right `total_length`/indices (serialization round-trip; no live gateway). Update the **"Array writes replace the whole array"** README section to document: default-fill semantics (unmentioned = reset to default, not preserved), the required `total_length`, and that bare-name array writes now auto-normalize.
### Task 4: .NET client
**Classification:** standard · **~4 min** · **Parallelizable with:** Tasks 5-9, 1-3
- Regenerate types: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx`.
- Add helper next to `WriteAsync` in `clients/dotnet/ZB.MOM.WW.MxGateway.Client/MxGatewaySession.cs:678-688`.
- Test alongside existing client tests; README `clients/dotnet/README.md:162-170`.
- Verify: build slnx + `dotnet test` the client test project.
### Task 5: Go client
**Classification:** standard · **~4 min** · **Parallelizable with:** Tasks 4,6-9, 1-3
- Regenerate per `clients/go` README; helper next to `Write`/`WriteRaw` in `clients/go/mxgateway/session.go:559-581`.
- README `clients/go/README.md:139-147`.
- Verify: `gofmt`, `go build ./...`, `go test ./...` from `clients/go`.
### Task 6: Python client
**Classification:** standard · **~4 min** · **Parallelizable with:** Tasks 4-5,7-9, 1-3
- Regenerate per `clients/python` README; helper next to `write` in `clients/python/src/zb_mom_ww_mxgateway/session.py:469-490`.
- README `clients/python/README.md:142-150`.
- Verify: `python -m pytest` from `clients/python`.
### Task 7: Rust client
**Classification:** standard · **~4 min** · **Parallelizable with:** Tasks 4-6,8-9, 1-3
- Helper next to `write` in `clients/rust/src/session.rs:530-548`; conversion helpers in `clients/rust/src/value.rs`.
- README `clients/rust/README.md:162-170`.
- Verify: `cargo fmt`, `cargo check --workspace`, `cargo test --workspace`, `cargo clippy --all-targets -- -D warnings` from `clients/rust`.
### Task 8: Java client
**Classification:** standard · **~5 min** · **Parallelizable with:** Tasks 4-7,9, 1-3
- Helper next to `write`/`writeRaw` in `clients/java/.../client/MxGatewaySession.java:581-604`.
- README `clients/java/README.md:118-126`.
- **Build/test on windev (JDK 21) via an isolated `origin/<branch>` worktree — Mac has no JRE** (memory `project_java_build_host`). After gradle regen, **revert the spurious protobuf-version churn** in `clients/java/src/main/generated/.../MxaccessGateway.java` if no proto semantics beyond the new messages changed (memory `project_java_generated_churn`); keep only the real `MxSparseArray` additions.
- Verify: `gradle test` on windev.
(Each client task ends with its own commit, e.g. `feat(client-<lang>): add WriteArrayElements default-fill helper and document semantics`.)
---
## Task 9: Docs — gateway.md + value conversion
**Classification:** small
**Estimated implement time:** ~3 min
**Parallelizable with:** Tasks 1-8 (depends on Task 0 only)
**Files:**
- Modify: `gateway.md` (command/value surface — document `MxSparseArray` as a write-only value and bare-name AddItem normalization)
- Modify: the value-conversion doc under `docs/` (search for where `MxArray`/value conversion is described) — add the default-fill + epoch-default note and the parity statement (worker still does a whole-array COM write)
**Step 1:** Add a subsection describing: `sparse_array_value` is write-only and gateway-expanded; default-fill semantics (epoch for time); `total_length` required; bare-name array writes auto-normalize to `[]` at AddItem with metadata pass-through fallback; non-goal: no preserve-unchanged merge, no element-wise COM write.
**Step 2: Commit.**
```bash
git add gateway.md docs/
git commit -m "docs: document MxSparseArray default-fill writes and bare-name array AddItem"
```
---
## Dependency summary
- **Task 0** blocks everything.
- **Tasks 1, 2** depend on 0; parallel with each other.
- **Task 3** depends on 1 and 2.
- **Tasks 4-8** depend on 0; parallel with each other and with 1-3, 9.
- **Task 9** depends on 0; parallel with all.
## Verification gates (per CLAUDE.md targeted-tests rule)
- Run only the `--filter` for the task you touched; run the array-write fake-worker group once after Task 3.
- Java verified on windev only. .NET/Go/Rust/Python verified locally.
- Live MXAccess (opt-in, windev): after merge, one default-fill write and one bare-name array write against real COM (`MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`).
@@ -0,0 +1,16 @@
{
"planPath": "docs/plans/2026-06-18-array-write-ergonomics.md",
"tasks": [
{"id": 0, "subject": "Task 0: Contract — add MxSparseArray + regenerate", "status": "pending"},
{"id": 1, "subject": "Task 1: Gateway SparseArrayExpander", "status": "pending", "blockedBy": [0]},
{"id": 2, "subject": "Task 2: Gateway ArrayAddressNormalizer", "status": "pending", "blockedBy": [0]},
{"id": 3, "subject": "Task 3: Wire normalization + expansion into GatewaySession", "status": "pending", "blockedBy": [1, 2]},
{"id": 4, "subject": "Task 4: .NET client WriteArrayElements + README", "status": "pending", "blockedBy": [0]},
{"id": 5, "subject": "Task 5: Go client WriteArrayElements + README", "status": "pending", "blockedBy": [0]},
{"id": 6, "subject": "Task 6: Python client write_array_elements + README", "status": "pending", "blockedBy": [0]},
{"id": 7, "subject": "Task 7: Rust client write_array_elements + README", "status": "pending", "blockedBy": [0]},
{"id": 8, "subject": "Task 8: Java client writeArrayElements + README (windev)", "status": "pending", "blockedBy": [0]},
{"id": 9, "subject": "Task 9: Docs — gateway.md + value conversion", "status": "pending", "blockedBy": [0]}
],
"lastUpdated": "2026-06-18"
}
+93 -5
View File
@@ -481,6 +481,68 @@ metadata rather than dropped. If a value cannot be losslessly converted, the
worker should return both the best typed projection and enough diagnostic
metadata to reproduce the case.
### MxSparseArray — default-fill partial array writes (write-only)
`MxSparseArray` is a write-only `oneof kind` arm on `MxValue` that lets clients
send only the indices they want to change plus a total length, rather than
marshalling the entire array every write. The worker never produces or receives
it; expansion happens entirely in the gateway before the command reaches the pipe.
```protobuf
message MxSparseArray {
MxDataType element_data_type = 1;
uint32 total_length = 2;
repeated MxSparseElement elements = 3;
}
message MxSparseElement {
uint32 index = 1;
MxValue value = 2; // scalar
}
```
**Expansion.** Before forwarding any write command to the worker the gateway
allocates a full array of `total_length` slots, initializes every slot to the
element type's default, places each `MxSparseElement` at its index, then
replaces the `MxValue` with a normal `array_value` (`MxArray`). The worker
receives an ordinary whole-array write — parity is preserved.
Default values by element type:
| Element type | Default |
|---|---|
| `Boolean` | `false` |
| `Integer` | `0` (int32, or int64 when an element value is 64-bit) |
| `Float` / `Double` | `0` |
| `String` | `""` |
| `Time` | Unix epoch (1970-01-01T00:00:00Z) |
Unmentioned indices take the element type's default — this is a **reset**, not a
preserve. There is no read-modify-write merge: adding that would introduce cache
staleness, a race window against other writers, and the latency of a round-trip
read, all of which contradict MXAccess semantics.
**Validation.** The gateway rejects the following with `InvalidArgument`:
- `total_length == 0`
- any `index >= total_length`
- duplicate indices
- `element_data_type` that is `Raw` or `Unspecified`
- an element `value` whose kind does not match `element_data_type`
- `total_length` exceeds the gateway-configured maximum array length
An empty `elements` list with a non-zero `total_length` is valid — it writes an
all-defaults array of length `total_length` (explicit reset). A `sparse_array_value`
arriving on any read or event path is rejected as a guard; the worker never
produces one.
**Non-goals.** There is no preserve-unchanged read-modify-write merge, no
element-wise COM write (MXAccess has no such API), and no change to `ReadBulk`
string addressing.
`sparse_array_value` is accepted by every write variant: `Write`, `Write2`,
`WriteSecured`, `WriteSecured2`, and each `*BulkEntry` entry.
## Status Model
Represent `MXSTATUS_PROXY` explicitly:
@@ -1049,6 +1111,27 @@ Known important parity areas from existing captures:
- Invalid handles and cross-server handles have specific exception/status
behavior.
- STA message pumping is required for event delivery.
- A plain `Write`/`Write2` only honors its `user_id` when the item has an active
supervisory advise. Callers that do not go through the
`AuthenticateUser``WriteSecured`/`WriteSecured2` path must send
`AdviseSupervisory` for the item before a user id on a plain write is
recorded; otherwise the user id is ignored.
- Writing an array attribute replaces the whole array — it is not an
element-wise patch. To change a subset of elements the caller must send the
full array (unchanged elements included); sending only the changed elements
resizes the attribute. `MxSparseArray` provides a default-fill shorthand for
this: the gateway reconstructs the full array from the supplied sparse
representation (unmentioned indices → type default) before sending the
whole-array write to the worker.
- Array attribute addresses require the `[]` body suffix to be write-capable.
The gateway normalizes bare-name addresses at `AddItem` time: when Galaxy
metadata confirms `is_array`, the gateway appends `[]` before registering the
handle with the worker. When metadata is unavailable or the address is not
recognized as an array, the address is forwarded unchanged so existing
behavior is not regressed. The normalized address is stored in
`SessionItemRegistration.TagAddress` and applies consistently to all
subsequent writes on that handle. `ReadBulk` is unaffected — it uses raw
address strings with its own ephemeral registration.
The gateway should not "fix" these behaviors unless the client explicitly opts
into a non-parity mode.
@@ -1084,12 +1167,19 @@ Resolved for v1:
- MXAccess COM target is `ArchestrA.MxAccess.LMXProxyServerClass` /
`LMXProxy.LMXProxyServer.1` from the installed 32-bit `LmxProxy.dll`.
- One `OpenSession` maps to one worker process; no reconnectable sessions.
- One active event subscriber per session.
- One `OpenSession` maps to one worker process.
- Reconnectable sessions: clients reconnect by re-issuing `StreamEvents` with
`after_worker_sequence`; the gateway replays the retained ring tail and emits
a `ReplayGap` sentinel when events were evicted. See `docs/Sessions.md`.
- Multi-subscriber event fan-out: multiple concurrent `StreamEvents` callers on
the same session are supported; single-subscriber mode uses fail-fast
backpressure, multi-subscriber mode disconnects only the slow consumer. See
`docs/Sessions.md`.
- API key authentication with hashed keys in gateway-owned SQLite.
- Basic Blazor Server dashboard with Bootstrap CSS/JS and real-time updates.
- Workers run as the gateway service identity.
- Event backpressure is fail-fast with bounded queues.
- Event backpressure is fail-fast with bounded queues (single-subscriber) or
per-subscriber disconnect (multi-subscriber).
- No public command batching.
- `OperationComplete` is forwarded only when native MXAccess raises it.
- `OnBufferedDataChange` is modeled now; multi-sample payload conversion remains
@@ -1098,8 +1188,6 @@ Resolved for v1:
Post-v1 revisit items:
- production event-rate target and optional coalescing,
- reconnectable sessions,
- multi-subscriber event fan-out,
- restricted worker process identity,
- command batching for high-volume setup.
File diff suppressed because it is too large Load Diff
@@ -729,8 +729,7 @@ message MxEvent {
// stream; it is ALWAYS unset on events in DrainEventsReply (the diagnostic
// drain path never emits the sentinel).
// Additive (proto3): existing clients that ignore this field continue to
// deserialize the stream unchanged. (Reconnect/replay logic is Task 12; this
// is the contract surface only.)
// deserialize the stream unchanged.
optional ReplayGap replay_gap = 14;
oneof body {
@@ -1041,6 +1040,7 @@ message MxValue {
google.protobuf.Timestamp timestamp_value = 16;
MxArray array_value = 17;
bytes raw_value = 18;
MxSparseArray sparse_array_value = 19;
}
}
@@ -1063,6 +1063,21 @@ message MxArray {
}
}
// Write-only sparse array value. The gateway expands this into a full,
// default-filled MxArray before forwarding to the worker; the worker never
// receives or produces it. Unmentioned indices take the element type's
// default (reset, NOT preserved).
message MxSparseArray {
MxDataType element_data_type = 1;
uint32 total_length = 2;
repeated MxSparseElement elements = 3;
}
message MxSparseElement {
uint32 index = 1;
MxValue value = 2; // scalar
}
message BoolArray {
repeated bool values = 1;
}
@@ -7,7 +7,7 @@
<PropertyGroup>
<IsPackable>true</IsPackable>
<PackageId>ZB.MOM.WW.MxGateway.Contracts</PackageId>
<Version>0.1.1</Version>
<Version>0.1.2</Version>
<Authors>Joseph Doherty</Authors>
<Company>ZB MOM WW</Company>
<Copyright>Copyright (c) ZB MOM WW. All rights reserved.</Copyright>
@@ -669,11 +669,12 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
Assert.NotEqual(0, userToIdReply.ArchestraUserToId.UserId);
}
// Suspend / Activate against the advised item. The dev-rig TestInt item class
// may not be suspendable (MXAccess returns 0x80070057 / E_INVALIDARG for a
// wrong item class — see B8 notes). That is MXAccess parity: assert the reply
// kind and a non-INVALID_REQUEST status, surface the HResult and MxStatusProxy
// for the record, and do NOT treat a provider-side rejection as a test failure.
// Suspend / Activate against the added-but-not-advised item (no Advise was issued
// between AddItem and this call). The dev-rig TestInt item class may not be
// suspendable (MXAccess returns 0x80070057 / E_INVALIDARG for a wrong item class
// — see B8 notes). That is MXAccess parity: assert the reply kind and a
// non-INVALID_REQUEST status, surface the HResult and MxStatusProxy for the
// record, and do NOT treat a provider-side rejection as a test failure.
MxCommandReply suspendReply = await fixture.Service.Invoke(
CreateSuspendRequest(sessionId, serverHandle, itemHandle),
new TestServerCallContext()).ConfigureAwait(false);
@@ -827,8 +828,9 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
streamCancellation.Token)
.ConfigureAwait(false);
}
catch (TimeoutException)
catch (TimeoutException ex)
{
output.WriteLine($"B8: sample-bearing batch predicate timed out: {ex.Message}");
bufferedBatch = null;
}
@@ -196,11 +196,30 @@ public sealed class ConstraintEnforcer(
private GalaxyTagLookup? ResolveTarget(string tagAddress)
{
GalaxyHierarchyCacheEntry entry = cache.Current;
return !string.IsNullOrWhiteSpace(tagAddress)
&& entry.Index.TagsByAddress.TryGetValue(tagAddress, out GalaxyTagLookup? lookup)
? lookup
: null;
if (string.IsNullOrWhiteSpace(tagAddress))
{
return null;
}
IReadOnlyDictionary<string, GalaxyTagLookup> tagsByAddress = cache.Current.Index.TagsByAddress;
if (tagsByAddress.TryGetValue(tagAddress, out GalaxyTagLookup? lookup))
{
return lookup;
}
// Galaxy SQL keys array attributes by their suffixed FullTagReference (e.g. "Obj.Arr[]"),
// but callers pass the bare address ("Obj.Arr") before the worker-boundary normalization
// runs. Probe the suffixed form so a bare array name resolves to its array attribute,
// consistent with ArrayAddressNormalizer. Only build the suffixed string on a direct miss
// when the address is not already suffixed, and only accept it when it is truly an array.
if (!tagAddress.EndsWith("[]", StringComparison.Ordinal)
&& tagsByAddress.TryGetValue(tagAddress + "[]", out GalaxyTagLookup? arrayLookup)
&& arrayLookup.Attribute?.IsArray == true)
{
return arrayLookup;
}
return null;
}
private static bool MatchesPathOrTag(
@@ -0,0 +1,43 @@
using ZB.MOM.WW.MxGateway.Server.Galaxy;
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
/// <summary>
/// Rewrites a bare MXAccess attribute address to its writable array form by appending the
/// trailing <c>[]</c> suffix when Galaxy Repository metadata reports the attribute as an array.
/// MXAccess requires the <c>[]</c> suffix on the AddItem address for an array attribute to be
/// writable; the bare name registers a read-only-ish handle. This is best-effort: when metadata
/// is cold, the address is unknown, or the attribute is not an array, the address is returned
/// unchanged and no exception is thrown.
/// </summary>
public sealed class ArrayAddressNormalizer(IGalaxyHierarchyCache cache)
{
private const string ArraySuffix = "[]";
/// <summary>
/// Returns <paramref name="address"/> with a trailing <c>[]</c> appended when Galaxy metadata
/// reports it as an array attribute; otherwise returns it unchanged. Never throws.
/// </summary>
/// <param name="address">The MXAccess attribute address to normalize.</param>
/// <returns>The normalized address, or the original address when no rewrite applies.</returns>
public string Normalize(string address)
{
if (string.IsNullOrWhiteSpace(address))
{
return address;
}
if (address.EndsWith(ArraySuffix, StringComparison.Ordinal))
{
return address;
}
// Galaxy SQL keys array attributes by their suffixed FullTagReference (e.g. "Obj.Arr[]"),
// so probe for the suffixed form to decide whether the bare name is an array.
string suffixed = address + ArraySuffix;
return cache.Current.Index.TagsByAddress.TryGetValue(suffixed, out GalaxyTagLookup? lookup)
&& lookup.Attribute?.IsArray == true
? suffixed
: address;
}
}
@@ -25,6 +25,12 @@ public sealed class GatewaySession
private readonly TimeSpan _detachGrace;
private readonly TimeSpan _workerReadyWaitTimeout;
private DateTimeOffset? _detachedAtUtc;
// True once at least one external subscriber attached SUCCESSFULLY. Detach-grace's
// "last subscriber dropped" stamp (see DetachEventSubscriber) is gated on this so a
// FAILED first attach — which still runs the rollback DetachEventSubscriber from the
// attach catch path — does not push a never-subscribed session into the grace window
// (Server-055).
private bool _everHadEventSubscriber;
private SessionEventDistributor? _eventDistributor;
private bool _eventDistributorStarted;
private bool _dashboardMirrorStarted;
@@ -32,6 +38,7 @@ public sealed class GatewaySession
private Task? _dashboardMirrorTask;
private CancellationTokenSource? _dashboardMirrorCts;
private readonly Dictionary<(int ServerHandle, int ItemHandle), SessionItemRegistration> _items = [];
private readonly ArrayAddressNormalizer? _addressNormalizer;
/// <summary>
/// Initializes a gateway session with session metadata and timeout configuration.
@@ -127,6 +134,12 @@ public sealed class GatewaySession
/// fast immediately. <see cref="TimeSpan.Zero"/> (the default) disables the wait and
/// preserves the original fail-fast behavior byte-for-byte.
/// </param>
/// <param name="addressNormalizer">
/// Rewrites bare array <c>AddItem</c>/<c>AddItem2</c> addresses to their writable <c>[]</c>
/// form using Galaxy metadata at the outbound choke point (and on registration tracking).
/// When <see langword="null"/> (legacy unit-construction paths that do not exercise Galaxy
/// metadata), addresses pass through unchanged.
/// </param>
public GatewaySession(
string sessionId,
string backendName,
@@ -143,7 +156,8 @@ public sealed class GatewaySession
DateTimeOffset openedAt,
SessionEventStreaming? eventStreaming = null,
TimeSpan detachGrace = default,
TimeSpan workerReadyWaitTimeout = default)
TimeSpan workerReadyWaitTimeout = default,
ArrayAddressNormalizer? addressNormalizer = null)
{
if (string.IsNullOrWhiteSpace(sessionId))
{
@@ -183,6 +197,7 @@ public sealed class GatewaySession
_eventStreaming = eventStreaming ?? SessionEventStreaming.Default;
_detachGrace = detachGrace > TimeSpan.Zero ? detachGrace : TimeSpan.Zero;
_workerReadyWaitTimeout = workerReadyWaitTimeout > TimeSpan.Zero ? workerReadyWaitTimeout : TimeSpan.Zero;
_addressNormalizer = addressNormalizer;
}
/// <summary>
@@ -842,6 +857,7 @@ public sealed class GatewaySession
try
{
IEventSubscriberLease distributorLease = StartDistributorAndRegister();
MarkEventSubscriberAttached();
return new EventSubscriberLease(this, distributorLease);
}
catch
@@ -906,6 +922,7 @@ public sealed class GatewaySession
out ulong oldestAvailableSequence,
out ulong liveResumeSequence);
MarkEventSubscriberAttached();
return new EventSubscriberReplayAttachment(
new EventSubscriberLease(this, distributorLease),
replayedEvents,
@@ -920,6 +937,17 @@ public sealed class GatewaySession
}
}
// Records that an external subscriber attached successfully. Gates the detach-grace
// "last subscriber dropped" stamp so a FAILED first attach (which still rolls back via
// DetachEventSubscriber) never pushes a never-subscribed session into grace (Server-055).
private void MarkEventSubscriberAttached()
{
lock (_syncRoot)
{
_everHadEventSubscriber = true;
}
}
/// <summary>
/// Invokes a worker command synchronously and returns the reply.
/// </summary>
@@ -929,12 +957,95 @@ public sealed class GatewaySession
WorkerCommand command,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(command);
if (command.Command is not null)
{
NormalizeOutboundCommand(command.Command);
}
IWorkerClient workerClient = await GetReadyWorkerClientAsync(cancellationToken).ConfigureAwait(false);
TouchClientActivity(_eventStreaming.TimeProvider.GetUtcNow());
return await workerClient.InvokeAsync(command, CommandTimeout, cancellationToken).ConfigureAwait(false);
}
// Single outbound choke point for the two array-write ergonomics shims (Task 3):
// 1. AddItem/AddItem2 array addresses gain the writable "[]" suffix when Galaxy metadata
// reports them as arrays, so the worker registers a write-capable handle. The mutation
// lands on the same MxCommand instance forwarded to the worker.
// 2. Sparse array write values are expanded to whole-array values, because MXAccess has no
// partial-array write primitive — the worker only ever sees a full MxArray.
// SparseArrayExpander.Expand throws RpcException(InvalidArgument) for an invalid sparse payload;
// that propagates out of InvokeAsync as the desired client-facing error and is deliberately not
// caught here.
private void NormalizeOutboundCommand(MxCommand command)
{
switch (command.PayloadCase)
{
case MxCommand.PayloadOneofCase.AddItem:
command.AddItem.ItemDefinition = NormalizeAddress(command.AddItem.ItemDefinition);
break;
case MxCommand.PayloadOneofCase.AddItem2:
command.AddItem2.ItemDefinition = NormalizeAddress(command.AddItem2.ItemDefinition);
break;
case MxCommand.PayloadOneofCase.Write:
ExpandValue(command.Write.Value);
break;
case MxCommand.PayloadOneofCase.WriteSecured:
ExpandValue(command.WriteSecured.Value);
break;
case MxCommand.PayloadOneofCase.Write2:
ExpandValue(command.Write2.Value);
break;
case MxCommand.PayloadOneofCase.WriteSecured2:
ExpandValue(command.WriteSecured2.Value);
break;
case MxCommand.PayloadOneofCase.WriteBulk:
foreach (WriteBulkEntry entry in command.WriteBulk.Entries)
{
ExpandValue(entry.Value);
}
break;
case MxCommand.PayloadOneofCase.Write2Bulk:
foreach (Write2BulkEntry entry in command.Write2Bulk.Entries)
{
ExpandValue(entry.Value);
}
break;
case MxCommand.PayloadOneofCase.WriteSecuredBulk:
foreach (WriteSecuredBulkEntry entry in command.WriteSecuredBulk.Entries)
{
ExpandValue(entry.Value);
}
break;
case MxCommand.PayloadOneofCase.WriteSecured2Bulk:
foreach (WriteSecured2BulkEntry entry in command.WriteSecured2Bulk.Entries)
{
ExpandValue(entry.Value);
}
break;
}
}
// Best-effort array-suffix rewrite; the normalizer is null in legacy unit-construction paths
// that do not exercise Galaxy metadata, in which case the address passes through unchanged.
private string NormalizeAddress(string address) =>
_addressNormalizer?.Normalize(address) ?? address;
// MXAccess writes replace the whole array; expand a sparse value in place so the worker only
// ever receives a whole-array MxValue. No-op for null or non-sparse values.
private static void ExpandValue(MxValue? value)
{
if (value is not null)
{
SparseArrayExpander.Expand(value);
}
}
/// <summary>Gets the item registration for a server and item handle pair.</summary>
/// <param name="serverHandle">The MXAccess server handle.</param>
/// <param name="itemHandle">The MXAccess item handle.</param>
@@ -966,11 +1077,16 @@ public sealed class GatewaySession
{
switch (command.Kind)
{
// The public reply is tracked from the pre-mapping MxCommand instance, which is a
// separate copy from the one mutated at the InvokeAsync choke point (the gRPC mapper
// deep-clones before forwarding). Re-apply the array-suffix normalization here so the
// registration's TagAddress matches the address the worker actually registered.
// Normalize is idempotent for an already-suffixed address.
case MxCommandKind.AddItem when reply.AddItem is not null:
TrackItem(command.AddItem.ServerHandle, reply.AddItem.ItemHandle, command.AddItem.ItemDefinition);
TrackItem(command.AddItem.ServerHandle, reply.AddItem.ItemHandle, NormalizeAddress(command.AddItem.ItemDefinition));
break;
case MxCommandKind.AddItem2 when reply.AddItem2 is not null:
TrackItem(command.AddItem2.ServerHandle, reply.AddItem2.ItemHandle, command.AddItem2.ItemDefinition);
TrackItem(command.AddItem2.ServerHandle, reply.AddItem2.ItemHandle, NormalizeAddress(command.AddItem2.ItemDefinition));
break;
case MxCommandKind.AddBufferedItem when reply.AddBufferedItem is not null:
TrackItem(command.AddBufferedItem.ServerHandle, reply.AddBufferedItem.ItemHandle, command.AddBufferedItem.ItemDefinition);
@@ -1862,7 +1978,12 @@ public sealed class GatewaySession
// Closing/Closed/Faulted there is nothing to retain. This is the detach→grace-start
// transition; it shares _syncRoot with the reattach→grace-cancel write above and the
// sweeper's IsDetachGraceExpired read, so the three serialize.
if (_detachGrace > TimeSpan.Zero
// Only stamp a detach that mirrors a prior SUCCESSFUL attach. The attach catch path
// calls this same method to roll back a reserved slot when the FIRST attach failed
// before any subscriber registered; that never-subscribed session must not enter the
// grace window (Server-055).
if (_everHadEventSubscriber
&& _detachGrace > TimeSpan.Zero
&& _activeEventSubscriberCount == 0
&& _state is not (SessionState.Closing or SessionState.Closed or SessionState.Faulted))
{
@@ -116,6 +116,17 @@ public sealed class SessionEventDistributor : IAsyncDisposable
private bool _started;
private bool _disposed;
// Set once the pump has run its final CompleteAllSubscribers sweep — the event source
// completed or faulted and the pump exited. Guarded by _lifecycleLock together with the
// subscriber add. A subscriber that registers AFTER this point but BEFORE DisposeAsync
// (the source ended but the session is not yet torn down) would otherwise be added with a
// channel the now-exited pump never completes, hanging its reader forever. The register
// paths complete such a late registrant's channel immediately with the same terminal
// state. _completionError carries the terminal exception (source fault) or null (graceful
// source completion), mirroring what the final CompleteAllSubscribers passed.
private bool _completed;
private Exception? _completionError;
/// <summary>
/// Initializes a per-session event distributor.
/// </summary>
@@ -304,6 +315,16 @@ public sealed class SessionEventDistributor : IAsyncDisposable
{
ObjectDisposedException.ThrowIf(_disposed, this);
_subscribers[subscriber.Id] = subscriber;
// Close the register-after-pump-completion window: if the pump already ran its
// final CompleteAllSubscribers (source completed/faulted) but the distributor is
// not yet disposed, no further completion sweep will run, so complete this late
// registrant's channel now with the same terminal state instead of leaving its
// reader hanging.
if (_completed)
{
subscriber.Channel.Writer.TryComplete(_completionError);
}
}
return new SubscriberLease(this, subscriber);
@@ -450,6 +471,14 @@ public sealed class SessionEventDistributor : IAsyncDisposable
{
ObjectDisposedException.ThrowIf(_disposed, this);
_subscribers[id] = subscriber;
// Same register-after-pump-completion guard as Register: a resume that races in
// after the source already ended still gets its retained replay batch (snapshot
// above), but its live channel must be completed now since the pump is gone.
if (_completed)
{
subscriber.Channel.Writer.TryComplete(_completionError);
}
}
}
@@ -628,9 +657,21 @@ public sealed class SessionEventDistributor : IAsyncDisposable
private void CompleteAllSubscribers(Exception? error)
{
foreach (Subscriber subscriber in _subscribers.Values)
// Record the terminal state AND complete the current subscribers under _lifecycleLock
// so this serializes with the subscriber-add in Register/RegisterWithReplay: a
// subscriber added before this runs is in the map and completed by the loop; one that
// races in afterward sees _completed and completes its own channel in the register
// path. Exactly one of the two completes each subscriber. TryComplete is non-blocking
// and (channels use AllowSynchronousContinuations=false) runs no continuation inline,
// so holding the lock across the loop cannot stall or re-enter.
lock (_lifecycleLock)
{
subscriber.Channel.Writer.TryComplete(error);
_completed = true;
_completionError = error;
foreach (Subscriber subscriber in _subscribers.Values)
{
subscriber.Channel.Writer.TryComplete(error);
}
}
}
@@ -29,6 +29,7 @@ public sealed class SessionManager : ISessionManager
private readonly Grpc.MxAccessGrpcMapper _eventMapper;
private readonly ILogger<SessionEventDistributor> _distributorLogger;
private readonly Dashboard.Hubs.IDashboardEventBroadcaster? _dashboardEventBroadcaster;
private readonly ArrayAddressNormalizer? _addressNormalizer;
/// <summary>
/// Initializes a new instance of <see cref="SessionManager"/>.
@@ -47,6 +48,11 @@ public sealed class SessionManager : ISessionManager
/// dashboard receives events regardless of whether a gRPC client is streaming. Null in
/// unit tests that do not exercise the dashboard mirror.
/// </param>
/// <param name="addressNormalizer">
/// Rewrites bare array AddItem addresses to their writable <c>[]</c> form using Galaxy
/// metadata; handed to each session so the normalization runs at the outbound choke point.
/// Null in unit tests that do not exercise array-write ergonomics.
/// </param>
public SessionManager(
ISessionRegistry registry,
ISessionWorkerClientFactory workerClientFactory,
@@ -56,7 +62,8 @@ public sealed class SessionManager : ISessionManager
ILogger<SessionManager>? logger = null,
Grpc.MxAccessGrpcMapper? eventMapper = null,
ILogger<SessionEventDistributor>? distributorLogger = null,
Dashboard.Hubs.IDashboardEventBroadcaster? dashboardEventBroadcaster = null)
Dashboard.Hubs.IDashboardEventBroadcaster? dashboardEventBroadcaster = null,
ArrayAddressNormalizer? addressNormalizer = null)
{
_registry = registry ?? throw new ArgumentNullException(nameof(registry));
_workerClientFactory = workerClientFactory ?? throw new ArgumentNullException(nameof(workerClientFactory));
@@ -67,6 +74,7 @@ public sealed class SessionManager : ISessionManager
_eventMapper = eventMapper ?? new Grpc.MxAccessGrpcMapper();
_distributorLogger = distributorLogger ?? NullLogger<SessionEventDistributor>.Instance;
_dashboardEventBroadcaster = dashboardEventBroadcaster;
_addressNormalizer = addressNormalizer;
_options = options.Value;
_sessionSlots = new SemaphoreSlim(_options.Sessions.MaxSessions, _options.Sessions.MaxSessions);
}
@@ -506,7 +514,8 @@ public sealed class SessionManager : ISessionManager
openedAt,
eventStreaming,
TimeSpan.FromSeconds(Math.Max(0, _options.Sessions.DetachGraceSeconds)),
TimeSpan.FromMilliseconds(Math.Max(0, _options.Sessions.WorkerReadyWaitTimeoutMs)));
TimeSpan.FromMilliseconds(Math.Max(0, _options.Sessions.WorkerReadyWaitTimeoutMs)),
_addressNormalizer);
}
private static string CreateClientCorrelationId(
@@ -8,6 +8,9 @@ public static class SessionServiceCollectionExtensions
/// <returns>The service collection for chaining.</returns>
public static IServiceCollection AddGatewaySessions(this IServiceCollection services)
{
// Lifetime consistent with IGalaxyHierarchyCache (singleton); the normalizer reads the
// cache's current snapshot per call, so it holds no per-session or per-request state.
services.AddSingleton<ArrayAddressNormalizer>();
services.AddSingleton<ISessionRegistry, SessionRegistry>();
services.AddSingleton<ISessionWorkerClientFactory, SessionWorkerClientFactory>();
services.AddSingleton<ISessionManager, SessionManager>();
@@ -0,0 +1,285 @@
using Google.Protobuf.WellKnownTypes;
using Grpc.Core;
using ZB.MOM.WW.MxGateway.Contracts.Proto;
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
/// <summary>
/// Expands a client-supplied sparse array write (<see cref="MxSparseArray"/>) into a
/// full, default-filled <see cref="MxArray"/> in place.
/// </summary>
/// <remarks>
/// MXAccess has no partial-array write primitive: a write replaces the whole array.
/// Clients that only care about a few indices send an <see cref="MxSparseArray"/> with
/// the total length plus the indices they want to set; this expander materializes the
/// full array so the worker can do an ordinary whole-array COM write. Indices the client
/// did not mention are reset to the element type's default (they are NOT preserved from
/// the live value); this is intentional, because the gateway cannot read-modify-write
/// without racing the provider.
///
/// For the MXAccess <c>Integer</c> element type the worker's COM-array converter (see
/// <c>VariantConverter.ConvertArray</c>) chooses between a 32-bit and 64-bit sub-array
/// based on the CLR element type. The sparse value carries no CLR array, so this expander
/// mirrors that choice by inspecting the supplied element value kinds: if any element is an
/// <see cref="MxValue.KindOneofCase.Int64Value"/> the whole array is emitted as
/// <see cref="MxArray.Int64Values"/>; otherwise it is emitted as
/// <see cref="MxArray.Int32Values"/> (matching a default Integer array).
/// </remarks>
internal static class SparseArrayExpander
{
/// <summary>
/// Replaces <paramref name="value"/>'s <see cref="MxValue.SparseArrayValue"/> with an
/// equivalent full <see cref="MxValue.ArrayValue"/>. If <paramref name="value"/> is not
/// a sparse array this is a no-op, so callers may invoke it unconditionally.
/// </summary>
/// <param name="value">The value to expand in place.</param>
/// <exception cref="RpcException">
/// <see cref="StatusCode.InvalidArgument"/> when the sparse payload is invalid: zero
/// total length, an index at or beyond the total length, a duplicate index, an
/// unsupported element type, or an element value whose kind does not match the declared
/// element type.
/// </exception>
public static void Expand(MxValue value)
{
ArgumentNullException.ThrowIfNull(value);
if (value.KindCase != MxValue.KindOneofCase.SparseArrayValue)
{
return;
}
MxSparseArray sparse = value.SparseArrayValue;
MxDataType elementType = sparse.ElementDataType;
uint totalLength = sparse.TotalLength;
if (totalLength == 0)
{
throw Invalid("Sparse array total_length must be greater than zero.");
}
if (!IsSupportedElementType(elementType))
{
throw Invalid($"Sparse array element_data_type '{elementType}' is not a supported scalar element type.");
}
if (totalLength > (uint)Array.MaxLength)
{
throw Invalid(
$"Sparse array total_length {totalLength} exceeds the maximum supported array length {Array.MaxLength}.");
}
int length = (int)totalLength;
HashSet<uint> seenIndices = new();
foreach (MxSparseElement element in sparse.Elements)
{
if (element.Index >= totalLength)
{
throw Invalid(
$"Sparse array index {element.Index} is out of range for total_length {totalLength}.");
}
if (!seenIndices.Add(element.Index))
{
throw Invalid($"Sparse array has a duplicate index {element.Index}.");
}
ValidateElementKind(elementType, element);
}
MxArray array = BuildArray(elementType, length, sparse.Elements);
array.ElementDataType = elementType;
array.Dimensions.Add(totalLength);
// Assigning ArrayValue switches the oneof and clears SparseArrayValue.
value.ArrayValue = array;
}
private static MxArray BuildArray(
MxDataType elementType,
int length,
IReadOnlyList<MxSparseElement> elements)
{
MxArray array = new();
switch (elementType)
{
case MxDataType.Boolean:
{
BoolArray values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(false);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.BoolValue;
}
array.BoolValues = values;
break;
}
case MxDataType.Integer when UsesInt64(elements):
{
Int64Array values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(0L);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = ReadInt64(element.Value);
}
array.Int64Values = values;
break;
}
case MxDataType.Integer:
{
Int32Array values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(0);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.Int32Value;
}
array.Int32Values = values;
break;
}
case MxDataType.Float:
{
FloatArray values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(0f);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.FloatValue;
}
array.FloatValues = values;
break;
}
case MxDataType.Double:
{
DoubleArray values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(0d);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.DoubleValue;
}
array.DoubleValues = values;
break;
}
case MxDataType.String:
{
StringArray values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(string.Empty);
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.StringValue;
}
array.StringValues = values;
break;
}
case MxDataType.Time:
{
TimestampArray values = new();
for (int i = 0; i < length; i++)
{
values.Values.Add(new Timestamp { Seconds = 0, Nanos = 0 });
}
foreach (MxSparseElement element in elements)
{
values.Values[(int)element.Index] = element.Value.TimestampValue;
}
array.TimestampValues = values;
break;
}
default:
// Unreachable: IsSupportedElementType gates the element type before BuildArray.
throw Invalid($"Sparse array element_data_type '{elementType}' is not supported.");
}
return array;
}
private static bool IsSupportedElementType(MxDataType elementType) => elementType switch
{
MxDataType.Boolean => true,
MxDataType.Integer => true,
MxDataType.Float => true,
MxDataType.Double => true,
MxDataType.String => true,
MxDataType.Time => true,
_ => false,
};
private static void ValidateElementKind(MxDataType elementType, MxSparseElement element)
{
MxValue.KindOneofCase kind = element.Value?.KindCase ?? MxValue.KindOneofCase.None;
bool matches = elementType switch
{
MxDataType.Boolean => kind == MxValue.KindOneofCase.BoolValue,
MxDataType.Integer => kind is MxValue.KindOneofCase.Int32Value or MxValue.KindOneofCase.Int64Value,
MxDataType.Float => kind == MxValue.KindOneofCase.FloatValue,
MxDataType.Double => kind == MxValue.KindOneofCase.DoubleValue,
MxDataType.String => kind == MxValue.KindOneofCase.StringValue,
MxDataType.Time => kind == MxValue.KindOneofCase.TimestampValue,
_ => false,
};
if (!matches)
{
throw Invalid(
$"Sparse array element at index {element.Index} has value kind '{kind}' which does not match element_data_type '{elementType}'.");
}
}
private static bool UsesInt64(IReadOnlyList<MxSparseElement> elements)
{
foreach (MxSparseElement element in elements)
{
if (element.Value.KindCase == MxValue.KindOneofCase.Int64Value)
{
return true;
}
}
return false;
}
private static long ReadInt64(MxValue value) =>
value.KindCase == MxValue.KindOneofCase.Int64Value ? value.Int64Value : value.Int32Value;
private static RpcException Invalid(string message) =>
new(new Status(StatusCode.InvalidArgument, message));
}
@@ -393,4 +393,91 @@ public sealed class GatewayOptionsValidatorTests
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Succeeded);
}
[Fact]
public void Validate_Fails_WhenDetachGraceSecondsIsNegative()
{
GatewayOptions options = CloneWithSessions(
ValidOptions(),
new SessionOptions { DetachGraceSeconds = -1 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains(
result.Failures!,
f => f.Contains("MxGateway:Sessions:DetachGraceSeconds"));
}
[Fact]
public void Validate_Succeeds_WhenDetachGraceSecondsIsZero()
{
GatewayOptions options = CloneWithSessions(
ValidOptions(),
new SessionOptions { DetachGraceSeconds = 0 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Succeeded);
}
// -------------------------------------------------------------------------
// ReplayBufferCapacity / ReplayRetentionSeconds validation
// -------------------------------------------------------------------------
private static GatewayOptions CloneWithEvents(GatewayOptions source, EventOptions events)
=> new()
{
Authentication = source.Authentication,
Ldap = source.Ldap,
Worker = source.Worker,
Sessions = source.Sessions,
Events = events,
Dashboard = source.Dashboard,
Protocol = source.Protocol,
Alarms = source.Alarms,
Tls = source.Tls,
};
[Fact]
public void Validate_Fails_WhenReplayBufferCapacityIsNegative()
{
GatewayOptions options = CloneWithEvents(
ValidOptions(),
new EventOptions { ReplayBufferCapacity = -1 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains(
result.Failures!,
f => f.Contains("MxGateway:Events:ReplayBufferCapacity"));
}
[Fact]
public void Validate_Succeeds_WhenReplayBufferCapacityIsZero()
{
GatewayOptions options = CloneWithEvents(
ValidOptions(),
new EventOptions { ReplayBufferCapacity = 0 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Succeeded);
}
[Fact]
public void Validate_Fails_WhenReplayRetentionSecondsIsNegative()
{
GatewayOptions options = CloneWithEvents(
ValidOptions(),
new EventOptions { ReplayRetentionSeconds = -1 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains(
result.Failures!,
f => f.Contains("MxGateway:Events:ReplayRetentionSeconds"));
}
[Fact]
public void Validate_Succeeds_WhenReplayRetentionSecondsIsZero()
{
GatewayOptions options = CloneWithEvents(
ValidOptions(),
new EventOptions { ReplayRetentionSeconds = 0 });
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
Assert.True(result.Succeeded);
}
}
@@ -1543,4 +1543,49 @@ public sealed class ProtobufContractRoundTripTests
Assert.Equal(AlarmProviderMode.Subtag, parsed.OnAlarmProviderModeChanged.Mode);
Assert.Equal(unchecked((int)0x80004005), parsed.OnAlarmProviderModeChanged.Hresult);
}
/// <summary>
/// Verifies that an <see cref="MxEvent"/> carrying a
/// <see cref="ReplayGap"/> (the <c>optional replay_gap = 14</c> field)
/// round-trips with both sequence fields populated, that
/// <see cref="MxEvent.BodyCase"/> remains <see cref="MxEvent.BodyOneofCase.None"/>
/// (replay_gap is not part of the body oneof), and pins the wire field
/// numbers for <c>MxEvent.replay_gap</c> (14),
/// <c>ReplayGap.requested_after_sequence</c> (1), and
/// <c>ReplayGap.oldest_available_sequence</c> (2) via the descriptor.
/// </summary>
[Fact]
public void MxEvent_RoundTripsReplayGapSentinelAndPinsFieldNumbers()
{
// ReplayGap field on MxEvent must be wire number 14.
Assert.Equal(14, MxEvent.ReplayGapFieldNumber);
// ReplayGap sub-field numbers must be pinned.
var replayGapFields = ReplayGap.Descriptor.Fields;
Assert.Equal(1, replayGapFields[ReplayGap.RequestedAfterSequenceFieldNumber].FieldNumber);
Assert.Equal("requested_after_sequence", replayGapFields[ReplayGap.RequestedAfterSequenceFieldNumber].Name);
Assert.Equal(2, replayGapFields[ReplayGap.OldestAvailableSequenceFieldNumber].FieldNumber);
Assert.Equal("oldest_available_sequence", replayGapFields[ReplayGap.OldestAvailableSequenceFieldNumber].Name);
// Build a sentinel MxEvent: replay_gap set, body oneof unset, family UNSPECIFIED.
var original = new MxEvent
{
SessionId = "session-1",
WorkerSequence = 0,
ReplayGap = new ReplayGap
{
RequestedAfterSequence = 150,
OldestAvailableSequence = 200,
},
};
var parsed = MxEvent.Parser.ParseFrom(original.ToByteArray());
Assert.Equal(original, parsed);
// replay_gap is NOT part of the body oneof — BodyCase must remain None.
Assert.Equal(MxEvent.BodyOneofCase.None, parsed.BodyCase);
Assert.NotNull(parsed.ReplayGap);
Assert.Equal(150UL, parsed.ReplayGap.RequestedAfterSequence);
Assert.Equal(200UL, parsed.ReplayGap.OldestAvailableSequence);
}
}
@@ -0,0 +1,105 @@
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
using ZB.MOM.WW.MxGateway.Server.Galaxy;
using ZB.MOM.WW.MxGateway.Server.Sessions;
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Sessions;
public sealed class ArrayAddressNormalizerTests
{
/// <summary>Verifies a bare array attribute name gains the trailing array suffix.</summary>
[Fact]
public void Normalize_BareArrayName_AppendsArraySuffix()
{
ArrayAddressNormalizer normalizer = CreateNormalizer();
Assert.Equal("Obj.Arr[]", normalizer.Normalize("Obj.Arr"));
}
/// <summary>Verifies an already-suffixed address is returned unchanged.</summary>
[Fact]
public void Normalize_AlreadySuffixed_ReturnsUnchanged()
{
ArrayAddressNormalizer normalizer = CreateNormalizer();
Assert.Equal("Obj.Arr[]", normalizer.Normalize("Obj.Arr[]"));
}
/// <summary>Verifies a scalar attribute is returned unchanged.</summary>
[Fact]
public void Normalize_ScalarAttribute_ReturnsUnchanged()
{
ArrayAddressNormalizer normalizer = CreateNormalizer();
Assert.Equal("Obj.Scalar", normalizer.Normalize("Obj.Scalar"));
}
/// <summary>Verifies an address absent from the cache is returned unchanged.</summary>
[Fact]
public void Normalize_UnknownAddress_ReturnsUnchanged()
{
ArrayAddressNormalizer normalizer = CreateNormalizer();
Assert.Equal("Obj.Unknown", normalizer.Normalize("Obj.Unknown"));
}
/// <summary>Verifies null, empty, and whitespace addresses are returned unchanged.</summary>
[Theory]
[InlineData("")]
[InlineData(" ")]
public void Normalize_BlankAddress_ReturnsUnchanged(string address)
{
ArrayAddressNormalizer normalizer = CreateNormalizer();
Assert.Equal(address, normalizer.Normalize(address));
}
private static ArrayAddressNormalizer CreateNormalizer()
{
IReadOnlyList<GalaxyObject> objects =
[
new GalaxyObject
{
GobjectId = 1,
TagName = "Obj",
ContainedName = "Obj",
Attributes =
{
new GalaxyAttribute
{
AttributeName = "Arr",
// Galaxy SQL already appends "[]" to array attribute references.
FullTagReference = "Obj.Arr[]",
IsArray = true,
},
new GalaxyAttribute
{
AttributeName = "Scalar",
FullTagReference = "Obj.Scalar",
IsArray = false,
},
},
},
];
GalaxyHierarchyCacheEntry entry = GalaxyHierarchyCacheEntry.Empty with
{
Status = GalaxyCacheStatus.Healthy,
Objects = objects,
Index = GalaxyHierarchyIndex.Build(objects),
};
return new ArrayAddressNormalizer(new StubGalaxyHierarchyCache(entry));
}
private sealed class StubGalaxyHierarchyCache(GalaxyHierarchyCacheEntry current) : IGalaxyHierarchyCache
{
/// <summary>Gets the current cache entry.</summary>
public GalaxyHierarchyCacheEntry Current { get; } = current;
/// <inheritdoc />
public Task RefreshAsync(CancellationToken cancellationToken) => Task.CompletedTask;
/// <inheritdoc />
public Task WaitForFirstLoadAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}
}
@@ -0,0 +1,373 @@
using System.Runtime.CompilerServices;
using ZB.MOM.WW.MxGateway.Contracts.Proto;
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
using ZB.MOM.WW.MxGateway.Server.Galaxy;
using ZB.MOM.WW.MxGateway.Server.Sessions;
using ZB.MOM.WW.MxGateway.Server.Workers;
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Sessions;
/// <summary>
/// Integration coverage for the single outbound choke point
/// (<see cref="GatewaySession.InvokeAsync(WorkerCommand, System.Threading.CancellationToken)"/>):
/// array <c>AddItem</c> addresses gain the writable <c>[]</c> suffix and sparse array writes are
/// expanded to whole-array values before any command reaches the worker.
/// </summary>
public sealed class GatewayArrayWriteWiringTests
{
/// <summary>
/// A bare array <c>AddItem</c> address is normalized to its writable array form on the wire,
/// and the normalized address lands in the tracked <see cref="SessionItemRegistration"/>.
/// </summary>
[Fact]
public async Task AddItem_BareArrayAddress_NormalizedOnWireAndInRegistration()
{
CapturingWorkerClient worker = new();
GatewaySession session = CreateReadySession(worker);
WorkerCommand command = new()
{
Command = new MxCommand
{
Kind = MxCommandKind.AddItem,
AddItem = new AddItemCommand
{
ServerHandle = 1,
ItemDefinition = "Obj.Arr",
},
},
};
worker.NextReply = new WorkerCommandReply
{
Reply = new MxCommandReply
{
Kind = MxCommandKind.AddItem,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
AddItem = new AddItemReply { ItemHandle = 42 },
},
};
await session.InvokeAsync(command, CancellationToken.None);
Assert.NotNull(worker.LastCommand);
Assert.Equal("Obj.Arr[]", worker.LastCommand!.Command.AddItem.ItemDefinition);
// Track the reply through the same path the gRPC service uses; the registration must carry
// the normalized address even though the public reply is tracked from a separate command copy.
MxCommand trackingCopy = new()
{
Kind = MxCommandKind.AddItem,
AddItem = new AddItemCommand
{
ServerHandle = 1,
ItemDefinition = "Obj.Arr",
},
};
session.TrackCommandReply(trackingCopy, worker.NextReply.Reply);
Assert.True(session.TryGetItemRegistration(1, 42, out SessionItemRegistration registration));
Assert.Equal("Obj.Arr[]", registration.TagAddress);
}
/// <summary>A bare scalar <c>AddItem</c> address is forwarded unchanged.</summary>
[Fact]
public async Task AddItem_ScalarAddress_ForwardedUnchanged()
{
CapturingWorkerClient worker = new();
GatewaySession session = CreateReadySession(worker);
WorkerCommand command = new()
{
Command = new MxCommand
{
Kind = MxCommandKind.AddItem,
AddItem = new AddItemCommand
{
ServerHandle = 1,
ItemDefinition = "Obj.Scalar",
},
},
};
await session.InvokeAsync(command, CancellationToken.None);
Assert.Equal("Obj.Scalar", worker.LastCommand!.Command.AddItem.ItemDefinition);
}
/// <summary>
/// A sparse-array <see cref="WriteCommand"/> value is expanded to a full, default-filled
/// <see cref="MxArray"/> before reaching the worker; no sparse value is ever forwarded.
/// </summary>
[Fact]
public async Task Write_SparseArrayValue_ExpandedBeforeReachingWorker()
{
CapturingWorkerClient worker = new();
GatewaySession session = CreateReadySession(worker);
WorkerCommand command = new()
{
Command = new MxCommand
{
Kind = MxCommandKind.Write,
Write = new WriteCommand
{
ServerHandle = 1,
ItemHandle = 42,
Value = new MxValue
{
SparseArrayValue = new MxSparseArray
{
ElementDataType = MxDataType.Integer,
TotalLength = 4,
Elements =
{
new MxSparseElement
{
Index = 1,
Value = new MxValue { Int32Value = 7 },
},
},
},
},
},
},
};
await session.InvokeAsync(command, CancellationToken.None);
MxValue forwarded = worker.LastCommand!.Command.Write.Value;
Assert.Equal(MxValue.KindOneofCase.ArrayValue, forwarded.KindCase);
Assert.Equal(new[] { 0, 7, 0, 0 }, forwarded.ArrayValue.Int32Values.Values);
}
/// <summary>
/// A bare array <c>AddItem2</c> address is normalized to its writable array form on the wire,
/// and the normalized address lands in the tracked <see cref="SessionItemRegistration"/>.
/// </summary>
[Fact]
public async Task AddItem2_BareArrayAddress_NormalizedOnWireAndInRegistration()
{
CapturingWorkerClient worker = new();
GatewaySession session = CreateReadySession(worker);
WorkerCommand command = new()
{
Command = new MxCommand
{
Kind = MxCommandKind.AddItem2,
AddItem2 = new AddItem2Command
{
ServerHandle = 1,
ItemDefinition = "Obj.Arr",
},
},
};
worker.NextReply = new WorkerCommandReply
{
Reply = new MxCommandReply
{
Kind = MxCommandKind.AddItem2,
ProtocolStatus = new ProtocolStatus { Code = ProtocolStatusCode.Ok },
AddItem2 = new AddItem2Reply { ItemHandle = 43 },
},
};
await session.InvokeAsync(command, CancellationToken.None);
Assert.NotNull(worker.LastCommand);
Assert.Equal("Obj.Arr[]", worker.LastCommand!.Command.AddItem2.ItemDefinition);
MxCommand trackingCopy = new()
{
Kind = MxCommandKind.AddItem2,
AddItem2 = new AddItem2Command
{
ServerHandle = 1,
ItemDefinition = "Obj.Arr",
},
};
session.TrackCommandReply(trackingCopy, worker.NextReply.Reply);
Assert.True(session.TryGetItemRegistration(1, 43, out SessionItemRegistration registration));
Assert.Equal("Obj.Arr[]", registration.TagAddress);
}
/// <summary>
/// A sparse-array entry in a <see cref="WriteBulkCommand"/> is expanded to a full,
/// default-filled <see cref="MxArray"/> before reaching the worker; no sparse value is ever
/// forwarded inside a bulk write.
/// </summary>
[Fact]
public async Task WriteBulk_SparseArrayEntryValue_ExpandedBeforeReachingWorker()
{
CapturingWorkerClient worker = new();
GatewaySession session = CreateReadySession(worker);
WorkerCommand command = new()
{
Command = new MxCommand
{
Kind = MxCommandKind.WriteBulk,
WriteBulk = new WriteBulkCommand
{
ServerHandle = 1,
Entries =
{
new WriteBulkEntry
{
ItemHandle = 42,
Value = new MxValue
{
SparseArrayValue = new MxSparseArray
{
ElementDataType = MxDataType.Integer,
TotalLength = 4,
Elements =
{
new MxSparseElement
{
Index = 1,
Value = new MxValue { Int32Value = 7 },
},
},
},
},
},
},
},
},
};
await session.InvokeAsync(command, CancellationToken.None);
MxValue forwarded = worker.LastCommand!.Command.WriteBulk.Entries[0].Value;
Assert.Equal(MxValue.KindOneofCase.ArrayValue, forwarded.KindCase);
Assert.Equal(new[] { 0, 7, 0, 0 }, forwarded.ArrayValue.Int32Values.Values);
}
private static GatewaySession CreateReadySession(IWorkerClient workerClient)
{
GatewaySession session = new(
sessionId: "session-array-write-wiring",
backendName: "mxaccess",
pipeName: "mxaccess-gateway-1-session-array-write-wiring",
nonce: "nonce",
clientIdentity: "client-1",
ownerKeyId: null,
clientSessionName: "test-session",
clientCorrelationId: "client-correlation-1",
commandTimeout: TimeSpan.FromSeconds(5),
startupTimeout: TimeSpan.FromSeconds(5),
shutdownTimeout: TimeSpan.FromSeconds(5),
leaseDuration: TimeSpan.FromMinutes(30),
openedAt: DateTimeOffset.UtcNow,
addressNormalizer: CreateNormalizer());
session.AttachWorkerClient(workerClient);
session.MarkReady();
return session;
}
private static ArrayAddressNormalizer CreateNormalizer()
{
IReadOnlyList<GalaxyObject> objects =
[
new GalaxyObject
{
GobjectId = 1,
TagName = "Obj",
ContainedName = "Obj",
Attributes =
{
new GalaxyAttribute
{
AttributeName = "Arr",
FullTagReference = "Obj.Arr[]",
IsArray = true,
},
new GalaxyAttribute
{
AttributeName = "Scalar",
FullTagReference = "Obj.Scalar",
IsArray = false,
},
},
},
];
GalaxyHierarchyCacheEntry entry = GalaxyHierarchyCacheEntry.Empty with
{
Status = GalaxyCacheStatus.Healthy,
Objects = objects,
Index = GalaxyHierarchyIndex.Build(objects),
};
return new ArrayAddressNormalizer(new StubGalaxyHierarchyCache(entry));
}
private sealed class StubGalaxyHierarchyCache(GalaxyHierarchyCacheEntry current) : IGalaxyHierarchyCache
{
/// <summary>Gets the current cache entry.</summary>
public GalaxyHierarchyCacheEntry Current { get; } = current;
/// <inheritdoc />
public Task RefreshAsync(CancellationToken cancellationToken) => Task.CompletedTask;
/// <inheritdoc />
public Task WaitForFirstLoadAsync(CancellationToken cancellationToken) => Task.CompletedTask;
}
private sealed class CapturingWorkerClient : IWorkerClient
{
/// <summary>Gets the most recent command forwarded to the worker.</summary>
public WorkerCommand? LastCommand { get; private set; }
/// <summary>Gets or sets the reply returned by the next invocation.</summary>
public WorkerCommandReply NextReply { get; set; } = new();
/// <summary>Gets the session identifier.</summary>
public string SessionId { get; } = "session-array-write-wiring";
/// <summary>Gets the worker process identifier.</summary>
public int? ProcessId { get; } = 1234;
/// <summary>Gets the worker client state.</summary>
public WorkerClientState State { get; } = WorkerClientState.Ready;
/// <summary>Gets the last recorded heartbeat timestamp.</summary>
public DateTimeOffset LastHeartbeatAt { get; } = DateTimeOffset.UtcNow;
/// <inheritdoc />
public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask;
/// <inheritdoc />
public Task<WorkerCommandReply> InvokeAsync(
WorkerCommand command,
TimeSpan timeout,
CancellationToken cancellationToken)
{
LastCommand = command;
return Task.FromResult(NextReply);
}
/// <inheritdoc />
public async IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
[EnumeratorCancellation] CancellationToken cancellationToken)
{
await Task.CompletedTask.ConfigureAwait(false);
yield break;
}
/// <inheritdoc />
public Task ShutdownAsync(TimeSpan timeout, CancellationToken cancellationToken) => Task.CompletedTask;
/// <inheritdoc />
public void Kill(string reason)
{
}
/// <inheritdoc />
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
}
}
@@ -67,6 +67,12 @@ public sealed class GatewaySessionDashboardMirrorTests
workerClient.Events.Add(CreateWorkerEvent(2, MxEventFamily.OnDataChange));
workerClient.Events.Add(CreateWorkerEvent(3, MxEventFamily.OnWriteComplete));
workerClient.CompleteAfterConfiguredEvents = true;
// Hold the worker stream until BOTH subscribers are attached so neither misses an event.
// MarkReady registers the internal dashboard subscriber and starts the pump, which then
// blocks on the gate; the gRPC subscriber attaches below; only then is the finite stream
// released. Without this gate the pump can drain all three events before the gRPC
// subscriber registers — a register-vs-pump race that otherwise makes this test flaky.
workerClient.HoldEventsUntilReleased();
RecordingDashboardEventBroadcaster broadcaster = new();
await using GatewaySession session = CreateSession(workerClient, broadcaster);
@@ -79,13 +85,22 @@ public sealed class GatewaySessionDashboardMirrorTests
new GatewayMetrics());
List<MxEvent> grpcEvents = [];
await foreach (MxEvent mxEvent in service
.StreamEventsAsync(new StreamEventsRequest { SessionId = session.SessionId }, CancellationToken.None)
.WithCancellation(CancellationToken.None))
Task grpcReader = Task.Run(async () =>
{
grpcEvents.Add(mxEvent);
}
await foreach (MxEvent mxEvent in service
.StreamEventsAsync(new StreamEventsRequest { SessionId = session.SessionId }, CancellationToken.None)
.WithCancellation(CancellationToken.None))
{
grpcEvents.Add(mxEvent);
}
});
// The gRPC subscriber counts against ActiveEventSubscriberCount (the internal dashboard
// mirror does not), so count == 1 confirms it has attached. Only then release the stream.
await WaitUntilAsync(() => session.ActiveEventSubscriberCount == 1);
workerClient.ReleaseEvents();
await grpcReader.WaitAsync(TestTimeout);
await WaitUntilAsync(() => broadcaster.Captures.Count == 3);
Assert.Equal([1UL, 2UL, 3UL], grpcEvents.Select(mxEvent => mxEvent.WorkerSequence).ToArray());
@@ -280,6 +295,24 @@ public sealed class GatewaySessionDashboardMirrorTests
public bool CompleteAfterConfiguredEvents { get; set; }
// Gate that holds the event stream before it yields anything. Released by default, so
// ungated tests are unaffected. HoldEventsUntilReleased() makes ReadEventsAsync block
// until ReleaseEvents(), letting a test attach every subscriber before a finite,
// fast-completing stream drains (avoids a register-vs-pump race).
private TaskCompletionSource _releaseGate = CreateReleasedGate();
private static TaskCompletionSource CreateReleasedGate()
{
TaskCompletionSource gate = new(TaskCreationOptions.RunContinuationsAsynchronously);
gate.SetResult();
return gate;
}
public void HoldEventsUntilReleased() =>
_releaseGate = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
public void ReleaseEvents() => _releaseGate.TrySetResult();
public string SessionId { get; } = "session-dashboard-mirror";
public int? ProcessId { get; } = 1234;
@@ -298,6 +331,9 @@ public sealed class GatewaySessionDashboardMirrorTests
public async IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
[EnumeratorCancellation] CancellationToken cancellationToken)
{
// Block before yielding any event until released (ungated by default).
await _releaseGate.Task.WaitAsync(cancellationToken).ConfigureAwait(false);
foreach (WorkerEvent workerEvent in Events)
{
cancellationToken.ThrowIfCancellationRequested();
@@ -545,6 +545,43 @@ public sealed class GatewaySessionTests
Assert.False(session.IsDetachGraceExpired(clock.GetUtcNow()));
}
/// <summary>
/// Server-055 regression. A FAILED first attach (the distributor never registered a
/// subscriber) must NOT enter the detach-grace window: the catch path's
/// <c>DetachEventSubscriber</c> rolls the reserved slot back to 0 but must not stamp
/// <c>DetachedAtUtc</c>, because the "last subscriber dropped" semantics only apply once
/// a subscriber was successfully registered. A freshly-Ready session whose first attach
/// failed must therefore stay out of grace and never become sweep-eligible on that basis.
/// </summary>
[Fact]
public async Task DetachGrace_FailedFirstAttach_DoesNotEnterGrace()
{
FakeTimeProvider clock = new(DateTimeOffset.UtcNow);
FakeWorkerClient workerClient = new();
// QueueCapacity = 0 makes the distributor constructor throw ArgumentOutOfRangeException
// inside StartDistributorAndRegister, so the very first AttachEventSubscriber fails after
// it reserved a slot — exercising the catch → DetachEventSubscriber rollback path.
await using GatewaySession session = CreateReadySessionWithDetachGrace(
workerClient,
clock,
detachGrace: TimeSpan.FromSeconds(30),
queueCapacity: 0);
Assert.ThrowsAny<ArgumentException>(
() => session.AttachEventSubscriber(maxSubscribers: 1));
// The reserved slot was rolled back, but no successful subscriber ever existed, so the
// session must NOT have entered detach-grace.
Assert.Equal(SessionState.Ready, session.State);
Assert.Equal(0, session.ActiveEventSubscriberCount);
Assert.Null(session.DetachedAtUtc);
// And it must never become detach-grace-eligible no matter how far the clock advances.
clock.Advance(TimeSpan.FromHours(1));
Assert.False(session.IsDetachGraceExpired(clock.GetUtcNow()));
}
/// <summary>
/// Task 11. The gateway-owned internal dashboard subscriber must NOT keep a session out
/// of detach-grace: with only the dashboard mirror attached (and no external gRPC
@@ -618,7 +655,8 @@ public sealed class GatewaySessionTests
IWorkerClient workerClient,
TimeProvider timeProvider,
TimeSpan detachGrace,
IDashboardEventBroadcaster? dashboardBroadcaster = null)
IDashboardEventBroadcaster? dashboardBroadcaster = null,
int queueCapacity = 8)
{
GatewaySession session = new(
sessionId: "session-test-detach-grace",
@@ -636,7 +674,7 @@ public sealed class GatewaySessionTests
openedAt: timeProvider.GetUtcNow(),
eventStreaming: new SessionEventStreaming(
new MxAccessGrpcMapper(),
new EventOptions { QueueCapacity = 8 },
new EventOptions { QueueCapacity = queueCapacity },
NullLogger<SessionEventDistributor>.Instance,
timeProvider,
new GatewayMetrics(),
@@ -702,16 +702,71 @@ public sealed class SessionEventDistributorTests
private static async Task DrainUntilFaultAsync(ChannelReader<MxEvent> reader)
{
// Drains any buffered events, then surfaces the channel's completion fault (if any)
// by awaiting the final read past the buffered tail.
// by awaiting the final WaitToReadAsync past the buffered tail.
// If WaitToReadAsync returns false (graceful completion rather than a fault),
// await Completion to surface any fault stored there, then Assert.Fail so the
// helper does not spin forever on a channel that completes without an exception.
while (true)
{
await reader.WaitToReadAsync().AsTask().WaitAsync(ReadTimeout);
bool hasMore = await reader.WaitToReadAsync().AsTask().WaitAsync(ReadTimeout);
if (!hasMore)
{
// Graceful completion — propagate any stored exception, then fail.
await reader.Completion;
Assert.Fail("DrainUntilFaultAsync: channel completed gracefully (no fault).");
return;
}
while (reader.TryRead(out _))
{
}
}
}
/// <summary>
/// Regression: a subscriber that registers in the window AFTER the pump has completed
/// (its event source finished) but BEFORE the distributor is disposed must have its
/// channel completed immediately, not left open forever. The pump has already run its
/// final <c>CompleteAllSubscribers</c> sweep and exited, so without the
/// register-after-completion guard the late subscriber's reader hangs indefinitely.
/// This was observed as an order-dependent hang in
/// <c>GatewaySessionDashboardMirrorTests</c>, where a gRPC subscriber attached after a
/// fast-completing worker stream had already drained.
/// </summary>
[Fact]
public async Task Register_AfterSourceCompletes_CompletesLateSubscriberInsteadOfHanging()
{
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
await using SessionEventDistributor distributor = CreateDistributor(source.Reader);
await distributor.StartAsync(CancellationToken.None);
// An early subscriber lets us observe when the pump's final completion sweep has run.
using IEventSubscriberLease early = distributor.Register();
// Complete the source: the pump drains it, runs CompleteAllSubscribers, and exits.
source.Writer.Complete();
// Draining the early subscriber to completion proves the pump finished its sweep — so
// a subscriber registering now is unambiguously in the register-after-completion window.
using (CancellationTokenSource earlyCts = new(ReadTimeout))
{
await foreach (MxEvent _ in early.Reader.ReadAllAsync(earlyCts.Token))
{
}
}
// Register AFTER the pump has completed. The channel must be completed immediately; the
// bounded read below must end rather than hang (the ReadTimeout converts a regression
// into a fast OperationCanceledException failure instead of an indefinite hang).
using IEventSubscriberLease late = distributor.Register();
using CancellationTokenSource lateCts = new(ReadTimeout);
await foreach (MxEvent _ in late.Reader.ReadAllAsync(lateCts.Token))
{
}
Assert.False(lateCts.IsCancellationRequested);
}
private static SessionEventDistributor CreateDistributor(ChannelReader<MxEvent> source)
=> CreateDistributor(source, replayBufferCapacity: 1024, replayRetentionSeconds: 300);
@@ -0,0 +1,210 @@
using Google.Protobuf.WellKnownTypes;
using Grpc.Core;
using ZB.MOM.WW.MxGateway.Contracts.Proto;
using ZB.MOM.WW.MxGateway.Server.Sessions;
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Sessions;
public sealed class SparseArrayExpanderTests
{
private static MxValue SparseValue(
MxDataType elementType,
uint totalLength,
params (uint Index, MxValue Value)[] elements)
{
MxSparseArray sparse = new()
{
ElementDataType = elementType,
TotalLength = totalLength,
};
foreach ((uint index, MxValue value) in elements)
{
sparse.Elements.Add(new MxSparseElement { Index = index, Value = value });
}
return new MxValue { SparseArrayValue = sparse };
}
[Fact]
public void Expand_Int32_FillsDefaultsAndSetsElement()
{
MxValue value = SparseValue(
MxDataType.Integer,
4,
(1, new MxValue { Int32Value = 7 }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxValue.KindOneofCase.ArrayValue, value.KindCase);
Assert.Equal(MxDataType.Integer, value.ArrayValue.ElementDataType);
Assert.Equal(new uint[] { 4 }, value.ArrayValue.Dimensions);
Assert.Equal(MxArray.ValuesOneofCase.Int32Values, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { 0, 7, 0, 0 }, value.ArrayValue.Int32Values.Values);
}
[Fact]
public void Expand_Boolean_EmptyElements_AllDefaultFalse()
{
MxValue value = SparseValue(MxDataType.Boolean, 3);
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.BoolValues, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { false, false, false }, value.ArrayValue.BoolValues.Values);
}
[Fact]
public void Expand_ZeroTotalLength_Throws()
{
MxValue value = SparseValue(MxDataType.Integer, 0);
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_IndexOutOfRange_Throws()
{
MxValue value = SparseValue(
MxDataType.Integer,
2,
(5, new MxValue { Int32Value = 1 }));
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_DuplicateIndex_Throws()
{
MxValue value = SparseValue(
MxDataType.Integer,
4,
(1, new MxValue { Int32Value = 1 }),
(1, new MxValue { Int32Value = 2 }));
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_UnsupportedElementType_Throws()
{
MxValue value = SparseValue(MxDataType.Unspecified, 2);
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_ElementValueKindMismatch_Throws()
{
MxValue value = SparseValue(
MxDataType.Integer,
2,
(0, new MxValue { StringValue = "nope" }));
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
[Fact]
public void Expand_String_FillsEmptyStringDefault()
{
MxValue value = SparseValue(
MxDataType.String,
2,
(0, new MxValue { StringValue = "a" }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.StringValues, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { "a", string.Empty }, value.ArrayValue.StringValues.Values);
}
[Fact]
public void Expand_Time_FillsEpochDefault()
{
MxValue value = SparseValue(
MxDataType.Time,
2,
(1, new MxValue { TimestampValue = new Timestamp { Seconds = 5 } }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.TimestampValues, value.ArrayValue.ValuesCase);
Assert.Equal(2, value.ArrayValue.TimestampValues.Values.Count);
Assert.Equal(0, value.ArrayValue.TimestampValues.Values[0].Seconds);
Assert.Equal(0, value.ArrayValue.TimestampValues.Values[0].Nanos);
Assert.Equal(5, value.ArrayValue.TimestampValues.Values[1].Seconds);
}
[Fact]
public void Expand_Double_HappyPath()
{
MxValue value = SparseValue(
MxDataType.Double,
3,
(2, new MxValue { DoubleValue = 1.5 }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.DoubleValues, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { 0d, 0d, 1.5 }, value.ArrayValue.DoubleValues.Values);
}
[Fact]
public void Expand_Float_HappyPath()
{
MxValue value = SparseValue(
MxDataType.Float,
2,
(0, new MxValue { FloatValue = 2.5f }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.FloatValues, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { 2.5f, 0f }, value.ArrayValue.FloatValues.Values);
}
[Fact]
public void Expand_Int64_WhenElementIsInt64()
{
MxValue value = SparseValue(
MxDataType.Integer,
3,
(2, new MxValue { Int64Value = 9_000_000_000L }));
SparseArrayExpander.Expand(value);
Assert.Equal(MxArray.ValuesOneofCase.Int64Values, value.ArrayValue.ValuesCase);
Assert.Equal(new[] { 0L, 0L, 9_000_000_000L }, value.ArrayValue.Int64Values.Values);
}
[Fact]
public void Expand_NonSparseValue_NoOps()
{
MxValue value = new() { Int32Value = 42 };
SparseArrayExpander.Expand(value);
Assert.Equal(MxValue.KindOneofCase.Int32Value, value.KindCase);
Assert.Equal(42, value.Int32Value);
}
[Fact]
public void Expand_NullValue_ThrowsArgumentNull()
{
Assert.Throws<ArgumentNullException>(() => SparseArrayExpander.Expand(null!));
}
[Fact]
public void Expand_TotalLengthExceedsMaxArrayLength_Throws()
{
MxValue value = SparseValue(MxDataType.Integer, 2_147_483_648u);
RpcException ex = Assert.Throws<RpcException>(() => SparseArrayExpander.Expand(value));
Assert.Equal(StatusCode.InvalidArgument, ex.StatusCode);
}
}
@@ -207,6 +207,62 @@ public sealed class ConstraintEnforcerTests
Assert.Equal("read_historized_only", failure.ConstraintName);
}
/// <summary>
/// A bare array attribute address (no trailing <c>[]</c>) resolves through the Galaxy index
/// even though arrays are keyed by their suffixed FullTagReference (e.g. "Pump_001.Levels[]").
/// Without the <c>[]</c> fallback in ResolveTarget the bare name misses the index and a
/// read-constrained key gets a spurious tag_metadata denial for an AddItem it should allow.
/// </summary>
[Fact]
public async Task CheckReadTagAsync_WithBareArrayName_ResolvesViaArraySuffixFallback()
{
ConstraintEnforcer enforcer = CreateEnforcer(out _);
ApiKeyIdentity identity = CreateIdentity(ApiKeyConstraints.Empty with
{
// A read constraint that covers the Pump_001 subtree; the array attribute is inside it.
ReadTagGlobs = ["Pump_001.*"],
});
ConstraintFailure? failure = await enforcer.CheckReadTagAsync(
identity,
"Pump_001.Levels",
CancellationToken.None);
// Before the fix: bare "Pump_001.Levels" misses the index (keyed "Pump_001.Levels[]") and
// returns a tag_metadata failure. After the fix: it resolves and is within scope -> null.
Assert.Null(failure);
}
/// <summary>
/// A bare non-array name that is genuinely absent from the index still resolves to null:
/// the <c>[]</c> probe must not manufacture a false positive for a scalar/missing tag.
/// </summary>
[Fact]
public async Task CheckReadTagAsync_WithMissingNonArrayName_StillFailsToResolve()
{
ConstraintEnforcer enforcer = CreateEnforcer(out _);
ApiKeyIdentity identity = CreateIdentity(ApiKeyConstraints.Empty with
{
ReadTagGlobs = ["Pump_001.*"],
});
// "Pump_001.Scalar" is not in the index, and "Pump_001.Scalar[]" is not either, so the
// suffix probe must not resolve it. A genuinely-unknown name behaves the same.
ConstraintFailure? missingScalar = await enforcer.CheckReadTagAsync(
identity,
"Pump_001.Scalar",
CancellationToken.None);
Assert.NotNull(missingScalar);
Assert.Equal("tag_metadata", missingScalar.ConstraintName);
ConstraintFailure? unknown = await enforcer.CheckReadTagAsync(
identity,
"DoesNotExist_999.Whatever",
CancellationToken.None);
Assert.NotNull(unknown);
Assert.Equal("tag_metadata", unknown.ConstraintName);
}
private static ConstraintEnforcer CreateEnforcer(out FakeAuditWriter auditWriter)
{
auditWriter = new FakeAuditWriter();
@@ -276,6 +332,14 @@ public sealed class ConstraintEnforcerTests
AttributeName = "NonHistorized",
FullTagReference = "Pump_001.NonHistorized",
},
new GalaxyAttribute
{
// Galaxy SQL keys array attributes by their suffixed FullTagReference,
// so the index entry is "Pump_001.Levels[]", not the bare name.
AttributeName = "Levels",
FullTagReference = "Pump_001.Levels[]",
IsArray = true,
},
},
},
new GalaxyObject
@@ -980,7 +980,11 @@ public sealed class WorkerPipeSessionTests
Task runTask = session.RunAsync(cancellation.Token);
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
DateTimeOffset start = DateTimeOffset.UtcNow;
// The heartbeatWait CTS (5s cancel-after) already enforces the timing bound:
// if the first heartbeat is not received within 5s, ReadUntilAsync throws
// OperationCanceledException and the test fails. A redundant wall-clock
// elapsed < 5s assertion would add the same class of flakiness
// Workers.Tests-003/004/013/020 corrected elsewhere, so it is omitted here.
using CancellationTokenSource heartbeatWait = CancellationTokenSource
.CreateLinkedTokenSource(cancellation.Token);
heartbeatWait.CancelAfter(TimeSpan.FromSeconds(5));
@@ -988,12 +992,8 @@ public sealed class WorkerPipeSessionTests
pipePair.GatewayReader,
WorkerEnvelope.BodyOneofCase.WorkerHeartbeat,
heartbeatWait.Token);
TimeSpan elapsed = DateTimeOffset.UtcNow - start;
Assert.Equal(WorkerEnvelope.BodyOneofCase.WorkerHeartbeat, heartbeat.BodyCase);
Assert.True(
elapsed < TimeSpan.FromSeconds(5),
$"First heartbeat took {elapsed}, expected well under the 30s interval.");
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
}
@@ -1123,6 +1123,36 @@ public sealed class MxAccessCommandExecutorTests
Assert.Equal(500, fakeComObject.SetBufferedUpdateIntervalValue);
}
/// <summary>
/// Verifies that a command with an unknown <see cref="MxCommandKind"/> value returns an
/// <see cref="ProtocolStatusCode.InvalidRequest"/> reply whose diagnostic contains "Unsupported".
/// This pins the <c>_ =&gt; CreateInvalidRequestReply(...)</c> discard arm in
/// <c>MxAccessCommandExecutor.Execute</c>: a regression that changed the arm to
/// <c>throw</c> would propagate an unhandled exception through <c>WorkerPipeSession</c>
/// and no other test would catch it.
/// </summary>
[Fact]
public async Task DispatchAsync_WithUnknownCommandKind_ReturnsInvalidRequestWithUnsupportedDiagnostic()
{
FakeMxAccessComObjectFactory factory = new(new FakeMxAccessComObject(registerHandle: 999));
using StaRuntime runtime = CreateRuntime();
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
await session.StartAsync(workerProcessId: 1234);
// Cast an integer outside the defined MxCommandKind range to an unknown kind value.
MxCommandKind unknownKind = (MxCommandKind)int.MaxValue;
MxCommandReply reply = await session.DispatchAsync(new StaCommand(
"session-1",
"unknown-kind-correlation",
new MxCommand
{
Kind = unknownKind,
}));
Assert.Equal(ProtocolStatusCode.InvalidRequest, reply.ProtocolStatus.Code);
Assert.Contains("Unsupported", reply.DiagnosticMessage, StringComparison.OrdinalIgnoreCase);
}
private static StaCommand CreateSuspendCommand(
string correlationId,
int serverHandle,
@@ -2229,21 +2259,6 @@ public sealed class MxAccessCommandExecutorTests
SetBufferedUpdateIntervalValue = updateIntervalMilliseconds;
}
/// <summary>Status stand-in reflected over by the worker's MxStatusProxy converter.</summary>
internal sealed class FakeMxStatus
{
/// <summary>Success indicator read by the status converter.</summary>
public int success;
/// <summary>Status category read by the status converter.</summary>
public int category;
/// <summary>Status detected-by read by the status converter.</summary>
public int detectedBy;
/// <summary>Status detail read by the status converter.</summary>
public int detail;
}
}
/// <summary>Factory for creating fake MXAccess COM objects in tests.</summary>
@@ -94,6 +94,11 @@ internal sealed class NoopMxAccessServer : IMxAccessServer
/// <c>success</c>, <c>category</c>, <c>detectedBy</c>, and <c>detail</c>
/// fields, so this fake exposes the same field shape with all-OK values.
/// </summary>
/// <remarks>
/// Previously duplicated as a nested class in <c>MxAccessCommandExecutorTests.FakeMxAccessComObject</c>.
/// Consolidated here per Worker.Tests-034 so a future field change to the real COM struct only
/// requires updating one place.
/// </remarks>
internal sealed class FakeMxStatus
{
// These public fields exist solely so MxStatusProxyConverter can reflect