Compare commits
47 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 7b12eebbd1 | |||
| bd190ab012 | |||
| 2ead9bc200 | |||
| 1ea08c3b10 | |||
| 4f43733b96 | |||
| 039111ca05 | |||
| 61627fc5b0 | |||
| 7f1018bac1 | |||
| c2c518862f | |||
| e962737d2c | |||
| 7773bdebbd | |||
| c79b292968 | |||
| a43b2ee6af | |||
| f5479f3ca3 | |||
| 00c849e63b | |||
| 3fc6ccad30 | |||
| 0e4843612b | |||
| a56ce0ddbd | |||
| f7ada90359 | |||
| efd99718d7 | |||
| b298ca74be | |||
| 0d5b488c11 | |||
| bb5139fec2 | |||
| dde9934e60 | |||
| 29399325d5 | |||
| f94c206489 | |||
| 72e1aca716 | |||
| bf72cd8961 | |||
| 5a7f8ace77 | |||
| c10faa2ee5 | |||
| 7975b09325 | |||
| d7e2a8b3cf | |||
| 39ec2a3275 | |||
| 8cb416ba30 | |||
| 55526d5e56 | |||
| a59fc998e3 | |||
| 539e6ef2de | |||
| 742ced7970 | |||
| bd46ba1270 | |||
| 0032d2dc44 | |||
| 8415f35abd | |||
| 639e36b1bc | |||
| 90529dce6e | |||
| a211faefed | |||
| 849f1d2f6d | |||
| 883557fc8a | |||
| 4a00b1bdc1 |
@@ -8,10 +8,10 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
The architecture is a two-process design — read `gateway.md` before making structural changes:
|
||||
|
||||
- **Gateway** (`src/MxGateway.Server`, .NET 10, x64): ASP.NET Core gRPC server. Owns the public API, sessions, auth, the Blazor dashboard, and the Galaxy Repository SQL browse RPCs. **Never instantiates MXAccess COM directly.**
|
||||
- **Worker** (`src/MxGateway.Worker`, .NET Framework 4.8, **x86**): one process per session. Owns one MXAccess COM instance on a dedicated STA, pumps Windows messages, and converts COM events to protobuf.
|
||||
- **Gateway** (`src/ZB.MOM.WW.MxGateway.Server`, .NET 10, x64): ASP.NET Core gRPC server. Owns the public API, sessions, auth, the Blazor dashboard, and the Galaxy Repository SQL browse RPCs. **Never instantiates MXAccess COM directly.**
|
||||
- **Worker** (`src/ZB.MOM.WW.MxGateway.Worker`, .NET Framework 4.8, **x86**): one process per session. Owns one MXAccess COM instance on a dedicated STA, pumps Windows messages, and converts COM events to protobuf.
|
||||
- **IPC**: gateway↔worker uses one bidirectional named pipe per worker (`mxaccess-gateway-{gatewayPid}-{sessionId}`) with length-prefixed `WorkerEnvelope` protobuf frames. Gateway hosts the pipe server and launches the worker. **gRPC is not used inside the worker** — .NET Framework 4.8 doesn't have a first-class gRPC stack.
|
||||
- **Contracts** (`src/MxGateway.Contracts`): multi-targets `net10.0;net48` and owns the `.proto` files (`mxaccess_gateway.proto`, `mxaccess_worker.proto`, `galaxy_repository.proto`). All other projects consume the generated types from here. Do not hand-edit anything under `Generated/`.
|
||||
- **Contracts** (`src/ZB.MOM.WW.MxGateway.Contracts`): multi-targets `net10.0;net48` and owns the `.proto` files (`mxaccess_gateway.proto`, `mxaccess_worker.proto`, `galaxy_repository.proto`). All other projects consume the generated types from here. Do not hand-edit anything under `Generated/`.
|
||||
|
||||
The worker must do all MXAccess COM calls on its dedicated STA thread, and the STA loop must pump Windows messages (`MsgWaitForMultipleObjectsEx` + `PeekMessage`/`DispatchMessage`) so MXAccess events deliver. A plain blocking queue on an STA is not enough.
|
||||
|
||||
@@ -19,42 +19,42 @@ The worker must do all MXAccess COM calls on its dedicated STA thread, and the S
|
||||
|
||||
```powershell
|
||||
# Full solution build (gateway, worker, contracts, tests)
|
||||
dotnet build src/MxGateway.sln
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.slnx
|
||||
|
||||
# Worker must be built x86 — the gateway looks for MxGateway.Worker.exe under bin\x86
|
||||
dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86
|
||||
# Worker must be built x86 — the gateway looks for ZB.MOM.WW.MxGateway.Worker.exe under bin\x86
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86
|
||||
|
||||
# Gateway tests (no MXAccess required — uses FakeWorkerHarness)
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj
|
||||
dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86
|
||||
|
||||
# Run gateway locally (defaults bound under MxGateway:* in src/MxGateway.Server/appsettings.json)
|
||||
dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj
|
||||
# Run gateway locally (defaults bound under MxGateway:* in src/ZB.MOM.WW.MxGateway.Server/appsettings.json)
|
||||
dotnet run --project src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
|
||||
|
||||
# API-key admin CLI (same exe, "apikey" subcommand)
|
||||
dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create --display-name "dev" --scopes session,invoke,event,metadata,admin
|
||||
dotnet run --project src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -- apikey create --display-name "dev" --scopes session,invoke,event,metadata,admin
|
||||
```
|
||||
|
||||
Single test by name (xUnit `--filter`):
|
||||
|
||||
```powershell
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
|
||||
```
|
||||
|
||||
Live MXAccess integration tests are **opt-in** because they need installed MXAccess COM and live provider state:
|
||||
|
||||
```powershell
|
||||
$env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1"
|
||||
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
|
||||
```
|
||||
|
||||
Live LDAP tests use `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`. See `docs/GatewayTesting.md` for the full opt-in matrix and `LiveMxAccessFactAttribute` / `LiveLdapFactAttribute` for the gating logic.
|
||||
|
||||
## Clients
|
||||
|
||||
Each language client is in `clients/<lang>/` with its own README. They all consume the shared `.proto` files in `src/MxGateway.Contracts/Protos`:
|
||||
Each language client is in `clients/<lang>/` with its own README. They all consume the shared `.proto` files in `src/ZB.MOM.WW.MxGateway.Contracts/Protos`:
|
||||
|
||||
- `clients/dotnet`: `dotnet build clients/dotnet/MxGateway.Client.sln`
|
||||
- `clients/dotnet`: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx`
|
||||
- `clients/python`: `python -m pip install -e ".[dev]"; python -m pytest`
|
||||
- `clients/rust`: `cargo test --workspace; cargo clippy --workspace --all-targets -- -D warnings`
|
||||
- `clients/java`: `gradle test` (Java 21)
|
||||
@@ -77,7 +77,7 @@ powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1
|
||||
- **Gateway restart does not reattach orphan workers.** The first version terminates orphaned workers on startup; do not design code paths that assume reattachment.
|
||||
- **No Blazor UI component libraries.** Dashboard uses local Bootstrap CSS/JS only — do not introduce MudBlazor, Radzen, FluentUI, etc.
|
||||
- **Don't log secrets or full tag values by default.** API keys, passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs. Value logging is opt-in and redacted.
|
||||
- **Generated code** under `src/MxGateway.Contracts/Generated/`, `clients/*/generated*/`, `clients/python/src/mxgateway/generated/`, etc., is build output. Don't hand-edit. To regenerate, build the contracts project (`dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj`) or run the per-client generation step in that client's README.
|
||||
- **Generated code** under `src/ZB.MOM.WW.MxGateway.Contracts/Generated/`, `clients/*/generated*/`, `clients/python/src/mxgateway/generated/`, etc., is build output. Don't hand-edit. To regenerate, build the contracts project (`dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`) or run the per-client generation step in that client's README.
|
||||
- **Documentation style** (`StyleGuide.md`): PascalCase filenames, no marketing language, present tense, explain *why* not *what*.
|
||||
- **Update docs in the same change as the source.** When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs (`gateway.md`, `docs/`, client READMEs, design docs) must change in the same commit. Don't leave stale prose describing old behavior.
|
||||
|
||||
@@ -88,9 +88,9 @@ When source code changes, build and test the affected component before reporting
|
||||
| Changed area | Required verification |
|
||||
|---|---|
|
||||
| Contracts or `.proto` files | regenerate generated code, then build gateway, worker, and every generated client touched by the contract |
|
||||
| Gateway server, sessions, workers, gRPC, dashboard, metrics | `dotnet build src/MxGateway.Server` and run affected gateway / fake-worker tests |
|
||||
| Worker IPC, STA, MXAccess, conversion | `dotnet build src/MxGateway.Worker -p:Platform=x86` and run worker tests |
|
||||
| .NET client | `dotnet build clients/dotnet/MxGateway.Client.sln` and run its tests |
|
||||
| Gateway server, sessions, workers, gRPC, dashboard, metrics | `dotnet build src/ZB.MOM.WW.MxGateway.Server` and run affected gateway / fake-worker tests |
|
||||
| Worker IPC, STA, MXAccess, conversion | `dotnet build src/ZB.MOM.WW.MxGateway.Worker -p:Platform=x86` and run worker tests |
|
||||
| .NET client | `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` and run its tests |
|
||||
| Go client | `gofmt`, `go build ./...`, `go test ./...` from `clients/go` |
|
||||
| Rust client | `cargo fmt`, `cargo check --workspace`, `cargo test --workspace`, `cargo clippy --all-targets -- -D warnings` from `clients/rust` |
|
||||
| Python client | `python -m pytest` from `clients/python` |
|
||||
@@ -114,7 +114,7 @@ External analysis sources referenced by design docs:
|
||||
|
||||
## Authentication
|
||||
|
||||
Gateway gRPC clients authenticate with an API key in metadata: `authorization: Bearer mxgw_<key-id>_<secret>`. Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session`, `invoke`, `event`, `metadata`, `admin`) gate specific RPCs; missing → `Unauthenticated`, insufficient → `PermissionDenied`. The `apikey` subcommand on the server exe manages keys; see `src/MxGateway.Server/Security/Authentication/`.
|
||||
Gateway gRPC clients authenticate with an API key in metadata: `authorization: Bearer mxgw_<key-id>_<secret>`. Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session`, `invoke`, `event`, `metadata`, `admin`) gate specific RPCs; missing → `Unauthenticated`, insufficient → `PermissionDenied`. The `apikey` subcommand on the server exe manages keys; see `src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/`.
|
||||
|
||||
Dashboard auth is LDAP-backed (separate from the gRPC API-key model). `/login` binds against `MxGateway:Ldap` and maps the user's LDAP groups to `Admin` or `Viewer` via `MxGateway:Dashboard:GroupToRole`, then issues an HTTP-only secure `__Host-MxGatewayDashboard` cookie. SignalR hubs at `/hubs/{snapshot,alarms,events}` accept either the cookie or a 30-minute bearer minted at `/hubs/token`. `Dashboard:AllowAnonymousLocalhost` bypasses auth on loopback when enabled.
|
||||
|
||||
|
||||
@@ -244,6 +244,19 @@ foreach (LazyBrowseNode root in roots)
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`BrowseAsync` again from the root.
|
||||
|
||||
The CLI counterpart is `galaxy-browse`. Without `--parent` it walks the root
|
||||
objects and eagerly expands `--depth` further levels into an indented tree; with
|
||||
`--parent <gobject-id>` it fetches exactly one level of children for that object
|
||||
(`--depth` is ignored there). Filter flags map onto `BrowseChildrenOptions`:
|
||||
`--category-ids` and `--template-contains` are comma-separated lists,
|
||||
`--tag-name-glob` / `--alarm-bearing-only` / `--historized-only` are scalar, and
|
||||
`--include-attributes` overrides the server default for attribute population.
|
||||
|
||||
```powershell
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-browse --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --depth 1
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-browse --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --parent 42 --json
|
||||
```
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`WatchDeployEventsAsync` opens the `WatchDeployEvents` server-streaming RPC. The
|
||||
|
||||
@@ -105,4 +105,16 @@ public interface IMxGatewayCliClient : IAsyncDisposable
|
||||
IAsyncEnumerable<DeployEvent> GalaxyWatchDeployEventsAsync(
|
||||
WatchDeployEventsRequest request,
|
||||
CancellationToken cancellationToken);
|
||||
|
||||
/// <summary>
|
||||
/// Fetches one page of direct children of a Galaxy parent (or the root
|
||||
/// objects when the parent selector is unset), the primitive that backs the
|
||||
/// lazy-browse helper.
|
||||
/// </summary>
|
||||
/// <param name="request">The browse-children request.</param>
|
||||
/// <param name="cancellationToken">Cancellation token for the operation.</param>
|
||||
/// <returns>The browse-children reply.</returns>
|
||||
Task<BrowseChildrenReply> GalaxyBrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CancellationToken cancellationToken);
|
||||
}
|
||||
|
||||
@@ -100,6 +100,14 @@ internal sealed class MxGatewayCliClientAdapter : IMxGatewayCliClient
|
||||
return _galaxyClient.Value.WatchDeployEventsRawAsync(request, cancellationToken);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<BrowseChildrenReply> GalaxyBrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
return _galaxyClient.Value.BrowseChildrenRawAsync(request, cancellationToken);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async ValueTask DisposeAsync()
|
||||
{
|
||||
|
||||
@@ -144,6 +144,8 @@ public static class MxGatewayClientCli
|
||||
.ConfigureAwait(false),
|
||||
"galaxy-discover" => await GalaxyDiscoverAsync(arguments, client, standardOutput, cancellation.Token)
|
||||
.ConfigureAwait(false),
|
||||
"galaxy-browse" => await GalaxyBrowseAsync(arguments, client, standardOutput, standardError, cancellation.Token)
|
||||
.ConfigureAwait(false),
|
||||
"galaxy-watch" => await GalaxyWatchAsync(arguments, client, standardOutput, cancellation.Token)
|
||||
.ConfigureAwait(false),
|
||||
_ => WriteUnknownCommand(command, standardError),
|
||||
@@ -1607,6 +1609,270 @@ public static class MxGatewayClientCli
|
||||
return aggregate;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Per-request page size for the galaxy-browse single-level walks. Mirrors
|
||||
/// the library's <c>BrowseChildrenPageSize</c> so the CLI and the
|
||||
/// lazy-browse helper page identically.
|
||||
/// </summary>
|
||||
private const int BrowseChildrenCliPageSize = 500;
|
||||
|
||||
/// <summary>
|
||||
/// Drives the lazy-browse Galaxy surface from the CLI. Without
|
||||
/// <c>--parent</c> it walks the root objects and eagerly expands
|
||||
/// <c>--depth</c> further levels (each level reuses the same
|
||||
/// <see cref="BrowseChildrenOptions"/>, like the library helper). With
|
||||
/// <c>--parent</c> it fetches exactly one level of children for that
|
||||
/// gobject id via a parent-scoped BrowseChildren request; <c>--depth</c>
|
||||
/// is not meaningful there and a warning is emitted if combined, mirroring
|
||||
/// the Go/Rust CLIs.
|
||||
/// </summary>
|
||||
private static async Task<int> GalaxyBrowseAsync(
|
||||
CliArguments arguments,
|
||||
IMxGatewayCliClient client,
|
||||
TextWriter output,
|
||||
TextWriter standardError,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
BrowseChildrenOptions options = ParseBrowseChildrenOptions(arguments);
|
||||
bool json = arguments.HasFlag("json");
|
||||
int parent = arguments.GetInt32("parent", -1);
|
||||
int depth = arguments.GetInt32("depth", 0);
|
||||
|
||||
// A specific parent → one level of children via the parent-scoped RPC.
|
||||
if (parent >= 0)
|
||||
{
|
||||
if (depth > 0)
|
||||
{
|
||||
standardError.WriteLine("warning: --depth is ignored when --parent is specified.");
|
||||
}
|
||||
|
||||
IReadOnlyList<GalaxyObject> children = await BrowseOneLevelAsync(
|
||||
client,
|
||||
options,
|
||||
parent,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
if (json)
|
||||
{
|
||||
output.WriteLine(JsonSerializer.Serialize(
|
||||
new
|
||||
{
|
||||
command = "galaxy-browse",
|
||||
parentId = parent,
|
||||
children = children.Select(GalaxyObjectToJsonElement).ToArray(),
|
||||
},
|
||||
JsonOptions));
|
||||
return 0;
|
||||
}
|
||||
|
||||
output.WriteLine(children.Count.ToString(CultureInfo.InvariantCulture));
|
||||
foreach (GalaxyObject child in children)
|
||||
{
|
||||
output.WriteLine(FormatGalaxyObject(child, level: 0, hasChildrenHint: null));
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
// No parent → walk the root objects, eagerly expanding --depth levels.
|
||||
IReadOnlyList<BrowseTreeNode> roots = await BrowseTreeAsync(
|
||||
client,
|
||||
options,
|
||||
parentGobjectId: 0,
|
||||
remainingDepth: depth,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
if (json)
|
||||
{
|
||||
output.WriteLine(JsonSerializer.Serialize(
|
||||
new
|
||||
{
|
||||
command = "galaxy-browse",
|
||||
nodes = roots.Select(BrowseTreeNodeToJson).ToArray(),
|
||||
},
|
||||
JsonOptions));
|
||||
return 0;
|
||||
}
|
||||
|
||||
output.WriteLine(roots.Count.ToString(CultureInfo.InvariantCulture));
|
||||
foreach (BrowseTreeNode node in roots)
|
||||
{
|
||||
WriteBrowseTreeNode(output, node, level: 0);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// One node in the eagerly-expanded galaxy-browse tree: the Galaxy object,
|
||||
/// the server's has-children hint, and any children fetched up to the
|
||||
/// requested depth.
|
||||
/// </summary>
|
||||
private sealed record BrowseTreeNode(
|
||||
GalaxyObject Object,
|
||||
bool HasChildrenHint,
|
||||
IReadOnlyList<BrowseTreeNode> Children);
|
||||
|
||||
/// <summary>
|
||||
/// Fetches the direct children of <paramref name="parentGobjectId"/>
|
||||
/// (0 = root) and recursively expands <paramref name="remainingDepth"/>
|
||||
/// further levels. Paging is followed to completion at each level.
|
||||
/// </summary>
|
||||
private static async Task<IReadOnlyList<BrowseTreeNode>> BrowseTreeAsync(
|
||||
IMxGatewayCliClient client,
|
||||
BrowseChildrenOptions options,
|
||||
int parentGobjectId,
|
||||
int remainingDepth,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
List<BrowseTreeNode> nodes = [];
|
||||
string pageToken = string.Empty;
|
||||
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
|
||||
do
|
||||
{
|
||||
BrowseChildrenRequest request = BuildBrowseChildrenRequest(options, parentGobjectId, pageToken);
|
||||
BrowseChildrenReply reply = await client.GalaxyBrowseChildrenAsync(request, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
for (int i = 0; i < reply.Children.Count; i++)
|
||||
{
|
||||
GalaxyObject child = reply.Children[i];
|
||||
bool hint = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
|
||||
IReadOnlyList<BrowseTreeNode> grandChildren = remainingDepth > 0
|
||||
? await BrowseTreeAsync(client, options, child.GobjectId, remainingDepth - 1, cancellationToken)
|
||||
.ConfigureAwait(false)
|
||||
: [];
|
||||
nodes.Add(new BrowseTreeNode(child, hint, grandChildren));
|
||||
}
|
||||
|
||||
pageToken = reply.NextPageToken;
|
||||
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
|
||||
{
|
||||
throw new MxGatewayException(
|
||||
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
|
||||
}
|
||||
}
|
||||
while (!string.IsNullOrWhiteSpace(pageToken));
|
||||
|
||||
return nodes;
|
||||
}
|
||||
|
||||
/// <summary>Fetches exactly one level of children for a parent gobject id, paging to completion.</summary>
|
||||
private static async Task<IReadOnlyList<GalaxyObject>> BrowseOneLevelAsync(
|
||||
IMxGatewayCliClient client,
|
||||
BrowseChildrenOptions options,
|
||||
int parentGobjectId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
List<GalaxyObject> children = [];
|
||||
string pageToken = string.Empty;
|
||||
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
|
||||
do
|
||||
{
|
||||
BrowseChildrenRequest request = BuildBrowseChildrenRequest(options, parentGobjectId, pageToken);
|
||||
BrowseChildrenReply reply = await client.GalaxyBrowseChildrenAsync(request, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
children.AddRange(reply.Children);
|
||||
pageToken = reply.NextPageToken;
|
||||
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
|
||||
{
|
||||
throw new MxGatewayException(
|
||||
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
|
||||
}
|
||||
}
|
||||
while (!string.IsNullOrWhiteSpace(pageToken));
|
||||
|
||||
return children;
|
||||
}
|
||||
|
||||
private static BrowseChildrenOptions ParseBrowseChildrenOptions(CliArguments arguments)
|
||||
{
|
||||
return new BrowseChildrenOptions
|
||||
{
|
||||
CategoryIds = ParseOptionalInt32List(arguments.GetOptional("category-ids")),
|
||||
TemplateChainContains = ParseOptionalStringList(arguments.GetOptional("template-contains")),
|
||||
TagNameGlob = arguments.GetOptional("tag-name-glob"),
|
||||
AlarmBearingOnly = arguments.HasFlag("alarm-bearing-only"),
|
||||
HistorizedOnly = arguments.HasFlag("historized-only"),
|
||||
// Tri-state: only override the server default when the flag is present.
|
||||
IncludeAttributes = arguments.HasFlag("include-attributes") ? true : null,
|
||||
};
|
||||
}
|
||||
|
||||
private static BrowseChildrenRequest BuildBrowseChildrenRequest(
|
||||
BrowseChildrenOptions options,
|
||||
int parentGobjectId,
|
||||
string pageToken)
|
||||
{
|
||||
BrowseChildrenRequest request = new()
|
||||
{
|
||||
PageSize = BrowseChildrenCliPageSize,
|
||||
PageToken = pageToken,
|
||||
ParentGobjectId = parentGobjectId,
|
||||
AlarmBearingOnly = options.AlarmBearingOnly,
|
||||
HistorizedOnly = options.HistorizedOnly,
|
||||
};
|
||||
request.CategoryIds.Add(options.CategoryIds);
|
||||
request.TemplateChainContains.Add(options.TemplateChainContains);
|
||||
if (!string.IsNullOrWhiteSpace(options.TagNameGlob))
|
||||
{
|
||||
request.TagNameGlob = options.TagNameGlob;
|
||||
}
|
||||
|
||||
if (options.IncludeAttributes.HasValue)
|
||||
{
|
||||
request.IncludeAttributes = options.IncludeAttributes.Value;
|
||||
}
|
||||
|
||||
return request;
|
||||
}
|
||||
|
||||
private static void WriteBrowseTreeNode(TextWriter output, BrowseTreeNode node, int level)
|
||||
{
|
||||
output.WriteLine(FormatGalaxyObject(node.Object, level, node.HasChildrenHint));
|
||||
foreach (BrowseTreeNode child in node.Children)
|
||||
{
|
||||
WriteBrowseTreeNode(output, child, level + 1);
|
||||
}
|
||||
}
|
||||
|
||||
private static string FormatGalaxyObject(GalaxyObject galaxyObject, int level, bool? hasChildrenHint)
|
||||
{
|
||||
string indent = new(' ', level * 2);
|
||||
string suffix = hasChildrenHint is null
|
||||
? $"(attrs={galaxyObject.Attributes.Count})"
|
||||
: $"(attrs={galaxyObject.Attributes.Count}, hasChildrenHint={hasChildrenHint.Value})";
|
||||
return $"{indent}{galaxyObject.GobjectId}\t{galaxyObject.TagName}\t{galaxyObject.BrowseName}\t{suffix}";
|
||||
}
|
||||
|
||||
private static object BrowseTreeNodeToJson(BrowseTreeNode node)
|
||||
{
|
||||
return new
|
||||
{
|
||||
@object = GalaxyObjectToJsonElement(node.Object),
|
||||
hasChildrenHint = node.HasChildrenHint,
|
||||
children = node.Children.Select(BrowseTreeNodeToJson).ToArray(),
|
||||
};
|
||||
}
|
||||
|
||||
private static JsonElement GalaxyObjectToJsonElement(GalaxyObject galaxyObject)
|
||||
{
|
||||
return JsonDocument.Parse(ProtobufJsonFormatter.Format(galaxyObject)).RootElement.Clone();
|
||||
}
|
||||
|
||||
private static IReadOnlyList<int> ParseOptionalInt32List(string? value)
|
||||
{
|
||||
return string.IsNullOrWhiteSpace(value) ? [] : ParseInt32List(value);
|
||||
}
|
||||
|
||||
private static IReadOnlyList<string> ParseOptionalStringList(string? value)
|
||||
{
|
||||
return string.IsNullOrWhiteSpace(value) ? [] : ParseStringList(value);
|
||||
}
|
||||
|
||||
private static async Task<int> GalaxyWatchAsync(
|
||||
CliArguments arguments,
|
||||
IMxGatewayCliClient client,
|
||||
@@ -1736,6 +2002,7 @@ public static class MxGatewayClientCli
|
||||
or "galaxy-test-connection"
|
||||
or "galaxy-last-deploy"
|
||||
or "galaxy-discover"
|
||||
or "galaxy-browse"
|
||||
or "galaxy-watch";
|
||||
}
|
||||
|
||||
@@ -1797,6 +2064,7 @@ public static class MxGatewayClientCli
|
||||
writer.WriteLine("mxgw-dotnet galaxy-test-connection [--json]");
|
||||
writer.WriteLine("mxgw-dotnet galaxy-last-deploy [--json]");
|
||||
writer.WriteLine("mxgw-dotnet galaxy-discover [--json]");
|
||||
writer.WriteLine("mxgw-dotnet galaxy-browse [--parent <gobject-id>] [--depth <n>] [--category-ids <n,n>] [--template-contains <s,s>] [--tag-name-glob <glob>] [--alarm-bearing-only] [--historized-only] [--include-attributes] [--json]");
|
||||
writer.WriteLine("mxgw-dotnet galaxy-watch [--last-seen-deploy-time <iso8601>] [--max-events <n>] [--json]");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -360,6 +360,146 @@ public sealed class MxGatewayClientCliTests
|
||||
Assert.Equal(string.Empty, error.ToString());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies galaxy-browse walks root objects and eagerly expands one further
|
||||
/// level when --depth 1 is passed, printing an indented tree.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GalaxyBrowse_TextTreeExpandsToDepth()
|
||||
{
|
||||
using var output = new StringWriter();
|
||||
using var error = new StringWriter();
|
||||
FakeCliClient fakeClient = new();
|
||||
// Root level (parent 0): one area with a child hint.
|
||||
fakeClient.GalaxyBrowseChildrenReplies[0] = new Queue<BrowseChildrenReply>(
|
||||
[
|
||||
new BrowseChildrenReply
|
||||
{
|
||||
Children = { new GalaxyObject { GobjectId = 10, TagName = "Area_001", BrowseName = "Area" } },
|
||||
ChildHasChildren = { true },
|
||||
},
|
||||
]);
|
||||
// Children of gobject 10.
|
||||
fakeClient.GalaxyBrowseChildrenReplies[10] = new Queue<BrowseChildrenReply>(
|
||||
[
|
||||
new BrowseChildrenReply
|
||||
{
|
||||
Children = { new GalaxyObject { GobjectId = 20, TagName = "Tank_001", BrowseName = "Tank" } },
|
||||
},
|
||||
]);
|
||||
|
||||
int exitCode = await MxGatewayClientCli.RunAsync(
|
||||
[
|
||||
"galaxy-browse",
|
||||
"--endpoint",
|
||||
"http://localhost:5000",
|
||||
"--api-key",
|
||||
"test-api-key",
|
||||
"--depth",
|
||||
"1",
|
||||
],
|
||||
output,
|
||||
error,
|
||||
_ => fakeClient);
|
||||
|
||||
Assert.Equal(0, exitCode);
|
||||
string text = output.ToString();
|
||||
Assert.Contains("Area_001", text);
|
||||
Assert.Contains("Tank_001", text);
|
||||
// Children are indented beneath their parent (two-space indent per level).
|
||||
Assert.Matches(@"\n \d+\tTank_001", text);
|
||||
// Root fetched with the parent oneof unset; child fetch used parent 10.
|
||||
Assert.Contains(
|
||||
fakeClient.GalaxyBrowseChildrenRequests,
|
||||
request => request.ParentCase == BrowseChildrenRequest.ParentOneofCase.ParentGobjectId
|
||||
&& request.ParentGobjectId == 10);
|
||||
Assert.Equal(string.Empty, error.ToString());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies galaxy-browse --json emits a nested JSON document and forwards
|
||||
/// the filter flags onto the BrowseChildren request.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GalaxyBrowse_JsonForwardsFilters()
|
||||
{
|
||||
using var output = new StringWriter();
|
||||
using var error = new StringWriter();
|
||||
FakeCliClient fakeClient = new();
|
||||
fakeClient.GalaxyBrowseChildrenReplies[0] = new Queue<BrowseChildrenReply>(
|
||||
[
|
||||
new BrowseChildrenReply
|
||||
{
|
||||
Children = { new GalaxyObject { GobjectId = 10, TagName = "Area_001", BrowseName = "Area" } },
|
||||
},
|
||||
]);
|
||||
|
||||
int exitCode = await MxGatewayClientCli.RunAsync(
|
||||
[
|
||||
"galaxy-browse",
|
||||
"--endpoint",
|
||||
"http://localhost:5000",
|
||||
"--api-key",
|
||||
"test-api-key",
|
||||
"--tag-name-glob",
|
||||
"Area*",
|
||||
"--alarm-bearing-only",
|
||||
"--json",
|
||||
],
|
||||
output,
|
||||
error,
|
||||
_ => fakeClient);
|
||||
|
||||
Assert.Equal(0, exitCode);
|
||||
using System.Text.Json.JsonDocument document = System.Text.Json.JsonDocument.Parse(output.ToString());
|
||||
Assert.Equal("galaxy-browse", document.RootElement.GetProperty("command").GetString());
|
||||
Assert.True(document.RootElement.GetProperty("nodes").GetArrayLength() >= 1);
|
||||
BrowseChildrenRequest request = Assert.Single(fakeClient.GalaxyBrowseChildrenRequests);
|
||||
Assert.Equal("Area*", request.TagNameGlob);
|
||||
Assert.True(request.AlarmBearingOnly);
|
||||
Assert.Equal(string.Empty, error.ToString());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies galaxy-browse --parent fetches exactly one level of children for
|
||||
/// the supplied gobject id via a parent-scoped BrowseChildren request.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GalaxyBrowse_ParentFetchesSingleLevel()
|
||||
{
|
||||
using var output = new StringWriter();
|
||||
using var error = new StringWriter();
|
||||
FakeCliClient fakeClient = new();
|
||||
fakeClient.GalaxyBrowseChildrenReplies[10] = new Queue<BrowseChildrenReply>(
|
||||
[
|
||||
new BrowseChildrenReply
|
||||
{
|
||||
Children = { new GalaxyObject { GobjectId = 20, TagName = "Tank_001", BrowseName = "Tank" } },
|
||||
},
|
||||
]);
|
||||
|
||||
int exitCode = await MxGatewayClientCli.RunAsync(
|
||||
[
|
||||
"galaxy-browse",
|
||||
"--endpoint",
|
||||
"http://localhost:5000",
|
||||
"--api-key",
|
||||
"test-api-key",
|
||||
"--parent",
|
||||
"10",
|
||||
],
|
||||
output,
|
||||
error,
|
||||
_ => fakeClient);
|
||||
|
||||
Assert.Equal(0, exitCode);
|
||||
Assert.Contains("Tank_001", output.ToString());
|
||||
BrowseChildrenRequest request = Assert.Single(fakeClient.GalaxyBrowseChildrenRequests);
|
||||
Assert.Equal(BrowseChildrenRequest.ParentOneofCase.ParentGobjectId, request.ParentCase);
|
||||
Assert.Equal(10, request.ParentGobjectId);
|
||||
Assert.Equal(string.Empty, error.ToString());
|
||||
}
|
||||
|
||||
/// <summary>Verifies that galaxy-watch command prints text output for deploy events.</summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GalaxyWatch_PrintsTextOutputForEvents()
|
||||
@@ -1051,5 +1191,33 @@ public sealed class MxGatewayClientCliTests
|
||||
yield return deployEvent;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>List of received galaxy browse-children requests, in call order.</summary>
|
||||
public List<BrowseChildrenRequest> GalaxyBrowseChildrenRequests { get; } = [];
|
||||
|
||||
/// <summary>
|
||||
/// Per-parent browse-children replies keyed by <c>parent_gobject_id</c>
|
||||
/// (0 = root). Each parent's queue is dequeued in page order; an absent
|
||||
/// or exhausted queue yields an empty reply.
|
||||
/// </summary>
|
||||
public Dictionary<int, Queue<BrowseChildrenReply>> GalaxyBrowseChildrenReplies { get; } = [];
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<BrowseChildrenReply> GalaxyBrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
GalaxyBrowseChildrenRequests.Add(request);
|
||||
int parentId = request.ParentCase == BrowseChildrenRequest.ParentOneofCase.ParentGobjectId
|
||||
? request.ParentGobjectId
|
||||
: 0;
|
||||
if (GalaxyBrowseChildrenReplies.TryGetValue(parentId, out Queue<BrowseChildrenReply>? queue)
|
||||
&& queue.TryDequeue(out BrowseChildrenReply? reply))
|
||||
{
|
||||
return Task.FromResult(reply);
|
||||
}
|
||||
|
||||
return Task.FromResult(new BrowseChildrenReply());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -121,6 +121,10 @@ func runWithIO(ctx context.Context, args []string, stdout, stderr io.Writer) err
|
||||
return runGalaxyDiscover(ctx, args[1:], stdout, stderr)
|
||||
case "galaxy-watch":
|
||||
return runGalaxyWatch(ctx, args[1:], stdout, stderr)
|
||||
case "galaxy-browse":
|
||||
return runGalaxyBrowse(ctx, args[1:], stdout, stderr)
|
||||
case "ping":
|
||||
return runPing(ctx, args[1:], stdout, stderr)
|
||||
case "batch":
|
||||
return runBatch(ctx, os.Stdin, stdout, stderr)
|
||||
default:
|
||||
@@ -228,6 +232,52 @@ func runCloseSession(ctx context.Context, args []string, stdout, stderr io.Write
|
||||
return nil
|
||||
}
|
||||
|
||||
func runPing(ctx context.Context, args []string, stdout, stderr io.Writer) error {
|
||||
flags := flag.NewFlagSet("ping", flag.ContinueOnError)
|
||||
flags.SetOutput(stderr)
|
||||
common := bindCommonFlags(flags)
|
||||
jsonOutput := flags.Bool("json", false, "write JSON output")
|
||||
sessionID := flags.String("session-id", "", "gateway session id")
|
||||
message := flags.String("message", "ping", "ping payload message")
|
||||
|
||||
if err := flags.Parse(args); err != nil {
|
||||
return err
|
||||
}
|
||||
if *sessionID == "" {
|
||||
return errors.New("session-id is required")
|
||||
}
|
||||
|
||||
client, options, err := dialForCommand(ctx, common)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer client.Close()
|
||||
|
||||
session := mxgateway.NewSessionForID(client, *sessionID)
|
||||
reply, err := session.PingRaw(ctx, *message)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if *jsonOutput {
|
||||
return writeJSON(stdout, commandReplyOutput{
|
||||
Command: "ping",
|
||||
Options: options,
|
||||
Reply: mustMarshalProto(reply),
|
||||
})
|
||||
}
|
||||
// DiagnosticMessage carries the echoed ping text set by the gateway.
|
||||
// Fall back to the kind string when the gateway returns an empty message
|
||||
// (forward-compat guard for future gateway versions). writeCommandOutput
|
||||
// is not reused here because it would print the opaque Kind enum rather
|
||||
// than the human-readable echo.
|
||||
echo := reply.GetDiagnosticMessage()
|
||||
if echo == "" {
|
||||
echo = reply.GetKind().String()
|
||||
}
|
||||
fmt.Fprintln(stdout, echo)
|
||||
return nil
|
||||
}
|
||||
|
||||
func runRegister(ctx context.Context, args []string, stdout, stderr io.Writer) error {
|
||||
flags := flag.NewFlagSet("register", flag.ContinueOnError)
|
||||
flags.SetOutput(stderr)
|
||||
@@ -1196,7 +1246,7 @@ type protojsonMessage interface {
|
||||
}
|
||||
|
||||
func writeUsage(writer io.Writer) {
|
||||
fmt.Fprintln(writer, "usage: mxgw-go <version|open-session|close-session|register|add-item|advise|subscribe-bulk|unsubscribe-bulk|read-bulk|write-bulk|write2-bulk|write-secured-bulk|write-secured2-bulk|bench-read-bulk|write|stream-events|stream-alarms|acknowledge-alarm|smoke|galaxy-test-connection|galaxy-last-deploy|galaxy-discover|galaxy-watch|batch>")
|
||||
fmt.Fprintln(writer, "usage: mxgw-go <version|open-session|close-session|ping|register|add-item|advise|subscribe-bulk|unsubscribe-bulk|read-bulk|write-bulk|write2-bulk|write-secured-bulk|write-secured2-bulk|bench-read-bulk|write|stream-events|stream-alarms|acknowledge-alarm|smoke|galaxy-test-connection|galaxy-last-deploy|galaxy-discover|galaxy-watch|galaxy-browse|batch>")
|
||||
}
|
||||
|
||||
// batchEOR is the end-of-result sentinel emitted to stdout after every command
|
||||
@@ -1459,6 +1509,234 @@ func runGalaxyWatch(ctx context.Context, args []string, stdout, stderr io.Writer
|
||||
}
|
||||
}
|
||||
|
||||
// runGalaxyBrowse drives the lazy-browse Galaxy helper from the CLI. Without
|
||||
// -parent it walks the root objects via GalaxyClient.Browse and eagerly expands
|
||||
// -depth further levels (each level reuses the same BrowseChildrenOptions, like
|
||||
// the library helper). With -parent it fetches exactly one level of children for
|
||||
// that gobject id via a parent-scoped BrowseChildren request; -depth is not
|
||||
// meaningful there and a warning is emitted if combined, mirroring the Rust CLI.
|
||||
//
|
||||
// Filter flags map onto BrowseChildrenOptions: -category-ids and
|
||||
// -template-contains are comma-separated lists (matching this CLI's other
|
||||
// list-valued flags), -tag-name-glob / -alarm-bearing-only / -historized-only
|
||||
// are scalar, and -include-attributes is a tri-state pointer (left nil unless
|
||||
// the flag is provided so the server default applies).
|
||||
func runGalaxyBrowse(ctx context.Context, args []string, stdout, stderr io.Writer) error {
|
||||
flags := flag.NewFlagSet("galaxy-browse", flag.ContinueOnError)
|
||||
flags.SetOutput(stderr)
|
||||
common := bindCommonFlags(flags)
|
||||
jsonOutput := flags.Bool("json", false, "write JSON output")
|
||||
parent := flags.Int("parent", -1, "parent gobject id whose children to browse; omit (or <0) for root objects")
|
||||
depth := flags.Int("depth", 0, "additional levels to eagerly expand beneath each root node; ignored with -parent")
|
||||
categoryIDs := flags.String("category-ids", "", "comma-separated Galaxy category ids to restrict results")
|
||||
templateContains := flags.String("template-contains", "", "comma-separated template tag names the chain must contain")
|
||||
tagNameGlob := flags.String("tag-name-glob", "", "restrict to objects whose tag name matches this glob")
|
||||
alarmBearingOnly := flags.Bool("alarm-bearing-only", false, "restrict to alarm-bearing objects")
|
||||
historizedOnly := flags.Bool("historized-only", false, "restrict to historized objects")
|
||||
includeAttributes := flags.Bool("include-attributes", false, "populate attributes on returned objects (overrides server default)")
|
||||
|
||||
if err := flags.Parse(args); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
categoryList, err := parseInt32List(*categoryIDs)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
opts := &mxgateway.BrowseChildrenOptions{
|
||||
CategoryIds: categoryList,
|
||||
TemplateChainContains: parseStringList(*templateContains),
|
||||
TagNameGlob: *tagNameGlob,
|
||||
AlarmBearingOnly: *alarmBearingOnly,
|
||||
HistorizedOnly: *historizedOnly,
|
||||
}
|
||||
// Only override the server default when the flag was actually set; the
|
||||
// pointer form mirrors the proto's optional field.
|
||||
flags.Visit(func(f *flag.Flag) {
|
||||
if f.Name == "include-attributes" {
|
||||
value := *includeAttributes
|
||||
opts.IncludeAttributes = &value
|
||||
}
|
||||
})
|
||||
|
||||
client, options, err := dialGalaxyForCommand(ctx, common)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer client.Close()
|
||||
|
||||
// A specific parent → one level of children via the raw parent-scoped RPC.
|
||||
if *parent >= 0 {
|
||||
if *parent == 0 {
|
||||
fmt.Fprintln(stderr, "warning: -parent 0 is the server root sentinel; omit -parent for the root walk, or use -parent <id> >= 1")
|
||||
}
|
||||
if *depth > 0 {
|
||||
fmt.Fprintln(stderr, "warning: -depth is ignored when -parent is specified")
|
||||
}
|
||||
return runGalaxyBrowseParent(ctx, client, int32(*parent), opts, stdout, *jsonOutput, options)
|
||||
}
|
||||
|
||||
// No parent → walk the lazy-browse tree from the root objects, eagerly
|
||||
// expanding -depth further levels so the print walks cached children
|
||||
// without re-issuing RPCs.
|
||||
nodes, err := client.Browse(ctx, opts)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
for _, node := range nodes {
|
||||
if err := expandToDepth(ctx, node, *depth); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
if *jsonOutput {
|
||||
jsonNodes := make([]map[string]any, 0, len(nodes))
|
||||
for _, node := range nodes {
|
||||
jsonNodes = append(jsonNodes, lazyNodeToJSON(node))
|
||||
}
|
||||
return writeJSON(stdout, map[string]any{
|
||||
"command": "galaxy-browse",
|
||||
"options": options,
|
||||
"nodes": jsonNodes,
|
||||
})
|
||||
}
|
||||
|
||||
fmt.Fprintln(stdout, len(nodes))
|
||||
for _, node := range nodes {
|
||||
printLazyNode(stdout, node, 0)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// runGalaxyBrowseParent fetches exactly one level of children for parentID via a
|
||||
// parent-scoped BrowseChildren request, paging until the server stops. It does
|
||||
// not lazily wrap the children in nodes; the single level is rendered directly.
|
||||
func runGalaxyBrowseParent(
|
||||
ctx context.Context,
|
||||
client *mxgateway.GalaxyClient,
|
||||
parentID int32,
|
||||
opts *mxgateway.BrowseChildrenOptions,
|
||||
stdout io.Writer,
|
||||
jsonOutput bool,
|
||||
options commonOptions,
|
||||
) error {
|
||||
var children []*mxgateway.GalaxyObject
|
||||
pageToken := ""
|
||||
seen := map[string]struct{}{}
|
||||
for {
|
||||
req := &mxgateway.BrowseChildrenRequest{
|
||||
PageSize: browseChildrenCLIPageSize,
|
||||
PageToken: pageToken,
|
||||
CategoryIds: opts.CategoryIds,
|
||||
TemplateChainContains: opts.TemplateChainContains,
|
||||
TagNameGlob: opts.TagNameGlob,
|
||||
AlarmBearingOnly: opts.AlarmBearingOnly,
|
||||
HistorizedOnly: opts.HistorizedOnly,
|
||||
IncludeAttributes: opts.IncludeAttributes,
|
||||
Parent: &mxgateway.BrowseChildrenRequest_ParentGobjectId{ParentGobjectId: parentID},
|
||||
}
|
||||
reply, err := client.BrowseChildrenRaw(ctx, req)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
children = append(children, reply.GetChildren()...)
|
||||
pageToken = reply.GetNextPageToken()
|
||||
if pageToken == "" {
|
||||
break
|
||||
}
|
||||
if _, dup := seen[pageToken]; dup {
|
||||
return fmt.Errorf("galaxy browse children returned repeated page token %q", pageToken)
|
||||
}
|
||||
seen[pageToken] = struct{}{}
|
||||
}
|
||||
|
||||
if jsonOutput {
|
||||
jsonChildren := make([]map[string]any, 0, len(children))
|
||||
for _, child := range children {
|
||||
jsonChildren = append(jsonChildren, galaxyObjectToJSON(child))
|
||||
}
|
||||
return writeJSON(stdout, map[string]any{
|
||||
"command": "galaxy-browse",
|
||||
"options": options,
|
||||
"parentId": parentID,
|
||||
"children": jsonChildren,
|
||||
})
|
||||
}
|
||||
|
||||
fmt.Fprintln(stdout, len(children))
|
||||
for _, child := range children {
|
||||
fmt.Fprintf(stdout, "%d\t%s\t%s\t(attrs=%d)\n", child.GetGobjectId(), child.GetTagName(), child.GetBrowseName(), len(child.GetAttributes()))
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// browseChildrenCLIPageSize is the per-request page size for the -parent
|
||||
// single-level walk. It mirrors the library's browseChildrenPageSize so the
|
||||
// CLI and the lazy-browse helper page identically.
|
||||
const browseChildrenCLIPageSize = 500
|
||||
|
||||
// expandToDepth eagerly expands node and remaining further levels beneath it so
|
||||
// a subsequent print walk reads cached children without re-issuing RPCs. A
|
||||
// remaining of 0 leaves the node unexpanded (only the requested level prints).
|
||||
func expandToDepth(ctx context.Context, node *mxgateway.LazyBrowseNode, remaining int) error {
|
||||
if remaining <= 0 {
|
||||
return nil
|
||||
}
|
||||
if err := node.Expand(ctx); err != nil {
|
||||
return err
|
||||
}
|
||||
for _, child := range node.Children() {
|
||||
if err := expandToDepth(ctx, child, remaining-1); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// printLazyNode renders one node and its already-expanded children as an
|
||||
// indent-per-level tree. Only children loaded by a prior Expand are walked.
|
||||
func printLazyNode(stdout io.Writer, node *mxgateway.LazyBrowseNode, level int) {
|
||||
indent := strings.Repeat(" ", level)
|
||||
obj := node.Object()
|
||||
fmt.Fprintf(stdout, "%s%d\t%s\t%s\t(attrs=%d, hasChildrenHint=%t)\n",
|
||||
indent, obj.GetGobjectId(), obj.GetTagName(), obj.GetBrowseName(), len(obj.GetAttributes()), node.HasChildrenHint())
|
||||
for _, child := range node.Children() {
|
||||
printLazyNode(stdout, child, level+1)
|
||||
}
|
||||
}
|
||||
|
||||
// lazyNodeToJSON renders one lazy node and its already-expanded children as a
|
||||
// nested JSON object.
|
||||
func lazyNodeToJSON(node *mxgateway.LazyBrowseNode) map[string]any {
|
||||
out := galaxyObjectToJSON(node.Object())
|
||||
out["hasChildrenHint"] = node.HasChildrenHint()
|
||||
children := node.Children()
|
||||
jsonChildren := make([]map[string]any, 0, len(children))
|
||||
for _, child := range children {
|
||||
jsonChildren = append(jsonChildren, lazyNodeToJSON(child))
|
||||
}
|
||||
out["children"] = jsonChildren
|
||||
return out
|
||||
}
|
||||
|
||||
// galaxyObjectToJSON renders the scalar fields of a GalaxyObject for the
|
||||
// browse JSON output. Attributes are summarised by count to keep the tree
|
||||
// compact; -include-attributes still drives whether the server populates them.
|
||||
func galaxyObjectToJSON(obj *mxgateway.GalaxyObject) map[string]any {
|
||||
return map[string]any{
|
||||
"gobjectId": obj.GetGobjectId(),
|
||||
"tagName": obj.GetTagName(),
|
||||
"containedName": obj.GetContainedName(),
|
||||
"browseName": obj.GetBrowseName(),
|
||||
"parentGobjectId": obj.GetParentGobjectId(),
|
||||
"isArea": obj.GetIsArea(),
|
||||
"categoryId": obj.GetCategoryId(),
|
||||
"templateChain": obj.GetTemplateChain(),
|
||||
"attributeCount": len(obj.GetAttributes()),
|
||||
}
|
||||
}
|
||||
|
||||
func formatDeployEvent(event *mxgateway.DeployEvent) string {
|
||||
observed := ""
|
||||
if ts := event.GetObservedAt(); ts != nil {
|
||||
|
||||
@@ -190,6 +190,109 @@ func TestRunBenchReadBulkRespectsContextCancellation(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunPingPlainText verifies the ping subcommand round-trips through the
|
||||
// fake gateway and prints the echo (diagnostic_message) in plain-text mode.
|
||||
func TestRunPingPlainText(t *testing.T) {
|
||||
listener, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
t.Fatalf("listen: %v", err)
|
||||
}
|
||||
server := grpc.NewServer()
|
||||
fake := &pingFakeGateway{}
|
||||
pb.RegisterMxAccessGatewayServer(server, fake)
|
||||
go func() { _ = server.Serve(listener) }()
|
||||
defer server.Stop()
|
||||
defer listener.Close()
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
args := []string{
|
||||
"ping",
|
||||
"-endpoint", listener.Addr().String(),
|
||||
"-plaintext",
|
||||
"-api-key", "test",
|
||||
"-session-id", "test-session",
|
||||
"-message", "hello",
|
||||
}
|
||||
if err := runWithIO(t.Context(), args, &stdout, &stderr); err != nil {
|
||||
t.Fatalf("runWithIO() error = %v; stderr = %s", err, stderr.String())
|
||||
}
|
||||
got := strings.TrimSpace(stdout.String())
|
||||
if got != "pong:hello" {
|
||||
t.Fatalf("ping plain-text output = %q, want %q", got, "pong:hello")
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunPingJSON verifies the ping subcommand emits valid JSON in --json mode.
|
||||
func TestRunPingJSON(t *testing.T) {
|
||||
listener, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
t.Fatalf("listen: %v", err)
|
||||
}
|
||||
server := grpc.NewServer()
|
||||
fake := &pingFakeGateway{}
|
||||
pb.RegisterMxAccessGatewayServer(server, fake)
|
||||
go func() { _ = server.Serve(listener) }()
|
||||
defer server.Stop()
|
||||
defer listener.Close()
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
args := []string{
|
||||
"ping",
|
||||
"-endpoint", listener.Addr().String(),
|
||||
"-plaintext",
|
||||
"-api-key", "test",
|
||||
"-session-id", "test-session",
|
||||
"-message", "hello",
|
||||
"-json",
|
||||
}
|
||||
if err := runWithIO(t.Context(), args, &stdout, &stderr); err != nil {
|
||||
t.Fatalf("runWithIO() error = %v; stderr = %s", err, stderr.String())
|
||||
}
|
||||
var out commandReplyOutput
|
||||
if err := json.Unmarshal(stdout.Bytes(), &out); err != nil {
|
||||
t.Fatalf("parse JSON: %v\noutput: %s", err, stdout.String())
|
||||
}
|
||||
if out.Command != "ping" {
|
||||
t.Fatalf("command = %q, want %q", out.Command, "ping")
|
||||
}
|
||||
// The fake gateway echoes "pong:<message>" in diagnostic_message; verify the
|
||||
// echo appears in the serialised reply so a future regression that wired
|
||||
// PingRaw to the wrong proto field would be caught here.
|
||||
replyStr := string(out.Reply)
|
||||
if !strings.Contains(replyStr, "pong:hello") {
|
||||
t.Fatalf("ping JSON reply missing echoed message %q; reply = %s", "pong:hello", replyStr)
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunPingRequiresSessionID verifies the ping subcommand rejects missing session-id.
|
||||
func TestRunPingRequiresSessionID(t *testing.T) {
|
||||
var stdout, stderr bytes.Buffer
|
||||
err := runWithIO(t.Context(), []string{"ping", "-plaintext", "-api-key", "test"}, &stdout, &stderr)
|
||||
if err == nil {
|
||||
t.Fatalf("runWithIO(ping without --session-id) returned no error")
|
||||
}
|
||||
if !strings.Contains(err.Error(), "session-id is required") {
|
||||
t.Fatalf("error = %v; want 'session-id is required'", err)
|
||||
}
|
||||
}
|
||||
|
||||
// pingFakeGateway handles Invoke for MX_COMMAND_KIND_PING by echoing the
|
||||
// message back in the diagnostic_message field so the CLI plain-text path
|
||||
// has a deterministic, non-empty string to assert on.
|
||||
type pingFakeGateway struct {
|
||||
pb.UnimplementedMxAccessGatewayServer
|
||||
}
|
||||
|
||||
func (g *pingFakeGateway) Invoke(_ context.Context, req *pb.MxCommandRequest) (*pb.MxCommandReply, error) {
|
||||
echo := "pong:" + req.GetCommand().GetPing().GetMessage()
|
||||
return &pb.MxCommandReply{
|
||||
SessionId: req.GetSessionId(),
|
||||
Kind: pb.MxCommandKind_MX_COMMAND_KIND_PING,
|
||||
DiagnosticMessage: echo,
|
||||
ProtocolStatus: &pb.ProtocolStatus{Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK},
|
||||
}, nil
|
||||
}
|
||||
|
||||
// benchFakeGateway is a minimal MxAccessGatewayServer that satisfies the
|
||||
// bench-read-bulk session-setup sequence (OpenSession + Invoke for Register
|
||||
// / SubscribeBulk / ReadBulk / UnsubscribeBulk / CloseSession).
|
||||
@@ -245,6 +348,146 @@ func TestRunBenchReadBulkRejectsNonPositiveBulkSize(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// browseFakeGalaxy implements BrowseChildren for the galaxy-browse subcommand
|
||||
// tests. It returns two root objects when no parent is supplied (the first
|
||||
// flagged as having children), and one child when the first root's gobject id
|
||||
// is supplied as the parent. The recorded last request lets a test assert the
|
||||
// CLI forwarded the parent and filter fields onto the wire.
|
||||
type browseFakeGalaxy struct {
|
||||
pb.UnimplementedGalaxyRepositoryServer
|
||||
lastRequest *pb.BrowseChildrenRequest
|
||||
}
|
||||
|
||||
func (g *browseFakeGalaxy) BrowseChildren(_ context.Context, req *pb.BrowseChildrenRequest) (*pb.BrowseChildrenReply, error) {
|
||||
g.lastRequest = req
|
||||
if req.GetParentGobjectId() == 10 {
|
||||
return &pb.BrowseChildrenReply{
|
||||
Children: []*pb.GalaxyObject{
|
||||
{GobjectId: 11, TagName: "Area1.Tank", BrowseName: "Tank"},
|
||||
},
|
||||
ChildHasChildren: []bool{false},
|
||||
}, nil
|
||||
}
|
||||
return &pb.BrowseChildrenReply{
|
||||
Children: []*pb.GalaxyObject{
|
||||
{GobjectId: 10, TagName: "Area1", BrowseName: "Area1"},
|
||||
{GobjectId: 20, TagName: "Area2", BrowseName: "Area2"},
|
||||
},
|
||||
ChildHasChildren: []bool{true, false},
|
||||
}, nil
|
||||
}
|
||||
|
||||
func startBrowseFakeGalaxy(t *testing.T) (addr string, fake *browseFakeGalaxy) {
|
||||
t.Helper()
|
||||
listener, err := net.Listen("tcp", "127.0.0.1:0")
|
||||
if err != nil {
|
||||
t.Fatalf("listen: %v", err)
|
||||
}
|
||||
server := grpc.NewServer()
|
||||
fake = &browseFakeGalaxy{}
|
||||
pb.RegisterGalaxyRepositoryServer(server, fake)
|
||||
go func() { _ = server.Serve(listener) }()
|
||||
t.Cleanup(func() {
|
||||
server.Stop()
|
||||
_ = listener.Close()
|
||||
})
|
||||
return listener.Addr().String(), fake
|
||||
}
|
||||
|
||||
// TestRunGalaxyBrowseTextTree verifies the galaxy-browse subcommand issues
|
||||
// BrowseChildren for the root walk, eagerly expands one level when --depth is
|
||||
// set, and renders an indented tree.
|
||||
func TestRunGalaxyBrowseTextTree(t *testing.T) {
|
||||
addr, _ := startBrowseFakeGalaxy(t)
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
args := []string{
|
||||
"galaxy-browse",
|
||||
"-endpoint", addr,
|
||||
"-plaintext",
|
||||
"-api-key", "test",
|
||||
"-depth", "1",
|
||||
}
|
||||
if err := runWithIO(t.Context(), args, &stdout, &stderr); err != nil {
|
||||
t.Fatalf("runWithIO() error = %v; stderr = %s", err, stderr.String())
|
||||
}
|
||||
out := stdout.String()
|
||||
// Both roots present; the first root's eagerly-expanded child appears
|
||||
// indented beneath it.
|
||||
for _, want := range []string{"Area1", "Area2", "Tank"} {
|
||||
if !strings.Contains(out, want) {
|
||||
t.Fatalf("galaxy-browse text output missing %q; got:\n%s", want, out)
|
||||
}
|
||||
}
|
||||
if !strings.Contains(out, " ") {
|
||||
t.Fatalf("galaxy-browse text output not indented for children; got:\n%s", out)
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunGalaxyBrowseJSON verifies the galaxy-browse subcommand emits valid
|
||||
// nested JSON and forwards filter options onto the BrowseChildren request.
|
||||
func TestRunGalaxyBrowseJSON(t *testing.T) {
|
||||
addr, fake := startBrowseFakeGalaxy(t)
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
args := []string{
|
||||
"galaxy-browse",
|
||||
"-endpoint", addr,
|
||||
"-plaintext",
|
||||
"-api-key", "test",
|
||||
"-depth", "1",
|
||||
"-tag-name-glob", "Area%",
|
||||
"-alarm-bearing-only",
|
||||
"-json",
|
||||
}
|
||||
if err := runWithIO(t.Context(), args, &stdout, &stderr); err != nil {
|
||||
t.Fatalf("runWithIO() error = %v; stderr = %s", err, stderr.String())
|
||||
}
|
||||
|
||||
var payload map[string]any
|
||||
if err := json.Unmarshal(stdout.Bytes(), &payload); err != nil {
|
||||
t.Fatalf("parse JSON: %v\noutput: %s", err, stdout.String())
|
||||
}
|
||||
if payload["command"] != "galaxy-browse" {
|
||||
t.Fatalf("command = %v, want galaxy-browse", payload["command"])
|
||||
}
|
||||
nodes, ok := payload["nodes"].([]any)
|
||||
if !ok || len(nodes) != 2 {
|
||||
t.Fatalf("nodes = %v, want 2 root nodes", payload["nodes"])
|
||||
}
|
||||
// Filter fields must have reached the wire.
|
||||
if got := fake.lastRequest.GetTagNameGlob(); got != "Area%" {
|
||||
t.Fatalf("BrowseChildren TagNameGlob = %q, want %q", got, "Area%")
|
||||
}
|
||||
if !fake.lastRequest.GetAlarmBearingOnly() {
|
||||
t.Fatalf("BrowseChildren AlarmBearingOnly = false, want true")
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunGalaxyBrowseParentSingleLevel verifies that passing --parent fetches a
|
||||
// single level of children for that parent via the parent-scoped request.
|
||||
func TestRunGalaxyBrowseParentSingleLevel(t *testing.T) {
|
||||
addr, fake := startBrowseFakeGalaxy(t)
|
||||
|
||||
var stdout, stderr bytes.Buffer
|
||||
args := []string{
|
||||
"galaxy-browse",
|
||||
"-endpoint", addr,
|
||||
"-plaintext",
|
||||
"-api-key", "test",
|
||||
"-parent", "10",
|
||||
}
|
||||
if err := runWithIO(t.Context(), args, &stdout, &stderr); err != nil {
|
||||
t.Fatalf("runWithIO() error = %v; stderr = %s", err, stderr.String())
|
||||
}
|
||||
if !strings.Contains(stdout.String(), "Tank") {
|
||||
t.Fatalf("galaxy-browse -parent output missing child %q; got:\n%s", "Tank", stdout.String())
|
||||
}
|
||||
if got := fake.lastRequest.GetParentGobjectId(); got != 10 {
|
||||
t.Fatalf("BrowseChildren ParentGobjectId = %d, want 10", got)
|
||||
}
|
||||
}
|
||||
|
||||
// TestRunBatchSkipsBlankLinesAndContinuesUntilEOF pins the Client.Go-027 fix:
|
||||
// a blank line in the middle of a batch session must NOT terminate the loop —
|
||||
// only stdin EOF ends the session.
|
||||
|
||||
@@ -363,6 +363,89 @@ func TestBulkMethodsShortCircuitOnEmptySliceWithoutRoundTrip(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestWrite2BuildsCommandWithTimestampAndReturnsNoError(t *testing.T) {
|
||||
fake := &fakeGatewayServer{
|
||||
invokeReply: &pb.MxCommandReply{
|
||||
SessionId: "session-1",
|
||||
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE2,
|
||||
ProtocolStatus: &pb.ProtocolStatus{
|
||||
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
|
||||
},
|
||||
},
|
||||
}
|
||||
client, cleanup := newBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
session := NewSessionForID(client, "session-1")
|
||||
|
||||
val := Int32Value(99)
|
||||
ts := Int32Value(77)
|
||||
err := session.Write2(context.Background(), 12, 34, val, ts, 100)
|
||||
if err != nil {
|
||||
t.Fatalf("Write2() error = %v", err)
|
||||
}
|
||||
|
||||
req := fake.invokeRequest
|
||||
if req.GetCommand().GetKind() != pb.MxCommandKind_MX_COMMAND_KIND_WRITE2 {
|
||||
t.Fatalf("command kind = %s, want WRITE2", req.GetCommand().GetKind())
|
||||
}
|
||||
w2 := req.GetCommand().GetWrite2()
|
||||
if w2.GetServerHandle() != 12 {
|
||||
t.Fatalf("server handle = %d, want 12", w2.GetServerHandle())
|
||||
}
|
||||
if w2.GetItemHandle() != 34 {
|
||||
t.Fatalf("item handle = %d, want 34", w2.GetItemHandle())
|
||||
}
|
||||
if w2.GetValue().GetInt32Value() != 99 {
|
||||
t.Fatalf("value int32 = %d, want 99", w2.GetValue().GetInt32Value())
|
||||
}
|
||||
if w2.GetTimestampValue().GetInt32Value() != 77 {
|
||||
t.Fatalf("timestamp value int32 = %d, want 77", w2.GetTimestampValue().GetInt32Value())
|
||||
}
|
||||
if w2.GetUserId() != 100 {
|
||||
t.Fatalf("user id = %d, want 100", w2.GetUserId())
|
||||
}
|
||||
}
|
||||
|
||||
func TestWrite2RawReturnsRawReply(t *testing.T) {
|
||||
fake := &fakeGatewayServer{
|
||||
invokeReply: &pb.MxCommandReply{
|
||||
SessionId: "session-1",
|
||||
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE2,
|
||||
ProtocolStatus: &pb.ProtocolStatus{
|
||||
Code: pb.ProtocolStatusCode_PROTOCOL_STATUS_CODE_OK,
|
||||
},
|
||||
},
|
||||
}
|
||||
client, cleanup := newBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
session := NewSessionForID(client, "session-1")
|
||||
|
||||
reply, err := session.Write2Raw(context.Background(), 12, 34, Int32Value(1), Int32Value(0), 0)
|
||||
if err != nil {
|
||||
t.Fatalf("Write2Raw() error = %v", err)
|
||||
}
|
||||
if reply == nil {
|
||||
t.Fatal("Write2Raw() returned nil reply")
|
||||
}
|
||||
if reply.GetKind() != pb.MxCommandKind_MX_COMMAND_KIND_WRITE2 {
|
||||
t.Fatalf("reply kind = %s, want WRITE2", reply.GetKind())
|
||||
}
|
||||
}
|
||||
|
||||
func TestWrite2RejectsNilValue(t *testing.T) {
|
||||
fake := &fakeGatewayServer{}
|
||||
client, cleanup := newBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
session := NewSessionForID(client, "session-1")
|
||||
|
||||
if err := session.Write2(context.Background(), 12, 34, nil, Int32Value(0), 0); err == nil {
|
||||
t.Fatal("Write2(nil value) returned no error")
|
||||
}
|
||||
if err := session.Write2(context.Background(), 12, 34, Int32Value(1), nil, 0); err == nil {
|
||||
t.Fatal("Write2(nil timestampValue) returned no error")
|
||||
}
|
||||
}
|
||||
|
||||
func TestReadBulkForwardsTimeoutAndUnpacksCachedFlag(t *testing.T) {
|
||||
fake := &fakeGatewayServer{
|
||||
invokeReply: &pb.MxCommandReply{
|
||||
|
||||
@@ -54,6 +54,11 @@ type (
|
||||
BrowseChildrenRequest = pb.BrowseChildrenRequest
|
||||
// BrowseChildrenReply is the reply for BrowseChildren.
|
||||
BrowseChildrenReply = pb.BrowseChildrenReply
|
||||
// BrowseChildrenRequest_ParentGobjectId selects the parent-by-gobject-id
|
||||
// variant of the BrowseChildrenRequest parent oneof. Exposed so callers
|
||||
// (e.g. the mxgw-go CLI) can issue a parent-scoped single-level browse
|
||||
// without reaching into the generated package.
|
||||
BrowseChildrenRequest_ParentGobjectId = pb.BrowseChildrenRequest_ParentGobjectId //nolint:revive,staticcheck // mirrors generated proto oneof name
|
||||
)
|
||||
|
||||
// RawDeployEventStream is the generated WatchDeployEvents client stream.
|
||||
|
||||
@@ -580,6 +580,46 @@ func (s *Session) WriteRaw(ctx context.Context, serverHandle, itemHandle int32,
|
||||
})
|
||||
}
|
||||
|
||||
// PingRaw sends a diagnostic PING command and returns the raw reply.
|
||||
// The message is echoed back by the gateway in the reply's DiagnosticMessage field.
|
||||
func (s *Session) PingRaw(ctx context.Context, message string) (*MxCommandReply, error) {
|
||||
return s.invokeCommand(ctx, &pb.MxCommand{
|
||||
Kind: pb.MxCommandKind_MX_COMMAND_KIND_PING,
|
||||
Payload: &pb.MxCommand_Ping{
|
||||
Ping: &pb.PingCommand{Message: message},
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
// Write2 invokes MXAccess Write2 (timestamped single-item write).
|
||||
func (s *Session) Write2(ctx context.Context, serverHandle, itemHandle int32, value, timestampValue *MxValue, userID int32) error {
|
||||
_, err := s.Write2Raw(ctx, serverHandle, itemHandle, value, timestampValue, userID)
|
||||
return err
|
||||
}
|
||||
|
||||
// Write2Raw invokes MXAccess Write2 (timestamped single-item write) and returns the raw reply.
|
||||
func (s *Session) Write2Raw(ctx context.Context, serverHandle, itemHandle int32, value, timestampValue *MxValue, userID int32) (*MxCommandReply, error) {
|
||||
if value == nil {
|
||||
return nil, errors.New("mxgateway: write2 value is required")
|
||||
}
|
||||
if timestampValue == nil {
|
||||
return nil, errors.New("mxgateway: write2 timestamp value is required")
|
||||
}
|
||||
|
||||
return s.invokeCommand(ctx, &pb.MxCommand{
|
||||
Kind: pb.MxCommandKind_MX_COMMAND_KIND_WRITE2,
|
||||
Payload: &pb.MxCommand_Write2{
|
||||
Write2: &pb.Write2Command{
|
||||
ServerHandle: serverHandle,
|
||||
ItemHandle: itemHandle,
|
||||
Value: value,
|
||||
TimestampValue: timestampValue,
|
||||
UserId: userID,
|
||||
},
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
// Events streams ordered session events until the server ends the stream,
|
||||
// context cancellation stops Recv, or a terminal error is sent.
|
||||
func (s *Session) Events(ctx context.Context) (<-chan EventResult, error) {
|
||||
|
||||
+23
-6
@@ -115,17 +115,33 @@ try (GalaxyRepositoryClient galaxy = GalaxyRepositoryClient.connect(options)) {
|
||||
messages directly so callers can read all fields (including the nested
|
||||
`GalaxyAttribute` list) without an extra DTO layer.
|
||||
|
||||
The CLI exposes matching subcommands: `galaxy-test`, `galaxy-deploy-time`,
|
||||
`galaxy-discover`, and `galaxy-watch`. They take the same `--endpoint`,
|
||||
`--api-key-env`, `--plaintext`, `--ca-file`, `--server-name-override`,
|
||||
`--timeout`, and `--json` options as the gateway commands.
|
||||
The CLI exposes matching subcommands: `galaxy-test-connection`,
|
||||
`galaxy-last-deploy`, `galaxy-discover`, `galaxy-browse`, and `galaxy-watch`.
|
||||
The short names `galaxy-test` and `galaxy-deploy-time` remain as deprecated
|
||||
aliases for `galaxy-test-connection` and `galaxy-last-deploy` so existing
|
||||
scripts keep working. They take the same `--endpoint`, `--api-key-env`,
|
||||
`--plaintext`, `--ca-file`, `--server-name-override`, `--timeout`, and `--json`
|
||||
options as the gateway commands.
|
||||
|
||||
```powershell
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-test --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-deploy-time --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-test-connection --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-last-deploy --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-discover --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
```
|
||||
|
||||
`galaxy-browse` walks the hierarchy via `BrowseChildren`. Without `--parent` it
|
||||
returns the root nodes and eagerly expands `--depth` further levels; with
|
||||
`--parent <gobject-id>` it returns exactly one level of children for that
|
||||
parent. The filter flags (`--category-ids`, `--template-contains`,
|
||||
`--tag-name-glob`, `--alarm-bearing-only`, `--historized-only`,
|
||||
`--include-attributes`) match `galaxy-discover`. The `--json` node shape is the
|
||||
cross-client browse surface: the flattened object fields plus a
|
||||
`hasChildrenHint` flag and a nested `children` array.
|
||||
|
||||
```powershell
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-browse --depth 1 --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
```
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `browseChildrenRaw` to walk one level at a
|
||||
@@ -239,6 +255,7 @@ Run the CLI through Gradle:
|
||||
```powershell
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="version --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="open-session --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --client-session-name java-cli --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="ping --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --message hello --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="register --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --client-name java-cli --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="add-item --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --server-handle 1 --item TestObject.TestInt --json"
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="advise --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <id> --server-handle 1 --item-handle 1 --json"
|
||||
|
||||
+326
-6
@@ -1,7 +1,9 @@
|
||||
package com.zb.mom.ww.mxgateway.cli;
|
||||
|
||||
import com.zb.mom.ww.mxgateway.client.BrowseChildrenOptions;
|
||||
import com.zb.mom.ww.mxgateway.client.DeployEventStream;
|
||||
import com.zb.mom.ww.mxgateway.client.GalaxyRepositoryClient;
|
||||
import com.zb.mom.ww.mxgateway.client.LazyBrowseNode;
|
||||
import com.zb.mom.ww.mxgateway.client.MxEventStream;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayAlarmFeedSubscription;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayClient;
|
||||
@@ -10,6 +12,8 @@ import com.zb.mom.ww.mxgateway.client.MxGatewayClientVersion;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewaySecrets;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewaySession;
|
||||
import com.zb.mom.ww.mxgateway.client.MxValues;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.GalaxyAttribute;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.GalaxyObject;
|
||||
@@ -26,8 +30,10 @@ import java.time.Duration;
|
||||
import java.time.Instant;
|
||||
import java.util.ArrayList;
|
||||
import java.util.Arrays;
|
||||
import java.util.HashSet;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.List;
|
||||
import java.util.Set;
|
||||
import java.util.Map;
|
||||
import java.util.Optional;
|
||||
import java.util.concurrent.ArrayBlockingQueue;
|
||||
@@ -42,11 +48,14 @@ import mxaccess_gateway.v1.MxaccessGateway.AlarmFeedMessage;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.BulkReadResult;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.BulkWriteResult;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.CloseSessionRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.MxCommand;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.MxCommandKind;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.MxCommandReply;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.MxEvent;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.MxValue;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.OnAlarmTransitionEvent;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.PingCommand;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.StreamAlarmsRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.SubscribeResult;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.Write2BulkEntry;
|
||||
@@ -126,6 +135,7 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
commandLine.addSubcommand("version", new VersionCommand());
|
||||
commandLine.addSubcommand("open-session", new OpenSessionCommand(clientFactory));
|
||||
commandLine.addSubcommand("close-session", new CloseSessionCommand(clientFactory));
|
||||
commandLine.addSubcommand("ping", new PingCommandLine(clientFactory));
|
||||
commandLine.addSubcommand("register", new RegisterCommand(clientFactory));
|
||||
commandLine.addSubcommand("add-item", new AddItemCommand(clientFactory));
|
||||
commandLine.addSubcommand("advise", new AdviseCommand(clientFactory));
|
||||
@@ -142,9 +152,10 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
commandLine.addSubcommand("stream-alarms", new StreamAlarmsCommand(clientFactory));
|
||||
commandLine.addSubcommand("acknowledge-alarm", new AcknowledgeAlarmCommand(clientFactory));
|
||||
commandLine.addSubcommand("smoke", new SmokeCommand(clientFactory));
|
||||
commandLine.addSubcommand("galaxy-test", new GalaxyTestConnectionCommand());
|
||||
commandLine.addSubcommand("galaxy-deploy-time", new GalaxyDeployTimeCommand());
|
||||
commandLine.addSubcommand("galaxy-test-connection", new GalaxyTestConnectionCommand());
|
||||
commandLine.addSubcommand("galaxy-last-deploy", new GalaxyDeployTimeCommand());
|
||||
commandLine.addSubcommand("galaxy-discover", new GalaxyDiscoverCommand());
|
||||
commandLine.addSubcommand("galaxy-browse", new GalaxyBrowseCommand());
|
||||
commandLine.addSubcommand("galaxy-watch", new GalaxyWatchCommand());
|
||||
commandLine.addSubcommand("batch", new BatchCommand(clientFactory));
|
||||
return commandLine;
|
||||
@@ -359,7 +370,10 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
}
|
||||
|
||||
@Command(name = "galaxy-test", description = "Calls GalaxyRepository.TestConnection.")
|
||||
@Command(
|
||||
name = "galaxy-test-connection",
|
||||
aliases = {"galaxy-test"},
|
||||
description = "Calls GalaxyRepository.TestConnection.")
|
||||
static final class GalaxyTestConnectionCommand extends GalaxyCommand {
|
||||
@Override
|
||||
public Integer call() {
|
||||
@@ -368,7 +382,7 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
PrintWriter out = common.spec.commandLine().getOut();
|
||||
if (json) {
|
||||
Map<String, Object> output = new LinkedHashMap<>();
|
||||
output.put("command", "galaxy-test");
|
||||
output.put("command", "galaxy-test-connection");
|
||||
output.put("options", common.redactedJsonMap());
|
||||
output.put("ok", ok);
|
||||
out.println(jsonObject(output));
|
||||
@@ -380,7 +394,10 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
}
|
||||
|
||||
@Command(name = "galaxy-deploy-time", description = "Calls GalaxyRepository.GetLastDeployTime.")
|
||||
@Command(
|
||||
name = "galaxy-last-deploy",
|
||||
aliases = {"galaxy-deploy-time"},
|
||||
description = "Calls GalaxyRepository.GetLastDeployTime.")
|
||||
static final class GalaxyDeployTimeCommand extends GalaxyCommand {
|
||||
@Override
|
||||
public Integer call() {
|
||||
@@ -389,7 +406,7 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
PrintWriter out = common.spec.commandLine().getOut();
|
||||
if (json) {
|
||||
Map<String, Object> output = new LinkedHashMap<>();
|
||||
output.put("command", "galaxy-deploy-time");
|
||||
output.put("command", "galaxy-last-deploy");
|
||||
output.put("options", common.redactedJsonMap());
|
||||
output.put("present", result.isPresent());
|
||||
output.put("timeOfLastDeploy", result.map(Instant::toString).orElse(""));
|
||||
@@ -429,6 +446,274 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Page size used for the raw {@code BrowseChildren} paging loop driven by
|
||||
* the {@code --parent} one-level path. Mirrors {@code BROWSE_CHILDREN_PAGE_SIZE}
|
||||
* in the client library's lazy-browse helper and the other clients' CLI page
|
||||
* size so paging behaviour is consistent across languages.
|
||||
*/
|
||||
private static final int BROWSE_CHILDREN_CLI_PAGE_SIZE = 500;
|
||||
|
||||
@Command(
|
||||
name = "galaxy-browse",
|
||||
description = "Browses the Galaxy hierarchy via GalaxyRepository.BrowseChildren.")
|
||||
static final class GalaxyBrowseCommand extends GalaxyCommand {
|
||||
@Spec
|
||||
private CommandSpec spec;
|
||||
|
||||
@Option(
|
||||
names = "--parent",
|
||||
defaultValue = "-1",
|
||||
description =
|
||||
"Parent gobject id to browse one level of children for."
|
||||
+ " Use the default (omit) to walk root nodes;"
|
||||
+ " gobject id 0 is reserved by the server to mean roots.")
|
||||
int parent;
|
||||
|
||||
@Option(
|
||||
names = "--depth",
|
||||
defaultValue = "0",
|
||||
description =
|
||||
"When walking roots, eagerly expand this many further levels before printing."
|
||||
+ " Must be between 0 and 50 inclusive.")
|
||||
int depth;
|
||||
|
||||
@Option(names = "--category-ids", description = "Comma-separated category ids to include.")
|
||||
String categoryIds;
|
||||
|
||||
@Option(names = "--template-contains", description = "Comma-separated template names each child's chain must contain.")
|
||||
String templateContains;
|
||||
|
||||
@Option(names = "--tag-name-glob", description = "SQL-LIKE-style glob applied to tag_name.")
|
||||
String tagNameGlob;
|
||||
|
||||
@Option(names = "--alarm-bearing-only", description = "Restrict to alarm-bearing objects.")
|
||||
boolean alarmBearingOnly;
|
||||
|
||||
@Option(names = "--historized-only", description = "Restrict to objects with at least one historized attribute.")
|
||||
boolean historizedOnly;
|
||||
|
||||
@Option(names = "--include-attributes", description = "Request attribute population on each returned object.")
|
||||
boolean includeAttributes;
|
||||
|
||||
@Override
|
||||
public Integer call() {
|
||||
if (depth < 0) {
|
||||
throw new CommandLine.ParameterException(spec.commandLine(), "--depth must be non-negative");
|
||||
}
|
||||
if (depth > 50) {
|
||||
throw new CommandLine.ParameterException(spec.commandLine(), "--depth must be at most 50");
|
||||
}
|
||||
BrowseChildrenOptions options = buildOptions();
|
||||
PrintWriter out = common.spec.commandLine().getOut();
|
||||
PrintWriter err = common.spec.commandLine().getErr();
|
||||
if (parent == 0) {
|
||||
err.println("warning: --parent 0 is the server sentinel for root nodes; omit --parent to walk roots instead.");
|
||||
}
|
||||
try (GalaxyRepositoryClient client = connect()) {
|
||||
if (parent >= 0) {
|
||||
if (depth > 0) {
|
||||
err.println("warning: --depth is ignored when --parent is specified.");
|
||||
}
|
||||
List<BrowseChild> children = browseOneLevel(client, parent, options);
|
||||
if (json) {
|
||||
List<Map<String, Object>> nodes = new ArrayList<>(children.size());
|
||||
for (BrowseChild child : children) {
|
||||
nodes.add(browseNodeMap(child.object(), child.hasChildrenHint(), List.of()));
|
||||
}
|
||||
Map<String, Object> output = new LinkedHashMap<>();
|
||||
output.put("command", "galaxy-browse");
|
||||
output.put("options", common.redactedJsonMap());
|
||||
output.put("parentId", parent);
|
||||
output.put("nodes", nodes);
|
||||
out.println(jsonObject(output));
|
||||
} else {
|
||||
out.println(children.size());
|
||||
for (BrowseChild child : children) {
|
||||
printBrowseChild(out, child);
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
List<LazyBrowseNode> roots = client.browse(options);
|
||||
for (LazyBrowseNode root : roots) {
|
||||
expandToDepth(root, depth);
|
||||
}
|
||||
if (json) {
|
||||
List<Map<String, Object>> nodes = new ArrayList<>(roots.size());
|
||||
for (LazyBrowseNode root : roots) {
|
||||
nodes.add(lazyNodeMap(root));
|
||||
}
|
||||
Map<String, Object> output = new LinkedHashMap<>();
|
||||
output.put("command", "galaxy-browse");
|
||||
output.put("options", common.redactedJsonMap());
|
||||
output.put("nodes", nodes);
|
||||
out.println(jsonObject(output));
|
||||
} else {
|
||||
out.println(roots.size());
|
||||
for (LazyBrowseNode root : roots) {
|
||||
printLazyNode(out, root, 0);
|
||||
}
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
private BrowseChildrenOptions buildOptions() {
|
||||
return BrowseChildrenOptions.builder()
|
||||
.categoryIds(parseOptionalIntList(categoryIds))
|
||||
.templateChainContains(parseOptionalStringList(templateContains))
|
||||
.tagNameGlob(tagNameGlob == null ? "" : tagNameGlob)
|
||||
// Tri-state: only override the server default when the flag is present.
|
||||
.includeAttributes(includeAttributes ? Boolean.TRUE : null)
|
||||
.alarmBearingOnly(alarmBearingOnly)
|
||||
.historizedOnly(historizedOnly)
|
||||
.build();
|
||||
}
|
||||
}
|
||||
|
||||
/** One raw {@code BrowseChildren} child paired with its server-supplied has-children hint. */
|
||||
private record BrowseChild(GalaxyObject object, boolean hasChildrenHint) {
|
||||
}
|
||||
|
||||
/**
|
||||
* Drives the raw {@code BrowseChildren} paging loop for a single parent and
|
||||
* returns the flattened one-level child list. Used by the {@code --parent}
|
||||
* path, which surfaces a single level rather than the lazy root-tree walk.
|
||||
*/
|
||||
private static List<BrowseChild> browseOneLevel(
|
||||
GalaxyRepositoryClient client, int parentGobjectId, BrowseChildrenOptions options) {
|
||||
List<BrowseChild> children = new ArrayList<>();
|
||||
Set<String> seenPageTokens = new HashSet<>();
|
||||
String pageToken = "";
|
||||
while (true) {
|
||||
BrowseChildrenRequest.Builder builder = BrowseChildrenRequest.newBuilder()
|
||||
.setPageSize(BROWSE_CHILDREN_CLI_PAGE_SIZE)
|
||||
.setPageToken(pageToken)
|
||||
.setParentGobjectId(parentGobjectId)
|
||||
.setAlarmBearingOnly(options.isAlarmBearingOnly())
|
||||
.setHistorizedOnly(options.isHistorizedOnly());
|
||||
if (!options.getCategoryIds().isEmpty()) {
|
||||
builder.addAllCategoryIds(options.getCategoryIds());
|
||||
}
|
||||
if (!options.getTemplateChainContains().isEmpty()) {
|
||||
builder.addAllTemplateChainContains(options.getTemplateChainContains());
|
||||
}
|
||||
if (!options.getTagNameGlob().isEmpty()) {
|
||||
builder.setTagNameGlob(options.getTagNameGlob());
|
||||
}
|
||||
if (options.getIncludeAttributes() != null) {
|
||||
builder.setIncludeAttributes(options.getIncludeAttributes());
|
||||
}
|
||||
|
||||
BrowseChildrenReply reply = client.browseChildrenRaw(builder.build());
|
||||
for (int i = 0; i < reply.getChildrenCount(); i++) {
|
||||
boolean hint = i < reply.getChildHasChildrenCount() && reply.getChildHasChildren(i);
|
||||
children.add(new BrowseChild(reply.getChildren(i), hint));
|
||||
}
|
||||
|
||||
pageToken = reply.getNextPageToken();
|
||||
if (pageToken == null || pageToken.isEmpty()) {
|
||||
return children;
|
||||
}
|
||||
if (!seenPageTokens.add(pageToken)) {
|
||||
throw new IllegalStateException(
|
||||
"galaxy browse children returned repeated page token: " + pageToken);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Recursively expands a {@link LazyBrowseNode} up to {@code depth} further
|
||||
* levels. A {@code depth} of 0 leaves the node unexpanded so callers print
|
||||
* only the requested level. Nodes the server reports as childless are not
|
||||
* expanded.
|
||||
*/
|
||||
private static void expandToDepth(LazyBrowseNode node, int depth) {
|
||||
if (depth <= 0) {
|
||||
return;
|
||||
}
|
||||
if (node.hasChildrenHint()) {
|
||||
node.expand();
|
||||
}
|
||||
for (LazyBrowseNode child : node.getChildren()) {
|
||||
expandToDepth(child, depth - 1);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Renders one {@link LazyBrowseNode} (and any already-expanded descendants)
|
||||
* as a JSON map. Mirrors the {@code galaxy-discover} object shape with an
|
||||
* added {@code hasChildrenHint} flag and a nested {@code children} array,
|
||||
* matching the cross-client browse JSON surface.
|
||||
*/
|
||||
private static Map<String, Object> lazyNodeMap(LazyBrowseNode node) {
|
||||
List<Map<String, Object>> children = new ArrayList<>();
|
||||
if (node.isExpanded()) {
|
||||
for (LazyBrowseNode child : node.getChildren()) {
|
||||
children.add(lazyNodeMap(child));
|
||||
}
|
||||
}
|
||||
return browseNodeMap(node.getObject(), node.hasChildrenHint(), children);
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds the per-node browse JSON map: the flattened Galaxy object fields,
|
||||
* the {@code hasChildrenHint} flag, and a nested {@code children} array.
|
||||
* The {@code hasChildrenHint} key is the cross-client standard (Rust /
|
||||
* Python / .NET / Go all use the same key and node shape).
|
||||
*/
|
||||
static Map<String, Object> browseNodeMap(
|
||||
GalaxyObject object, boolean hasChildrenHint, List<Map<String, Object>> children) {
|
||||
Map<String, Object> values = galaxyObjectMap(object);
|
||||
values.put("hasChildrenHint", hasChildrenHint);
|
||||
values.put("children", children);
|
||||
return values;
|
||||
}
|
||||
|
||||
private static void printLazyNode(PrintWriter out, LazyBrowseNode node, int level) {
|
||||
GalaxyObject obj = node.getObject();
|
||||
out.printf(
|
||||
"%s%d\t%s\t%s\t(attrs=%d, hasChildrenHint=%b)%n",
|
||||
" ".repeat(level),
|
||||
obj.getGobjectId(),
|
||||
obj.getTagName(),
|
||||
obj.getBrowseName(),
|
||||
obj.getAttributesCount(),
|
||||
node.hasChildrenHint());
|
||||
if (node.isExpanded()) {
|
||||
for (LazyBrowseNode child : node.getChildren()) {
|
||||
printLazyNode(out, child, level + 1);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static void printBrowseChild(PrintWriter out, BrowseChild child) {
|
||||
GalaxyObject obj = child.object();
|
||||
out.printf(
|
||||
"%d\t%s\t%s\t(attrs=%d, hasChildrenHint=%b)%n",
|
||||
obj.getGobjectId(),
|
||||
obj.getTagName(),
|
||||
obj.getBrowseName(),
|
||||
obj.getAttributesCount(),
|
||||
child.hasChildrenHint());
|
||||
}
|
||||
|
||||
private static List<Integer> parseOptionalIntList(String value) {
|
||||
if (value == null || value.isBlank()) {
|
||||
return List.of();
|
||||
}
|
||||
return parseIntList(value);
|
||||
}
|
||||
|
||||
private static List<String> parseOptionalStringList(String value) {
|
||||
if (value == null || value.isBlank()) {
|
||||
return List.of();
|
||||
}
|
||||
return parseStringList(value);
|
||||
}
|
||||
|
||||
@Command(
|
||||
name = "galaxy-watch",
|
||||
description = "Streams GalaxyRepository.WatchDeployEvents until cancelled.")
|
||||
@@ -622,6 +907,31 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
}
|
||||
|
||||
@Command(name = "ping", description = "Sends a diagnostic ping command to the session worker.")
|
||||
static final class PingCommandLine extends GatewayCommand {
|
||||
@Option(names = "--session-id", required = true, description = "Gateway session id.")
|
||||
String sessionId;
|
||||
|
||||
@Option(names = "--message", defaultValue = "ping", description = "Message echoed back in the reply.")
|
||||
String message;
|
||||
|
||||
PingCommandLine(MxGatewayCliClientFactory clientFactory) {
|
||||
super(clientFactory);
|
||||
}
|
||||
|
||||
@Override
|
||||
public Integer call() {
|
||||
try (MxGatewayCliClient client = clientFactory.connect(common.resolved())) {
|
||||
MxCommandReply reply = client.session(sessionId).pingRaw(message);
|
||||
// The worker echoes the message in the diagnostic message field;
|
||||
// there is no dedicated ping reply payload, so the plain-text path
|
||||
// surfaces that field.
|
||||
writeOutput("ping", common, json, reply, reply::getDiagnosticMessage);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
@Command(name = "register", description = "Invokes MXAccess Register.")
|
||||
static final class RegisterCommand extends GatewayCommand {
|
||||
@Option(names = "--session-id", required = true, description = "Gateway session id.")
|
||||
@@ -1438,6 +1748,8 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
|
||||
interface MxGatewayCliSession {
|
||||
MxCommandReply pingRaw(String message);
|
||||
|
||||
int register(String clientName);
|
||||
|
||||
MxCommandReply registerRaw(String clientName);
|
||||
@@ -1523,6 +1835,14 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
}
|
||||
|
||||
record GrpcMxGatewayCliSession(MxGatewaySession session) implements MxGatewayCliSession {
|
||||
@Override
|
||||
public MxCommandReply pingRaw(String message) {
|
||||
return session.invokeCommand(MxCommand.newBuilder()
|
||||
.setKind(MxCommandKind.MX_COMMAND_KIND_PING)
|
||||
.setPing(PingCommand.newBuilder().setMessage(message))
|
||||
.build());
|
||||
}
|
||||
|
||||
@Override
|
||||
public int register(String clientName) {
|
||||
return session.register(clientName);
|
||||
|
||||
+175
@@ -6,6 +6,7 @@ import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayAlarmFeedSubscription;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayClientOptions;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.GalaxyObject;
|
||||
import io.grpc.stub.StreamObserver;
|
||||
import java.io.ByteArrayInputStream;
|
||||
import java.io.InputStream;
|
||||
@@ -15,6 +16,7 @@ import java.nio.charset.StandardCharsets;
|
||||
import java.time.Duration;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmReply;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
|
||||
@@ -124,6 +126,166 @@ final class MxGatewayCliTests {
|
||||
assertTrue(run.output().contains("\"itemHandle\":7"));
|
||||
}
|
||||
|
||||
// ---- ping subcommand (D4) ----
|
||||
|
||||
@Test
|
||||
void pingCommandForwardsMessageAndPrintsEcho() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory, "ping", "--session-id", "session-cli", "--message", "hello-mxgw");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals("hello-mxgw", factory.client.session.lastPingMessage);
|
||||
// The worker echoes the message in the diagnostic message field; the
|
||||
// plain-text path surfaces exactly that echoed value.
|
||||
assertEquals("hello-mxgw", run.output().trim());
|
||||
}
|
||||
|
||||
@Test
|
||||
void pingCommandDefaultsMessageToPing() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(factory, "ping", "--session-id", "session-cli");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
assertEquals("ping", factory.client.session.lastPingMessage);
|
||||
}
|
||||
|
||||
@Test
|
||||
void pingCommandJsonIncludesPingKindAndDiagnosticMessage() {
|
||||
FakeClientFactory factory = new FakeClientFactory();
|
||||
CliRun run = execute(
|
||||
factory, "ping", "--session-id", "session-cli", "--message", "diag-1", "--json");
|
||||
|
||||
assertEquals(0, run.exitCode());
|
||||
String out = run.output();
|
||||
assertTrue(out.contains("\"command\":\"ping\""), out);
|
||||
assertTrue(out.contains("\"kind\":\"MX_COMMAND_KIND_PING\""), out);
|
||||
assertTrue(out.contains("diag-1"), out);
|
||||
}
|
||||
|
||||
// ---- galaxy-browse subcommand (D8-java) ----
|
||||
|
||||
@Test
|
||||
void galaxyBrowseNodeJsonUsesHasChildrenHintKeyAndFlattensObjectFields() {
|
||||
GalaxyObject object = GalaxyObject.newBuilder()
|
||||
.setGobjectId(101)
|
||||
.setTagName("Area001")
|
||||
.setBrowseName("Area001")
|
||||
.build();
|
||||
Map<String, Object> leaf = MxGatewayCli.browseNodeMap(
|
||||
GalaxyObject.newBuilder().setGobjectId(202).setTagName("Pump001").build(),
|
||||
false,
|
||||
List.of());
|
||||
Map<String, Object> node = MxGatewayCli.browseNodeMap(object, true, List.of(leaf));
|
||||
|
||||
// Cross-client JSON parity: the per-node "has children" flag MUST use the
|
||||
// key hasChildrenHint (Rust / Python / .NET / Go all standardized on it).
|
||||
assertTrue(node.containsKey("hasChildrenHint"), node.toString());
|
||||
assertEquals(Boolean.TRUE, node.get("hasChildrenHint"));
|
||||
// Object fields are flattened directly into the node (matching the
|
||||
// galaxy-discover object shape), not nested under an "object" key.
|
||||
assertFalse(node.containsKey("object"), node.toString());
|
||||
assertEquals(101L, ((Number) node.get("gobjectId")).longValue());
|
||||
assertEquals("Area001", node.get("tagName"));
|
||||
// Nested children array carries the same node shape recursively.
|
||||
@SuppressWarnings("unchecked")
|
||||
List<Map<String, Object>> children = (List<Map<String, Object>>) node.get("children");
|
||||
assertEquals(1, children.size());
|
||||
assertTrue(children.get(0).containsKey("hasChildrenHint"));
|
||||
assertEquals(Boolean.FALSE, children.get(0).get("hasChildrenHint"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void galaxyBrowseInvocationsParseCleanly() {
|
||||
// galaxy-browse connects via GalaxyRepositoryClient.connect (a static),
|
||||
// so the full surface is exercised only by the cross-language matrix
|
||||
// against a live gateway. Here we assert the option surface parses.
|
||||
assertReadmeExampleParses(new String[] {"galaxy-browse", "--json"});
|
||||
assertReadmeExampleParses(new String[] {"galaxy-browse", "--parent", "42", "--json"});
|
||||
assertReadmeExampleParses(new String[] {
|
||||
"galaxy-browse",
|
||||
"--depth", "2",
|
||||
"--category-ids", "1,2",
|
||||
"--template-contains", "$Pump",
|
||||
"--tag-name-glob", "Area%",
|
||||
"--alarm-bearing-only",
|
||||
"--historized-only",
|
||||
"--include-attributes",
|
||||
"--json"
|
||||
});
|
||||
}
|
||||
|
||||
@Test
|
||||
void galaxyBrowseNegativeDepthYieldsNonZeroExitViaParameterException() {
|
||||
// Fix: --depth validation must surface as a picocli ParameterException
|
||||
// (clean error line on stderr) rather than an unhandled IllegalArgumentException
|
||||
// stack trace. Picocli maps ParameterException to exit code 2.
|
||||
CliRun run = execute(new FakeClientFactory(), "galaxy-browse", "--depth", "-1");
|
||||
|
||||
assertFalse(run.exitCode() == 0, "expected non-zero exit for --depth -1");
|
||||
// Picocli writes ParameterException messages to the error writer.
|
||||
assertTrue(run.errors().contains("--depth"), "expected --depth in error output: " + run.errors());
|
||||
}
|
||||
|
||||
@Test
|
||||
void galaxyBrowseDepthAbove50YieldsNonZeroExit() {
|
||||
CliRun run = execute(new FakeClientFactory(), "galaxy-browse", "--depth", "51");
|
||||
|
||||
assertFalse(run.exitCode() == 0, "expected non-zero exit for --depth 51");
|
||||
assertTrue(run.errors().contains("--depth"), "expected --depth in error output: " + run.errors());
|
||||
}
|
||||
|
||||
@Test
|
||||
void galaxyBrowseParentZeroEmitsWarningToStderr() {
|
||||
// --parent 0 is the server sentinel for roots; passing it explicitly is
|
||||
// almost certainly a mistake. The CLI must print a warning to stderr
|
||||
// (matching Go/Rust client behaviour) but must still attempt the call
|
||||
// (exit behaviour depends on gateway reachability, not tested here;
|
||||
// we only assert the warning path is triggered by checking the error
|
||||
// writer before any gRPC connection is attempted).
|
||||
//
|
||||
// GalaxyBrowseCommand connects to a real GalaxyRepositoryClient, so the
|
||||
// call() body will throw after printing the warning when no gateway is
|
||||
// reachable. We only assert the warning appears on stderr.
|
||||
StringWriter output = new StringWriter();
|
||||
StringWriter errors = new StringWriter();
|
||||
// Non-zero exit is expected (no live gateway), but the warning must
|
||||
// appear on stderr regardless of what happens next.
|
||||
MxGatewayCli.execute(
|
||||
new FakeClientFactory(),
|
||||
new PrintWriter(output, true),
|
||||
new PrintWriter(errors, true),
|
||||
"galaxy-browse", "--parent", "0", "--depth", "1");
|
||||
|
||||
assertTrue(
|
||||
errors.toString().contains("--parent 0"),
|
||||
"expected '--parent 0' warning on stderr; got: " + errors);
|
||||
}
|
||||
|
||||
// ---- galaxy command-name aliases (D9-java) ----
|
||||
|
||||
@Test
|
||||
void galaxyTestConnectionCanonicalAndDeprecatedAliasResolve() {
|
||||
picocli.CommandLine commandLine = MxGatewayCli.commandLine(new FakeClientFactory());
|
||||
// Both the canonical dash-separated name and the deprecated short alias
|
||||
// must resolve to the same subcommand so existing scripts keep working.
|
||||
assertTrue(commandLine.getSubcommands().containsKey("galaxy-test-connection"));
|
||||
assertTrue(commandLine.getSubcommands().containsKey("galaxy-test"));
|
||||
assertEquals(
|
||||
commandLine.getSubcommands().get("galaxy-test-connection"),
|
||||
commandLine.getSubcommands().get("galaxy-test"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void galaxyLastDeployCanonicalAndDeprecatedAliasResolve() {
|
||||
picocli.CommandLine commandLine = MxGatewayCli.commandLine(new FakeClientFactory());
|
||||
assertTrue(commandLine.getSubcommands().containsKey("galaxy-last-deploy"));
|
||||
assertTrue(commandLine.getSubcommands().containsKey("galaxy-deploy-time"));
|
||||
assertEquals(
|
||||
commandLine.getSubcommands().get("galaxy-last-deploy"),
|
||||
commandLine.getSubcommands().get("galaxy-deploy-time"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void subscribeBulkCommandPrintsResults() {
|
||||
CliRun run = execute(
|
||||
@@ -652,6 +814,19 @@ final class MxGatewayCliTests {
|
||||
private boolean addItemCalled;
|
||||
private boolean adviseCalled;
|
||||
private MxValue lastWriteValue;
|
||||
private String lastPingMessage;
|
||||
|
||||
@Override
|
||||
public MxCommandReply pingRaw(String message) {
|
||||
lastPingMessage = message;
|
||||
// The worker echoes the request message in the diagnostic message
|
||||
// field; there is no dedicated ping reply payload.
|
||||
return MxCommandReply.newBuilder()
|
||||
.setKind(MxCommandKind.MX_COMMAND_KIND_PING)
|
||||
.setProtocolStatus(ok())
|
||||
.setDiagnosticMessage(message)
|
||||
.build();
|
||||
}
|
||||
|
||||
@Override
|
||||
public int register(String clientName) {
|
||||
|
||||
@@ -214,9 +214,33 @@ The method returns an async iterator yielding the generated `DeployEvent`
|
||||
proto. Breaking out of the loop, calling `aclose()` on the iterator, or
|
||||
cancelling the surrounding task closes the underlying gRPC stream
|
||||
cleanly. The streaming RPC requires the same `metadata:read` scope as
|
||||
the other Galaxy methods. The CLI does not currently expose a
|
||||
streaming `watch-deploy-events` subcommand — use the library API
|
||||
directly when subscribing to deploy events from Python.
|
||||
the other Galaxy methods.
|
||||
|
||||
The CLI exposes the Galaxy Repository RPCs through five subcommands that
|
||||
mirror the other clients:
|
||||
|
||||
```bash
|
||||
mxgw-py galaxy-test-connection --plaintext --json
|
||||
mxgw-py galaxy-last-deploy --plaintext --json
|
||||
mxgw-py galaxy-discover --plaintext --json
|
||||
mxgw-py galaxy-browse --plaintext --json
|
||||
mxgw-py galaxy-watch --plaintext --json
|
||||
```
|
||||
|
||||
`galaxy-watch` is bounded by `--max-events` (default `1`) and `--timeout`
|
||||
(seconds) so it always terminates; pass `--last-seen-deploy-time` (an
|
||||
ISO-8601 timestamp) to suppress the bootstrap event when it matches the
|
||||
current cached deploy time.
|
||||
|
||||
`galaxy-browse` wraps the lazy `LazyBrowseNode` walker. Without `--depth`
|
||||
it lists only the root objects; `--depth N` eagerly expands `N` further
|
||||
levels before printing. Text output is a node count followed by an indented
|
||||
tree (`+`/`-` marks the server's has-children hint); `--json` emits nested
|
||||
`{..., "hasChildrenHint": bool, "children": [...]}` nodes that match the
|
||||
`galaxy-discover` object shape. The `BrowseChildrenRequest` filters are
|
||||
exposed as `--category-id` (repeatable), `--template-chain-contains`
|
||||
(repeatable), `--tag-name-glob`, `--include-attributes`,
|
||||
`--alarm-bearing-only`, and `--historized-only`, all AND-combined.
|
||||
|
||||
## Authentication And TLS
|
||||
|
||||
|
||||
@@ -21,8 +21,10 @@ from zb_mom_ww_mxgateway import __version__
|
||||
from zb_mom_ww_mxgateway.auth import redact_secret
|
||||
from zb_mom_ww_mxgateway.client import GatewayClient
|
||||
from zb_mom_ww_mxgateway.errors import MxGatewayError
|
||||
from zb_mom_ww_mxgateway.galaxy import GalaxyRepositoryClient
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
|
||||
from zb_mom_ww_mxgateway.options import ClientOptions
|
||||
from zb_mom_ww_mxgateway.options import BrowseChildrenOptions, ClientOptions
|
||||
from zb_mom_ww_mxgateway.values import MxValueInput, to_mx_value
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -512,6 +514,148 @@ def smoke(**kwargs: Any) -> None:
|
||||
_run(_smoke(**kwargs), output_json=kwargs["output_json"], secrets=_secrets(kwargs))
|
||||
|
||||
|
||||
@main.command("galaxy-test-connection")
|
||||
@gateway_options
|
||||
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
|
||||
def galaxy_test_connection(**kwargs: Any) -> None:
|
||||
"""Test whether the gateway can reach the Galaxy Repository DB."""
|
||||
|
||||
_run(
|
||||
_galaxy_test_connection(**kwargs),
|
||||
output_json=kwargs["output_json"],
|
||||
secrets=_secrets(kwargs),
|
||||
)
|
||||
|
||||
|
||||
@main.command("galaxy-last-deploy")
|
||||
@gateway_options
|
||||
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
|
||||
def galaxy_last_deploy(**kwargs: Any) -> None:
|
||||
"""Read the last Galaxy deploy timestamp."""
|
||||
|
||||
_run(
|
||||
_galaxy_last_deploy(**kwargs),
|
||||
output_json=kwargs["output_json"],
|
||||
secrets=_secrets(kwargs),
|
||||
)
|
||||
|
||||
|
||||
@main.command("galaxy-discover")
|
||||
@gateway_options
|
||||
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
|
||||
def galaxy_discover(**kwargs: Any) -> None:
|
||||
"""Enumerate the deployed Galaxy object hierarchy."""
|
||||
|
||||
_run(
|
||||
_galaxy_discover(**kwargs),
|
||||
output_json=kwargs["output_json"],
|
||||
secrets=_secrets(kwargs),
|
||||
)
|
||||
|
||||
|
||||
@main.command("galaxy-browse")
|
||||
@gateway_options
|
||||
@click.option(
|
||||
"--parent-gobject-id",
|
||||
"parent_gobject_id",
|
||||
default=None,
|
||||
type=int,
|
||||
help=(
|
||||
"Fetch one level of this parent's direct children via BrowseChildren "
|
||||
"instead of the lazy root walk. Pass a gobject id >= 1. "
|
||||
"(gobject-id 0 is the server root sentinel — omit the flag to list root objects.) "
|
||||
"--depth is ignored when this option is set."
|
||||
),
|
||||
)
|
||||
@click.option(
|
||||
"--depth",
|
||||
default=0,
|
||||
type=int,
|
||||
show_default=True,
|
||||
help="Eagerly expand the root nodes this many further levels before printing.",
|
||||
)
|
||||
@click.option(
|
||||
"--category-id",
|
||||
"category_ids",
|
||||
multiple=True,
|
||||
type=int,
|
||||
help="Restrict to objects whose category_id matches one of these ids (repeatable).",
|
||||
)
|
||||
@click.option(
|
||||
"--template-chain-contains",
|
||||
"template_chain_contains",
|
||||
multiple=True,
|
||||
help="Restrict to objects whose template chain contains this entry (repeatable).",
|
||||
)
|
||||
@click.option(
|
||||
"--tag-name-glob",
|
||||
"tag_name_glob",
|
||||
default=None,
|
||||
help="Restrict to objects whose tag name matches this glob.",
|
||||
)
|
||||
@click.option(
|
||||
"--include-attributes",
|
||||
"include_attributes",
|
||||
is_flag=True,
|
||||
help="Include each object's attribute metadata in the browse results.",
|
||||
)
|
||||
@click.option(
|
||||
"--alarm-bearing-only",
|
||||
"alarm_bearing_only",
|
||||
is_flag=True,
|
||||
help="Only return objects that own at least one alarm-bearing attribute.",
|
||||
)
|
||||
@click.option(
|
||||
"--historized-only",
|
||||
"historized_only",
|
||||
is_flag=True,
|
||||
help="Only return objects that own at least one historized attribute.",
|
||||
)
|
||||
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
|
||||
def galaxy_browse(**kwargs: Any) -> None:
|
||||
"""Browse the deployed Galaxy object hierarchy as a lazy-expanded tree."""
|
||||
|
||||
_run(
|
||||
_galaxy_browse(**kwargs),
|
||||
output_json=kwargs["output_json"],
|
||||
secrets=_secrets(kwargs),
|
||||
)
|
||||
|
||||
|
||||
@main.command("galaxy-watch")
|
||||
@gateway_options
|
||||
@click.option(
|
||||
"--last-seen-deploy-time",
|
||||
"last_seen_deploy_time",
|
||||
default=None,
|
||||
help="ISO-8601 timestamp; when it matches the current cached deploy time the "
|
||||
"bootstrap event is suppressed.",
|
||||
)
|
||||
@click.option(
|
||||
"--max-events",
|
||||
default=1,
|
||||
type=int,
|
||||
show_default=True,
|
||||
help="Stop after collecting this many deploy events.",
|
||||
)
|
||||
@click.option(
|
||||
"--timeout",
|
||||
default=5.0,
|
||||
type=float,
|
||||
show_default=True,
|
||||
help="Seconds to wait for each event before stopping.",
|
||||
)
|
||||
@click.option("--json", "output_json", is_flag=True, help="Emit JSON output.")
|
||||
def galaxy_watch(**kwargs: Any) -> None:
|
||||
"""Stream a bounded number of Galaxy deploy events."""
|
||||
|
||||
_run(
|
||||
_galaxy_watch(**kwargs),
|
||||
output_json=kwargs["output_json"],
|
||||
secrets=_secrets(kwargs),
|
||||
)
|
||||
|
||||
|
||||
async def _open_session(**kwargs: Any) -> dict[str, Any]:
|
||||
async with await _connect(kwargs) as client:
|
||||
reply = await client.open_session_raw(
|
||||
@@ -922,6 +1066,215 @@ async def _smoke(**kwargs: Any) -> dict[str, Any]:
|
||||
await session.close()
|
||||
|
||||
|
||||
async def _galaxy_test_connection(**kwargs: Any) -> dict[str, Any]:
|
||||
async with await _connect_galaxy(kwargs) as galaxy:
|
||||
ok = await galaxy.test_connection()
|
||||
return {"command": "galaxy-test-connection", "ok": ok}
|
||||
|
||||
|
||||
async def _galaxy_last_deploy(**kwargs: Any) -> dict[str, Any]:
|
||||
async with await _connect_galaxy(kwargs) as galaxy:
|
||||
last_deploy = await galaxy.get_last_deploy_time()
|
||||
payload: dict[str, Any] = {
|
||||
"command": "galaxy-last-deploy",
|
||||
"present": last_deploy is not None,
|
||||
}
|
||||
if last_deploy is not None:
|
||||
# galaxy.py returns a timezone-NAIVE UTC datetime (protobuf ToDatetime()).
|
||||
# Stamp it as UTC so the emitted ISO-8601 carries an unambiguous offset,
|
||||
# matching the Go client's "...Z" output.
|
||||
payload["timeOfLastDeploy"] = last_deploy.replace(tzinfo=timezone.utc).isoformat()
|
||||
return payload
|
||||
|
||||
|
||||
async def _galaxy_discover(**kwargs: Any) -> dict[str, Any]:
|
||||
async with await _connect_galaxy(kwargs) as galaxy:
|
||||
objects = await galaxy.discover_hierarchy()
|
||||
return {
|
||||
"command": "galaxy-discover",
|
||||
"objects": [_message_dict(obj) for obj in objects],
|
||||
}
|
||||
|
||||
|
||||
async def _galaxy_browse(**kwargs: Any) -> dict[str, Any]:
|
||||
depth = int(kwargs["depth"])
|
||||
if depth < 0 or depth > 50:
|
||||
raise click.BadParameter("--depth must be between 0 and 50", param_hint="--depth")
|
||||
parent_gobject_id: int | None = kwargs.get("parent_gobject_id")
|
||||
options = BrowseChildrenOptions(
|
||||
category_ids=tuple(kwargs.get("category_ids") or ()),
|
||||
template_chain_contains=tuple(kwargs.get("template_chain_contains") or ()),
|
||||
tag_name_glob=kwargs.get("tag_name_glob"),
|
||||
include_attributes=True if kwargs.get("include_attributes") else None,
|
||||
alarm_bearing_only=bool(kwargs.get("alarm_bearing_only")),
|
||||
historized_only=bool(kwargs.get("historized_only")),
|
||||
)
|
||||
async with await _connect_galaxy(kwargs) as galaxy:
|
||||
if parent_gobject_id is not None:
|
||||
# Single-level parent drill-down: drive BrowseChildren paging by hand
|
||||
# and return the children as a flat list. --depth is not meaningful
|
||||
# here; warn if the caller set it so they know it was ignored.
|
||||
if depth > 0:
|
||||
click.echo(
|
||||
"warning: --depth is ignored when --parent-gobject-id is specified",
|
||||
err=True,
|
||||
)
|
||||
children = await _browse_children_one_level(galaxy, parent_gobject_id, options)
|
||||
return {
|
||||
"command": "galaxy-browse",
|
||||
"nodes": [_browse_child_dict(obj, hint) for obj, hint in children],
|
||||
"_text": _render_browse_children(children),
|
||||
}
|
||||
|
||||
roots = await galaxy.browse(options)
|
||||
for root in roots:
|
||||
await _expand_to_depth(root, depth)
|
||||
return {
|
||||
"command": "galaxy-browse",
|
||||
"nodes": [_browse_node_dict(node) for node in roots],
|
||||
"_text": _render_browse_tree(roots),
|
||||
}
|
||||
|
||||
|
||||
async def _expand_to_depth(node: Any, depth: int) -> None:
|
||||
"""Recursively expand a LazyBrowseNode up to ``depth`` further levels.
|
||||
|
||||
``depth == 0`` leaves the node unexpanded so only the requested level is
|
||||
printed; each level beyond fetches and recurses into the loaded children.
|
||||
"""
|
||||
|
||||
if depth <= 0:
|
||||
return
|
||||
if node.has_children_hint:
|
||||
await node.expand()
|
||||
for child in node.children:
|
||||
await _expand_to_depth(child, depth - 1)
|
||||
|
||||
|
||||
def _browse_node_dict(node: Any) -> dict[str, Any]:
|
||||
"""Render one LazyBrowseNode (and any already-expanded descendants).
|
||||
|
||||
Mirrors the ``galaxy-discover`` object shape with an added
|
||||
``hasChildrenHint`` flag and a nested ``children`` array, matching the
|
||||
cross-client browse JSON surface.
|
||||
"""
|
||||
|
||||
payload = _message_dict(node.object)
|
||||
payload["hasChildrenHint"] = bool(node.has_children_hint)
|
||||
payload["children"] = (
|
||||
[_browse_node_dict(child) for child in node.children] if node.is_expanded else []
|
||||
)
|
||||
return payload
|
||||
|
||||
|
||||
def _render_browse_tree(roots: list[Any]) -> str:
|
||||
"""Render the lazy-browse roots as a node count plus an indented tree."""
|
||||
|
||||
lines: list[str] = [str(len(roots))]
|
||||
for root in roots:
|
||||
_append_browse_node_lines(root, 0, lines)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _append_browse_node_lines(node: Any, indent: int, lines: list[str]) -> None:
|
||||
obj = node.object
|
||||
marker = "+" if node.has_children_hint else "-"
|
||||
pad = " " * indent
|
||||
lines.append(f"{pad}{marker} {obj.tag_name} {obj.browse_name} (gobject {obj.gobject_id})")
|
||||
if node.is_expanded:
|
||||
for child in node.children:
|
||||
_append_browse_node_lines(child, indent + 2, lines)
|
||||
|
||||
|
||||
_BROWSE_CHILDREN_PAGE_SIZE = 500
|
||||
|
||||
|
||||
async def _browse_children_one_level(
|
||||
galaxy: Any,
|
||||
parent_gobject_id: int,
|
||||
options: BrowseChildrenOptions,
|
||||
) -> list[tuple[Any, bool]]:
|
||||
"""Page through BrowseChildren for ``parent_gobject_id`` and return (object, hint) pairs.
|
||||
|
||||
Uses page size 500 (matching the library constant) and guards against a
|
||||
repeated page token to prevent an infinite loop if the server misbehaves.
|
||||
"""
|
||||
|
||||
results: list[tuple[Any, bool]] = []
|
||||
seen_page_tokens: set[str] = set()
|
||||
page_token = ""
|
||||
|
||||
while True:
|
||||
request = galaxy_pb.BrowseChildrenRequest(
|
||||
parent_gobject_id=parent_gobject_id,
|
||||
page_size=_BROWSE_CHILDREN_PAGE_SIZE,
|
||||
page_token=page_token,
|
||||
alarm_bearing_only=options.alarm_bearing_only,
|
||||
historized_only=options.historized_only,
|
||||
)
|
||||
if options.category_ids:
|
||||
request.category_ids.extend(options.category_ids)
|
||||
if options.template_chain_contains:
|
||||
request.template_chain_contains.extend(options.template_chain_contains)
|
||||
if options.tag_name_glob:
|
||||
request.tag_name_glob = options.tag_name_glob
|
||||
if options.include_attributes is not None:
|
||||
request.include_attributes = options.include_attributes
|
||||
|
||||
reply = await galaxy.browse_children_raw(request)
|
||||
|
||||
for index, obj in enumerate(reply.children):
|
||||
hint = index < len(reply.child_has_children) and bool(reply.child_has_children[index])
|
||||
results.append((obj, hint))
|
||||
|
||||
page_token = reply.next_page_token
|
||||
if not page_token:
|
||||
return results
|
||||
if page_token in seen_page_tokens:
|
||||
raise MxGatewayError(
|
||||
f"galaxy browse children returned repeated page token {page_token!r}"
|
||||
)
|
||||
seen_page_tokens.add(page_token)
|
||||
|
||||
|
||||
def _browse_child_dict(obj: Any, has_children_hint: bool) -> dict[str, Any]:
|
||||
"""Render one raw browse child as a node dict matching the lazy-browse shape.
|
||||
|
||||
The ``children`` array is always empty — the parent drill-down path returns
|
||||
a flat single-level listing without recursive expansion.
|
||||
"""
|
||||
|
||||
payload = _message_dict(obj)
|
||||
payload["hasChildrenHint"] = has_children_hint
|
||||
payload["children"] = []
|
||||
return payload
|
||||
|
||||
|
||||
def _render_browse_children(children: list[tuple[Any, bool]]) -> str:
|
||||
"""Render a flat one-level child list as a count line plus marker lines."""
|
||||
|
||||
lines: list[str] = [str(len(children))]
|
||||
for obj, has_children_hint in children:
|
||||
marker = "+" if has_children_hint else "-"
|
||||
lines.append(f"{marker} {obj.tag_name} {obj.browse_name} (gobject {obj.gobject_id})")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
async def _galaxy_watch(**kwargs: Any) -> dict[str, Any]:
|
||||
last_seen = kwargs.get("last_seen_deploy_time")
|
||||
last_seen_dt = _parse_datetime(last_seen) if last_seen else None
|
||||
async with await _connect_galaxy(kwargs) as galaxy:
|
||||
events = await _collect_deploy_events(
|
||||
galaxy.watch_deploy_events(last_seen_dt),
|
||||
max_events=kwargs["max_events"],
|
||||
timeout=kwargs["timeout"],
|
||||
)
|
||||
return {
|
||||
"command": "galaxy-watch",
|
||||
"events": [_message_dict(event) for event in events],
|
||||
}
|
||||
|
||||
|
||||
async def _connect(kwargs: dict[str, Any]) -> GatewayClient:
|
||||
api_key = kwargs.get("api_key") or _api_key_from_env(kwargs.get("api_key_env"))
|
||||
return await GatewayClient.connect(
|
||||
@@ -938,6 +1291,22 @@ async def _connect(kwargs: dict[str, Any]) -> GatewayClient:
|
||||
)
|
||||
|
||||
|
||||
async def _connect_galaxy(kwargs: dict[str, Any]) -> GalaxyRepositoryClient:
|
||||
api_key = kwargs.get("api_key") or _api_key_from_env(kwargs.get("api_key_env"))
|
||||
return await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(
|
||||
endpoint=kwargs["endpoint"],
|
||||
api_key=api_key,
|
||||
plaintext=_use_plaintext(kwargs),
|
||||
ca_file=kwargs.get("ca_file"),
|
||||
require_certificate_validation=bool(kwargs.get("require_certificate_validation")),
|
||||
server_name_override=kwargs.get("server_name_override"),
|
||||
call_timeout=kwargs.get("call_timeout"),
|
||||
stream_timeout=kwargs.get("stream_timeout"),
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
def _session(client: GatewayClient, session_id: str):
|
||||
from zb_mom_ww_mxgateway.session import Session
|
||||
|
||||
@@ -995,11 +1364,17 @@ def _emit(
|
||||
output_json: bool,
|
||||
text: str | None = None,
|
||||
) -> None:
|
||||
# A payload may carry a pre-rendered text representation under the private
|
||||
# "_text" key (used by commands like galaxy-browse whose text output is a
|
||||
# custom indented tree rather than the default JSON dump). Strip it so it
|
||||
# never leaks into the JSON branch.
|
||||
rendered_text = payload.pop("_text", None) if isinstance(payload, dict) else None
|
||||
|
||||
if output_json:
|
||||
click.echo(json.dumps(payload, sort_keys=True))
|
||||
return
|
||||
|
||||
click.echo(text or json.dumps(payload, sort_keys=True))
|
||||
click.echo(text or rendered_text or json.dumps(payload, sort_keys=True))
|
||||
|
||||
|
||||
async def _collect_events(
|
||||
@@ -1058,6 +1433,34 @@ async def _collect_alarm_messages(
|
||||
return collected
|
||||
|
||||
|
||||
async def _collect_deploy_events(
|
||||
events: Any,
|
||||
*,
|
||||
max_events: int,
|
||||
timeout: float,
|
||||
) -> list[galaxy_pb.DeployEvent]:
|
||||
if max_events > MAX_AGGREGATE_EVENTS:
|
||||
raise click.BadParameter(
|
||||
f"must be less than or equal to {MAX_AGGREGATE_EVENTS}",
|
||||
param_hint="--max-events",
|
||||
)
|
||||
|
||||
collected: list[galaxy_pb.DeployEvent] = []
|
||||
iterator = events.__aiter__()
|
||||
try:
|
||||
while len(collected) < max_events:
|
||||
collected.append(await asyncio.wait_for(iterator.__anext__(), timeout=timeout))
|
||||
except StopAsyncIteration:
|
||||
pass
|
||||
except asyncio.TimeoutError:
|
||||
pass
|
||||
finally:
|
||||
close = getattr(iterator, "aclose", None)
|
||||
if close is not None:
|
||||
await close()
|
||||
return collected
|
||||
|
||||
|
||||
def _parse_value(raw_value: str, value_type: str) -> MxValueInput:
|
||||
normalized = value_type.lower()
|
||||
if normalized == "bool":
|
||||
|
||||
@@ -211,3 +211,437 @@ def test_batch_continues_after_error_line() -> None:
|
||||
# Second block: successful version JSON.
|
||||
version_payload = json.loads(blocks[1].strip())
|
||||
assert version_payload["version"] == __version__
|
||||
|
||||
|
||||
class _FakeGalaxyClient:
|
||||
"""Minimal async-context-manager fake satisfying the galaxy command bodies."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
ok: bool = True,
|
||||
objects=None,
|
||||
last_deploy=None,
|
||||
events=None,
|
||||
browse_roots=None,
|
||||
browse_children_pages=None,
|
||||
) -> None:
|
||||
self._ok = ok
|
||||
self._objects = objects or []
|
||||
self._last_deploy = last_deploy
|
||||
self._events = events or []
|
||||
self._browse_roots = browse_roots or []
|
||||
# List of BrowseChildrenReply-like objects to serve in order (paged).
|
||||
self._browse_children_pages = browse_children_pages or []
|
||||
self._browse_children_calls: list = []
|
||||
self.browse_options = None
|
||||
|
||||
async def __aenter__(self) -> "_FakeGalaxyClient":
|
||||
return self
|
||||
|
||||
async def __aexit__(self, *_exc: object) -> None:
|
||||
return None
|
||||
|
||||
async def test_connection(self) -> bool:
|
||||
return self._ok
|
||||
|
||||
async def discover_hierarchy(self):
|
||||
return self._objects
|
||||
|
||||
async def browse(self, options=None):
|
||||
self.browse_options = options
|
||||
return self._browse_roots
|
||||
|
||||
async def browse_children_raw(self, request):
|
||||
"""Return the next queued BrowseChildrenReply page; raises if queue empty."""
|
||||
self._browse_children_calls.append(request)
|
||||
if not self._browse_children_pages:
|
||||
raise AssertionError("browse_children_raw called but no pages queued")
|
||||
return self._browse_children_pages.pop(0)
|
||||
|
||||
async def get_last_deploy_time(self):
|
||||
# Mirrors galaxy.py: protobuf ToDatetime() yields a timezone-NAIVE UTC datetime.
|
||||
return self._last_deploy
|
||||
|
||||
def watch_deploy_events(self, _last_seen_deploy_time=None):
|
||||
events = self._events
|
||||
|
||||
async def _iter():
|
||||
for event in events:
|
||||
yield event
|
||||
|
||||
return _iter()
|
||||
|
||||
|
||||
def _patch_galaxy_connect(monkeypatch: pytest.MonkeyPatch, fake: _FakeGalaxyClient) -> None:
|
||||
async def fake_connect(options, **_kwargs):
|
||||
return fake
|
||||
|
||||
monkeypatch.setattr(commands_module.GalaxyRepositoryClient, "connect", fake_connect)
|
||||
|
||||
|
||||
def test_galaxy_test_connection_emits_ok(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(ok=True))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-test-connection", "--plaintext", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert payload == {"command": "galaxy-test-connection", "ok": True}
|
||||
|
||||
|
||||
def test_galaxy_discover_serializes_objects(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
objects = [
|
||||
galaxy_pb.GalaxyObject(gobject_id=7, tag_name="Area001", contained_name="Area001"),
|
||||
galaxy_pb.GalaxyObject(gobject_id=8, tag_name="Pump001", contained_name="Pump001"),
|
||||
]
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(objects=objects))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-discover", "--plaintext", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert payload["command"] == "galaxy-discover"
|
||||
assert len(payload["objects"]) == 2
|
||||
assert payload["objects"][0]["tagName"] == "Area001"
|
||||
assert payload["objects"][1]["gobjectId"] == 8
|
||||
|
||||
|
||||
def test_galaxy_last_deploy_emits_utc_iso(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""The naive-UTC deploy time from the library must be emitted as unambiguous UTC ISO-8601."""
|
||||
from datetime import datetime
|
||||
|
||||
naive_utc = datetime(2025, 6, 15, 12, 0, 0) # noqa: DTZ001 -- mirrors protobuf ToDatetime()
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(last_deploy=naive_utc))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-last-deploy", "--plaintext", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert payload["command"] == "galaxy-last-deploy"
|
||||
assert payload["present"] is True
|
||||
assert payload["timeOfLastDeploy"].endswith(("+00:00", "Z"))
|
||||
|
||||
|
||||
def test_galaxy_watch_serializes_deploy_events(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
events = [galaxy_pb.DeployEvent(sequence=1)]
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(events=events))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-watch", "--plaintext", "--max-events", "1", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert payload["command"] == "galaxy-watch"
|
||||
assert len(payload["events"]) == 1
|
||||
|
||||
|
||||
class _FakeBrowseNode:
|
||||
"""Minimal stand-in for LazyBrowseNode covering the CLI render path."""
|
||||
|
||||
def __init__(self, obj, *, has_children_hint=False, children=None) -> None:
|
||||
self._object = obj
|
||||
self._has_children_hint = has_children_hint
|
||||
self._children = list(children or [])
|
||||
self._is_expanded = bool(children)
|
||||
self.expand_calls = 0
|
||||
|
||||
@property
|
||||
def object(self):
|
||||
return self._object
|
||||
|
||||
@property
|
||||
def has_children_hint(self) -> bool:
|
||||
return self._has_children_hint
|
||||
|
||||
@property
|
||||
def children(self):
|
||||
return list(self._children)
|
||||
|
||||
@property
|
||||
def is_expanded(self) -> bool:
|
||||
return self._is_expanded
|
||||
|
||||
async def expand(self) -> None:
|
||||
self.expand_calls += 1
|
||||
self._is_expanded = True
|
||||
|
||||
|
||||
def test_galaxy_browse_serializes_nested_nodes(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child = _FakeBrowseNode(
|
||||
galaxy_pb.GalaxyObject(gobject_id=8, tag_name="Pump001", contained_name="Pump001"),
|
||||
has_children_hint=False,
|
||||
)
|
||||
root = _FakeBrowseNode(
|
||||
galaxy_pb.GalaxyObject(gobject_id=7, tag_name="Area001", contained_name="Area001"),
|
||||
has_children_hint=True,
|
||||
children=[child],
|
||||
)
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(browse_roots=[root]))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
assert "_text" not in payload
|
||||
assert payload["command"] == "galaxy-browse"
|
||||
assert len(payload["nodes"]) == 1
|
||||
node = payload["nodes"][0]
|
||||
assert node["tagName"] == "Area001"
|
||||
assert node["hasChildrenHint"] is True
|
||||
assert len(node["children"]) == 1
|
||||
assert node["children"][0]["gobjectId"] == 8
|
||||
assert node["children"][0]["children"] == []
|
||||
|
||||
|
||||
def test_galaxy_browse_renders_indented_text_tree(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child = _FakeBrowseNode(
|
||||
galaxy_pb.GalaxyObject(gobject_id=8, tag_name="Pump001", browse_name="Pump001"),
|
||||
)
|
||||
root = _FakeBrowseNode(
|
||||
galaxy_pb.GalaxyObject(gobject_id=7, tag_name="Area001", browse_name="Area001"),
|
||||
has_children_hint=True,
|
||||
children=[child],
|
||||
)
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(browse_roots=[root]))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
lines = result.output.splitlines()
|
||||
assert lines[0] == "1"
|
||||
assert lines[1] == "+ Area001 Area001 (gobject 7)"
|
||||
assert lines[2] == " - Pump001 Pump001 (gobject 8)"
|
||||
|
||||
|
||||
def test_galaxy_browse_forwards_filter_options(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
fake = _FakeGalaxyClient(browse_roots=[])
|
||||
_patch_galaxy_connect(monkeypatch, fake)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
[
|
||||
"galaxy-browse",
|
||||
"--plaintext",
|
||||
"--category-id",
|
||||
"10",
|
||||
"--category-id",
|
||||
"12",
|
||||
"--template-chain-contains",
|
||||
"$Pump",
|
||||
"--tag-name-glob",
|
||||
"Area*",
|
||||
"--include-attributes",
|
||||
"--alarm-bearing-only",
|
||||
"--historized-only",
|
||||
"--json",
|
||||
],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
options = fake.browse_options
|
||||
assert tuple(options.category_ids) == (10, 12)
|
||||
assert tuple(options.template_chain_contains) == ("$Pump",)
|
||||
assert options.tag_name_glob == "Area*"
|
||||
assert options.include_attributes is True
|
||||
assert options.alarm_bearing_only is True
|
||||
assert options.historized_only is True
|
||||
|
||||
|
||||
def test_galaxy_browse_expands_to_depth(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
root = _FakeBrowseNode(
|
||||
galaxy_pb.GalaxyObject(gobject_id=7, tag_name="Area001"),
|
||||
has_children_hint=True,
|
||||
)
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(browse_roots=[root]))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--depth", "2", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
assert root.expand_calls == 1
|
||||
|
||||
|
||||
def test_galaxy_commands_are_registered() -> None:
|
||||
runner = CliRunner()
|
||||
for command in (
|
||||
"galaxy-test-connection",
|
||||
"galaxy-last-deploy",
|
||||
"galaxy-discover",
|
||||
"galaxy-watch",
|
||||
"galaxy-browse",
|
||||
):
|
||||
result = runner.invoke(main, [command, "--help"])
|
||||
assert result.exit_code == 0, result.output
|
||||
assert "--endpoint" in result.output
|
||||
|
||||
|
||||
@pytest.mark.parametrize("depth_arg", ["99", "-1"])
|
||||
def test_galaxy_browse_rejects_out_of_range_depth(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
depth_arg: str,
|
||||
) -> None:
|
||||
"""--depth values outside [0, 50] must be rejected with a non-zero exit."""
|
||||
_patch_galaxy_connect(monkeypatch, _FakeGalaxyClient(browse_roots=[]))
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--depth", depth_arg, "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code != 0
|
||||
assert "--depth must be between 0 and 50" in result.output
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# --parent-gobject-id drill-down tests
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def _fake_browse_children_reply(children_and_hints, *, next_page_token=""):
|
||||
"""Build a minimal fake BrowseChildrenReply-like object."""
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
reply = galaxy_pb.BrowseChildrenReply()
|
||||
for obj, hint in children_and_hints:
|
||||
reply.children.append(obj)
|
||||
reply.child_has_children.append(hint)
|
||||
reply.next_page_token = next_page_token
|
||||
return reply
|
||||
|
||||
|
||||
def test_galaxy_browse_parent_fetches_one_level_json(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""--parent-gobject-id N calls browse_children_raw and renders one-level JSON."""
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child_a = galaxy_pb.GalaxyObject(gobject_id=10, tag_name="PumpA", browse_name="PumpA")
|
||||
child_b = galaxy_pb.GalaxyObject(gobject_id=11, tag_name="PumpB", browse_name="PumpB")
|
||||
page = _fake_browse_children_reply([(child_a, True), (child_b, False)])
|
||||
fake = _FakeGalaxyClient(browse_children_pages=[page])
|
||||
_patch_galaxy_connect(monkeypatch, fake)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--parent-gobject-id", "7", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
payload = json.loads(result.output)
|
||||
|
||||
# One BrowseChildren RPC was issued with the correct parent id.
|
||||
assert len(fake._browse_children_calls) == 1
|
||||
call_req = fake._browse_children_calls[0]
|
||||
assert call_req.parent_gobject_id == 7
|
||||
|
||||
# JSON shape mirrors the lazy-browse node shape.
|
||||
assert payload["command"] == "galaxy-browse"
|
||||
nodes = payload["nodes"]
|
||||
assert len(nodes) == 2
|
||||
assert nodes[0]["tagName"] == "PumpA"
|
||||
assert nodes[0]["hasChildrenHint"] is True
|
||||
assert nodes[0]["children"] == []
|
||||
assert nodes[1]["gobjectId"] == 11
|
||||
assert nodes[1]["hasChildrenHint"] is False
|
||||
assert nodes[1]["children"] == []
|
||||
|
||||
|
||||
def test_galaxy_browse_parent_renders_text_tree(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""--parent-gobject-id N text output: count line then marker lines (no indent)."""
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child = galaxy_pb.GalaxyObject(gobject_id=10, tag_name="PumpA", browse_name="PumpA")
|
||||
page = _fake_browse_children_reply([(child, False)])
|
||||
fake = _FakeGalaxyClient(browse_children_pages=[page])
|
||||
_patch_galaxy_connect(monkeypatch, fake)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--parent-gobject-id", "7"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
lines = result.output.splitlines()
|
||||
assert lines[0] == "1"
|
||||
assert lines[1] == "- PumpA PumpA (gobject 10)"
|
||||
|
||||
|
||||
def test_galaxy_browse_parent_pages_correctly(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""--parent-gobject-id loops on next_page_token until exhausted."""
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child_a = galaxy_pb.GalaxyObject(gobject_id=10, tag_name="PumpA", browse_name="PumpA")
|
||||
child_b = galaxy_pb.GalaxyObject(gobject_id=11, tag_name="PumpB", browse_name="PumpB")
|
||||
page1 = _fake_browse_children_reply([(child_a, False)], next_page_token="tok1")
|
||||
page2 = _fake_browse_children_reply([(child_b, True)])
|
||||
fake = _FakeGalaxyClient(browse_children_pages=[page1, page2])
|
||||
_patch_galaxy_connect(monkeypatch, fake)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--parent-gobject-id", "7", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
assert len(fake._browse_children_calls) == 2
|
||||
# Second call must carry the page token from the first reply.
|
||||
assert fake._browse_children_calls[1].page_token == "tok1"
|
||||
payload = json.loads(result.output)
|
||||
assert len(payload["nodes"]) == 2
|
||||
|
||||
|
||||
def test_galaxy_browse_parent_warns_when_depth_also_set(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""When both --parent-gobject-id and --depth>0 are supplied a warning is emitted."""
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
|
||||
child = galaxy_pb.GalaxyObject(gobject_id=10, tag_name="PumpA", browse_name="PumpA")
|
||||
page = _fake_browse_children_reply([(child, False)])
|
||||
fake = _FakeGalaxyClient(browse_children_pages=[page])
|
||||
_patch_galaxy_connect(monkeypatch, fake)
|
||||
|
||||
# CliRunner mixes stderr into output in this Click version.
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["galaxy-browse", "--plaintext", "--parent-gobject-id", "7", "--depth", "2", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
assert "--depth is ignored" in result.output
|
||||
|
||||
|
||||
def test_galaxy_browse_help_shows_parent_gobject_id() -> None:
|
||||
"""--parent-gobject-id appears in the galaxy-browse --help output."""
|
||||
result = CliRunner().invoke(main, ["galaxy-browse", "--help"])
|
||||
|
||||
assert result.exit_code == 0
|
||||
assert "--parent-gobject-id" in result.output
|
||||
|
||||
@@ -18,6 +18,7 @@ use clap::{Args, Parser, Subcommand, ValueEnum};
|
||||
use futures_util::StreamExt;
|
||||
use serde_json::json;
|
||||
use serde_json::Value;
|
||||
use zb_mom_ww_mxgateway_client::galaxy::{BrowseChildrenOptions, LazyBrowseNode};
|
||||
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::DeployEvent;
|
||||
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
|
||||
alarm_feed_message, AcknowledgeAlarmRequest, AlarmFeedMessage, CloseSessionRequest, MxCommand,
|
||||
@@ -387,6 +388,47 @@ enum GalaxyCommand {
|
||||
#[arg(long)]
|
||||
json: bool,
|
||||
},
|
||||
/// Lazily browse the Galaxy hierarchy through `BrowseChildren`.
|
||||
///
|
||||
/// With no `--parent-gobject-id` the root objects are listed; pass a
|
||||
/// parent id to list that object's direct children. `--depth` controls
|
||||
/// how many further levels are eagerly expanded (0 = the requested level
|
||||
/// only). The filter flags map onto `BrowseChildrenOptions` and are reused
|
||||
/// at every expanded level, mirroring the lazy-browse library helper.
|
||||
Browse {
|
||||
#[command(flatten)]
|
||||
connection: ConnectionArgs,
|
||||
/// Parent gobject id whose children to browse. Omit for root objects.
|
||||
#[arg(long)]
|
||||
parent_gobject_id: Option<i32>,
|
||||
/// Restrict to objects whose `category_id` matches one of these ids.
|
||||
/// Repeatable.
|
||||
#[arg(long = "category-id")]
|
||||
category_ids: Vec<i32>,
|
||||
/// Restrict to objects whose template chain contains this entry.
|
||||
/// Repeatable (combined with AND).
|
||||
#[arg(long = "template-contains")]
|
||||
template_chain_contains: Vec<String>,
|
||||
/// Restrict to objects whose tag name matches this SQL `LIKE`-style glob.
|
||||
#[arg(long)]
|
||||
tag_name_glob: Option<String>,
|
||||
/// Populate `attributes` on the returned objects.
|
||||
#[arg(long)]
|
||||
include_attributes: bool,
|
||||
/// Only return objects that own at least one alarm-bearing attribute.
|
||||
#[arg(long)]
|
||||
alarm_bearing_only: bool,
|
||||
/// Only return objects that own at least one historized attribute.
|
||||
#[arg(long)]
|
||||
historized_only: bool,
|
||||
/// Number of additional levels to eagerly expand beneath each returned
|
||||
/// node. 0 (the default) prints only the requested level.
|
||||
/// Ignored when --parent-gobject-id is specified.
|
||||
#[arg(long, default_value_t = 0)]
|
||||
depth: usize,
|
||||
#[arg(long)]
|
||||
json: bool,
|
||||
},
|
||||
/// Subscribe to the WatchDeployEvents server stream.
|
||||
///
|
||||
/// Prints one line per received event (or one JSON object with `--json`).
|
||||
@@ -1103,10 +1145,271 @@ async fn run_galaxy(command: GalaxyCommand) -> Result<(), Error> {
|
||||
}
|
||||
}
|
||||
}
|
||||
GalaxyCommand::Browse {
|
||||
connection,
|
||||
parent_gobject_id,
|
||||
category_ids,
|
||||
template_chain_contains,
|
||||
tag_name_glob,
|
||||
include_attributes,
|
||||
alarm_bearing_only,
|
||||
historized_only,
|
||||
depth,
|
||||
json,
|
||||
} => {
|
||||
if parent_gobject_id.is_some() && depth > 0 {
|
||||
eprintln!("warning: --depth is ignored when --parent-gobject-id is specified");
|
||||
}
|
||||
|
||||
let mut client = connect_galaxy(connection).await?;
|
||||
let options = BrowseChildrenOptions {
|
||||
category_ids,
|
||||
template_chain_contains,
|
||||
tag_name_glob,
|
||||
include_attributes: include_attributes.then_some(true),
|
||||
alarm_bearing_only,
|
||||
historized_only,
|
||||
};
|
||||
|
||||
match parent_gobject_id {
|
||||
// No parent → walk the lazy-browse tree from the root objects,
|
||||
// eagerly expanding `depth` further levels so the print walks
|
||||
// cached children without re-issuing RPCs.
|
||||
None => {
|
||||
let nodes = client.browse(Some(options)).await?;
|
||||
for node in &nodes {
|
||||
expand_to_depth(node, depth).await?;
|
||||
}
|
||||
if json {
|
||||
let mut payload = Vec::with_capacity(nodes.len());
|
||||
for node in &nodes {
|
||||
payload.push(lazy_node_to_json(node).await);
|
||||
}
|
||||
println!("{}", json!({ "nodes": payload }));
|
||||
} else {
|
||||
println!("{}", nodes.len());
|
||||
for node in &nodes {
|
||||
print_lazy_node(node, 0).await;
|
||||
}
|
||||
}
|
||||
}
|
||||
// A specific parent → fetch exactly one level of children via
|
||||
// the raw paged RPC. `--depth` is not meaningful here; the
|
||||
// single-level children are returned as-is.
|
||||
Some(parent) => {
|
||||
let children = browse_children_one_level(&mut client, parent, &options).await?;
|
||||
print_browse_children(&children, json);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Page size used for the raw `BrowseChildren` RPC when fetching a single
|
||||
/// level via `--parent-gobject-id`. Mirrors `BROWSE_CHILDREN_PAGE_SIZE` in
|
||||
/// `galaxy.rs` (the library's lazy-browse helper uses the same value).
|
||||
const BROWSE_PAGE_SIZE: i32 = 500;
|
||||
|
||||
/// Drive `BrowseChildren` paging by hand for a single parent and return the
|
||||
/// flattened child list. Used by the `browse --parent-gobject-id` path, which
|
||||
/// surfaces one level of children rather than the lazy root-tree walk.
|
||||
async fn browse_children_one_level(
|
||||
client: &mut GalaxyClient,
|
||||
parent_gobject_id: i32,
|
||||
options: &BrowseChildrenOptions,
|
||||
) -> Result<Vec<GalaxyBrowseChild>, Error> {
|
||||
use std::collections::HashSet;
|
||||
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::{
|
||||
browse_children_request, BrowseChildrenRequest,
|
||||
};
|
||||
|
||||
let mut children = Vec::new();
|
||||
let mut page_token = String::new();
|
||||
let mut seen: HashSet<String> = HashSet::new();
|
||||
loop {
|
||||
let request = BrowseChildrenRequest {
|
||||
page_size: BROWSE_PAGE_SIZE,
|
||||
page_token: page_token.clone(),
|
||||
category_ids: options.category_ids.clone(),
|
||||
template_chain_contains: options.template_chain_contains.clone(),
|
||||
tag_name_glob: options.tag_name_glob.clone().unwrap_or_default(),
|
||||
include_attributes: options.include_attributes,
|
||||
alarm_bearing_only: options.alarm_bearing_only,
|
||||
historized_only: options.historized_only,
|
||||
parent: Some(browse_children_request::Parent::ParentGobjectId(
|
||||
parent_gobject_id,
|
||||
)),
|
||||
};
|
||||
let reply = client.browse_children_raw(request).await?;
|
||||
let hints = reply.child_has_children;
|
||||
for (index, object) in reply.children.into_iter().enumerate() {
|
||||
let has_children_hint = hints.get(index).copied().unwrap_or(false);
|
||||
children.push(GalaxyBrowseChild {
|
||||
object,
|
||||
has_children_hint,
|
||||
});
|
||||
}
|
||||
page_token = reply.next_page_token;
|
||||
if page_token.is_empty() {
|
||||
return Ok(children);
|
||||
}
|
||||
if !seen.insert(page_token.clone()) {
|
||||
return Err(Error::InvalidArgument {
|
||||
name: "page_token".to_owned(),
|
||||
detail: format!(
|
||||
"galaxy browse children returned repeated page token `{page_token}`"
|
||||
),
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// A single child returned by the raw `BrowseChildren` paging path, paired
|
||||
/// with its server-supplied `child_has_children` hint.
|
||||
struct GalaxyBrowseChild {
|
||||
object: zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::GalaxyObject,
|
||||
has_children_hint: bool,
|
||||
}
|
||||
|
||||
/// Print the one-level children of a browsed parent, mirroring the JSON node
|
||||
/// shape used by the root-tree walk (minus the recursive `children` array).
|
||||
fn print_browse_children(children: &[GalaxyBrowseChild], use_json: bool) {
|
||||
if use_json {
|
||||
let payload: Vec<_> = children.iter().map(browse_child_to_json).collect();
|
||||
println!("{}", json!({ "nodes": payload }));
|
||||
} else {
|
||||
println!("{}", children.len());
|
||||
for child in children {
|
||||
let object = &child.object;
|
||||
let marker = if child.has_children_hint { "+" } else { "-" };
|
||||
println!(
|
||||
"{marker} {} {} (gobject {})",
|
||||
object.tag_name, object.browse_name, object.gobject_id,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Render one raw browse child as a JSON object whose key set matches the
|
||||
/// lazy-node renderer (with an empty `children` array).
|
||||
fn browse_child_to_json(child: &GalaxyBrowseChild) -> Value {
|
||||
let object = &child.object;
|
||||
json!({
|
||||
"gobjectId": object.gobject_id,
|
||||
"tagName": object.tag_name,
|
||||
"containedName": object.contained_name,
|
||||
"browseName": object.browse_name,
|
||||
"parentGobjectId": object.parent_gobject_id,
|
||||
"isArea": object.is_area,
|
||||
"categoryId": object.category_id,
|
||||
"hostedByGobjectId": object.hosted_by_gobject_id,
|
||||
"templateChain": object.template_chain,
|
||||
"hasChildrenHint": child.has_children_hint,
|
||||
"attributes": object.attributes.iter().map(|attribute| json!({
|
||||
"attributeName": attribute.attribute_name,
|
||||
"fullTagReference": attribute.full_tag_reference,
|
||||
"mxDataType": attribute.mx_data_type,
|
||||
"dataTypeName": attribute.data_type_name,
|
||||
"isArray": attribute.is_array,
|
||||
"arrayDimension": attribute.array_dimension,
|
||||
"arrayDimensionPresent": attribute.array_dimension_present,
|
||||
"mxAttributeCategory": attribute.mx_attribute_category,
|
||||
"securityClassification": attribute.security_classification,
|
||||
"isHistorized": attribute.is_historized,
|
||||
"isAlarm": attribute.is_alarm,
|
||||
})).collect::<Vec<_>>(),
|
||||
"children": Vec::<Value>::new(),
|
||||
})
|
||||
}
|
||||
|
||||
/// Recursively expand a [`LazyBrowseNode`] up to `depth` further levels. A
|
||||
/// `depth` of 0 leaves the node unexpanded so the caller prints only the
|
||||
/// requested level.
|
||||
fn expand_to_depth(
|
||||
node: &LazyBrowseNode,
|
||||
depth: usize,
|
||||
) -> std::pin::Pin<Box<dyn std::future::Future<Output = Result<(), Error>> + Send + '_>> {
|
||||
Box::pin(async move {
|
||||
if depth == 0 {
|
||||
return Ok(());
|
||||
}
|
||||
node.expand().await?;
|
||||
for child in node.children().await {
|
||||
expand_to_depth(&child, depth - 1).await?;
|
||||
}
|
||||
Ok(())
|
||||
})
|
||||
}
|
||||
|
||||
/// Print a [`LazyBrowseNode`] and any already-expanded descendants as an
|
||||
/// indented tree. Indentation is two spaces per level.
|
||||
fn print_lazy_node(
|
||||
node: &LazyBrowseNode,
|
||||
indent: usize,
|
||||
) -> std::pin::Pin<Box<dyn std::future::Future<Output = ()> + Send + '_>> {
|
||||
Box::pin(async move {
|
||||
let object = node.object();
|
||||
let marker = if node.has_children_hint() { "+" } else { "-" };
|
||||
println!(
|
||||
"{:indent$}{marker} {} {} (gobject {})",
|
||||
"",
|
||||
object.tag_name,
|
||||
object.browse_name,
|
||||
object.gobject_id,
|
||||
indent = indent,
|
||||
);
|
||||
if node.is_expanded().await {
|
||||
for child in node.children().await {
|
||||
print_lazy_node(&child, indent + 2).await;
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
/// Render a [`LazyBrowseNode`] (and its already-expanded children) as a JSON
|
||||
/// object. Mirrors the `discover-hierarchy` object shape with an added
|
||||
/// `hasChildrenHint` flag and a nested `children` array.
|
||||
fn lazy_node_to_json(
|
||||
node: &LazyBrowseNode,
|
||||
) -> std::pin::Pin<Box<dyn std::future::Future<Output = Value> + Send + '_>> {
|
||||
Box::pin(async move {
|
||||
let object = node.object();
|
||||
let mut children = Vec::new();
|
||||
if node.is_expanded().await {
|
||||
for child in node.children().await {
|
||||
children.push(lazy_node_to_json(&child).await);
|
||||
}
|
||||
}
|
||||
json!({
|
||||
"gobjectId": object.gobject_id,
|
||||
"tagName": object.tag_name,
|
||||
"containedName": object.contained_name,
|
||||
"browseName": object.browse_name,
|
||||
"parentGobjectId": object.parent_gobject_id,
|
||||
"isArea": object.is_area,
|
||||
"categoryId": object.category_id,
|
||||
"hostedByGobjectId": object.hosted_by_gobject_id,
|
||||
"templateChain": object.template_chain,
|
||||
"hasChildrenHint": node.has_children_hint(),
|
||||
"attributes": object.attributes.iter().map(|attribute| json!({
|
||||
"attributeName": attribute.attribute_name,
|
||||
"fullTagReference": attribute.full_tag_reference,
|
||||
"mxDataType": attribute.mx_data_type,
|
||||
"dataTypeName": attribute.data_type_name,
|
||||
"isArray": attribute.is_array,
|
||||
"arrayDimension": attribute.array_dimension,
|
||||
"arrayDimensionPresent": attribute.array_dimension_present,
|
||||
"mxAttributeCategory": attribute.mx_attribute_category,
|
||||
"securityClassification": attribute.security_classification,
|
||||
"isHistorized": attribute.is_historized,
|
||||
"isAlarm": attribute.is_alarm,
|
||||
})).collect::<Vec<_>>(),
|
||||
"children": children,
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
async fn session_for(
|
||||
connection: ConnectionArgs,
|
||||
session_id: String,
|
||||
@@ -2131,6 +2434,37 @@ mod tests {
|
||||
assert!(parsed.is_ok(), "parse failed: {parsed:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parses_galaxy_browse_command_with_filters_and_depth() {
|
||||
let parsed = Cli::try_parse_from([
|
||||
"mxgw",
|
||||
"galaxy",
|
||||
"browse",
|
||||
"--parent-gobject-id",
|
||||
"42",
|
||||
"--category-id",
|
||||
"3",
|
||||
"--category-id",
|
||||
"5",
|
||||
"--template-contains",
|
||||
"$DelmiaReceiver",
|
||||
"--tag-name-glob",
|
||||
"Recv_*",
|
||||
"--include-attributes",
|
||||
"--alarm-bearing-only",
|
||||
"--depth",
|
||||
"2",
|
||||
"--json",
|
||||
]);
|
||||
assert!(parsed.is_ok(), "parse failed: {parsed:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parses_galaxy_browse_command_with_defaults() {
|
||||
let parsed = Cli::try_parse_from(["mxgw", "galaxy", "browse"]);
|
||||
assert!(parsed.is_ok(), "parse failed: {parsed:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parses_batch_command() {
|
||||
let parsed = Cli::try_parse_from(["mxgw", "batch"]);
|
||||
|
||||
@@ -762,16 +762,17 @@ in the codebase for the forward-compat shape, but the gateway-side
|
||||
`AcknowledgeAlarmByName` when the public RPC supplies a recognizable
|
||||
`Provider!Group.Tag` reference.
|
||||
|
||||
### 5. STA / threading — production fix needed
|
||||
### 5. STA / threading — resolved
|
||||
|
||||
The wnwrap COM is `ThreadingModel=Apartment`. The consumer's
|
||||
internal `Timer` fires on threadpool threads and would block forever
|
||||
on cross-apartment marshaling unless the host STA pumps Win32
|
||||
messages. The smoke test sidesteps this by setting
|
||||
`pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce`
|
||||
manually from the test's STA. Production hosting will route polls
|
||||
through the worker's `StaRuntime` in a follow-up — the consumer's
|
||||
`PollOnce` is `public` and idempotent so the wire-up is mechanical.
|
||||
manually from the test's STA. Production alarm polling was wired up
|
||||
through `GatewayAlarmMonitor`, which routes polling through the
|
||||
worker's `StaRuntime` (the STA pump owner) via the worker IPC path. This item is resolved; the wnwrap consumer's `PollOnce`
|
||||
is no longer invoked directly in production.
|
||||
|
||||
### Capture summary
|
||||
|
||||
|
||||
@@ -37,11 +37,14 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
|
||||
"MaxPendingCommandsPerSession": 128,
|
||||
"DefaultLeaseSeconds": 1800,
|
||||
"LeaseSweepIntervalSeconds": 30,
|
||||
"AllowMultipleEventSubscribers": false
|
||||
"AllowMultipleEventSubscribers": false,
|
||||
"MaxEventSubscribersPerSession": 8
|
||||
},
|
||||
"Events": {
|
||||
"QueueCapacity": 10000,
|
||||
"BackpressurePolicy": "FailFast"
|
||||
"BackpressurePolicy": "FailFast",
|
||||
"ReplayBufferCapacity": 1024,
|
||||
"ReplayRetentionSeconds": 300
|
||||
},
|
||||
"Dashboard": {
|
||||
"Enabled": true,
|
||||
@@ -123,23 +126,35 @@ to avoid accidental large allocations from malformed or oversized frames.
|
||||
| `MxGateway:Sessions:MaxPendingCommandsPerSession` | `128` | Maximum number of pending worker commands for one session. Excess commands fail fast instead of queueing indefinitely. |
|
||||
| `MxGateway:Sessions:DefaultLeaseSeconds` | `1800` | Initial session lease and refresh duration. Unary client activity extends the lease by this duration. |
|
||||
| `MxGateway:Sessions:LeaseSweepIntervalSeconds` | `30` | Hosted monitor interval for closing expired leases. Active event-stream subscribers keep a session from expiring while the stream remains attached. |
|
||||
| `MxGateway:Sessions:AllowMultipleEventSubscribers` | `false` | Controls whether multiple `StreamEvents` subscribers may attach to one session. `true` is rejected until event fan-out is implemented. |
|
||||
| `MxGateway:Sessions:AllowMultipleEventSubscribers` | `false` | Controls whether multiple `StreamEvents` subscribers may attach to one session. When `false` the session refuses a second subscriber with `AlreadyExists`. Set to `true` to enable fan-out via the `SessionEventDistributor`. |
|
||||
| `MxGateway:Sessions:MaxEventSubscribersPerSession` | `8` | Maximum number of concurrent `StreamEvents` subscribers per session when `AllowMultipleEventSubscribers` is `true`. Effectively 1 when `AllowMultipleEventSubscribers` is `false`. Must be greater than zero. |
|
||||
|
||||
All numeric session options must be greater than zero. The current event stream
|
||||
implementation supports one active subscriber per session; this preserves event
|
||||
ordering and avoids competing consumers.
|
||||
All numeric session options must be greater than zero.
|
||||
|
||||
## Event Options
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Events:QueueCapacity` | `10000` | Capacity for bounded per-session event queues used by the gateway worker event channel and the public gRPC event stream queue. |
|
||||
| `MxGateway:Events:BackpressurePolicy` | `FailFast` | Event backpressure behavior. `FailFast` faults the session on public stream queue overflow. `DisconnectSubscriber` disconnects only the slow stream. |
|
||||
| `MxGateway:Events:BackpressurePolicy` | `FailFast` | Per-subscriber event backpressure behavior when a subscriber's bounded event channel overflows. Overflow is isolated to the offending subscriber: it is always disconnected with an `EventQueueOverflow` fault while the session pump and other subscribers keep running. `FailFast` additionally faults the whole session only in the legacy single-subscriber case (the current default mode); with multiple subscribers it degrades to a per-subscriber disconnect so one slow consumer never faults a shared session. `DisconnectSubscriber` disconnects only the slow subscriber in all cases. |
|
||||
| `MxGateway:Events:ReplayBufferCapacity` | `1024` | Maximum number of events retained per session in the replay ring buffer, used to re-deliver events a returning subscriber missed (reconnect/reattach). The oldest retained event is evicted once this count is exceeded. `0` disables replay retention. |
|
||||
| `MxGateway:Events:ReplayRetentionSeconds` | `300` | Maximum age, in seconds, of an event retained in the replay ring buffer. Entries older than this are evicted regardless of capacity. `0` disables age-based eviction. |
|
||||
|
||||
`QueueCapacity` must be greater than zero. With `FailFast`, queue overflow
|
||||
faults the affected worker or session instead of silently dropping MXAccess
|
||||
events. With `DisconnectSubscriber`, public gRPC stream overflow terminates only
|
||||
the affected stream while the MXAccess session remains active.
|
||||
`QueueCapacity` must be greater than zero; it bounds each per-subscriber event
|
||||
channel fed by the session's single event pump. A slow subscriber overflows only
|
||||
its own channel and is always disconnected with an `EventQueueOverflow` fault
|
||||
rather than silently dropping MXAccess events — the pump, the session, and other
|
||||
subscribers are unaffected. With `FailFast` in the single-subscriber case (the
|
||||
default mode), that overflow additionally faults the whole session; with multiple
|
||||
subscribers `FailFast` degrades to a per-subscriber disconnect, matching
|
||||
`DisconnectSubscriber`, so one slow consumer cannot fault a session shared by
|
||||
healthy subscribers. With `DisconnectSubscriber`, overflow terminates only the
|
||||
affected stream while the MXAccess session remains active.
|
||||
|
||||
`ReplayBufferCapacity` and `ReplayRetentionSeconds` must each be greater than or
|
||||
equal to zero (either dimension can be disabled with `0`). A returning subscriber
|
||||
that asks for events older than the oldest still-retained event is told it missed
|
||||
events (a "gap") and must re-snapshot; whatever is still retained is replayed.
|
||||
|
||||
## Dashboard Options
|
||||
|
||||
|
||||
@@ -167,7 +167,7 @@ bearer). Each hub class is `[Authorize(Policy = HubClientsPolicy)]`.
|
||||
|---|---|---|---|---|
|
||||
| `DashboardSnapshotHub` | `/hubs/snapshot` | `DashboardSnapshotPublisher` (BackgroundService consuming `IDashboardSnapshotService.WatchSnapshotsAsync`) | `DashboardSnapshot` | Sent to all connected clients on every snapshot tick; new connections receive the current snapshot synchronously in `OnConnectedAsync`. |
|
||||
| `AlarmsHub` | `/hubs/alarms` | `AlarmsHubPublisher` (BackgroundService consuming `IGatewayAlarmService.StreamAsync(filter: null)`) | `AlarmFeedMessage` (`active_alarm` / `snapshot_complete` / `transition`) | Connected clients auto-join `__alarms__`; all clients receive every message. Publisher auto-reconnects every 5s on stream faults. |
|
||||
| `EventsHub` | `/hubs/events` | `DashboardEventBroadcaster` invoked by `EventStreamService` for each event it forwards to a gRPC client | `MxEvent` | Clients call `SubscribeSession(sessionId)` to join `session:{id}`. Events appear only while a gRPC client is also consuming that session's events — the dashboard is a passive mirror, not a separate worker subscriber. |
|
||||
| `EventsHub` | `/hubs/events` | `DashboardEventBroadcaster` invoked by each session's internal dashboard-mirror subscriber on its `SessionEventDistributor` (registered when the session becomes Ready) | `MxEvent` | Clients call `SubscribeSession(sessionId)` to join `session:{id}`. The dashboard is a first-class distributor subscriber, so it receives the session's events whether or not a gRPC client is streaming. It sees RAW session events — not the per-gRPC-subscriber `AfterWorkerSequence` filtering that `EventStreamService` applies at its own boundary — because the dashboard is a separate LDAP-authenticated monitoring view meant to show the session's full event activity (per-session dashboard ACL is tracked separately). |
|
||||
|
||||
`DashboardPageBase` opens a `DashboardSnapshotHub` connection via the connection
|
||||
factory in `OnInitializedAsync`, seeds `Snapshot` synchronously from
|
||||
@@ -184,7 +184,8 @@ Default cadences:
|
||||
- snapshot service produces one snapshot per
|
||||
`MxGateway:Dashboard:SnapshotIntervalMilliseconds` (default 1s);
|
||||
- alarm publisher emits on each transition observed by the central monitor;
|
||||
- event publisher emits per event forwarded by `StreamEvents`.
|
||||
- event publisher emits per event fanned by the session's `SessionEventDistributor`
|
||||
to its internal dashboard-mirror subscriber (independent of any gRPC `StreamEvents`).
|
||||
|
||||
Avoid pushing every MXAccess data-change event into a wider broadcast group.
|
||||
The current design routes events strictly through `session:{id}` groups; the
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# Deferred Follow-ups Implementation Plan
|
||||
|
||||
**Date:** 2026-06-14
|
||||
**Status:** Plan only — NOT yet executed. Saved for review.
|
||||
**Status:** D1 executed (commit 4af24b9 — `mxgateway.alarms.provider_switches` emitted in `DashboardSnapshotService.cs:198`). D2 resolved as no-op (see resolution section below). D3–D5 remain pending (ops/validation, no code).
|
||||
**Context:** After the alarm-subtag-fallback cleanup (merged `5976770`) and its redeploy to
|
||||
windev (10.100.0.48), five items remain deferred. This plan handles all five. They are
|
||||
independent — execute in any order, or cherry-pick. Items D1–D2 are code (branch off `main`);
|
||||
@@ -268,3 +268,34 @@ the deployed instance. **No source change made** (no-op).
|
||||
deployed instance (the only path that exercises routing past the unauthenticated 302-to-`/login`).
|
||||
Recommend a spot-check of authenticated `GET /` after the next Server redeploy; if it returns 200
|
||||
(not 500), this item can be fully closed.
|
||||
|
||||
---
|
||||
|
||||
## Recorded residuals after `feat/stillpending-completion` (2026-06-15)
|
||||
|
||||
The stillpending.md actionable items were implemented on branch `feat/stillpending-completion`
|
||||
(see `docs/plans/2026-06-15-stillpending-completion.md`). These environment/vendor-gated residuals
|
||||
remain explicitly open — none are code defects:
|
||||
|
||||
- **`provider_switches{from,to,reason}` counter — live exercise still pending.** The metric is
|
||||
emitted on the alarm failover/failback path and unit-tested, but the dev rig's object-driven
|
||||
alarms can't be made to fail a real alarmmgr→subtag switch from outside, so the `reason` tagging
|
||||
is unproven against a live failover. Re-verify when a rig (or capture) can drive an actual
|
||||
alarmmgr fault. (stillpending §1.3.)
|
||||
|
||||
- **`DrainEvents` is a diagnostic snapshot, not an exhaustive drain.** The worker now answers
|
||||
`DrainEvents` (handled in `WorkerPipeSession`, off-STA), but it pulls from the same event queue
|
||||
that the ~25 ms background stream-drain loop continuously empties. With an active event stream a
|
||||
`DrainEvents` caller therefore receives only events not yet pushed by the stream loop — there is
|
||||
no loss or double-delivery (the queue drain is lock-protected and destructive), but the result is
|
||||
a non-deterministic snapshot. Documented here so the semantics aren't mistaken for a bug.
|
||||
|
||||
- **Buffered multi-sample conversion (`OnBufferedDataChange`) — unverified live.** `AddBufferedItem`
|
||||
/ `SetBufferedUpdateInterval` are implemented and live-confirmed; the empty `NoData` bootstrap
|
||||
event converts cleanly live (`f7ada90`). A real parallel quality/timestamp sample batch
|
||||
(length > 1) is undrivable on the current rig, so the multi-sample path is exercised only by unit
|
||||
tests against a fake `IMxAccessServer`. (stillpending §3.2.)
|
||||
|
||||
- **8-arg alarm ack operator `domain`/`full_name` — vendor-blocked.** The AVEVA `IwwAlarmConsumer2`
|
||||
8-arg `AlarmAckByName` returns −55 (stub) and `AlarmAckByGUID` is `E_NOTIMPL` on this build, so the
|
||||
two fields stay forward-compat-only on the wire. (stillpending §1.4 / §3.4 / §3.5.)
|
||||
|
||||
@@ -0,0 +1,233 @@
|
||||
# Session Resilience Epic — Design
|
||||
|
||||
**Date:** 2026-06-15
|
||||
**Branch:** `feat/session-resilience`
|
||||
**Source:** `stillpending.md` §2 (intentional v1-deferred items), scoped into a real feature design.
|
||||
**Status:** Design approved; implementation plan to follow.
|
||||
|
||||
## Goal
|
||||
|
||||
Lift four deliberately-deferred v1 limitations into supported features, built on
|
||||
one shared foundation:
|
||||
|
||||
1. **Multi-event-subscriber fan-out** (§2: plumbed but validator-blocked).
|
||||
2. **Reconnectable sessions** (§2: 1:1 session↔connection today).
|
||||
3. **Per-session dashboard ACL** (§2 / EventsHub `TODO(per-session-acl)`).
|
||||
4. **Orphan-worker reattach on gateway restart** (§2 — **overturns a hard
|
||||
CLAUDE.md rule**, see "Documented-rule changes").
|
||||
|
||||
These are not peers: fan-out is the keystone, reconnect and reattach reuse its
|
||||
machinery, and three of the four need a new session-ownership concept.
|
||||
|
||||
## Documented-rule changes (explicit, owner-approved)
|
||||
|
||||
This epic deliberately reverses three documented v1 decisions. Each reversal is a
|
||||
required deliverable in the same change as the code:
|
||||
|
||||
- **CLAUDE.md:77** "Gateway restart does not reattach orphan workers… do not
|
||||
design code paths that assume reattachment." → reattach becomes supported,
|
||||
bounded, and opt-in.
|
||||
- **`docs/DesignDecisions.md:63-73`** "no reconnectable sessions for v1." →
|
||||
reconnect becomes supported within a bounded detach-grace window.
|
||||
- **`docs/DesignDecisions.md:75-80`** single event subscriber per session. →
|
||||
multi-subscriber fan-out becomes supported, capped.
|
||||
|
||||
The owner explicitly accepted overturning the reattach rule during design.
|
||||
|
||||
## Current-state seams (verified by recon, with citations)
|
||||
|
||||
- `GatewaySession.AttachEventSubscriber(bool allowMultipleSubscribers)`
|
||||
(`Sessions/GatewaySession.cs:386-408`) guards on a single int
|
||||
`_activeEventSubscriberCount` (`:16`) under `_syncRoot`; a second subscriber
|
||||
throws `EventSubscriberAlreadyActive` (`:398`).
|
||||
- `GatewayOptionsValidator.cs:181-185` hard-rejects
|
||||
`AllowMultipleEventSubscribers` ("not supported until event fan-out is
|
||||
implemented"); option bound at `SessionOptions.cs:26-29`.
|
||||
- `EventStreamService.StreamEventsAsync` (`Grpc/EventStreamService.cs:27-101`)
|
||||
creates **a new bounded `Channel<MxEvent>` per RPC call** (`:43-50`) and
|
||||
`ProduceEventsAsync` drains `session.ReadEventsAsync()` directly — a
|
||||
**destructive, single-consumer drain**. Two RPCs would fight over one queue.
|
||||
- Backpressure: `ProduceEventsAsync` uses non-blocking `TryWrite`; on overflow
|
||||
with `EventBackpressurePolicy.FailFast` (default, `EventOptions`) it calls
|
||||
`session.MarkFaulted` (`EventStreamService.cs:143-162`) — faulting the **whole
|
||||
session**, not just the slow consumer.
|
||||
- `DashboardEventBroadcaster.Publish` (`Dashboard/Hubs/DashboardEventBroadcaster.cs:13-44`)
|
||||
is called **inside** the per-RPC producer loop (`EventStreamService.cs:131-141`)
|
||||
— so the dashboard only mirrors events while a gRPC client is streaming. Latent
|
||||
bug: no gRPC subscriber ⇒ dashboard feed is dark.
|
||||
- Pipe name `mxaccess-gateway-{Environment.ProcessId}-{sessionId}`
|
||||
(`SessionManager.cs:433`); session id `session-{Guid:N}` (`:479`), in-memory
|
||||
`SessionRegistry` only (`SessionRegistry.cs:12`), **not persisted**.
|
||||
- `OrphanWorkerTerminator` (`Workers/OrphanWorkerTerminator.cs:49-112`) discovers
|
||||
orphans by executable name/path (x64 gateway cannot introspect the x86 worker
|
||||
module → image-name fallback) and **terminates** them; rationale comment at
|
||||
`:9-16`.
|
||||
- Pipe fault → `WorkerClient` read loop detects `EndOfStream`, session →
|
||||
`Faulted` (`WorkerClient.cs:376-381`); no reattach. Worker launch passes the
|
||||
per-session nonce via `MXGATEWAY_WORKER_NONCE` env var
|
||||
(`WorkerProcessLauncher.cs:180-182`).
|
||||
- Sessions store `ClientIdentity` (informational only, `GatewaySession.cs:114`);
|
||||
**no `OwnerKeyId`, no per-session ACL.** gRPC `StreamEvents` enforces per-item
|
||||
read constraints but **no session-level access gate** — any caller who knows a
|
||||
session id can stream it.
|
||||
- `EventsHub.SubscribeSession(string)` (`Dashboard/Hubs/EventsHub.cs:46-54`) joins
|
||||
group `session:{id}`; only hub-level `[Authorize(HubClientsPolicy)]` gates it,
|
||||
so **any** Admin/Viewer can subscribe to **any** session. `TODO(per-session-acl)`
|
||||
at `:39-43`. `SnapshotHub`/`AlarmsHub` broadcast to all. Hub bearer
|
||||
(`HubTokenService`, 30-min) carries name + roles only, **no session scope**.
|
||||
- `StreamEventsRequest.AfterWorkerSequence` already exists on the wire (the
|
||||
reconnect replay contract is half-built).
|
||||
|
||||
## Shared foundation
|
||||
|
||||
### A. `SessionEventDistributor` (one pump, N per-subscriber channels)
|
||||
|
||||
Per `GatewaySession`, replace the per-RPC direct drain with a single owned
|
||||
distributor:
|
||||
|
||||
- One background **pump task** drains `ReadEventsAsync()` exactly once.
|
||||
- Each event is (1) stamped with its worker sequence, (2) appended to a **bounded
|
||||
replay ring buffer** (retain last `ReplayBufferCapacity` events or
|
||||
`ReplayRetentionSeconds`, whichever first), and (3) `TryWrite`-fanned to every
|
||||
registered subscriber's own bounded `Channel<MxEvent>`.
|
||||
- **Per-subscriber backpressure isolation:** overflow completes only that
|
||||
subscriber's channel (policy `DisconnectSubscriber`); the session and peers are
|
||||
untouched. `FailFast`→`MarkFaulted` is retained only for the legacy
|
||||
single-subscriber config path, for backward compatibility.
|
||||
- **Constraint filtering stays per-subscriber:** the pump fans *raw* events; each
|
||||
subscriber's read loop applies its own API-key read subtree/glob filter exactly
|
||||
as today. No change to constraint semantics.
|
||||
- `AttachEventSubscriber` returns a lease carrying that subscriber's channel
|
||||
reader + its start sequence (for replay). `EventStreamService` reads the lease
|
||||
channel instead of creating its own channel and draining the session.
|
||||
|
||||
### B. Session ownership
|
||||
|
||||
Record an authoritative **`OwnerKeyId`** (the creating API key id) on the session
|
||||
at `OpenSession`, alongside the existing informational `ClientIdentity`. This one
|
||||
field underpins ACL, reconnect re-validation, and reattach adoption.
|
||||
|
||||
## Feature designs
|
||||
|
||||
### 1. Multi-subscriber fan-out
|
||||
|
||||
- Remove the `GatewayOptionsValidator.cs:181-185` rejection; keep the option but
|
||||
allow `true`.
|
||||
- `_activeEventSubscriberCount` → a subscriber-lease collection on the
|
||||
distributor. New cap `MaxEventSubscribersPerSession` (default 8) → reject the
|
||||
N+1 attach with `EventSubscriberLimitReached`.
|
||||
- Dashboard broadcaster registers as a distributor subscriber (removing the inline
|
||||
tap), fixing the dashboard-dark-without-gRPC bug.
|
||||
- **No proto change.**
|
||||
|
||||
### 2. Reconnectable sessions
|
||||
|
||||
- On stream drop, a session in **detach-grace** mode is retained (not closed) for
|
||||
`DetachGraceSeconds` (separate from the session lease). New session
|
||||
disconnect-policy value `DetachGrace`.
|
||||
- On reconnect: client calls `StreamEvents` with the same session id +
|
||||
`AfterWorkerSequence = lastSeen`. The distributor replays ring-buffer events
|
||||
with `sequence > AfterWorkerSequence`, then resumes live.
|
||||
- If the requested sequence is older than the ring's oldest retained event (gone
|
||||
too long / ring overflowed), the server signals **`ReplayGap`** so the client
|
||||
re-snapshots. **Contract addition** (a `ReplayGap` status / response marker) →
|
||||
codegen ripple across all 5 clients.
|
||||
- Reconnect re-validates caller `OwnerKeyId` == session owner → else
|
||||
`PermissionDenied`.
|
||||
|
||||
### 3. Per-session ACL
|
||||
|
||||
- **gRPC (real security win, no proto change):** `Invoke` / `StreamEvents` /
|
||||
`CloseSession` gated to the owning API key, OR a key holding a new
|
||||
all-sessions admin scope → else `PermissionDenied`. Enforced in
|
||||
`MxAccessGatewayService` against session `OwnerKeyId`.
|
||||
- **Dashboard:** identity-domain mismatch (LDAP Admin/Viewer users vs API-key
|
||||
sessions) means no natural owner link.
|
||||
- **Decision required (flagged, not hard-coded):** default proposal —
|
||||
**Admin** sees all sessions; **Viewer** scoped via config
|
||||
`Dashboard:GroupToSessionTag` matched against an optional session `Tag`.
|
||||
Enforced at `EventsHub.SubscribeSession` and in the `/hubs/token` mint
|
||||
(token gains an allowed-session-tag claim). The owner may instead choose a
|
||||
strict default (Viewers see nothing unless granted).
|
||||
|
||||
### 4. Orphan-worker reattach
|
||||
|
||||
- **Stable pipe naming:** drop `{gatewayPid}`; use a persisted stable
|
||||
gateway-instance id. Replaces the pid's collision-avoidance role.
|
||||
- **Adoption manifest:** persist a minimal record per live session
|
||||
(`sessionId → workerPid, nonce, ownerKeyId, pipeName`) in the existing SQLite
|
||||
store. This is the *only* persisted session state; COM/advise state stays in
|
||||
the worker.
|
||||
- **Worker phones home:** the worker runs a reconnect loop with bounded backoff;
|
||||
the restarted gateway re-opens pipe servers for manifest entries and the
|
||||
surviving worker re-attaches, presenting its **nonce**. Gateway validates the
|
||||
nonce against the manifest and **rejects impostors / foreign workers**.
|
||||
- **Resync, not replay:** the in-memory ring buffer is lost on restart, so a
|
||||
reattached session's subscribers get `ReplayGap` and re-snapshot. Gateway
|
||||
resyncs worker view via the now-implemented `GetSessionState` / `GetWorkerInfo`
|
||||
commands.
|
||||
- **Safety net retained:** workers self-terminate after `MaxOrphanLifetime` with
|
||||
no re-adoption; `OrphanWorkerTerminator` stays as the fallback for un-adoptable
|
||||
or foreign workers. Reattach is opt-in (`Workers:EnableOrphanReattach`,
|
||||
default off) so the documented-safe behavior remains the default.
|
||||
- **Pipe protocol:** add an **adopt/reconnect frame** to `mxaccess_worker.proto`
|
||||
→ worker codegen regen + commit `Generated/` (net48 regen rule applies).
|
||||
|
||||
## Contract / codegen impact
|
||||
|
||||
Unlike the prior epic, this is **not zero-proto**:
|
||||
|
||||
- `mxaccess_gateway.proto` — `ReplayGap` signal for reconnect (Feature 2).
|
||||
- `mxaccess_worker.proto` — adopt/reconnect frame (Feature 4).
|
||||
|
||||
Per the repo rule: regenerate `Generated/`, commit it, rebuild gateway + worker +
|
||||
every generated client touched, and update affected docs in the same change.
|
||||
|
||||
## Error handling
|
||||
|
||||
- Per-subscriber overflow → disconnect that subscriber only; session survives.
|
||||
- Reconnect past the ring horizon → `ReplayGap`, client re-snapshots (no silent
|
||||
loss).
|
||||
- Reattach nonce mismatch → reject + fall back to termination.
|
||||
- ACL denial → `PermissionDenied` (gRPC) / hub subscribe refused (dashboard).
|
||||
- All worker COM/STA interactions keep MXAccess parity — no synthesized events,
|
||||
no "fixing" surprising returns.
|
||||
|
||||
## Testing & cross-platform verification
|
||||
|
||||
| Area | Test | Host |
|
||||
|---|---|---|
|
||||
| Distributor fan-out, per-sub backpressure, replay ring | unit, `FakeWorkerHarness` | local (macOS) |
|
||||
| Reconnect replay + `ReplayGap` | unit + fake-worker integration | local |
|
||||
| Session ownership / gRPC ACL | unit + gateway integration | local |
|
||||
| Dashboard per-session ACL | LDAP test users (`multi-role`/`gw-viewer`) | local + live LDAP |
|
||||
| Worker adopt frame, reattach handshake | worker unit (net48/x86) | **windev** |
|
||||
| Gateway-restart reattach round-trip | integration | **windev** + live worker |
|
||||
| Client `ReplayGap` handling | per-client tests; Java on macOS JDK 21 | local |
|
||||
|
||||
TDD throughout; per-task commits; `Generated/` regenerated+committed on proto
|
||||
changes; docs (incl. the three documented-rule reversals) updated in the same
|
||||
change as source.
|
||||
|
||||
## Delivery order (dependency stack)
|
||||
|
||||
Each phase is independently shippable:
|
||||
|
||||
1. **Foundation** — `SessionEventDistributor` + replay ring + session `OwnerKeyId`
|
||||
(refactor with no external behavior change; dashboard-dark bug fixed).
|
||||
2. **Fan-out** — remove validator block, subscriber-lease list, cap, dashboard as
|
||||
subscriber.
|
||||
3. **Reconnect** — detach-grace, replay-on-reconnect, `ReplayGap` contract +
|
||||
client handling.
|
||||
4. **Per-session ACL** — gRPC owner gate + dashboard scoping.
|
||||
5. **Reattach** — stable pipe naming, adoption manifest, worker phone-home +
|
||||
adopt frame, resync, safety net; documented-rule reversals.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Cross-gateway / clustered session sharing (single gateway instance only).
|
||||
- Event persistence beyond the in-memory ring (no durable event log).
|
||||
- Reconnect across a gateway *restart* with zero event gap (restart always yields
|
||||
`ReplayGap` by design — the ring is in-memory).
|
||||
- Per-session ACL on `SnapshotHub` / `AlarmsHub` (they broadcast aggregate state;
|
||||
only `EventsHub` is session-scoped).
|
||||
@@ -0,0 +1,417 @@
|
||||
# Session Resilience Epic — Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development (same session) or executing-plans (parallel session) to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Lift four deferred v1 limitations — multi-subscriber fan-out, reconnectable sessions, per-session ACL, orphan-worker reattach — onto one shared event-distribution foundation.
|
||||
|
||||
**Architecture:** A per-session `SessionEventDistributor` (one pump → N per-subscriber bounded channels + a bounded replay ring) replaces today's per-RPC destructive drain. Session ownership (`OwnerKeyId`) underpins ACL, reconnect re-validation, and reattach adoption. See `docs/plans/2026-06-15-session-resilience-design.md`.
|
||||
|
||||
**Tech Stack:** .NET 10 gateway (x64), .NET Framework 4.8 worker (x86, windev), SQLite auth/manifest store, gRPC + protobuf contracts (net10.0;net48), 5 language clients, Blazor/SignalR dashboard, LDAP dashboard auth.
|
||||
|
||||
**Cross-platform:** Gateway, dotnet/Go/Rust/Python clients, and the Java client build/test locally on macOS (JDK 21 at `~/.local/jdks/jdk-21.0.11+10/Contents/Home`). The net48/x86 worker and worker tests build/test on **windev** (ssh alias, PowerShell). Proto changes: regenerate `Generated/`, commit it, rebuild every touched component.
|
||||
|
||||
**Standing rules (from CLAUDE.md):** never log secrets/credentials/values; MXAccess parity (no synthesized events, no "fixing" surprising returns); no init-only props/positional records in net48 worker; update docs in the same change as source; branch already created (`feat/session-resilience`); per-task commits; build+test affected components before marking done.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Foundation (refactor; no external behavior change except the dashboard-dark fix)
|
||||
|
||||
### Task 1: Add `OwnerKeyId` to the session
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (other phase-1 tasks build on the session type)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (add `OwnerKeyId` readonly prop near `ClientIdentity:114`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs` (set `OwnerKeyId` from the request identity at `OpenSession`, near `CreateSessionId:479`)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs`
|
||||
|
||||
**Steps:** TDD — failing test asserting an opened session records the creating API key id → add the property + assignment from `IGatewayRequestIdentityAccessor.Current` → green → `dotnet build src/ZB.MOM.WW.MxGateway.Server` + run session tests → commit.
|
||||
|
||||
### Task 2: `SessionEventDistributor` skeleton (single pump, subscriber registry)
|
||||
|
||||
**Classification:** high-risk (concurrency / actor model)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionEventDistributor.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionEventDistributorTests.cs`
|
||||
|
||||
**Design:** One background pump `Task` draining `session.ReadEventsAsync()` exactly once; a thread-safe subscriber collection where each subscriber owns a bounded `Channel<MxEvent>` (`SingleReader=true`, `FullMode=Wait` for the per-sub channel, but writes use non-blocking `TryWrite`). `Register(startSequence)` returns a lease (channel reader + dispose). Pump fans each drained event to all subscriber channels via `TryWrite`.
|
||||
|
||||
**Steps:** Failing test: two registered subscribers both receive the same fanned event; disposing one stops its delivery without affecting the other. Implement pump + registry. Green. Build + test. Commit.
|
||||
|
||||
### Task 3: Bounded replay ring buffer
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (extends Task 2)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionEventDistributor.cs`
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/EventOptions.cs` (add `ReplayBufferCapacity`, `ReplayRetentionSeconds`)
|
||||
- Test: `SessionEventDistributorTests.cs`
|
||||
|
||||
**Design:** Append each fanned event to a ring keyed by worker sequence, evicting by count (`ReplayBufferCapacity`) or age (`ReplayRetentionSeconds`), whichever first. Expose `TryGetReplayFrom(afterSequence, out events, out gap)`.
|
||||
|
||||
**Steps:** Failing test: events evicted past capacity; `TryGetReplayFrom` returns `gap=true` when requested sequence is older than the oldest retained. Implement. Green. Build+test. Commit.
|
||||
|
||||
### Task 4: Rewire `AttachEventSubscriber` + `EventStreamService` onto the distributor
|
||||
|
||||
**Classification:** high-risk (changes the live event path)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:386-408` (own a `SessionEventDistributor`; `AttachEventSubscriber` returns a lease wrapping `distributor.Register(...)`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:27-101` (read the lease's channel instead of creating a per-RPC channel and draining the session directly; remove the per-RPC `Channel.CreateBounded` at `:43-50`)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs`
|
||||
|
||||
**Steps:** Failing test: a single subscriber still streams events end-to-end through the distributor (regression parity with today). Rewire. Keep per-item constraint filtering in the subscriber read loop. Green. Build + run gateway event-stream tests. Commit.
|
||||
|
||||
### Task 5: Per-subscriber backpressure isolation
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (extends Tasks 2/4)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionEventDistributor.cs`
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs` (overflow path `:143-162`)
|
||||
- Test: `SessionEventDistributorTests.cs`
|
||||
|
||||
**Design:** On a subscriber channel `TryWrite` failure, complete only that subscriber's channel with `EventQueueOverflow` (policy `DisconnectSubscriber`). Retain `FailFast`→`MarkFaulted` only when the session is in legacy single-subscriber mode (back-compat).
|
||||
|
||||
**Steps:** Failing test: a slow subscriber overflows and is disconnected while a second subscriber keeps receiving and the session stays `Ready`. Implement. Green. Build+test. Commit.
|
||||
|
||||
### Task 6: Dashboard broadcaster becomes a distributor subscriber
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:131-141` (remove the inline `dashboardEventBroadcaster.Publish` tap)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (register the dashboard broadcaster as a distributor subscriber on session start)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/DashboardEventBroadcaster.cs` (consume from a distributor lease)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardEventBroadcasterTests.cs`
|
||||
|
||||
**Steps:** Failing test: dashboard receives session events even with **no** active gRPC subscriber (fixes the latent dark-feed bug). Implement. Green. Build + dashboard tests. Commit.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Multi-subscriber fan-out
|
||||
|
||||
### Task 7: Remove the validator block + add the subscriber cap option
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** Task 8 is sequential (same files); none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/GatewayOptionsValidator.cs:181-185` (delete the rejection)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs` (add `MaxEventSubscribersPerSession`, default 8)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Configuration/GatewayOptionsValidatorTests.cs`
|
||||
|
||||
**Steps:** Failing test: `AllowMultipleEventSubscribers=true` now validates clean. Remove rule, add option. Green. Build+test. Commit.
|
||||
|
||||
### Task 8: Subscriber-lease collection + cap enforcement
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (replace `_activeEventSubscriberCount:16` with a lease collection; honor `allowMultipleSubscribers`; reject N+1 with new `SessionManagerErrorCode.EventSubscriberLimitReached`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManagerErrorCode.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/GatewaySessionTests.cs`
|
||||
|
||||
**Steps:** Failing tests: N subscribers attach concurrently up to the cap; N+1 throws `EventSubscriberLimitReached`; single-subscriber mode still rejects the 2nd. Implement. Green. Build+test. Commit.
|
||||
|
||||
### Task 9: Multi-subscriber end-to-end test via FakeWorkerHarness
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs`
|
||||
|
||||
**Steps:** Two concurrent `StreamEvents` RPCs on one session both receive every worker event; one cancels, the other continues. Build + full fake-worker suite. Commit.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Reconnectable sessions
|
||||
|
||||
### Task 10: Proto — `ReplayGap` signal (contract change)
|
||||
|
||||
**Classification:** high-risk (contracts → all clients)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto` (add a `ReplayGap` marker — a `replay_gap` bool + `oldest_available_sequence` on the stream response, or a dedicated leading status frame)
|
||||
- Regenerate: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`; **commit** `src/ZB.MOM.WW.MxGateway.Contracts/Generated/*` (net48 regen rule — see `project_proto_codegen_regen`)
|
||||
- Test: contracts build both TFMs (net10.0;net48)
|
||||
|
||||
**Steps:** Add field(s), regen, `del Generated/*.cs` if needed to force regen, commit generated. Build contracts both TFMs. Commit. **This unblocks Task 11 and Task 14.**
|
||||
|
||||
### Task 11: Detach-grace session retention
|
||||
|
||||
**Classification:** high-risk (session lifecycle)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (add `DetachGrace` retention: on last-subscriber-drop, keep session alive for `DetachGraceSeconds` instead of closing)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs` (`DetachGraceSeconds`; new disconnect-policy value `DetachGrace`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionLeaseMonitorHostedService.cs` (sweep expired detach-grace windows)
|
||||
- Test: `GatewaySessionTests.cs`
|
||||
|
||||
**Steps:** Failing test: subscriber drop under `DetachGrace` keeps the session `Ready` until the window expires, then closes. Implement. Green. Build + session/lease tests. Commit.
|
||||
|
||||
### Task 12: Replay-on-reconnect + emit `ReplayGap`
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs` (on attach with `AfterWorkerSequence`, call `distributor.TryGetReplayFrom`; replay buffered events then resume live; if `gap`, send the `ReplayGap` marker first)
|
||||
- Test: `EventStreamServiceTests.cs`
|
||||
|
||||
**Steps:** Failing tests: reconnect with a known sequence replays only newer events; reconnect past the ring horizon yields `ReplayGap`. Implement. Green. Build + test. Commit.
|
||||
|
||||
### Task 13: Owner re-validation on reconnect
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** Task 12 (different assertion in same service — sequence after 12)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs` (reconnect requires caller `OwnerKeyId` == session owner → `PermissionDenied`)
|
||||
- Test: `EventStreamServiceTests.cs`
|
||||
|
||||
**Steps:** Failing test: a different API key cannot resume someone else's session. Implement. Green. Build+test. Commit.
|
||||
|
||||
### Task 14: Client `ReplayGap` handling — all 5 clients
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min each (dispatch as 5 parallel sub-tasks; disjoint files)
|
||||
**Parallelizable with:** each other (14a–14e)
|
||||
|
||||
**Files (one client each):**
|
||||
- 14a dotnet: `clients/dotnet/.../` stream consumer + test
|
||||
- 14b Go: `clients/go/mxgateway/` + `go test`
|
||||
- 14c Python: `clients/python/src/.../` + `pytest`
|
||||
- 14d Rust: `clients/rust/crates/.../` + `cargo test`/clippy
|
||||
- 14e Java: `clients/java/.../` + `gradle test` (macOS JDK 21; **revert generated `MxaccessGateway.java` churn** per `project_java_generated_churn`)
|
||||
|
||||
**Steps (each):** Regenerate client stubs from the updated proto; surface `ReplayGap` to the caller (callback/return marker) so apps know to re-snapshot; test the gap path. Build+test that client. Commit per client.
|
||||
|
||||
### Task 15: Reconnect integration test (fake worker)
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs`
|
||||
|
||||
**Steps:** Stream, drop, reconnect within grace with last sequence → no gap; reconnect after ring overflow → `ReplayGap`. Build + suite. Commit.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Per-session ACL
|
||||
|
||||
### Task 16: gRPC session-owner gate + all-sessions admin scope
|
||||
|
||||
**Classification:** high-risk (security)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/MxAccessGatewayService.cs` (`Invoke`/`StreamEvents`/`CloseSession` require caller key == session `OwnerKeyId`, or a key bearing a new all-sessions scope)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/` (define the all-sessions scope; map it)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs`
|
||||
|
||||
**Steps:** Failing tests: foreign key gets `PermissionDenied` on another key's session; owner and all-sessions-scoped key succeed. Implement. Green. Build + gateway tests. Commit.
|
||||
|
||||
### Task 17: Session `Tag` + dashboard group→tag config
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** Task 16 (disjoint files)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs` (+`SessionManager` to set an optional `Tag` from the open request)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/` Dashboard options (`GroupToSessionTag` map)
|
||||
- Test: config-binding test
|
||||
|
||||
**Steps:** Failing test: a session carries its tag; config map binds. Implement. Green. Build+test. Commit.
|
||||
|
||||
### Task 18: EventsHub per-session ACL + hub-token session-tag claim
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on 17)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:39-54` (replace `TODO(per-session-acl)`: Admin sees all; Viewer allowed only if the session's tag is in the user's `GroupToSessionTag`-derived allowed set)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenService.cs` (mint an allowed-session-tag claim)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenAuthenticationHandler.cs` (carry the claim back)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/EventsHubTests.cs`
|
||||
|
||||
> **Decision flagged in the design doc:** default is "Admin all / Viewer by tag map." If the owner chose the strict variant (Viewers see nothing unless granted), invert the default here — the executor must confirm which before implementing.
|
||||
|
||||
**Steps:** Failing tests: Viewer without the tag is refused `SubscribeSession`; Admin allowed; Viewer with the mapped tag allowed. Implement. Green. Build + dashboard tests. Commit.
|
||||
|
||||
### Task 19: ACL tests incl. live LDAP users
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs` (extend; gated `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`)
|
||||
|
||||
**Steps:** With `multi-role` (Admin) vs `gw-viewer` (Viewer), assert subscribe authorization differs by session tag. Document if skipped (no live LDAP). Commit.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Orphan-worker reattach (overturns the CLAUDE.md rule)
|
||||
|
||||
### Task 20: Stable gateway-instance id + stable pipe naming
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433` (pipe name uses a persisted stable gateway-instance id instead of `Environment.ProcessId`)
|
||||
- Create: gateway-instance-id persistence (small file/SQLite row under `C:\ProgramData\MxGateway\`)
|
||||
- Test: `SessionManagerTests.cs` / a new instance-id test
|
||||
|
||||
**Steps:** Failing test: pipe name is stable across simulated restarts (same instance id). Implement. Green. Build + tests. Commit.
|
||||
|
||||
### Task 21: Adoption manifest store (SQLite)
|
||||
|
||||
**Classification:** high-risk (persistence, security material)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerAdoptionManifest.cs` (persist `sessionId → workerPid, nonce, ownerKeyId, pipeName`; upsert on launch, delete on clean close)
|
||||
- Modify: gateway-auth SQLite schema/migration
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/WorkerAdoptionManifestTests.cs`
|
||||
|
||||
> Nonce is security material — store it like other secrets (no plaintext logging; standing rule).
|
||||
|
||||
**Steps:** Failing test: manifest round-trips an entry; clean close removes it. Implement. Green. Build + tests. Commit.
|
||||
|
||||
### Task 22: Proto — worker adopt/reconnect frame (contract change)
|
||||
|
||||
**Classification:** high-risk (contracts → worker)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto` (add an adopt/reconnect `WorkerEnvelope` frame: worker presents `sessionId` + `nonce`; gateway ACK/NACK)
|
||||
- Regenerate + **commit** `Generated/*` (net48 rule)
|
||||
- Test: contracts build both TFMs
|
||||
|
||||
**Steps:** Add frame, regen, commit generated, build both TFMs. Commit. **Unblocks Tasks 24–25.**
|
||||
|
||||
### Task 23: Worker phone-home reconnect loop + self-terminate
|
||||
|
||||
**Classification:** high-risk (worker, net48/x86 — **windev**)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/Ipc/WorkerPipeClient.cs` (on pipe drop: reconnect loop with bounded backoff to the stable pipe name; present the adopt frame)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/` runtime (self-terminate after `MaxOrphanLifetime` with no adoption)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/` (net48/x86 on windev)
|
||||
|
||||
**Steps:** Failing test (fake pipe server): worker retries and adopts; gives up + self-terminates past the lifetime. Build x86 + worker tests on **windev**. Commit. *(net48: no init-only/positional records.)*
|
||||
|
||||
### Task 24: Gateway adoption — re-open pipes, nonce-validate, reject impostors
|
||||
|
||||
**Classification:** high-risk (security, lifecycle — **windev** for live)
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerAdopter.cs` (startup: read manifest, re-open pipe servers, accept adopt frames, validate nonce → adopt or reject)
|
||||
- Modify: gateway startup hosted-service order (adopter runs **before** `OrphanWorkerTerminator`; terminator handles only un-adoptable/foreign workers)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/OrphanWorkerAdopterTests.cs`
|
||||
|
||||
**Steps:** Failing tests: matching nonce adopts and rebuilds the session; mismatched nonce is rejected and the worker terminated. Implement. Green. Build + tests. Commit.
|
||||
|
||||
### Task 25: Resync adopted worker + `ReplayGap` to subscribers
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerAdopter.cs` (after adoption, `GetSessionState`/`GetWorkerInfo` to resync; reattached subscribers get `ReplayGap` since the ring is gone)
|
||||
- Test: `OrphanWorkerAdopterTests.cs`
|
||||
|
||||
**Steps:** Failing test: adopted session reports resynced state; a resuming subscriber receives `ReplayGap`. Implement. Green. Build + tests. Commit.
|
||||
|
||||
### Task 26: `EnableOrphanReattach` flag (default off) + terminator fallback
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs` (`EnableOrphanReattach`, default `false`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerTerminator.cs` (unchanged default behavior when reattach disabled)
|
||||
- Test: `OrphanWorkerTerminatorTests.cs` / adopter test
|
||||
|
||||
**Steps:** Failing test: with the flag off, startup terminates (today's behavior); on, it adopts. Implement. Green. Build + tests. Commit.
|
||||
|
||||
### Task 27: Gateway-restart reattach round-trip (integration, **windev** + live worker)
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs` (gated `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`)
|
||||
|
||||
**Steps:** Open session → simulate gateway restart → adopter re-adopts the surviving worker → session usable → subscriber gets `ReplayGap` then live events. Run on **windev** with live MXAccess. Document if skipped.
|
||||
|
||||
### Task 28: Documented-rule reversals + stillpending refresh
|
||||
|
||||
**Classification:** trivial (doc-only)
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none (final)
|
||||
|
||||
**Files:**
|
||||
- Modify: `CLAUDE.md` (line ~77 — reattach now supported, opt-in/bounded)
|
||||
- Modify: `docs/DesignDecisions.md` (`:63-73` reconnect, `:75-80` multi-subscriber, reattach rationale → mark superseded with this design's date/commit)
|
||||
- Modify: `gateway.md` (post-v1 revisit items — reflect what shipped)
|
||||
- Modify: `stillpending.md` (§2 items: mark fan-out/reconnect/ACL/reattach Resolved with commit refs)
|
||||
- Modify: `docs/GatewayConfiguration.md` (new options: `MaxEventSubscribersPerSession`, `ReplayBufferCapacity`, `ReplayRetentionSeconds`, `DetachGraceSeconds`, `GroupToSessionTag`, `EnableOrphanReattach`, `MaxOrphanLifetime`)
|
||||
|
||||
**Steps:** Edit docs to match shipped behavior. Commit.
|
||||
|
||||
---
|
||||
|
||||
## Verification matrix
|
||||
|
||||
| Phase | Build/test | Host |
|
||||
|---|---|---|
|
||||
| 1–4 (gateway, clients) | `dotnet build` + gateway/fake-worker tests; per-client `go/pytest/cargo/gradle/dotnet test` | local (macOS) |
|
||||
| 3/5 proto changes | regen + commit `Generated/`; build contracts both TFMs; rebuild touched clients | local |
|
||||
| 5 worker (net48/x86) | `dotnet build -p:Platform=x86` + `Worker.Tests` | **windev** |
|
||||
| 5 live reattach + Phase-4 LDAP | opt-in gated integration tests | **windev** / live LDAP |
|
||||
|
||||
## Final integration review
|
||||
|
||||
After all tasks: dispatch a final integration reviewer over `git diff main..HEAD` focusing on the live event path, concurrency in `SessionEventDistributor`, security gates (ACL + nonce adoption), and the three documented-rule reversals. Then use superpowers-extended-cc:finishing-a-development-branch.
|
||||
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-15-session-resilience.md",
|
||||
"tasks": [
|
||||
{"id": 108, "subject": "Task 1: Add OwnerKeyId to the session", "status": "pending"},
|
||||
{"id": 109, "subject": "Task 2: SessionEventDistributor skeleton", "status": "pending", "blockedBy": [108]},
|
||||
{"id": 110, "subject": "Task 3: Bounded replay ring buffer", "status": "pending", "blockedBy": [109]},
|
||||
{"id": 111, "subject": "Task 4: Rewire AttachEventSubscriber + EventStreamService onto distributor", "status": "pending", "blockedBy": [110]},
|
||||
{"id": 112, "subject": "Task 5: Per-subscriber backpressure isolation", "status": "pending", "blockedBy": [111]},
|
||||
{"id": 113, "subject": "Task 6: Dashboard broadcaster becomes a distributor subscriber", "status": "pending", "blockedBy": [111]},
|
||||
{"id": 114, "subject": "Task 7: Remove validator block + add subscriber cap option", "status": "pending", "blockedBy": [112]},
|
||||
{"id": 115, "subject": "Task 8: Subscriber-lease collection + cap enforcement", "status": "pending", "blockedBy": [114]},
|
||||
{"id": 116, "subject": "Task 9: Multi-subscriber end-to-end test (FakeWorkerHarness)", "status": "pending", "blockedBy": [115]},
|
||||
{"id": 117, "subject": "Task 10: Proto - ReplayGap signal", "status": "pending", "blockedBy": [116]},
|
||||
{"id": 118, "subject": "Task 11: Detach-grace session retention", "status": "pending", "blockedBy": [117]},
|
||||
{"id": 119, "subject": "Task 12: Replay-on-reconnect + emit ReplayGap", "status": "pending", "blockedBy": [118, 110]},
|
||||
{"id": 120, "subject": "Task 13: Owner re-validation on reconnect", "status": "pending", "blockedBy": [119, 108]},
|
||||
{"id": 121, "subject": "Task 14: Client ReplayGap handling - all 5 clients", "status": "pending", "blockedBy": [117]},
|
||||
{"id": 122, "subject": "Task 15: Reconnect integration test (fake worker)", "status": "pending", "blockedBy": [119]},
|
||||
{"id": 123, "subject": "Task 16: gRPC session-owner gate + all-sessions admin scope", "status": "pending", "blockedBy": [116, 108]},
|
||||
{"id": 124, "subject": "Task 17: Session Tag + dashboard group-to-tag config", "status": "pending", "blockedBy": [116]},
|
||||
{"id": 125, "subject": "Task 18: EventsHub per-session ACL + hub-token tag claim", "status": "pending", "blockedBy": [124]},
|
||||
{"id": 126, "subject": "Task 19: ACL tests incl. live LDAP users", "status": "pending", "blockedBy": [125]},
|
||||
{"id": 127, "subject": "Task 20: Stable gateway-instance id + stable pipe naming", "status": "pending", "blockedBy": [126]},
|
||||
{"id": 128, "subject": "Task 21: Adoption manifest store (SQLite)", "status": "pending", "blockedBy": [127]},
|
||||
{"id": 129, "subject": "Task 22: Proto - worker adopt/reconnect frame", "status": "pending", "blockedBy": [128]},
|
||||
{"id": 130, "subject": "Task 23: Worker phone-home reconnect loop + self-terminate", "status": "pending", "blockedBy": [129]},
|
||||
{"id": 131, "subject": "Task 24: Gateway adoption - re-open pipes, nonce-validate, reject impostors", "status": "pending", "blockedBy": [130]},
|
||||
{"id": 132, "subject": "Task 25: Resync adopted worker + ReplayGap to subscribers", "status": "pending", "blockedBy": [131, 119]},
|
||||
{"id": 133, "subject": "Task 26: EnableOrphanReattach flag (default off) + terminator fallback", "status": "pending", "blockedBy": [131]},
|
||||
{"id": 134, "subject": "Task 27: Gateway-restart reattach round-trip (WINDEV + live worker)", "status": "pending", "blockedBy": [132, 133]},
|
||||
{"id": 135, "subject": "Task 28: Documented-rule reversals + stillpending refresh", "status": "pending", "blockedBy": [134]}
|
||||
],
|
||||
"lastUpdated": "2026-06-15"
|
||||
}
|
||||
@@ -0,0 +1,152 @@
|
||||
# Still-Pending Completion — Design
|
||||
|
||||
**Date:** 2026-06-15
|
||||
**Source:** `stillpending.md` (audit at commit `c7f754c`)
|
||||
**Branch:** `feat/stillpending-completion`
|
||||
**Status:** Design approved; implementation plan to follow.
|
||||
|
||||
## Goal
|
||||
|
||||
Close the genuinely actionable items in `stillpending.md`: the 11 unimplemented
|
||||
worker command kinds (§1.1), audit-record CorrelationId threading (§1.2), the
|
||||
client CLI/helper parity gaps (§4), and the documentation/residual-recording
|
||||
hygiene (§7). Items that are deliberate v1 scope (§2), vendor- or rig-gated
|
||||
(§1.3, §1.4, §3), or opt-in verification gates (§5) are documented, not built.
|
||||
|
||||
## Key discovery that shapes the work
|
||||
|
||||
All 11 worker commands **already** have proto request *and* reply messages,
|
||||
gateway-side validation (`MxAccessGrpcRequestValidator.cs:86-111`), scope
|
||||
mapping (`GatewayGrpcScopeResolver.cs:45-54`), and generic pass-through routing
|
||||
(`MxAccessGatewayService.Invoke`). Therefore:
|
||||
|
||||
- **No `.proto` changes** — no codegen, no net48 `CS0246` regen risk.
|
||||
- **No gateway routing changes** for the 11 commands.
|
||||
- The real code surface is: worker executor arms, six new COM-wrapper methods,
|
||||
the constraint-path CorrelationId thread, and client CLI/helper additions.
|
||||
|
||||
8 of the 11 have dedicated reply messages; 3 (`SetBufferedUpdateInterval`,
|
||||
`Ping`, `ShutdownWorker`) intentionally return the base OK reply.
|
||||
|
||||
## Architecture
|
||||
|
||||
Two-process design is unchanged. Work lands in five independent workstreams.
|
||||
|
||||
### Workstream A — Worker control/lifecycle commands (5)
|
||||
|
||||
`Ping`, `GetSessionState`, `GetWorkerInfo`, `DrainEvents`, `ShutdownWorker`.
|
||||
No COM involved. New arms in `MxAccessCommandExecutor.Execute`
|
||||
(`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:97-128`).
|
||||
|
||||
The executor currently holds only COM collaborators, so it gains the runtime
|
||||
state these commands report on:
|
||||
|
||||
- `DrainEvents` → `MxAccessEventQueue.Drain(maxEvents)` (already exists;
|
||||
`maxEvents == 0` drains all) → `DrainEventsReply { repeated MxEvent events }`.
|
||||
- `Ping` → echoes `PingCommand.message`; base OK reply.
|
||||
- `GetSessionState` → `SessionStateReply { SessionState state }` from the
|
||||
runtime/heartbeat snapshot.
|
||||
- `GetWorkerInfo` → `WorkerInfoReply { worker_process_id, worker_version,
|
||||
mxaccess_progid, mxaccess_clsid }` from `MxAccessInteropInfo` + process info.
|
||||
- `ShutdownWorker` → honor `grace_period` then signal `StaRuntime` shutdown;
|
||||
base OK reply.
|
||||
|
||||
Where the control arms physically live (inside `MxAccessCommandExecutor` with
|
||||
injected collaborators, vs. intercepted one layer up where the runtime context
|
||||
already exists) is an implementation decision for the plan; the contract surface
|
||||
is identical either way.
|
||||
|
||||
### Workstream B — Worker MXAccess COM commands (6)
|
||||
|
||||
`Suspend`, `Activate`, `AuthenticateUser`, `ArchestrAUserToId`,
|
||||
`AddBufferedItem`, `SetBufferedUpdateInterval`. Each needs:
|
||||
|
||||
1. A wrapper method on `IMxAccessServer`
|
||||
(`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessServer.cs`) and dispatch in
|
||||
`MxAccessComServer` (`MxAccessComServer.cs`). **Open question resolved on
|
||||
windev:** which native interface (`ILMXProxyServer` / `…3` / `…4`) exposes
|
||||
each method — confirm against the interop and `docs/MXAccess-Public-API.md`
|
||||
before writing the dispatch.
|
||||
2. An executor arm mapping request → wrapper call → reply
|
||||
(`SuspendReply`/`ActivateReply` carry `MxStatusProxy`; `AuthenticateUserReply`/
|
||||
`ArchestrAUserToIdReply` carry `user_id`; `AddBufferedItemReply` carries
|
||||
`item_handle`; `SetBufferedUpdateInterval` returns base OK).
|
||||
|
||||
`AuthenticateUser` credentials must never reach logs (standing rule).
|
||||
`AddBufferedItem`/`SetBufferedUpdateInterval` enable the already-wired
|
||||
`OnBufferedDataChange` event path (`MxAccessEventMapper.cs:231-254`); per the
|
||||
approved decision we **verify the buffered round-trip live on windev**,
|
||||
capturing a real multi-sample batch to validate the §3.2 conversion path
|
||||
(`Conversion/VariantConverter.cs`).
|
||||
|
||||
### Workstream C — §1.2 audit CorrelationId
|
||||
|
||||
Thread `request.ClientCorrelationId` from `MxAccessGatewayService.Invoke` →
|
||||
`ApplyConstraintsAsync` → the six filter helpers (`EnforceReadTagAsync`,
|
||||
`EnforceWriteHandleAsync`, `FilterTagBulkAsync`, `FilterReadBulkAsync`,
|
||||
`FilterWriteBulkAsync`, `FilterHandleBulkAsync`) →
|
||||
`IConstraintEnforcer.RecordDenialAsync`, which gains a `correlationId`
|
||||
parameter.
|
||||
|
||||
**Type detail:** `AuditEvent.CorrelationId` is `Guid?`, but `ClientCorrelationId`
|
||||
is a free-form string. The enforcer does `Guid.TryParse` and stores the value
|
||||
when parseable, else null. No audit-schema or contract change. Remove the
|
||||
`TODO(Task 2.3)` at `ConstraintEnforcer.cs:134-136`.
|
||||
|
||||
### Workstream D — Client CLI/helper parity (5 clients)
|
||||
|
||||
- **Go** `Write2` single-session helper, modeled on `Write` (`session.go:559`).
|
||||
- **Python CLI** — add the 4 `galaxy-*` commands wrapping existing `galaxy.py`
|
||||
library methods (`test_connection`, `get_last_deploy_time`,
|
||||
`discover_hierarchy`, `watch_deploy_events`).
|
||||
- **`ping` CLI** — add to Go (`cmd/mxgw-go/main.go`) and Java (`MxGatewayCli.java`).
|
||||
- **`browse` CLI** — add to **all 5**, wrapping each client's existing
|
||||
`LazyBrowseNode`/`Browse` helper → 0/5 → 5/5 parity.
|
||||
- **Galaxy name unification** — add canonical `galaxy-test-connection` /
|
||||
`galaxy-last-deploy` to Java, keeping `galaxy-test` / `galaxy-deploy-time` as
|
||||
**deprecated aliases** (no script breakage).
|
||||
- **`version` in dotnet** — the explorer found a `version` path at
|
||||
`MxGatewayClientCli.cs:85` that conflicts with the audit's §4.4. Treat as
|
||||
**verify-then-fix-only-if-genuinely-missing**.
|
||||
|
||||
### Workstream E — Docs/hygiene (§7) + residual recording
|
||||
|
||||
- D1 plan header (`docs/plans/2026-06-14-deferred-followups.md:4`).
|
||||
- Stale STA "production fix needed" prose (`docs/AlarmClientDiscovery.md:765-774`).
|
||||
- Stale EventsHub follow-up comment (`Dashboard/Hubs/EventsHub.cs:9-17`).
|
||||
- CLAUDE.md project-name drift (`MxGateway.*` → `ZB.MOM.WW.MxGateway.*`).
|
||||
- Remove dead `MapSqlException` (`GalaxyRepositoryGrpcService.cs:350-360`).
|
||||
- Record §1.3 (live failover counter unproven on rig) and §1.4 (8-arg ack drops
|
||||
operator domain/full-name; vendor stub) as explicitly-documented residuals.
|
||||
- Update `stillpending.md` to reflect what closed.
|
||||
|
||||
## Data flow / error handling specifics
|
||||
|
||||
- Control commands return base/typed replies with `ProtocolStatusCode.Ok`;
|
||||
failures map to the existing `CreateInvalidRequestReply` /
|
||||
`CreateAlarmFailureReply` helpers, never a thrown exception across the STA.
|
||||
- COM commands surface the native `HResult` on the reply exactly as the other
|
||||
COM arms do — MXAccess parity means we do not "fix" surprising returns
|
||||
(e.g. `AuthenticateUser` is allowed to fail; the live test tolerates it).
|
||||
- `ShutdownWorker` must not deadlock the STA: signal shutdown, return the reply,
|
||||
then let the pump drain — sequencing is a plan-level concern.
|
||||
|
||||
## Testing & cross-platform verification
|
||||
|
||||
| Workstream | Build/test | Host |
|
||||
|---|---|---|
|
||||
| A, B (worker) | `Worker.Tests` xUnit with a fake `IMxAccessServer` asserting each arm calls the right wrapper and maps the reply | **windev** (net48/x86) |
|
||||
| B (live) | `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` smoke incl. buffered capture | **windev** + live MXAccess |
|
||||
| C (gateway) | unit test: denied op persists parsed `CorrelationId` | local |
|
||||
| D (clients) | `go test`, `pytest`, `cargo test`+clippy, `gradle test`, `dotnet test` | local; Java on windev |
|
||||
| E (docs) | doc-only; `--check` regen where applicable | local |
|
||||
|
||||
TDD throughout, per-task commits, docs updated in the same change as source.
|
||||
|
||||
## Out of scope (documented, not built)
|
||||
|
||||
§1.3 live-drive (rig can't drive a real failover), §1.4 actual delivery +
|
||||
§3.4/§3.5 (AVEVA `AlarmAckByName` stub returns -55, `AlarmAckByGUID` is
|
||||
`E_NOTIMPL`), §3.1/§3.2-conversion-fix/§3.3/§3.6/§3.7 (await live captures),
|
||||
all of §2 (deliberate v1 scope, incl. validator-blocked multi-subscriber), §5
|
||||
(opt-in verification gates), §7.6 (`Won't Fix` review findings).
|
||||
@@ -0,0 +1,334 @@
|
||||
# Still-Pending Completion Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Close the actionable items in `stillpending.md` — 11 unimplemented worker command kinds (§1.1), audit CorrelationId threading (§1.2), client CLI/helper parity (§4), and doc hygiene (§7).
|
||||
|
||||
**Architecture:** Two-process gateway/worker design is unchanged. All 11 worker commands already have proto request+reply messages, gateway validation, scope mapping, and generic pass-through routing — so the work is **worker executor arms + 6 new COM-wrapper methods + a gateway constraint-path CorrelationId thread + client CLI additions**. **Zero `.proto` changes**, therefore no codegen and no net48 regen risk.
|
||||
|
||||
**Tech Stack:** .NET 10 (gateway, x64), .NET Framework 4.8 (worker, x86, MXAccess COM on STA), Go/Python/Rust/Java/.NET clients. Worker net48/x86 + Java client build/test on Windows host `windev` (10.100.0.48, passwordless ssh, PowerShell); everything else builds locally on macOS.
|
||||
|
||||
**Design source:** `docs/plans/2026-06-15-stillpending-completion-design.md`.
|
||||
|
||||
**Branch:** `feat/stillpending-completion` (already created).
|
||||
|
||||
---
|
||||
|
||||
## Cross-platform build reference (read before any worker/Java task)
|
||||
|
||||
- **Worker (net48/x86) + Worker.Tests + Java client** do NOT build on macOS. Build/test them on `windev`:
|
||||
- Copy the working tree (or use the existing build worktree pattern) to `windev`, `git fetch && git reset --hard origin/<branch>` in the build worktree (NEVER trust a stale local `main` — see memory `project_deploy_mechanics`).
|
||||
- Build: `dotnet build src/ZB.MOM.WW.MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86`
|
||||
- Test: `dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86`
|
||||
- Live MXAccess: set `$env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1"` then run the IntegrationTests filter.
|
||||
- Nested ssh→PowerShell mangles quotes; scp a `.ps1` and run `powershell -NoProfile -ExecutionPolicy Bypass -File`. Wrap git in `cmd /c "git ... 2>&1"`.
|
||||
- **net48 worker C#:** no init-only props / positional records (no `IsExternalInit`); use `{ get; set; }` or ctors (memory `project_net48_worker_csharp`).
|
||||
- **Gateway, .NET client, Go, Rust, Python** build+test locally on macOS.
|
||||
|
||||
---
|
||||
|
||||
## Workstream A — Worker control/lifecycle commands (5)
|
||||
|
||||
These add arms to `MxAccessCommandExecutor.Execute` (`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:90-129`). They need runtime state the executor does not currently hold. **Task A0 establishes how the executor reaches that state; do it first.**
|
||||
|
||||
### Task A0: Decide & wire control-command collaborators into the executor
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (A1–A5 depend on it)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (constructor + fields)
|
||||
- Read first: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessEventQueue.cs` (`Drain(uint)`:168, `Count`:58), `WorkerRuntimeHeartbeatSnapshot.cs`, `src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs` (`IsRunning`:80, `Shutdown()`), `MxAccessInteropInfo.cs` (progid/clsid), the executor's existing construction site (grep `new MxAccessCommandExecutor(`)
|
||||
|
||||
**What to do:** The 5 control commands need: the event queue (DrainEvents), a session-state source (GetSessionState), worker identity — pid/version/progid/clsid (GetWorkerInfo), and a shutdown signal (ShutdownWorker). Determine the cleanest seam:
|
||||
- **Preferred:** inject the collaborators the executor lacks (event queue reference, a `Func<SessionState>` or the session object, `MxAccessInteropInfo`, and a shutdown delegate/`Action`) via the constructor, matching how its existing COM collaborator is passed.
|
||||
- If the executor's construction site shows control commands are better intercepted one layer up (where `StaRuntime`/session context already lives), surface that to the controller before proceeding — do NOT silently relocate the dispatch.
|
||||
|
||||
**Acceptance:** executor compiles on windev with new collaborators available to A1–A5; no behavior change yet (arms still fall through). Commit.
|
||||
|
||||
> Note: A1–A5 are sequential edits to the same `Execute` switch + helper region of one file, so they are NOT parallelizable with each other. Bundle their review.
|
||||
|
||||
### Task A1: Ping
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none (same file as A0/A2-A5)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (add `MxCommandKind.Ping` arm + `ExecutePing`)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/MxAccessCommandExecutorTests.cs` (or the existing executor test file — grep to confirm name)
|
||||
|
||||
**Step 1 — failing test:** assert `Execute` with a `Ping` command (`PingCommand { Message = "hi" }`) returns `ProtocolStatusCode.Ok`, `Hresult == 0`, and echoes the message (via reply diagnostic or base reply — `Ping` has no dedicated reply message, so assert OK status). Build/test on windev.
|
||||
|
||||
**Step 2 — run, expect FAIL** (currently INVALID_REQUEST).
|
||||
|
||||
**Step 3 — implement:** add `MxCommandKind.Ping => ExecutePing(command),` to the switch (`:99-126` region). `ExecutePing` returns `CreateOkReply(command)` (helper at `:784`).
|
||||
|
||||
**Step 4 — run, expect PASS** on windev.
|
||||
|
||||
**Step 5 — commit:** `feat(worker): implement Ping command`
|
||||
|
||||
### Task A2: DrainEvents
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:**
|
||||
- Modify: `MxAccessCommandExecutor.cs` (`DrainEvents` arm + `ExecuteDrainEvents`)
|
||||
- Test: executor test file
|
||||
|
||||
**Steps (TDD):** test that `DrainEvents { MaxEvents = N }` drains up to N from the injected `MxAccessEventQueue` and returns `DrainEventsReply { events = [...] }` (reply field 102). `MaxEvents == 0` drains all. Map each `WorkerEvent` → `MxEvent` using the existing event-mapping path (grep how the live event loop converts `WorkerEvent`→`MxEvent`; reuse, do not duplicate). Build/test windev. Commit `feat(worker): implement DrainEvents command`.
|
||||
|
||||
### Task A3: GetSessionState
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
|
||||
**Steps:** test that `GetSessionState` returns `SessionStateReply { State = <current> }` (reply field 100) mapping the worker's lifecycle to the proto `SessionState` enum (READY when the STA is running). Build/test windev. Commit `feat(worker): implement GetSessionState command`.
|
||||
|
||||
### Task A4: GetWorkerInfo
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
|
||||
**Steps:** test that `GetWorkerInfo` returns `WorkerInfoReply { WorkerProcessId, WorkerVersion, MxaccessProgid, MxaccessClsid }` (reply field 101) sourced from `Process.GetCurrentProcess().Id`, the worker assembly version, and `MxAccessInteropInfo` (progid `LMXProxy.LMXProxyServer.1`, clsid `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}`). Build/test windev. Commit `feat(worker): implement GetWorkerInfo command`.
|
||||
|
||||
### Task A5: ShutdownWorker
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
|
||||
**Steps:** test that `ShutdownWorker { GracePeriod }` returns a base OK reply and triggers the injected shutdown signal **after** the reply is produced (must not deadlock the STA — signal shutdown, return reply, let the pump drain). Verify the grace period is honored (or documented as best-effort). Build/test windev. Commit `feat(worker): implement ShutdownWorker command`.
|
||||
|
||||
### Task A6: Make FakeWorkerHarness respond to control commands
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on A1–A5 reply shapes)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs`
|
||||
- Test: a gateway-side test that invokes `Ping`/`GetWorkerInfo`/`DrainEvents` through the harness and asserts the reply (builds locally on macOS).
|
||||
|
||||
**Why:** the audit (§1.1) flagged that control kinds were "exercised only through `FakeWorkerHarness`" but the harness is a passive relay that does not auto-respond — so gateway tests could not actually cover them. Add canned responses so the gateway↔worker round-trip for these commands is verified in the default (no-MXAccess) suite. Commit `test(gateway): fake worker responds to control commands`.
|
||||
|
||||
---
|
||||
|
||||
## Workstream B — Worker MXAccess COM commands (6)
|
||||
|
||||
`Suspend`, `Activate`, `AuthenticateUser`, `ArchestrAUserToId`, `AddBufferedItem`, `SetBufferedUpdateInterval`. **Task B0 (windev interop inspection) MUST run first** — the native interface exposing each method is unknown until inspected.
|
||||
|
||||
### Task B0: Resolve native COM signatures on windev
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min (investigation)
|
||||
**Parallelizable with:** A-workstream tasks (different files/host activity)
|
||||
|
||||
**Files:**
|
||||
- Read on windev: the generated interop for `ArchestrA.MXAccess.dll` (the `ILMXProxyServer` / `ILMXProxyServer3` / `ILMXProxyServer4` RCW definitions), `C:\Users\dohertj2\Desktop\mxaccess\docs\MXAccess-Public-API.md` (method list/signatures).
|
||||
- Output: a short note appended to this plan (or a comment block) recording, for each of the 6 methods, which interface version exposes it and its exact signature.
|
||||
|
||||
**What to do:** Confirm the exact native signatures for `Suspend(int serverHandle, int itemHandle)`, `Activate(int serverHandle, int itemHandle)`, `AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword)` → user id, `ArchestrAUserToId(int serverHandle, string userIdGuid)` → user id, `AddBufferedItem(int serverHandle, string itemDefinition, string itemContext)` → item handle, `SetBufferedUpdateInterval(int serverHandle, int intervalMs)`. If any method is **not** present on the installed interop (mirroring the §3.4/§3.5 vendor-stub pattern for alarms), STOP and surface it — implement only the available ones and record the rest as vendor-gated residuals. Commit the note.
|
||||
|
||||
### Task B1: Add 6 wrapper methods to IMxAccessServer + MxAccessComServer
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (blocked by B0)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessServer.cs` (add 6 method declarations)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessComServer.cs` (dispatch to the interface version resolved in B0, mirroring existing methods like `Write2`:173, `AddItem2`:84)
|
||||
- Modify: any fake/test `IMxAccessServer` implementation (grep `: IMxAccessServer`) to add the 6 methods (return canned values).
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/...` for `MxAccessComServer` if one exists.
|
||||
|
||||
**Steps (TDD):** add the 6 declarations; implement dispatch following the existing version-selection pattern; update fakes so the solution compiles. Build on windev `-p:Platform=x86`. Commit `feat(worker): add MXAccess COM wrappers for suspend/activate/auth/buffered`.
|
||||
|
||||
> B2–B7 are sequential edits to the same `Execute` switch; not parallelizable with each other. Bundle review.
|
||||
|
||||
### Task B2: Suspend arm
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `Suspend { ServerHandle, ItemHandle }` calls the wrapper and returns `SuspendReply { Status = MxStatusProxy }` (reply field 24). Use a fake `IMxAccessServer` asserting the call. Build/test windev. Commit `feat(worker): implement Suspend command`.
|
||||
|
||||
### Task B3: Activate arm
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `Activate { ServerHandle, ItemHandle }` → `ActivateReply { Status }` (field 25). Build/test windev. Commit `feat(worker): implement Activate command`.
|
||||
|
||||
### Task B4: AuthenticateUser arm
|
||||
|
||||
**Classification:** standard · **~4 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `AuthenticateUser { ServerHandle, VerifyUser, VerifyUserPassword }` → `AuthenticateUserReply { UserId }` (field 26). **Credentials must never be logged** (standing rule) — assert no log statement includes the password. AuthenticateUser is allowed to fail (surface the native HResult, do not throw). Build/test windev. Commit `feat(worker): implement AuthenticateUser command`.
|
||||
|
||||
### Task B5: ArchestrAUserToId arm
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `ArchestrAUserToId { ServerHandle, UserIdGuid }` → `ArchestrAUserToIdReply { UserId }` (field 27). Build/test windev. Commit `feat(worker): implement ArchestrAUserToId command`.
|
||||
|
||||
### Task B6: AddBufferedItem arm
|
||||
|
||||
**Classification:** standard · **~4 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `AddBufferedItem { ServerHandle, ItemDefinition, ItemContext }` → `AddBufferedItemReply { ItemHandle }` (field 23). Build/test windev. Commit `feat(worker): implement AddBufferedItem command`.
|
||||
|
||||
### Task B7: SetBufferedUpdateInterval arm
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** none
|
||||
**Files:** `MxAccessCommandExecutor.cs` + executor test.
|
||||
TDD: `SetBufferedUpdateInterval { ServerHandle, UpdateIntervalMilliseconds }` → base OK reply (no dedicated reply message). Build/test windev. Commit `feat(worker): implement SetBufferedUpdateInterval command`.
|
||||
|
||||
### Task B8: Live COM smoke + buffered capture on windev
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min (authoring; live run is manual)
|
||||
**Parallelizable with:** none (blocked by B1–B7)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs` (the existing AuthenticateUser send at ~line 919/931 should now get an OK/typed reply instead of INVALID_REQUEST; add Suspend/Activate/AddBufferedItem+SetBufferedUpdateInterval sends).
|
||||
- Possibly: `src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/` for a buffered-capture probe.
|
||||
|
||||
**Steps:** Under `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` on windev: verify the 4 unambiguous COM commands round-trip; then `AddBufferedItem` + `SetBufferedUpdateInterval` on a real tag and **capture a multi-sample `OnBufferedDataChange` batch** to validate the §3.2 `VariantConverter` path. If the buffered conversion proves correct, record it; if it surfaces a conversion bug, STOP and report (do not silently ship). If a live buffered sample cannot be elicited on the rig, record buffered round-trip as the documented residual (close the command gap, leave §3.2 open). Commit `test(integration): live COM command + buffered capture smoke`.
|
||||
|
||||
---
|
||||
|
||||
## Workstream C — §1.2 gateway audit CorrelationId
|
||||
|
||||
### Task C1: Thread ClientCorrelationId into constraint-denial audit records
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** all A/B/D tasks (gateway-only files, builds locally)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/IConstraintEnforcer.cs` (add `string? correlationId` param to `RecordDenialAsync`, signature at `:49-54`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/ConstraintEnforcer.cs` (`:124-158`): accept the param, `Guid.TryParse` it into the `Guid? CorrelationId` audit field (was hardcoded `null` at `:147`); remove `TODO(Task 2.3)` at `:134-136`.
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Grpc/MxAccessGatewayService.cs`: thread `request.ClientCorrelationId` from `Invoke` (`:96`) → `ApplyConstraintsAsync` (`:279`) → the 6 filter helpers (`EnforceReadTagAsync`:427, `EnforceWriteHandleAsync`:448, `FilterTagBulkAsync`:474, `FilterReadBulkAsync`:529, `FilterWriteBulkAsync`:584, `FilterHandleBulkAsync`:656) → `RecordDenialAsync`.
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/...` constraint-enforcer / gateway-service test.
|
||||
|
||||
**Step 1 — failing test:** a denied operation with `ClientCorrelationId = "<a real GUID>"` persists an audit record whose `CorrelationId` equals that GUID; a non-GUID correlation id persists `null` (documented behavior). Run locally: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/MxGateway.Tests.csproj --filter <name>`.
|
||||
|
||||
**Step 2 — FAIL** (currently always null).
|
||||
|
||||
**Step 3 — implement** the threading + `Guid.TryParse`.
|
||||
|
||||
**Step 4 — PASS** locally + full gateway suite green.
|
||||
|
||||
**Step 5 — commit:** `feat(gateway): thread ClientCorrelationId into constraint-denial audit (§1.2)`
|
||||
|
||||
---
|
||||
|
||||
## Workstream D — Client CLI/helper parity (5 clients)
|
||||
|
||||
All D tasks touch disjoint client trees and are parallelizable across languages. Each builds/tests on its own toolchain (Java on windev; the rest local).
|
||||
|
||||
### Task D1: Go single-shot Write2 helper
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** D2–D9
|
||||
**Files:**
|
||||
- Modify: `clients/go/mxgateway/session.go` (add `Write2`/`Write2Raw` after `Write`:559, modeled on `Write` + the `Write2Bulk`:427 payload shape)
|
||||
- Test: `clients/go/mxgateway/session_test.go` (or nearest)
|
||||
TDD: `Write2(ctx, serverHandle, itemHandle, value, timestampValue *MxValue, userID int32) error` issues `MX_COMMAND_KIND_WRITE2` with `Write2Command{ServerHandle,ItemHandle,Value,TimestampValue,UserId}`. Verify: `gofmt`, `go build ./...`, `go test ./...` from `clients/go`. Commit `feat(go): add single-shot Write2 session helper (§4.1)`.
|
||||
|
||||
### Task D2: Python galaxy-* CLI commands (4)
|
||||
|
||||
**Classification:** standard · **~5 min** · **Parallelizable with:** D1,D3–D9
|
||||
**Files:**
|
||||
- Modify: `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` (add `galaxy-test-connection`, `galaxy-last-deploy`, `galaxy-discover`, `galaxy-watch` Click commands wrapping `galaxy.py` `test_connection`/`get_last_deploy_time`/`discover_hierarchy`/`watch_deploy_events`; mirror the existing `ping` command structure at `:221`)
|
||||
- Modify: `clients/python/README.md:217` (correct the understated galaxy CLI claim)
|
||||
- Test: `clients/python/tests/` CLI test
|
||||
TDD then `python -m pytest` from `clients/python`. Commit `feat(python): add galaxy-* CLI commands (§4.2)`.
|
||||
|
||||
### Task D3: ping CLI in Go
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** others
|
||||
**Files:** `clients/go/cmd/mxgw-go/main.go` (add `ping` case to the switch ~`:77-130`/`:1199`, modeled on an existing simple command) + test.
|
||||
TDD; `gofmt`, `go build ./...`, `go test ./...`. Commit `feat(go): add ping CLI subcommand (§4.3)`.
|
||||
|
||||
### Task D4: ping CLI in Java
|
||||
|
||||
**Classification:** small · **~3 min** · **Parallelizable with:** others — **build on windev**
|
||||
**Files:** `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java` (register a `ping` subcommand ~`:126-149`) + test.
|
||||
TDD; `gradle test` on windev. Commit `feat(java): add ping CLI subcommand (§4.3)`.
|
||||
|
||||
### Task D5: browse CLI — Go
|
||||
|
||||
**Classification:** standard · **~4 min** · **Parallelizable with:** others
|
||||
**Files:** `clients/go/cmd/mxgw-go/main.go` (new `browse` command wrapping `GalaxyClient.Browse`:398 / `LazyBrowseNode.Expand`:337) + test. `go build/test`. Commit `feat(go): add browse CLI (§4.6)`.
|
||||
|
||||
### Task D6: browse CLI — Python
|
||||
|
||||
**Classification:** standard · **~4 min** · **Parallelizable with:** others
|
||||
**Files:** `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` (new `browse` command wrapping `galaxy.py` `browse`:163) + test. `pytest`. Commit `feat(python): add browse CLI (§4.6)`.
|
||||
|
||||
### Task D7: browse CLI — Rust
|
||||
|
||||
**Classification:** standard · **~4 min** · **Parallelizable with:** others
|
||||
**Files:** `clients/rust/crates/mxgw-cli/src/main.rs` (new `Browse` command variant wrapping the galaxy browse helper in `galaxy.rs`) + test. `cargo fmt`, `cargo test --workspace`, `cargo clippy --all-targets -- -D warnings`. Commit `feat(rust): add browse CLI (§4.6)`.
|
||||
|
||||
### Task D8: browse CLI — Java + dotnet
|
||||
|
||||
**Classification:** standard · **~5 min** · **Parallelizable with:** others — **Java builds on windev**
|
||||
**Files:** `clients/java/.../MxGatewayCli.java` (browse subcommand wrapping `GalaxyRepositoryClient.browse`) + `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs` (`browse` command wrapping `LazyBrowseNode.ExpandAsync`:63) + tests. `gradle test` (windev), `dotnet test` (local). Commit `feat(dotnet,java): add browse CLI (§4.6) — 5/5 parity`.
|
||||
|
||||
### Task D9: Java galaxy-name aliases + verify dotnet version
|
||||
|
||||
**Classification:** small · **~4 min** · **Parallelizable with:** others — **Java builds on windev**
|
||||
**Files:**
|
||||
- Modify: `clients/java/.../MxGatewayCli.java:145-146` — add canonical `galaxy-test-connection`/`galaxy-last-deploy` as the primary names; keep `galaxy-test`/`galaxy-deploy-time` as **deprecated aliases** (picocli `@Command(name=..., aliases={...})` or equivalent).
|
||||
- Verify: `clients/dotnet/.../MxGatewayClientCli.cs` — the explorer found a `version` path at `:85` that conflicts with audit §4.4. **Read it**: if a `version` subcommand genuinely works, no change (note it in the §7 update); if it's only a `--version` flag and `IsKnownGatewayCommand` lacks `version`, add the subcommand. Do not add what already exists.
|
||||
- Test: Java CLI test asserting both names resolve.
|
||||
`gradle test` (windev), `dotnet build/test` (local). Commit `feat(java): galaxy command aliases; chore(dotnet): verify version subcommand (§4.4,§4.5)`.
|
||||
|
||||
---
|
||||
|
||||
## Workstream E — Docs/hygiene + residual recording
|
||||
|
||||
### Task E1: Doc hygiene + dead-code removal
|
||||
|
||||
**Classification:** small · **~5 min** · **Parallelizable with:** all (mostly doc-only; one code deletion)
|
||||
**Files:**
|
||||
- `docs/plans/2026-06-14-deferred-followups.md:4` — change "Plan only — NOT yet executed" to reflect D1 done (`4af24b9`).
|
||||
- `docs/AlarmClientDiscovery.md:765-774` — rewrite stale STA "production fix needed" prose (alarms now run through worker STA / `GatewayAlarmMonitor`).
|
||||
- `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:9-17` — remove/update stale "publisher side is a follow-up" comment (broadcaster shipped).
|
||||
- `CLAUDE.md` — fix project-name drift `src/MxGateway.*` → `src/ZB.MOM.WW.MxGateway.*` throughout.
|
||||
- `src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:350-360` — remove dead IDE0051-suppressed `MapSqlException`.
|
||||
**Verify:** `dotnet build src/ZB.MOM.WW.MxGateway.Server` locally (the only code change is the deletion). Commit `docs+chore: fix stale prose, project names, remove dead MapSqlException (§7)`.
|
||||
|
||||
### Task E2: Record §1.3 and §1.4 residuals + refresh stillpending.md
|
||||
|
||||
**Classification:** trivial · **~3 min** · **Parallelizable with:** all (doc-only)
|
||||
**Files:**
|
||||
- `docs/plans/2026-06-14-deferred-followups.md` — record §1.3 (provider_switches counter live-exercise unproven; rig can't drive a real failover) as an explicit documented residual.
|
||||
- Add a short note (in the worker alarm code's existing comment near `WnWrapAlarmConsumer.cs:261` or the design doc) that §1.4's 8-arg ack drops domain/full-name because the AVEVA `AlarmAckByName` v2 is a vendor stub (-55) — already partly noted; make it explicit and cross-referenced.
|
||||
- `stillpending.md` — mark §1.1, §1.2, §4.1/§4.2/§4.3/§4.6 (and §4.4/§4.5 per outcome) as Resolved with commit refs; keep the documented residuals.
|
||||
Commit `docs: record §1.3/§1.4 residuals and refresh stillpending.md (§7)`.
|
||||
|
||||
---
|
||||
|
||||
## Final integration review
|
||||
|
||||
After all workstreams: run the full local suite (`dotnet test` gateway + `.NET` client, `go test`, `pytest`, `cargo test`+clippy) and the windev suite (worker net48/x86 + Java + live MXAccess smoke). Then use **superpowers-extended-cc:finishing-a-development-branch**.
|
||||
|
||||
## Dependency summary
|
||||
|
||||
- A0 → A1..A5 → A6
|
||||
- B0 → B1 → B2..B7 → B8
|
||||
- C1 independent (gateway-only, local)
|
||||
- D1..D9 independent of A/B/C and of each other (disjoint client trees)
|
||||
- E1, E2 last (reflect what closed); E1 mostly independent
|
||||
@@ -0,0 +1,34 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-15-stillpending-completion.md",
|
||||
"tasks": [
|
||||
{"id": 80, "subject": "Task A0: Wire control-command collaborators into executor", "status": "pending"},
|
||||
{"id": 81, "subject": "Task A1: Implement Ping command", "status": "pending", "blockedBy": [80]},
|
||||
{"id": 82, "subject": "Task A2: Implement DrainEvents command", "status": "pending", "blockedBy": [80]},
|
||||
{"id": 83, "subject": "Task A3: Implement GetSessionState command", "status": "pending", "blockedBy": [80]},
|
||||
{"id": 84, "subject": "Task A4: Implement GetWorkerInfo command", "status": "pending", "blockedBy": [80]},
|
||||
{"id": 85, "subject": "Task A5: Implement ShutdownWorker command", "status": "pending", "blockedBy": [80]},
|
||||
{"id": 86, "subject": "Task A6: FakeWorkerHarness responds to control commands", "status": "pending", "blockedBy": [81, 82, 84]},
|
||||
{"id": 87, "subject": "Task B0: Resolve native COM signatures on windev", "status": "pending"},
|
||||
{"id": 88, "subject": "Task B1: Add 6 COM wrapper methods (IMxAccessServer + MxAccessComServer)", "status": "pending", "blockedBy": [87]},
|
||||
{"id": 89, "subject": "Task B2: Implement Suspend arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 90, "subject": "Task B3: Implement Activate arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 91, "subject": "Task B4: Implement AuthenticateUser arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 92, "subject": "Task B5: Implement ArchestrAUserToId arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 93, "subject": "Task B6: Implement AddBufferedItem arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 94, "subject": "Task B7: Implement SetBufferedUpdateInterval arm", "status": "pending", "blockedBy": [88]},
|
||||
{"id": 95, "subject": "Task B8: Live COM smoke + buffered capture on windev", "status": "pending", "blockedBy": [89, 90, 91, 92, 93, 94]},
|
||||
{"id": 96, "subject": "Task C1: Thread ClientCorrelationId into denial audit (§1.2)", "status": "pending"},
|
||||
{"id": 97, "subject": "Task D1: Go single-shot Write2 helper (§4.1)", "status": "pending"},
|
||||
{"id": 98, "subject": "Task D2: Python galaxy-* CLI commands (§4.2)", "status": "pending"},
|
||||
{"id": 99, "subject": "Task D3: ping CLI in Go (§4.3)", "status": "pending"},
|
||||
{"id": 100, "subject": "Task D4: ping CLI in Java (§4.3)", "status": "pending"},
|
||||
{"id": 101, "subject": "Task D5: browse CLI — Go (§4.6)", "status": "pending"},
|
||||
{"id": 102, "subject": "Task D6: browse CLI — Python (§4.6)", "status": "pending"},
|
||||
{"id": 103, "subject": "Task D7: browse CLI — Rust (§4.6)", "status": "pending"},
|
||||
{"id": 104, "subject": "Task D8: browse CLI — Java + dotnet (§4.6)", "status": "pending"},
|
||||
{"id": 105, "subject": "Task D9: Java galaxy aliases + verify dotnet version (§4.4,§4.5)", "status": "pending"},
|
||||
{"id": 106, "subject": "Task E1: Doc hygiene + dead-code removal (§7)", "status": "pending"},
|
||||
{"id": 107, "subject": "Task E2: Record §1.3/§1.4 residuals + refresh stillpending.md (§7)", "status": "pending", "blockedBy": [81, 82, 83, 84, 85, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 105]}
|
||||
],
|
||||
"lastUpdated": "2026-06-15"
|
||||
}
|
||||
@@ -562,6 +562,344 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
|
||||
Assert.DoesNotContain(verifyPassword, recordedOutput.Captured, StringComparison.Ordinal);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// B8 live verification of the COM commands the B-bundle added against a fake
|
||||
/// IMxAccessServer: <c>AuthenticateUser</c>, <c>ArchestrAUserToId</c>, <c>Suspend</c>,
|
||||
/// and <c>Activate</c>. The contract being proven is that each command round-trips
|
||||
/// to the worker and back carrying a real MXAccess outcome (Ok / an MxStatusProxy /
|
||||
/// a non-zero HResult) and is NOT short-circuited to <c>INVALID_REQUEST</c> the way an
|
||||
/// unimplemented command would be. MXAccess-level rejections (a wrong item class for
|
||||
/// Suspend/Activate commonly returns 0x80070057) are parity, not test failures — we
|
||||
/// assert the reply kind plus a non-INVALID_REQUEST protocol status, and log the
|
||||
/// HResult for the record.
|
||||
/// </summary>
|
||||
[LiveMxAccessFact]
|
||||
public async Task GatewaySession_WithLiveWorker_NewComCommands_RoundTripWithRealReplies()
|
||||
{
|
||||
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
|
||||
Assert.True(
|
||||
File.Exists(workerExecutablePath),
|
||||
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
|
||||
|
||||
// Credential-redaction: AuthenticateUser carries a password. Route every test
|
||||
// log surface through the buffering helper so the post-run assertion proves the
|
||||
// password never reached the gateway logger, worker stdout/stderr, or any
|
||||
// WriteLine the test body issued (same pattern as the WriteSecured parity test).
|
||||
RecordingTestOutputHelper recordedOutput = new(output);
|
||||
TestWorkerProcessFactory processFactory = new(recordedOutput);
|
||||
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, recordedOutput);
|
||||
|
||||
string? sessionId = null;
|
||||
(string verifyUser, string verifyPassword) = ResolveLiveMxAccessSecuredCredentials();
|
||||
|
||||
try
|
||||
{
|
||||
OpenSessionReply openReply = await fixture.Service.OpenSession(
|
||||
new OpenSessionRequest
|
||||
{
|
||||
ClientSessionName = "live-mxaccess-new-com-commands",
|
||||
ClientCorrelationId = "live-open-new-com",
|
||||
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
|
||||
},
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
|
||||
sessionId = openReply.SessionId;
|
||||
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
|
||||
|
||||
MxCommandReply registerReply = await fixture.Service.Invoke(
|
||||
CreateRegisterRequest(sessionId),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReplyTo(recordedOutput, "Register", registerReply);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
|
||||
int serverHandle = registerReply.Register.ServerHandle;
|
||||
|
||||
MxCommandReply addItemReply = await fixture.Service.Invoke(
|
||||
CreateAddItemRequest(sessionId, serverHandle),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReplyTo(recordedOutput, "AddItem", addItemReply);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, addItemReply.ProtocolStatus.Code);
|
||||
int itemHandle = addItemReply.AddItem.ItemHandle;
|
||||
|
||||
// AuthenticateUser — the B-bundle command under live verification. Before the
|
||||
// B-bundle this command was unimplemented and the worker short-circuited it to
|
||||
// INVALID_REQUEST. It must now produce a real reply (Ok with a user id when the
|
||||
// provider accepts the credential, or a real MXAccess HResult when it does not).
|
||||
MxCommandReply authReply = await fixture.Service.Invoke(
|
||||
CreateAuthenticateUserRequest(sessionId, serverHandle, verifyUser, verifyPassword),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
recordedOutput.WriteLine(
|
||||
$"AuthenticateUser status={authReply.ProtocolStatus.Code} hresult={authReply.Hresult} user_id={authReply.AuthenticateUser?.UserId}");
|
||||
Assert.Equal(MxCommandKind.AuthenticateUser, authReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, authReply.ProtocolStatus.Code);
|
||||
Assert.True(
|
||||
authReply.ProtocolStatus.Code is ProtocolStatusCode.Ok or ProtocolStatusCode.MxaccessFailure,
|
||||
$"AuthenticateUser must surface a real MXAccess outcome, got {authReply.ProtocolStatus.Code}.");
|
||||
int authenticatedUserId =
|
||||
authReply.ProtocolStatus.Code == ProtocolStatusCode.Ok && authReply.AuthenticateUser is not null
|
||||
? authReply.AuthenticateUser.UserId
|
||||
: 0;
|
||||
if (authReply.ProtocolStatus.Code == ProtocolStatusCode.Ok)
|
||||
{
|
||||
// On the dev rig AuthenticateUser("Administrator","") resolves to user id 1.
|
||||
// Don't pin the exact value (provider/user-store dependent) — just prove a
|
||||
// success carried a usable, non-zero ArchestrA user id through the reply.
|
||||
Assert.NotEqual(0, authenticatedUserId);
|
||||
}
|
||||
|
||||
// ArchestrAUserToId — resolves an ArchestrA user GUID to an integer user id.
|
||||
// We feed an empty/placeholder GUID: the value is provider-dependent, so the
|
||||
// assertion is the parity one (real reply, never INVALID_REQUEST). A non-zero
|
||||
// HResult here is the expected MXAccess rejection of an unknown GUID.
|
||||
MxCommandReply userToIdReply = await fixture.Service.Invoke(
|
||||
CreateArchestrAUserToIdRequest(sessionId, serverHandle, userIdGuid: string.Empty),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReplyTo(recordedOutput, "ArchestrAUserToId", userToIdReply);
|
||||
recordedOutput.WriteLine($"ArchestrAUserToId user_id={userToIdReply.ArchestraUserToId?.UserId}");
|
||||
Assert.Equal(MxCommandKind.ArchestraUserToId, userToIdReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, userToIdReply.ProtocolStatus.Code);
|
||||
Assert.True(
|
||||
userToIdReply.ProtocolStatus.Code is ProtocolStatusCode.Ok or ProtocolStatusCode.MxaccessFailure,
|
||||
$"ArchestrAUserToId must surface a real MXAccess outcome, got {userToIdReply.ProtocolStatus.Code}.");
|
||||
if (userToIdReply.ProtocolStatus.Code == ProtocolStatusCode.Ok)
|
||||
{
|
||||
// On the dev rig ArchestrAUserToId with a valid GUID resolves to user_id=1.
|
||||
// Don't pin the exact id (provider-dependent) — just prove the Ok path carried
|
||||
// a usable non-zero ArchestrA user id through the reply payload.
|
||||
Assert.NotNull(userToIdReply.ArchestraUserToId);
|
||||
Assert.NotEqual(0, userToIdReply.ArchestraUserToId.UserId);
|
||||
}
|
||||
|
||||
// Suspend / Activate against the advised item. The dev-rig TestInt item class
|
||||
// may not be suspendable (MXAccess returns 0x80070057 / E_INVALIDARG for a
|
||||
// wrong item class — see B8 notes). That is MXAccess parity: assert the reply
|
||||
// kind and a non-INVALID_REQUEST status, surface the HResult and MxStatusProxy
|
||||
// for the record, and do NOT treat a provider-side rejection as a test failure.
|
||||
MxCommandReply suspendReply = await fixture.Service.Invoke(
|
||||
CreateSuspendRequest(sessionId, serverHandle, itemHandle),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReplyTo(recordedOutput, "Suspend", suspendReply);
|
||||
recordedOutput.WriteLine(
|
||||
$"Suspend status_proxy success={suspendReply.Suspend?.Status?.Success} hresult=0x{(uint)suspendReply.Hresult:X8}");
|
||||
Assert.Equal(MxCommandKind.Suspend, suspendReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, suspendReply.ProtocolStatus.Code);
|
||||
Assert.True(
|
||||
suspendReply.ProtocolStatus.Code is ProtocolStatusCode.Ok or ProtocolStatusCode.MxaccessFailure,
|
||||
$"Suspend must surface a real MXAccess outcome, got {suspendReply.ProtocolStatus.Code}.");
|
||||
|
||||
MxCommandReply activateReply = await fixture.Service.Invoke(
|
||||
CreateActivateRequest(sessionId, serverHandle, itemHandle),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReplyTo(recordedOutput, "Activate", activateReply);
|
||||
recordedOutput.WriteLine(
|
||||
$"Activate status_proxy success={activateReply.Activate?.Status?.Success} hresult=0x{(uint)activateReply.Hresult:X8}");
|
||||
Assert.Equal(MxCommandKind.Activate, activateReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, activateReply.ProtocolStatus.Code);
|
||||
Assert.True(
|
||||
activateReply.ProtocolStatus.Code is ProtocolStatusCode.Ok or ProtocolStatusCode.MxaccessFailure,
|
||||
$"Activate must surface a real MXAccess outcome, got {activateReply.ProtocolStatus.Code}.");
|
||||
}
|
||||
finally
|
||||
{
|
||||
await ShutDownAsync(fixture, processFactory, sessionId, streamTask: null).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
// Credential contract: the AuthenticateUser password must never reach any log
|
||||
// surface (gateway logger, worker stdout/stderr, or test WriteLine).
|
||||
Assert.DoesNotContain(verifyPassword, recordedOutput.Captured, StringComparison.Ordinal);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// B8 §3.2 buffered-data path: adds a BUFFERED item (<c>AddBufferedItem</c>), sets the
|
||||
/// buffered update interval (<c>SetBufferedUpdateInterval</c>), advises it, then attempts
|
||||
/// to observe an <see cref="MxEventFamily.OnBufferedDataChange"/> event carrying multiple
|
||||
/// samples so the worker's multi-sample conversion (VariantConverter →
|
||||
/// OnBufferedDataChangeEvent quality/timestamp arrays) can be validated live.
|
||||
/// <para>
|
||||
/// The AddBufferedItem + SetBufferedUpdateInterval round-trips are asserted unconditionally
|
||||
/// (they are the B-bundle commands under verification). The buffered EVENT capture is
|
||||
/// best-effort: if the rig's object logic does not drive a buffered batch within the live
|
||||
/// event timeout (the same environmental limitation seen with the externally-undrivable
|
||||
/// alarm rig), the test records the buffered conversion as an unverified residual rather
|
||||
/// than failing — the command path is proven, the live multi-sample conversion is not.
|
||||
/// When a batch IS captured, the converted value and quality/timestamp arrays are asserted
|
||||
/// to be non-empty and internally consistent (no crash, no dropped payload).
|
||||
/// </para>
|
||||
/// </summary>
|
||||
[LiveMxAccessFact]
|
||||
public async Task GatewaySession_WithLiveWorker_BufferedItem_AddsSetsIntervalAndAttemptsCapture()
|
||||
{
|
||||
string workerExecutablePath = IntegrationTestEnvironment.ResolveLiveMxAccessWorkerExecutablePath();
|
||||
Assert.True(
|
||||
File.Exists(workerExecutablePath),
|
||||
$"Live MXAccess worker executable was not found at {workerExecutablePath}. Build the worker or set {IntegrationTestEnvironment.LiveMxAccessWorkerExecutableVariableName}.");
|
||||
|
||||
TestWorkerProcessFactory processFactory = new(output);
|
||||
await using GatewayServiceFixture fixture = new(workerExecutablePath, processFactory, output);
|
||||
using RecordingServerStreamWriter<MxEvent> eventWriter = new();
|
||||
|
||||
string? sessionId = null;
|
||||
Task? streamTask = null;
|
||||
using CancellationTokenSource streamCancellation = new();
|
||||
|
||||
// AddBufferedItem takes (item_definition, item_context) like AddItem2. The dev rig
|
||||
// exposes TestChildObject.TestInt; the buffered form is item="TestInt",
|
||||
// context="TestChildObject" (per B8 notes). Split the configured live item so a
|
||||
// custom MXGATEWAY_LIVE_MXACCESS_ITEM override still works.
|
||||
(string bufferedItem, string bufferedContext) = SplitLiveItemForBuffered(IntegrationTestEnvironment.LiveMxAccessItem);
|
||||
|
||||
try
|
||||
{
|
||||
OpenSessionReply openReply = await fixture.Service.OpenSession(
|
||||
new OpenSessionRequest
|
||||
{
|
||||
ClientSessionName = "live-mxaccess-buffered",
|
||||
ClientCorrelationId = "live-open-buffered",
|
||||
CommandTimeout = Duration.FromTimeSpan(CommandTimeout),
|
||||
},
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
|
||||
sessionId = openReply.SessionId;
|
||||
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
|
||||
|
||||
streamTask = fixture.Service.StreamEvents(
|
||||
new StreamEventsRequest { SessionId = sessionId },
|
||||
eventWriter,
|
||||
new TestServerCallContext(streamCancellation.Token));
|
||||
|
||||
MxCommandReply registerReply = await fixture.Service.Invoke(
|
||||
CreateRegisterRequest(sessionId),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReply("Register", registerReply);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, registerReply.ProtocolStatus.Code);
|
||||
int serverHandle = registerReply.Register.ServerHandle;
|
||||
|
||||
// SetBufferedUpdateInterval first so the buffered cadence is established before
|
||||
// the item is added/advised. MXAccess rounds to 100ms units and rejects < 1.
|
||||
MxCommandReply intervalReply = await fixture.Service.Invoke(
|
||||
CreateSetBufferedUpdateIntervalRequest(sessionId, serverHandle, updateIntervalMilliseconds: 1000),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReply("SetBufferedUpdateInterval", intervalReply);
|
||||
Assert.Equal(MxCommandKind.SetBufferedUpdateInterval, intervalReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, intervalReply.ProtocolStatus.Code);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, intervalReply.ProtocolStatus.Code);
|
||||
|
||||
// AddBufferedItem — must return a real item handle (the dev rig yields handle 1).
|
||||
MxCommandReply addBufferedReply = await fixture.Service.Invoke(
|
||||
CreateAddBufferedItemRequest(sessionId, serverHandle, bufferedItem, bufferedContext),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReply("AddBufferedItem", addBufferedReply);
|
||||
Assert.Equal(MxCommandKind.AddBufferedItem, addBufferedReply.Kind);
|
||||
Assert.NotEqual(ProtocolStatusCode.InvalidRequest, addBufferedReply.ProtocolStatus.Code);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, addBufferedReply.ProtocolStatus.Code);
|
||||
Assert.NotNull(addBufferedReply.AddBufferedItem);
|
||||
int bufferedItemHandle = addBufferedReply.AddBufferedItem.ItemHandle;
|
||||
Assert.True(bufferedItemHandle > 0, "AddBufferedItem must yield a usable item handle.");
|
||||
|
||||
MxCommandReply adviseReply = await fixture.Service.Invoke(
|
||||
CreateAdviseRequest(sessionId, serverHandle, bufferedItemHandle),
|
||||
new TestServerCallContext()).ConfigureAwait(false);
|
||||
LogReply("Advise(buffered)", adviseReply);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, adviseReply.ProtocolStatus.Code);
|
||||
|
||||
// Best-effort capture of a SAMPLE-BEARING buffered batch.
|
||||
//
|
||||
// Live observation (B8): immediately after Advise the provider delivers an
|
||||
// initial OnBufferedDataChange with data_type=NoData / raw_data_type=0 and zero
|
||||
// quality+timestamp samples — the buffered analogue of the bad-quality/
|
||||
// registration-state bootstrap event the OnDataChange tests skip with their
|
||||
// family-match predicate. That empty bootstrap is parity, NOT a dropped payload:
|
||||
// the converter ran without crashing and there were simply no samples to carry.
|
||||
// We therefore match only a batch that actually carries samples, so a real
|
||||
// multi-sample conversion can be validated and the empty bootstrap is skipped
|
||||
// rather than mistaken for a defect.
|
||||
MxEvent? bufferedBatch = null;
|
||||
try
|
||||
{
|
||||
bufferedBatch = await eventWriter
|
||||
.WaitForMessageAsync(
|
||||
candidate => candidate.Family == MxEventFamily.OnBufferedDataChange
|
||||
&& candidate.ServerHandle == serverHandle
|
||||
&& candidate.ItemHandle == bufferedItemHandle
|
||||
&& candidate.OnBufferedDataChange is not null
|
||||
&& (CountArrayElements(candidate.OnBufferedDataChange.QualityValues) > 0
|
||||
|| CountArrayElements(candidate.OnBufferedDataChange.TimestampValues) > 0),
|
||||
IntegrationTestEnvironment.LiveMxAccessEventTimeout,
|
||||
streamCancellation.Token)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch (TimeoutException)
|
||||
{
|
||||
bufferedBatch = null;
|
||||
}
|
||||
|
||||
// Whether or not a sample-bearing batch arrived, record what buffered events the
|
||||
// rig DID deliver (typically just the empty NoData bootstrap) for the record.
|
||||
int bootstrapBufferedEvents = CountMatchingEvents(
|
||||
eventWriter,
|
||||
e => e.Family == MxEventFamily.OnBufferedDataChange
|
||||
&& e.ServerHandle == serverHandle
|
||||
&& e.ItemHandle == bufferedItemHandle);
|
||||
|
||||
if (bufferedBatch is null)
|
||||
{
|
||||
// RESIDUAL (documented): the command path (AddBufferedItem +
|
||||
// SetBufferedUpdateInterval + Advise) is proven and the buffered EVENT plumbing
|
||||
// is live (the empty NoData bootstrap arrives and converts without crashing),
|
||||
// but the rig did not drive a sample-bearing buffered batch within the timeout
|
||||
// — the same environmental limitation as the externally-undrivable alarm rig.
|
||||
// The §3.2 OnBufferedDataChange MULTI-SAMPLE conversion therefore remains
|
||||
// unverified live. This is environmental, not a defect — let the test pass.
|
||||
output.WriteLine(
|
||||
"B8 RESIDUAL: AddBufferedItem/SetBufferedUpdateInterval/Advise round-tripped and "
|
||||
+ $"{bootstrapBufferedEvents} OnBufferedDataChange event(s) arrived (empty NoData "
|
||||
+ "bootstrap, converted without crash/drop), but no sample-bearing buffered batch "
|
||||
+ $"was observed within {IntegrationTestEnvironment.LiveMxAccessEventTimeout}. Live "
|
||||
+ "§3.2 multi-sample conversion remains unverified (rig object logic may not drive "
|
||||
+ "buffered samples on demand).");
|
||||
|
||||
// The residual claim ("empty NoData bootstrap arrives and converts without crashing")
|
||||
// is only meaningful if at least one OnBufferedDataChange event arrived. Assert that
|
||||
// the buffered subscription registered and the worker's event plumbing fired at all.
|
||||
Assert.True(
|
||||
bootstrapBufferedEvents > 0,
|
||||
"No OnBufferedDataChange event arrived at all after Advise; the buffered subscription may not have registered.");
|
||||
return;
|
||||
}
|
||||
|
||||
// A SAMPLE-BEARING buffered batch was captured — validate the §3.2 conversion.
|
||||
LogEvent(bufferedBatch);
|
||||
OnBufferedDataChangeEvent body = bufferedBatch.OnBufferedDataChange;
|
||||
Assert.NotNull(body);
|
||||
|
||||
int qualityCount = CountArrayElements(body.QualityValues);
|
||||
int timestampCount = CountArrayElements(body.TimestampValues);
|
||||
output.WriteLine(
|
||||
$"B8 CAPTURED buffered batch: data_type={body.DataType} raw_data_type={body.RawDataType} "
|
||||
+ $"quality_samples={qualityCount} timestamp_samples={timestampCount} "
|
||||
+ $"value_kind={bufferedBatch.Value?.KindCase}");
|
||||
|
||||
// The predicate guaranteed at least one sample; the converted aggregate value
|
||||
// must also exist (no crash, no dropped payload).
|
||||
Assert.True(
|
||||
qualityCount > 0 || timestampCount > 0,
|
||||
"Sample-bearing OnBufferedDataChange lost its samples after the predicate matched.");
|
||||
Assert.NotNull(bufferedBatch.Value);
|
||||
|
||||
// When MXAccess delivers parallel quality + timestamp arrays the converted
|
||||
// arrays must agree in length; a mismatch is a real conversion defect (a sample
|
||||
// was dropped on one side).
|
||||
if (qualityCount > 0 && timestampCount > 0)
|
||||
{
|
||||
Assert.Equal(qualityCount, timestampCount);
|
||||
}
|
||||
}
|
||||
finally
|
||||
{
|
||||
streamCancellation.Cancel();
|
||||
await ShutDownAsync(fixture, processFactory, sessionId, streamTask).ConfigureAwait(false);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that killing the worker process marks the session
|
||||
/// <see cref="SessionState.Faulted"/> with a clean fault classification — the gateway
|
||||
@@ -939,6 +1277,113 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateArchestrAUserToIdRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
string userIdGuid)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "live-archestra-user-to-id",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.ArchestraUserToId,
|
||||
ArchestraUserToId = new ArchestrAUserToIdCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
UserIdGuid = userIdGuid,
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateAddBufferedItemRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
string itemDefinition,
|
||||
string itemContext)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "live-add-buffered-item",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.AddBufferedItem,
|
||||
AddBufferedItem = new AddBufferedItemCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemDefinition = itemDefinition,
|
||||
ItemContext = itemContext,
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateSetBufferedUpdateIntervalRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
int updateIntervalMilliseconds)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "live-set-buffered-update-interval",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.SetBufferedUpdateInterval,
|
||||
SetBufferedUpdateInterval = new SetBufferedUpdateIntervalCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
UpdateIntervalMilliseconds = updateIntervalMilliseconds,
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateSuspendRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "live-suspend",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Suspend,
|
||||
Suspend = new SuspendCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemHandle = itemHandle,
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateActivateRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "live-activate",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Activate,
|
||||
Activate = new ActivateCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemHandle = itemHandle,
|
||||
},
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateWriteSecuredRequest(
|
||||
string sessionId,
|
||||
int serverHandle,
|
||||
@@ -978,6 +1423,50 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
|
||||
return (verifyUser, verifyPassword);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Splits a dotted MXAccess reference (e.g. "TestChildObject.TestInt") into the
|
||||
/// (item_definition, item_context) pair AddBufferedItem expects — attribute name and
|
||||
/// owning object. An undotted reference is passed through with an empty context.
|
||||
/// </summary>
|
||||
private static (string Item, string Context) SplitLiveItemForBuffered(string liveItem)
|
||||
{
|
||||
int lastDot = liveItem.LastIndexOf('.');
|
||||
if (lastDot < 0 || lastDot >= liveItem.Length - 1)
|
||||
{
|
||||
return (liveItem, string.Empty);
|
||||
}
|
||||
|
||||
string context = liveItem[..lastDot];
|
||||
string item = liveItem[(lastDot + 1)..];
|
||||
return (item, context);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Counts the elements in a converted buffered <see cref="MxArray"/> across whichever
|
||||
/// typed-array oneof case the VariantConverter populated, so the buffered-capture
|
||||
/// assertions are independent of the rig item's element type.
|
||||
/// </summary>
|
||||
private static int CountArrayElements(MxArray? array)
|
||||
{
|
||||
if (array is null)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
return array.ValuesCase switch
|
||||
{
|
||||
MxArray.ValuesOneofCase.BoolValues => array.BoolValues.Values.Count,
|
||||
MxArray.ValuesOneofCase.Int32Values => array.Int32Values.Values.Count,
|
||||
MxArray.ValuesOneofCase.Int64Values => array.Int64Values.Values.Count,
|
||||
MxArray.ValuesOneofCase.FloatValues => array.FloatValues.Values.Count,
|
||||
MxArray.ValuesOneofCase.DoubleValues => array.DoubleValues.Values.Count,
|
||||
MxArray.ValuesOneofCase.StringValues => array.StringValues.Values.Count,
|
||||
MxArray.ValuesOneofCase.TimestampValues => array.TimestampValues.Values.Count,
|
||||
MxArray.ValuesOneofCase.RawValues => array.RawValues.Values.Count,
|
||||
_ => 0,
|
||||
};
|
||||
}
|
||||
|
||||
private static int CountMatchingEvents(
|
||||
RecordingServerStreamWriter<MxEvent> writer,
|
||||
Func<MxEvent, bool> predicate)
|
||||
@@ -1607,6 +2096,7 @@ public sealed class WorkerLiveMxAccessSmokeTests(ITestOutputHelper output)
|
||||
string commandKind,
|
||||
string target,
|
||||
ConstraintFailure failure,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
}
|
||||
|
||||
|
||||
@@ -203,6 +203,7 @@ public sealed class GatewayAlarmMonitor : BackgroundService, IGatewayAlarmServic
|
||||
GatewaySession session = await _sessionManager.OpenSessionAsync(
|
||||
new SessionOpenRequest(BackendName, MonitorClientName, Guid.NewGuid().ToString("N"), CommandTimeout: null),
|
||||
MonitorClientName,
|
||||
ownerKeyId: null,
|
||||
stoppingToken)
|
||||
.ConfigureAwait(false);
|
||||
lock (_sync) { _session = session; }
|
||||
|
||||
@@ -2,4 +2,6 @@ namespace ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
|
||||
public sealed record EffectiveEventConfiguration(
|
||||
int QueueCapacity,
|
||||
string BackpressurePolicy);
|
||||
string BackpressurePolicy,
|
||||
int ReplayBufferCapacity,
|
||||
double ReplayRetentionSeconds);
|
||||
|
||||
@@ -6,4 +6,5 @@ public sealed record EffectiveSessionConfiguration(
|
||||
int MaxPendingCommandsPerSession,
|
||||
int DefaultLeaseSeconds,
|
||||
int LeaseSweepIntervalSeconds,
|
||||
bool AllowMultipleEventSubscribers);
|
||||
bool AllowMultipleEventSubscribers,
|
||||
int MaxEventSubscribersPerSession);
|
||||
|
||||
@@ -11,4 +11,20 @@ public sealed class EventOptions
|
||||
/// Gets the backpressure policy for event queue overflow.
|
||||
/// </summary>
|
||||
public EventBackpressurePolicy BackpressurePolicy { get; init; } = EventBackpressurePolicy.FailFast;
|
||||
|
||||
/// <summary>
|
||||
/// Gets the maximum number of events retained in the per-session replay ring buffer
|
||||
/// used to re-deliver events a returning subscriber missed (reconnect/reattach).
|
||||
/// When the buffer exceeds this count the oldest retained events are evicted first.
|
||||
/// A value of <c>0</c> disables replay retention entirely.
|
||||
/// </summary>
|
||||
public int ReplayBufferCapacity { get; init; } = 1024;
|
||||
|
||||
/// <summary>
|
||||
/// Gets the maximum age, in seconds, of an event retained in the per-session replay
|
||||
/// ring buffer. Entries older than this are evicted regardless of capacity. A value
|
||||
/// of <c>0</c> disables age-based eviction (only <see cref="ReplayBufferCapacity"/>
|
||||
/// bounds the buffer).
|
||||
/// </summary>
|
||||
public double ReplayRetentionSeconds { get; init; } = 300;
|
||||
}
|
||||
|
||||
@@ -46,10 +46,13 @@ public sealed class GatewayConfigurationProvider(IOptions<GatewayOptions> option
|
||||
MaxPendingCommandsPerSession: value.Sessions.MaxPendingCommandsPerSession,
|
||||
DefaultLeaseSeconds: value.Sessions.DefaultLeaseSeconds,
|
||||
LeaseSweepIntervalSeconds: value.Sessions.LeaseSweepIntervalSeconds,
|
||||
AllowMultipleEventSubscribers: value.Sessions.AllowMultipleEventSubscribers),
|
||||
AllowMultipleEventSubscribers: value.Sessions.AllowMultipleEventSubscribers,
|
||||
MaxEventSubscribersPerSession: value.Sessions.MaxEventSubscribersPerSession),
|
||||
Events: new EffectiveEventConfiguration(
|
||||
QueueCapacity: value.Events.QueueCapacity,
|
||||
BackpressurePolicy: value.Events.BackpressurePolicy.ToString()),
|
||||
BackpressurePolicy: value.Events.BackpressurePolicy.ToString(),
|
||||
ReplayBufferCapacity: value.Events.ReplayBufferCapacity,
|
||||
ReplayRetentionSeconds: value.Events.ReplayRetentionSeconds),
|
||||
Dashboard: new EffectiveDashboardConfiguration(
|
||||
Enabled: value.Dashboard.Enabled,
|
||||
AllowAnonymousLocalhost: value.Dashboard.AllowAnonymousLocalhost,
|
||||
|
||||
@@ -177,12 +177,10 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
|
||||
options.LeaseSweepIntervalSeconds,
|
||||
"MxGateway:Sessions:LeaseSweepIntervalSeconds must be greater than zero.",
|
||||
builder);
|
||||
|
||||
if (options.AllowMultipleEventSubscribers)
|
||||
{
|
||||
builder.Add(
|
||||
"MxGateway:Sessions:AllowMultipleEventSubscribers is not supported until event fan-out is implemented.");
|
||||
}
|
||||
AddIfNotPositive(
|
||||
options.MaxEventSubscribersPerSession,
|
||||
"MxGateway:Sessions:MaxEventSubscribersPerSession must be greater than zero.",
|
||||
builder);
|
||||
}
|
||||
|
||||
private static void ValidateEvents(EventOptions options, ValidationBuilder builder)
|
||||
@@ -193,6 +191,16 @@ public sealed class GatewayOptionsValidator : OptionsValidatorBase<GatewayOption
|
||||
{
|
||||
builder.Add("MxGateway:Events:BackpressurePolicy must be a supported backpressure policy.");
|
||||
}
|
||||
|
||||
// ReplayBufferCapacity and ReplayRetentionSeconds are bounds on the replay ring
|
||||
// buffer; 0 is a valid value (disables that dimension), so only negatives fail.
|
||||
AddIfNegative(
|
||||
options.ReplayBufferCapacity,
|
||||
"MxGateway:Events:ReplayBufferCapacity must be greater than or equal to zero.",
|
||||
builder);
|
||||
builder.RequireThat(
|
||||
options.ReplayRetentionSeconds >= 0,
|
||||
"MxGateway:Events:ReplayRetentionSeconds must be greater than or equal to zero.");
|
||||
}
|
||||
|
||||
private static void ValidateDashboard(DashboardOptions options, ValidationBuilder builder)
|
||||
|
||||
@@ -27,4 +27,11 @@ public sealed class SessionOptions
|
||||
/// Gets a value indicating whether multiple event subscribers are allowed per session.
|
||||
/// </summary>
|
||||
public bool AllowMultipleEventSubscribers { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Gets the maximum number of concurrent event subscribers per session.
|
||||
/// Applies when <see cref="AllowMultipleEventSubscribers"/> is <see langword="true"/>;
|
||||
/// effectively 1 when it is <see langword="false"/>. Must be greater than zero.
|
||||
/// </summary>
|
||||
public int MaxEventSubscribersPerSession { get; init; } = 8;
|
||||
}
|
||||
|
||||
@@ -138,6 +138,7 @@ public sealed class DashboardLiveDataService : IDashboardLiveDataService, IAsync
|
||||
GatewaySession session = await _sessionManager.OpenSessionAsync(
|
||||
new SessionOpenRequest(BackendName, ClientName, Guid.NewGuid().ToString("N"), CommandTimeout: null),
|
||||
ClientName,
|
||||
ownerKeyId: null,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
|
||||
@@ -24,9 +24,21 @@ public sealed class DashboardEventBroadcaster(
|
||||
return;
|
||||
}
|
||||
|
||||
Task send = hubContext.Clients
|
||||
.Group(EventsHub.GroupName(sessionId))
|
||||
.SendAsync(EventsHub.EventMessage, mxEvent);
|
||||
// Wrap the Task acquisition in a try/catch so a hypothetical synchronous throw
|
||||
// from SendAsync (e.g. an implementation that throws before returning the Task)
|
||||
// cannot escape Publish. The interface contract is never-throw; fire-and-forget.
|
||||
Task send;
|
||||
try
|
||||
{
|
||||
send = hubContext.Clients
|
||||
.Group(EventsHub.GroupName(sessionId))
|
||||
.SendAsync(EventsHub.EventMessage, mxEvent);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
logger.LogDebug(ex, "Dashboard event mirror to session {SessionId} threw synchronously.", sessionId);
|
||||
return;
|
||||
}
|
||||
|
||||
if (!send.IsCompletedSuccessfully)
|
||||
{
|
||||
|
||||
@@ -6,15 +6,9 @@ namespace ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
|
||||
/// <summary>
|
||||
/// SignalR hub for per-session MxEvent push. Clients call
|
||||
/// <see cref="SubscribeSession"/> to join the group for a specific
|
||||
/// session; the dashboard's MxEvent broadcaster (a future hook on
|
||||
/// <c>EventStreamService</c>) sends messages to <c>session:{id}</c>.
|
||||
/// session; <see cref="DashboardEventBroadcaster"/> sends messages to
|
||||
/// <c>session:{id}</c> as events arrive from the live gRPC stream.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The publisher side is intentionally a follow-up. Today the dashboard's
|
||||
/// per-session event view is fed by the snapshot hub, which carries the
|
||||
/// rolling recent-events list. Once a dedicated MxEvent broadcaster
|
||||
/// lands, this hub's group convention is what it will publish to.
|
||||
/// </remarks>
|
||||
[Authorize(Policy = DashboardAuthenticationDefaults.HubClientsPolicy)]
|
||||
public sealed class EventsHub : Hub
|
||||
{
|
||||
|
||||
@@ -1,9 +1,7 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using System.Threading.Channels;
|
||||
using Microsoft.Extensions.Options;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
using ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
using ZB.MOM.WW.MxGateway.Server.Workers;
|
||||
@@ -13,14 +11,46 @@ namespace ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
public sealed class EventStreamService(
|
||||
ISessionManager sessionManager,
|
||||
IOptions<GatewayOptions> options,
|
||||
MxAccessGrpcMapper mapper,
|
||||
GatewayMetrics metrics,
|
||||
IDashboardEventBroadcaster dashboardEventBroadcaster,
|
||||
ILogger<EventStreamService> logger) : IEventStreamService
|
||||
GatewayMetrics metrics) : IEventStreamService
|
||||
{
|
||||
/// <summary>
|
||||
/// Streams events from a session to the client asynchronously.
|
||||
/// Streams events from a session to the client asynchronously.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Task 4 rewired this from a per-RPC channel that drained the session directly
|
||||
/// to reading the subscriber's lease channel fed by the session's single
|
||||
/// <see cref="SessionEventDistributor"/> pump. The pump owns the single drain of
|
||||
/// the worker event stream and the worker→public mapping (mirroring the former
|
||||
/// <c>ProduceEventsAsync</c>); this loop is the per-subscriber boundary that
|
||||
/// applies the per-RPC filter (<c>AfterWorkerSequence</c>), queue-depth metrics,
|
||||
/// and the backpressure/overflow policy.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Task 6 moved the dashboard mirror OFF this per-RPC loop. The dashboard is now a
|
||||
/// first-class internal subscriber on the session's
|
||||
/// <see cref="SessionEventDistributor"/> (see <c>GatewaySession.StartDashboardMirror</c>),
|
||||
/// so it receives session events even when no gRPC client is streaming. This loop no
|
||||
/// longer mirrors to the dashboard. One deliberate consequence: the dashboard now sees
|
||||
/// RAW session events, not the per-gRPC-subscriber <c>AfterWorkerSequence</c>-filtered
|
||||
/// view this loop applies — the dashboard is a separate LDAP-authenticated monitoring
|
||||
/// view that should see the session's full event activity (per-session dashboard ACL is
|
||||
/// the separate Task 18).
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Overflow handling (Task 5): the distributor's per-subscriber channel is bounded
|
||||
/// and the pump writes non-blocking. When this subscriber's channel is full the pump
|
||||
/// applies the per-subscriber backpressure policy and completes this subscriber's
|
||||
/// channel with a <see cref="SessionManagerException"/>
|
||||
/// (<see cref="SessionManagerErrorCode.EventQueueOverflow"/>). That terminal fault
|
||||
/// surfaces here when the reader's <c>MoveNextAsync</c> throws, and — like the
|
||||
/// pre-epic per-RPC overflow — it propagates to the gRPC client unchanged. The
|
||||
/// overflow metric, and (in the legacy single-subscriber FailFast case) the session
|
||||
/// fault + fault metric, are recorded by the distributor's overflow handler so the
|
||||
/// session, the pump, and other subscribers are isolated from this subscriber's
|
||||
/// slowness.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
/// <param name="request">Stream events request.</param>
|
||||
/// <param name="cancellationToken">Cancellation token.</param>
|
||||
/// <returns>Async enumerable of MX events.</returns>
|
||||
@@ -35,151 +65,80 @@ public sealed class EventStreamService(
|
||||
$"Session {request.SessionId} was not found.");
|
||||
}
|
||||
|
||||
using IDisposable subscriber = session.AttachEventSubscriber(
|
||||
// No `using` here — subscriber.Dispose() is called exactly once in the finally
|
||||
// block below, which also disposes the reader. A `using` declaration would add a
|
||||
// second Dispose on the same path and double-decrement the session subscriber count.
|
||||
IEventSubscriberLease subscriber = session.AttachEventSubscriber(
|
||||
options.Value.Sessions.AllowMultipleEventSubscribers);
|
||||
using CancellationTokenSource streamCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
|
||||
|
||||
int streamQueueDepth = 0;
|
||||
Channel<MxEvent> eventQueue = Channel.CreateBounded<MxEvent>(
|
||||
new BoundedChannelOptions(options.Value.Events.QueueCapacity)
|
||||
{
|
||||
SingleReader = true,
|
||||
SingleWriter = true,
|
||||
FullMode = BoundedChannelFullMode.Wait,
|
||||
AllowSynchronousContinuations = false,
|
||||
});
|
||||
Task producerTask = ProduceEventsAsync(
|
||||
session,
|
||||
request.AfterWorkerSequence,
|
||||
eventQueue.Writer,
|
||||
() =>
|
||||
{
|
||||
Interlocked.Increment(ref streamQueueDepth);
|
||||
metrics.AdjustGrpcEventStreamQueueDepth(1);
|
||||
},
|
||||
streamCts.Token);
|
||||
ulong afterWorkerSequence = request.AfterWorkerSequence;
|
||||
IAsyncEnumerator<MxEvent> reader = subscriber.Reader
|
||||
.ReadAllAsync(cancellationToken)
|
||||
.GetAsyncEnumerator(cancellationToken);
|
||||
|
||||
try
|
||||
{
|
||||
await foreach (MxEvent mxEvent in eventQueue.Reader.ReadAllAsync(cancellationToken).ConfigureAwait(false))
|
||||
while (true)
|
||||
{
|
||||
Interlocked.Decrement(ref streamQueueDepth);
|
||||
metrics.AdjustGrpcEventStreamQueueDepth(-1);
|
||||
MxEvent mxEvent;
|
||||
try
|
||||
{
|
||||
if (!await reader.MoveNextAsync().ConfigureAwait(false))
|
||||
{
|
||||
break;
|
||||
}
|
||||
|
||||
mxEvent = reader.Current;
|
||||
}
|
||||
catch (WorkerClientException workerException)
|
||||
{
|
||||
// The distributor pump completes every subscriber channel with the source
|
||||
// fault when the worker event stream terminates abnormally; that surfaces
|
||||
// here. Mirror the pre-Task-4 ProduceEventsAsync behavior: fault the
|
||||
// session and record the metric, then propagate the terminal fault to the
|
||||
// gRPC client.
|
||||
session.MarkFaulted(workerException.Message);
|
||||
metrics.Fault(WorkerClientErrorCode.WorkerFaulted.ToString());
|
||||
throw;
|
||||
}
|
||||
|
||||
// Per-RPC filter stays at the subscriber boundary: each request may resume
|
||||
// from a different AfterWorkerSequence, so the shared pump fans raw events and
|
||||
// this loop drops the ones at or below the caller's watermark.
|
||||
if (mxEvent.WorkerSequence <= afterWorkerSequence)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
// Queue-depth gauge tracks events the pump has fanned into this subscriber's
|
||||
// channel but the client has not yet consumed — the same "buffered, not yet
|
||||
// delivered" quantity the pre-Task-4 per-RPC channel reported. The bounded
|
||||
// subscriber channel supports counting, so reconcile the gauge to the current
|
||||
// backlog; falling back to a no-op delta if a channel ever cannot count.
|
||||
int backlog = subscriber.Reader.CanCount ? subscriber.Reader.Count : streamQueueDepth;
|
||||
int delta = backlog - streamQueueDepth;
|
||||
if (delta != 0)
|
||||
{
|
||||
streamQueueDepth = backlog;
|
||||
metrics.AdjustGrpcEventStreamQueueDepth(delta);
|
||||
}
|
||||
|
||||
yield return mxEvent;
|
||||
}
|
||||
|
||||
await producerTask.ConfigureAwait(false);
|
||||
}
|
||||
finally
|
||||
{
|
||||
await streamCts.CancelAsync().ConfigureAwait(false);
|
||||
await reader.DisposeAsync().ConfigureAwait(false);
|
||||
subscriber.Dispose();
|
||||
|
||||
try
|
||||
if (streamQueueDepth != 0)
|
||||
{
|
||||
await producerTask.ConfigureAwait(false);
|
||||
}
|
||||
catch (OperationCanceledException) when (streamCts.IsCancellationRequested)
|
||||
{
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
logger.LogDebug(
|
||||
exception,
|
||||
"Event stream producer stopped for session {SessionId}.",
|
||||
request.SessionId);
|
||||
}
|
||||
|
||||
int remainingDepth = Interlocked.Exchange(ref streamQueueDepth, 0);
|
||||
if (remainingDepth > 0)
|
||||
{
|
||||
metrics.AdjustGrpcEventStreamQueueDepth(-remainingDepth);
|
||||
metrics.AdjustGrpcEventStreamQueueDepth(-streamQueueDepth);
|
||||
streamQueueDepth = 0;
|
||||
}
|
||||
|
||||
metrics.StreamDisconnected("Detached");
|
||||
}
|
||||
}
|
||||
|
||||
private async Task ProduceEventsAsync(
|
||||
GatewaySession session,
|
||||
ulong afterWorkerSequence,
|
||||
ChannelWriter<MxEvent> writer,
|
||||
Action eventQueued,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
try
|
||||
{
|
||||
await foreach (WorkerEvent workerEvent in session
|
||||
.ReadEventsAsync(cancellationToken)
|
||||
.WithCancellation(cancellationToken)
|
||||
.ConfigureAwait(false))
|
||||
{
|
||||
MxEvent publicEvent = mapper.MapEvent(workerEvent);
|
||||
if (publicEvent.WorkerSequence <= afterWorkerSequence)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
// Mirror the event to the dashboard EventsHub group for this
|
||||
// session. Fire-and-forget — broadcast errors must not affect
|
||||
// the source gRPC stream. Server-041: the
|
||||
// IDashboardEventBroadcaster contract documents Publish as
|
||||
// never-throw, but we enforce that at the seam too, so a
|
||||
// future implementation that adds synchronous validation or
|
||||
// a serializer hop cannot fault the producer loop and end
|
||||
// this client's gRPC stream.
|
||||
try
|
||||
{
|
||||
dashboardEventBroadcaster.Publish(session.SessionId, publicEvent);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
logger.LogDebug(
|
||||
ex,
|
||||
"Dashboard event mirror threw for session {SessionId}; continuing.",
|
||||
session.SessionId);
|
||||
}
|
||||
|
||||
if (!writer.TryWrite(publicEvent))
|
||||
{
|
||||
string message = $"Session {session.SessionId} event stream queue overflowed.";
|
||||
metrics.QueueOverflow("grpc-event-stream");
|
||||
if (options.Value.Events.BackpressurePolicy == EventBackpressurePolicy.FailFast)
|
||||
{
|
||||
session.MarkFaulted(message);
|
||||
metrics.Fault(SessionManagerErrorCode.EventQueueOverflow.ToString());
|
||||
}
|
||||
else
|
||||
{
|
||||
logger.LogDebug(
|
||||
"Disconnecting event stream for session {SessionId} after queue overflow.",
|
||||
session.SessionId);
|
||||
}
|
||||
|
||||
writer.TryComplete(new SessionManagerException(
|
||||
SessionManagerErrorCode.EventQueueOverflow,
|
||||
message));
|
||||
return;
|
||||
}
|
||||
|
||||
eventQueued();
|
||||
}
|
||||
|
||||
writer.TryComplete();
|
||||
}
|
||||
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
writer.TryComplete();
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
if (exception is WorkerClientException)
|
||||
{
|
||||
session.MarkFaulted(exception.Message);
|
||||
metrics.Fault(WorkerClientErrorCode.WorkerFaulted.ToString());
|
||||
}
|
||||
|
||||
writer.TryComplete(exception);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
using Google.Protobuf.WellKnownTypes;
|
||||
using Grpc.Core;
|
||||
using Microsoft.Data.SqlClient;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using GalaxyDb = ZB.MOM.WW.MxGateway.Server.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Server.Security.Authentication;
|
||||
@@ -20,8 +19,7 @@ public sealed class GalaxyRepositoryGrpcService(
|
||||
GalaxyDb.IGalaxyRepository repository,
|
||||
GalaxyDb.IGalaxyHierarchyCache cache,
|
||||
GalaxyDb.IGalaxyDeployNotifier notifier,
|
||||
IGatewayRequestIdentityAccessor identityAccessor,
|
||||
ILogger<GalaxyRepositoryGrpcService> logger) : ProtoGalaxyRepository.GalaxyRepositoryBase
|
||||
IGatewayRequestIdentityAccessor identityAccessor) : ProtoGalaxyRepository.GalaxyRepositoryBase
|
||||
{
|
||||
private static readonly TimeSpan FirstLoadWaitBudget = TimeSpan.FromSeconds(5);
|
||||
private const int DefaultDiscoverPageSize = 1000;
|
||||
@@ -347,15 +345,4 @@ public sealed class GalaxyRepositoryGrpcService(
|
||||
|
||||
private sealed record PageToken(long Sequence, string FilterSignature, int Offset);
|
||||
|
||||
[System.Diagnostics.CodeAnalysis.SuppressMessage(
|
||||
"Style",
|
||||
"IDE0051:Remove unused private members",
|
||||
Justification = "Kept for parity with prior SQL exception mapping; future direct-SQL paths reuse it.")]
|
||||
private RpcException MapSqlException(SqlException exception)
|
||||
{
|
||||
logger.LogWarning(exception, "Galaxy repository query failed.");
|
||||
return new RpcException(new Status(
|
||||
StatusCode.Unavailable,
|
||||
"Galaxy repository is unavailable."));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -36,6 +36,7 @@ public sealed class MxAccessGatewayService(
|
||||
.OpenSessionAsync(
|
||||
SessionOpenRequest.FromContract(request),
|
||||
ResolveClientIdentity(),
|
||||
identityAccessor.Current?.KeyId,
|
||||
context.CancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
@@ -105,6 +106,7 @@ public sealed class MxAccessGatewayService(
|
||||
BulkConstraintPlan? bulkConstraintPlan = await ApplyConstraintsAsync(
|
||||
session,
|
||||
command,
|
||||
request.ClientCorrelationId,
|
||||
context.CancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
@@ -279,17 +281,18 @@ public sealed class MxAccessGatewayService(
|
||||
private async Task<BulkConstraintPlan?> ApplyConstraintsAsync(
|
||||
GatewaySession session,
|
||||
MxCommand command,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
ApiKeyIdentity? identity = identityAccessor.Current;
|
||||
switch (command.Kind)
|
||||
{
|
||||
case MxCommandKind.AddItem:
|
||||
await EnforceReadTagAsync(identity, command.Kind, command.AddItem.ItemDefinition, cancellationToken)
|
||||
await EnforceReadTagAsync(identity, command.Kind, command.AddItem.ItemDefinition, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
case MxCommandKind.AddItem2:
|
||||
await EnforceReadTagAsync(identity, command.Kind, command.AddItem2.ItemDefinition, cancellationToken)
|
||||
await EnforceReadTagAsync(identity, command.Kind, command.AddItem2.ItemDefinition, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
case MxCommandKind.AddItemBulk:
|
||||
@@ -298,6 +301,7 @@ public sealed class MxAccessGatewayService(
|
||||
command,
|
||||
command.AddItemBulk.ServerHandle,
|
||||
command.AddItemBulk.TagAddresses,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.SubscribeBulk:
|
||||
@@ -306,6 +310,7 @@ public sealed class MxAccessGatewayService(
|
||||
command,
|
||||
command.SubscribeBulk.ServerHandle,
|
||||
command.SubscribeBulk.TagAddresses,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.AdviseItemBulk:
|
||||
@@ -315,6 +320,7 @@ public sealed class MxAccessGatewayService(
|
||||
command,
|
||||
command.AdviseItemBulk.ServerHandle,
|
||||
command.AdviseItemBulk.ItemHandles,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.ReadBulk:
|
||||
@@ -323,6 +329,7 @@ public sealed class MxAccessGatewayService(
|
||||
command,
|
||||
command.ReadBulk.ServerHandle,
|
||||
command.ReadBulk.TagAddresses,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.WriteBulk:
|
||||
@@ -333,6 +340,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.WriteBulk.ServerHandle,
|
||||
command.WriteBulk.Entries,
|
||||
entry => entry.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.Write2Bulk:
|
||||
@@ -343,6 +351,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Write2Bulk.ServerHandle,
|
||||
command.Write2Bulk.Entries,
|
||||
entry => entry.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.WriteSecuredBulk:
|
||||
@@ -353,6 +362,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.WriteSecuredBulk.ServerHandle,
|
||||
command.WriteSecuredBulk.Entries,
|
||||
entry => entry.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.WriteSecured2Bulk:
|
||||
@@ -363,6 +373,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.WriteSecured2Bulk.ServerHandle,
|
||||
command.WriteSecured2Bulk.Entries,
|
||||
entry => entry.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
case MxCommandKind.Write:
|
||||
@@ -372,6 +383,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Kind,
|
||||
command.Write.ServerHandle,
|
||||
command.Write.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
@@ -382,6 +394,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Kind,
|
||||
command.Write2.ServerHandle,
|
||||
command.Write2.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
@@ -392,6 +405,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Kind,
|
||||
command.WriteSecured.ServerHandle,
|
||||
command.WriteSecured.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
@@ -402,6 +416,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Kind,
|
||||
command.WriteSecured2.ServerHandle,
|
||||
command.WriteSecured2.ItemHandle,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
return null;
|
||||
@@ -414,6 +429,7 @@ public sealed class MxAccessGatewayService(
|
||||
ApiKeyIdentity? identity,
|
||||
MxCommandKind commandKind,
|
||||
string tagAddress,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
ConstraintFailure? failure = await constraintEnforcer
|
||||
@@ -424,7 +440,7 @@ public sealed class MxAccessGatewayService(
|
||||
return;
|
||||
}
|
||||
|
||||
await constraintEnforcer.RecordDenialAsync(identity, commandKind.ToString(), tagAddress, failure, cancellationToken)
|
||||
await constraintEnforcer.RecordDenialAsync(identity, commandKind.ToString(), tagAddress, failure, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
throw new RpcException(new Status(StatusCode.PermissionDenied, failure.Message));
|
||||
}
|
||||
@@ -435,6 +451,7 @@ public sealed class MxAccessGatewayService(
|
||||
MxCommandKind commandKind,
|
||||
int serverHandle,
|
||||
int itemHandle,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
ConstraintFailure? failure = await constraintEnforcer
|
||||
@@ -445,7 +462,7 @@ public sealed class MxAccessGatewayService(
|
||||
return;
|
||||
}
|
||||
|
||||
await constraintEnforcer.RecordDenialAsync(identity, commandKind.ToString(), itemHandle.ToString(System.Globalization.CultureInfo.InvariantCulture), failure, cancellationToken)
|
||||
await constraintEnforcer.RecordDenialAsync(identity, commandKind.ToString(), itemHandle.ToString(System.Globalization.CultureInfo.InvariantCulture), failure, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
throw new RpcException(new Status(StatusCode.PermissionDenied, failure.Message));
|
||||
}
|
||||
@@ -455,6 +472,7 @@ public sealed class MxAccessGatewayService(
|
||||
MxCommand command,
|
||||
int serverHandle,
|
||||
IReadOnlyList<string> tagAddresses,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
Dictionary<int, SubscribeResult> denied = [];
|
||||
@@ -471,7 +489,7 @@ public sealed class MxAccessGatewayService(
|
||||
continue;
|
||||
}
|
||||
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), tagAddress, failure, cancellationToken)
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), tagAddress, failure, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
denied[index] = new SubscribeResult
|
||||
{
|
||||
@@ -507,6 +525,7 @@ public sealed class MxAccessGatewayService(
|
||||
MxCommand command,
|
||||
int serverHandle,
|
||||
IReadOnlyList<string> tagAddresses,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
// Mirrors FilterTagBulkAsync but produces BulkReadResult denial entries
|
||||
@@ -526,7 +545,7 @@ public sealed class MxAccessGatewayService(
|
||||
continue;
|
||||
}
|
||||
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), tagAddress, failure, cancellationToken)
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), tagAddress, failure, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
denied[index] = new BulkReadResult
|
||||
{
|
||||
@@ -557,6 +576,7 @@ public sealed class MxAccessGatewayService(
|
||||
int serverHandle,
|
||||
Google.Protobuf.Collections.RepeatedField<TEntry> entries,
|
||||
Func<TEntry, int> getItemHandle,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken) where TEntry : class
|
||||
{
|
||||
// The four bulk-write families each carry a different per-entry message
|
||||
@@ -586,6 +606,7 @@ public sealed class MxAccessGatewayService(
|
||||
command.Kind.ToString(),
|
||||
itemHandle.ToString(System.Globalization.CultureInfo.InvariantCulture),
|
||||
failure,
|
||||
correlationId,
|
||||
cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
denied[index] = new BulkWriteResult
|
||||
@@ -637,6 +658,7 @@ public sealed class MxAccessGatewayService(
|
||||
MxCommand command,
|
||||
int serverHandle,
|
||||
IReadOnlyList<int> itemHandles,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
Dictionary<int, SubscribeResult> denied = [];
|
||||
@@ -653,7 +675,7 @@ public sealed class MxAccessGatewayService(
|
||||
continue;
|
||||
}
|
||||
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), itemHandle.ToString(System.Globalization.CultureInfo.InvariantCulture), failure, cancellationToken)
|
||||
await constraintEnforcer.RecordDenialAsync(identity, command.Kind.ToString(), itemHandle.ToString(System.Globalization.CultureInfo.InvariantCulture), failure, correlationId, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
denied[index] = new SubscribeResult
|
||||
{
|
||||
|
||||
@@ -120,20 +120,24 @@ public sealed class ConstraintEnforcer(
|
||||
/// <param name="commandKind">The command type (e.g., read, write).</param>
|
||||
/// <param name="target">The target being accessed (tag address or handle).</param>
|
||||
/// <param name="failure">The constraint failure details.</param>
|
||||
/// <param name="correlationId">
|
||||
/// The per-request client correlation id, if any. Persisted as the audit record's typed
|
||||
/// <c>CorrelationId</c> when it parses as a GUID; a non-GUID value leaves that column null.
|
||||
/// The raw string is always preserved in <c>DetailsJson["clientCorrelationId"]</c> so a
|
||||
/// non-GUID id (e.g. from Rust/Python/Java clients) is never silently lost.
|
||||
/// </param>
|
||||
/// <param name="cancellationToken">Token to observe for cancellation.</param>
|
||||
public async Task RecordDenialAsync(
|
||||
ApiKeyIdentity? identity,
|
||||
string commandKind,
|
||||
string target,
|
||||
ConstraintFailure failure,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
// Emit a canonical Denied AuditEvent directly through the best-effort IAuditWriter
|
||||
// (Task 2.3 #6): structured Target ("<commandKind>:<target>") and a richer DetailsJson
|
||||
// envelope carrying constraint/message/commandKind/target.
|
||||
// TODO(Task 2.3): CorrelationId is left null here. Threading the per-request
|
||||
// ClientCorrelationId down to RecordDenialAsync would require an invasive IConstraintEnforcer
|
||||
// signature change across the gRPC call path; that is deferred to a follow-up.
|
||||
AuditEvent auditEvent = new()
|
||||
{
|
||||
EventId = Guid.NewGuid(),
|
||||
@@ -144,13 +148,18 @@ public sealed class ConstraintEnforcer(
|
||||
Category = CanonicalForwardingApiKeyAuditStore.ApiKeyCategory,
|
||||
Target = $"{commandKind}:{target}",
|
||||
SourceNode = null,
|
||||
CorrelationId = null,
|
||||
CorrelationId = Guid.TryParse(correlationId, out var cid) ? cid : (Guid?)null,
|
||||
DetailsJson = JsonSerializer.Serialize(new Dictionary<string, string>
|
||||
{
|
||||
["constraint"] = failure.ConstraintName,
|
||||
["message"] = failure.Message,
|
||||
["commandKind"] = commandKind,
|
||||
["target"] = target,
|
||||
// Always preserve the raw client correlation id here so it is never silently
|
||||
// lost: the typed CorrelationId column only retains GUID-parseable ids, but
|
||||
// clients (Rust/Python/Java) commonly send non-GUID or empty trace ids. The
|
||||
// raw id is a client trace id, not a secret, so storing it is fine.
|
||||
["clientCorrelationId"] = correlationId ?? "",
|
||||
}),
|
||||
};
|
||||
|
||||
|
||||
@@ -45,11 +45,16 @@ public interface IConstraintEnforcer
|
||||
/// <param name="commandKind">The kind of command denied.</param>
|
||||
/// <param name="target">The target of the denied command.</param>
|
||||
/// <param name="failure">The constraint failure details.</param>
|
||||
/// <param name="correlationId">
|
||||
/// The per-request client correlation id, if any. Stored on the audit record's
|
||||
/// <c>CorrelationId</c> when it parses as a GUID; otherwise left null.
|
||||
/// </param>
|
||||
/// <param name="cancellationToken">Token to observe for cancellation.</param>
|
||||
Task RecordDenialAsync(
|
||||
ApiKeyIdentity? identity,
|
||||
string commandKind,
|
||||
string target,
|
||||
ConstraintFailure failure,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken);
|
||||
}
|
||||
|
||||
@@ -1,4 +1,10 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
|
||||
using ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
using ZB.MOM.WW.MxGateway.Server.Workers;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
@@ -7,6 +13,7 @@ public sealed class GatewaySession
|
||||
{
|
||||
private readonly object _syncRoot = new();
|
||||
private readonly SemaphoreSlim _closeLock = new(1, 1);
|
||||
private readonly SessionEventStreaming _eventStreaming;
|
||||
private IWorkerClient? _workerClient;
|
||||
private SessionState _state = SessionState.Creating;
|
||||
private string? _finalFault;
|
||||
@@ -14,6 +21,12 @@ public sealed class GatewaySession
|
||||
private DateTimeOffset? _leaseExpiresAt;
|
||||
private bool _closeStarted;
|
||||
private int _activeEventSubscriberCount;
|
||||
private SessionEventDistributor? _eventDistributor;
|
||||
private bool _eventDistributorStarted;
|
||||
private bool _dashboardMirrorStarted;
|
||||
private IEventSubscriberLease? _dashboardMirrorLease;
|
||||
private Task? _dashboardMirrorTask;
|
||||
private CancellationTokenSource? _dashboardMirrorCts;
|
||||
private readonly Dictionary<(int ServerHandle, int ItemHandle), SessionItemRegistration> _items = [];
|
||||
|
||||
/// <summary>
|
||||
@@ -30,6 +43,11 @@ public sealed class GatewaySession
|
||||
/// <param name="startupTimeout">Timeout for worker process startup.</param>
|
||||
/// <param name="shutdownTimeout">Timeout for worker process shutdown.</param>
|
||||
/// <param name="openedAt">Timestamp when the session opened.</param>
|
||||
/// <remarks>
|
||||
/// Constructs a session with no owner key (<see cref="OwnerKeyId"/> will be null).
|
||||
/// Authenticated call sites that have a resolved API key identity must use the
|
||||
/// 12-parameter overload and pass the caller's key id explicitly.
|
||||
/// </remarks>
|
||||
public GatewaySession(
|
||||
string sessionId,
|
||||
string backendName,
|
||||
@@ -48,6 +66,7 @@ public sealed class GatewaySession
|
||||
pipeName,
|
||||
nonce,
|
||||
clientIdentity,
|
||||
ownerKeyId: null,
|
||||
clientSessionName,
|
||||
clientCorrelationId,
|
||||
commandTimeout,
|
||||
@@ -66,6 +85,7 @@ public sealed class GatewaySession
|
||||
/// <param name="pipeName">Name of the named pipe for gateway-worker IPC.</param>
|
||||
/// <param name="nonce">Security nonce for worker validation.</param>
|
||||
/// <param name="clientIdentity">Client identity from the authentication context.</param>
|
||||
/// <param name="ownerKeyId">API key identifier of the caller that created this session.</param>
|
||||
/// <param name="clientSessionName">Client-supplied session name.</param>
|
||||
/// <param name="clientCorrelationId">Client-supplied correlation identifier.</param>
|
||||
/// <param name="commandTimeout">Timeout for command invocation.</param>
|
||||
@@ -73,19 +93,30 @@ public sealed class GatewaySession
|
||||
/// <param name="shutdownTimeout">Timeout for worker process shutdown.</param>
|
||||
/// <param name="leaseDuration">Duration of the session lease.</param>
|
||||
/// <param name="openedAt">Timestamp when the session opened.</param>
|
||||
/// <param name="eventStreaming">
|
||||
/// Dependencies the session uses to construct and own its
|
||||
/// <see cref="SessionEventDistributor"/> (the single per-session worker-event pump
|
||||
/// that fans raw mapped <see cref="MxEvent"/>s to every subscriber lease). When
|
||||
/// <see langword="null"/>, defaults are used (no replay logger, system clock, a
|
||||
/// fresh mapper, and default <see cref="EventOptions"/>) so unit tests that build a
|
||||
/// session directly still get a working distributor. Production passes the
|
||||
/// DI-resolved dependencies.
|
||||
/// </param>
|
||||
public GatewaySession(
|
||||
string sessionId,
|
||||
string backendName,
|
||||
string pipeName,
|
||||
string nonce,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
string? clientSessionName,
|
||||
string? clientCorrelationId,
|
||||
TimeSpan commandTimeout,
|
||||
TimeSpan startupTimeout,
|
||||
TimeSpan shutdownTimeout,
|
||||
TimeSpan leaseDuration,
|
||||
DateTimeOffset openedAt)
|
||||
DateTimeOffset openedAt,
|
||||
SessionEventStreaming? eventStreaming = null)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(sessionId))
|
||||
{
|
||||
@@ -112,6 +143,7 @@ public sealed class GatewaySession
|
||||
PipeName = pipeName;
|
||||
Nonce = nonce;
|
||||
ClientIdentity = clientIdentity;
|
||||
OwnerKeyId = ownerKeyId;
|
||||
ClientSessionName = clientSessionName;
|
||||
ClientCorrelationId = clientCorrelationId;
|
||||
CommandTimeout = commandTimeout;
|
||||
@@ -121,6 +153,7 @@ public sealed class GatewaySession
|
||||
OpenedAt = openedAt;
|
||||
_lastClientActivityAt = openedAt;
|
||||
_leaseExpiresAt = openedAt + leaseDuration;
|
||||
_eventStreaming = eventStreaming ?? SessionEventStreaming.Default;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -148,6 +181,11 @@ public sealed class GatewaySession
|
||||
/// </summary>
|
||||
public string? ClientIdentity { get; }
|
||||
|
||||
/// <summary>
|
||||
/// Gets the API key identifier of the caller that created this session.
|
||||
/// </summary>
|
||||
public string? OwnerKeyId { get; }
|
||||
|
||||
/// <summary>
|
||||
/// Gets the client-supplied session name.
|
||||
/// </summary>
|
||||
@@ -318,9 +356,268 @@ public sealed class GatewaySession
|
||||
/// <summary>
|
||||
/// Transitions the session to the Ready state.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// On becoming Ready the session starts its internal dashboard mirror (Task 6) when a
|
||||
/// dashboard broadcaster was supplied. The mirror registers an internal subscriber on
|
||||
/// the distributor and starts the pump <em>before</em> any gRPC client attaches, so the
|
||||
/// dashboard EventsHub receives session events even with no gRPC subscriber streaming —
|
||||
/// fixing the "dark feed" where the dashboard only saw events while a gRPC client was
|
||||
/// actively streaming. Registering the internal subscriber BEFORE
|
||||
/// <see cref="SessionEventDistributor.StartAsync"/> also avoids the Task 4 hazard where
|
||||
/// starting the pump at Ready with zero subscribers drained a fast-completing worker
|
||||
/// stream into nothing and left a later subscriber hanging: there is now always a
|
||||
/// subscriber (the dashboard one) registered before the pump starts.
|
||||
/// </remarks>
|
||||
public void MarkReady()
|
||||
{
|
||||
TransitionTo(SessionState.Ready);
|
||||
StartDashboardMirror();
|
||||
}
|
||||
|
||||
// Constructs and starts the distributor exactly once, registering the subscriber under
|
||||
// the same start so no event the pump fans can be missed between start and register.
|
||||
// Started lazily on the FIRST AttachEventSubscriber rather than at MarkReady: today the
|
||||
// worker event stream is only drained when a client begins streaming, so deferring the
|
||||
// single drain to first-attach preserves that "events start flowing on subscribe"
|
||||
// behavior and avoids draining a fast-completing source into the void before any
|
||||
// subscriber exists. The source factory mirrors the mapping/ordering/start that
|
||||
// EventStreamService.ProduceEventsAsync used before Task 4: it drains the worker event
|
||||
// stream in source order and maps each WorkerEvent to the public MxEvent with the same
|
||||
// mapper, with no skip/filter — per-RPC filtering (e.g. AfterWorkerSequence) stays at the
|
||||
// subscriber boundary in EventStreamService. Returns a registered lease atomically with
|
||||
// the start so the very first subscriber sees the stream from its beginning.
|
||||
private IEventSubscriberLease StartDistributorAndRegister()
|
||||
{
|
||||
SessionEventDistributor distributor = EnsureDistributorCreated(out bool startNow);
|
||||
|
||||
// Register BEFORE starting the pump so a subscriber is present when the pump begins
|
||||
// draining — no event is fanned to an empty subscriber set and then missed by this
|
||||
// first subscriber. StartAsync only schedules the pump task; it never blocks.
|
||||
IEventSubscriberLease lease = distributor.Register();
|
||||
StartPumpIfRequested(distributor, startNow);
|
||||
|
||||
return lease;
|
||||
}
|
||||
|
||||
// Constructs the distributor exactly once and reports whether THIS caller is the one
|
||||
// that should start the pump (i.e. it observed the unstarted state and claimed the
|
||||
// start). Both the construction and the started-flag flip happen under _syncRoot so two
|
||||
// concurrent callers (e.g. MarkReady's dashboard mirror and a racing first
|
||||
// AttachEventSubscriber) agree on a single distributor and a single start.
|
||||
private SessionEventDistributor EnsureDistributorCreated(out bool startNow)
|
||||
{
|
||||
lock (_syncRoot)
|
||||
{
|
||||
if (_eventDistributor is null)
|
||||
{
|
||||
EventOptions eventOptions = _eventStreaming.EventOptions;
|
||||
_eventDistributor = new SessionEventDistributor(
|
||||
SessionId,
|
||||
MapWorkerEventsAsync,
|
||||
eventOptions.QueueCapacity,
|
||||
eventOptions.ReplayBufferCapacity,
|
||||
eventOptions.ReplayRetentionSeconds,
|
||||
_eventStreaming.DistributorLogger,
|
||||
_eventStreaming.TimeProvider,
|
||||
CreateOverflowHandler(eventOptions.BackpressurePolicy));
|
||||
}
|
||||
|
||||
startNow = false;
|
||||
if (!_eventDistributorStarted)
|
||||
{
|
||||
_eventDistributorStarted = true;
|
||||
startNow = true;
|
||||
}
|
||||
|
||||
return _eventDistributor;
|
||||
}
|
||||
}
|
||||
|
||||
private static void StartPumpIfRequested(SessionEventDistributor distributor, bool startNow)
|
||||
{
|
||||
if (!startNow)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
// StartAsync only schedules the pump via Task.Run and returns a completed task;
|
||||
// it does not perform any async I/O itself. The sync-over-async call here is
|
||||
// therefore safe and will not deadlock. Do not make StartAsync truly async
|
||||
// (i.e., await real I/O before returning) without also changing this call site.
|
||||
distributor.StartAsync(CancellationToken.None).GetAwaiter().GetResult();
|
||||
}
|
||||
|
||||
// Registers the gateway-owned internal dashboard subscriber on the distributor and starts
|
||||
// a background loop that mirrors every fanned event to the dashboard broadcaster. Called
|
||||
// once when the session becomes Ready (idempotent). The internal subscriber is registered
|
||||
// BEFORE the pump starts (see StartDistributorAndRegister / EnsureDistributorCreated), so
|
||||
// a subscriber is always present at pump start — the dashboard receives events with no
|
||||
// gRPC subscriber attached, and the Task 4 "zero-subscriber drain into the void" hang
|
||||
// cannot occur. No-op when no dashboard broadcaster was supplied (unit tests).
|
||||
//
|
||||
// Race-safety (Issue 1): _dashboardMirrorLease and _dashboardMirrorTask are published
|
||||
// atomically under a SINGLE second lock section, and DisposeAsync reads/nulls them under
|
||||
// that same lock. After EnsureDistributorCreated/Register/StartPump (all outside _syncRoot
|
||||
// to avoid lock inversion with the distributor's own lifecycle lock), we re-enter
|
||||
// _syncRoot and check for concurrent disposal. If the session is already Closing/Closed/
|
||||
// Faulted at that point, we dispose the just-created lease immediately and do NOT start
|
||||
// the mirror task, so nothing is orphaned.
|
||||
private void StartDashboardMirror()
|
||||
{
|
||||
IDashboardEventBroadcaster? broadcaster = _eventStreaming.DashboardBroadcaster;
|
||||
if (broadcaster is null)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
CancellationToken loopToken;
|
||||
lock (_syncRoot)
|
||||
{
|
||||
if (_dashboardMirrorStarted || _state is SessionState.Closing or SessionState.Closed or SessionState.Faulted)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
_dashboardMirrorStarted = true;
|
||||
_dashboardMirrorCts = new CancellationTokenSource();
|
||||
loopToken = _dashboardMirrorCts.Token;
|
||||
}
|
||||
|
||||
// Create the distributor (claiming the start if we are first) and register the
|
||||
// internal subscriber BEFORE starting the pump. isInternal: true keeps the dashboard
|
||||
// subscriber out of the single-subscriber overflow accounting, so a slow/broken
|
||||
// dashboard mirror only disconnects itself and never faults the session.
|
||||
// These three calls are OUTSIDE _syncRoot to avoid holding it across
|
||||
// EnsureDistributorCreated's own lock and StartAsync's Task.Run.
|
||||
SessionEventDistributor distributor = EnsureDistributorCreated(out bool startNow);
|
||||
IEventSubscriberLease lease = distributor.Register(isInternal: true);
|
||||
StartPumpIfRequested(distributor, startNow);
|
||||
|
||||
// Publish BOTH the lease and the task atomically under one lock section so
|
||||
// DisposeAsync always sees them in a consistent state: either both are set or
|
||||
// both are null. If the session already started disposal before we got here,
|
||||
// dispose the lease immediately instead of orphaning it.
|
||||
lock (_syncRoot)
|
||||
{
|
||||
if (_state is SessionState.Closing or SessionState.Closed or SessionState.Faulted)
|
||||
{
|
||||
// Disposal already ran (or is in progress) — discard the just-created
|
||||
// lease now so it is not orphaned. Do NOT launch the mirror task.
|
||||
lease.Dispose();
|
||||
return;
|
||||
}
|
||||
|
||||
_dashboardMirrorLease = lease;
|
||||
_dashboardMirrorTask = Task.Run(
|
||||
() => RunDashboardMirrorAsync(broadcaster, lease, loopToken),
|
||||
CancellationToken.None);
|
||||
}
|
||||
}
|
||||
|
||||
// Reads the internal dashboard subscriber's channel and publishes each RAW fanned event
|
||||
// to the dashboard broadcaster. The dashboard is a first-class distributor subscriber
|
||||
// (Task 6), so it sees the session's full raw event activity — NOT the per-gRPC-subscriber
|
||||
// AfterWorkerSequence filtering that EventStreamService applies at its own boundary. This
|
||||
// is intentional: the dashboard is a separate LDAP-authenticated monitoring view (per-
|
||||
// session dashboard ACL is the separate Task 18). Publish is best-effort / never-throw, so
|
||||
// a slow or broken dashboard cannot fault the session or stall the pump; the bounded
|
||||
// internal subscriber channel (Task 5 per-subscriber isolation) only disconnects THIS
|
||||
// mirror on overflow, leaving the session and other subscribers untouched.
|
||||
private async Task RunDashboardMirrorAsync(
|
||||
IDashboardEventBroadcaster broadcaster,
|
||||
IEventSubscriberLease lease,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
try
|
||||
{
|
||||
await foreach (MxEvent mxEvent in lease.Reader
|
||||
.ReadAllAsync(cancellationToken)
|
||||
.ConfigureAwait(false))
|
||||
{
|
||||
try
|
||||
{
|
||||
broadcaster.Publish(SessionId, mxEvent);
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
// Publish is documented never-throw, but enforce it here too so a future
|
||||
// implementation cannot fault the mirror loop. Logs identifiers only.
|
||||
_eventStreaming.DistributorLogger.LogDebug(
|
||||
exception,
|
||||
"Dashboard event mirror threw for session {SessionId}; continuing.",
|
||||
SessionId);
|
||||
}
|
||||
}
|
||||
}
|
||||
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
// Teardown path: the session is shutting down the mirror.
|
||||
}
|
||||
catch (SessionManagerException)
|
||||
{
|
||||
// The internal subscriber's channel overflowed and the distributor disconnected
|
||||
// it with a terminal overflow fault. That disconnects only the dashboard mirror;
|
||||
// the session, pump, and any gRPC subscriber are unaffected. Stop mirroring.
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
// Source-fault completion (worker event stream terminated abnormally) surfaces
|
||||
// here. The session's own fault handling runs via the gRPC path / lifecycle; the
|
||||
// mirror just stops. Logs identifiers only.
|
||||
_eventStreaming.DistributorLogger.LogDebug(
|
||||
exception,
|
||||
"Dashboard event mirror loop ended for session {SessionId}.",
|
||||
SessionId);
|
||||
}
|
||||
}
|
||||
|
||||
// Builds the per-subscriber backpressure handler the distributor invokes when a
|
||||
// subscriber's bounded channel overflows. The distributor always disconnects the
|
||||
// offending subscriber with an EventQueueOverflow fault; this handler adds the
|
||||
// observable side effects, preserving exactly what the pre-epic per-RPC overflow path
|
||||
// emitted:
|
||||
// - always record the queue-overflow metric, labeled by subscriber kind;
|
||||
// - FailFast in the legacy single-subscriber case (isOnlySubscriber): fault the whole
|
||||
// session and record the fault metric, matching back-compat behavior;
|
||||
// - FailFast with multiple subscribers, or DisconnectSubscriber in any case: do NOT
|
||||
// fault the session — the distributor's disconnect of the one slow subscriber is the
|
||||
// whole remedy, so other subscribers and the pump are unaffected. Multi-subscriber
|
||||
// FailFast deliberately degrades to a disconnect because faulting a shared session on
|
||||
// one slow consumer would punish healthy subscribers.
|
||||
// The delegate now carries isInternal directly (Issue 4), so the metric label is chosen
|
||||
// without any heuristic: "dashboard-mirror" for internal, "grpc-event-stream" for external.
|
||||
private SubscriberOverflowHandler CreateOverflowHandler(EventBackpressurePolicy policy)
|
||||
{
|
||||
GatewayMetrics metrics = _eventStreaming.Metrics;
|
||||
string sessionId = SessionId;
|
||||
return (isOnlySubscriber, isInternal) =>
|
||||
{
|
||||
// Label the overflow metric by subscriber kind. The distributor passes isInternal
|
||||
// directly, so no heuristic is needed to distinguish an internal overflow (the
|
||||
// gateway-owned dashboard mirror) from an external one (a gRPC streaming client).
|
||||
string label = isInternal ? "dashboard-mirror" : "grpc-event-stream";
|
||||
metrics.QueueOverflow(label);
|
||||
|
||||
if (policy == EventBackpressurePolicy.FailFast && isOnlySubscriber)
|
||||
{
|
||||
MarkFaulted($"Session {sessionId} event stream queue overflowed.");
|
||||
metrics.Fault(SessionManagerErrorCode.EventQueueOverflow.ToString());
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
// The distributor's single event source. Drains the worker event stream once (the
|
||||
// distributor guarantees a single consumer) and maps each frame to the public MxEvent,
|
||||
// preserving worker order. Mirrors the former ProduceEventsAsync mapping exactly.
|
||||
private async IAsyncEnumerable<MxEvent> MapWorkerEventsAsync(
|
||||
[EnumeratorCancellation] CancellationToken cancellationToken)
|
||||
{
|
||||
MxAccessGrpcMapper mapper = _eventStreaming.Mapper;
|
||||
await foreach (WorkerEvent workerEvent in ReadEventsAsync(cancellationToken)
|
||||
.ConfigureAwait(false))
|
||||
{
|
||||
yield return mapper.MapEvent(workerEvent);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -381,10 +678,15 @@ public sealed class GatewaySession
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Attaches an event subscriber and returns a disposable lease.
|
||||
/// Attaches an event subscriber and returns a lease whose
|
||||
/// <see cref="IEventSubscriberLease.Reader"/> reads the fanned public
|
||||
/// <see cref="MxEvent"/>s for this subscriber. The single-subscriber guard
|
||||
/// (Tasks 7/8 relax it) is unchanged: with multi-subscriber disabled a second
|
||||
/// attach is rejected. The returned lease, when disposed, unregisters the
|
||||
/// distributor subscriber AND decrements the active-subscriber count.
|
||||
/// </summary>
|
||||
/// <param name="allowMultipleSubscribers">If true, allows multiple concurrent event subscribers.</param>
|
||||
public IDisposable AttachEventSubscriber(bool allowMultipleSubscribers)
|
||||
public IEventSubscriberLease AttachEventSubscriber(bool allowMultipleSubscribers)
|
||||
{
|
||||
lock (_syncRoot)
|
||||
{
|
||||
@@ -403,7 +705,20 @@ public sealed class GatewaySession
|
||||
}
|
||||
|
||||
_activeEventSubscriberCount++;
|
||||
return new EventSubscriberLease(this);
|
||||
}
|
||||
|
||||
// Construct/start the distributor and register this subscriber. Done outside the
|
||||
// guard lock (StartDistributorAndRegister takes _syncRoot itself for construction).
|
||||
// On any failure roll back the count we just took so the guard stays consistent.
|
||||
try
|
||||
{
|
||||
IEventSubscriberLease distributorLease = StartDistributorAndRegister();
|
||||
return new EventSubscriberLease(this, distributorLease);
|
||||
}
|
||||
catch
|
||||
{
|
||||
DetachEventSubscriber();
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -960,6 +1275,63 @@ public sealed class GatewaySession
|
||||
{
|
||||
}
|
||||
|
||||
// Stop the internal dashboard mirror first: cancel its loop, dispose its lease (which
|
||||
// unregisters its internal distributor subscriber and completes its channel), and
|
||||
// await the loop task. Done BEFORE disposing the distributor and worker client — like
|
||||
// the distributor itself — so the mirror is no longer reading the pump when the pump
|
||||
// and its source (the worker client) tear down.
|
||||
IEventSubscriberLease? dashboardLease;
|
||||
Task? dashboardTask;
|
||||
CancellationTokenSource? dashboardCts;
|
||||
lock (_syncRoot)
|
||||
{
|
||||
dashboardLease = _dashboardMirrorLease;
|
||||
dashboardTask = _dashboardMirrorTask;
|
||||
dashboardCts = _dashboardMirrorCts;
|
||||
_dashboardMirrorLease = null;
|
||||
_dashboardMirrorTask = null;
|
||||
_dashboardMirrorCts = null;
|
||||
}
|
||||
|
||||
if (dashboardCts is not null)
|
||||
{
|
||||
await dashboardCts.CancelAsync().ConfigureAwait(false);
|
||||
}
|
||||
|
||||
dashboardLease?.Dispose();
|
||||
|
||||
if (dashboardTask is not null)
|
||||
{
|
||||
try
|
||||
{
|
||||
await dashboardTask.ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception)
|
||||
{
|
||||
// The mirror loop swallows its own faults; any escape here must not block
|
||||
// disposal. The loop has stopped, which is all teardown requires.
|
||||
}
|
||||
}
|
||||
|
||||
dashboardCts?.Dispose();
|
||||
|
||||
// Stop the event pump and complete every subscriber channel before tearing down the
|
||||
// worker client (the pump's source). DisposeAsync is the single session teardown
|
||||
// point (SessionManager.RemoveSessionAsync awaits it after close), so awaiting it
|
||||
// here guarantees the distributor's pump task is observed and subscribers are
|
||||
// completed rather than left dangling.
|
||||
SessionEventDistributor? distributor;
|
||||
lock (_syncRoot)
|
||||
{
|
||||
distributor = _eventDistributor;
|
||||
_eventDistributor = null;
|
||||
}
|
||||
|
||||
if (distributor is not null)
|
||||
{
|
||||
await distributor.DisposeAsync().ConfigureAwait(false);
|
||||
}
|
||||
|
||||
if (_workerClient is not null)
|
||||
{
|
||||
await _workerClient.DisposeAsync().ConfigureAwait(false);
|
||||
@@ -1101,22 +1473,30 @@ public sealed class GatewaySession
|
||||
}
|
||||
}
|
||||
|
||||
private sealed class EventSubscriberLease(GatewaySession session) : IDisposable
|
||||
private sealed class EventSubscriberLease(GatewaySession session, IEventSubscriberLease distributorLease)
|
||||
: IEventSubscriberLease
|
||||
{
|
||||
private bool _disposed;
|
||||
// 0 = live, 1 = disposed. Interlocked so concurrent stream-completion +
|
||||
// client-cancellation paths cannot both call DetachEventSubscriber and
|
||||
// double-decrement _activeEventSubscriberCount to -1.
|
||||
private int _leaseDisposed;
|
||||
|
||||
/// <inheritdoc />
|
||||
public System.Threading.Channels.ChannelReader<MxEvent> Reader => distributorLease.Reader;
|
||||
|
||||
/// <summary>
|
||||
/// Disposes the lease and detaches the event subscriber.
|
||||
/// Disposes the lease: unregisters this subscriber from the distributor (completing
|
||||
/// its channel) and decrements the session's active-subscriber count. Ordering is
|
||||
/// not significant — the count guard and the distributor registration are
|
||||
/// independent — but both must run exactly once.
|
||||
/// </summary>
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed)
|
||||
if (Interlocked.Exchange(ref _leaseDisposed, 1) == 0)
|
||||
{
|
||||
return;
|
||||
distributorLease.Dispose();
|
||||
session.DetachEventSubscriber();
|
||||
}
|
||||
|
||||
session.DetachEventSubscriber();
|
||||
_disposed = true;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,18 @@
|
||||
using System.Threading.Channels;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
|
||||
/// <summary>
|
||||
/// A registration lease into a <see cref="SessionEventDistributor"/>. Exposes the
|
||||
/// subscriber's own <see cref="ChannelReader{T}"/> of fanned events. Disposing the
|
||||
/// lease unregisters the subscriber and completes its channel without disturbing the
|
||||
/// pump or other subscribers.
|
||||
/// </summary>
|
||||
public interface IEventSubscriberLease : IDisposable
|
||||
{
|
||||
/// <summary>
|
||||
/// Gets the reader for this subscriber's fanned event channel.
|
||||
/// </summary>
|
||||
ChannelReader<MxEvent> Reader { get; }
|
||||
}
|
||||
@@ -8,11 +8,13 @@ public interface ISessionManager
|
||||
/// <summary>Opens a new gateway session and launches a worker process.</summary>
|
||||
/// <param name="request">Request payload.</param>
|
||||
/// <param name="clientIdentity">Client identity string.</param>
|
||||
/// <param name="ownerKeyId">API key identifier of the caller creating the session.</param>
|
||||
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
||||
/// <returns>The newly opened session.</returns>
|
||||
Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken);
|
||||
|
||||
/// <summary>Attempts to retrieve a session by ID.</summary>
|
||||
|
||||
@@ -0,0 +1,666 @@
|
||||
using System.Collections.Concurrent;
|
||||
using System.Threading.Channels;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
|
||||
/// <summary>
|
||||
/// Invoked by the pump (on the pump thread) when a subscriber's bounded channel is full
|
||||
/// and the event cannot be written. The handler applies policy side-effects only:
|
||||
/// it records the overflow metric and, in the legacy single-subscriber FailFast case,
|
||||
/// faults the owning session. The handler MUST NOT complete the subscriber's channel —
|
||||
/// the distributor performs the disconnect and channel-completion unconditionally,
|
||||
/// regardless of what the handler does.
|
||||
/// </summary>
|
||||
/// <param name="isOnlySubscriber">
|
||||
/// <see langword="true"/> when the overflowing subscriber is the sole registered
|
||||
/// subscriber at the moment of overflow (legacy single-subscriber mode). FailFast faults
|
||||
/// the session only in this case; with multiple subscribers FailFast degrades to a
|
||||
/// per-subscriber disconnect so one slow consumer never faults a session shared by others.
|
||||
/// Always <see langword="false"/> for internal subscribers (the dashboard mirror) because
|
||||
/// <see cref="SessionEventDistributor"/> excludes them from the external-subscriber count.
|
||||
/// </param>
|
||||
/// <param name="isInternal">
|
||||
/// <see langword="true"/> when the overflowing subscriber is the gateway-owned internal
|
||||
/// dashboard mirror subscriber. The handler uses this to choose the correct metric label
|
||||
/// (<c>"dashboard-mirror"</c> vs <c>"grpc-event-stream"</c>).
|
||||
/// </param>
|
||||
public delegate void SubscriberOverflowHandler(bool isOnlySubscriber, bool isInternal);
|
||||
|
||||
/// <summary>
|
||||
/// Per-session event pump and fan-out. A single background task drains the
|
||||
/// session's event source <em>exactly once</em> and fans each event out to
|
||||
/// every currently-registered subscriber's own bounded channel.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Introduced by Task 2 of the Session Resilience epic; the bounded replay ring
|
||||
/// buffer was added by Task 3, it was wired into <c>GatewaySession</c> and
|
||||
/// <c>EventStreamService</c> by Task 4, and the per-subscriber backpressure-isolation
|
||||
/// policy (Task 5) is implemented here: a slow subscriber overflows only its own
|
||||
/// bounded channel and the pump applies the policy to that subscriber alone (see
|
||||
/// <see cref="SubscriberOverflowHandler"/> and <c>OnSubscriberOverflow</c>), leaving
|
||||
/// the pump, the session, and other subscribers running. The class does not yet
|
||||
/// remove the single-subscriber guard (Tasks 7/8). The ring buffer supports capacity
|
||||
/// eviction (oldest entry dropped when the count exceeds
|
||||
/// <c>replayBufferCapacity</c>) and age eviction (entries older than
|
||||
/// <c>replayRetentionSeconds</c> dropped on the next append or query), and is
|
||||
/// queried via <see cref="TryGetReplayFrom"/> by reconnecting subscribers.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Source seam.</b> The event source is injected as a
|
||||
/// <see cref="Func{T, TResult}"/> producing an
|
||||
/// <see cref="IAsyncEnumerable{T}"/> of already-mapped public
|
||||
/// <see cref="MxEvent"/>s, given a <see cref="CancellationToken"/>. This is the
|
||||
/// cleanest seam for Task 4: it can pass
|
||||
/// <c>ct => session.ReadEventsAsync(ct).Select(mapper.MapEvent)</c> (or a
|
||||
/// channel reader's <c>ReadAllAsync</c>), while unit tests pass a plain
|
||||
/// channel reader's <c>ReadAllAsync</c> with no real session. The pump owns the
|
||||
/// single consumption of this enumerable; fan-out happens on the public
|
||||
/// <see cref="MxEvent"/> after mapping, mirroring today's
|
||||
/// <c>EventStreamService.ProduceEventsAsync</c> ordering.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Concurrency.</b> The subscriber set is a
|
||||
/// <see cref="ConcurrentDictionary{TKey, TValue}"/> keyed by a monotonic id.
|
||||
/// The pump iterates it with a snapshot-free enumerator (which never throws on
|
||||
/// concurrent add/remove), and <see cref="Register"/> / lease disposal mutate it
|
||||
/// without any lock held across an <c>await</c>. Each subscriber channel has a
|
||||
/// single writer — the pump — so per-channel writes never race. MXAccess parity:
|
||||
/// events are fanned in the order received; the pump never reorders or
|
||||
/// synthesizes events.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class SessionEventDistributor : IAsyncDisposable
|
||||
{
|
||||
/// <summary>
|
||||
/// Bounded wait for the pump to stop during disposal. A source factory that
|
||||
/// ignores cancellation must not hang dispose forever; after this window the
|
||||
/// pump is abandoned and subscribers are completed anyway.
|
||||
/// </summary>
|
||||
private static readonly TimeSpan DefaultShutdownTimeout = TimeSpan.FromSeconds(5);
|
||||
|
||||
private readonly string _sessionId;
|
||||
private readonly Func<CancellationToken, IAsyncEnumerable<MxEvent>> _eventSourceFactory;
|
||||
private readonly int _subscriberQueueCapacity;
|
||||
private readonly SubscriberOverflowHandler? _overflowHandler;
|
||||
private readonly TimeSpan _shutdownTimeout;
|
||||
private readonly ILogger<SessionEventDistributor> _logger;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ConcurrentDictionary<long, Subscriber> _subscribers = new();
|
||||
private readonly CancellationTokenSource _shutdownCts = new();
|
||||
private readonly object _lifecycleLock = new();
|
||||
|
||||
// Replay ring buffer. Appended on the pump thread and queried from arbitrary
|
||||
// threads via TryGetReplayFrom, so every access is under _replayLock. The deque
|
||||
// keeps events in ascending WorkerSequence order (the pump fans in source order),
|
||||
// so the oldest retained event is always at the front. Capacity == 0 disables
|
||||
// retention; RetentionSeconds <= 0 disables age-based eviction.
|
||||
private readonly int _replayBufferCapacity;
|
||||
private readonly TimeSpan _replayRetention;
|
||||
private readonly bool _ageEvictionEnabled;
|
||||
private readonly LinkedList<ReplayEntry> _replayBuffer = new();
|
||||
private readonly object _replayLock = new();
|
||||
private bool _anyEventSeen;
|
||||
private ulong _highestSequenceSeen;
|
||||
|
||||
private long _nextSubscriberId;
|
||||
private Task? _pumpTask;
|
||||
private bool _started;
|
||||
private bool _disposed;
|
||||
|
||||
/// <summary>
|
||||
/// Initializes a per-session event distributor.
|
||||
/// </summary>
|
||||
/// <param name="sessionId">Owning session id, used only for logging context.</param>
|
||||
/// <param name="eventSourceFactory">
|
||||
/// Factory producing the session's event stream given a cancellation token.
|
||||
/// The pump consumes this exactly once. See the type remarks for the seam Task 4
|
||||
/// plugs into.
|
||||
/// </param>
|
||||
/// <param name="subscriberQueueCapacity">
|
||||
/// Bounded capacity of each per-subscriber channel. Mirrors the gRPC event-stream
|
||||
/// queue capacity shape used today.
|
||||
/// </param>
|
||||
/// <param name="logger">Logger for pump lifecycle diagnostics.</param>
|
||||
/// <remarks>
|
||||
/// This overload disables the replay ring buffer (capacity 0). Use the overload
|
||||
/// taking replay parameters to retain events for reconnect/reattach replay.
|
||||
/// Kept <c>internal</c> so production wiring (Task 4) cannot accidentally use
|
||||
/// the no-replay path; tests reach it via <c>InternalsVisibleTo</c>.
|
||||
/// </remarks>
|
||||
internal SessionEventDistributor(
|
||||
string sessionId,
|
||||
Func<CancellationToken, IAsyncEnumerable<MxEvent>> eventSourceFactory,
|
||||
int subscriberQueueCapacity,
|
||||
ILogger<SessionEventDistributor> logger,
|
||||
SubscriberOverflowHandler? overflowHandler = null)
|
||||
: this(
|
||||
sessionId,
|
||||
eventSourceFactory,
|
||||
subscriberQueueCapacity,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0,
|
||||
logger,
|
||||
TimeProvider.System,
|
||||
overflowHandler)
|
||||
{
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Initializes a per-session event distributor with a bounded replay ring buffer.
|
||||
/// </summary>
|
||||
/// <param name="sessionId">Owning session id, used only for logging context.</param>
|
||||
/// <param name="eventSourceFactory">
|
||||
/// Factory producing the session's event stream given a cancellation token.
|
||||
/// The pump consumes this exactly once. See the type remarks for the seam Task 4
|
||||
/// plugs into.
|
||||
/// </param>
|
||||
/// <param name="subscriberQueueCapacity">
|
||||
/// Bounded capacity of each per-subscriber channel. Mirrors the gRPC event-stream
|
||||
/// queue capacity shape used today.
|
||||
/// </param>
|
||||
/// <param name="replayBufferCapacity">
|
||||
/// Maximum number of events retained for replay. The oldest retained event is
|
||||
/// evicted once this count is exceeded. <c>0</c> disables retention entirely.
|
||||
/// </param>
|
||||
/// <param name="replayRetentionSeconds">
|
||||
/// Maximum age, in seconds, of a retained event. Entries older than this are
|
||||
/// evicted regardless of capacity. <c>0</c> (or less) disables age-based eviction.
|
||||
/// </param>
|
||||
/// <param name="logger">Logger for pump lifecycle diagnostics.</param>
|
||||
/// <param name="timeProvider">
|
||||
/// Clock used to timestamp and age-evict replay entries. Inject a fake to make
|
||||
/// age-eviction deterministic in tests.
|
||||
/// </param>
|
||||
/// <param name="overflowHandler">
|
||||
/// Optional per-subscriber backpressure handler invoked when a subscriber's bounded
|
||||
/// channel is full. It records the overflow metric and, for the legacy
|
||||
/// single-subscriber FailFast case, faults the owning session. The distributor always
|
||||
/// disconnects the offending subscriber with an overflow fault regardless of the
|
||||
/// handler. When <see langword="null"/> (unit/skeleton use) the offending subscriber is
|
||||
/// still disconnected but no metric/fault side effect runs.
|
||||
/// </param>
|
||||
public SessionEventDistributor(
|
||||
string sessionId,
|
||||
Func<CancellationToken, IAsyncEnumerable<MxEvent>> eventSourceFactory,
|
||||
int subscriberQueueCapacity,
|
||||
int replayBufferCapacity,
|
||||
double replayRetentionSeconds,
|
||||
ILogger<SessionEventDistributor> logger,
|
||||
TimeProvider timeProvider,
|
||||
SubscriberOverflowHandler? overflowHandler = null)
|
||||
{
|
||||
ArgumentException.ThrowIfNullOrWhiteSpace(sessionId);
|
||||
ArgumentNullException.ThrowIfNull(eventSourceFactory);
|
||||
ArgumentOutOfRangeException.ThrowIfLessThan(subscriberQueueCapacity, 1);
|
||||
ArgumentOutOfRangeException.ThrowIfNegative(replayBufferCapacity);
|
||||
ArgumentOutOfRangeException.ThrowIfNegative(replayRetentionSeconds);
|
||||
ArgumentNullException.ThrowIfNull(logger);
|
||||
ArgumentNullException.ThrowIfNull(timeProvider);
|
||||
|
||||
_sessionId = sessionId;
|
||||
_eventSourceFactory = eventSourceFactory;
|
||||
_subscriberQueueCapacity = subscriberQueueCapacity;
|
||||
_overflowHandler = overflowHandler;
|
||||
_shutdownTimeout = DefaultShutdownTimeout;
|
||||
_replayBufferCapacity = replayBufferCapacity;
|
||||
_ageEvictionEnabled = replayRetentionSeconds > 0;
|
||||
_replayRetention = _ageEvictionEnabled
|
||||
? TimeSpan.FromSeconds(replayRetentionSeconds)
|
||||
: TimeSpan.Zero;
|
||||
_logger = logger;
|
||||
_timeProvider = timeProvider;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Gets the count of currently-registered subscribers.
|
||||
/// </summary>
|
||||
public int SubscriberCount => _subscribers.Count;
|
||||
|
||||
/// <summary>
|
||||
/// Starts the background pump. Idempotent — a second call is a no-op.
|
||||
/// </summary>
|
||||
/// <param name="cancellationToken">Token observed only while starting.</param>
|
||||
public Task StartAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
|
||||
lock (_lifecycleLock)
|
||||
{
|
||||
ObjectDisposedException.ThrowIf(_disposed, this);
|
||||
if (_started)
|
||||
{
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
|
||||
_started = true;
|
||||
_pumpTask = Task.Run(() => PumpAsync(_shutdownCts.Token), CancellationToken.None);
|
||||
}
|
||||
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Registers a new subscriber and returns its lease. The lease exposes the
|
||||
/// subscriber's <see cref="ChannelReader{T}"/> and, when disposed, unregisters the
|
||||
/// subscriber and completes its channel without disturbing the pump or other
|
||||
/// subscribers.
|
||||
/// </summary>
|
||||
/// <param name="isInternal">
|
||||
/// <see langword="true"/> for a gateway-owned internal subscriber (Task 6: the
|
||||
/// session's dashboard mirror) that must NOT participate in the single-subscriber
|
||||
/// overflow accounting. An internal subscriber is excluded from the
|
||||
/// <c>isOnlySubscriber</c> count, so a lone external gRPC subscriber still reports
|
||||
/// <c>isOnlySubscriber == true</c> (preserving legacy FailFast session-fault
|
||||
/// behavior) even while the dashboard subscriber is attached; and an internal
|
||||
/// subscriber that itself overflows always reports <c>isOnlySubscriber == false</c>,
|
||||
/// so a slow/broken dashboard can never fault the session — it is merely
|
||||
/// disconnected from the mirror. Defaults to <see langword="false"/> (external
|
||||
/// subscriber) so every existing call site is unchanged.
|
||||
/// </param>
|
||||
public IEventSubscriberLease Register(bool isInternal = false)
|
||||
{
|
||||
// The pump is the single writer for this channel; readers are single-consumer
|
||||
// (one gRPC stream / dashboard subscriber). Synchronous continuations are
|
||||
// disabled so a slow reader can never stall the pump on its completion.
|
||||
//
|
||||
// The pump MUST stay non-blocking: it writes with the non-blocking TryWrite so one
|
||||
// slow reader can never stall the single pump that feeds every subscriber. FullMode
|
||||
// is deliberately Wait — NOT because the pump ever blocks (it never calls the blocking
|
||||
// WriteAsync overload), but because Wait is the only BoundedChannelFullMode under
|
||||
// which TryWrite returns false when the channel is full. That false return IS the
|
||||
// overflow signal the pump needs to apply the per-subscriber backpressure policy. The
|
||||
// Drop* modes would make TryWrite silently succeed-and-drop, hiding overflow and
|
||||
// re-introducing the silent data loss this task removes. So: Wait mode + TryWrite =
|
||||
// a non-blocking pump that still detects a full subscriber channel.
|
||||
Channel<MxEvent> channel = Channel.CreateBounded<MxEvent>(
|
||||
new BoundedChannelOptions(_subscriberQueueCapacity)
|
||||
{
|
||||
SingleReader = true,
|
||||
SingleWriter = true,
|
||||
FullMode = BoundedChannelFullMode.Wait,
|
||||
AllowSynchronousContinuations = false,
|
||||
});
|
||||
|
||||
long id = Interlocked.Increment(ref _nextSubscriberId);
|
||||
Subscriber subscriber = new(id, channel, isInternal);
|
||||
|
||||
// The disposed check AND the map add happen under the same lock with no await
|
||||
// in between. DisposeAsync sets _disposed=true under this same lock before it
|
||||
// calls CompleteAllSubscribers, so once disposal has begun no further subscriber
|
||||
// can be added — closing the Register-after-DisposeAsync window that would
|
||||
// otherwise leave a subscriber's channel never completed.
|
||||
lock (_lifecycleLock)
|
||||
{
|
||||
ObjectDisposedException.ThrowIf(_disposed, this);
|
||||
_subscribers[id] = subscriber;
|
||||
}
|
||||
|
||||
return new SubscriberLease(this, subscriber);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Stops the pump and completes all subscriber channels. Idempotent.
|
||||
/// </summary>
|
||||
public async ValueTask DisposeAsync()
|
||||
{
|
||||
Task? pumpTask;
|
||||
lock (_lifecycleLock)
|
||||
{
|
||||
if (_disposed)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
_disposed = true;
|
||||
pumpTask = _pumpTask;
|
||||
}
|
||||
|
||||
// Signal the pump to stop. It must not block on a non-reading subscriber:
|
||||
// it writes with non-blocking TryWrite, so cancellation tears it down promptly.
|
||||
await _shutdownCts.CancelAsync().ConfigureAwait(false);
|
||||
|
||||
if (pumpTask is not null)
|
||||
{
|
||||
// Bound the wait: a source factory that ignores cancellation would otherwise
|
||||
// hang dispose forever. If the pump does not stop in time we log and proceed
|
||||
// to complete subscribers anyway; DisposeAsync must not throw on this path.
|
||||
Task completed = await Task.WhenAny(pumpTask, Task.Delay(_shutdownTimeout)).ConfigureAwait(false);
|
||||
if (!ReferenceEquals(completed, pumpTask))
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Event distributor pump did not stop within {ShutdownTimeoutSeconds}s for session {SessionId}; completing subscribers and abandoning the pump.",
|
||||
_shutdownTimeout.TotalSeconds,
|
||||
_sessionId);
|
||||
}
|
||||
else
|
||||
{
|
||||
try
|
||||
{
|
||||
await pumpTask.ConfigureAwait(false);
|
||||
}
|
||||
catch (OperationCanceledException)
|
||||
{
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
_logger.LogDebug(
|
||||
exception,
|
||||
"Event distributor pump faulted during shutdown for session {SessionId}.",
|
||||
_sessionId);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
CompleteAllSubscribers(error: null);
|
||||
_shutdownCts.Dispose();
|
||||
}
|
||||
|
||||
private async Task PumpAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
try
|
||||
{
|
||||
await foreach (MxEvent mxEvent in _eventSourceFactory(cancellationToken)
|
||||
.WithCancellation(cancellationToken)
|
||||
.ConfigureAwait(false))
|
||||
{
|
||||
// Retain for replay BEFORE fan-out so a reconnecting subscriber that
|
||||
// queries between fan-out and its own read still sees this event. Order
|
||||
// is preserved: the pump is the single appender and events arrive in
|
||||
// source order.
|
||||
AppendToReplayBuffer(mxEvent);
|
||||
|
||||
// Enumerating a ConcurrentDictionary's Values never throws on concurrent
|
||||
// add/remove; a subscriber registered mid-iteration may miss this event,
|
||||
// which matches "late subscribers see events after they register".
|
||||
foreach (Subscriber subscriber in _subscribers.Values)
|
||||
{
|
||||
// Non-blocking write: TryWrite never blocks the pump on a slow reader.
|
||||
// A false return means this subscriber's bounded channel is full — the
|
||||
// per-subscriber overflow signal. We apply the backpressure policy to
|
||||
// THIS subscriber only; the pump, the session, and every other subscriber
|
||||
// keep running. Logs identifiers (worker sequence, subscriber id, session)
|
||||
// only, never the event payload or tag values.
|
||||
if (!subscriber.Channel.Writer.TryWrite(mxEvent))
|
||||
{
|
||||
OnSubscriberOverflow(subscriber, mxEvent.WorkerSequence);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
CompleteAllSubscribers(error: null);
|
||||
}
|
||||
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
// Shutdown path: DisposeAsync completes subscribers.
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
// Unexpected source fault (not the shutdown-cancellation path above) — visible
|
||||
// by default so an event stream silently dying is not lost in Debug noise.
|
||||
_logger.LogError(
|
||||
exception,
|
||||
"Event distributor source faulted for session {SessionId}.",
|
||||
_sessionId);
|
||||
CompleteAllSubscribers(exception);
|
||||
}
|
||||
}
|
||||
|
||||
// Applies the per-subscriber backpressure policy when a subscriber's bounded channel is
|
||||
// full. Runs on the pump thread. The offending subscriber is ALWAYS disconnected with an
|
||||
// overflow fault and unregistered, so it can never wedge the pump again; the overflow
|
||||
// handler decides the observable side effects (overflow metric, and — for legacy
|
||||
// single-subscriber FailFast — faulting the owning session). Multi-subscriber FailFast
|
||||
// intentionally degrades to a plain disconnect (see SubscriberOverflowHandler docs): one
|
||||
// slow consumer must not fault a session shared by other healthy subscribers.
|
||||
private void OnSubscriberOverflow(Subscriber subscriber, ulong workerSequence)
|
||||
{
|
||||
// Snapshot whether this is the sole subscriber BEFORE we unregister it. This drives
|
||||
// the FailFast-fault-session-vs-disconnect decision: FailFast only faults the session
|
||||
// when the overflowing subscriber is the sole subscriber.
|
||||
//
|
||||
// This snapshot is safe in v1 because AllowMultipleEventSubscribers=false is enforced
|
||||
// by the validator and the single-subscriber guard in AttachEventSubscriber — a
|
||||
// concurrent second registration is impossible, so the false-FailFast race (two
|
||||
// subscribers, one overflows, Count reads as 1 after the other concurrently unregisters,
|
||||
// FailFast wrongly faults the session) cannot occur today.
|
||||
//
|
||||
// REVISIT (Task 7/8): when multi-subscriber is enabled the guard is removed and the
|
||||
// race window opens — a concurrent second registration could cause Count to read as 1
|
||||
// here even with two subscribers, producing a false FailFast that faults a shared
|
||||
// session. Resolve before enabling multi-subscriber.
|
||||
//
|
||||
// Task 6: the gateway-owned internal dashboard subscriber is excluded from this
|
||||
// accounting. (a) An internal subscriber that overflows is NEVER the "only subscriber"
|
||||
// — a slow/broken dashboard must never fault the session, only disconnect its own
|
||||
// mirror. (b) Internal subscribers are excluded from the count, so a lone external
|
||||
// gRPC subscriber still reports isOnlySubscriber==true and preserves the legacy
|
||||
// FailFast session-fault behavior even while the dashboard mirror is attached.
|
||||
bool isOnlySubscriber = !subscriber.IsInternal && CountExternalSubscribers() == 1;
|
||||
|
||||
_logger.LogDebug(
|
||||
"Event distributor disconnecting subscriber {SubscriberId} in session {SessionId} after queue overflow (worker sequence {WorkerSequence}).",
|
||||
subscriber.Id,
|
||||
_sessionId,
|
||||
workerSequence);
|
||||
|
||||
// Observability + session-fault decision. Errors here must not stall the pump or
|
||||
// leave the subscriber attached, so the disconnect below runs regardless.
|
||||
// Pass subscriber.IsInternal so the handler can choose the correct metric label.
|
||||
try
|
||||
{
|
||||
_overflowHandler?.Invoke(isOnlySubscriber, subscriber.IsInternal);
|
||||
}
|
||||
catch (Exception exception)
|
||||
{
|
||||
_logger.LogError(
|
||||
exception,
|
||||
"Event distributor overflow handler threw for session {SessionId}; disconnecting subscriber {SubscriberId} anyway.",
|
||||
_sessionId,
|
||||
subscriber.Id);
|
||||
}
|
||||
|
||||
// Disconnect ONLY this subscriber: complete its channel with the overflow fault and
|
||||
// remove it from the fan-out set. Its gRPC reader's MoveNextAsync then throws the
|
||||
// SessionManagerException, which EventStreamService surfaces to the client exactly as
|
||||
// the pre-epic per-RPC overflow did. The pump and every other subscriber are untouched.
|
||||
if (_subscribers.TryRemove(subscriber.Id, out _))
|
||||
{
|
||||
subscriber.Channel.Writer.TryComplete(new SessionManagerException(
|
||||
SessionManagerErrorCode.EventQueueOverflow,
|
||||
$"Session {_sessionId} event stream queue overflowed."));
|
||||
}
|
||||
}
|
||||
|
||||
// Counts external (non-internal) subscribers. Drives the isOnlySubscriber FailFast
|
||||
// decision so the gateway-owned internal dashboard subscriber never inflates the count.
|
||||
private int CountExternalSubscribers()
|
||||
{
|
||||
int count = 0;
|
||||
foreach (Subscriber subscriber in _subscribers.Values)
|
||||
{
|
||||
if (!subscriber.IsInternal)
|
||||
{
|
||||
count++;
|
||||
}
|
||||
}
|
||||
|
||||
return count;
|
||||
}
|
||||
|
||||
private void CompleteAllSubscribers(Exception? error)
|
||||
{
|
||||
foreach (Subscriber subscriber in _subscribers.Values)
|
||||
{
|
||||
subscriber.Channel.Writer.TryComplete(error);
|
||||
}
|
||||
}
|
||||
|
||||
private void Unregister(Subscriber subscriber)
|
||||
{
|
||||
if (_subscribers.TryRemove(subscriber.Id, out _))
|
||||
{
|
||||
subscriber.Channel.Writer.TryComplete();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Returns the retained events with <see cref="MxEvent.WorkerSequence"/> strictly
|
||||
/// greater than <paramref name="afterSequence"/>, in ascending sequence order, so a
|
||||
/// reconnecting or reattaching subscriber can replay what it missed.
|
||||
/// </summary>
|
||||
/// <param name="afterSequence">
|
||||
/// The last worker sequence the caller already observed. Only events newer than this
|
||||
/// are returned.
|
||||
/// </param>
|
||||
/// <param name="events">
|
||||
/// The retained events newer than <paramref name="afterSequence"/>, in order. Never
|
||||
/// null; empty when nothing newer is retained.
|
||||
/// </param>
|
||||
/// <param name="gap">
|
||||
/// <see langword="true"/> when events between <paramref name="afterSequence"/> and the
|
||||
/// oldest retained event were already evicted (by capacity or age), meaning the caller
|
||||
/// missed events that can no longer be replayed and must re-snapshot. When
|
||||
/// <see langword="true"/>, whatever IS still retained is still returned via
|
||||
/// <paramref name="events"/>.
|
||||
/// </param>
|
||||
/// <returns>
|
||||
/// Always <see langword="true"/> — the out parameters fully describe the result. The
|
||||
/// return value exists for a fluent call shape and future extension.
|
||||
/// </returns>
|
||||
/// <remarks>
|
||||
/// <para>Gap semantics, by buffer state:</para>
|
||||
/// <list type="bullet">
|
||||
/// <item>
|
||||
/// Buffer non-empty: <paramref name="gap"/> is <see langword="true"/> iff
|
||||
/// <paramref name="afterSequence"/> is below the oldest retained sequence minus
|
||||
/// one (i.e. at least one event newer than <paramref name="afterSequence"/> but
|
||||
/// older than the oldest retained was evicted). When
|
||||
/// <paramref name="afterSequence"/> equals or exceeds the newest retained
|
||||
/// sequence the caller is fully caught up: empty list, no gap.
|
||||
/// </item>
|
||||
/// <item>
|
||||
/// Buffer empty (retention disabled, nothing seen yet, or everything evicted):
|
||||
/// empty list, and <paramref name="gap"/> is <see langword="true"/> iff
|
||||
/// <paramref name="afterSequence"/> is below the highest sequence ever seen —
|
||||
/// i.e. the caller is behind but nothing is retained to replay. If no event has
|
||||
/// ever been seen, or the caller is already at/ahead of the highest seen, there
|
||||
/// is nothing to miss: no gap.
|
||||
/// </item>
|
||||
/// </list>
|
||||
/// </remarks>
|
||||
public bool TryGetReplayFrom(ulong afterSequence, out IReadOnlyList<MxEvent> events, out bool gap)
|
||||
{
|
||||
lock (_replayLock)
|
||||
{
|
||||
EvictAged();
|
||||
|
||||
if (_replayBuffer.Count == 0)
|
||||
{
|
||||
events = [];
|
||||
// Nothing retained. The caller missed events only if it is behind the
|
||||
// highest sequence ever seen (and we have seen at least one event).
|
||||
gap = _anyEventSeen && afterSequence < _highestSequenceSeen;
|
||||
return true;
|
||||
}
|
||||
|
||||
ulong oldestRetained = _replayBuffer.First!.Value.Event.WorkerSequence;
|
||||
|
||||
// A gap exists when at least one event newer than afterSequence was evicted,
|
||||
// i.e. afterSequence sits below the oldest-retained-minus-one boundary.
|
||||
// Written as (oldestRetained > 0 && afterSequence < oldestRetained - 1) to
|
||||
// avoid wrapping when afterSequence == ulong.MaxValue (afterSequence + 1
|
||||
// would overflow to 0, falsely reporting a gap).
|
||||
gap = oldestRetained > 0 && afterSequence < oldestRetained - 1;
|
||||
|
||||
// O(n) scan over the retained buffer — acceptable because TryGetReplayFrom
|
||||
// is only called on subscriber reconnect, never on the hot fan-out path.
|
||||
List<MxEvent> newer = [];
|
||||
foreach (ReplayEntry entry in _replayBuffer)
|
||||
{
|
||||
if (entry.Event.WorkerSequence > afterSequence)
|
||||
{
|
||||
newer.Add(entry.Event);
|
||||
}
|
||||
}
|
||||
|
||||
events = newer;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
private void AppendToReplayBuffer(MxEvent mxEvent)
|
||||
{
|
||||
lock (_replayLock)
|
||||
{
|
||||
_anyEventSeen = true;
|
||||
if (mxEvent.WorkerSequence > _highestSequenceSeen)
|
||||
{
|
||||
_highestSequenceSeen = mxEvent.WorkerSequence;
|
||||
}
|
||||
|
||||
// Capacity 0 disables retention: track the highest-seen sequence (so replay
|
||||
// can still report a gap) but keep no events.
|
||||
if (_replayBufferCapacity == 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
_replayBuffer.AddLast(new ReplayEntry(mxEvent, _timeProvider.GetUtcNow()));
|
||||
|
||||
// Capacity eviction: drop oldest until within bound.
|
||||
while (_replayBuffer.Count > _replayBufferCapacity)
|
||||
{
|
||||
_replayBuffer.RemoveFirst();
|
||||
}
|
||||
|
||||
EvictAged();
|
||||
}
|
||||
}
|
||||
|
||||
// Must be called under _replayLock. Drops entries older than the retention window.
|
||||
private void EvictAged()
|
||||
{
|
||||
if (!_ageEvictionEnabled || _replayBuffer.Count == 0)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
DateTimeOffset cutoff = _timeProvider.GetUtcNow() - _replayRetention;
|
||||
while (_replayBuffer.First is { } first && first.Value.RetainedAt < cutoff)
|
||||
{
|
||||
_replayBuffer.RemoveFirst();
|
||||
}
|
||||
}
|
||||
|
||||
private readonly record struct ReplayEntry(MxEvent Event, DateTimeOffset RetainedAt);
|
||||
|
||||
private sealed class Subscriber(long id, Channel<MxEvent> channel, bool isInternal)
|
||||
{
|
||||
public long Id { get; } = id;
|
||||
|
||||
public Channel<MxEvent> Channel { get; } = channel;
|
||||
|
||||
// True for the gateway-owned internal dashboard subscriber. Excluded from the
|
||||
// single-subscriber overflow accounting so it cannot fault the session.
|
||||
public bool IsInternal { get; } = isInternal;
|
||||
}
|
||||
|
||||
private sealed class SubscriberLease(SessionEventDistributor distributor, Subscriber subscriber)
|
||||
: IEventSubscriberLease
|
||||
{
|
||||
private int _leaseDisposed;
|
||||
|
||||
public ChannelReader<MxEvent> Reader => subscriber.Channel.Reader;
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
// Atomic check-and-set so concurrent Dispose calls unregister at most once.
|
||||
if (Interlocked.Exchange(ref _leaseDisposed, 1) == 0)
|
||||
{
|
||||
distributor.Unregister(subscriber);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,60 @@
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
|
||||
using ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
|
||||
/// <summary>
|
||||
/// Dependencies a <see cref="GatewaySession"/> needs to construct and own its
|
||||
/// <see cref="SessionEventDistributor"/>. Bundled so the session constructor stays a
|
||||
/// single optional parameter rather than four, and so unit tests that build a session
|
||||
/// directly get a working distributor from <see cref="Default"/> without wiring DI.
|
||||
/// </summary>
|
||||
/// <param name="Mapper">
|
||||
/// Maps worker IPC <c>WorkerEvent</c> frames to public <c>MxEvent</c>s. The distributor
|
||||
/// pump applies this once per event in worker order, mirroring the mapping
|
||||
/// <c>EventStreamService.ProduceEventsAsync</c> used before Task 4.
|
||||
/// </param>
|
||||
/// <param name="EventOptions">
|
||||
/// Supplies the distributor's per-subscriber queue capacity and replay ring-buffer
|
||||
/// bounds (<see cref="EventOptions.QueueCapacity"/>,
|
||||
/// <see cref="EventOptions.ReplayBufferCapacity"/>,
|
||||
/// <see cref="EventOptions.ReplayRetentionSeconds"/>).
|
||||
/// </param>
|
||||
/// <param name="DistributorLogger">Logger for the distributor pump lifecycle.</param>
|
||||
/// <param name="TimeProvider">Clock used to timestamp and age-evict replay entries.</param>
|
||||
/// <param name="Metrics">
|
||||
/// Gateway metrics sink used by the session's per-subscriber overflow handler to record
|
||||
/// the queue-overflow counter and, for legacy single-subscriber FailFast, the session
|
||||
/// fault. Carrying it here keeps the distributor decoupled from the metrics type while
|
||||
/// preserving the observability the pre-epic per-RPC overflow path emitted.
|
||||
/// </param>
|
||||
/// <param name="DashboardBroadcaster">
|
||||
/// Sink the session's internal dashboard mirror loop (Task 6) publishes raw session
|
||||
/// <c>MxEvent</c>s to. When non-null the session registers an internal distributor
|
||||
/// subscriber on becoming Ready and mirrors every fanned event to the dashboard
|
||||
/// EventsHub group regardless of whether a gRPC client is streaming. When null
|
||||
/// (unit tests that don't exercise the dashboard mirror) no mirror is started.
|
||||
/// </param>
|
||||
public sealed record SessionEventStreaming(
|
||||
MxAccessGrpcMapper Mapper,
|
||||
EventOptions EventOptions,
|
||||
ILogger<SessionEventDistributor> DistributorLogger,
|
||||
TimeProvider TimeProvider,
|
||||
GatewayMetrics Metrics,
|
||||
IDashboardEventBroadcaster? DashboardBroadcaster = null)
|
||||
{
|
||||
/// <summary>
|
||||
/// Defaults used when a session is constructed without explicit streaming
|
||||
/// dependencies (unit tests). Uses a fresh mapper, default event options, a no-op
|
||||
/// logger, the system clock, a fresh metrics sink, and no dashboard mirror.
|
||||
/// </summary>
|
||||
public static SessionEventStreaming Default { get; } = new(
|
||||
new MxAccessGrpcMapper(),
|
||||
new EventOptions(),
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
new GatewayMetrics());
|
||||
}
|
||||
@@ -25,6 +25,9 @@ public sealed class SessionManager : ISessionManager
|
||||
private readonly ILogger<SessionManager> _logger;
|
||||
private readonly GatewayOptions _options;
|
||||
private readonly SemaphoreSlim _sessionSlots;
|
||||
private readonly Grpc.MxAccessGrpcMapper _eventMapper;
|
||||
private readonly ILogger<SessionEventDistributor> _distributorLogger;
|
||||
private readonly Dashboard.Hubs.IDashboardEventBroadcaster? _dashboardEventBroadcaster;
|
||||
|
||||
/// <summary>
|
||||
/// Initializes a new instance of <see cref="SessionManager"/>.
|
||||
@@ -35,13 +38,24 @@ public sealed class SessionManager : ISessionManager
|
||||
/// <param name="metrics">Gateway metrics.</param>
|
||||
/// <param name="timeProvider">Time provider for timestamps.</param>
|
||||
/// <param name="logger">Logger.</param>
|
||||
/// <param name="eventMapper">Mapper used by each session's event distributor to map worker events to public events.</param>
|
||||
/// <param name="distributorLogger">Logger passed to each session's event distributor pump.</param>
|
||||
/// <param name="dashboardEventBroadcaster">
|
||||
/// Dashboard SignalR fan-out sink. Each session registers an internal distributor
|
||||
/// subscriber (Task 6) that mirrors raw session events to this broadcaster, so the
|
||||
/// dashboard receives events regardless of whether a gRPC client is streaming. Null in
|
||||
/// unit tests that do not exercise the dashboard mirror.
|
||||
/// </param>
|
||||
public SessionManager(
|
||||
ISessionRegistry registry,
|
||||
ISessionWorkerClientFactory workerClientFactory,
|
||||
IOptions<GatewayOptions> options,
|
||||
GatewayMetrics metrics,
|
||||
TimeProvider? timeProvider = null,
|
||||
ILogger<SessionManager>? logger = null)
|
||||
ILogger<SessionManager>? logger = null,
|
||||
Grpc.MxAccessGrpcMapper? eventMapper = null,
|
||||
ILogger<SessionEventDistributor>? distributorLogger = null,
|
||||
Dashboard.Hubs.IDashboardEventBroadcaster? dashboardEventBroadcaster = null)
|
||||
{
|
||||
_registry = registry ?? throw new ArgumentNullException(nameof(registry));
|
||||
_workerClientFactory = workerClientFactory ?? throw new ArgumentNullException(nameof(workerClientFactory));
|
||||
@@ -49,6 +63,9 @@ public sealed class SessionManager : ISessionManager
|
||||
_metrics = metrics ?? throw new ArgumentNullException(nameof(metrics));
|
||||
_timeProvider = timeProvider ?? TimeProvider.System;
|
||||
_logger = logger ?? NullLogger<SessionManager>.Instance;
|
||||
_eventMapper = eventMapper ?? new Grpc.MxAccessGrpcMapper();
|
||||
_distributorLogger = distributorLogger ?? NullLogger<SessionEventDistributor>.Instance;
|
||||
_dashboardEventBroadcaster = dashboardEventBroadcaster;
|
||||
_options = options.Value;
|
||||
_sessionSlots = new SemaphoreSlim(_options.Sessions.MaxSessions, _options.Sessions.MaxSessions);
|
||||
}
|
||||
@@ -58,11 +75,13 @@ public sealed class SessionManager : ISessionManager
|
||||
/// </summary>
|
||||
/// <param name="request">Session open request.</param>
|
||||
/// <param name="clientIdentity">Client authentication identity.</param>
|
||||
/// <param name="ownerKeyId">API key identifier of the caller creating the session.</param>
|
||||
/// <param name="cancellationToken">Cancellation token.</param>
|
||||
/// <returns>Opened gateway session.</returns>
|
||||
public async Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(request);
|
||||
@@ -72,7 +91,7 @@ public sealed class SessionManager : ISessionManager
|
||||
bool sessionOpenedRecorded = false;
|
||||
try
|
||||
{
|
||||
session = CreateSession(request, clientIdentity);
|
||||
session = CreateSession(request, clientIdentity, ownerKeyId);
|
||||
if (!_registry.TryAdd(session))
|
||||
{
|
||||
throw new SessionManagerException(
|
||||
@@ -420,7 +439,8 @@ public sealed class SessionManager : ISessionManager
|
||||
|
||||
private GatewaySession CreateSession(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity)
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId)
|
||||
{
|
||||
string sessionId = CreateSessionId();
|
||||
string backendName = string.IsNullOrWhiteSpace(request.RequestedBackend)
|
||||
@@ -435,19 +455,29 @@ public sealed class SessionManager : ISessionManager
|
||||
DateTimeOffset openedAt = _timeProvider.GetUtcNow();
|
||||
string clientCorrelationId = CreateClientCorrelationId(request.ClientSessionName, sessionId);
|
||||
|
||||
SessionEventStreaming eventStreaming = new(
|
||||
_eventMapper,
|
||||
_options.Events,
|
||||
_distributorLogger,
|
||||
_timeProvider,
|
||||
_metrics,
|
||||
_dashboardEventBroadcaster);
|
||||
|
||||
return new GatewaySession(
|
||||
sessionId,
|
||||
backendName,
|
||||
pipeName,
|
||||
nonce,
|
||||
clientIdentity,
|
||||
ownerKeyId,
|
||||
request.ClientSessionName,
|
||||
clientCorrelationId,
|
||||
commandTimeout,
|
||||
startupTimeout,
|
||||
shutdownTimeout,
|
||||
leaseDuration,
|
||||
openedAt);
|
||||
openedAt,
|
||||
eventStreaming);
|
||||
}
|
||||
|
||||
private static string CreateClientCorrelationId(
|
||||
|
||||
@@ -46,11 +46,14 @@
|
||||
"MaxPendingCommandsPerSession": 128,
|
||||
"DefaultLeaseSeconds": 1800,
|
||||
"LeaseSweepIntervalSeconds": 30,
|
||||
"AllowMultipleEventSubscribers": false
|
||||
"AllowMultipleEventSubscribers": false,
|
||||
"MaxEventSubscribersPerSession": 8
|
||||
},
|
||||
"Events": {
|
||||
"QueueCapacity": 10000,
|
||||
"BackpressurePolicy": "FailFast"
|
||||
"BackpressurePolicy": "FailFast",
|
||||
"ReplayBufferCapacity": 1024,
|
||||
"ReplayRetentionSeconds": 300
|
||||
},
|
||||
"Dashboard": {
|
||||
"Enabled": true,
|
||||
|
||||
@@ -410,6 +410,7 @@ public sealed class AlarmFailoverEndToEndTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
GatewaySession session = new(
|
||||
|
||||
@@ -711,6 +711,7 @@ public sealed class GatewayAlarmMonitorProviderModeTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
GatewaySession session = new(
|
||||
|
||||
@@ -31,9 +31,11 @@ public sealed class GatewayOptionsTests
|
||||
|
||||
Assert.Equal(30, options.Sessions.DefaultCommandTimeoutSeconds);
|
||||
Assert.Equal(64, options.Sessions.MaxSessions);
|
||||
Assert.Equal(128, options.Sessions.MaxPendingCommandsPerSession);
|
||||
Assert.Equal(1800, options.Sessions.DefaultLeaseSeconds);
|
||||
Assert.Equal(30, options.Sessions.LeaseSweepIntervalSeconds);
|
||||
Assert.False(options.Sessions.AllowMultipleEventSubscribers);
|
||||
Assert.Equal(8, options.Sessions.MaxEventSubscribersPerSession);
|
||||
|
||||
Assert.Equal(10_000, options.Events.QueueCapacity);
|
||||
Assert.Equal(EventBackpressurePolicy.FailFast, options.Events.BackpressurePolicy);
|
||||
|
||||
@@ -289,4 +289,71 @@ public sealed class GatewayOptionsValidatorTests
|
||||
Assert.True(result.Failed);
|
||||
Assert.Contains(result.Failures!, f => f.Contains(keyPart));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// AllowMultipleEventSubscribers / MaxEventSubscribersPerSession validation
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private static GatewayOptions CloneWithSessions(GatewayOptions source, SessionOptions sessions)
|
||||
=> new()
|
||||
{
|
||||
Authentication = source.Authentication,
|
||||
Ldap = source.Ldap,
|
||||
Worker = source.Worker,
|
||||
Sessions = sessions,
|
||||
Events = source.Events,
|
||||
Dashboard = source.Dashboard,
|
||||
Protocol = source.Protocol,
|
||||
Alarms = source.Alarms,
|
||||
Tls = source.Tls,
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void Validate_Succeeds_WhenAllowMultipleEventSubscribersIsTrue()
|
||||
{
|
||||
// AllowMultipleEventSubscribers=true must now validate cleanly (no longer rejected).
|
||||
GatewayOptions options = CloneWithSessions(
|
||||
ValidOptions(),
|
||||
new SessionOptions { AllowMultipleEventSubscribers = true });
|
||||
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
|
||||
Assert.True(result.Succeeded);
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData(0)]
|
||||
[InlineData(-1)]
|
||||
public void Validate_Fails_WhenMaxEventSubscribersPerSessionBelowOne(int value)
|
||||
{
|
||||
GatewayOptions options = CloneWithSessions(
|
||||
ValidOptions(),
|
||||
new SessionOptions { MaxEventSubscribersPerSession = value });
|
||||
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
|
||||
Assert.True(result.Failed);
|
||||
Assert.Contains(
|
||||
result.Failures!,
|
||||
f => f.Contains("MxGateway:Sessions:MaxEventSubscribersPerSession"));
|
||||
}
|
||||
|
||||
[Theory]
|
||||
[InlineData(1)]
|
||||
[InlineData(8)]
|
||||
[InlineData(32)]
|
||||
public void Validate_Succeeds_WhenMaxEventSubscribersPerSessionIsPositive(int value)
|
||||
{
|
||||
GatewayOptions options = CloneWithSessions(
|
||||
ValidOptions(),
|
||||
new SessionOptions { MaxEventSubscribersPerSession = value });
|
||||
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
|
||||
Assert.True(result.Succeeded);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Validate_Succeeds_WithDefaultSessionOptions()
|
||||
{
|
||||
// Default SessionOptions (AllowMultipleEventSubscribers=false, MaxEventSubscribersPerSession=8)
|
||||
// must validate cleanly.
|
||||
GatewayOptions options = CloneWithSessions(ValidOptions(), new SessionOptions());
|
||||
ValidateOptionsResult result = new GatewayOptionsValidator().Validate(null, options);
|
||||
Assert.True(result.Succeeded);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
using System.Diagnostics;
|
||||
using Grpc.Core;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard;
|
||||
using ZB.MOM.WW.MxGateway.Server.Galaxy;
|
||||
@@ -266,8 +265,7 @@ public sealed class GalaxyFilterInputSafetyTests
|
||||
new ZB.MOM.WW.MxGateway.Server.Galaxy.GalaxyRepository(options),
|
||||
new StubGalaxyHierarchyCache(entry),
|
||||
new GalaxyDeployNotifier(),
|
||||
new GatewayRequestIdentityAccessor(),
|
||||
NullLogger<GalaxyRepositoryGrpcService>.Instance);
|
||||
new GatewayRequestIdentityAccessor());
|
||||
}
|
||||
|
||||
private static GalaxyHierarchyCacheEntry CreateEntry(IReadOnlyList<GalaxyObject> objects)
|
||||
|
||||
@@ -244,6 +244,7 @@ public sealed class DashboardSessionAdminServiceTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
throw new NotSupportedException();
|
||||
|
||||
@@ -94,6 +94,103 @@ public sealed class GatewayEndToEndFakeWorkerSmokeTests
|
||||
launcher.CommandKinds);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that the gateway forwards control commands (Ping, GetWorkerInfo, DrainEvents)
|
||||
/// through the full gRPC→WorkerClient→pipe roundtrip when the fake worker responds
|
||||
/// with canned replies via RespondToControlCommandAsync.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task GatewayService_WithFakeWorker_ControlCommandsRoundtripThroughGateway()
|
||||
{
|
||||
ControlCommandFakeWorkerProcessLauncher launcher = new();
|
||||
await using GatewayServiceFixture fixture = new(launcher);
|
||||
|
||||
OpenSessionReply openReply = await fixture.Service.OpenSession(
|
||||
new OpenSessionRequest
|
||||
{
|
||||
ClientSessionName = "control-cmd-e2e",
|
||||
ClientCorrelationId = "control-open-correlation",
|
||||
CommandTimeout = Duration.FromTimeSpan(TestTimeout),
|
||||
},
|
||||
new TestServerCallContext());
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, openReply.ProtocolStatus.Code);
|
||||
string sessionId = openReply.SessionId;
|
||||
|
||||
// Ping — the scripted worker echoes back the message.
|
||||
Task<MxCommandReply> pingTask = fixture.Service.Invoke(
|
||||
new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "ping-correlation",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Ping,
|
||||
Ping = new PingCommand { Message = "e2e-ping" },
|
||||
},
|
||||
},
|
||||
new TestServerCallContext());
|
||||
await launcher.WaitForNextControlCommandAsync(TestTimeout);
|
||||
MxCommandReply pingReply = await pingTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, pingReply.ProtocolStatus.Code);
|
||||
Assert.Equal(MxCommandKind.Ping, pingReply.Kind);
|
||||
Assert.Equal("e2e-ping", pingReply.DiagnosticMessage);
|
||||
|
||||
// GetWorkerInfo — the scripted worker returns canned info.
|
||||
Task<MxCommandReply> infoTask = fixture.Service.Invoke(
|
||||
new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "info-correlation",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.GetWorkerInfo,
|
||||
GetWorkerInfo = new GetWorkerInfoCommand(),
|
||||
},
|
||||
},
|
||||
new TestServerCallContext());
|
||||
await launcher.WaitForNextControlCommandAsync(TestTimeout);
|
||||
MxCommandReply infoReply = await infoTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, infoReply.ProtocolStatus.Code);
|
||||
Assert.Equal(MxCommandKind.GetWorkerInfo, infoReply.Kind);
|
||||
Assert.NotNull(infoReply.WorkerInfo);
|
||||
Assert.Equal(FakeWorkerHarness.DefaultWorkerProcessId, infoReply.WorkerInfo.WorkerProcessId);
|
||||
Assert.False(string.IsNullOrEmpty(infoReply.WorkerInfo.MxaccessProgid));
|
||||
|
||||
// DrainEvents — the scripted worker returns an empty drain reply.
|
||||
Task<MxCommandReply> drainTask = fixture.Service.Invoke(
|
||||
new MxCommandRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "drain-correlation",
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.DrainEvents,
|
||||
DrainEvents = new DrainEventsCommand { MaxEvents = 16 },
|
||||
},
|
||||
},
|
||||
new TestServerCallContext());
|
||||
await launcher.WaitForNextControlCommandAsync(TestTimeout);
|
||||
MxCommandReply drainReply = await drainTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, drainReply.ProtocolStatus.Code);
|
||||
Assert.Equal(MxCommandKind.DrainEvents, drainReply.Kind);
|
||||
Assert.NotNull(drainReply.DrainEvents);
|
||||
Assert.Empty(drainReply.DrainEvents.Events);
|
||||
|
||||
// Tear down cleanly.
|
||||
await fixture.Service.CloseSession(
|
||||
new CloseSessionRequest
|
||||
{
|
||||
SessionId = sessionId,
|
||||
ClientCorrelationId = "control-close-correlation",
|
||||
},
|
||||
new TestServerCallContext());
|
||||
await launcher.WorkerTask.WaitAsync(TestTimeout);
|
||||
}
|
||||
|
||||
private static MxCommandRequest CreateRegisterRequest(string sessionId)
|
||||
{
|
||||
return new MxCommandRequest
|
||||
@@ -171,15 +268,13 @@ public sealed class GatewayEndToEndFakeWorkerSmokeTests
|
||||
workerClientFactory,
|
||||
options,
|
||||
_metrics,
|
||||
logger: NullLogger<SessionManager>.Instance);
|
||||
logger: NullLogger<SessionManager>.Instance,
|
||||
dashboardEventBroadcaster: NullDashboardEventBroadcaster.Instance);
|
||||
MxAccessGrpcMapper mapper = new();
|
||||
EventStreamService eventStreamService = new(
|
||||
sessionManager,
|
||||
options,
|
||||
mapper,
|
||||
_metrics,
|
||||
NullDashboardEventBroadcaster.Instance,
|
||||
NullLogger<EventStreamService>.Instance);
|
||||
_metrics);
|
||||
|
||||
Service = new MxAccessGatewayService(
|
||||
sessionManager,
|
||||
@@ -355,6 +450,89 @@ public sealed class GatewayEndToEndFakeWorkerSmokeTests
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// A fake worker launcher whose scripted worker automatically responds to control
|
||||
/// commands (Ping, GetWorkerInfo, DrainEvents) using <see cref="FakeWorkerHarness.RespondToControlCommandAsync"/>
|
||||
/// and sends a shutdown ack when the gateway closes the session. Exposes
|
||||
/// <see cref="WaitForNextControlCommandAsync"/> so the test can drive the interaction
|
||||
/// one command at a time without races.
|
||||
/// </summary>
|
||||
private sealed class ControlCommandFakeWorkerProcessLauncher : IWorkerProcessLauncher
|
||||
{
|
||||
public const int ProcessId = 5590;
|
||||
|
||||
private readonly FakeWorkerProcess _process = new(ProcessId);
|
||||
private readonly SemaphoreSlim _commandHandled = new(0);
|
||||
|
||||
/// <summary>Gets the task backing the scripted worker loop.</summary>
|
||||
public Task WorkerTask { get; private set; } = Task.CompletedTask;
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<WorkerProcessHandle> LaunchAsync(
|
||||
WorkerProcessLaunchRequest request,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
WorkerTask = RunWorkerAsync(request, cancellationToken);
|
||||
|
||||
return Task.FromResult(new WorkerProcessHandle(
|
||||
_process,
|
||||
new WorkerProcessCommandLine("fake-control-worker.exe", []),
|
||||
DateTimeOffset.UtcNow));
|
||||
}
|
||||
|
||||
/// <summary>Waits until the scripted worker has responded to one control command.</summary>
|
||||
/// <param name="timeout">Maximum time to wait.</param>
|
||||
public async Task WaitForNextControlCommandAsync(TimeSpan timeout)
|
||||
{
|
||||
using CancellationTokenSource cts = new(timeout);
|
||||
await _commandHandled.WaitAsync(cts.Token).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private async Task RunWorkerAsync(
|
||||
WorkerProcessLaunchRequest request,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
await using FakeWorkerHarness harness = await FakeWorkerHarness.ConnectToGatewayPipeAsync(
|
||||
request.SessionId,
|
||||
request.Nonce,
|
||||
request.PipeName,
|
||||
request.ProtocolVersion,
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
await harness.CompleteStartupAsync(ProcessId, cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
|
||||
while (!cancellationToken.IsCancellationRequested)
|
||||
{
|
||||
WorkerEnvelope envelope = await harness
|
||||
.ReadGatewayEnvelopeAsync(cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
if (envelope.BodyCase == WorkerEnvelope.BodyOneofCase.WorkerShutdown)
|
||||
{
|
||||
await harness.SendShutdownAckAsync(cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
_process.MarkExited(0);
|
||||
return;
|
||||
}
|
||||
|
||||
if (envelope.BodyCase == WorkerEnvelope.BodyOneofCase.WorkerCommand)
|
||||
{
|
||||
MxCommandKind kind = envelope.WorkerCommand?.Command?.Kind ?? MxCommandKind.Unspecified;
|
||||
if (kind is MxCommandKind.Ping or MxCommandKind.GetSessionState
|
||||
or MxCommandKind.GetWorkerInfo or MxCommandKind.DrainEvents
|
||||
or MxCommandKind.ShutdownWorker)
|
||||
{
|
||||
await harness.RespondToControlCommandAsync(envelope, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
_commandHandled.Release();
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
throw new InvalidOperationException(
|
||||
$"ControlCommandFakeWorkerProcessLauncher received unexpected envelope {envelope.BodyCase}.");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private sealed class FakeWorkerProcess(int processId) : IWorkerProcess
|
||||
{
|
||||
private readonly TaskCompletionSource _exited = new(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
|
||||
@@ -9,7 +9,6 @@ using ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
using ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
using ZB.MOM.WW.MxGateway.Server.Workers;
|
||||
using ZB.MOM.WW.MxGateway.Tests.TestSupport;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Grpc;
|
||||
|
||||
@@ -157,59 +156,99 @@ public sealed class EventStreamServiceTests
|
||||
await WaitUntilAsync(() => metrics.GetSnapshot().GrpcEventStreamQueueDepth == 0);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that event queue overflow faults the session and reports the overflow metric.</summary>
|
||||
/// <summary>
|
||||
/// Re-targeted in Task 5: a per-subscriber channel overflow in the session's
|
||||
/// <see cref="SessionEventDistributor"/> faults the whole session under the legacy
|
||||
/// single-subscriber FailFast policy (the default, single-subscriber mode) and records
|
||||
/// the overflow + fault metrics. The distributor completes this subscriber's channel
|
||||
/// with the overflow fault, which surfaces here as the same
|
||||
/// <see cref="SessionManagerErrorCode.EventQueueOverflow"/> the pre-epic per-RPC
|
||||
/// overflow produced.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task StreamEventsAsync_WhenStreamQueueOverflows_FaultsSessionAndReportsOverflow()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
GatewaySession session = CreateReadySession(workerClient);
|
||||
using GatewayMetrics metrics = new();
|
||||
GatewaySession session = CreateReadySession(
|
||||
workerClient,
|
||||
queueCapacity: 1,
|
||||
metrics: metrics,
|
||||
backpressurePolicy: EventBackpressurePolicy.FailFast);
|
||||
EventStreamService service = CreateService(
|
||||
new FakeSessionManager(session),
|
||||
metrics,
|
||||
queueCapacity: 1);
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 2, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 3, MxEventFamily.OnDataChange));
|
||||
for (ulong sequence = 1; sequence <= 50; sequence++)
|
||||
{
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence, MxEventFamily.OnDataChange));
|
||||
}
|
||||
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
await using IAsyncEnumerator<MxEvent> subscriber = service
|
||||
.StreamEventsAsync(CreateRequest(session.SessionId), CancellationToken.None)
|
||||
.GetAsyncEnumerator();
|
||||
|
||||
Assert.True(await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout));
|
||||
await WaitUntilAsync(() => session.State == SessionState.Faulted);
|
||||
// The pump fans 50 events into a capacity-1 subscriber channel faster than this
|
||||
// single reader drains, so one of the reads observes the terminal overflow fault.
|
||||
SessionManagerException exception = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout));
|
||||
async () =>
|
||||
{
|
||||
while (await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout))
|
||||
{
|
||||
}
|
||||
});
|
||||
|
||||
Assert.Equal(SessionManagerErrorCode.EventQueueOverflow, exception.ErrorCode);
|
||||
await WaitUntilAsync(() => session.State == SessionState.Faulted);
|
||||
Assert.Equal(SessionState.Faulted, session.State);
|
||||
Assert.Equal(1, metrics.GetSnapshot().QueueOverflows);
|
||||
Assert.Equal(1, metrics.GetSnapshot().Faults);
|
||||
GatewayMetricsSnapshot snapshot = metrics.GetSnapshot();
|
||||
Assert.Equal(1, snapshot.QueueOverflows);
|
||||
Assert.Equal(1, snapshot.Faults);
|
||||
// The finally block in StreamEventsAsync calls StreamDisconnected("Detached") on the
|
||||
// overflow+fault path too; pin it here so a regression removing that call is caught.
|
||||
Assert.Equal(1, snapshot.StreamDisconnects);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that the disconnect backpressure policy disconnects the subscriber without faulting the session.</summary>
|
||||
/// <summary>
|
||||
/// Re-targeted in Task 5: under the DisconnectSubscriber policy a per-subscriber
|
||||
/// channel overflow disconnects only that subscriber's stream (terminal
|
||||
/// <see cref="SessionManagerErrorCode.EventQueueOverflow"/>) and records the overflow
|
||||
/// metric, but leaves the session <see cref="SessionState.Ready"/> and records no
|
||||
/// fault. The session, pump, and any other subscribers are unaffected.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task StreamEventsAsync_WhenStreamQueueOverflowsWithDisconnectPolicy_LeavesSessionReady()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
GatewaySession session = CreateReadySession(workerClient);
|
||||
using GatewayMetrics metrics = new();
|
||||
GatewaySession session = CreateReadySession(
|
||||
workerClient,
|
||||
queueCapacity: 1,
|
||||
metrics: metrics,
|
||||
backpressurePolicy: EventBackpressurePolicy.DisconnectSubscriber);
|
||||
EventStreamService service = CreateService(
|
||||
new FakeSessionManager(session),
|
||||
metrics,
|
||||
queueCapacity: 1,
|
||||
backpressurePolicy: EventBackpressurePolicy.DisconnectSubscriber);
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 2, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 3, MxEventFamily.OnDataChange));
|
||||
for (ulong sequence = 1; sequence <= 50; sequence++)
|
||||
{
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence, MxEventFamily.OnDataChange));
|
||||
}
|
||||
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
await using IAsyncEnumerator<MxEvent> subscriber = service
|
||||
.StreamEventsAsync(CreateRequest(session.SessionId), CancellationToken.None)
|
||||
.GetAsyncEnumerator();
|
||||
|
||||
Assert.True(await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout));
|
||||
SessionManagerException exception = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout));
|
||||
async () =>
|
||||
{
|
||||
while (await subscriber.MoveNextAsync().AsTask().WaitAsync(TestTimeout))
|
||||
{
|
||||
}
|
||||
});
|
||||
|
||||
Assert.Equal(SessionManagerErrorCode.EventQueueOverflow, exception.ErrorCode);
|
||||
Assert.Equal(SessionState.Ready, session.State);
|
||||
@@ -261,81 +300,11 @@ public sealed class EventStreamServiceTests
|
||||
Assert.Equal(1, metrics.GetSnapshot().Faults);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Tests-026 regression: <see cref="EventStreamService.StreamEventsAsync"/>
|
||||
/// must mirror every yielded event to the
|
||||
/// <see cref="ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs.IDashboardEventBroadcaster"/>
|
||||
/// seam (the only path that fans events out to dashboard SignalR clients).
|
||||
/// A regression that silently dropped the <c>Publish</c> call — e.g. an
|
||||
/// <c>if</c> accidentally added around it, or the broadcaster ctor
|
||||
/// parameter being removed — would have produced no failing test before
|
||||
/// this fixture existed. The recording fake captures every call and we
|
||||
/// assert one publish per yielded event, with the correct session id and
|
||||
/// preserved <c>WorkerSequence</c>.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task StreamEventsAsync_PublishesEachEventToDashboardBroadcaster()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
GatewaySession session = CreateReadySession(workerClient);
|
||||
RecordingDashboardEventBroadcaster recordingBroadcaster = new();
|
||||
EventStreamService service = CreateService(
|
||||
new FakeSessionManager(session),
|
||||
dashboardEventBroadcaster: recordingBroadcaster);
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 7, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 8, MxEventFamily.OnWriteComplete));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
|
||||
List<MxEvent> events = await CollectEventsAsync(service, session.SessionId);
|
||||
|
||||
Assert.Equal([7UL, 8UL], events.Select(mxEvent => mxEvent.WorkerSequence).ToArray());
|
||||
IReadOnlyList<DashboardEventCapture> captures = recordingBroadcaster.Captures;
|
||||
Assert.Equal(2, captures.Count);
|
||||
Assert.All(captures, capture => Assert.Equal(session.SessionId, capture.SessionId));
|
||||
Assert.Equal([7UL, 8UL], captures.Select(capture => capture.MxEvent.WorkerSequence).ToArray());
|
||||
Assert.Equal(MxEventFamily.OnDataChange, captures[0].MxEvent.Family);
|
||||
Assert.Equal(MxEventFamily.OnWriteComplete, captures[1].MxEvent.Family);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Server-041 regression: <see cref="EventStreamService"/> must not
|
||||
/// abort the gRPC stream when the dashboard broadcaster throws.
|
||||
/// <c>IDashboardEventBroadcaster.Publish</c> is documented as
|
||||
/// best-effort and never-throw, but the gRPC consumer cannot rely on
|
||||
/// implementation discipline alone — the seam itself swallows the
|
||||
/// fault and logs at debug, mirroring the broadcaster's own
|
||||
/// continuation handler. Without the wrap, the producer loop would
|
||||
/// surface the exception and the client would see a faulted stream
|
||||
/// for a dashboard-mirror failure.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task StreamEventsAsync_WhenDashboardBroadcasterThrows_StillYieldsEventsAndDoesNotFaultSession()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
GatewaySession session = CreateReadySession(workerClient);
|
||||
using GatewayMetrics metrics = new();
|
||||
ThrowingDashboardEventBroadcaster throwingBroadcaster = new();
|
||||
EventStreamService service = CreateService(
|
||||
new FakeSessionManager(session),
|
||||
metrics,
|
||||
dashboardEventBroadcaster: throwingBroadcaster);
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(sequence: 2, MxEventFamily.OnDataChange));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
|
||||
List<MxEvent> events = await CollectEventsAsync(service, session.SessionId);
|
||||
|
||||
Assert.Equal([1UL, 2UL], events.Select(mxEvent => mxEvent.WorkerSequence).ToArray());
|
||||
Assert.Equal(2, throwingBroadcaster.PublishAttempts);
|
||||
Assert.NotEqual(SessionState.Faulted, session.State);
|
||||
}
|
||||
|
||||
private static EventStreamService CreateService(
|
||||
FakeSessionManager sessionManager,
|
||||
GatewayMetrics? metrics = null,
|
||||
int queueCapacity = 8,
|
||||
EventBackpressurePolicy backpressurePolicy = EventBackpressurePolicy.FailFast,
|
||||
ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs.IDashboardEventBroadcaster? dashboardEventBroadcaster = null)
|
||||
EventBackpressurePolicy backpressurePolicy = EventBackpressurePolicy.FailFast)
|
||||
{
|
||||
return new EventStreamService(
|
||||
sessionManager,
|
||||
@@ -347,25 +316,7 @@ public sealed class EventStreamServiceTests
|
||||
BackpressurePolicy = backpressurePolicy,
|
||||
},
|
||||
}),
|
||||
new MxAccessGrpcMapper(),
|
||||
metrics ?? new GatewayMetrics(),
|
||||
dashboardEventBroadcaster ?? NullDashboardEventBroadcaster.Instance,
|
||||
NullLogger<EventStreamService>.Instance);
|
||||
}
|
||||
|
||||
private sealed class ThrowingDashboardEventBroadcaster : ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs.IDashboardEventBroadcaster
|
||||
{
|
||||
/// <summary>Gets the count of publish attempts.</summary>
|
||||
public int PublishAttempts { get; private set; }
|
||||
|
||||
/// <summary>Increments the attempt count and throws a simulated failure.</summary>
|
||||
/// <param name="sessionId">The session identifier.</param>
|
||||
/// <param name="mxEvent">The event to publish.</param>
|
||||
public void Publish(string sessionId, MxEvent mxEvent)
|
||||
{
|
||||
PublishAttempts++;
|
||||
throw new InvalidOperationException("simulated dashboard broadcaster failure");
|
||||
}
|
||||
metrics ?? new GatewayMetrics());
|
||||
}
|
||||
|
||||
private static async Task<List<MxEvent>> CollectEventsAsync(
|
||||
@@ -393,20 +344,39 @@ public sealed class EventStreamServiceTests
|
||||
|
||||
private static GatewaySession CreateReadySession(
|
||||
FakeWorkerClient workerClient,
|
||||
string sessionId = "session-events")
|
||||
string sessionId = "session-events",
|
||||
int queueCapacity = 8,
|
||||
GatewayMetrics? metrics = null,
|
||||
EventBackpressurePolicy backpressurePolicy = EventBackpressurePolicy.FailFast)
|
||||
{
|
||||
// The per-subscriber overflow policy now lives in the session's
|
||||
// SessionEventDistributor, so the session must share the same metrics sink and
|
||||
// backpressure policy the overflow assertions observe. queueCapacity flows into the
|
||||
// distributor's per-subscriber channel bound, which is what overflows.
|
||||
GatewaySession session = new(
|
||||
sessionId,
|
||||
GatewayContractInfo.DefaultBackendName,
|
||||
"pipe",
|
||||
"nonce",
|
||||
"client",
|
||||
ownerKeyId: null,
|
||||
"client-session",
|
||||
"client-correlation",
|
||||
TimeSpan.FromSeconds(30),
|
||||
TimeSpan.FromSeconds(30),
|
||||
TimeSpan.FromSeconds(10),
|
||||
DateTimeOffset.UtcNow);
|
||||
TimeSpan.FromMinutes(30),
|
||||
DateTimeOffset.UtcNow,
|
||||
new SessionEventStreaming(
|
||||
new MxAccessGrpcMapper(),
|
||||
new EventOptions
|
||||
{
|
||||
QueueCapacity = queueCapacity,
|
||||
BackpressurePolicy = backpressurePolicy,
|
||||
},
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
metrics ?? new GatewayMetrics()));
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
|
||||
@@ -471,6 +441,7 @@ public sealed class EventStreamServiceTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
return Task.FromResult(_sessions.Values.First());
|
||||
|
||||
@@ -1,5 +1,4 @@
|
||||
using Grpc.Core;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard;
|
||||
using ZB.MOM.WW.MxGateway.Server.Galaxy;
|
||||
@@ -217,8 +216,7 @@ public sealed class GalaxyRepositoryGrpcServiceTests
|
||||
new global::ZB.MOM.WW.MxGateway.Server.Galaxy.GalaxyRepository(options),
|
||||
new StubGalaxyHierarchyCache(entry),
|
||||
new GalaxyDeployNotifier(),
|
||||
new GatewayRequestIdentityAccessor(),
|
||||
NullLogger<GalaxyRepositoryGrpcService>.Instance);
|
||||
new GatewayRequestIdentityAccessor());
|
||||
}
|
||||
|
||||
private static GalaxyHierarchyCacheEntry CreateEntry(IReadOnlyList<GalaxyObject> objects)
|
||||
@@ -366,8 +364,7 @@ public sealed class GalaxyRepositoryGrpcServiceTests
|
||||
new global::ZB.MOM.WW.MxGateway.Server.Galaxy.GalaxyRepository(options),
|
||||
new NeverLoadsHierarchyCache(),
|
||||
new GalaxyDeployNotifier(),
|
||||
new GatewayRequestIdentityAccessor(),
|
||||
NullLogger<GalaxyRepositoryGrpcService>.Instance);
|
||||
new GatewayRequestIdentityAccessor());
|
||||
|
||||
// No caller-supplied CT so WaitForCacheBootstrap exits via its 5s internal budget
|
||||
// (instead of re-throwing OperationCanceledException from the caller's CT). The
|
||||
@@ -448,8 +445,7 @@ public sealed class GalaxyRepositoryGrpcServiceTests
|
||||
new global::ZB.MOM.WW.MxGateway.Server.Galaxy.GalaxyRepository(options),
|
||||
new StubGalaxyHierarchyCache(CreateEntry(CreateFilterObjects())),
|
||||
new GalaxyDeployNotifier(),
|
||||
identityAccessor,
|
||||
NullLogger<GalaxyRepositoryGrpcService>.Instance);
|
||||
identityAccessor);
|
||||
|
||||
// Sanity: with no identity pushed, both Pump and Valve come back under Line3 (id=2).
|
||||
BrowseChildrenReply unconstrained = await service.BrowseChildren(
|
||||
|
||||
@@ -548,6 +548,33 @@ public sealed class MxAccessGatewayServiceConstraintTests
|
||||
Assert.Equal("42", enforcer.RecordedDenials[0].Target);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// End-to-end wiring (M-2): the per-request <c>ClientCorrelationId</c> must propagate
|
||||
/// all the way through <c>Invoke</c> -> <c>ApplyConstraintsAsync</c> -> the unary write
|
||||
/// enforce helper -> <c>RecordDenialAsync</c>, so the recorded denial carries the exact
|
||||
/// id the client sent (including non-GUID trace ids used by Rust/Python/Java clients).
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Invoke_Write_WithDeniedHandle_ThreadsClientCorrelationIdIntoRecordedDenial()
|
||||
{
|
||||
const string CorrelationId = "rust-client-Write-7";
|
||||
PredicateConstraintEnforcer enforcer = new()
|
||||
{
|
||||
DenyWriteHandle = (serverHandle, itemHandle) => serverHandle == 7 && itemHandle == 42,
|
||||
};
|
||||
FakeSessionManager sessionManager = CreateSessionManagerWithSeed();
|
||||
MxAccessGatewayService service = CreateService(sessionManager, enforcer);
|
||||
|
||||
MxCommandRequest request = CreateWriteRequest(serverHandle: 7, itemHandle: 42);
|
||||
request.ClientCorrelationId = CorrelationId;
|
||||
|
||||
await Assert.ThrowsAsync<RpcException>(
|
||||
async () => await service.Invoke(request, new TestServerCallContext()));
|
||||
|
||||
Assert.Single(enforcer.RecordedDenials);
|
||||
Assert.Equal(CorrelationId, enforcer.RecordedDenials[0].CorrelationId);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Unary <c>WriteSecured</c> against a denied handle takes the same enforce path
|
||||
/// and rejects identically — proving the four-arm switch in
|
||||
@@ -857,10 +884,12 @@ public sealed class MxAccessGatewayServiceConstraintTests
|
||||
/// <summary>Opens a test session asynchronously.</summary>
|
||||
/// <param name="request">The session open request.</param>
|
||||
/// <param name="clientIdentity">The client identity, if any.</param>
|
||||
/// <param name="ownerKeyId">The API key identifier of the caller, if any.</param>
|
||||
/// <param name="cancellationToken">Token to observe for cancellation.</param>
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken) =>
|
||||
Task.FromResult(seededSessions.Values.First());
|
||||
|
||||
|
||||
@@ -45,6 +45,7 @@ public sealed class MxAccessGatewayServiceTests
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Contains("unary-invoke", reply.Capabilities);
|
||||
Assert.Equal("Operator Key", sessionManager.LastClientIdentity);
|
||||
Assert.Equal("operator01", sessionManager.LastOwnerKeyId);
|
||||
Assert.Equal("operator-session", sessionManager.LastOpenRequest?.ClientSessionName);
|
||||
}
|
||||
|
||||
@@ -508,6 +509,9 @@ public sealed class MxAccessGatewayServiceTests
|
||||
/// <summary>The last client identity passed to OpenSessionAsync.</summary>
|
||||
public string? LastClientIdentity { get; private set; }
|
||||
|
||||
/// <summary>The last owner key id passed to OpenSessionAsync.</summary>
|
||||
public string? LastOwnerKeyId { get; private set; }
|
||||
|
||||
/// <summary>The last session ID passed to ReadEventsAsync.</summary>
|
||||
public string? LastReadEventsSessionId { get; private set; }
|
||||
|
||||
@@ -545,10 +549,12 @@ public sealed class MxAccessGatewayServiceTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
LastOpenRequest = request;
|
||||
LastClientIdentity = clientIdentity;
|
||||
LastOwnerKeyId = ownerKeyId;
|
||||
|
||||
return Task.FromResult(OpenSessionResult ?? CreateSession("session-1", processId: 1234));
|
||||
}
|
||||
|
||||
@@ -0,0 +1,323 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using Microsoft.Extensions.Options;
|
||||
using ZB.MOM.WW.MxGateway.Contracts;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
using ZB.MOM.WW.MxGateway.Server.Dashboard.Hubs;
|
||||
using ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
using ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
using ZB.MOM.WW.MxGateway.Server.Workers;
|
||||
using ZB.MOM.WW.MxGateway.Tests.TestSupport;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Sessions;
|
||||
|
||||
/// <summary>
|
||||
/// Task 6 regression tests for the internal dashboard mirror. The dashboard is a
|
||||
/// first-class subscriber on the session's <see cref="SessionEventDistributor"/>, so it
|
||||
/// receives session events whether or not a gRPC client is streaming — fixing the
|
||||
/// "dark feed" where the dashboard only saw events while a gRPC client was actively
|
||||
/// streaming (the inline per-RPC tap removed by this task).
|
||||
/// </summary>
|
||||
public sealed class GatewaySessionDashboardMirrorTests
|
||||
{
|
||||
private static readonly TimeSpan TestTimeout = TimeSpan.FromSeconds(5);
|
||||
|
||||
/// <summary>
|
||||
/// The KEY bug-fix test: the dashboard broadcaster receives session events even when
|
||||
/// NO gRPC <c>StreamEvents</c> subscriber is attached. The session is driven to Ready
|
||||
/// with a fake worker emitting events; only the internal dashboard subscriber exists.
|
||||
/// Before Task 6 the mirror lived inside the per-RPC gRPC loop, so with no gRPC
|
||||
/// subscriber the dashboard saw nothing.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DashboardMirror_ReceivesEvents_WithNoGrpcSubscriber()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
workerClient.Events.Add(CreateWorkerEvent(10, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(11, MxEventFamily.OnWriteComplete));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
RecordingDashboardEventBroadcaster broadcaster = new();
|
||||
|
||||
await using GatewaySession session = CreateSession(workerClient, broadcaster);
|
||||
session.AttachWorkerClient(workerClient);
|
||||
|
||||
// MarkReady starts the internal dashboard mirror; no gRPC subscriber is ever attached.
|
||||
session.MarkReady();
|
||||
|
||||
await WaitUntilAsync(() => broadcaster.Captures.Count == 2);
|
||||
|
||||
IReadOnlyList<DashboardEventCapture> captures = broadcaster.Captures;
|
||||
Assert.Equal(0, session.ActiveEventSubscriberCount);
|
||||
Assert.Equal([10UL, 11UL], captures.Select(capture => capture.MxEvent.WorkerSequence).ToArray());
|
||||
Assert.All(captures, capture => Assert.Equal(session.SessionId, capture.SessionId));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// A gRPC subscriber and the dashboard both receive every event concurrently. The
|
||||
/// gRPC path is no longer the dashboard's source — both read independent leases fed by
|
||||
/// the single distributor pump.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DashboardMirror_AndGrpcSubscriber_BothReceiveEvents()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
workerClient.Events.Add(CreateWorkerEvent(1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(2, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(3, MxEventFamily.OnWriteComplete));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
RecordingDashboardEventBroadcaster broadcaster = new();
|
||||
|
||||
await using GatewaySession session = CreateSession(workerClient, broadcaster);
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
|
||||
EventStreamService service = new(
|
||||
new SingleSessionManager(session),
|
||||
Options.Create(new GatewayOptions { Events = new EventOptions { QueueCapacity = 8 } }),
|
||||
new GatewayMetrics());
|
||||
|
||||
List<MxEvent> grpcEvents = [];
|
||||
await foreach (MxEvent mxEvent in service
|
||||
.StreamEventsAsync(new StreamEventsRequest { SessionId = session.SessionId }, CancellationToken.None)
|
||||
.WithCancellation(CancellationToken.None))
|
||||
{
|
||||
grpcEvents.Add(mxEvent);
|
||||
}
|
||||
|
||||
await WaitUntilAsync(() => broadcaster.Captures.Count == 3);
|
||||
|
||||
Assert.Equal([1UL, 2UL, 3UL], grpcEvents.Select(mxEvent => mxEvent.WorkerSequence).ToArray());
|
||||
Assert.Equal([1UL, 2UL, 3UL], broadcaster.Captures.Select(capture => capture.MxEvent.WorkerSequence).ToArray());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Task 4 hazard guard: starting the pump at Ready with a fast-completing worker stream
|
||||
/// and zero subscribers used to drain into nothing and leave a later subscriber hanging.
|
||||
/// Now the dashboard subscriber is registered BEFORE the pump starts, so even a worker
|
||||
/// stream that completes immediately delivers every event to the dashboard with no hang.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DashboardMirror_FastCompletingWorkerStream_DeliversAllEventsWithoutHang()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
workerClient.Events.Add(CreateWorkerEvent(1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(2, MxEventFamily.OnDataChange));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
RecordingDashboardEventBroadcaster broadcaster = new();
|
||||
|
||||
await using GatewaySession session = CreateSession(workerClient, broadcaster);
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
|
||||
await WaitUntilAsync(() => broadcaster.Captures.Count == 2);
|
||||
Assert.Equal([1UL, 2UL], broadcaster.Captures.Select(capture => capture.MxEvent.WorkerSequence).ToArray());
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// The dashboard Publish must be never-throw at the seam too: a throwing broadcaster
|
||||
/// must not fault the session or stop the mirror from continuing past the failure.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DashboardMirror_WhenBroadcasterThrows_DoesNotFaultSessionAndKeepsMirroring()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
workerClient.Events.Add(CreateWorkerEvent(1, MxEventFamily.OnDataChange));
|
||||
workerClient.Events.Add(CreateWorkerEvent(2, MxEventFamily.OnDataChange));
|
||||
workerClient.CompleteAfterConfiguredEvents = true;
|
||||
ThrowingDashboardEventBroadcaster broadcaster = new();
|
||||
|
||||
await using GatewaySession session = CreateSession(workerClient, broadcaster);
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
|
||||
await WaitUntilAsync(() => broadcaster.PublishAttempts == 2);
|
||||
Assert.NotEqual(SessionState.Faulted, session.State);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// The internal dashboard subscriber must NOT count against the single-subscriber
|
||||
/// guard: a gRPC subscriber can still attach while the dashboard mirror is running.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DashboardMirror_DoesNotCountAgainstSingleSubscriberGuard()
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
RecordingDashboardEventBroadcaster broadcaster = new();
|
||||
|
||||
await using GatewaySession session = CreateSession(workerClient, broadcaster);
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
|
||||
Assert.Equal(0, session.ActiveEventSubscriberCount);
|
||||
using IEventSubscriberLease lease = session.AttachEventSubscriber(allowMultipleSubscribers: false);
|
||||
Assert.Equal(1, session.ActiveEventSubscriberCount);
|
||||
}
|
||||
|
||||
private static GatewaySession CreateSession(
|
||||
IWorkerClient workerClient,
|
||||
IDashboardEventBroadcaster broadcaster)
|
||||
{
|
||||
return new GatewaySession(
|
||||
sessionId: "session-dashboard-mirror",
|
||||
backendName: GatewayContractInfo.DefaultBackendName,
|
||||
pipeName: "mxaccess-gateway-1-session-dashboard-mirror",
|
||||
nonce: "nonce",
|
||||
clientIdentity: "client-1",
|
||||
ownerKeyId: null,
|
||||
clientSessionName: "test-session",
|
||||
clientCorrelationId: "client-correlation-1",
|
||||
commandTimeout: TimeSpan.FromSeconds(5),
|
||||
startupTimeout: TimeSpan.FromSeconds(5),
|
||||
shutdownTimeout: TimeSpan.FromSeconds(5),
|
||||
leaseDuration: TimeSpan.FromMinutes(30),
|
||||
openedAt: DateTimeOffset.UtcNow,
|
||||
eventStreaming: new SessionEventStreaming(
|
||||
new MxAccessGrpcMapper(),
|
||||
new EventOptions { QueueCapacity = 8 },
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
new GatewayMetrics(),
|
||||
broadcaster));
|
||||
}
|
||||
|
||||
private static WorkerEvent CreateWorkerEvent(ulong sequence, MxEventFamily family)
|
||||
{
|
||||
MxEvent mxEvent = new()
|
||||
{
|
||||
SessionId = "session-dashboard-mirror",
|
||||
Family = family,
|
||||
WorkerSequence = sequence,
|
||||
};
|
||||
|
||||
switch (family)
|
||||
{
|
||||
case MxEventFamily.OnDataChange:
|
||||
mxEvent.OnDataChange = new OnDataChangeEvent();
|
||||
break;
|
||||
case MxEventFamily.OnWriteComplete:
|
||||
mxEvent.OnWriteComplete = new OnWriteCompleteEvent();
|
||||
break;
|
||||
}
|
||||
|
||||
return new WorkerEvent { Event = mxEvent };
|
||||
}
|
||||
|
||||
private static async Task WaitUntilAsync(Func<bool> predicate, [CallerArgumentExpression(nameof(predicate))] string? condition = null)
|
||||
{
|
||||
using CancellationTokenSource cancellationTokenSource = new(TestTimeout);
|
||||
try
|
||||
{
|
||||
while (!predicate())
|
||||
{
|
||||
await Task.Delay(TimeSpan.FromMilliseconds(10), cancellationTokenSource.Token);
|
||||
}
|
||||
}
|
||||
catch (OperationCanceledException)
|
||||
{
|
||||
Assert.Fail($"Timed out after {TestTimeout.TotalSeconds}s waiting for: {condition}");
|
||||
}
|
||||
}
|
||||
|
||||
private sealed class ThrowingDashboardEventBroadcaster : IDashboardEventBroadcaster
|
||||
{
|
||||
private int _publishAttempts;
|
||||
|
||||
public int PublishAttempts => Volatile.Read(ref _publishAttempts);
|
||||
|
||||
public void Publish(string sessionId, MxEvent mxEvent)
|
||||
{
|
||||
Interlocked.Increment(ref _publishAttempts);
|
||||
throw new InvalidOperationException("simulated dashboard broadcaster failure");
|
||||
}
|
||||
}
|
||||
|
||||
private sealed class SingleSessionManager(GatewaySession session) : ISessionManager
|
||||
{
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken) => Task.FromResult(session);
|
||||
|
||||
public bool TryGetSession(string sessionId, out GatewaySession gatewaySession)
|
||||
{
|
||||
gatewaySession = session;
|
||||
return string.Equals(sessionId, session.SessionId, StringComparison.Ordinal);
|
||||
}
|
||||
|
||||
public Task<WorkerCommandReply> InvokeAsync(
|
||||
string sessionId,
|
||||
WorkerCommand command,
|
||||
CancellationToken cancellationToken) => Task.FromResult(new WorkerCommandReply());
|
||||
|
||||
public IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
|
||||
string sessionId,
|
||||
CancellationToken cancellationToken) => session.ReadEventsAsync(cancellationToken);
|
||||
|
||||
public Task<SessionCloseResult> CloseSessionAsync(
|
||||
string sessionId,
|
||||
CancellationToken cancellationToken) =>
|
||||
Task.FromResult(new SessionCloseResult(sessionId, SessionState.Closed, AlreadyClosed: false));
|
||||
|
||||
public Task<SessionCloseResult> KillWorkerAsync(
|
||||
string sessionId,
|
||||
string reason,
|
||||
CancellationToken cancellationToken) =>
|
||||
Task.FromResult(new SessionCloseResult(sessionId, SessionState.Closed, AlreadyClosed: false));
|
||||
|
||||
public Task<int> CloseExpiredLeasesAsync(
|
||||
DateTimeOffset now,
|
||||
CancellationToken cancellationToken) => Task.FromResult(0);
|
||||
|
||||
public Task ShutdownAsync(CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
}
|
||||
|
||||
private sealed class FakeWorkerClient : IWorkerClient
|
||||
{
|
||||
public List<WorkerEvent> Events { get; } = [];
|
||||
|
||||
public bool CompleteAfterConfiguredEvents { get; set; }
|
||||
|
||||
public string SessionId { get; } = "session-dashboard-mirror";
|
||||
|
||||
public int? ProcessId { get; } = 1234;
|
||||
|
||||
public WorkerClientState State { get; } = WorkerClientState.Ready;
|
||||
|
||||
public DateTimeOffset LastHeartbeatAt { get; } = DateTimeOffset.UtcNow;
|
||||
|
||||
public Task StartAsync(CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
|
||||
public Task<WorkerCommandReply> InvokeAsync(
|
||||
WorkerCommand command,
|
||||
TimeSpan timeout,
|
||||
CancellationToken cancellationToken) => Task.FromResult(new WorkerCommandReply());
|
||||
|
||||
public async IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
|
||||
[EnumeratorCancellation] CancellationToken cancellationToken)
|
||||
{
|
||||
foreach (WorkerEvent workerEvent in Events)
|
||||
{
|
||||
cancellationToken.ThrowIfCancellationRequested();
|
||||
yield return workerEvent;
|
||||
}
|
||||
|
||||
if (CompleteAfterConfiguredEvents)
|
||||
{
|
||||
yield break;
|
||||
}
|
||||
|
||||
await Task.Delay(Timeout.InfiniteTimeSpan, cancellationToken);
|
||||
}
|
||||
|
||||
public Task ShutdownAsync(TimeSpan timeout, CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
|
||||
public void Kill(string reason)
|
||||
{
|
||||
}
|
||||
|
||||
public ValueTask DisposeAsync() => ValueTask.CompletedTask;
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,9 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Server.Configuration;
|
||||
using ZB.MOM.WW.MxGateway.Server.Grpc;
|
||||
using ZB.MOM.WW.MxGateway.Server.Metrics;
|
||||
using ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
using ZB.MOM.WW.MxGateway.Server.Workers;
|
||||
|
||||
@@ -156,6 +160,66 @@ public sealed class GatewaySessionTests
|
||||
await session.DisposeAsync();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Issue-1 regression. Concurrent <c>Dispose()</c> calls on the same
|
||||
/// <see cref="IEventSubscriberLease"/> — as can happen when a gRPC stream
|
||||
/// completion and a client cancellation both fire at the same time — must
|
||||
/// decrement <c>_activeEventSubscriberCount</c> exactly once, never to −1.
|
||||
/// A negative count permanently blocks future subscribers because
|
||||
/// <c>AttachEventSubscriber(allowMultipleSubscribers:false)</c> gates on
|
||||
/// <c>_activeEventSubscriberCount > 0</c>. After both racing disposes finish,
|
||||
/// the count must be exactly 0 and a subsequent single-subscriber attach must
|
||||
/// succeed.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task EventSubscriberLease_ConcurrentDispose_DecrementsCountExactlyOnce()
|
||||
{
|
||||
const int Concurrency = 16;
|
||||
const int Iterations = 200;
|
||||
TimeSpan testTimeout = TimeSpan.FromSeconds(10);
|
||||
|
||||
FakeWorkerClient workerClient = new();
|
||||
GatewaySession session = CreateReadySessionWithEventStreaming(workerClient);
|
||||
|
||||
for (int i = 0; i < Iterations; i++)
|
||||
{
|
||||
// Attach one subscriber; this increments _activeEventSubscriberCount to 1.
|
||||
IEventSubscriberLease lease = session.AttachEventSubscriber(
|
||||
allowMultipleSubscribers: false);
|
||||
|
||||
// Race Concurrency threads all calling Dispose() on the same lease.
|
||||
// Only one must actually run DetachEventSubscriber.
|
||||
using SemaphoreSlim gate = new(0);
|
||||
Task[] tasks = new Task[Concurrency];
|
||||
for (int t = 0; t < Concurrency; t++)
|
||||
{
|
||||
tasks[t] = Task.Run(async () =>
|
||||
{
|
||||
// All threads wait at the gate so they start as simultaneously
|
||||
// as the scheduler allows, maximising the race window.
|
||||
await gate.WaitAsync(testTimeout);
|
||||
lease.Dispose();
|
||||
});
|
||||
}
|
||||
|
||||
gate.Release(Concurrency);
|
||||
await Task.WhenAll(tasks).WaitAsync(testTimeout);
|
||||
|
||||
// Count must be exactly 0 — not negative — after all disposes.
|
||||
Assert.Equal(0, session.ActiveEventSubscriberCount);
|
||||
|
||||
// Observable contract: a fresh single subscriber must now be attachable
|
||||
// (i.e., the guard _activeEventSubscriberCount > 0 is false).
|
||||
IEventSubscriberLease next = session.AttachEventSubscriber(
|
||||
allowMultipleSubscribers: false);
|
||||
next.Dispose();
|
||||
Assert.Equal(0, session.ActiveEventSubscriberCount);
|
||||
}
|
||||
|
||||
await session.CloseAsync("test-done", CancellationToken.None);
|
||||
await session.DisposeAsync();
|
||||
}
|
||||
|
||||
private static GatewaySession CreateReadySession(IWorkerClient workerClient)
|
||||
{
|
||||
GatewaySession session = new(
|
||||
@@ -164,6 +228,7 @@ public sealed class GatewaySessionTests
|
||||
pipeName: "mxaccess-gateway-1-session-test",
|
||||
nonce: "nonce",
|
||||
clientIdentity: "client-1",
|
||||
ownerKeyId: null,
|
||||
clientSessionName: "test-session",
|
||||
clientCorrelationId: "client-correlation-1",
|
||||
commandTimeout: TimeSpan.FromSeconds(5),
|
||||
@@ -176,6 +241,33 @@ public sealed class GatewaySessionTests
|
||||
return session;
|
||||
}
|
||||
|
||||
private static GatewaySession CreateReadySessionWithEventStreaming(IWorkerClient workerClient)
|
||||
{
|
||||
GatewaySession session = new(
|
||||
sessionId: "session-test-concurrent",
|
||||
backendName: "mxaccess",
|
||||
pipeName: "mxaccess-gateway-1-session-test-concurrent",
|
||||
nonce: "nonce",
|
||||
clientIdentity: "client-1",
|
||||
ownerKeyId: null,
|
||||
clientSessionName: "test-session",
|
||||
clientCorrelationId: "client-correlation-1",
|
||||
commandTimeout: TimeSpan.FromSeconds(5),
|
||||
startupTimeout: TimeSpan.FromSeconds(5),
|
||||
shutdownTimeout: TimeSpan.FromSeconds(5),
|
||||
leaseDuration: TimeSpan.FromMinutes(30),
|
||||
openedAt: DateTimeOffset.UtcNow,
|
||||
eventStreaming: new SessionEventStreaming(
|
||||
new MxAccessGrpcMapper(),
|
||||
new EventOptions { QueueCapacity = 8 },
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
new GatewayMetrics()));
|
||||
session.AttachWorkerClient(workerClient);
|
||||
session.MarkReady();
|
||||
return session;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Minimal worker client that parks <see cref="ShutdownAsync"/> until the test
|
||||
/// explicitly releases it. Used to keep <see cref="GatewaySession.CloseAsync"/>
|
||||
|
||||
@@ -0,0 +1,569 @@
|
||||
using System.Threading.Channels;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using Microsoft.Extensions.Time.Testing;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.MxGateway.Server.Sessions;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Tests.Gateway.Sessions;
|
||||
|
||||
/// <summary>
|
||||
/// Concurrency and fan-out tests for <see cref="SessionEventDistributor"/>, the
|
||||
/// Session Resilience epic's per-session event pump. One pump drains the source
|
||||
/// exactly once and fans every event to N independent per-subscriber channels.
|
||||
/// Every async wait is bounded so a fan-out or shutdown deadlock fails fast.
|
||||
/// </summary>
|
||||
public sealed class SessionEventDistributorTests
|
||||
{
|
||||
private static readonly TimeSpan ReadTimeout = TimeSpan.FromSeconds(5);
|
||||
|
||||
[Fact]
|
||||
public async Task TwoSubscribers_BothReceiveFannedEventsInOrder()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(source.Reader);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease leaseA = distributor.Register();
|
||||
using IEventSubscriberLease leaseB = distributor.Register();
|
||||
|
||||
source.Writer.TryWrite(Event(1));
|
||||
source.Writer.TryWrite(Event(2));
|
||||
|
||||
MxEvent a1 = await ReadOneAsync(leaseA.Reader);
|
||||
MxEvent a2 = await ReadOneAsync(leaseA.Reader);
|
||||
MxEvent b1 = await ReadOneAsync(leaseB.Reader);
|
||||
MxEvent b2 = await ReadOneAsync(leaseB.Reader);
|
||||
|
||||
Assert.Equal(1ul, a1.WorkerSequence);
|
||||
Assert.Equal(2ul, a2.WorkerSequence);
|
||||
Assert.Equal(1ul, b1.WorkerSequence);
|
||||
Assert.Equal(2ul, b2.WorkerSequence);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DisposingOneLease_StopsItsDelivery_OtherKeepsReceiving()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(source.Reader);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
IEventSubscriberLease leaseA = distributor.Register();
|
||||
using IEventSubscriberLease leaseB = distributor.Register();
|
||||
|
||||
source.Writer.TryWrite(Event(1));
|
||||
_ = await ReadOneAsync(leaseA.Reader);
|
||||
_ = await ReadOneAsync(leaseB.Reader);
|
||||
|
||||
leaseA.Dispose();
|
||||
|
||||
// A's reader must complete (no more delivery) after dispose.
|
||||
await AssertCompletedAsync(leaseA.Reader);
|
||||
|
||||
// B still receives subsequent events.
|
||||
source.Writer.TryWrite(Event(2));
|
||||
MxEvent b2 = await ReadOneAsync(leaseB.Reader);
|
||||
Assert.Equal(2ul, b2.WorkerSequence);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task SubscriberRegisteredAfterStart_ReceivesEventsEmittedAfterRegistration()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(source.Reader);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease leaseA = distributor.Register();
|
||||
source.Writer.TryWrite(Event(1));
|
||||
_ = await ReadOneAsync(leaseA.Reader);
|
||||
|
||||
// Late subscriber: only sees events emitted after it registered.
|
||||
using IEventSubscriberLease leaseB = distributor.Register();
|
||||
source.Writer.TryWrite(Event(2));
|
||||
|
||||
MxEvent b = await ReadOneAsync(leaseB.Reader);
|
||||
Assert.Equal(2ul, b.WorkerSequence);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task DisposingDistributor_CompletesAllSubscriberChannels_AndStopsPump()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
SessionEventDistributor distributor = CreateDistributor(source.Reader);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease leaseA = distributor.Register();
|
||||
using IEventSubscriberLease leaseB = distributor.Register();
|
||||
|
||||
// Bounded so a shutdown hang fails fast.
|
||||
await distributor.DisposeAsync().AsTask().WaitAsync(ReadTimeout);
|
||||
|
||||
await AssertCompletedAsync(leaseA.Reader);
|
||||
await AssertCompletedAsync(leaseB.Reader);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task Register_AfterDispose_ThrowsObjectDisposedException()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
SessionEventDistributor distributor = CreateDistributor(source.Reader);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
await distributor.DisposeAsync().AsTask().WaitAsync(ReadTimeout);
|
||||
|
||||
Assert.Throws<ObjectDisposedException>(() => distributor.Register());
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_OverCapacity_EvictsOldestFirst_AndReportsGap()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 3,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
// A live subscriber forces the pump to fan (and thereby retain) each event,
|
||||
// and gives us a deterministic point to know the pump has processed event 5.
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
for (ulong sequence = 1; sequence <= 5; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
}
|
||||
|
||||
for (ulong sequence = 1; sequence <= 5; sequence++)
|
||||
{
|
||||
MxEvent e = await ReadOneAsync(lease.Reader);
|
||||
Assert.Equal(sequence, e.WorkerSequence);
|
||||
}
|
||||
|
||||
// Capacity 3 retains only the newest three: sequences 3, 4, 5. Events 1 and 2
|
||||
// were evicted, so a caller asking from 0 missed events => gap=true, and it
|
||||
// gets only the retained tail.
|
||||
bool found = distributor.TryGetReplayFrom(0, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.True(gap);
|
||||
Assert.Equal(new ulong[] { 3, 4, 5 }, replay.Select(e => e.WorkerSequence));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_WithinRetainedWindow_ReturnsNewerEvents_NoGap()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 10,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
for (ulong sequence = 1; sequence <= 5; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
}
|
||||
|
||||
// afterSequence 2 is still inside the retained window [1..5], so no gap and
|
||||
// exactly the newer events 3, 4, 5 come back.
|
||||
bool found = distributor.TryGetReplayFrom(2, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.False(gap);
|
||||
Assert.Equal(new ulong[] { 3, 4, 5 }, replay.Select(e => e.WorkerSequence));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_AgedEntries_AreEvictedAfterRetentionElapses()
|
||||
{
|
||||
FakeTimeProvider time = new(new DateTimeOffset(2026, 1, 1, 0, 0, 0, TimeSpan.Zero));
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 100,
|
||||
replayRetentionSeconds: 30,
|
||||
timeProvider: time);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
|
||||
// Two old events, then advance the clock well past the retention window.
|
||||
source.Writer.TryWrite(Event(1));
|
||||
source.Writer.TryWrite(Event(2));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
|
||||
time.Advance(TimeSpan.FromSeconds(60));
|
||||
|
||||
// A fresh event triggers age-eviction of the now-stale entries 1 and 2.
|
||||
source.Writer.TryWrite(Event(3));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
|
||||
bool found = distributor.TryGetReplayFrom(0, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
// Events 1 and 2 aged out; only 3 remains, and 0 predates the oldest retained.
|
||||
Assert.Equal(new ulong[] { 3 }, replay.Select(e => e.WorkerSequence));
|
||||
Assert.True(gap);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_AfterSequenceNewerThanAllRetained_ReturnsEmpty_NoGap()
|
||||
{
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 10,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
for (ulong sequence = 1; sequence <= 3; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
}
|
||||
|
||||
// afterSequence 3 is at/after the newest retained; nothing newer, and the
|
||||
// caller is fully caught up => empty list, gap=false.
|
||||
bool found = distributor.TryGetReplayFrom(3, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.False(gap);
|
||||
Assert.Empty(replay);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_Capacity0_AfterSequenceBelowHighestSeen_ReportsGap_NoEvents()
|
||||
{
|
||||
// Disabled buffer: events are tracked for the highest-seen counter but not
|
||||
// retained. A caller behind the highest-seen sequence must be told to re-snapshot.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
for (ulong sequence = 1; sequence <= 3; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
}
|
||||
|
||||
// afterSequence=1 is below highestSeen=3 — gap, nothing to replay.
|
||||
bool found = distributor.TryGetReplayFrom(1, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.True(gap);
|
||||
Assert.Empty(replay);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_Capacity0_AfterSequenceAtOrAboveHighestSeen_NoGap_NoEvents()
|
||||
{
|
||||
// Disabled buffer: caller is already caught up — no gap, nothing to replay.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
for (ulong sequence = 1; sequence <= 3; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
}
|
||||
|
||||
// afterSequence=3 equals highestSeen — caller is fully caught up.
|
||||
bool found = distributor.TryGetReplayFrom(3, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.False(gap);
|
||||
Assert.Empty(replay);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_NoEventsSeen_AnyAfterSequence_NoGap_NoEvents()
|
||||
{
|
||||
// No events ever seen: nothing can have been missed, so gap must be false.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0);
|
||||
// Pump not started — no events arrive.
|
||||
|
||||
bool found = distributor.TryGetReplayFrom(0, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.False(gap);
|
||||
Assert.Empty(replay);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ReplayBuffer_AfterSequenceMaxValue_WithRetainedEvents_NoGap_NoNewEvents()
|
||||
{
|
||||
// ulong.MaxValue as afterSequence: afterSequence + 1 would wrap to 0, which the
|
||||
// old code used to compare against oldestRetained, falsely reporting gap=true.
|
||||
// The corrected formula must yield gap=false and an empty replay list.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
await using SessionEventDistributor distributor = CreateDistributor(
|
||||
source.Reader,
|
||||
replayBufferCapacity: 10,
|
||||
replayRetentionSeconds: 0);
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
using IEventSubscriberLease lease = distributor.Register();
|
||||
source.Writer.TryWrite(Event(1));
|
||||
_ = await ReadOneAsync(lease.Reader);
|
||||
|
||||
bool found = distributor.TryGetReplayFrom(ulong.MaxValue, out IReadOnlyList<MxEvent> replay, out bool gap);
|
||||
|
||||
Assert.True(found);
|
||||
Assert.False(gap);
|
||||
Assert.Empty(replay);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task SlowSubscriberOverflow_DisconnectsOnlyThatSubscriber_PumpAndOtherKeepRunning()
|
||||
{
|
||||
// Per-subscriber backpressure isolation (Task 5): one subscriber stops reading and
|
||||
// overflows its own tiny channel; it is disconnected with an EventQueueOverflow fault
|
||||
// while a second, healthy subscriber keeps receiving and the pump keeps pumping.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
int overflowCalls = 0;
|
||||
// Separate fields for the bool value and the "set" flag so both can use
|
||||
// Volatile.Read/Write; bool? is not valid for the volatile keyword on a local.
|
||||
// Interlocked.Increment on the pump thread is the store for overflowCalls;
|
||||
// Volatile.Read/Write provide ordering for observedIsOnlySubscriber.
|
||||
int observedIsOnlySubscriberSet = 0;
|
||||
bool observedIsOnlySubscriberValue = false;
|
||||
await using SessionEventDistributor distributor = new(
|
||||
"session-test",
|
||||
ct => source.Reader.ReadAllAsync(ct),
|
||||
subscriberQueueCapacity: 2,
|
||||
replayBufferCapacity: 1024,
|
||||
replayRetentionSeconds: 0,
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
(isOnlySubscriber, _) =>
|
||||
{
|
||||
Interlocked.Increment(ref overflowCalls);
|
||||
Volatile.Write(ref observedIsOnlySubscriberValue, isOnlySubscriber);
|
||||
Volatile.Write(ref observedIsOnlySubscriberSet, 1);
|
||||
});
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
// Slow subscriber: registered but never read, so its capacity-2 channel fills.
|
||||
using IEventSubscriberLease slow = distributor.Register();
|
||||
// Healthy subscriber: drains promptly throughout.
|
||||
using IEventSubscriberLease healthy = distributor.Register();
|
||||
|
||||
// Push more events than the slow subscriber's channel can hold while the healthy one
|
||||
// keeps up. The slow channel overflows; the healthy channel does not.
|
||||
for (ulong sequence = 1; sequence <= 10; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
MxEvent received = await ReadOneAsync(healthy.Reader);
|
||||
Assert.Equal(sequence, received.WorkerSequence);
|
||||
}
|
||||
|
||||
// The slow subscriber is disconnected with the overflow fault.
|
||||
SessionManagerException fault = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await DrainUntilFaultAsync(slow.Reader));
|
||||
Assert.Equal(SessionManagerErrorCode.EventQueueOverflow, fault.ErrorCode);
|
||||
|
||||
// Two subscribers were registered at overflow time, so isOnlySubscriber is false.
|
||||
// Use Interlocked.Read / Volatile.Read so the test-thread reads are ordered after the
|
||||
// pump-thread writes, avoiding a data race by the C# memory model.
|
||||
Assert.Equal(1, Volatile.Read(ref overflowCalls));
|
||||
Assert.Equal(1, Volatile.Read(ref observedIsOnlySubscriberSet));
|
||||
Assert.False(Volatile.Read(ref observedIsOnlySubscriberValue));
|
||||
Assert.Equal(1, distributor.SubscriberCount);
|
||||
|
||||
// The pump is still running and the healthy subscriber still receives new events.
|
||||
source.Writer.TryWrite(Event(11));
|
||||
MxEvent afterOverflow = await ReadOneAsync(healthy.Reader);
|
||||
Assert.Equal(11ul, afterOverflow.WorkerSequence);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task SlowSubscriberOverflow_WithMultipleSubscribers_HandlerSeesIsOnlySubscriberFalse_OtherKeepsReceiving()
|
||||
{
|
||||
// Distributor-level pin for "FailFast with multiple subscribers degrades to
|
||||
// disconnect-only (no session fault)": when the overflowing subscriber is NOT the
|
||||
// sole subscriber, isOnlySubscriber is false, so a FailFast-wired handler must NOT
|
||||
// fault the session. This test drives the distributor directly (without GatewaySession)
|
||||
// with two subscribers and a FailFast-style overflow handler seam, overflows the slow
|
||||
// one, and asserts (a) isOnlySubscriber==false, (b) the other subscriber keeps
|
||||
// receiving, and (c) the pump keeps running — all without a GatewaySession.
|
||||
//
|
||||
// TODO(Task 8): add a GatewaySession-level "session stays Ready" assertion once
|
||||
// multi-subscriber config is enabled by the Tasks 7/8 validator/guard change.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
bool handlerFiredWithFalse = false;
|
||||
bool sessionFaultWouldBeCalled = false; // tracks if a FailFast path would fault
|
||||
await using SessionEventDistributor distributor = new(
|
||||
"session-multi-sub",
|
||||
ct => source.Reader.ReadAllAsync(ct),
|
||||
subscriberQueueCapacity: 2,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0,
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
(isOnlySubscriber, _) =>
|
||||
{
|
||||
if (!isOnlySubscriber)
|
||||
{
|
||||
// Multi-subscriber: FailFast degrades to disconnect-only.
|
||||
Volatile.Write(ref handlerFiredWithFalse, true);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Single-subscriber: FailFast would fault the session — must not happen here.
|
||||
Volatile.Write(ref sessionFaultWouldBeCalled, true);
|
||||
}
|
||||
});
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
// Slow subscriber: never reads, so capacity-2 channel overflows quickly.
|
||||
using IEventSubscriberLease slow = distributor.Register();
|
||||
// Healthy subscriber: drains every event promptly.
|
||||
using IEventSubscriberLease healthy = distributor.Register();
|
||||
|
||||
// Drive enough events to overflow the slow subscriber's channel.
|
||||
for (ulong sequence = 1; sequence <= 10; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
_ = await ReadOneAsync(healthy.Reader);
|
||||
}
|
||||
|
||||
// Slow subscriber is disconnected with the overflow fault.
|
||||
SessionManagerException fault = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await DrainUntilFaultAsync(slow.Reader));
|
||||
Assert.Equal(SessionManagerErrorCode.EventQueueOverflow, fault.ErrorCode);
|
||||
|
||||
// The handler saw isOnlySubscriber==false (multi-subscriber degradation path).
|
||||
Assert.True(Volatile.Read(ref handlerFiredWithFalse));
|
||||
// The FailFast session-fault branch was NOT taken (session stays Ready equivalent).
|
||||
Assert.False(Volatile.Read(ref sessionFaultWouldBeCalled));
|
||||
|
||||
// The pump and healthy subscriber are unaffected.
|
||||
source.Writer.TryWrite(Event(11));
|
||||
MxEvent afterOverflow = await ReadOneAsync(healthy.Reader);
|
||||
Assert.Equal(11ul, afterOverflow.WorkerSequence);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task InternalSubscriberOverflow_HandlerSeesIsOnlySubscriberFalse_ProvingCountExcludesInternal()
|
||||
{
|
||||
// Issue 3: verifies that CountExternalSubscribers() excludes the internal dashboard
|
||||
// subscriber, so a FailFast policy would NOT fault the session even when the internal
|
||||
// subscriber is the ONLY registered subscriber. The overflow handler receives
|
||||
// isOnlySubscriber==false (not true) because the overflowing subscriber is internal
|
||||
// and is therefore excluded from the external-subscriber count.
|
||||
Channel<MxEvent> source = Channel.CreateUnbounded<MxEvent>();
|
||||
int observedIsOnlySubscriberSet = 0;
|
||||
bool observedIsOnlySubscriberValue = false;
|
||||
bool observedIsInternalValue = false;
|
||||
await using SessionEventDistributor distributor = new(
|
||||
"session-internal-overflow",
|
||||
ct => source.Reader.ReadAllAsync(ct),
|
||||
subscriberQueueCapacity: 2,
|
||||
replayBufferCapacity: 0,
|
||||
replayRetentionSeconds: 0,
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
TimeProvider.System,
|
||||
(isOnlySubscriber, isInternal) =>
|
||||
{
|
||||
Volatile.Write(ref observedIsOnlySubscriberValue, isOnlySubscriber);
|
||||
Volatile.Write(ref observedIsInternalValue, isInternal);
|
||||
Volatile.Write(ref observedIsOnlySubscriberSet, 1);
|
||||
});
|
||||
await distributor.StartAsync(CancellationToken.None);
|
||||
|
||||
// Register ONLY an internal subscriber — no external subscriber is attached.
|
||||
using IEventSubscriberLease internalLease = distributor.Register(isInternal: true);
|
||||
|
||||
// Push enough events to overflow the capacity-2 internal subscriber channel.
|
||||
for (ulong sequence = 1; sequence <= 10; sequence++)
|
||||
{
|
||||
source.Writer.TryWrite(Event(sequence));
|
||||
}
|
||||
|
||||
// The internal subscriber is disconnected with the overflow fault.
|
||||
SessionManagerException fault = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await DrainUntilFaultAsync(internalLease.Reader));
|
||||
Assert.Equal(SessionManagerErrorCode.EventQueueOverflow, fault.ErrorCode);
|
||||
|
||||
// Wait for the handler to fire (it runs on the pump thread).
|
||||
await Task.Run(async () =>
|
||||
{
|
||||
using CancellationTokenSource cts = new(ReadTimeout);
|
||||
while (Volatile.Read(ref observedIsOnlySubscriberSet) == 0)
|
||||
{
|
||||
await Task.Delay(10, cts.Token);
|
||||
}
|
||||
});
|
||||
|
||||
// isOnlySubscriber must be FALSE even though the internal subscriber was the ONLY
|
||||
// subscriber — CountExternalSubscribers excludes it, so a FailFast policy on the
|
||||
// external count would NOT fault the session.
|
||||
Assert.True(Volatile.Read(ref observedIsOnlySubscriberSet) == 1, "Overflow handler should have fired.");
|
||||
Assert.False(Volatile.Read(ref observedIsOnlySubscriberValue),
|
||||
"isOnlySubscriber must be false for an internal subscriber (CountExternalSubscribers excludes it).");
|
||||
Assert.True(Volatile.Read(ref observedIsInternalValue),
|
||||
"isInternal must be true for a subscriber registered with isInternal: true.");
|
||||
}
|
||||
|
||||
private static async Task DrainUntilFaultAsync(ChannelReader<MxEvent> reader)
|
||||
{
|
||||
// Drains any buffered events, then surfaces the channel's completion fault (if any)
|
||||
// by awaiting the final read past the buffered tail.
|
||||
while (true)
|
||||
{
|
||||
await reader.WaitToReadAsync().AsTask().WaitAsync(ReadTimeout);
|
||||
while (reader.TryRead(out _))
|
||||
{
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static SessionEventDistributor CreateDistributor(ChannelReader<MxEvent> source)
|
||||
=> CreateDistributor(source, replayBufferCapacity: 1024, replayRetentionSeconds: 300);
|
||||
|
||||
private static SessionEventDistributor CreateDistributor(
|
||||
ChannelReader<MxEvent> source,
|
||||
int replayBufferCapacity,
|
||||
double replayRetentionSeconds,
|
||||
TimeProvider? timeProvider = null)
|
||||
=> new(
|
||||
"session-test",
|
||||
ct => source.ReadAllAsync(ct),
|
||||
subscriberQueueCapacity: 64,
|
||||
replayBufferCapacity: replayBufferCapacity,
|
||||
replayRetentionSeconds: replayRetentionSeconds,
|
||||
NullLogger<SessionEventDistributor>.Instance,
|
||||
timeProvider ?? TimeProvider.System);
|
||||
|
||||
private static MxEvent Event(ulong sequence)
|
||||
=> new() { SessionId = "session-test", WorkerSequence = sequence };
|
||||
|
||||
private static async Task<MxEvent> ReadOneAsync(ChannelReader<MxEvent> reader)
|
||||
{
|
||||
await reader.WaitToReadAsync().AsTask().WaitAsync(ReadTimeout);
|
||||
Assert.True(reader.TryRead(out MxEvent? value));
|
||||
return value!;
|
||||
}
|
||||
|
||||
private static async Task AssertCompletedAsync(ChannelReader<MxEvent> reader)
|
||||
{
|
||||
// Drain anything still buffered, then assert the channel is completed
|
||||
// (no further events). Bounded so a never-completing channel fails fast.
|
||||
await reader.Completion.WaitAsync(ReadTimeout);
|
||||
}
|
||||
}
|
||||
@@ -663,7 +663,7 @@ public sealed class SessionManagerBulkTests
|
||||
private static async Task<GatewaySession> OpenSessionAsync(IWorkerClient workerClient)
|
||||
{
|
||||
SessionManager manager = CreateManager(workerClient);
|
||||
return await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
return await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
}
|
||||
|
||||
private static SessionManager CreateManager(IWorkerClient workerClient)
|
||||
|
||||
@@ -23,7 +23,7 @@ public sealed class SessionManagerTests
|
||||
using GatewayMetrics metrics = new();
|
||||
SessionManager manager = CreateManager(factory, metrics: metrics);
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
Assert.True(manager.TryGetSession(session.SessionId, out GatewaySession? registered));
|
||||
Assert.Same(session, registered);
|
||||
@@ -34,6 +34,36 @@ public sealed class SessionManagerTests
|
||||
Assert.Equal(1, metrics.GetSnapshot().SessionsOpened);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that a session opened by an authenticated caller records that caller's API key id in OwnerKeyId.</summary>
|
||||
[Fact]
|
||||
public async Task OpenSessionAsync_WithOwnerKeyId_RecordsOwnerKeyIdOnSession()
|
||||
{
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(new FakeWorkerClient()));
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
clientIdentity: "MyKey Display",
|
||||
ownerKeyId: "key-abc123",
|
||||
CancellationToken.None);
|
||||
|
||||
Assert.Equal("key-abc123", session.OwnerKeyId);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that a session opened without an owner key id records null in OwnerKeyId.</summary>
|
||||
[Fact]
|
||||
public async Task OpenSessionAsync_WithNullOwnerKeyId_RecordsNullOwnerKeyIdOnSession()
|
||||
{
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(new FakeWorkerClient()));
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
clientIdentity: null,
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Assert.Null(session.OwnerKeyId);
|
||||
}
|
||||
|
||||
/// <summary>Verifies that opening a session sets the initial lease expiry from the configured default lease.</summary>
|
||||
[Fact]
|
||||
public async Task OpenSessionAsync_SetsInitialDefaultLease()
|
||||
@@ -45,7 +75,7 @@ public sealed class SessionManagerTests
|
||||
options: options,
|
||||
timeProvider: clock);
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
Assert.Equal(clock.GetUtcNow() + TimeSpan.FromMinutes(30), session.LeaseExpiresAt);
|
||||
}
|
||||
@@ -61,7 +91,7 @@ public sealed class SessionManagerTests
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(new FakeWorkerClient()));
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(request, "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(request, "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
Assert.Equal($"rust-load-client-{session.SessionId}", session.ClientCorrelationId);
|
||||
}
|
||||
@@ -76,7 +106,7 @@ public sealed class SessionManagerTests
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(new FakeWorkerClient()));
|
||||
|
||||
GatewaySession session = await manager.OpenSessionAsync(request, "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(request, "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
Assert.Equal($"client-{session.SessionId}", session.ClientCorrelationId);
|
||||
}
|
||||
@@ -87,7 +117,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
WorkerCommandReply reply = await manager.InvokeAsync(
|
||||
session.SessionId,
|
||||
@@ -108,6 +138,7 @@ public sealed class SessionManagerTests
|
||||
"mxaccess-gateway-1-session-lease-refresh",
|
||||
"nonce",
|
||||
"client-1",
|
||||
null,
|
||||
"test-session",
|
||||
"client-correlation-1",
|
||||
TimeSpan.FromSeconds(30),
|
||||
@@ -156,7 +187,7 @@ public sealed class SessionManagerTests
|
||||
},
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
IReadOnlyList<SubscribeResult> results = await session.SubscribeBulkAsync(
|
||||
12,
|
||||
@@ -207,7 +238,7 @@ public sealed class SessionManagerTests
|
||||
},
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
IReadOnlyList<BulkWriteResult> results = await session.WriteBulkAsync(
|
||||
12,
|
||||
@@ -268,7 +299,7 @@ public sealed class SessionManagerTests
|
||||
},
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
IReadOnlyList<BulkReadResult> results = await session.ReadBulkAsync(
|
||||
12,
|
||||
@@ -291,7 +322,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
session.MarkFaulted("test fault");
|
||||
|
||||
SessionManagerException exception = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
@@ -316,7 +347,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
// Force a state mismatch: session stays Ready, worker transitions out.
|
||||
workerClient.State = WorkerClientState.Handshaking;
|
||||
@@ -341,7 +372,7 @@ public sealed class SessionManagerTests
|
||||
FakeWorkerClient workerClient = new();
|
||||
using GatewayMetrics metrics = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient), metrics: metrics);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
SessionCloseResult firstClose = await manager.CloseSessionAsync(session.SessionId, CancellationToken.None);
|
||||
SessionManagerException secondClose = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
@@ -366,7 +397,7 @@ public sealed class SessionManagerTests
|
||||
"Worker shutdown timed out."),
|
||||
};
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
SessionManagerException exception = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await manager.CloseSessionAsync(session.SessionId, CancellationToken.None));
|
||||
@@ -397,6 +428,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession firstSession = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-1",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
metrics.EventReceived(firstSession.SessionId, MxEventFamily.OnDataChange.ToString());
|
||||
|
||||
@@ -405,6 +437,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession secondSession = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-2",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Assert.Equal(SessionManagerErrorCode.CloseFailed, exception.ErrorCode);
|
||||
@@ -440,6 +473,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-1",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Task<SessionCloseResult> firstClose = manager.CloseSessionAsync(session.SessionId, CancellationToken.None);
|
||||
@@ -482,7 +516,7 @@ public sealed class SessionManagerTests
|
||||
FakeWorkerClient workerClient = new();
|
||||
using GatewayMetrics metrics = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient), metrics: metrics);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
SessionCloseResult result = await manager.KillWorkerAsync(session.SessionId, "test-kill", CancellationToken.None);
|
||||
|
||||
@@ -510,7 +544,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
await Assert.ThrowsAsync<ArgumentException>(
|
||||
async () => await manager.KillWorkerAsync(session.SessionId, blankReason, CancellationToken.None));
|
||||
@@ -529,7 +563,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
await Assert.ThrowsAsync<ArgumentNullException>(
|
||||
async () => await manager.KillWorkerAsync(session.SessionId, null!, CancellationToken.None));
|
||||
@@ -569,6 +603,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-1",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Assert.Equal(1, metrics.GetSnapshot().OpenSessions);
|
||||
@@ -598,6 +633,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-1",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Task<SessionCloseResult> first = manager.KillWorkerAsync(session.SessionId, "kill-a", CancellationToken.None);
|
||||
@@ -641,6 +677,7 @@ public sealed class SessionManagerTests
|
||||
GatewaySession session = await manager.OpenSessionAsync(
|
||||
CreateOpenRequest(),
|
||||
"client-1",
|
||||
ownerKeyId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
Assert.Equal(1, metrics.GetSnapshot().OpenSessions);
|
||||
@@ -666,7 +703,7 @@ public sealed class SessionManagerTests
|
||||
metrics);
|
||||
|
||||
SessionManagerException exception = await Assert.ThrowsAsync<SessionManagerException>(
|
||||
async () => await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None));
|
||||
async () => await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None));
|
||||
|
||||
Assert.Equal(SessionManagerErrorCode.OpenFailed, exception.ErrorCode);
|
||||
Assert.Equal(0, registry.Count);
|
||||
@@ -682,8 +719,8 @@ public sealed class SessionManagerTests
|
||||
FakeWorkerClient activeClient = new();
|
||||
QueueingSessionWorkerClientFactory factory = new(expiredClient, activeClient);
|
||||
SessionManager manager = CreateManager(factory);
|
||||
GatewaySession expiredSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession activeSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-2", CancellationToken.None);
|
||||
GatewaySession expiredSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
GatewaySession activeSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-2", ownerKeyId: null, CancellationToken.None);
|
||||
DateTimeOffset now = DateTimeOffset.UtcNow;
|
||||
expiredSession.ExtendLease(now.AddSeconds(-1));
|
||||
activeSession.ExtendLease(now.AddMinutes(5));
|
||||
@@ -703,7 +740,7 @@ public sealed class SessionManagerTests
|
||||
{
|
||||
FakeWorkerClient workerClient = new();
|
||||
SessionManager manager = CreateManager(new FakeSessionWorkerClientFactory(workerClient));
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession session = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
DateTimeOffset now = DateTimeOffset.UtcNow;
|
||||
session.ExtendLease(now.AddSeconds(-1));
|
||||
using IDisposable eventSubscriber = session.AttachEventSubscriber(allowMultipleSubscribers: false);
|
||||
@@ -724,8 +761,8 @@ public sealed class SessionManagerTests
|
||||
QueueingSessionWorkerClientFactory factory = new(firstClient, secondClient);
|
||||
using GatewayMetrics metrics = new();
|
||||
SessionManager manager = CreateManager(factory, metrics: metrics);
|
||||
GatewaySession firstSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", CancellationToken.None);
|
||||
GatewaySession secondSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-2", CancellationToken.None);
|
||||
GatewaySession firstSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-1", ownerKeyId: null, CancellationToken.None);
|
||||
GatewaySession secondSession = await manager.OpenSessionAsync(CreateOpenRequest(), "client-2", ownerKeyId: null, CancellationToken.None);
|
||||
|
||||
await manager.ShutdownAsync(CancellationToken.None);
|
||||
|
||||
|
||||
@@ -192,6 +192,140 @@ public sealed class FakeWorkerHarnessTests
|
||||
Assert.Equal(WorkerClientState.Closed, client.State);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that RespondToControlCommandAsync echoes the Ping message back
|
||||
/// in the DiagnosticMessage field, matching the real worker's ping reply shape.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RespondToControlCommandAsync_Ping_EchoesMessageInDiagnostic()
|
||||
{
|
||||
await using FakeWorkerHarness fakeWorker = await FakeWorkerHarness.CreateConnectedPairAsync();
|
||||
await using WorkerClient client = fakeWorker.CreateClient();
|
||||
await StartClientAsync(fakeWorker, client);
|
||||
|
||||
Task<WorkerCommandReply> invokeTask = client.InvokeAsync(
|
||||
CreateCommand(MxCommandKind.Ping, cmd => cmd.Ping = new PingCommand { Message = "hello-ping" }),
|
||||
TestTimeout,
|
||||
CancellationToken.None);
|
||||
await fakeWorker.RespondToControlCommandAsync().WaitAsync(TestTimeout);
|
||||
|
||||
WorkerCommandReply reply = await invokeTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(MxCommandKind.Ping, reply.Reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.Reply.ProtocolStatus.Code);
|
||||
Assert.Equal("hello-ping", reply.Reply.DiagnosticMessage);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that RespondToControlCommandAsync returns a SessionStateReply
|
||||
/// with state Ready for a GetSessionState command.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RespondToControlCommandAsync_GetSessionState_ReturnsReadyState()
|
||||
{
|
||||
await using FakeWorkerHarness fakeWorker = await FakeWorkerHarness.CreateConnectedPairAsync();
|
||||
await using WorkerClient client = fakeWorker.CreateClient();
|
||||
await StartClientAsync(fakeWorker, client);
|
||||
|
||||
Task<WorkerCommandReply> invokeTask = client.InvokeAsync(
|
||||
CreateCommand(MxCommandKind.GetSessionState),
|
||||
TestTimeout,
|
||||
CancellationToken.None);
|
||||
await fakeWorker.RespondToControlCommandAsync().WaitAsync(TestTimeout);
|
||||
|
||||
WorkerCommandReply reply = await invokeTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(MxCommandKind.GetSessionState, reply.Reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.Reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.Reply.SessionState);
|
||||
Assert.Equal(SessionState.Ready, reply.Reply.SessionState.State);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that RespondToControlCommandAsync returns a WorkerInfoReply
|
||||
/// with the fake worker's process ID, version, and MXAccess identifiers.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RespondToControlCommandAsync_GetWorkerInfo_ReturnsFakeWorkerInfo()
|
||||
{
|
||||
await using FakeWorkerHarness fakeWorker = await FakeWorkerHarness.CreateConnectedPairAsync();
|
||||
await using WorkerClient client = fakeWorker.CreateClient();
|
||||
await StartClientAsync(fakeWorker, client);
|
||||
|
||||
Task<WorkerCommandReply> invokeTask = client.InvokeAsync(
|
||||
CreateCommand(MxCommandKind.GetWorkerInfo),
|
||||
TestTimeout,
|
||||
CancellationToken.None);
|
||||
await fakeWorker.RespondToControlCommandAsync().WaitAsync(TestTimeout);
|
||||
|
||||
WorkerCommandReply reply = await invokeTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(MxCommandKind.GetWorkerInfo, reply.Reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.Reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.Reply.WorkerInfo);
|
||||
Assert.Equal(FakeWorkerHarness.DefaultWorkerProcessId, reply.Reply.WorkerInfo.WorkerProcessId);
|
||||
Assert.Equal("LMXProxy.LMXProxyServer.1", reply.Reply.WorkerInfo.MxaccessProgid);
|
||||
Assert.Equal("{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}", reply.Reply.WorkerInfo.MxaccessClsid);
|
||||
Assert.Equal("fake-worker", reply.Reply.WorkerInfo.WorkerVersion);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that RespondToControlCommandAsync returns an empty DrainEventsReply
|
||||
/// for a DrainEvents command (the fake harness has no queued events).
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RespondToControlCommandAsync_DrainEvents_ReturnsEmptyReply()
|
||||
{
|
||||
await using FakeWorkerHarness fakeWorker = await FakeWorkerHarness.CreateConnectedPairAsync();
|
||||
await using WorkerClient client = fakeWorker.CreateClient();
|
||||
await StartClientAsync(fakeWorker, client);
|
||||
|
||||
Task<WorkerCommandReply> invokeTask = client.InvokeAsync(
|
||||
CreateCommand(MxCommandKind.DrainEvents, cmd => cmd.DrainEvents = new DrainEventsCommand { MaxEvents = 32 }),
|
||||
TestTimeout,
|
||||
CancellationToken.None);
|
||||
await fakeWorker.RespondToControlCommandAsync().WaitAsync(TestTimeout);
|
||||
|
||||
WorkerCommandReply reply = await invokeTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(MxCommandKind.DrainEvents, reply.Reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.Reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.Reply.DrainEvents);
|
||||
Assert.Empty(reply.Reply.DrainEvents.Events);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that RespondToControlCommandAsync for ShutdownWorker sends an OK
|
||||
/// reply followed by a WorkerShutdownAck, which closes the client.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RespondToControlCommandAsync_ShutdownWorker_SendsReplyThenAck()
|
||||
{
|
||||
await using FakeWorkerHarness fakeWorker = await FakeWorkerHarness.CreateConnectedPairAsync();
|
||||
await using WorkerClient client = fakeWorker.CreateClient();
|
||||
await StartClientAsync(fakeWorker, client);
|
||||
|
||||
// ShutdownAsync triggers a WorkerShutdown envelope (not WorkerCommand),
|
||||
// so we directly invoke ShutdownWorker as a control command via InvokeAsync.
|
||||
Task<WorkerCommandReply> invokeTask = client.InvokeAsync(
|
||||
CreateCommand(MxCommandKind.ShutdownWorker, cmd => cmd.ShutdownWorker = new ShutdownWorkerCommand()),
|
||||
TestTimeout,
|
||||
CancellationToken.None);
|
||||
|
||||
// The harness reads the ShutdownWorker WorkerCommand and replies with
|
||||
// OK + ShutdownAck — the WorkerClient's read loop processes the ack and
|
||||
// transitions to Closed.
|
||||
await fakeWorker.RespondToControlCommandAsync().WaitAsync(TestTimeout);
|
||||
|
||||
WorkerCommandReply reply = await invokeTask.WaitAsync(TestTimeout);
|
||||
|
||||
Assert.Equal(MxCommandKind.ShutdownWorker, reply.Reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.Reply.ProtocolStatus.Code);
|
||||
|
||||
await WaitUntilAsync(() => client.State == WorkerClientState.Closed, TestTimeout);
|
||||
Assert.Equal(WorkerClientState.Closed, client.State);
|
||||
}
|
||||
|
||||
private static async Task StartClientAsync(
|
||||
FakeWorkerHarness fakeWorker,
|
||||
WorkerClient client)
|
||||
@@ -201,15 +335,13 @@ public sealed class FakeWorkerHarnessTests
|
||||
await startTask.WaitAsync(TestTimeout).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private static WorkerCommand CreateCommand(MxCommandKind kind)
|
||||
private static WorkerCommand CreateCommand(
|
||||
MxCommandKind kind,
|
||||
Action<MxCommand>? configure = null)
|
||||
{
|
||||
return new WorkerCommand
|
||||
{
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = kind,
|
||||
},
|
||||
};
|
||||
MxCommand command = new() { Kind = kind };
|
||||
configure?.Invoke(command);
|
||||
return new WorkerCommand { Command = command };
|
||||
}
|
||||
|
||||
private static async Task WaitUntilAsync(
|
||||
|
||||
@@ -391,6 +391,118 @@ public sealed class FakeWorkerHarness : IAsyncDisposable
|
||||
cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Reads one incoming command envelope and, if it is one of the five
|
||||
/// control command kinds (Ping, GetSessionState, GetWorkerInfo, DrainEvents,
|
||||
/// ShutdownWorker), writes a canned reply that mirrors the real worker's
|
||||
/// reply shape. For ShutdownWorker the method additionally sends a
|
||||
/// <see cref="WorkerShutdownAck"/> after the OK reply, matching the real
|
||||
/// worker's shutdown flow.
|
||||
/// </summary>
|
||||
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
||||
/// <returns>The command envelope that was handled.</returns>
|
||||
/// <exception cref="InvalidOperationException">
|
||||
/// Thrown when the next envelope is not a <c>WorkerCommand</c> or contains a
|
||||
/// non-control command kind.
|
||||
/// </exception>
|
||||
public async Task<WorkerEnvelope> RespondToControlCommandAsync(
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
WorkerEnvelope commandEnvelope = await ReadCommandAsync(cancellationToken).ConfigureAwait(false);
|
||||
return await RespondToControlCommandAsync(commandEnvelope, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Accepts an already-read command envelope and, if it is one of the five control
|
||||
/// command kinds (Ping, GetSessionState, GetWorkerInfo, DrainEvents, ShutdownWorker),
|
||||
/// writes a canned reply that mirrors the real worker's reply shape. For ShutdownWorker
|
||||
/// the method additionally sends a <see cref="WorkerShutdownAck"/> after the OK reply.
|
||||
/// Use this overload when the caller has already consumed the envelope from the pipe
|
||||
/// (e.g., to inspect the kind before routing) to avoid re-reading.
|
||||
/// </summary>
|
||||
/// <param name="commandEnvelope">The already-read command envelope to respond to.</param>
|
||||
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
||||
/// <returns>The command envelope that was handled.</returns>
|
||||
/// <exception cref="ArgumentException">
|
||||
/// Thrown when <paramref name="commandEnvelope"/> does not contain a <c>WorkerCommand</c>.
|
||||
/// </exception>
|
||||
/// <exception cref="InvalidOperationException">
|
||||
/// Thrown when the command kind is not one of the five control command kinds.
|
||||
/// </exception>
|
||||
public async Task<WorkerEnvelope> RespondToControlCommandAsync(
|
||||
WorkerEnvelope commandEnvelope,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
if (commandEnvelope.BodyCase != WorkerEnvelope.BodyOneofCase.WorkerCommand)
|
||||
{
|
||||
throw new ArgumentException(
|
||||
$"Expected WorkerCommand envelope but received {commandEnvelope.BodyCase}.",
|
||||
nameof(commandEnvelope));
|
||||
}
|
||||
|
||||
MxCommand command = commandEnvelope.WorkerCommand.Command;
|
||||
|
||||
switch (command.Kind)
|
||||
{
|
||||
case MxCommandKind.Ping:
|
||||
await ReplyToCommandAsync(
|
||||
commandEnvelope,
|
||||
configureReply: reply =>
|
||||
{
|
||||
string? message = command.Ping?.Message;
|
||||
if (!string.IsNullOrEmpty(message))
|
||||
{
|
||||
reply.DiagnosticMessage = message;
|
||||
}
|
||||
},
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
case MxCommandKind.GetSessionState:
|
||||
await ReplyToCommandAsync(
|
||||
commandEnvelope,
|
||||
configureReply: reply => reply.SessionState = new SessionStateReply
|
||||
{
|
||||
State = SessionState.Ready,
|
||||
},
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
case MxCommandKind.GetWorkerInfo:
|
||||
await ReplyToCommandAsync(
|
||||
commandEnvelope,
|
||||
configureReply: reply => reply.WorkerInfo = new WorkerInfoReply
|
||||
{
|
||||
WorkerProcessId = DefaultWorkerProcessId,
|
||||
WorkerVersion = "fake-worker",
|
||||
MxaccessProgid = "LMXProxy.LMXProxyServer.1",
|
||||
MxaccessClsid = "{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}",
|
||||
},
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
case MxCommandKind.DrainEvents:
|
||||
await ReplyToCommandAsync(
|
||||
commandEnvelope,
|
||||
configureReply: reply => reply.DrainEvents = new DrainEventsReply(),
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
case MxCommandKind.ShutdownWorker:
|
||||
await ReplyToCommandAsync(
|
||||
commandEnvelope,
|
||||
cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
await SendShutdownAckAsync(cancellationToken: cancellationToken).ConfigureAwait(false);
|
||||
break;
|
||||
|
||||
default:
|
||||
throw new InvalidOperationException(
|
||||
$"RespondToControlCommandAsync only handles control command kinds; received {command.Kind}.");
|
||||
}
|
||||
|
||||
return commandEnvelope;
|
||||
}
|
||||
|
||||
/// <summary>Writes a malformed payload directly to the worker stream.</summary>
|
||||
/// <param name="payload">Malformed payload bytes to write.</param>
|
||||
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
using System.Text.Json;
|
||||
using ZB.MOM.WW.Audit;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto;
|
||||
@@ -69,7 +70,7 @@ public sealed class ConstraintEnforcerTests
|
||||
CancellationToken.None);
|
||||
Assert.NotNull(failure);
|
||||
|
||||
await enforcer.RecordDenialAsync(identity, "Write", "42", failure, CancellationToken.None);
|
||||
await enforcer.RecordDenialAsync(identity, "Write", "42", failure, correlationId: null, CancellationToken.None);
|
||||
|
||||
AuditEvent auditEvent = Assert.Single(auditWriter.Events);
|
||||
Assert.Equal("operator01", auditEvent.Actor);
|
||||
@@ -83,6 +84,52 @@ public sealed class ConstraintEnforcerTests
|
||||
Assert.Null(auditEvent.CorrelationId);
|
||||
}
|
||||
|
||||
/// <summary>A denial carrying a parseable correlation id stores it on the audit record.</summary>
|
||||
[Fact]
|
||||
public async Task RecordDenialAsync_WithGuidCorrelationId_StoresCorrelationId()
|
||||
{
|
||||
ConstraintEnforcer enforcer = CreateEnforcer(out FakeAuditWriter auditWriter);
|
||||
Guid correlationId = Guid.NewGuid();
|
||||
|
||||
await enforcer.RecordDenialAsync(
|
||||
identity: null,
|
||||
"Read",
|
||||
"Secret.Tag",
|
||||
new ConstraintFailure("read_scope", "Tag is outside the API key read scope."),
|
||||
correlationId.ToString(),
|
||||
CancellationToken.None);
|
||||
|
||||
AuditEvent auditEvent = Assert.Single(auditWriter.Events);
|
||||
Assert.Equal(correlationId, auditEvent.CorrelationId);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// A denial with a non-GUID correlation id leaves the typed audit correlation id null but
|
||||
/// still preserves the raw client correlation id in DetailsJson so it is not lost.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RecordDenialAsync_WithNonGuidCorrelationId_LeavesCorrelationIdNullButPreservesRawInDetails()
|
||||
{
|
||||
ConstraintEnforcer enforcer = CreateEnforcer(out FakeAuditWriter auditWriter);
|
||||
|
||||
await enforcer.RecordDenialAsync(
|
||||
identity: null,
|
||||
"Read",
|
||||
"Secret.Tag",
|
||||
new ConstraintFailure("read_scope", "Tag is outside the API key read scope."),
|
||||
"rust-client-Write-7",
|
||||
CancellationToken.None);
|
||||
|
||||
AuditEvent auditEvent = Assert.Single(auditWriter.Events);
|
||||
Assert.Null(auditEvent.CorrelationId);
|
||||
Assert.NotNull(auditEvent.DetailsJson);
|
||||
|
||||
Dictionary<string, string>? details =
|
||||
JsonSerializer.Deserialize<Dictionary<string, string>>(auditEvent.DetailsJson);
|
||||
Assert.NotNull(details);
|
||||
Assert.Equal("rust-client-Write-7", details["clientCorrelationId"]);
|
||||
}
|
||||
|
||||
/// <summary>A denial with no identity records the canonical "anonymous" actor.</summary>
|
||||
[Fact]
|
||||
public async Task RecordDenialAsync_WithoutIdentity_UsesAnonymousActor()
|
||||
@@ -94,6 +141,7 @@ public sealed class ConstraintEnforcerTests
|
||||
"Read",
|
||||
"Secret.Tag",
|
||||
new ConstraintFailure("read_scope", "Tag is outside the API key read scope."),
|
||||
correlationId: null,
|
||||
CancellationToken.None);
|
||||
|
||||
AuditEvent auditEvent = Assert.Single(auditWriter.Events);
|
||||
|
||||
+1
@@ -416,6 +416,7 @@ public sealed class GatewayGrpcAuthorizationInterceptorTests
|
||||
public Task<GatewaySession> OpenSessionAsync(
|
||||
SessionOpenRequest request,
|
||||
string? clientIdentity,
|
||||
string? ownerKeyId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
OpenSessionCount++;
|
||||
|
||||
@@ -38,5 +38,6 @@ public sealed class AllowAllConstraintEnforcer : IConstraintEnforcer
|
||||
string commandKind,
|
||||
string target,
|
||||
ConstraintFailure failure,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
}
|
||||
|
||||
@@ -23,8 +23,8 @@ public sealed class PredicateConstraintEnforcer : IConstraintEnforcer
|
||||
/// <summary>Deny predicate keyed on (serverHandle, itemHandle) (returns true to deny).</summary>
|
||||
public Func<int, int, bool> DenyWriteHandle { get; init; } = (_, _) => false;
|
||||
|
||||
/// <summary>Recorded denial messages — (commandKind, target) tuples.</summary>
|
||||
public List<(string CommandKind, string Target)> RecordedDenials { get; } = [];
|
||||
/// <summary>Recorded denial messages — (commandKind, target, correlationId) tuples.</summary>
|
||||
public List<(string CommandKind, string Target, string? CorrelationId)> RecordedDenials { get; } = [];
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<ConstraintFailure?> CheckReadTagAsync(
|
||||
@@ -81,9 +81,10 @@ public sealed class PredicateConstraintEnforcer : IConstraintEnforcer
|
||||
string commandKind,
|
||||
string target,
|
||||
ConstraintFailure failure,
|
||||
string? correlationId,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
RecordedDenials.Add((commandKind, target));
|
||||
RecordedDenials.Add((commandKind, target, correlationId));
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -288,6 +288,206 @@ public sealed class WorkerPipeSessionTests
|
||||
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that a Ping control command is answered on the worker side
|
||||
/// (not dispatched to the STA) with an OK reply that echoes the ping
|
||||
/// message into the reply's diagnostic field.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_PingControlCommand_RepliesOkAndEchoesMessage()
|
||||
{
|
||||
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
|
||||
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
|
||||
FakeRuntimeSession runtime = new();
|
||||
WorkerPipeSession session = CreatePipeSession(pipePair.WorkerStream, runtime);
|
||||
Task runTask = session.RunAsync(cancellation.Token);
|
||||
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
|
||||
|
||||
await pipePair.GatewayWriter
|
||||
.WriteAsync(CreatePingCommandEnvelope("ping-1", "hello-worker"), cancellation.Token);
|
||||
|
||||
WorkerEnvelope replyEnvelope = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerCommandReply,
|
||||
cancellation.Token);
|
||||
|
||||
MxCommandReply reply = replyEnvelope.WorkerCommandReply.Reply;
|
||||
Assert.Equal("ping-1", reply.CorrelationId);
|
||||
Assert.Equal(MxCommandKind.Ping, reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Equal("hello-worker", reply.DiagnosticMessage);
|
||||
|
||||
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that GetSessionState reports the worker's lifecycle as the
|
||||
/// proto SessionState — READY while the message loop is serving.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GetSessionStateControlCommand_RepliesReady()
|
||||
{
|
||||
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
|
||||
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
|
||||
FakeRuntimeSession runtime = new();
|
||||
WorkerPipeSession session = CreatePipeSession(pipePair.WorkerStream, runtime);
|
||||
Task runTask = session.RunAsync(cancellation.Token);
|
||||
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
|
||||
|
||||
await pipePair.GatewayWriter
|
||||
.WriteAsync(
|
||||
CreateControlCommandEnvelope(
|
||||
"state-1",
|
||||
MxCommandKind.GetSessionState,
|
||||
command => command.GetSessionState = new GetSessionStateCommand()),
|
||||
cancellation.Token);
|
||||
|
||||
WorkerEnvelope replyEnvelope = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerCommandReply,
|
||||
cancellation.Token);
|
||||
|
||||
MxCommandReply reply = replyEnvelope.WorkerCommandReply.Reply;
|
||||
Assert.Equal(MxCommandKind.GetSessionState, reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Equal(SessionState.Ready, reply.SessionState.State);
|
||||
|
||||
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that GetWorkerInfo populates the worker process id, version,
|
||||
/// and MXAccess ProgID/CLSID from the worker's own metadata.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_GetWorkerInfoControlCommand_PopulatesWorkerInfoFields()
|
||||
{
|
||||
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
|
||||
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
|
||||
FakeRuntimeSession runtime = new();
|
||||
WorkerPipeSession session = CreatePipeSession(pipePair.WorkerStream, runtime);
|
||||
Task runTask = session.RunAsync(cancellation.Token);
|
||||
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
|
||||
|
||||
await pipePair.GatewayWriter
|
||||
.WriteAsync(
|
||||
CreateControlCommandEnvelope(
|
||||
"info-1",
|
||||
MxCommandKind.GetWorkerInfo,
|
||||
command => command.GetWorkerInfo = new GetWorkerInfoCommand()),
|
||||
cancellation.Token);
|
||||
|
||||
WorkerEnvelope replyEnvelope = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerCommandReply,
|
||||
cancellation.Token);
|
||||
|
||||
MxCommandReply reply = replyEnvelope.WorkerCommandReply.Reply;
|
||||
Assert.Equal(MxCommandKind.GetWorkerInfo, reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
WorkerInfoReply info = reply.WorkerInfo;
|
||||
Assert.Equal(1234, info.WorkerProcessId);
|
||||
Assert.False(string.IsNullOrEmpty(info.WorkerVersion));
|
||||
Assert.Equal(MxAccessInteropInfo.ProgId, info.MxaccessProgid);
|
||||
Assert.Equal(MxAccessInteropInfo.Clsid, info.MxaccessClsid);
|
||||
|
||||
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that DrainEvents drains the runtime session's queued events
|
||||
/// into the reply rather than streaming them as WorkerEvent envelopes.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_DrainEventsControlCommand_ReturnsQueuedEvents()
|
||||
{
|
||||
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
|
||||
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
|
||||
// Suppress the background drain loop's fixed-batch drains so the
|
||||
// queued events survive for the explicit DrainEvents command (which
|
||||
// drains all via max_events == 0). 128 mirrors
|
||||
// WorkerPipeSession.EventDrainBatchSize.
|
||||
FakeRuntimeSession runtime = new() { SuppressDrainForBatchSize = 128 };
|
||||
WorkerPipeSession session = CreatePipeSession(pipePair.WorkerStream, runtime);
|
||||
runtime.EnqueueEvent(CreateWorkerEvent(sequence: 11));
|
||||
runtime.EnqueueEvent(CreateWorkerEvent(sequence: 12));
|
||||
Task runTask = session.RunAsync(cancellation.Token);
|
||||
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
|
||||
|
||||
await pipePair.GatewayWriter
|
||||
.WriteAsync(
|
||||
CreateControlCommandEnvelope(
|
||||
"drain-1",
|
||||
MxCommandKind.DrainEvents,
|
||||
command => command.DrainEvents = new DrainEventsCommand { MaxEvents = 0 }),
|
||||
cancellation.Token);
|
||||
|
||||
WorkerEnvelope replyEnvelope = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerCommandReply,
|
||||
cancellation.Token);
|
||||
|
||||
MxCommandReply reply = replyEnvelope.WorkerCommandReply.Reply;
|
||||
Assert.Equal(MxCommandKind.DrainEvents, reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Equal(2, reply.DrainEvents.Events.Count);
|
||||
Assert.Contains(reply.DrainEvents.Events, e => e.WorkerSequence == 11UL);
|
||||
Assert.Contains(reply.DrainEvents.Events, e => e.WorkerSequence == 12UL);
|
||||
|
||||
await SendShutdownAndWaitAsync(pipePair, runTask, cancellation.Token);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that ShutdownWorker returns its OK reply BEFORE the graceful
|
||||
/// shutdown runs and disposes the runtime session, and that the message
|
||||
/// loop then stops.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task RunAsync_ShutdownWorkerControlCommand_RepliesOkThenShutsDown()
|
||||
{
|
||||
using CancellationTokenSource cancellation = new(TimeSpan.FromSeconds(5));
|
||||
using PipePair pipePair = await PipePair.CreateAsync(cancellation.Token);
|
||||
FakeRuntimeSession runtime = new();
|
||||
WorkerPipeSession session = CreatePipeSession(pipePair.WorkerStream, runtime);
|
||||
Task runTask = session.RunAsync(cancellation.Token);
|
||||
await CompleteGatewayHandshakeAsync(pipePair, cancellation.Token);
|
||||
|
||||
await pipePair.GatewayWriter
|
||||
.WriteAsync(
|
||||
CreateControlCommandEnvelope(
|
||||
"shutdown-1",
|
||||
MxCommandKind.ShutdownWorker,
|
||||
command => command.ShutdownWorker = new ShutdownWorkerCommand
|
||||
{
|
||||
GracePeriod = Duration.FromTimeSpan(TimeSpan.FromSeconds(1)),
|
||||
}),
|
||||
cancellation.Token);
|
||||
|
||||
WorkerEnvelope replyEnvelope = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerCommandReply,
|
||||
cancellation.Token);
|
||||
|
||||
MxCommandReply reply = replyEnvelope.WorkerCommandReply.Reply;
|
||||
Assert.Equal("shutdown-1", reply.CorrelationId);
|
||||
Assert.Equal(MxCommandKind.ShutdownWorker, reply.Kind);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
|
||||
// The OK reply is followed by a shutdown ack, then the loop stops and
|
||||
// the runtime session is disposed.
|
||||
WorkerEnvelope ack = await ReadUntilAsync(
|
||||
pipePair.GatewayReader,
|
||||
WorkerEnvelope.BodyOneofCase.WorkerShutdownAck,
|
||||
cancellation.Token);
|
||||
Assert.Equal(ProtocolStatusCode.Ok, ack.WorkerShutdownAck.Status.Code);
|
||||
|
||||
Task completedTask = await Task
|
||||
.WhenAny(runTask, Task.Delay(TimeSpan.FromSeconds(5), cancellation.Token));
|
||||
Assert.Same(runTask, completedTask);
|
||||
await runTask;
|
||||
Assert.True(runtime.Disposed, "ShutdownWorker must dispose the runtime session.");
|
||||
}
|
||||
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that stale STA activity with no command in flight triggers
|
||||
@@ -867,6 +1067,20 @@ public sealed class WorkerPipeSessionTests
|
||||
() => 1234);
|
||||
}
|
||||
|
||||
private static WorkerPipeSession CreatePipeSession(
|
||||
Stream stream,
|
||||
FakeRuntimeSession runtime)
|
||||
{
|
||||
return CreatePipeSession(
|
||||
stream,
|
||||
runtime,
|
||||
new WorkerPipeSessionOptions
|
||||
{
|
||||
HeartbeatInterval = TimeSpan.FromMilliseconds(100),
|
||||
HeartbeatGrace = TimeSpan.FromSeconds(5),
|
||||
});
|
||||
}
|
||||
|
||||
private static WorkerPipeSession CreatePipeSession(
|
||||
Stream stream,
|
||||
FakeRuntimeSession runtime,
|
||||
@@ -916,6 +1130,11 @@ public sealed class WorkerPipeSessionTests
|
||||
};
|
||||
}
|
||||
|
||||
// A generic STA-dispatched command used by the dispatch/heartbeat/
|
||||
// shutdown-race tests. Register is a real MXAccess command kind (not a
|
||||
// worker control command), so it flows through IWorkerRuntimeSession
|
||||
// .DispatchAsync — unlike Ping/GetSessionState/etc., which are answered on
|
||||
// the message-loop thread without touching the STA.
|
||||
private static WorkerEnvelope CreateCommandEnvelope(string correlationId, ulong sequence = 2)
|
||||
{
|
||||
return new WorkerEnvelope
|
||||
@@ -928,10 +1147,10 @@ public sealed class WorkerPipeSessionTests
|
||||
{
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Ping,
|
||||
Ping = new PingCommand
|
||||
Kind = MxCommandKind.Register,
|
||||
Register = new RegisterCommand
|
||||
{
|
||||
Message = "ping",
|
||||
ClientName = "test-client",
|
||||
},
|
||||
},
|
||||
EnqueueTimestamp = Timestamp.FromDateTimeOffset(DateTimeOffset.UtcNow),
|
||||
@@ -939,6 +1158,40 @@ public sealed class WorkerPipeSessionTests
|
||||
};
|
||||
}
|
||||
|
||||
private static WorkerEnvelope CreatePingCommandEnvelope(
|
||||
string correlationId,
|
||||
string message,
|
||||
ulong sequence = 2)
|
||||
{
|
||||
return CreateControlCommandEnvelope(
|
||||
correlationId,
|
||||
MxCommandKind.Ping,
|
||||
command => command.Ping = new PingCommand { Message = message },
|
||||
sequence);
|
||||
}
|
||||
|
||||
private static WorkerEnvelope CreateControlCommandEnvelope(
|
||||
string correlationId,
|
||||
MxCommandKind kind,
|
||||
Action<MxCommand> configurePayload,
|
||||
ulong sequence = 2)
|
||||
{
|
||||
MxCommand command = new() { Kind = kind };
|
||||
configurePayload(command);
|
||||
return new WorkerEnvelope
|
||||
{
|
||||
ProtocolVersion = GatewayContractInfo.WorkerProtocolVersion,
|
||||
SessionId = SessionId,
|
||||
Sequence = sequence,
|
||||
CorrelationId = correlationId,
|
||||
WorkerCommand = new WorkerCommand
|
||||
{
|
||||
Command = command,
|
||||
EnqueueTimestamp = Timestamp.FromDateTimeOffset(DateTimeOffset.UtcNow),
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static WorkerEnvelope CreateCancelEnvelope(string correlationId, ulong sequence = 2)
|
||||
{
|
||||
return new WorkerEnvelope
|
||||
|
||||
@@ -259,6 +259,21 @@ public sealed class LmxSubtagAlarmSourceTests
|
||||
{
|
||||
}
|
||||
|
||||
public object Suspend(int serverHandle, int itemHandle) => new object();
|
||||
|
||||
public object Activate(int serverHandle, int itemHandle) => new object();
|
||||
|
||||
public int AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword) => 0;
|
||||
|
||||
public int ArchestrAUserToId(int serverHandle, string userIdGuid) => 0;
|
||||
|
||||
public int AddBufferedItem(int serverHandle, string itemDefinition, string itemContext)
|
||||
=> AddItem(serverHandle, itemDefinition);
|
||||
|
||||
public void SetBufferedUpdateInterval(int serverHandle, int updateIntervalMilliseconds)
|
||||
{
|
||||
}
|
||||
|
||||
internal sealed class WriteRecord
|
||||
{
|
||||
public WriteRecord(int serverHandle, int itemHandle, object? value, int userId)
|
||||
|
||||
@@ -33,6 +33,39 @@ public sealed class MxAccessComServerTests
|
||||
Assert.Equal(new[] { "Register:client-a", "Advise:77:9", "Unregister:77" }, typed.Calls);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// The MXAccess command methods added in the worker COM commands bundle
|
||||
/// (Suspend/Activate/AuthenticateUser/ArchestrAUserToId/AddBufferedItem/
|
||||
/// SetBufferedUpdateInterval) route through the typed interface with their
|
||||
/// arguments preserved, and the credential is never echoed back.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void CommandMethods_WithTypedServer_RouteThroughTypedInterface()
|
||||
{
|
||||
RecordingMxAccessServer typed = new(registerHandle: 5);
|
||||
MxAccessComServer adapter = new(typed);
|
||||
|
||||
adapter.Suspend(serverHandle: 5, itemHandle: 11);
|
||||
adapter.Activate(serverHandle: 5, itemHandle: 12);
|
||||
adapter.AuthenticateUser(serverHandle: 5, verifyUser: "Administrator", verifyUserPassword: "s3cret");
|
||||
adapter.ArchestrAUserToId(serverHandle: 5, userIdGuid: "guid-1");
|
||||
adapter.AddBufferedItem(serverHandle: 5, itemDefinition: "TestInt", itemContext: "TestChildObject");
|
||||
adapter.SetBufferedUpdateInterval(serverHandle: 5, updateIntervalMilliseconds: 250);
|
||||
|
||||
Assert.Equal(
|
||||
new[]
|
||||
{
|
||||
"Suspend:5:11",
|
||||
"Activate:5:12",
|
||||
"AuthenticateUser:5:Administrator",
|
||||
"ArchestrAUserToId:5:guid-1",
|
||||
"AddBufferedItem:5:TestInt:TestChildObject",
|
||||
"SetBufferedUpdateInterval:5:250",
|
||||
},
|
||||
typed.Calls);
|
||||
Assert.DoesNotContain(typed.Calls, call => call.Contains("s3cret", StringComparison.Ordinal));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// A COM object that implements neither the typed COM interface family
|
||||
/// nor <see cref="IMxAccessServer"/> fails fast with a clear
|
||||
@@ -207,5 +240,60 @@ public sealed class MxAccessComServerTests
|
||||
{
|
||||
calls.Add($"WriteSecured2:{serverHandle}:{itemHandle}:{currentUserId}:{verifierUserId}:{value}:{timestamp}");
|
||||
}
|
||||
|
||||
/// <summary>Records a Suspend call and returns a canned status.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="itemHandle">The MXAccess item handle.</param>
|
||||
public object Suspend(int serverHandle, int itemHandle)
|
||||
{
|
||||
calls.Add($"Suspend:{serverHandle}:{itemHandle}");
|
||||
return new object();
|
||||
}
|
||||
|
||||
/// <summary>Records an Activate call and returns a canned status.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="itemHandle">The MXAccess item handle.</param>
|
||||
public object Activate(int serverHandle, int itemHandle)
|
||||
{
|
||||
calls.Add($"Activate:{serverHandle}:{itemHandle}");
|
||||
return new object();
|
||||
}
|
||||
|
||||
/// <summary>Records an AuthenticateUser call and returns zero.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="verifyUser">The user name to authenticate.</param>
|
||||
/// <param name="verifyUserPassword">The credential; recorded only as a fixed marker, never echoed.</param>
|
||||
public int AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword)
|
||||
{
|
||||
calls.Add($"AuthenticateUser:{serverHandle}:{verifyUser}");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/// <summary>Records an ArchestrAUserToId call and returns zero.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="userIdGuid">The ArchestrA user GUID to resolve.</param>
|
||||
public int ArchestrAUserToId(int serverHandle, string userIdGuid)
|
||||
{
|
||||
calls.Add($"ArchestrAUserToId:{serverHandle}:{userIdGuid}");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/// <summary>Records an AddBufferedItem call and returns zero.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="itemDefinition">The item definition string to record.</param>
|
||||
/// <param name="itemContext">The item context string to record.</param>
|
||||
public int AddBufferedItem(int serverHandle, string itemDefinition, string itemContext)
|
||||
{
|
||||
calls.Add($"AddBufferedItem:{serverHandle}:{itemDefinition}:{itemContext}");
|
||||
return 0;
|
||||
}
|
||||
|
||||
/// <summary>Records a SetBufferedUpdateInterval call.</summary>
|
||||
/// <param name="serverHandle">The MXAccess server handle.</param>
|
||||
/// <param name="updateIntervalMilliseconds">The buffered update interval in milliseconds.</param>
|
||||
public void SetBufferedUpdateInterval(int serverHandle, int updateIntervalMilliseconds)
|
||||
{
|
||||
calls.Add($"SetBufferedUpdateInterval:{serverHandle}:{updateIntervalMilliseconds}");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -952,6 +952,295 @@ public sealed class MxAccessCommandExecutorTests
|
||||
Assert.Null(fakeComObject.WriteServerHandle);
|
||||
}
|
||||
|
||||
/// <summary>Verifies Suspend calls MXAccess on the STA and maps the native status to MxStatusProxy.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_Suspend_CallsMxAccessOnStaAndMapsStatus()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 200);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-suspend", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(CreateSuspendCommand("suspend-1", 200, 21));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Equal(0, reply.Hresult);
|
||||
Assert.NotNull(reply.Suspend);
|
||||
Assert.NotNull(reply.Suspend.Status);
|
||||
Assert.Equal(1, reply.Suspend.Status.Success);
|
||||
Assert.Equal(MxStatusCategory.Ok, reply.Suspend.Status.Category);
|
||||
Assert.Equal(200, fakeComObject.SuspendServerHandle);
|
||||
Assert.Equal(21, fakeComObject.SuspendItemHandle);
|
||||
Assert.Equal(runtime.StaThreadId, fakeComObject.SuspendThreadId);
|
||||
}
|
||||
|
||||
/// <summary>Verifies Activate calls MXAccess on the STA and maps the native status to MxStatusProxy.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_Activate_CallsMxAccessOnStaAndMapsStatus()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 201);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-activate", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(CreateActivateCommand("activate-1", 201, 22));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.Activate);
|
||||
Assert.NotNull(reply.Activate.Status);
|
||||
Assert.Equal(1, reply.Activate.Status.Success);
|
||||
Assert.Equal(MxStatusCategory.Ok, reply.Activate.Status.Category);
|
||||
Assert.Equal(201, fakeComObject.ActivateServerHandle);
|
||||
Assert.Equal(22, fakeComObject.ActivateItemHandle);
|
||||
Assert.Equal(runtime.StaThreadId, fakeComObject.ActivateThreadId);
|
||||
}
|
||||
|
||||
/// <summary>Verifies AuthenticateUser passes credentials to MXAccess on the STA and returns the user id.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_AuthenticateUser_CallsMxAccessOnStaAndReturnsUserId()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 202);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-auth", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(
|
||||
CreateAuthenticateUserCommand("auth-1", 202, "Administrator", string.Empty));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.AuthenticateUser);
|
||||
Assert.Equal(1, reply.AuthenticateUser.UserId);
|
||||
Assert.Equal(202, fakeComObject.AuthenticateServerHandle);
|
||||
Assert.Equal("Administrator", fakeComObject.AuthenticateUserName);
|
||||
Assert.Equal(runtime.StaThreadId, fakeComObject.AuthenticateThreadId);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies the AuthenticateUser path never surfaces the credential into the
|
||||
/// command reply or any recorded diagnostic — the password is only ever
|
||||
/// handed straight to the MXAccess wrapper.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_AuthenticateUser_DoesNotLeakPassword()
|
||||
{
|
||||
const string secret = "sup3r-secret-pw";
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 203);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-auth-leak", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(
|
||||
CreateAuthenticateUserCommand("auth-leak", 203, "Administrator", secret));
|
||||
|
||||
// The wrapper still receives the credential verbatim...
|
||||
Assert.Equal(secret, fakeComObject.AuthenticatePassword);
|
||||
|
||||
// ...but the reply (diagnostics, status text) and the fake's operation
|
||||
// log must never contain it.
|
||||
Assert.DoesNotContain(secret, reply.DiagnosticMessage ?? string.Empty, StringComparison.Ordinal);
|
||||
Assert.DoesNotContain(secret, reply.ProtocolStatus.Message ?? string.Empty, StringComparison.Ordinal);
|
||||
Assert.DoesNotContain(fakeComObject.OperationNames, name => name.Contains(secret, StringComparison.Ordinal));
|
||||
}
|
||||
|
||||
/// <summary>Verifies ArchestrAUserToId calls MXAccess on the STA and returns the resolved user id.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_ArchestrAUserToId_CallsMxAccessOnStaAndReturnsUserId()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 204);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-user-to-id", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(
|
||||
CreateArchestrAUserToIdCommand("user-to-id-1", 204, "11112222-3333-4444-5555-666677778888"));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.ArchestraUserToId);
|
||||
Assert.Equal(7, reply.ArchestraUserToId.UserId);
|
||||
Assert.Equal(204, fakeComObject.ArchestrAUserToIdServerHandle);
|
||||
Assert.Equal("11112222-3333-4444-5555-666677778888", fakeComObject.ArchestrAUserToIdGuid);
|
||||
}
|
||||
|
||||
/// <summary>Verifies AddBufferedItem calls MXAccess on the STA and tracks the buffered item handle.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_AddBufferedItem_CallsMxAccessOnStaAndTracksItemHandle()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 205);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-buffered", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(
|
||||
CreateAddBufferedItemCommand("buffered-1", 205, "TestInt", "TestChildObject"));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.NotNull(reply.AddBufferedItem);
|
||||
Assert.Equal(1, reply.AddBufferedItem.ItemHandle);
|
||||
Assert.Equal(MxDataType.Integer, reply.ReturnValue.DataType);
|
||||
Assert.Equal(1, reply.ReturnValue.Int32Value);
|
||||
Assert.Equal(205, fakeComObject.AddBufferedItemServerHandle);
|
||||
Assert.Equal("TestInt", fakeComObject.AddBufferedItemDefinition);
|
||||
Assert.Equal("TestChildObject", fakeComObject.AddBufferedItemContext);
|
||||
|
||||
RegisteredItemHandle registeredItemHandle = Assert.Single(
|
||||
await session.GetRegisteredItemHandlesAsync());
|
||||
Assert.Equal(205, registeredItemHandle.ServerHandle);
|
||||
Assert.Equal(1, registeredItemHandle.ItemHandle);
|
||||
Assert.Equal("TestInt", registeredItemHandle.ItemDefinition);
|
||||
Assert.Equal("TestChildObject", registeredItemHandle.ItemContext);
|
||||
Assert.True(registeredItemHandle.HasItemContext);
|
||||
}
|
||||
|
||||
/// <summary>Verifies SetBufferedUpdateInterval calls MXAccess on the STA and returns a base OK reply.</summary>
|
||||
[Fact]
|
||||
public async Task DispatchAsync_SetBufferedUpdateInterval_CallsMxAccessOnStaAndReturnsOk()
|
||||
{
|
||||
FakeMxAccessComObject fakeComObject = new(registerHandle: 206);
|
||||
FakeMxAccessComObjectFactory factory = new(fakeComObject);
|
||||
using StaRuntime runtime = CreateRuntime();
|
||||
using MxAccessStaSession session = new(runtime, factory, new NoopEventSink());
|
||||
await session.StartAsync(workerProcessId: 1234);
|
||||
await session.DispatchAsync(CreateRegisterCommand("register-before-interval", "client-a"));
|
||||
|
||||
MxCommandReply reply = await session.DispatchAsync(
|
||||
CreateSetBufferedUpdateIntervalCommand("interval-1", 206, 500));
|
||||
|
||||
Assert.Equal(ProtocolStatusCode.Ok, reply.ProtocolStatus.Code);
|
||||
Assert.Equal(0, reply.Hresult);
|
||||
Assert.Equal(206, fakeComObject.SetBufferedUpdateIntervalServerHandle);
|
||||
Assert.Equal(500, fakeComObject.SetBufferedUpdateIntervalValue);
|
||||
}
|
||||
|
||||
private static StaCommand CreateSuspendCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Suspend,
|
||||
Suspend = new SuspendCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemHandle = itemHandle,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateActivateCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.Activate,
|
||||
Activate = new ActivateCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemHandle = itemHandle,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateAuthenticateUserCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
string verifyUser,
|
||||
string verifyUserPassword)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.AuthenticateUser,
|
||||
AuthenticateUser = new AuthenticateUserCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
VerifyUser = verifyUser,
|
||||
VerifyUserPassword = verifyUserPassword,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateArchestrAUserToIdCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
string userIdGuid)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.ArchestraUserToId,
|
||||
ArchestraUserToId = new ArchestrAUserToIdCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
UserIdGuid = userIdGuid,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateAddBufferedItemCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
string itemDefinition,
|
||||
string itemContext)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.AddBufferedItem,
|
||||
AddBufferedItem = new AddBufferedItemCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
ItemDefinition = itemDefinition,
|
||||
ItemContext = itemContext,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateSetBufferedUpdateIntervalCommand(
|
||||
string correlationId,
|
||||
int serverHandle,
|
||||
int updateIntervalMilliseconds)
|
||||
{
|
||||
return new StaCommand(
|
||||
"session-1",
|
||||
correlationId,
|
||||
new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.SetBufferedUpdateInterval,
|
||||
SetBufferedUpdateInterval = new SetBufferedUpdateIntervalCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
UpdateIntervalMilliseconds = updateIntervalMilliseconds,
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
private static StaCommand CreateRegisterCommand(
|
||||
string correlationId,
|
||||
string clientName)
|
||||
@@ -1810,6 +2099,151 @@ public sealed class MxAccessCommandExecutorTests
|
||||
throw exception;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>Gets the server handle passed to Suspend, if called.</summary>
|
||||
public int? SuspendServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the item handle passed to Suspend, if called.</summary>
|
||||
public int? SuspendItemHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the thread ID on which Suspend was called.</summary>
|
||||
public int? SuspendThreadId { get; private set; }
|
||||
|
||||
/// <summary>Gets the server handle passed to Activate, if called.</summary>
|
||||
public int? ActivateServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the item handle passed to Activate, if called.</summary>
|
||||
public int? ActivateItemHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the thread ID on which Activate was called.</summary>
|
||||
public int? ActivateThreadId { get; private set; }
|
||||
|
||||
/// <summary>Gets the server handle passed to AuthenticateUser, if called.</summary>
|
||||
public int? AuthenticateServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the user name passed to AuthenticateUser, if called.</summary>
|
||||
public string? AuthenticateUserName { get; private set; }
|
||||
|
||||
/// <summary>Gets the credential passed to AuthenticateUser, if called. Used only to prove non-logging.</summary>
|
||||
public string? AuthenticatePassword { get; private set; }
|
||||
|
||||
/// <summary>Gets the thread ID on which AuthenticateUser was called.</summary>
|
||||
public int? AuthenticateThreadId { get; private set; }
|
||||
|
||||
/// <summary>Gets the server handle passed to ArchestrAUserToId, if called.</summary>
|
||||
public int? ArchestrAUserToIdServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the GUID passed to ArchestrAUserToId, if called.</summary>
|
||||
public string? ArchestrAUserToIdGuid { get; private set; }
|
||||
|
||||
/// <summary>Gets the server handle passed to AddBufferedItem, if called.</summary>
|
||||
public int? AddBufferedItemServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the item definition passed to AddBufferedItem, if called.</summary>
|
||||
public string? AddBufferedItemDefinition { get; private set; }
|
||||
|
||||
/// <summary>Gets the item context passed to AddBufferedItem, if called.</summary>
|
||||
public string? AddBufferedItemContext { get; private set; }
|
||||
|
||||
/// <summary>Gets the server handle passed to SetBufferedUpdateInterval, if called.</summary>
|
||||
public int? SetBufferedUpdateIntervalServerHandle { get; private set; }
|
||||
|
||||
/// <summary>Gets the interval passed to SetBufferedUpdateInterval, if called.</summary>
|
||||
public int? SetBufferedUpdateIntervalValue { get; private set; }
|
||||
|
||||
/// <summary>Suspends an item and returns a canned status whose fields drive MxStatusProxy conversion.</summary>
|
||||
/// <param name="serverHandle">Server handle for the suspend.</param>
|
||||
/// <param name="itemHandle">Item handle to suspend.</param>
|
||||
/// <returns>A status stand-in with all-OK fields.</returns>
|
||||
public object Suspend(int serverHandle, int itemHandle)
|
||||
{
|
||||
operationNames.Add($"Suspend:{serverHandle}:{itemHandle}");
|
||||
SuspendServerHandle = serverHandle;
|
||||
SuspendItemHandle = itemHandle;
|
||||
SuspendThreadId = Environment.CurrentManagedThreadId;
|
||||
return new FakeMxStatus { success = 1, category = 0, detectedBy = 0, detail = 0 };
|
||||
}
|
||||
|
||||
/// <summary>Activates an item and returns a canned status whose fields drive MxStatusProxy conversion.</summary>
|
||||
/// <param name="serverHandle">Server handle for the activate.</param>
|
||||
/// <param name="itemHandle">Item handle to activate.</param>
|
||||
/// <returns>A status stand-in with all-OK fields.</returns>
|
||||
public object Activate(int serverHandle, int itemHandle)
|
||||
{
|
||||
operationNames.Add($"Activate:{serverHandle}:{itemHandle}");
|
||||
ActivateServerHandle = serverHandle;
|
||||
ActivateItemHandle = itemHandle;
|
||||
ActivateThreadId = Environment.CurrentManagedThreadId;
|
||||
return new FakeMxStatus { success = 1, category = 0, detectedBy = 0, detail = 0 };
|
||||
}
|
||||
|
||||
/// <summary>Authenticates a user and returns a canned user id.</summary>
|
||||
/// <param name="serverHandle">Server handle for the authentication.</param>
|
||||
/// <param name="verifyUser">User name to authenticate.</param>
|
||||
/// <param name="verifyUserPassword">Credential; recorded only to assert it is never logged.</param>
|
||||
/// <returns>The canned MXAccess user id (1).</returns>
|
||||
public int AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword)
|
||||
{
|
||||
// Deliberately does NOT include the password in the operation log.
|
||||
operationNames.Add($"AuthenticateUser:{serverHandle}:{verifyUser}");
|
||||
AuthenticateServerHandle = serverHandle;
|
||||
AuthenticateUserName = verifyUser;
|
||||
AuthenticatePassword = verifyUserPassword;
|
||||
AuthenticateThreadId = Environment.CurrentManagedThreadId;
|
||||
return 1;
|
||||
}
|
||||
|
||||
/// <summary>Resolves an ArchestrA user GUID and returns a canned user id.</summary>
|
||||
/// <param name="serverHandle">Server handle for the resolution.</param>
|
||||
/// <param name="userIdGuid">ArchestrA user GUID to resolve.</param>
|
||||
/// <returns>The canned MXAccess user id (7).</returns>
|
||||
public int ArchestrAUserToId(int serverHandle, string userIdGuid)
|
||||
{
|
||||
operationNames.Add($"ArchestrAUserToId:{serverHandle}:{userIdGuid}");
|
||||
ArchestrAUserToIdServerHandle = serverHandle;
|
||||
ArchestrAUserToIdGuid = userIdGuid;
|
||||
return 7;
|
||||
}
|
||||
|
||||
/// <summary>Adds a buffered item and returns a canned item handle.</summary>
|
||||
/// <param name="serverHandle">Server handle to add the item to.</param>
|
||||
/// <param name="itemDefinition">Item definition string.</param>
|
||||
/// <param name="itemContext">Item context string.</param>
|
||||
/// <returns>The canned buffered item handle (1).</returns>
|
||||
public int AddBufferedItem(int serverHandle, string itemDefinition, string itemContext)
|
||||
{
|
||||
operationNames.Add($"AddBufferedItem:{serverHandle}:{itemDefinition}:{itemContext}");
|
||||
AddBufferedItemServerHandle = serverHandle;
|
||||
AddBufferedItemDefinition = itemDefinition;
|
||||
AddBufferedItemContext = itemContext;
|
||||
return 1;
|
||||
}
|
||||
|
||||
/// <summary>Sets the buffered update interval and tracks the operation.</summary>
|
||||
/// <param name="serverHandle">Server handle for the interval change.</param>
|
||||
/// <param name="updateIntervalMilliseconds">Buffered update interval in milliseconds.</param>
|
||||
public void SetBufferedUpdateInterval(int serverHandle, int updateIntervalMilliseconds)
|
||||
{
|
||||
operationNames.Add($"SetBufferedUpdateInterval:{serverHandle}:{updateIntervalMilliseconds}");
|
||||
SetBufferedUpdateIntervalServerHandle = serverHandle;
|
||||
SetBufferedUpdateIntervalValue = updateIntervalMilliseconds;
|
||||
}
|
||||
|
||||
/// <summary>Status stand-in reflected over by the worker's MxStatusProxy converter.</summary>
|
||||
internal sealed class FakeMxStatus
|
||||
{
|
||||
/// <summary>Success indicator read by the status converter.</summary>
|
||||
public int success;
|
||||
|
||||
/// <summary>Status category read by the status converter.</summary>
|
||||
public int category;
|
||||
|
||||
/// <summary>Status detected-by read by the status converter.</summary>
|
||||
public int detectedBy;
|
||||
|
||||
/// <summary>Status detail read by the status converter.</summary>
|
||||
public int detail;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>Factory for creating fake MXAccess COM objects in tests.</summary>
|
||||
|
||||
@@ -122,11 +122,26 @@ internal sealed class FakeRuntimeSession : IWorkerRuntimeSession
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// When set, <see cref="DrainEvents"/> returns no events for the
|
||||
/// WorkerPipeSession background drain loop's fixed batch size, so an
|
||||
/// explicit DrainEvents control command (which drains all via
|
||||
/// <c>maxEvents == 0</c>) can claim the queued events deterministically
|
||||
/// without racing the 25 ms background loop. Mirrors
|
||||
/// <c>WorkerPipeSession.EventDrainBatchSize</c>.
|
||||
/// </summary>
|
||||
public uint? SuppressDrainForBatchSize { get; set; }
|
||||
|
||||
/// <summary>Drains queued events up to the specified limit.</summary>
|
||||
/// <param name="maxEvents">Maximum events to drain; 0 drains all.</param>
|
||||
/// <returns>The drained events.</returns>
|
||||
public IReadOnlyList<WorkerEvent> DrainEvents(uint maxEvents)
|
||||
{
|
||||
if (SuppressDrainForBatchSize is uint suppressed && maxEvents == suppressed)
|
||||
{
|
||||
return Array.Empty<WorkerEvent>();
|
||||
}
|
||||
|
||||
lock (gate)
|
||||
{
|
||||
int drainCount = maxEvents == 0
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
using ZB.MOM.WW.MxGateway.Worker.Conversion;
|
||||
using ZB.MOM.WW.MxGateway.Worker.MxAccess;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Worker.Tests.TestSupport;
|
||||
@@ -55,14 +56,10 @@ internal sealed class NoopMxAccessServer : IMxAccessServer
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public void Suspend(int serverHandle, int itemHandle)
|
||||
{
|
||||
}
|
||||
public object Suspend(int serverHandle, int itemHandle) => new FakeMxStatus();
|
||||
|
||||
/// <inheritdoc />
|
||||
public void Activate(int serverHandle, int itemHandle)
|
||||
{
|
||||
}
|
||||
public object Activate(int serverHandle, int itemHandle) => new FakeMxStatus();
|
||||
|
||||
/// <inheritdoc />
|
||||
public void Write(int serverHandle, int itemHandle, object? value, int userId)
|
||||
@@ -85,8 +82,34 @@ internal sealed class NoopMxAccessServer : IMxAccessServer
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public int AuthenticateUser(string userName, string password) => 0;
|
||||
public int AuthenticateUser(int serverHandle, string verifyUser, string verifyUserPassword) => 0;
|
||||
|
||||
/// <inheritdoc />
|
||||
public int ArchestrAUserToId(string userName) => 0;
|
||||
public int ArchestrAUserToId(int serverHandle, string userIdGuid) => 0;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Minimal stand-in for the native <c>ArchestrA.MxAccess.MxStatus</c> struct.
|
||||
/// <see cref="MxStatusProxyConverter"/> reflects over the public
|
||||
/// <c>success</c>, <c>category</c>, <c>detectedBy</c>, and <c>detail</c>
|
||||
/// fields, so this fake exposes the same field shape with all-OK values.
|
||||
/// </summary>
|
||||
internal sealed class FakeMxStatus
|
||||
{
|
||||
// These public fields exist solely so MxStatusProxyConverter can reflect
|
||||
// over them by name; they are read through reflection, not directly, so the
|
||||
// compiler's "never assigned" (CS0649) diagnostic does not apply.
|
||||
#pragma warning disable CS0649
|
||||
/// <summary>Success indicator field read by the status converter.</summary>
|
||||
public int success;
|
||||
|
||||
/// <summary>Status category field read by the status converter.</summary>
|
||||
public int category;
|
||||
|
||||
/// <summary>Status detected-by field read by the status converter.</summary>
|
||||
public int detectedBy;
|
||||
|
||||
/// <summary>Status detail field read by the status converter.</summary>
|
||||
public int detail;
|
||||
#pragma warning restore CS0649
|
||||
}
|
||||
|
||||
@@ -378,6 +378,22 @@ public sealed class WorkerPipeSession
|
||||
switch (envelope.BodyCase)
|
||||
{
|
||||
case WorkerEnvelope.BodyOneofCase.WorkerCommand:
|
||||
// Worker control/lifecycle commands (Ping, GetSessionState,
|
||||
// GetWorkerInfo, DrainEvents, ShutdownWorker) are answered here
|
||||
// on the message-loop thread instead of being dispatched onto
|
||||
// the STA. Their replies are built from process-level state
|
||||
// (worker process id, assembly version, _state, the runtime
|
||||
// session's event queue) that the STA-bound
|
||||
// MxAccessCommandExecutor cannot see, and ShutdownWorker must
|
||||
// return its OK reply BEFORE the graceful shutdown joins the
|
||||
// STA thread — running it on the STA would deadlock. Returning
|
||||
// false from the ShutdownWorker arm stops the read loop exactly
|
||||
// as a WorkerShutdown envelope would.
|
||||
if (IsControlCommand(envelope.WorkerCommand?.Command?.Kind ?? MxCommandKind.Unspecified))
|
||||
{
|
||||
return await HandleControlCommandAsync(envelope, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
|
||||
TryStartCommandTask(envelope, cancellationToken);
|
||||
return true;
|
||||
case WorkerEnvelope.BodyOneofCase.WorkerShutdown:
|
||||
@@ -393,6 +409,175 @@ public sealed class WorkerPipeSession
|
||||
}
|
||||
}
|
||||
|
||||
private static bool IsControlCommand(MxCommandKind kind)
|
||||
{
|
||||
return kind switch
|
||||
{
|
||||
MxCommandKind.Ping => true,
|
||||
MxCommandKind.GetSessionState => true,
|
||||
MxCommandKind.GetWorkerInfo => true,
|
||||
MxCommandKind.DrainEvents => true,
|
||||
MxCommandKind.ShutdownWorker => true,
|
||||
_ => false,
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Answers a worker control/lifecycle command on the message-loop
|
||||
/// thread (never on the STA). Returns <c>false</c> only for
|
||||
/// <see cref="MxCommandKind.ShutdownWorker"/> — after writing its OK
|
||||
/// reply this drives the same graceful-shutdown path a
|
||||
/// <c>WorkerShutdown</c> envelope would, then signals the read loop to
|
||||
/// stop. All other control commands return <c>true</c> to keep reading.
|
||||
/// </summary>
|
||||
private async Task<bool> HandleControlCommandAsync(
|
||||
WorkerEnvelope envelope,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
WorkerCommand workerCommand = envelope.WorkerCommand;
|
||||
MxCommand command = workerCommand.Command;
|
||||
string correlationId = envelope.CorrelationId;
|
||||
|
||||
if (command.Kind == MxCommandKind.ShutdownWorker)
|
||||
{
|
||||
// Build and emit the OK reply BEFORE triggering shutdown so the
|
||||
// gateway's correlation-id wait is satisfied even though the
|
||||
// graceful shutdown below tears the session (and pipe) down.
|
||||
MxCommandReply shutdownReply = CreateControlOkReply(correlationId, command.Kind);
|
||||
await WriteControlReplyAsync(shutdownReply, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
WorkerShutdown shutdown = new();
|
||||
if (command.ShutdownWorker?.GracePeriod is not null)
|
||||
{
|
||||
shutdown.GracePeriod = command.ShutdownWorker.GracePeriod;
|
||||
}
|
||||
|
||||
shutdown.Reason = "ShutdownWorker command";
|
||||
await ShutdownAsync(shutdown, cancellationToken).ConfigureAwait(false);
|
||||
return false;
|
||||
}
|
||||
|
||||
MxCommandReply reply = command.Kind switch
|
||||
{
|
||||
MxCommandKind.Ping => CreatePingReply(correlationId, command),
|
||||
MxCommandKind.GetSessionState => CreateSessionStateReply(correlationId, command.Kind),
|
||||
MxCommandKind.GetWorkerInfo => CreateWorkerInfoReply(correlationId, command.Kind),
|
||||
MxCommandKind.DrainEvents => CreateDrainEventsReply(correlationId, command),
|
||||
_ => CreateControlOkReply(correlationId, command.Kind),
|
||||
};
|
||||
|
||||
await WriteControlReplyAsync(reply, cancellationToken).ConfigureAwait(false);
|
||||
return true;
|
||||
}
|
||||
|
||||
private Task WriteControlReplyAsync(
|
||||
MxCommandReply reply,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
return _writer.WriteAsync(
|
||||
CreateEnvelope(new WorkerCommandReply
|
||||
{
|
||||
Reply = reply,
|
||||
CompletedTimestamp = Timestamp.FromDateTime(DateTime.UtcNow),
|
||||
}),
|
||||
cancellationToken);
|
||||
}
|
||||
|
||||
private MxCommandReply CreatePingReply(string correlationId, MxCommand command)
|
||||
{
|
||||
MxCommandReply reply = CreateControlOkReply(correlationId, command.Kind);
|
||||
|
||||
// Echo the ping message back through the base reply's diagnostic
|
||||
// message field (there is no dedicated PingReply payload). An empty
|
||||
// message leaves the diagnostic field at its proto3 default.
|
||||
string? message = command.Ping?.Message;
|
||||
if (!string.IsNullOrEmpty(message))
|
||||
{
|
||||
reply.DiagnosticMessage = message;
|
||||
}
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply CreateSessionStateReply(string correlationId, MxCommandKind kind)
|
||||
{
|
||||
MxCommandReply reply = CreateControlOkReply(correlationId, kind);
|
||||
reply.SessionState = new SessionStateReply
|
||||
{
|
||||
State = MapWorkerStateToSessionState(_state),
|
||||
};
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply CreateWorkerInfoReply(string correlationId, MxCommandKind kind)
|
||||
{
|
||||
MxCommandReply reply = CreateControlOkReply(correlationId, kind);
|
||||
reply.WorkerInfo = new WorkerInfoReply
|
||||
{
|
||||
WorkerProcessId = _processIdProvider(),
|
||||
WorkerVersion = typeof(WorkerPipeSession).Assembly.GetName().Version?.ToString() ?? string.Empty,
|
||||
MxaccessProgid = MxAccessInteropInfo.ProgId,
|
||||
MxaccessClsid = MxAccessInteropInfo.Clsid,
|
||||
};
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply CreateDrainEventsReply(string correlationId, MxCommand command)
|
||||
{
|
||||
MxCommandReply reply = CreateControlOkReply(correlationId, command.Kind);
|
||||
DrainEventsReply drainReply = new();
|
||||
|
||||
IWorkerRuntimeSession? runtimeSession = _runtimeSession;
|
||||
if (runtimeSession is not null)
|
||||
{
|
||||
uint maxEvents = command.DrainEvents?.MaxEvents ?? 0;
|
||||
foreach (WorkerEvent workerEvent in runtimeSession.DrainEvents(maxEvents))
|
||||
{
|
||||
if (workerEvent.Event is not null)
|
||||
{
|
||||
drainReply.Events.Add(workerEvent.Event);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
reply.DrainEvents = drainReply;
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply CreateControlOkReply(string correlationId, MxCommandKind kind)
|
||||
{
|
||||
return new MxCommandReply
|
||||
{
|
||||
SessionId = _options.SessionId,
|
||||
CorrelationId = correlationId,
|
||||
Kind = kind,
|
||||
Hresult = 0,
|
||||
ProtocolStatus = new ProtocolStatus
|
||||
{
|
||||
Code = ProtocolStatusCode.Ok,
|
||||
Message = "OK",
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
private static SessionState MapWorkerStateToSessionState(WorkerState state)
|
||||
{
|
||||
return state switch
|
||||
{
|
||||
WorkerState.Starting => SessionState.StartingWorker,
|
||||
WorkerState.Handshaking => SessionState.Handshaking,
|
||||
WorkerState.InitializingSta => SessionState.InitializingWorker,
|
||||
WorkerState.Ready => SessionState.Ready,
|
||||
// A control command is being served, so the STA is alive and
|
||||
// ready — the busy state is incidental, not a distinct lifecycle.
|
||||
WorkerState.ExecutingCommand => SessionState.Ready,
|
||||
WorkerState.ShuttingDown => SessionState.Closing,
|
||||
WorkerState.Stopped => SessionState.Closed,
|
||||
WorkerState.Faulted => SessionState.Faulted,
|
||||
_ => SessionState.Unspecified,
|
||||
};
|
||||
}
|
||||
|
||||
private async Task ProcessCommandAsync(
|
||||
WorkerEnvelope envelope,
|
||||
CancellationToken cancellationToken)
|
||||
|
||||
@@ -57,6 +57,68 @@ public interface IMxAccessServer
|
||||
int serverHandle,
|
||||
int itemHandle);
|
||||
|
||||
/// <summary>Suspends data acquisition for an advised item (ILMXProxyServer4).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="itemHandle">Item handle to suspend.</param>
|
||||
/// <returns>
|
||||
/// The native MXAccess <c>MxStatus</c> value (boxed) produced by the call.
|
||||
/// Callers convert it to a protobuf <c>MxStatusProxy</c> via the worker's
|
||||
/// status converter; the underlying type is reflected over, not cast.
|
||||
/// </returns>
|
||||
object Suspend(
|
||||
int serverHandle,
|
||||
int itemHandle);
|
||||
|
||||
/// <summary>Reactivates data acquisition for a suspended item (ILMXProxyServer4).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="itemHandle">Item handle to activate.</param>
|
||||
/// <returns>
|
||||
/// The native MXAccess <c>MxStatus</c> value (boxed) produced by the call.
|
||||
/// Callers convert it to a protobuf <c>MxStatusProxy</c> via the worker's
|
||||
/// status converter; the underlying type is reflected over, not cast.
|
||||
/// </returns>
|
||||
object Activate(
|
||||
int serverHandle,
|
||||
int itemHandle);
|
||||
|
||||
/// <summary>Authenticates an MXAccess user and returns its user id (base ILMXProxyServer).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="verifyUser">MXAccess user name to authenticate.</param>
|
||||
/// <param name="verifyUserPassword">
|
||||
/// Raw MXAccess credential. Implementations must keep this value out of
|
||||
/// logs, metrics, command lines, and diagnostics.
|
||||
/// </param>
|
||||
/// <returns>The MXAccess user id for the authenticated user.</returns>
|
||||
int AuthenticateUser(
|
||||
int serverHandle,
|
||||
string verifyUser,
|
||||
string verifyUserPassword);
|
||||
|
||||
/// <summary>Resolves an ArchestrA user GUID to an MXAccess user id (ILMXProxyServer2).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="userIdGuid">ArchestrA user GUID to resolve.</param>
|
||||
/// <returns>The MXAccess user id for the resolved user.</returns>
|
||||
int ArchestrAUserToId(
|
||||
int serverHandle,
|
||||
string userIdGuid);
|
||||
|
||||
/// <summary>Adds a buffered item to a server and returns an item handle (ILMXProxyServer5).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="itemDefinition">Item definition string.</param>
|
||||
/// <param name="itemContext">Item context string.</param>
|
||||
/// <returns>Item handle for the added buffered item.</returns>
|
||||
int AddBufferedItem(
|
||||
int serverHandle,
|
||||
string itemDefinition,
|
||||
string itemContext);
|
||||
|
||||
/// <summary>Sets the buffered-update interval for a server (ILMXProxyServer5).</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="updateIntervalMilliseconds">Buffered update interval in milliseconds.</param>
|
||||
void SetBufferedUpdateInterval(
|
||||
int serverHandle,
|
||||
int updateIntervalMilliseconds);
|
||||
|
||||
/// <summary>Writes a value to an item.</summary>
|
||||
/// <param name="serverHandle">Server handle identifying the registration.</param>
|
||||
/// <param name="itemHandle">Item handle to write to.</param>
|
||||
|
||||
@@ -140,6 +140,89 @@ public sealed class MxAccessComServer : IMxAccessServer
|
||||
AsProxyServer4().AdviseSupervisory(serverHandle, itemHandle);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public object Suspend(
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
return typedFake.Suspend(serverHandle, itemHandle);
|
||||
}
|
||||
|
||||
AsProxyServer4().Suspend(serverHandle, itemHandle, out MxStatus status);
|
||||
return status;
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public object Activate(
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
return typedFake.Activate(serverHandle, itemHandle);
|
||||
}
|
||||
|
||||
AsProxyServer4().Activate(serverHandle, itemHandle, out MxStatus status);
|
||||
return status;
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public int AuthenticateUser(
|
||||
int serverHandle,
|
||||
string verifyUser,
|
||||
string verifyUserPassword)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
return typedFake.AuthenticateUser(serverHandle, verifyUser, verifyUserPassword);
|
||||
}
|
||||
|
||||
return AsProxyServer().AuthenticateUser(serverHandle, verifyUser, verifyUserPassword);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public int ArchestrAUserToId(
|
||||
int serverHandle,
|
||||
string userIdGuid)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
return typedFake.ArchestrAUserToId(serverHandle, userIdGuid);
|
||||
}
|
||||
|
||||
return AsProxyServer2().ArchestrAUserToId(serverHandle, userIdGuid);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public int AddBufferedItem(
|
||||
int serverHandle,
|
||||
string itemDefinition,
|
||||
string itemContext)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
return typedFake.AddBufferedItem(serverHandle, itemDefinition, itemContext);
|
||||
}
|
||||
|
||||
return AsProxyServer5().AddBufferedItem(serverHandle, itemDefinition, itemContext);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public void SetBufferedUpdateInterval(
|
||||
int serverHandle,
|
||||
int updateIntervalMilliseconds)
|
||||
{
|
||||
if (mxAccessComObject is IMxAccessServer typedFake)
|
||||
{
|
||||
typedFake.SetBufferedUpdateInterval(serverHandle, updateIntervalMilliseconds);
|
||||
return;
|
||||
}
|
||||
|
||||
AsProxyServer5().SetBufferedUpdateInterval(serverHandle, updateIntervalMilliseconds);
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public void Write(
|
||||
int serverHandle,
|
||||
@@ -216,6 +299,14 @@ public sealed class MxAccessComServer : IMxAccessServer
|
||||
+ $"{nameof(ILMXProxyServer)} or {nameof(IMxAccessServer)}.");
|
||||
}
|
||||
|
||||
private ILMXProxyServer2 AsProxyServer2()
|
||||
{
|
||||
return mxAccessComObject as ILMXProxyServer2
|
||||
?? throw new InvalidOperationException(
|
||||
$"MXAccess COM object of type '{mxAccessComObject.GetType().FullName}' does not implement "
|
||||
+ $"{nameof(ILMXProxyServer2)} or {nameof(IMxAccessServer)}.");
|
||||
}
|
||||
|
||||
private ILMXProxyServer3 AsProxyServer3()
|
||||
{
|
||||
return mxAccessComObject as ILMXProxyServer3
|
||||
@@ -231,4 +322,12 @@ public sealed class MxAccessComServer : IMxAccessServer
|
||||
$"MXAccess COM object of type '{mxAccessComObject.GetType().FullName}' does not implement "
|
||||
+ $"{nameof(ILMXProxyServer4)} or {nameof(IMxAccessServer)}.");
|
||||
}
|
||||
|
||||
private ILMXProxyServer5 AsProxyServer5()
|
||||
{
|
||||
return mxAccessComObject as ILMXProxyServer5
|
||||
?? throw new InvalidOperationException(
|
||||
$"MXAccess COM object of type '{mxAccessComObject.GetType().FullName}' does not implement "
|
||||
+ $"{nameof(ILMXProxyServer5)} or {nameof(IMxAccessServer)}.");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -16,6 +16,7 @@ public sealed class MxAccessCommandExecutor : IStaCommandExecutor
|
||||
|
||||
private readonly MxAccessSession session;
|
||||
private readonly VariantConverter variantConverter;
|
||||
private readonly MxStatusProxyConverter statusProxyConverter;
|
||||
private readonly IAlarmCommandHandler? alarmCommandHandler;
|
||||
private readonly Action pumpStep;
|
||||
|
||||
@@ -78,6 +79,7 @@ public sealed class MxAccessCommandExecutor : IStaCommandExecutor
|
||||
{
|
||||
this.session = session ?? throw new ArgumentNullException(nameof(session));
|
||||
this.variantConverter = variantConverter ?? throw new ArgumentNullException(nameof(variantConverter));
|
||||
this.statusProxyConverter = new MxStatusProxyConverter();
|
||||
this.alarmCommandHandler = alarmCommandHandler;
|
||||
this.pumpStep = pumpStep ?? (static () => { });
|
||||
}
|
||||
@@ -104,6 +106,12 @@ public sealed class MxAccessCommandExecutor : IStaCommandExecutor
|
||||
MxCommandKind.Advise => ExecuteAdvise(command),
|
||||
MxCommandKind.UnAdvise => ExecuteUnAdvise(command),
|
||||
MxCommandKind.AdviseSupervisory => ExecuteAdviseSupervisory(command),
|
||||
MxCommandKind.Suspend => ExecuteSuspend(command),
|
||||
MxCommandKind.Activate => ExecuteActivate(command),
|
||||
MxCommandKind.AuthenticateUser => ExecuteAuthenticateUser(command),
|
||||
MxCommandKind.ArchestraUserToId => ExecuteArchestrAUserToId(command),
|
||||
MxCommandKind.AddBufferedItem => ExecuteAddBufferedItem(command),
|
||||
MxCommandKind.SetBufferedUpdateInterval => ExecuteSetBufferedUpdateInterval(command),
|
||||
MxCommandKind.Write => ExecuteWrite(command),
|
||||
MxCommandKind.Write2 => ExecuteWrite2(command),
|
||||
MxCommandKind.WriteSecured => ExecuteWriteSecured(command),
|
||||
@@ -262,6 +270,134 @@ public sealed class MxAccessCommandExecutor : IStaCommandExecutor
|
||||
return CreateOkReply(command);
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteSuspend(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.Suspend)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "Suspend command payload is required.");
|
||||
}
|
||||
|
||||
SuspendCommand suspendCommand = command.Command.Suspend;
|
||||
object nativeStatus = session.Suspend(
|
||||
suspendCommand.ServerHandle,
|
||||
suspendCommand.ItemHandle);
|
||||
|
||||
MxCommandReply reply = CreateOkReply(command);
|
||||
reply.Suspend = new SuspendReply
|
||||
{
|
||||
Status = statusProxyConverter.Convert(nativeStatus),
|
||||
};
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteActivate(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.Activate)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "Activate command payload is required.");
|
||||
}
|
||||
|
||||
ActivateCommand activateCommand = command.Command.Activate;
|
||||
object nativeStatus = session.Activate(
|
||||
activateCommand.ServerHandle,
|
||||
activateCommand.ItemHandle);
|
||||
|
||||
MxCommandReply reply = CreateOkReply(command);
|
||||
reply.Activate = new ActivateReply
|
||||
{
|
||||
Status = statusProxyConverter.Convert(nativeStatus),
|
||||
};
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteAuthenticateUser(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.AuthenticateUser)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "AuthenticateUser command payload is required.");
|
||||
}
|
||||
|
||||
AuthenticateUserCommand authenticateUserCommand = command.Command.AuthenticateUser;
|
||||
|
||||
// The credential (verify_user_password) is passed straight to MXAccess
|
||||
// and is never written to logs, diagnostics, or the reply. MXAccess is
|
||||
// allowed to fail authentication; the native HResult is surfaced by the
|
||||
// dispatcher's exception path.
|
||||
int userId = session.AuthenticateUser(
|
||||
authenticateUserCommand.ServerHandle,
|
||||
authenticateUserCommand.VerifyUser,
|
||||
authenticateUserCommand.VerifyUserPassword);
|
||||
|
||||
MxCommandReply reply = CreateOkReply(command);
|
||||
reply.AuthenticateUser = new AuthenticateUserReply
|
||||
{
|
||||
UserId = userId,
|
||||
};
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteArchestrAUserToId(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.ArchestraUserToId)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "ArchestrAUserToId command payload is required.");
|
||||
}
|
||||
|
||||
ArchestrAUserToIdCommand archestrAUserToIdCommand = command.Command.ArchestraUserToId;
|
||||
int userId = session.ArchestrAUserToId(
|
||||
archestrAUserToIdCommand.ServerHandle,
|
||||
archestrAUserToIdCommand.UserIdGuid);
|
||||
|
||||
MxCommandReply reply = CreateOkReply(command);
|
||||
reply.ArchestraUserToId = new ArchestrAUserToIdReply
|
||||
{
|
||||
UserId = userId,
|
||||
};
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteAddBufferedItem(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.AddBufferedItem)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "AddBufferedItem command payload is required.");
|
||||
}
|
||||
|
||||
AddBufferedItemCommand addBufferedItemCommand = command.Command.AddBufferedItem;
|
||||
int itemHandle = session.AddBufferedItem(
|
||||
addBufferedItemCommand.ServerHandle,
|
||||
addBufferedItemCommand.ItemDefinition,
|
||||
addBufferedItemCommand.ItemContext);
|
||||
|
||||
MxCommandReply reply = CreateOkReply(command);
|
||||
reply.ReturnValue = variantConverter.Convert(itemHandle);
|
||||
reply.AddBufferedItem = new AddBufferedItemReply
|
||||
{
|
||||
ItemHandle = itemHandle,
|
||||
};
|
||||
|
||||
return reply;
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteSetBufferedUpdateInterval(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.SetBufferedUpdateInterval)
|
||||
{
|
||||
return CreateInvalidRequestReply(command, "SetBufferedUpdateInterval command payload is required.");
|
||||
}
|
||||
|
||||
SetBufferedUpdateIntervalCommand setBufferedUpdateIntervalCommand = command.Command.SetBufferedUpdateInterval;
|
||||
session.SetBufferedUpdateInterval(
|
||||
setBufferedUpdateIntervalCommand.ServerHandle,
|
||||
setBufferedUpdateIntervalCommand.UpdateIntervalMilliseconds);
|
||||
|
||||
return CreateOkReply(command);
|
||||
}
|
||||
|
||||
private MxCommandReply ExecuteWrite(StaCommand command)
|
||||
{
|
||||
if (command.Command.PayloadCase != MxCommand.PayloadOneofCase.Write)
|
||||
|
||||
@@ -300,6 +300,94 @@ public sealed class MxAccessSession : IDisposable
|
||||
MxAccessAdviceKind.Supervisory);
|
||||
}
|
||||
|
||||
/// <summary>Suspends data acquisition for an advised item and returns the native MXAccess status.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="itemHandle">Handle returned by the worker.</param>
|
||||
/// <returns>The boxed native MXAccess status produced by the call.</returns>
|
||||
public object Suspend(
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
return mxAccessServer.Suspend(serverHandle, itemHandle);
|
||||
}
|
||||
|
||||
/// <summary>Reactivates data acquisition for a suspended item and returns the native MXAccess status.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="itemHandle">Handle returned by the worker.</param>
|
||||
/// <returns>The boxed native MXAccess status produced by the call.</returns>
|
||||
public object Activate(
|
||||
int serverHandle,
|
||||
int itemHandle)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
return mxAccessServer.Activate(serverHandle, itemHandle);
|
||||
}
|
||||
|
||||
/// <summary>Authenticates an MXAccess user and returns its user id.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="verifyUser">MXAccess user name to authenticate.</param>
|
||||
/// <param name="verifyUserPassword">Raw MXAccess credential; never logged.</param>
|
||||
/// <returns>The MXAccess user id for the authenticated user.</returns>
|
||||
public int AuthenticateUser(
|
||||
int serverHandle,
|
||||
string verifyUser,
|
||||
string verifyUserPassword)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
return mxAccessServer.AuthenticateUser(serverHandle, verifyUser, verifyUserPassword);
|
||||
}
|
||||
|
||||
/// <summary>Resolves an ArchestrA user GUID to an MXAccess user id.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="userIdGuid">ArchestrA user GUID to resolve.</param>
|
||||
/// <returns>The MXAccess user id for the resolved user.</returns>
|
||||
public int ArchestrAUserToId(
|
||||
int serverHandle,
|
||||
string userIdGuid)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
return mxAccessServer.ArchestrAUserToId(serverHandle, userIdGuid);
|
||||
}
|
||||
|
||||
/// <summary>Adds a buffered item to an MXAccess server and returns the item handle.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="itemDefinition">Definition or address of the item to add.</param>
|
||||
/// <param name="itemContext">Context string for the item.</param>
|
||||
public int AddBufferedItem(
|
||||
int serverHandle,
|
||||
string itemDefinition,
|
||||
string itemContext)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
int itemHandle = mxAccessServer.AddBufferedItem(serverHandle, itemDefinition, itemContext);
|
||||
handleRegistry.RegisterItemHandle(
|
||||
serverHandle,
|
||||
itemHandle,
|
||||
itemDefinition,
|
||||
itemContext,
|
||||
hasItemContext: true);
|
||||
|
||||
return itemHandle;
|
||||
}
|
||||
|
||||
/// <summary>Sets the buffered-update interval for an MXAccess server.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="updateIntervalMilliseconds">Buffered update interval in milliseconds.</param>
|
||||
public void SetBufferedUpdateInterval(
|
||||
int serverHandle,
|
||||
int updateIntervalMilliseconds)
|
||||
{
|
||||
ThrowIfDisposed();
|
||||
|
||||
mxAccessServer.SetBufferedUpdateInterval(serverHandle, updateIntervalMilliseconds);
|
||||
}
|
||||
|
||||
/// <summary>Writes a value to an item.</summary>
|
||||
/// <param name="serverHandle">Handle returned by the worker.</param>
|
||||
/// <param name="itemHandle">Handle returned by the worker.</param>
|
||||
|
||||
+162
@@ -0,0 +1,162 @@
|
||||
# Still Pending — Deferred / Partial / Unfinished / Missing Functionality
|
||||
|
||||
**Generated:** 2026-06-15 · **Commit:** `c7f754c` (main) · **Method:** six parallel read-only audits (Server, Worker, Contracts/proto, all five clients, docs/design/plans, tests + review backlog). Every item cites a verified `file:line`.
|
||||
|
||||
> **Resolution update (2026-06-15, branch `feat/stillpending-completion`):** The actionable items were implemented and verified per `docs/plans/2026-06-15-stillpending-completion.md`. **§1.1** (all 11 worker command kinds), **§1.2** (audit CorrelationId), and the **§4** client CLI/helper parity gaps are **Resolved** — see per-item annotations below. Worker COM commands are live-verified on the dev rig (`efd9971`, `f7ada90`). Remaining open items are the documented residuals (**§1.3**, **§1.4**, the **§3** vendor/capture-gated questions incl. the new **§3.2** multi-sample buffered residual) and the deliberate v1 scope of **§2**. Zero `.proto` changes were needed (all reply messages already existed).
|
||||
|
||||
## How to read this
|
||||
|
||||
Items are graded by what they actually are, because most "pending" surface in this codebase is **deliberate v1 scope**, not accidental:
|
||||
|
||||
- 🔴 **Genuine gap** — real unfinished/missing functionality with user-visible impact; a candidate to actually build.
|
||||
- 🟠 **Parity hole** — declared in-scope (proto/design) but not wired through; breaks "MXAccess parity" or cross-client parity.
|
||||
- 🟡 **Open question / vendor-gated** — intentionally incomplete, awaiting a live MXAccess capture or an AVEVA fix; raw data preserved meanwhile.
|
||||
- 🔵 **Intentional v1 scope** — deliberately deferred and documented; listed so it's catalogued, not because it's broken.
|
||||
- ⚪ **Verification gap** — code exists but is unverified by default (opt-in/live tests).
|
||||
- 📄 **Stale doc / dead code** — prose or code that lags reality.
|
||||
|
||||
---
|
||||
|
||||
## 1. Genuine gaps (real unfinished functionality)
|
||||
|
||||
### ✅ 1.1 — Worker does not implement 11 declared command kinds *(biggest real gap)* — **RESOLVED**
|
||||
- **Resolved:** all 11 now implemented. The **5 control commands** (`Ping`, `GetSessionState`, `GetWorkerInfo`, `DrainEvents`, `ShutdownWorker`) are handled in `WorkerPipeSession` (off-STA — `ShutdownWorker` on the STA would deadlock, and these read pipe-session state) — `bf72cd8`. The **6 COM commands** (`Suspend`, `Activate`, `AuthenticateUser`, `ArchestrAUserToId`, `AddBufferedItem`, `SetBufferedUpdateInterval`) are implemented in `MxAccessCommandExecutor` (STA-dispatched) via new `IMxAccessServer`/`MxAccessComServer` wrappers selecting `ILMXProxyServer2/4/5` — `2939932`. Live-verified on the dev rig (`efd9971`, `f7ada90`): `ArchestrAUserToId`→Ok(user_id=1), `AddBufferedItem`/`SetBufferedUpdateInterval`→Ok; `AuthenticateUser`/`Suspend`/`Activate`→real `MxaccessFailure`/HResult (parity, not `INVALID_REQUEST`). `FakeWorkerHarness` now answers the control commands so the default gateway suite covers them (`bb5139f`). Note: `DrainEvents` is a *diagnostic snapshot* — it competes with the 25 ms background stream-drain loop, so with an active event stream it returns only events not yet pushed (no loss/double-drain; the queue drain is lock-protected and destructive).
|
||||
- **Original finding below (for history):**
|
||||
- **Location:** `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:97-128` (the `Execute` switch; everything else falls to `_ => CreateInvalidRequestReply(...)`).
|
||||
- **What's missing:** the proto `MxCommandKind` enum defines these, the gateway *validates* (`Grpc/MxAccessGrpcRequestValidator.cs:86-95`), *scopes* (`Security/Authorization/GatewayGrpcScopeResolver.cs:45-47`), and *routes* them, and reply messages exist — but the worker answers `INVALID_REQUEST`:
|
||||
- **MXAccess COM commands (parity-critical):** `AddBufferedItem`, `SetBufferedUpdateInterval`, `Suspend`, `Activate`, `AuthenticateUser`, `ArchestrAUserToId` (`mxaccess_gateway.proto:107-116,151-160`).
|
||||
- **Worker control/lifecycle:** `Ping`, `GetSessionState`, `GetWorkerInfo`, `DrainEvents`, `ShutdownWorker` (`mxaccess_gateway.proto:133-137,177-181`).
|
||||
- **Why it matters:** `gateway.md:890-899`, `docs/MxAccessWorkerInstanceDesign.md:424-433`, and `docs/DesignDecisions.md:169-173` list all six COM commands under **"Phase 4: Full Command Surface"** with the exit criterion *"gRPC command surface covers the installed MXAccess public method set."* So this is a declared-in-scope phase that isn't finished, not a v1 cut.
|
||||
- **Masked by tests:** the control kinds (`Ping`/`GetWorkerInfo`/`DrainEvents`/`ShutdownWorker`) are exercised only through `FakeWorkerHarness`, so the unit suite is green while the *real* worker can't answer them. The live integration test `WorkerLiveMxAccessSmokeTests.cs:931` even sends `AuthenticateUser`, which would get an invalid-request reply today.
|
||||
- **Note:** `OnBufferedDataChange` *event* mapping IS fully wired (`Conversion`/`MxAccessEventMapper.cs:231-254`) — but with `AddBufferedItem`/`SetBufferedUpdateInterval` unimplemented there is no way to set buffered eventing up.
|
||||
|
||||
### ✅ 1.2 — Constraint-denial audit events drop CorrelationId — **RESOLVED**
|
||||
- **Resolved (`8415f35`, `55526d5`):** `request.ClientCorrelationId` is threaded from `Invoke` → `ApplyConstraintsAsync` → the six filter helpers → `IConstraintEnforcer.RecordDenialAsync` (new `string? correlationId` param); the `TODO(Task 2.3)` is gone. The audit `CorrelationId` column is `Guid?`, so a GUID-parseable id is stored typed; **and** the raw string is always preserved in the audit record's `DetailsJson["clientCorrelationId"]` — this matters because the Rust client sends non-GUID ids (`rust-client-<op>-<n>`) on all traffic and Python/Java default to empty, which would otherwise have left the typed column null. An end-to-end test asserts the value propagates through `Invoke`.
|
||||
- **Original finding:** denied-operation audit records always wrote `CorrelationId = null` (`ConstraintEnforcer.cs:134-136,147`); threading needed an `IConstraintEnforcer` signature change.
|
||||
|
||||
### 🔴 1.3 — `provider_switches{from,to,reason}` counter never exercised live
|
||||
- **Location:** metric emitted in `Alarms/GatewayAlarmMonitor.cs` (failover/failback path); residual recorded in `docs/plans/2026-06-14-deferred-followups.md:124-125`.
|
||||
- **Evidence:** *"that counter's live exercise remains the one gap; record it explicitly rather than claiming coverage."* Unit-tested (Tests-032) but the dev rig can't drive a real alarmmgr→subtag failover (`project_rig_alarms_object_driven`), so the counter's `reason` tagging is unproven in production.
|
||||
|
||||
### 🟠 1.4 — Worker 8-arg alarm ack silently discards operator domain / full name
|
||||
- **Location:** `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:261-278` (`_ = ackOperatorDomain; _ = ackOperatorFullName;`).
|
||||
- **Evidence:** *"the IwwAlarmConsumer2 8-arg AlarmAckByName returns -55 on this AVEVA build (looks like a stub) … fields are accepted by the proto for forward-compat but are not propagated to AVEVA today."*
|
||||
- **Impact:** two contract fields are accepted on the wire and silently not delivered. Root cause is the vendor stub (see 3.5), but the drop is currently invisible to callers.
|
||||
|
||||
---
|
||||
|
||||
## 2. Intentional v1 scope decisions (deliberately deferred — catalog)
|
||||
|
||||
These are documented, deliberate, and mostly enforced. Listed so the deferred surface is in one place — **none are bugs.** Canonical register: `docs/DesignDecisions.md:466-474` ("Later Revisit Items") + `gateway.md` "Post-v1 revisit items".
|
||||
|
||||
- 🔵 **Reconnectable sessions** — not in v1. `docs/DesignDecisions.md:63-73`, `gateway.md:1087,1101`.
|
||||
- 🔵 **Multi-event-subscriber fan-out** — *plumbed but blocked.* The option flows all the way to `Sessions/GatewaySession.cs:387-408 AttachEventSubscriber(allowMultipleSubscribers)`, but `Configuration/GatewayOptionsValidator.cs:181-185` hard-rejects the only enabling value: *"AllowMultipleEventSubscribers is not supported until event fan-out is implemented."* So the fan-out code path never runs. `docs/DesignDecisions.md:75-80`.
|
||||
- 🔵 **Gateway restart does not reattach orphan workers** — terminates them on startup. `docs/DesignDecisions.md:65-69`, `CLAUDE.md`.
|
||||
- 🔵 **Workers run as the gateway service identity** — restricted service account is a reserved extension point. `docs/DesignDecisions.md:179-184`.
|
||||
- 🔵 **Fail-fast event backpressure, no coalescing** — opt-in coalescing is post-v1. `docs/DesignDecisions.md:187-203`.
|
||||
- 🔵 **No public command batching** — `docs/DesignDecisions.md:206-212`.
|
||||
- 🔵 **API-key admin is a local CLI only** — no public admin RPC. `docs/DesignDecisions.md:308-323`.
|
||||
- 🔵 **No Blazor UI component libraries** — hard constraint. `docs/DesignDecisions.md:342-358`.
|
||||
- 🔵 **Lazy browse is wire-only** — no lazy SQL / cache loading. `docs/DesignDecisions.md:365-376`, `docs/plans/2026-05-28-lazy-browse-design.md:30`.
|
||||
- 🔵 **No server-side / streaming browse search** — `docs/plans/2026-05-28-lazy-browse-design.md:208`.
|
||||
- 🔵 **Alarm command surface is ack + query only** — no Clear/Disable/Enable/Silence/Shelve/Inhibit; matches the MXAccess alarm-client set. `Worker/MxAccess/AlarmCommandHandler.cs`, shelve/suppress out of scope per `docs/AlarmClientDiscovery.md:60-66`.
|
||||
- 🔵 **Dashboard EventsHub has no per-session ACL** — any authenticated dashboard user may subscribe to any session group. `Dashboard/Hubs/EventsHub.cs:36-50` (`TODO(per-session-acl)`); only relevant once a per-session role model exists.
|
||||
|
||||
---
|
||||
|
||||
## 3. MXAccess parity — open questions & vendor-gated items
|
||||
|
||||
Intentionally incomplete, awaiting a live capture or an AVEVA fix; raw payload/metadata is preserved in the meantime (no synthesis).
|
||||
|
||||
- 🟡 **3.1 `OperationComplete` native trigger condition unknown** — modeled and emitted only from the real event (no synthesis), but the runtime condition that fires it isn't captured. `docs/DesignDecisions.md:280-289`, `gateway.md:1094`, `docs/MxAccessWorkerInstanceDesign.md:341,366`.
|
||||
- 🟡 **3.2 `OnBufferedDataChange` multi-sample conversion unvalidated — STILL OPEN (residual after B8)** — `AddBufferedItem`/`SetBufferedUpdateInterval` are now implemented and live-confirmed (§1.1), and the live B8 test (`f7ada90`) confirms the worker receives and cleanly converts the empty `NoData` bootstrap `OnBufferedDataChange` (no crash, no dropped payload). But the rig's object logic does not drive a multi-sample buffered batch on demand (same limitation as the alarm rig), so a real parallel quality/timestamp sample array (length > 1) has never been observed live — it is exercised only by the B-bundle unit tests against a fake `IMxAccessServer`. Re-run `GatewaySession_WithLiveWorker_BufferedItem_*` against a fast-changing simulated tag to close this. `docs/DesignDecisions.md:291-297`.
|
||||
- 🟡 **3.3 Completion-only status → `MXSTATUS_PROXY[]` mapping unproven** — completion-only operation-status bytes are kept as raw diagnostic metadata until the analysis proves an exact mapping. `docs/DesignDecisions.md:299-306`.
|
||||
- 🟡 **3.4 `AlarmAckByGUID` is `E_NOTIMPL` on this AVEVA build** — throws `NotImplementedException`; all acks route through `AlarmAckByName`. Proto/worker keep the path for forward-compat but it is dead today. `docs/AlarmClientDiscovery.md:750-763`.
|
||||
- 🟡 **3.5 8-arg `AlarmAckByName` v2 is a vendor stub (returns -55)** — worker uses the 6-arg method; the 8-arg `domain`/`full_name` fields are carried for forward-compat only (see 1.4). `docs/AlarmClientDiscovery.md:743-748`.
|
||||
- 🟡 **3.6 Subtag degraded-mode fidelity limits** — `category`, `description`, `alarm_type_name`, operator fields, and `retrigger` are not populated/synthesized in subtag fallback (no subtag exists for them). Documented, by design. `docs/AlarmClientDiscovery.md:913-931`, `docs/plans/2026-06-13-alarm-subtag-fallback-design.md:292-298`.
|
||||
- 🟡 **3.7 Subtag `Clear` transition unvalidated live** — Raise/Ack/AckMsg are live-confirmed; Clear is externally undrivable on the rig (object logic owns alarm state). Environmental, not code. (`project_alarm_subtag_fallback`, `project_rig_alarms_object_driven`.)
|
||||
|
||||
---
|
||||
|
||||
## 4. Clients — gaps & cross-client parity
|
||||
|
||||
Library RPC surface is at **full parity**: all gateway + GalaxyRepository RPCs and the `LazyBrowseNode` helper exist in all five clients, with **no** TODO/stub/not-implemented markers in production code. The CLI/helper gaps below are **RESOLVED**.
|
||||
|
||||
| Capability | Dotnet | Go | Python | Rust | Java |
|
||||
|---|---|---|---|---|---|
|
||||
| `Write2` single session helper | ✅ | ✅ `849f1d2` | ✅ | ✅ | ✅ |
|
||||
| `ping` CLI subcommand | ✅ | ✅ `90529dc` | ✅ | ✅ | ✅ `0d5b488` |
|
||||
| `version` CLI subcommand | ✅ *(already worked)* | ✅ | ✅ | ✅ | ✅ |
|
||||
| `galaxy-*` CLI commands (4) | ✅ | ✅ | ✅ `a211fae` | ✅ | ✅ |
|
||||
| `galaxy-browse` / BrowseChildren CLI | ✅ | ✅ | ✅ | ✅ | ✅ **(5/5)** |
|
||||
|
||||
- ✅ **4.1 Go single `Write2` helper — RESOLVED** (`849f1d2`): added `Write2`/`Write2Raw` to `clients/go/mxgateway/session.go`, matching the other four clients' signature.
|
||||
- ✅ **4.2 Python `galaxy-*` CLI commands — RESOLVED** (`a211fae`, `a59fc99`): added `galaxy-test-connection`/`galaxy-last-deploy`/`galaxy-discover`/`galaxy-watch` Click commands wrapping `galaxy.py`; README corrected. (Fixed a UTC-offset bug in last-deploy output during review.)
|
||||
- ✅ **4.3 `ping` CLI added to Go + Java — RESOLVED** (Go `90529dc`/`742ced7`, Java `0d5b488`).
|
||||
- ✅ **4.4 `version` CLI in Dotnet — NOT MISSING (audit correction)**: the dotnet `version` subcommand already worked (`MxGatewayClientCli.cs:85` → `WriteVersion`, prints gateway/worker protocol versions). The original audit was wrong. Minor: unlike Go, dotnet's `version` omits a client-*package*-version line (`MxGatewayClientContractInfo` exposes only the two protocol versions) — cosmetic, not tracked.
|
||||
- ✅ **4.5 Galaxy CLI command-name divergence — RESOLVED** (Java `0d5b488`): `galaxy-test-connection`/`galaxy-last-deploy` are now the canonical Java names, with `galaxy-test`/`galaxy-deploy-time` kept as **deprecated picocli aliases** so existing scripts don't break. (Rust keeps its `galaxy <subcommand>` group style — a clap structural choice, not a name divergence.)
|
||||
- ✅ **4.6 `browse`/`BrowseChildren` CLI — RESOLVED, 0/5 → 5/5** (Rust `639e36b`, Go `8cb416b`, Python `39ec2a3`, dotnet `d7e2a8b`, Java `0d5b488`). All five emit the per-node JSON key `hasChildrenHint` (unified during review). Minor residual divergence: dotnet *nests* the Galaxy object fields under an `object` key while Go/Rust/Python/Java *flatten* them — both carry `hasChildrenHint` + a nested `children` array; harmonizing the object nesting is a cosmetic follow-up, not tracked.
|
||||
- ⚪ **4.7 No typed wrappers for the rarer commands** — `AuthenticateUser`, `ArchestrAUserToId`, `AddBufferedItem`, `Suspend`/`Activate`, `GetSessionState`/`GetWorkerInfo`/`DrainEvents`/`ShutdownWorker` remain reachable via the generic `Invoke`/`invoke_raw` escape hatch in all five clients (consistent and deliberate; the worker-side commands are now implemented per §1.1, but no client adds dedicated typed wrappers — out of scope, the CLIs that needed them got `ping`/`browse` subcommands).
|
||||
|
||||
---
|
||||
|
||||
## 5. Verification gaps (code exists, unverified by default)
|
||||
|
||||
All live/integration paths are opt-in; the default unit suites do not exercise them.
|
||||
|
||||
- ⚪ **Live MXAccess COM + STA + message pump** — `Worker.Tests/MxAccess/MxAccessLiveComCreationTests.cs` (5 `[LiveMxAccessFact]`), gated `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`.
|
||||
- ⚪ **Live gateway↔worker↔MXAccess round-trip** — `IntegrationTests/WorkerLiveMxAccessSmokeTests.cs` (6 `[LiveMxAccessFact]`).
|
||||
- ⚪ **Live Galaxy Repository SQL** — `IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` (4 `[LiveGalaxyRepositoryFact]`), gated `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1`.
|
||||
- ⚪ **Live LDAP dashboard auth** — `IntegrationTests/DashboardLdapLiveTests.cs` (5 `[LiveLdapFact]`), gated `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`.
|
||||
- ⚪ **Alarm runtime/discovery probes (dev-rig)** — `Worker.Tests/Probes/{WnWrapConsumerProbeTests,AlarmClientWmProbeTests}.cs`, `AlarmClientDiscoveryTests.cs` — hard `[Fact(Skip=...)]`.
|
||||
- ⚪ **Live alarm + subtag-fallback smoke** — `Worker.Tests/Probes/{AlarmSubtagLiveSmokeTests,AlarmsLiveSmokeTests}.cs` — `Skip` + one `[LiveMxAccessFact]`; Clear path remains undrivable even when enabled.
|
||||
- ⚪ **Python loopback TLS** — `clients/python/tests/test_tls.py:111-112` — gated `MXGATEWAY_RUN_TLS_TESTS=1` + openssl; only cert-config parsing runs by default.
|
||||
- ⚪ **.NET client live browse smoke** — `clients/dotnet/.../BrowseChildrenSmokeTests.cs:17-18` — hard `[Fact(Skip=...)]`.
|
||||
- ⚪ **Cross-language client↔gateway wire behavior** — no per-client integration unit tests; only `scripts/run-client-e2e-tests.ps1` against a live gateway (`MXGATEWAY_INTEGRATION=1`). All client wire behavior is unverified in default unit runs.
|
||||
|
||||
No placeholder/empty/`Assert.True(true)` tests were found anywhere.
|
||||
|
||||
---
|
||||
|
||||
## 6. Config-gated functional gaps (work only after configuration)
|
||||
|
||||
- 🟠 **6.1 Alarm ack in subtag mode requires `AckComment` subtag configured** — empty by default; ack fails in subtag mode until set. Names must be validated against live MXAccess, not guessed. `docs/DesignDecisions.md:454-458`. (`AckCommentSubtag` is write-only; `Worker/MxAccess/SubtagAlarmStateMachine.cs:21`.)
|
||||
- 🔵 **6.2 Multi-subscriber** — see 2 (option exists, validator-blocked).
|
||||
|
||||
---
|
||||
|
||||
## 7. Stale docs, dead code, accepted gaps
|
||||
|
||||
- 📄 **7.1 D1 plan header stale** — `docs/plans/2026-06-14-deferred-followups.md:4` still says *"Plan only — NOT yet executed,"* but D1 is **done** (`Dashboard/DashboardSnapshotService.cs:198`, commit `4af24b9`). Update the plan status.
|
||||
- 📄 **7.2 `AlarmClientDiscovery.md` STA "production fix needed" prose is stale** — `docs/AlarmClientDiscovery.md:765-774` reads as a pending follow-up, but alarms now run through the worker STA / `GatewayAlarmMonitor` (merged). Re-check against current code.
|
||||
- 📄 **7.3 EventsHub "publisher side is a follow-up" comment is stale** — `Dashboard/Hubs/EventsHub.cs:9-17`; the `DashboardEventBroadcaster` exists, is DI-registered (`Dashboard/DashboardServiceCollectionExtensions.cs:47`), runs in the live loop (`Grpc/EventStreamService.cs:133`), and `SessionDetailsPage.razor` renders the feed.
|
||||
- 📄 **7.4 CLAUDE.md project-name drift** — CLAUDE.md uses `src/MxGateway.Server`/`MxGateway.Tests`; the actual tree is `src/ZB.MOM.WW.MxGateway.*`. Misleads path-based work.
|
||||
- ⚪ **7.5 Dead `MapSqlException` helper** — `Grpc/GalaxyRepositoryGrpcService.cs:350-360`, IDE0051-suppressed, kept for a hypothetical direct-SQL path that doesn't exist.
|
||||
- **7.6 Accepted code-review gaps (`Won't Fix`, by design):**
|
||||
- `Client.Python-012` — `Session.invoke_raw` deliberately skips `ensure_mxaccess_success`, so an embedded MXAccess HRESULT failure surfaces silently (raw-parity inspection). `code-reviews/Client.Python/findings.md:290`.
|
||||
- `Contracts-003` — closed as not-a-defect. `code-reviews/Contracts/findings.md`.
|
||||
- *(All 351 review findings are otherwise Resolved; none Open or Deferred.)*
|
||||
|
||||
---
|
||||
|
||||
## 8. Deferred test-coverage follow-ups (noted in resolutions, never filed as findings)
|
||||
|
||||
- **Java CLI bulk-subcommand coverage** — 6 of 13 non-trivial subcommands untested: `read-bulk`, `write-bulk`, `write2-bulk`, `write-secured-bulk`, `write-secured2-bulk`, `bench-read-bulk` (plus `stream-events`, the four `galaxy-*`, `close-session`). `code-reviews/Client.Java/findings.md:495` (Client.Java-026).
|
||||
- **Per-session-ACL TODO** at `Server/Dashboard/Hubs/EventsHub.cs` (`code-reviews/Server/findings.md:765`).
|
||||
- **Worker-Ready retry race** noted at `code-reviews/Server/findings.md:611`.
|
||||
- **Duplicated `FakeWorkerProcess` harness** flagged as a latent regression vector — `code-reviews/Tests/findings.md:463`.
|
||||
|
||||
---
|
||||
|
||||
## Bottom line
|
||||
|
||||
**Status after `feat/stillpending-completion` (2026-06-15):** the net-new functionality is **done** — **§1.1** (all 11 worker command kinds, COM half live-verified on the rig), **§1.2** (audit CorrelationId, with a raw-string fallback for non-GUID clients), and the entire **§4** client CLI/helper parity surface (`Write2`, `ping`, `galaxy-*`, `galaxy-browse` 5/5, name aliases). Doc hygiene **§7** is done (`0032d2d`, `bd46ba1`). Zero `.proto` changes were required.
|
||||
|
||||
**Still open (all deliberate or environment/vendor-gated):**
|
||||
- **§1.3** — `provider_switches` counter still only unit-tested; the dev rig can't drive a real alarmmgr→subtag failover, so live `reason`-tag coverage remains a recorded residual.
|
||||
- **§1.4 / §3.4 / §3.5** — the AVEVA 8-arg `AlarmAckByName` is a vendor stub (−55) and `AlarmAckByGUID` is `E_NOTIMPL`; the `domain`/`full_name` fields stay forward-compat-only until AVEVA implements them.
|
||||
- **§3.2** — buffered commands work and the empty bootstrap converts cleanly live, but a multi-sample buffered batch is undrivable on the rig (unit-tested only).
|
||||
- **§3.1 / §3.3 / §3.6 / §3.7** — await live MXAccess captures.
|
||||
- **§2** — deliberate v1 scope. **§5** — opt-in verification gates. **§7.6** — accepted `Won't Fix` review findings.
|
||||
|
||||
MXAccess **event/data/value/write** mapping, the **Galaxy** RPC surface, and now the **full command surface** are complete; no `NotImplementedException`s, stubbed RPC bodies, or empty tests remain in the production paths.
|
||||
Reference in New Issue
Block a user