Compare commits
170 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| c7f754c77b | |||
| 144c293f05 | |||
| 7c957908f8 | |||
| 6659653673 | |||
| 75a39f5a8c | |||
| cebe67e9bd | |||
| ddf2d84fbc | |||
| 56dd56954b | |||
| b57d02cc4d | |||
| 47062c1a6e | |||
| d0d1dcef15 | |||
| fb2b1a4a52 | |||
| d2c776901b | |||
| 258e09e0de | |||
| 410acc92eb | |||
| b40aaeef05 | |||
| 9208225f9c | |||
| c6f17557f6 | |||
| bbbef4d098 | |||
| 4af24b9518 | |||
| 371ce53409 | |||
| 597677025f | |||
| 393e326275 | |||
| 986dcee14a | |||
| a3752799de | |||
| 37aadf72b3 | |||
| 5573f2a229 | |||
| 56abd64c6c | |||
| 5b31e99ab6 | |||
| 64db828d71 | |||
| 1a9367b5de | |||
| 98e997b573 | |||
| 0e8d911fd8 | |||
| e72763d703 | |||
| 3c9becc8d6 | |||
| ec88532fe4 | |||
| 2f30f0c7c0 | |||
| 27f6c9e6b7 | |||
| 29bd504a99 | |||
| e10b252e3a | |||
| bcc54ca56b | |||
| ee459f43e1 | |||
| ebf1d95f72 | |||
| 3ccf0b5f9e | |||
| f7ccfd678e | |||
| 3f5e5fc0b3 | |||
| 7241a4fb9c | |||
| d6c0bb41ca | |||
| 0a54c0bc4b | |||
| fd64b9260c | |||
| 4bd757a136 | |||
| 1e2ed6d1ea | |||
| 5f6655de27 | |||
| fbc9cf56df | |||
| 4c0e14fc5d | |||
| c75920c620 | |||
| a46ce90e6f | |||
| f113ca53a1 | |||
| f3616cc7fa | |||
| 57d5a8725f | |||
| 60d35a914f | |||
| b10e103bcf | |||
| 348ab16456 | |||
| c16f016f0a | |||
| 1d85db7b4e | |||
| 5ea5618315 | |||
| 38a0ad8ab4 | |||
| 5df2ef0d1e | |||
| e5785fd769 | |||
| 22370ca4da | |||
| e0a3fbf35b | |||
| 161ed6f80d | |||
| e57d864ab2 | |||
| 5539ec8542 | |||
| 73e54e252d | |||
| 70d959bd9b | |||
| 0c5b796e2e | |||
| 47dc9d865f | |||
| 4f757e3c0c | |||
| 2f0ee4c961 | |||
| 0859d47f75 | |||
| 7ea8358c06 | |||
| a5944bbe5d | |||
| 04bce3ff9f | |||
| 9572045787 | |||
| 7e1af37eb1 | |||
| 05009d7370 | |||
| f4dc11bae4 | |||
| c3b466e13d | |||
| 792e3f9445 | |||
| ae281d06bb | |||
| 3ca2799c90 | |||
| 459a88b3e7 | |||
| 437ab65fc1 | |||
| 679562e5ed | |||
| dbf550da8b | |||
| 3965a7741e | |||
| abb2cfb84b | |||
| 4e0d8ccfed | |||
| a935aa8b7c | |||
| 9912389fa1 | |||
| f1129b969d | |||
| c51b6f9ce4 | |||
| e39972357b | |||
| 9ad17e2964 | |||
| ef0a883a81 | |||
| 62ba5e9487 | |||
| 136614be94 | |||
| a912bffad5 | |||
| 9bdb899774 | |||
| e5c704de69 | |||
| 4e520f9c0c | |||
| 2eb81379e4 | |||
| ddd5721082 | |||
| 3775f6bf3b | |||
| cdfad420bb | |||
| 330e665f6b | |||
| 5e01ad9c22 | |||
| 77a9108673 | |||
| 192607ab8c | |||
| ba82afe669 | |||
| fe7d1ce1ec | |||
| b8a6695612 | |||
| 6f9188bc8d | |||
| a276f46f81 | |||
| 572b268d81 | |||
| 4c093a64fa | |||
| f47bbaea95 | |||
| c463b49f46 | |||
| 87f86503ef | |||
| e912ef960c | |||
| c4e7ddea70 | |||
| 6bfa4fe884 | |||
| b4a7bac4c0 | |||
| 6df373ae4c | |||
| fe44e3c18a | |||
| 523f944f3e | |||
| c33f1e6047 | |||
| 92cc4688e6 | |||
| a155554038 | |||
| 68f905a344 | |||
| 5abc222c72 | |||
| da3aa7b0b2 | |||
| f0ec068430 | |||
| 1a1d14a9fd | |||
| b2448510ac | |||
| 75610e3f55 | |||
| 5032166106 | |||
| 76a042d663 | |||
| 4a19854eb9 | |||
| a4467e23ef | |||
| eacfeff9fb | |||
| b4bc2df015 | |||
| fd2a0ac4c7 | |||
| 555e4be51f | |||
| 1d8c0d83c4 | |||
| 6600f2a7bd | |||
| 803a207ad2 | |||
| 97e583e96b | |||
| eaf479349d | |||
| 83a4d41fce | |||
| 0d6193cdc4 | |||
| 8cd3e1c20e | |||
| 5c28458624 | |||
| 0b389f5a97 | |||
| 108c4bb118 | |||
| cf54a278e1 | |||
| 81b2aacfe2 | |||
| 5932fe2fd3 | |||
| 310dfab8b4 |
@@ -147,3 +147,8 @@ generated-scratch/
|
||||
|
||||
# Keep empty directories with .gitkeep files when needed
|
||||
!.gitkeep
|
||||
|
||||
# Documentation review artifacts (CommentChecker output)
|
||||
*-docs-issues.md
|
||||
*-docs-fixed.md
|
||||
*-docs-final.md
|
||||
|
||||
@@ -100,7 +100,7 @@ When source code changes, build and test the affected component before reporting
|
||||
## Design Sources To Consult Before Non-Trivial Changes
|
||||
|
||||
- `gateway.md` — top-level architecture, command/event surface, IPC envelope, STA thread model, fault handling.
|
||||
- `glauth.md` — local LDAP server (GLAuth on `localhost:3893`, base DN `dc=lmxopcua,dc=local`) used for dev authn. Pre-provisioned users (`admin/admin123`, `readonly/readonly123`, etc.) and the role→capability mapping live there.
|
||||
- `glauth.md` — shared GLAuth LDAP server (`10.100.0.35:3893`, base DN `dc=zb,dc=local`, source of truth `scadaproj/infra/glauth/`) used for dev authn. Dashboard test users (`multi-role`/`password` = Administrator, `gw-viewer`/`password` = Viewer) and the role→capability mapping live there.
|
||||
- `docs/DesignDecisions.md` — v1 choices (MXAccess COM target `LMXProxyServerClass` from `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`, API-key-in-SQLite auth, fail-fast event backpressure, etc.).
|
||||
- `docs/GatewayProcessDesign.md`, `docs/MxAccessWorkerInstanceDesign.md`, `docs/WorkerFrameProtocol.md`, `docs/WorkerProcessLauncher.md` — detailed component designs.
|
||||
- `docs/GatewayConfiguration.md` — full `MxGateway:*` options bound by `GatewayOptions` and validated at startup by `GatewayOptionsValidator`.
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
<Project>
|
||||
<PropertyGroup>
|
||||
<!-- Build-quality enforcement floor, mirroring src/Directory.Build.props so the
|
||||
.NET client tree is held to the same baseline CLAUDE.md mandates (warnings as
|
||||
errors, code-style enforced at build, latest analyzers, deterministic builds). -->
|
||||
<LangVersion>latest</LangVersion>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
||||
<AnalysisLevel>latest</AnalysisLevel>
|
||||
<EnforceCodeStyleInBuild>true</EnforceCodeStyleInBuild>
|
||||
<Deterministic>true</Deterministic>
|
||||
</PropertyGroup>
|
||||
|
||||
<PropertyGroup>
|
||||
<!-- Shared package metadata for clients/dotnet/. Individual projects opt in via <IsPackable>true</IsPackable>. -->
|
||||
<Authors>Joseph Doherty</Authors>
|
||||
<Company>ZB MOM WW</Company>
|
||||
<Copyright>Copyright (c) ZB MOM WW. All rights reserved.</Copyright>
|
||||
<Product>MxAccessGateway Client</Product>
|
||||
<RepositoryUrl>https://gitea.dohertylan.com/dohertj2/mxaccessgw</RepositoryUrl>
|
||||
<RepositoryType>git</RepositoryType>
|
||||
<PackageProjectUrl>https://gitea.dohertylan.com/dohertj2/mxaccessgw</PackageProjectUrl>
|
||||
<PackageTags>mxaccess;mxgateway;grpc;client;archestra</PackageTags>
|
||||
<PackageRequireLicenseAcceptance>false</PackageRequireLicenseAcceptance>
|
||||
<!-- Proprietary/internal package, consistent with the Rust ("Proprietary") and
|
||||
Python ("Proprietary") client license declarations. A LicenseRef SPDX expression
|
||||
is rejected by the current NuGet toolset (NU5124), so the proprietary terms ship
|
||||
as a packaged license file instead. -->
|
||||
<PackageLicenseFile>LICENSE.txt</PackageLicenseFile>
|
||||
<!-- Versioning: bump per release. Symbols ship as snupkg. -->
|
||||
<Version>0.1.1</Version>
|
||||
<IncludeSymbols>true</IncludeSymbols>
|
||||
<SymbolPackageFormat>snupkg</SymbolPackageFormat>
|
||||
<!-- Default: do NOT pack. Each project opts in. -->
|
||||
<IsPackable>false</IsPackable>
|
||||
</PropertyGroup>
|
||||
</Project>
|
||||
@@ -107,6 +107,7 @@ public sealed class MxGatewayClientOptions
|
||||
public required string ApiKey { get; init; }
|
||||
public bool UseTls { get; init; }
|
||||
public string? CaCertificatePath { get; init; }
|
||||
public bool RequireCertificateValidation { get; init; }
|
||||
public string? ServerNameOverride { get; init; }
|
||||
public TimeSpan ConnectTimeout { get; init; } = TimeSpan.FromSeconds(10);
|
||||
public TimeSpan DefaultCallTimeout { get; init; } = TimeSpan.FromSeconds(30);
|
||||
@@ -124,6 +125,24 @@ or subscription changes because those calls can partially succeed in MXAccess.
|
||||
API key may be loaded from `MXGATEWAY_API_KEY` by the CLI, not implicitly by the
|
||||
library constructor unless a helper explicitly says it does that.
|
||||
|
||||
### TLS trust posture
|
||||
|
||||
The gateway can serve a self-signed certificate it generates itself (it has no
|
||||
PKI). To make that usable, TLS is **lenient by default**: when `UseTls` is set
|
||||
and `CaCertificatePath` is empty, `CreateHttpHandler` installs a
|
||||
`RemoteCertificateValidationCallback` that returns `true`, so the gateway's
|
||||
self-signed certificate is accepted without verification.
|
||||
|
||||
To verify the gateway instead:
|
||||
|
||||
- set `CaCertificatePath` to pin a CA — validated via a `CustomRootTrust`
|
||||
`X509Chain` against that root, and the callback additionally rejects a
|
||||
hostname/SAN mismatch (`RemoteCertificateNameMismatch`); or
|
||||
- set `RequireCertificateValidation` to `true` to keep the default OS/system-trust
|
||||
verification on a connection with no pinned CA.
|
||||
|
||||
Pinning a CA always wins over the lenient default.
|
||||
|
||||
## Auth Interceptor
|
||||
|
||||
Use a gRPC call credentials/interceptor layer to attach:
|
||||
|
||||
@@ -0,0 +1,12 @@
|
||||
Proprietary License
|
||||
|
||||
Copyright (c) ZB MOM WW. All rights reserved.
|
||||
|
||||
This software and its source code are proprietary and confidential. They are
|
||||
licensed, not sold, for internal use within ZB MOM WW and its authorized
|
||||
partners only. No part of this package may be reproduced, distributed, or
|
||||
transmitted to third parties without the prior written permission of ZB MOM WW.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.
|
||||
@@ -196,6 +196,54 @@ dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-las
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- galaxy-discover --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY
|
||||
```
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `BrowseChildrenAsync` to walk one level at a
|
||||
time instead of paging the full hierarchy. Pass an empty request for root objects;
|
||||
subsequent calls supply `ParentGobjectId`, `ParentTagName`, or
|
||||
`ParentContainedPath`. Each child's `ChildHasChildren[i]` tells you whether to
|
||||
draw an expand triangle. Filter fields match `DiscoverHierarchy`. See
|
||||
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
|
||||
request and filter semantics.
|
||||
|
||||
```csharp
|
||||
BrowseChildrenReply roots = await repository.BrowseChildrenAsync(
|
||||
new BrowseChildrenRequest());
|
||||
|
||||
for (int i = 0; i < roots.Children.Count; i++)
|
||||
{
|
||||
GalaxyObject child = roots.Children[i];
|
||||
bool hasChildren = roots.ChildHasChildren[i];
|
||||
Console.WriteLine($"{child.TagName} expand={hasChildren}");
|
||||
}
|
||||
```
|
||||
|
||||
#### High-level walker
|
||||
|
||||
For UI trees, the client provides a `LazyBrowseNode` walker that handles
|
||||
sibling pagination and the `child_has_children` hint for you:
|
||||
|
||||
```csharp
|
||||
await using GalaxyRepositoryClient repository = GalaxyRepositoryClient.Create(
|
||||
new MxGatewayClientOptions { Endpoint = new Uri("http://localhost:5000"), ApiKey = apiKey });
|
||||
IReadOnlyList<LazyBrowseNode> roots = await repository.BrowseAsync();
|
||||
foreach (LazyBrowseNode root in roots)
|
||||
{
|
||||
if (root.HasChildrenHint)
|
||||
{
|
||||
await root.ExpandAsync();
|
||||
}
|
||||
foreach (LazyBrowseNode child in root.Children)
|
||||
{
|
||||
Console.WriteLine($"{child.Object.TagName} ({(child.HasChildrenHint ? "has children" : "leaf")})");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`ExpandAsync` is idempotent — calling it twice fires only one RPC,
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`BrowseAsync` again from the root.
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`WatchDeployEventsAsync` opens the `WatchDeployEvents` server-streaming RPC. The
|
||||
@@ -239,6 +287,17 @@ Use TLS options for a secured gateway:
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint https://ZB.MOM.WW.MxGateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name ZB.MOM.WW.MxGateway.example.local --api-key-env MXGATEWAY_API_KEY --item Area001.Pump001.Speed --json
|
||||
```
|
||||
|
||||
### TLS trust
|
||||
|
||||
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
|
||||
the client is **lenient by default**: a TLS connection (`UseTls` / `--tls`) with
|
||||
no pinned CA accepts whatever certificate the gateway presents. To verify
|
||||
instead, pin a CA with `CaCertificatePath` / `--ca-file` (this path also enforces
|
||||
the certificate hostname/SAN match), or set `RequireCertificateValidation` to
|
||||
force OS/system-trust verification without pinning. Use `ServerNameOverride` /
|
||||
`--server-name` when the dialed host differs from the certificate SAN. See
|
||||
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
|
||||
|
||||
## Integration Checks
|
||||
|
||||
Run live checks only when a gateway and MXAccess-backed worker are available:
|
||||
@@ -251,6 +310,29 @@ $env:MXGATEWAY_TEST_ITEM = 'Area001.Pump001.Speed'
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint $env:MXGATEWAY_ENDPOINT --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
```
|
||||
|
||||
## Installing as a NuGet Package
|
||||
|
||||
The client publishes to the internal Gitea NuGet feed at
|
||||
`https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json`.
|
||||
|
||||
Add the feed once:
|
||||
|
||||
````bash
|
||||
dotnet nuget add source https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json \
|
||||
--name dohertj2-gitea \
|
||||
--username <gitea-username> \
|
||||
--password <gitea-token-or-password> \
|
||||
--store-password-in-clear-text
|
||||
````
|
||||
|
||||
Then add the package to your project:
|
||||
|
||||
````bash
|
||||
dotnet add package ZB.MOM.WW.MxGateway.Client --version 0.1.1
|
||||
````
|
||||
|
||||
The `ZB.MOM.WW.MxGateway.Contracts` package is pulled in transitively.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Client Packaging](../../docs/ClientPackaging.md)
|
||||
|
||||
@@ -0,0 +1,34 @@
|
||||
using Grpc.Core;
|
||||
using Grpc.Net.Client;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Client.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Live smoke tests for the BrowseChildren RPC. Skipped by default; set
|
||||
/// MXGATEWAY_API_KEY and MXGATEWAY_ENDPOINT to run against a real gateway.
|
||||
/// </summary>
|
||||
public sealed class BrowseChildrenSmokeTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Verifies that BrowseChildren returns a non-zero cache sequence and
|
||||
/// a consistent children/child-has-children count from a live gateway.
|
||||
/// </summary>
|
||||
[Fact(Skip = "Set MXGATEWAY_API_KEY and MXGATEWAY_ENDPOINT to enable.")]
|
||||
public async Task BrowseChildren_LiveGateway_ReturnsRootsWithCacheSequence()
|
||||
{
|
||||
string? apiKey = Environment.GetEnvironmentVariable("MXGATEWAY_API_KEY");
|
||||
string endpoint = Environment.GetEnvironmentVariable("MXGATEWAY_ENDPOINT") ?? "http://localhost:5120";
|
||||
|
||||
Assert.False(string.IsNullOrEmpty(apiKey), "MXGATEWAY_API_KEY must be set.");
|
||||
|
||||
using GrpcChannel channel = GrpcChannel.ForAddress(endpoint);
|
||||
GalaxyRepository.GalaxyRepositoryClient client = new(channel);
|
||||
|
||||
Metadata headers = new() { { "authorization", $"Bearer {apiKey}" } };
|
||||
BrowseChildrenReply reply = await client.BrowseChildrenAsync(new BrowseChildrenRequest(), headers);
|
||||
|
||||
Assert.True(reply.CacheSequence > 0UL);
|
||||
Assert.Equal(reply.Children.Count, reply.ChildHasChildren.Count);
|
||||
}
|
||||
}
|
||||
@@ -123,6 +123,49 @@ internal sealed class FakeGalaxyRepositoryTransport(MxGatewayClientOptions optio
|
||||
: DiscoverHierarchyReply);
|
||||
}
|
||||
|
||||
/// <summary>Records BrowseChildren RPC calls made by the client.</summary>
|
||||
public List<(BrowseChildrenRequest Request, CallOptions CallOptions)> BrowseChildrenCalls { get; } = [];
|
||||
|
||||
/// <summary>Default reply returned from BrowseChildren when the queue is empty.</summary>
|
||||
public BrowseChildrenReply BrowseChildrenReply { get; set; } = new();
|
||||
|
||||
/// <summary>Queue of replies returned from BrowseChildren; dequeued in FIFO order.</summary>
|
||||
public Queue<BrowseChildrenReply> BrowseChildrenReplies { get; } = new();
|
||||
|
||||
/// <summary>Queue of exceptions to throw from BrowseChildren; dequeued in FIFO order.</summary>
|
||||
public Queue<Exception> BrowseChildrenExceptions { get; } = new();
|
||||
|
||||
/// <summary>
|
||||
/// Optional hook awaited inside BrowseChildren before the reply is produced. Lets a
|
||||
/// test hold an RPC mid-flight to exercise concurrent reads of the in-progress node.
|
||||
/// </summary>
|
||||
public Func<Task>? BrowseChildrenGate { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// Records the request and either throws a queued exception or returns the configured reply.
|
||||
/// </summary>
|
||||
/// <param name="request">The BrowseChildrenRequest to process.</param>
|
||||
/// <param name="callOptions">Call options specifying RPC behavior.</param>
|
||||
public async Task<BrowseChildrenReply> BrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CallOptions callOptions)
|
||||
{
|
||||
BrowseChildrenCalls.Add((request, callOptions));
|
||||
if (BrowseChildrenExceptions.TryDequeue(out Exception? exception))
|
||||
{
|
||||
throw exception;
|
||||
}
|
||||
|
||||
if (BrowseChildrenGate is { } gate)
|
||||
{
|
||||
await gate().ConfigureAwait(false);
|
||||
}
|
||||
|
||||
return BrowseChildrenReplies.TryDequeue(out BrowseChildrenReply? reply)
|
||||
? reply
|
||||
: BrowseChildrenReply;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Gets the list of WatchDeployEvents RPC calls made by the client.
|
||||
/// </summary>
|
||||
|
||||
@@ -0,0 +1,311 @@
|
||||
using Grpc.Core;
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Client.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Tests for the <see cref="LazyBrowseNode"/> walker over the BrowseChildren RPC.
|
||||
/// </summary>
|
||||
public sealed class LazyBrowseNodeTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Verifies that calling BrowseAsync with no parent returns the root nodes
|
||||
/// from the first BrowseChildren reply and surfaces the per-child has-children hint.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Browse_NoParent_ReturnsRoots()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true), BuildObject(2, "Other")],
|
||||
childHasChildren: [true, false],
|
||||
cacheSequence: 1));
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
|
||||
Assert.Equal(2, roots.Count);
|
||||
Assert.Equal("Plant", roots[0].Object.TagName);
|
||||
Assert.True(roots[0].HasChildrenHint);
|
||||
Assert.False(roots[0].IsExpanded);
|
||||
Assert.Equal("Other", roots[1].Object.TagName);
|
||||
Assert.False(roots[1].HasChildrenHint);
|
||||
Assert.False(roots[1].IsExpanded);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that ExpandAsync populates Children and marks the node expanded after one RPC.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_PopulatesChildrenAndMarksExpanded()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 1));
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(10, "Line1")],
|
||||
childHasChildren: [false],
|
||||
cacheSequence: 1));
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
await roots[0].ExpandAsync();
|
||||
|
||||
Assert.True(roots[0].IsExpanded);
|
||||
Assert.Single(roots[0].Children);
|
||||
Assert.Equal("Line1", roots[0].Children[0].Object.TagName);
|
||||
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that a second ExpandAsync call is a no-op and issues no additional RPC.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_CalledTwice_NoSecondRpc()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 1));
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(10, "Line1")],
|
||||
childHasChildren: [false],
|
||||
cacheSequence: 1));
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
await roots[0].ExpandAsync();
|
||||
await roots[0].ExpandAsync();
|
||||
|
||||
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that an RPC failure (NotFound) during expand is wrapped in MxGatewayException.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_UnknownParent_ThrowsMxGatewayException()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 1));
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
|
||||
// Queue the failure for the upcoming ExpandAsync call so it consumes
|
||||
// the exception on its first RPC rather than the BrowseAsync above.
|
||||
transport.BrowseChildrenExceptions.Enqueue(
|
||||
new MxGatewayException(
|
||||
"Parent not found",
|
||||
new RpcException(new Status(StatusCode.NotFound, "Parent not found"))));
|
||||
|
||||
await Assert.ThrowsAsync<MxGatewayException>(async () => await roots[0].ExpandAsync());
|
||||
Assert.False(roots[0].IsExpanded);
|
||||
Assert.Empty(roots[0].Children);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that ExpandAsync drains multi-page sibling replies and forwards the page token.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_MultiPageSiblings_GathersAllPages()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
// Roots
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(7, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 1));
|
||||
// First child page (2 children) with a next token
|
||||
BrowseChildrenReply childPage1 = BuildReply(
|
||||
children: [BuildObject(70, "ChildA"), BuildObject(71, "ChildB")],
|
||||
childHasChildren: [false, false],
|
||||
cacheSequence: 1);
|
||||
childPage1.NextPageToken = "7:abc:2";
|
||||
transport.BrowseChildrenReplies.Enqueue(childPage1);
|
||||
// Second child page (1 child) with no next token
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(72, "ChildC")],
|
||||
childHasChildren: [false],
|
||||
cacheSequence: 1));
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
await roots[0].ExpandAsync();
|
||||
|
||||
Assert.Equal(3, roots[0].Children.Count);
|
||||
Assert.Equal(3, transport.BrowseChildrenCalls.Count);
|
||||
Assert.Equal("7:abc:2", transport.BrowseChildrenCalls[2].Request.PageToken);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that ten concurrent ExpandAsync calls issue exactly one RPC, not ten.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_CalledConcurrently_OnlyFiresOneRpc()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 7));
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(2, "Mixer_001")],
|
||||
childHasChildren: [false],
|
||||
cacheSequence: 7));
|
||||
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
|
||||
// Fire ten concurrent expands of the same node.
|
||||
Task[] tasks = Enumerable.Range(0, 10)
|
||||
.Select(_ => roots[0].ExpandAsync())
|
||||
.ToArray();
|
||||
await Task.WhenAll(tasks);
|
||||
|
||||
Assert.True(roots[0].IsExpanded);
|
||||
Assert.Single(roots[0].Children);
|
||||
// 1 roots fetch + exactly 1 expand fetch = 2 total
|
||||
Assert.Equal(2, transport.BrowseChildrenCalls.Count);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that reading Children/IsExpanded concurrently with an in-flight ExpandAsync
|
||||
/// never throws (no torn enumeration of a mid-append list) and, once IsExpanded flips to
|
||||
/// true, the published Children snapshot is fully populated. Pins the safe-publication
|
||||
/// contract on the lock-free readers (Client.Dotnet-025).
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Expand_ConcurrentReadOfChildren_NeverTearsAndPublishesAtomically()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(1, "Plant", isArea: true)],
|
||||
childHasChildren: [true],
|
||||
cacheSequence: 1));
|
||||
|
||||
// Multi-page child set so the expand loop spends meaningful time appending,
|
||||
// widening the window for a concurrent reader to observe a torn list.
|
||||
BrowseChildrenReply childPage1 = BuildReply(
|
||||
children: [BuildObject(10, "A"), BuildObject(11, "B"), BuildObject(12, "C")],
|
||||
childHasChildren: [false, false, false],
|
||||
cacheSequence: 1);
|
||||
childPage1.NextPageToken = "1:p:3";
|
||||
transport.BrowseChildrenReplies.Enqueue(childPage1);
|
||||
transport.BrowseChildrenReplies.Enqueue(BuildReply(
|
||||
children: [BuildObject(13, "D"), BuildObject(14, "E")],
|
||||
childHasChildren: [false, false],
|
||||
cacheSequence: 1));
|
||||
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
IReadOnlyList<LazyBrowseNode> roots = await client.BrowseAsync();
|
||||
LazyBrowseNode node = roots[0];
|
||||
|
||||
// Gate the child-page RPCs so the expand stays mid-flight while the reader spins.
|
||||
using SemaphoreSlim release = new(0, 1);
|
||||
bool firstChildCall = true;
|
||||
transport.BrowseChildrenGate = async () =>
|
||||
{
|
||||
if (firstChildCall)
|
||||
{
|
||||
firstChildCall = false;
|
||||
await release.WaitAsync().ConfigureAwait(false);
|
||||
}
|
||||
};
|
||||
|
||||
using CancellationTokenSource readerStop = new();
|
||||
Exception? readerFailure = null;
|
||||
Task reader = Task.Run(() =>
|
||||
{
|
||||
try
|
||||
{
|
||||
while (!readerStop.IsCancellationRequested)
|
||||
{
|
||||
bool expanded = node.IsExpanded;
|
||||
|
||||
// Enumerate the snapshot; a torn/mid-append list would throw here.
|
||||
int count = 0;
|
||||
foreach (LazyBrowseNode _ in node.Children)
|
||||
{
|
||||
count++;
|
||||
}
|
||||
|
||||
// If the node reports expanded, the published snapshot must be complete.
|
||||
if (expanded)
|
||||
{
|
||||
Assert.Equal(5, count);
|
||||
}
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
readerFailure = ex;
|
||||
}
|
||||
});
|
||||
|
||||
Task expand = node.ExpandAsync();
|
||||
// Let the reader spin against the empty pre-publication snapshot for a moment.
|
||||
await Task.Delay(50);
|
||||
release.Release();
|
||||
await expand;
|
||||
|
||||
// Let the reader observe the post-publication state, then stop it.
|
||||
await Task.Delay(50);
|
||||
readerStop.Cancel();
|
||||
await reader;
|
||||
|
||||
Assert.Null(readerFailure);
|
||||
Assert.True(node.IsExpanded);
|
||||
Assert.Equal(5, node.Children.Count);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that BrowseChildrenOptions filter fields are forwarded to the BrowseChildren request.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public async Task Browse_WithFilter_ForwardsToRequest()
|
||||
{
|
||||
FakeGalaxyRepositoryTransport transport = CreateTransport();
|
||||
await using GalaxyRepositoryClient client = CreateClient(transport);
|
||||
|
||||
await client.BrowseAsync(new BrowseChildrenOptions
|
||||
{
|
||||
TagNameGlob = "Mixer*",
|
||||
AlarmBearingOnly = true,
|
||||
});
|
||||
|
||||
BrowseChildrenRequest request = Assert.Single(transport.BrowseChildrenCalls).Request;
|
||||
Assert.Equal("Mixer*", request.TagNameGlob);
|
||||
Assert.True(request.AlarmBearingOnly);
|
||||
}
|
||||
|
||||
private static GalaxyObject BuildObject(int id, string tag, bool isArea = false)
|
||||
=> new() { GobjectId = id, TagName = tag, BrowseName = tag, IsArea = isArea };
|
||||
|
||||
private static BrowseChildrenReply BuildReply(
|
||||
IReadOnlyList<GalaxyObject> children,
|
||||
IReadOnlyList<bool> childHasChildren,
|
||||
ulong cacheSequence)
|
||||
{
|
||||
BrowseChildrenReply reply = new() { TotalChildCount = children.Count, CacheSequence = cacheSequence };
|
||||
reply.Children.AddRange(children);
|
||||
reply.ChildHasChildren.AddRange(childHasChildren);
|
||||
return reply;
|
||||
}
|
||||
|
||||
private static GalaxyRepositoryClient CreateClient(FakeGalaxyRepositoryTransport transport)
|
||||
=> new(transport.Options, transport);
|
||||
|
||||
private static FakeGalaxyRepositoryTransport CreateTransport()
|
||||
=> new(new MxGatewayClientOptions
|
||||
{
|
||||
Endpoint = new Uri("http://localhost:5000"),
|
||||
ApiKey = "test-api-key",
|
||||
});
|
||||
}
|
||||
@@ -0,0 +1,85 @@
|
||||
using System.Net.Http;
|
||||
using System.Net.Security;
|
||||
using ZB.MOM.WW.MxGateway.Client;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Client.Tests;
|
||||
|
||||
public sealed class MxGatewayClientTlsHandlerTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
|
||||
/// the handler installs an accept-all callback so the gateway's self-signed cert is trusted.
|
||||
/// The callback must return true regardless of chain errors.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
|
||||
{
|
||||
MxGatewayClientOptions options = new()
|
||||
{
|
||||
Endpoint = new Uri("https://localhost:5120"),
|
||||
ApiKey = "k",
|
||||
UseTls = true,
|
||||
};
|
||||
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
|
||||
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
|
||||
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that when RequireCertificateValidation is true, the callback is left null
|
||||
/// so the OS trust store performs validation.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
|
||||
{
|
||||
MxGatewayClientOptions options = new()
|
||||
{
|
||||
Endpoint = new Uri("https://localhost:5120"),
|
||||
ApiKey = "k",
|
||||
UseTls = true,
|
||||
RequireCertificateValidation = true,
|
||||
};
|
||||
using SocketsHttpHandler handler = MxGatewayClient.CreateHttpHandlerForTests(options);
|
||||
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
|
||||
}
|
||||
}
|
||||
|
||||
public sealed class GalaxyRepositoryClientTlsHandlerTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Verifies that when TLS is used with no pinned CA and RequireCertificateValidation is false (default),
|
||||
/// the Galaxy client handler installs an accept-all callback so the gateway's self-signed cert is trusted.
|
||||
/// The callback must return true regardless of chain errors.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void Handler_SkipsVerification_WhenTlsAndNoCaPinned()
|
||||
{
|
||||
MxGatewayClientOptions options = new()
|
||||
{
|
||||
Endpoint = new Uri("https://localhost:5120"),
|
||||
ApiKey = "k",
|
||||
UseTls = true,
|
||||
};
|
||||
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
|
||||
Assert.NotNull(handler.SslOptions.RemoteCertificateValidationCallback);
|
||||
Assert.True(handler.SslOptions.RemoteCertificateValidationCallback!(null!, null!, null, SslPolicyErrors.RemoteCertificateChainErrors));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Verifies that when RequireCertificateValidation is true, the Galaxy client callback is left null
|
||||
/// so the OS trust store performs validation.
|
||||
/// </summary>
|
||||
[Fact]
|
||||
public void Handler_KeepsDefaultVerification_WhenRequireCertificateValidation()
|
||||
{
|
||||
MxGatewayClientOptions options = new()
|
||||
{
|
||||
Endpoint = new Uri("https://localhost:5120"),
|
||||
ApiKey = "k",
|
||||
UseTls = true,
|
||||
RequireCertificateValidation = true,
|
||||
};
|
||||
using SocketsHttpHandler handler = GalaxyRepositoryClient.CreateHttpHandlerForTests(options);
|
||||
Assert.Null(handler.SslOptions.RemoteCertificateValidationCallback);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
namespace ZB.MOM.WW.MxGateway.Client;
|
||||
|
||||
/// <summary>
|
||||
/// Filters and shape options for <see cref="GalaxyRepositoryClient.BrowseAsync(BrowseChildrenOptions, System.Threading.CancellationToken)"/>.
|
||||
/// Mirror of <see cref="DiscoverHierarchyOptions"/> for the lazy-browse path.
|
||||
/// </summary>
|
||||
public sealed class BrowseChildrenOptions
|
||||
{
|
||||
/// <summary>Restrict to children whose Galaxy category is in this set.</summary>
|
||||
public IReadOnlyList<int> CategoryIds { get; init; } = [];
|
||||
|
||||
/// <summary>Restrict to children whose template chain contains any of these tokens.</summary>
|
||||
public IReadOnlyList<string> TemplateChainContains { get; init; } = [];
|
||||
|
||||
/// <summary>Optional glob-style filter on <c>tag_name</c>.</summary>
|
||||
public string? TagNameGlob { get; init; }
|
||||
|
||||
/// <summary>Whether to populate each <c>GalaxyObject.Attributes</c>. Null leaves the server default.</summary>
|
||||
public bool? IncludeAttributes { get; init; }
|
||||
|
||||
/// <summary>Restrict to children that bear at least one alarm attribute.</summary>
|
||||
public bool AlarmBearingOnly { get; init; }
|
||||
|
||||
/// <summary>Restrict to children that have at least one historized attribute.</summary>
|
||||
public bool HistorizedOnly { get; init; }
|
||||
}
|
||||
@@ -19,6 +19,7 @@ namespace ZB.MOM.WW.MxGateway.Client;
|
||||
public sealed class GalaxyRepositoryClient : IAsyncDisposable
|
||||
{
|
||||
private const int DiscoverHierarchyPageSize = 5000;
|
||||
private const int BrowseChildrenPageSize = 500;
|
||||
|
||||
private readonly GrpcChannel? _channel;
|
||||
private readonly IGalaxyRepositoryClientTransport _transport;
|
||||
@@ -278,6 +279,89 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
|
||||
cancellationToken);
|
||||
}
|
||||
|
||||
/// <summary>Returns root-level browse nodes (objects with no parent).</summary>
|
||||
/// <param name="cancellationToken">Cancellation token.</param>
|
||||
/// <returns>The list of root <see cref="LazyBrowseNode"/> instances.</returns>
|
||||
public Task<IReadOnlyList<LazyBrowseNode>> BrowseAsync(CancellationToken cancellationToken = default)
|
||||
=> BrowseAsync(null, cancellationToken);
|
||||
|
||||
/// <summary>Returns root-level browse nodes filtered by the given options.</summary>
|
||||
/// <param name="options">Browse filter options. Null applies no filter.</param>
|
||||
/// <param name="cancellationToken">Cancellation token.</param>
|
||||
/// <returns>The list of root <see cref="LazyBrowseNode"/> instances.</returns>
|
||||
public async Task<IReadOnlyList<LazyBrowseNode>> BrowseAsync(
|
||||
BrowseChildrenOptions? options,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
BrowseChildrenOptions effective = options ?? new BrowseChildrenOptions();
|
||||
List<LazyBrowseNode> roots = [];
|
||||
string pageToken = string.Empty;
|
||||
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
|
||||
do
|
||||
{
|
||||
BrowseChildrenRequest request = BuildBrowseChildrenRequest(effective);
|
||||
request.PageToken = pageToken;
|
||||
BrowseChildrenReply reply = await BrowseChildrenRawAsync(request, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
for (int i = 0; i < reply.Children.Count; i++)
|
||||
{
|
||||
bool hint = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
|
||||
roots.Add(new LazyBrowseNode(this, reply.Children[i], hint, effective));
|
||||
}
|
||||
|
||||
pageToken = reply.NextPageToken;
|
||||
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
|
||||
{
|
||||
throw new MxGatewayException(
|
||||
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
|
||||
}
|
||||
}
|
||||
while (!string.IsNullOrWhiteSpace(pageToken));
|
||||
|
||||
return roots;
|
||||
}
|
||||
|
||||
/// <summary>Issues a raw BrowseChildren RPC without result wrapping.</summary>
|
||||
/// <param name="request">The browse-children request.</param>
|
||||
/// <param name="cancellationToken">Cancellation token.</param>
|
||||
/// <returns>The raw server reply.</returns>
|
||||
public Task<BrowseChildrenReply> BrowseChildrenRawAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(request);
|
||||
ThrowIfDisposed();
|
||||
|
||||
return ExecuteSafeUnaryAsync(
|
||||
token => _transport.BrowseChildrenAsync(request, CreateCallOptions(token)),
|
||||
cancellationToken);
|
||||
}
|
||||
|
||||
internal static BrowseChildrenRequest BuildBrowseChildrenRequest(BrowseChildrenOptions options)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(options);
|
||||
|
||||
BrowseChildrenRequest request = new()
|
||||
{
|
||||
PageSize = BrowseChildrenPageSize,
|
||||
AlarmBearingOnly = options.AlarmBearingOnly,
|
||||
HistorizedOnly = options.HistorizedOnly,
|
||||
};
|
||||
request.CategoryIds.Add(options.CategoryIds);
|
||||
request.TemplateChainContains.Add(options.TemplateChainContains);
|
||||
if (!string.IsNullOrWhiteSpace(options.TagNameGlob))
|
||||
{
|
||||
request.TagNameGlob = options.TagNameGlob;
|
||||
}
|
||||
|
||||
if (options.IncludeAttributes.HasValue)
|
||||
{
|
||||
request.IncludeAttributes = options.IncludeAttributes.Value;
|
||||
}
|
||||
|
||||
return request;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Subscribes to Galaxy deploy events. The server emits a bootstrap event with the
|
||||
/// current state on subscribe so callers can prime their cache, then emits one event
|
||||
@@ -406,7 +490,10 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
|
||||
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
|
||||
CreateHttpHandlerForTests(options);
|
||||
|
||||
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
|
||||
{
|
||||
SocketsHttpHandler handler = new()
|
||||
{
|
||||
@@ -426,6 +513,11 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
|
||||
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
|
||||
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
|
||||
{
|
||||
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
if (certificate is null)
|
||||
{
|
||||
return false;
|
||||
@@ -441,6 +533,10 @@ public sealed class GalaxyRepositoryClient : IAsyncDisposable
|
||||
return customChain.Build(certificateToValidate);
|
||||
};
|
||||
}
|
||||
else if (!options.RequireCertificateValidation)
|
||||
{
|
||||
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
|
||||
}
|
||||
}
|
||||
|
||||
return handler;
|
||||
|
||||
@@ -74,6 +74,23 @@ internal sealed class GrpcGalaxyRepositoryClientTransport(
|
||||
}
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async Task<BrowseChildrenReply> BrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CallOptions callOptions)
|
||||
{
|
||||
try
|
||||
{
|
||||
return await RawClient.BrowseChildrenAsync(request, callOptions)
|
||||
.ResponseAsync
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch (RpcException exception)
|
||||
{
|
||||
throw MapRpcException(exception, callOptions.CancellationToken);
|
||||
}
|
||||
}
|
||||
|
||||
/// <inheritdoc />
|
||||
public async IAsyncEnumerable<DeployEvent> WatchDeployEventsAsync(
|
||||
WatchDeployEventsRequest request,
|
||||
|
||||
@@ -33,6 +33,13 @@ internal interface IGalaxyRepositoryClientTransport
|
||||
DiscoverHierarchyRequest request,
|
||||
CallOptions callOptions);
|
||||
|
||||
/// <summary>Returns direct children of a parent in the Galaxy hierarchy.</summary>
|
||||
/// <param name="request">The browse children request.</param>
|
||||
/// <param name="callOptions">gRPC call options (timeout, cancellation, etc.).</param>
|
||||
Task<BrowseChildrenReply> BrowseChildrenAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CallOptions callOptions);
|
||||
|
||||
/// <summary>Watches for deployment events from the Galaxy Repository server.</summary>
|
||||
/// <param name="request">The watch deploy events request.</param>
|
||||
/// <param name="callOptions">gRPC call options (timeout, cancellation, etc.).</param>
|
||||
|
||||
@@ -0,0 +1,120 @@
|
||||
using ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxy;
|
||||
|
||||
namespace ZB.MOM.WW.MxGateway.Client;
|
||||
|
||||
/// <summary>
|
||||
/// One node in a lazy-loaded Galaxy browse tree. Holds the underlying
|
||||
/// <see cref="GalaxyObject"/> and exposes <see cref="ExpandAsync"/> to fetch
|
||||
/// its direct children on demand. Expansion is one-shot: a second call is a
|
||||
/// no-op. Pagination of large sibling sets is handled internally.
|
||||
/// </summary>
|
||||
public sealed class LazyBrowseNode
|
||||
{
|
||||
private readonly GalaxyRepositoryClient _client;
|
||||
private readonly BrowseChildrenOptions _options;
|
||||
private readonly SemaphoreSlim _expandLock = new(1, 1);
|
||||
|
||||
// Published once, under _expandLock, when expansion completes. Lock-free readers
|
||||
// see either the empty pre-expansion snapshot or the fully-populated post-expansion
|
||||
// snapshot — never a partially-filled list — because the snapshot is built in a local
|
||||
// and handed off via Volatile.Write (release) paired with Volatile.Read (acquire).
|
||||
private IReadOnlyList<LazyBrowseNode> _children = [];
|
||||
private volatile bool _isExpanded;
|
||||
|
||||
internal LazyBrowseNode(
|
||||
GalaxyRepositoryClient client,
|
||||
GalaxyObject @object,
|
||||
bool hasChildrenHint,
|
||||
BrowseChildrenOptions options)
|
||||
{
|
||||
_client = client;
|
||||
Object = @object;
|
||||
HasChildrenHint = hasChildrenHint;
|
||||
_options = options;
|
||||
}
|
||||
|
||||
/// <summary>The underlying Galaxy object for this node.</summary>
|
||||
public GalaxyObject Object { get; }
|
||||
|
||||
/// <summary>True when the server reports this node has at least one matching descendant.</summary>
|
||||
public bool HasChildrenHint { get; }
|
||||
|
||||
/// <summary>Direct children loaded by <see cref="ExpandAsync"/>; empty until then.</summary>
|
||||
public IReadOnlyList<LazyBrowseNode> Children => Volatile.Read(ref _children);
|
||||
|
||||
/// <summary>True after the first <see cref="ExpandAsync"/> call completes.</summary>
|
||||
public bool IsExpanded => _isExpanded;
|
||||
|
||||
/// <summary>
|
||||
/// Fetches direct children from the gateway and populates <see cref="Children"/>.
|
||||
/// Idempotent: subsequent calls are no-ops.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Thread-safe: concurrent callers see exactly one fetch; subsequent callers
|
||||
/// (after the first completes) return immediately. <see cref="Children"/> and
|
||||
/// <see cref="IsExpanded"/> may be read concurrently with an in-flight
|
||||
/// <see cref="ExpandAsync"/> on another thread; the populated children are
|
||||
/// published as an immutable snapshot under a release barrier, so a reader that
|
||||
/// observes <see cref="IsExpanded"/> as <see langword="true"/> always sees the
|
||||
/// fully-populated <see cref="Children"/>, and a reader never enumerates a
|
||||
/// partially-built list.
|
||||
/// </remarks>
|
||||
/// <param name="cancellationToken">Token to observe for cancellation.</param>
|
||||
public async Task ExpandAsync(CancellationToken cancellationToken = default)
|
||||
{
|
||||
if (_isExpanded)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
await _expandLock.WaitAsync(cancellationToken).ConfigureAwait(false);
|
||||
try
|
||||
{
|
||||
if (_isExpanded)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
// Accumulate into a local list, never the published field, so a lock-free
|
||||
// reader can never observe a half-populated collection or enumerate a list
|
||||
// that is being mutated mid-append.
|
||||
List<LazyBrowseNode> children = [];
|
||||
string pageToken = string.Empty;
|
||||
HashSet<string> seenPageTokens = new(StringComparer.Ordinal);
|
||||
do
|
||||
{
|
||||
BrowseChildrenRequest request = GalaxyRepositoryClient.BuildBrowseChildrenRequest(_options);
|
||||
request.ParentGobjectId = Object.GobjectId;
|
||||
request.PageToken = pageToken;
|
||||
|
||||
BrowseChildrenReply reply = await _client
|
||||
.BrowseChildrenRawAsync(request, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
|
||||
for (int i = 0; i < reply.Children.Count; i++)
|
||||
{
|
||||
bool hint = i < reply.ChildHasChildren.Count && reply.ChildHasChildren[i];
|
||||
children.Add(new LazyBrowseNode(_client, reply.Children[i], hint, _options));
|
||||
}
|
||||
|
||||
pageToken = reply.NextPageToken;
|
||||
if (!string.IsNullOrWhiteSpace(pageToken) && !seenPageTokens.Add(pageToken))
|
||||
{
|
||||
throw new MxGatewayException(
|
||||
$"Galaxy BrowseChildren returned a repeated page token '{pageToken}'.");
|
||||
}
|
||||
}
|
||||
while (!string.IsNullOrWhiteSpace(pageToken));
|
||||
|
||||
// Publish the completed, immutable snapshot (release) before marking the node
|
||||
// expanded (the volatile write below). A reader that observes IsExpanded == true
|
||||
// is guaranteed to also observe the fully-populated Children.
|
||||
Volatile.Write(ref _children, children);
|
||||
_isExpanded = true;
|
||||
}
|
||||
finally
|
||||
{
|
||||
_expandLock.Release();
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -315,7 +315,10 @@ public sealed class MxGatewayClient : IAsyncDisposable
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
|
||||
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options)
|
||||
private static HttpMessageHandler CreateHttpHandler(MxGatewayClientOptions options) =>
|
||||
CreateHttpHandlerForTests(options);
|
||||
|
||||
internal static SocketsHttpHandler CreateHttpHandlerForTests(MxGatewayClientOptions options)
|
||||
{
|
||||
SocketsHttpHandler handler = new()
|
||||
{
|
||||
@@ -335,6 +338,11 @@ public sealed class MxGatewayClient : IAsyncDisposable
|
||||
X509Certificate2 trustedRoot = X509CertificateLoader.LoadCertificateFromFile(options.CaCertificatePath);
|
||||
handler.SslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, errors) =>
|
||||
{
|
||||
if ((errors & System.Net.Security.SslPolicyErrors.RemoteCertificateNameMismatch) != 0)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
if (certificate is null)
|
||||
{
|
||||
return false;
|
||||
@@ -350,6 +358,10 @@ public sealed class MxGatewayClient : IAsyncDisposable
|
||||
return customChain.Build(certificateToValidate);
|
||||
};
|
||||
}
|
||||
else if (!options.RequireCertificateValidation)
|
||||
{
|
||||
handler.SslOptions.RemoteCertificateValidationCallback = (_, _, _, _) => true;
|
||||
}
|
||||
}
|
||||
|
||||
return handler;
|
||||
|
||||
@@ -7,9 +7,11 @@ namespace ZB.MOM.WW.MxGateway.Client;
|
||||
/// </summary>
|
||||
public static class MxGatewayClientContractInfo
|
||||
{
|
||||
/// <inheritdoc cref="GatewayContractInfo.GatewayProtocolVersion"/>
|
||||
public const uint GatewayProtocolVersion =
|
||||
GatewayContractInfo.GatewayProtocolVersion;
|
||||
|
||||
/// <inheritdoc cref="GatewayContractInfo.WorkerProtocolVersion"/>
|
||||
public const uint WorkerProtocolVersion =
|
||||
GatewayContractInfo.WorkerProtocolVersion;
|
||||
}
|
||||
|
||||
@@ -27,6 +27,14 @@ public sealed class MxGatewayClientOptions
|
||||
/// </summary>
|
||||
public string? CaCertificatePath { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// When true, TLS connections without a pinned <see cref="CaCertificatePath"/>
|
||||
/// use the OS trust store. When false (default), the gateway certificate is
|
||||
/// accepted without verification — appropriate for this internal tool's
|
||||
/// auto-generated self-signed certificate. Pinning a CA always verifies.
|
||||
/// </summary>
|
||||
public bool RequireCertificateValidation { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Gets the server name override for SNI during TLS handshake.
|
||||
/// </summary>
|
||||
|
||||
@@ -16,4 +16,26 @@
|
||||
<Nullable>enable</Nullable>
|
||||
</PropertyGroup>
|
||||
|
||||
<PropertyGroup>
|
||||
<IsPackable>true</IsPackable>
|
||||
<PackageId>ZB.MOM.WW.MxGateway.Client</PackageId>
|
||||
<Description>.NET 10 gRPC client for the MxAccessGateway service. Provides typed wrappers, retry, and a lazy-browse walker over the Galaxy Repository hierarchy.</Description>
|
||||
<PackageReadmeFile>README.md</PackageReadmeFile>
|
||||
<!-- Only the shipped library generates XML docs (matching src/Contracts). The Cli and
|
||||
Tests projects are not packable and do not document their public surface, so this
|
||||
stays out of the shared Directory.Build.props to avoid CS1591 on test classes. -->
|
||||
<GenerateDocumentationFile>true</GenerateDocumentationFile>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<None Include="..\README.md" Pack="true" PackagePath="\" />
|
||||
<None Include="..\LICENSE.txt" Pack="true" PackagePath="\" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<AssemblyAttribute Include="System.Runtime.CompilerServices.InternalsVisibleTo">
|
||||
<_Parameter1>ZB.MOM.WW.MxGateway.Client.Tests</_Parameter1>
|
||||
</AssemblyAttribute>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
|
||||
@@ -104,6 +104,23 @@ Support:
|
||||
- `credentials.NewClientTLSFromFile`,
|
||||
- custom `tls.Config` for advanced callers.
|
||||
|
||||
### Trust posture
|
||||
|
||||
The gateway can serve a self-signed certificate it generates itself (it has no
|
||||
PKI). To make that usable, TLS is **lenient by default**: when `Plaintext` is
|
||||
`false` and no `CACertFile`/`TLSConfig`/`TransportCredentials` is supplied,
|
||||
`buildCredentials` dials with `tls.Config{InsecureSkipVerify: true}` (carrying
|
||||
`ServerNameOverride` as the SNI when set), so the gateway's self-signed
|
||||
certificate is accepted without verification.
|
||||
|
||||
To verify the gateway instead:
|
||||
|
||||
- set `CACertFile` to pin a CA (full verification against that root), or
|
||||
- set `RequireCertificateValidation: true` to verify against the OS/system trust
|
||||
roots without pinning.
|
||||
|
||||
Pinning a CA always wins over the lenient default.
|
||||
|
||||
## Streaming
|
||||
|
||||
`Events(ctx)` should return a receive channel of:
|
||||
|
||||
@@ -75,6 +75,14 @@ client, err := mxgateway.Dial(ctx, mxgateway.Options{
|
||||
})
|
||||
```
|
||||
|
||||
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
|
||||
the client is **lenient by default**: a TLS connection (`Plaintext: false`) with
|
||||
no `CACertFile`/`TLSConfig` accepts whatever certificate the gateway presents
|
||||
(`InsecureSkipVerify`, with `ServerNameOverride` as the SNI when set). To verify
|
||||
instead, set `CACertFile` to pin a CA, or set `RequireCertificateValidation:
|
||||
true` to verify against the OS/system trust roots without pinning. See
|
||||
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
|
||||
|
||||
`Client.OpenSession` returns a `Session` with helpers for `Register`,
|
||||
`AddItem`, `AddItem2`, `Advise`, `Write`, `Events`, and `Close`. Prefer
|
||||
`SubscribeEvents` or `SubscribeEventsAfter` for long-running streams because the
|
||||
@@ -121,6 +129,68 @@ reports `present=false` (no deploy recorded). `DiscoverHierarchy` returns
|
||||
the generated `*GalaxyObject` slice with each object's dynamic attributes
|
||||
populated for direct contract access.
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `BrowseChildren` to walk one level at a
|
||||
time instead of loading the full hierarchy. Pass an empty request for root
|
||||
objects; subsequent calls set `ParentGobjectId`, `ParentTagName`, or
|
||||
`ParentContainedPath`. Filter fields match `DiscoverHierarchy`. Each response
|
||||
pairs `Children` with `ChildHasChildren` so you know which nodes to expand. See
|
||||
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
|
||||
request and filter semantics.
|
||||
|
||||
```go
|
||||
import pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated/galaxy_repository/v1"
|
||||
|
||||
reply, err := galaxy.BrowseChildren(ctx, &pb.BrowseChildrenRequest{})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
for i, child := range reply.GetChildren() {
|
||||
fmt.Printf("%s expand=%v\n", child.GetTagName(), reply.GetChildHasChildren()[i])
|
||||
}
|
||||
```
|
||||
|
||||
#### High-level walker
|
||||
|
||||
For UI trees, the client provides a `LazyBrowseNode` walker that handles
|
||||
sibling pagination and the `child_has_children` hint for you:
|
||||
|
||||
```go
|
||||
galaxy, err := mxgateway.DialGalaxy(ctx, mxgateway.Options{
|
||||
Endpoint: "localhost:5000",
|
||||
APIKey: os.Getenv("MXGATEWAY_API_KEY"),
|
||||
Plaintext: true,
|
||||
})
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
defer galaxy.Close()
|
||||
|
||||
roots, err := galaxy.Browse(ctx, nil)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
for _, root := range roots {
|
||||
if root.HasChildrenHint() {
|
||||
if err := root.Expand(ctx); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}
|
||||
for _, child := range root.Children() {
|
||||
kind := "leaf"
|
||||
if child.HasChildrenHint() {
|
||||
kind = "has children"
|
||||
}
|
||||
fmt.Printf("%s (%s)\n", child.Object().GetTagName(), kind)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Expand` is idempotent — calling it twice fires only one RPC,
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`Browse` again from the root.
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`WatchDeployEvents` opens a server-streaming subscription. The server emits a
|
||||
@@ -213,6 +283,41 @@ $env:MXGATEWAY_TEST_ITEM = 'Area001.Tag.Value'
|
||||
go run ./cmd/mxgw-go smoke -endpoint $env:MXGATEWAY_ENDPOINT -plaintext -api-key-env MXGATEWAY_API_KEY -item $env:MXGATEWAY_TEST_ITEM -json
|
||||
```
|
||||
|
||||
## Installing the Go client
|
||||
|
||||
The module is resolved directly from the git repo — no package registry:
|
||||
|
||||
````bash
|
||||
go get gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go@v0.1.1
|
||||
````
|
||||
|
||||
Then import:
|
||||
|
||||
````go
|
||||
import "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/mxgateway"
|
||||
````
|
||||
|
||||
If your build environment cannot reach `gitea.dohertylan.com` directly,
|
||||
configure `GOPROXY` to point at an internal proxy that fronts the Gitea
|
||||
repo, or set `GOPRIVATE=gitea.dohertylan.com/*` to fetch the module
|
||||
straight from the VCS — this both bypasses the public module proxy and
|
||||
disables checksum-database (`sum.golang.org`) verification for that path.
|
||||
Add `GOINSECURE=gitea.dohertylan.com/*` if the host serves the module over
|
||||
plain HTTP rather than HTTPS.
|
||||
|
||||
## Releasing a new version
|
||||
|
||||
Go modules in monorepo subdirectories use prefixed tags. To tag a release
|
||||
from this repo:
|
||||
|
||||
````bash
|
||||
pwsh scripts/tag-go-module.ps1 -Version v0.1.1 -Push
|
||||
````
|
||||
|
||||
The script validates semver, refuses to tag with uncommitted tracked
|
||||
changes, creates an annotated tag `clients/go/v0.1.1`, and (with `-Push`)
|
||||
pushes it to origin.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Client Packaging](../../docs/ClientPackaging.md)
|
||||
|
||||
@@ -824,6 +824,260 @@ func (x *GalaxyAttribute) GetIsAlarm() bool {
|
||||
return false
|
||||
}
|
||||
|
||||
type BrowseChildrenRequest struct {
|
||||
state protoimpl.MessageState `protogen:"open.v1"`
|
||||
// Parent selector. Empty oneof returns root objects (parent_gobject_id == 0).
|
||||
//
|
||||
// Types that are valid to be assigned to Parent:
|
||||
//
|
||||
// *BrowseChildrenRequest_ParentGobjectId
|
||||
// *BrowseChildrenRequest_ParentTagName
|
||||
// *BrowseChildrenRequest_ParentContainedPath
|
||||
Parent isBrowseChildrenRequest_Parent `protobuf_oneof:"parent"`
|
||||
// Maximum number of direct children to return. Server default 500; cap 5000.
|
||||
PageSize int32 `protobuf:"varint,4,opt,name=page_size,json=pageSize,proto3" json:"page_size,omitempty"`
|
||||
// Opaque token returned by a previous BrowseChildren response. Bound to the
|
||||
// cache sequence, parent selector, and the filter set; a mismatch returns
|
||||
// InvalidArgument.
|
||||
PageToken string `protobuf:"bytes,5,opt,name=page_token,json=pageToken,proto3" json:"page_token,omitempty"`
|
||||
// --- Filter parity with DiscoverHierarchy. AND-combined. ---
|
||||
CategoryIds []int32 `protobuf:"varint,6,rep,packed,name=category_ids,json=categoryIds,proto3" json:"category_ids,omitempty"`
|
||||
TemplateChainContains []string `protobuf:"bytes,7,rep,name=template_chain_contains,json=templateChainContains,proto3" json:"template_chain_contains,omitempty"`
|
||||
TagNameGlob string `protobuf:"bytes,8,opt,name=tag_name_glob,json=tagNameGlob,proto3" json:"tag_name_glob,omitempty"`
|
||||
IncludeAttributes *bool `protobuf:"varint,9,opt,name=include_attributes,json=includeAttributes,proto3,oneof" json:"include_attributes,omitempty"`
|
||||
AlarmBearingOnly bool `protobuf:"varint,10,opt,name=alarm_bearing_only,json=alarmBearingOnly,proto3" json:"alarm_bearing_only,omitempty"`
|
||||
HistorizedOnly bool `protobuf:"varint,11,opt,name=historized_only,json=historizedOnly,proto3" json:"historized_only,omitempty"`
|
||||
unknownFields protoimpl.UnknownFields
|
||||
sizeCache protoimpl.SizeCache
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) Reset() {
|
||||
*x = BrowseChildrenRequest{}
|
||||
mi := &file_galaxy_repository_proto_msgTypes[10]
|
||||
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
|
||||
ms.StoreMessageInfo(mi)
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) String() string {
|
||||
return protoimpl.X.MessageStringOf(x)
|
||||
}
|
||||
|
||||
func (*BrowseChildrenRequest) ProtoMessage() {}
|
||||
|
||||
func (x *BrowseChildrenRequest) ProtoReflect() protoreflect.Message {
|
||||
mi := &file_galaxy_repository_proto_msgTypes[10]
|
||||
if x != nil {
|
||||
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
|
||||
if ms.LoadMessageInfo() == nil {
|
||||
ms.StoreMessageInfo(mi)
|
||||
}
|
||||
return ms
|
||||
}
|
||||
return mi.MessageOf(x)
|
||||
}
|
||||
|
||||
// Deprecated: Use BrowseChildrenRequest.ProtoReflect.Descriptor instead.
|
||||
func (*BrowseChildrenRequest) Descriptor() ([]byte, []int) {
|
||||
return file_galaxy_repository_proto_rawDescGZIP(), []int{10}
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetParent() isBrowseChildrenRequest_Parent {
|
||||
if x != nil {
|
||||
return x.Parent
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetParentGobjectId() int32 {
|
||||
if x != nil {
|
||||
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentGobjectId); ok {
|
||||
return x.ParentGobjectId
|
||||
}
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetParentTagName() string {
|
||||
if x != nil {
|
||||
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentTagName); ok {
|
||||
return x.ParentTagName
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetParentContainedPath() string {
|
||||
if x != nil {
|
||||
if x, ok := x.Parent.(*BrowseChildrenRequest_ParentContainedPath); ok {
|
||||
return x.ParentContainedPath
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetPageSize() int32 {
|
||||
if x != nil {
|
||||
return x.PageSize
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetPageToken() string {
|
||||
if x != nil {
|
||||
return x.PageToken
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetCategoryIds() []int32 {
|
||||
if x != nil {
|
||||
return x.CategoryIds
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetTemplateChainContains() []string {
|
||||
if x != nil {
|
||||
return x.TemplateChainContains
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetTagNameGlob() string {
|
||||
if x != nil {
|
||||
return x.TagNameGlob
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetIncludeAttributes() bool {
|
||||
if x != nil && x.IncludeAttributes != nil {
|
||||
return *x.IncludeAttributes
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetAlarmBearingOnly() bool {
|
||||
if x != nil {
|
||||
return x.AlarmBearingOnly
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenRequest) GetHistorizedOnly() bool {
|
||||
if x != nil {
|
||||
return x.HistorizedOnly
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
type isBrowseChildrenRequest_Parent interface {
|
||||
isBrowseChildrenRequest_Parent()
|
||||
}
|
||||
|
||||
type BrowseChildrenRequest_ParentGobjectId struct {
|
||||
ParentGobjectId int32 `protobuf:"varint,1,opt,name=parent_gobject_id,json=parentGobjectId,proto3,oneof"`
|
||||
}
|
||||
|
||||
type BrowseChildrenRequest_ParentTagName struct {
|
||||
ParentTagName string `protobuf:"bytes,2,opt,name=parent_tag_name,json=parentTagName,proto3,oneof"`
|
||||
}
|
||||
|
||||
type BrowseChildrenRequest_ParentContainedPath struct {
|
||||
ParentContainedPath string `protobuf:"bytes,3,opt,name=parent_contained_path,json=parentContainedPath,proto3,oneof"`
|
||||
}
|
||||
|
||||
func (*BrowseChildrenRequest_ParentGobjectId) isBrowseChildrenRequest_Parent() {}
|
||||
|
||||
func (*BrowseChildrenRequest_ParentTagName) isBrowseChildrenRequest_Parent() {}
|
||||
|
||||
func (*BrowseChildrenRequest_ParentContainedPath) isBrowseChildrenRequest_Parent() {}
|
||||
|
||||
type BrowseChildrenReply struct {
|
||||
state protoimpl.MessageState `protogen:"open.v1"`
|
||||
// Direct children matching the filter, sorted areas-first then by
|
||||
// case-insensitive display name (same order as the dashboard tree).
|
||||
Children []*GalaxyObject `protobuf:"bytes,1,rep,name=children,proto3" json:"children,omitempty"`
|
||||
// Non-empty when another page of siblings is available.
|
||||
NextPageToken string `protobuf:"bytes,2,opt,name=next_page_token,json=nextPageToken,proto3" json:"next_page_token,omitempty"`
|
||||
// Total matching direct children of the parent (post-filter).
|
||||
TotalChildCount int32 `protobuf:"varint,3,opt,name=total_child_count,json=totalChildCount,proto3" json:"total_child_count,omitempty"`
|
||||
// Parallel array, indexed with `children`. True when the child has at least
|
||||
// one matching descendant under the same filter set. Lets a UI choose
|
||||
// whether to draw an expand triangle without an extra round trip.
|
||||
ChildHasChildren []bool `protobuf:"varint,4,rep,packed,name=child_has_children,json=childHasChildren,proto3" json:"child_has_children,omitempty"`
|
||||
// Cache sequence this reply was projected from. Clients may pass it back as
|
||||
// part of the page_token contract. Mismatch on the next page -> InvalidArgument.
|
||||
CacheSequence uint64 `protobuf:"varint,5,opt,name=cache_sequence,json=cacheSequence,proto3" json:"cache_sequence,omitempty"`
|
||||
unknownFields protoimpl.UnknownFields
|
||||
sizeCache protoimpl.SizeCache
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) Reset() {
|
||||
*x = BrowseChildrenReply{}
|
||||
mi := &file_galaxy_repository_proto_msgTypes[11]
|
||||
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
|
||||
ms.StoreMessageInfo(mi)
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) String() string {
|
||||
return protoimpl.X.MessageStringOf(x)
|
||||
}
|
||||
|
||||
func (*BrowseChildrenReply) ProtoMessage() {}
|
||||
|
||||
func (x *BrowseChildrenReply) ProtoReflect() protoreflect.Message {
|
||||
mi := &file_galaxy_repository_proto_msgTypes[11]
|
||||
if x != nil {
|
||||
ms := protoimpl.X.MessageStateOf(protoimpl.Pointer(x))
|
||||
if ms.LoadMessageInfo() == nil {
|
||||
ms.StoreMessageInfo(mi)
|
||||
}
|
||||
return ms
|
||||
}
|
||||
return mi.MessageOf(x)
|
||||
}
|
||||
|
||||
// Deprecated: Use BrowseChildrenReply.ProtoReflect.Descriptor instead.
|
||||
func (*BrowseChildrenReply) Descriptor() ([]byte, []int) {
|
||||
return file_galaxy_repository_proto_rawDescGZIP(), []int{11}
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) GetChildren() []*GalaxyObject {
|
||||
if x != nil {
|
||||
return x.Children
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) GetNextPageToken() string {
|
||||
if x != nil {
|
||||
return x.NextPageToken
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) GetTotalChildCount() int32 {
|
||||
if x != nil {
|
||||
return x.TotalChildCount
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) GetChildHasChildren() []bool {
|
||||
if x != nil {
|
||||
return x.ChildHasChildren
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (x *BrowseChildrenReply) GetCacheSequence() uint64 {
|
||||
if x != nil {
|
||||
return x.CacheSequence
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
var File_galaxy_repository_proto protoreflect.FileDescriptor
|
||||
|
||||
const file_galaxy_repository_proto_rawDesc = "" +
|
||||
@@ -897,12 +1151,35 @@ const file_galaxy_repository_proto_rawDesc = "" +
|
||||
"\x17security_classification\x18\t \x01(\x05R\x16securityClassification\x12#\n" +
|
||||
"\ris_historized\x18\n" +
|
||||
" \x01(\bR\fisHistorized\x12\x19\n" +
|
||||
"\bis_alarm\x18\v \x01(\bR\aisAlarm2\xcc\x03\n" +
|
||||
"\bis_alarm\x18\v \x01(\bR\aisAlarm\"\x8c\x04\n" +
|
||||
"\x15BrowseChildrenRequest\x12,\n" +
|
||||
"\x11parent_gobject_id\x18\x01 \x01(\x05H\x00R\x0fparentGobjectId\x12(\n" +
|
||||
"\x0fparent_tag_name\x18\x02 \x01(\tH\x00R\rparentTagName\x124\n" +
|
||||
"\x15parent_contained_path\x18\x03 \x01(\tH\x00R\x13parentContainedPath\x12\x1b\n" +
|
||||
"\tpage_size\x18\x04 \x01(\x05R\bpageSize\x12\x1d\n" +
|
||||
"\n" +
|
||||
"page_token\x18\x05 \x01(\tR\tpageToken\x12!\n" +
|
||||
"\fcategory_ids\x18\x06 \x03(\x05R\vcategoryIds\x126\n" +
|
||||
"\x17template_chain_contains\x18\a \x03(\tR\x15templateChainContains\x12\"\n" +
|
||||
"\rtag_name_glob\x18\b \x01(\tR\vtagNameGlob\x122\n" +
|
||||
"\x12include_attributes\x18\t \x01(\bH\x01R\x11includeAttributes\x88\x01\x01\x12,\n" +
|
||||
"\x12alarm_bearing_only\x18\n" +
|
||||
" \x01(\bR\x10alarmBearingOnly\x12'\n" +
|
||||
"\x0fhistorized_only\x18\v \x01(\bR\x0ehistorizedOnlyB\b\n" +
|
||||
"\x06parentB\x15\n" +
|
||||
"\x13_include_attributes\"\xfe\x01\n" +
|
||||
"\x13BrowseChildrenReply\x12>\n" +
|
||||
"\bchildren\x18\x01 \x03(\v2\".galaxy_repository.v1.GalaxyObjectR\bchildren\x12&\n" +
|
||||
"\x0fnext_page_token\x18\x02 \x01(\tR\rnextPageToken\x12*\n" +
|
||||
"\x11total_child_count\x18\x03 \x01(\x05R\x0ftotalChildCount\x12,\n" +
|
||||
"\x12child_has_children\x18\x04 \x03(\bR\x10childHasChildren\x12%\n" +
|
||||
"\x0ecache_sequence\x18\x05 \x01(\x04R\rcacheSequence2\xb6\x04\n" +
|
||||
"\x10GalaxyRepository\x12h\n" +
|
||||
"\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n" +
|
||||
"\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n" +
|
||||
"\x11DiscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n" +
|
||||
"\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01B-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3"
|
||||
"\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x12h\n" +
|
||||
"\x0eBrowseChildren\x12+.galaxy_repository.v1.BrowseChildrenRequest\x1a).galaxy_repository.v1.BrowseChildrenReplyB-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3"
|
||||
|
||||
var (
|
||||
file_galaxy_repository_proto_rawDescOnce sync.Once
|
||||
@@ -916,7 +1193,7 @@ func file_galaxy_repository_proto_rawDescGZIP() []byte {
|
||||
return file_galaxy_repository_proto_rawDescData
|
||||
}
|
||||
|
||||
var file_galaxy_repository_proto_msgTypes = make([]protoimpl.MessageInfo, 10)
|
||||
var file_galaxy_repository_proto_msgTypes = make([]protoimpl.MessageInfo, 12)
|
||||
var file_galaxy_repository_proto_goTypes = []any{
|
||||
(*TestConnectionRequest)(nil), // 0: galaxy_repository.v1.TestConnectionRequest
|
||||
(*TestConnectionReply)(nil), // 1: galaxy_repository.v1.TestConnectionReply
|
||||
@@ -928,30 +1205,35 @@ var file_galaxy_repository_proto_goTypes = []any{
|
||||
(*DeployEvent)(nil), // 7: galaxy_repository.v1.DeployEvent
|
||||
(*GalaxyObject)(nil), // 8: galaxy_repository.v1.GalaxyObject
|
||||
(*GalaxyAttribute)(nil), // 9: galaxy_repository.v1.GalaxyAttribute
|
||||
(*timestamppb.Timestamp)(nil), // 10: google.protobuf.Timestamp
|
||||
(*wrapperspb.Int32Value)(nil), // 11: google.protobuf.Int32Value
|
||||
(*BrowseChildrenRequest)(nil), // 10: galaxy_repository.v1.BrowseChildrenRequest
|
||||
(*BrowseChildrenReply)(nil), // 11: galaxy_repository.v1.BrowseChildrenReply
|
||||
(*timestamppb.Timestamp)(nil), // 12: google.protobuf.Timestamp
|
||||
(*wrapperspb.Int32Value)(nil), // 13: google.protobuf.Int32Value
|
||||
}
|
||||
var file_galaxy_repository_proto_depIdxs = []int32{
|
||||
10, // 0: galaxy_repository.v1.GetLastDeployTimeReply.time_of_last_deploy:type_name -> google.protobuf.Timestamp
|
||||
11, // 1: galaxy_repository.v1.DiscoverHierarchyRequest.max_depth:type_name -> google.protobuf.Int32Value
|
||||
12, // 0: galaxy_repository.v1.GetLastDeployTimeReply.time_of_last_deploy:type_name -> google.protobuf.Timestamp
|
||||
13, // 1: galaxy_repository.v1.DiscoverHierarchyRequest.max_depth:type_name -> google.protobuf.Int32Value
|
||||
8, // 2: galaxy_repository.v1.DiscoverHierarchyReply.objects:type_name -> galaxy_repository.v1.GalaxyObject
|
||||
10, // 3: galaxy_repository.v1.WatchDeployEventsRequest.last_seen_deploy_time:type_name -> google.protobuf.Timestamp
|
||||
10, // 4: galaxy_repository.v1.DeployEvent.observed_at:type_name -> google.protobuf.Timestamp
|
||||
10, // 5: galaxy_repository.v1.DeployEvent.time_of_last_deploy:type_name -> google.protobuf.Timestamp
|
||||
12, // 3: galaxy_repository.v1.WatchDeployEventsRequest.last_seen_deploy_time:type_name -> google.protobuf.Timestamp
|
||||
12, // 4: galaxy_repository.v1.DeployEvent.observed_at:type_name -> google.protobuf.Timestamp
|
||||
12, // 5: galaxy_repository.v1.DeployEvent.time_of_last_deploy:type_name -> google.protobuf.Timestamp
|
||||
9, // 6: galaxy_repository.v1.GalaxyObject.attributes:type_name -> galaxy_repository.v1.GalaxyAttribute
|
||||
0, // 7: galaxy_repository.v1.GalaxyRepository.TestConnection:input_type -> galaxy_repository.v1.TestConnectionRequest
|
||||
2, // 8: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:input_type -> galaxy_repository.v1.GetLastDeployTimeRequest
|
||||
4, // 9: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:input_type -> galaxy_repository.v1.DiscoverHierarchyRequest
|
||||
6, // 10: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:input_type -> galaxy_repository.v1.WatchDeployEventsRequest
|
||||
1, // 11: galaxy_repository.v1.GalaxyRepository.TestConnection:output_type -> galaxy_repository.v1.TestConnectionReply
|
||||
3, // 12: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:output_type -> galaxy_repository.v1.GetLastDeployTimeReply
|
||||
5, // 13: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:output_type -> galaxy_repository.v1.DiscoverHierarchyReply
|
||||
7, // 14: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:output_type -> galaxy_repository.v1.DeployEvent
|
||||
11, // [11:15] is the sub-list for method output_type
|
||||
7, // [7:11] is the sub-list for method input_type
|
||||
7, // [7:7] is the sub-list for extension type_name
|
||||
7, // [7:7] is the sub-list for extension extendee
|
||||
0, // [0:7] is the sub-list for field type_name
|
||||
8, // 7: galaxy_repository.v1.BrowseChildrenReply.children:type_name -> galaxy_repository.v1.GalaxyObject
|
||||
0, // 8: galaxy_repository.v1.GalaxyRepository.TestConnection:input_type -> galaxy_repository.v1.TestConnectionRequest
|
||||
2, // 9: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:input_type -> galaxy_repository.v1.GetLastDeployTimeRequest
|
||||
4, // 10: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:input_type -> galaxy_repository.v1.DiscoverHierarchyRequest
|
||||
6, // 11: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:input_type -> galaxy_repository.v1.WatchDeployEventsRequest
|
||||
10, // 12: galaxy_repository.v1.GalaxyRepository.BrowseChildren:input_type -> galaxy_repository.v1.BrowseChildrenRequest
|
||||
1, // 13: galaxy_repository.v1.GalaxyRepository.TestConnection:output_type -> galaxy_repository.v1.TestConnectionReply
|
||||
3, // 14: galaxy_repository.v1.GalaxyRepository.GetLastDeployTime:output_type -> galaxy_repository.v1.GetLastDeployTimeReply
|
||||
5, // 15: galaxy_repository.v1.GalaxyRepository.DiscoverHierarchy:output_type -> galaxy_repository.v1.DiscoverHierarchyReply
|
||||
7, // 16: galaxy_repository.v1.GalaxyRepository.WatchDeployEvents:output_type -> galaxy_repository.v1.DeployEvent
|
||||
11, // 17: galaxy_repository.v1.GalaxyRepository.BrowseChildren:output_type -> galaxy_repository.v1.BrowseChildrenReply
|
||||
13, // [13:18] is the sub-list for method output_type
|
||||
8, // [8:13] is the sub-list for method input_type
|
||||
8, // [8:8] is the sub-list for extension type_name
|
||||
8, // [8:8] is the sub-list for extension extendee
|
||||
0, // [0:8] is the sub-list for field type_name
|
||||
}
|
||||
|
||||
func init() { file_galaxy_repository_proto_init() }
|
||||
@@ -964,13 +1246,18 @@ func file_galaxy_repository_proto_init() {
|
||||
(*DiscoverHierarchyRequest_RootTagName)(nil),
|
||||
(*DiscoverHierarchyRequest_RootContainedPath)(nil),
|
||||
}
|
||||
file_galaxy_repository_proto_msgTypes[10].OneofWrappers = []any{
|
||||
(*BrowseChildrenRequest_ParentGobjectId)(nil),
|
||||
(*BrowseChildrenRequest_ParentTagName)(nil),
|
||||
(*BrowseChildrenRequest_ParentContainedPath)(nil),
|
||||
}
|
||||
type x struct{}
|
||||
out := protoimpl.TypeBuilder{
|
||||
File: protoimpl.DescBuilder{
|
||||
GoPackagePath: reflect.TypeOf(x{}).PkgPath(),
|
||||
RawDescriptor: unsafe.Slice(unsafe.StringData(file_galaxy_repository_proto_rawDesc), len(file_galaxy_repository_proto_rawDesc)),
|
||||
NumEnums: 0,
|
||||
NumMessages: 10,
|
||||
NumMessages: 12,
|
||||
NumExtensions: 0,
|
||||
NumServices: 1,
|
||||
},
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
|
||||
// versions:
|
||||
// - protoc-gen-go-grpc v1.6.1
|
||||
// - protoc-gen-go-grpc v1.6.2
|
||||
// - protoc v7.34.1
|
||||
// source: galaxy_repository.proto
|
||||
|
||||
@@ -23,6 +23,7 @@ const (
|
||||
GalaxyRepository_GetLastDeployTime_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/GetLastDeployTime"
|
||||
GalaxyRepository_DiscoverHierarchy_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/DiscoverHierarchy"
|
||||
GalaxyRepository_WatchDeployEvents_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/WatchDeployEvents"
|
||||
GalaxyRepository_BrowseChildren_FullMethodName = "/galaxy_repository.v1.GalaxyRepository/BrowseChildren"
|
||||
)
|
||||
|
||||
// GalaxyRepositoryClient is the client API for GalaxyRepository service.
|
||||
@@ -44,6 +45,11 @@ type GalaxyRepositoryClient interface {
|
||||
// increasing per server start; gaps indicate the per-subscriber buffer dropped
|
||||
// older events because the client was too slow.
|
||||
WatchDeployEvents(ctx context.Context, in *WatchDeployEventsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[DeployEvent], error)
|
||||
// Returns the direct children of a parent object (or the root objects when
|
||||
// `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
// one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
// DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
BrowseChildren(ctx context.Context, in *BrowseChildrenRequest, opts ...grpc.CallOption) (*BrowseChildrenReply, error)
|
||||
}
|
||||
|
||||
type galaxyRepositoryClient struct {
|
||||
@@ -103,6 +109,16 @@ func (c *galaxyRepositoryClient) WatchDeployEvents(ctx context.Context, in *Watc
|
||||
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
|
||||
type GalaxyRepository_WatchDeployEventsClient = grpc.ServerStreamingClient[DeployEvent]
|
||||
|
||||
func (c *galaxyRepositoryClient) BrowseChildren(ctx context.Context, in *BrowseChildrenRequest, opts ...grpc.CallOption) (*BrowseChildrenReply, error) {
|
||||
cOpts := append([]grpc.CallOption{grpc.StaticMethod()}, opts...)
|
||||
out := new(BrowseChildrenReply)
|
||||
err := c.cc.Invoke(ctx, GalaxyRepository_BrowseChildren_FullMethodName, in, out, cOpts...)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return out, nil
|
||||
}
|
||||
|
||||
// GalaxyRepositoryServer is the server API for GalaxyRepository service.
|
||||
// All implementations must embed UnimplementedGalaxyRepositoryServer
|
||||
// for forward compatibility.
|
||||
@@ -122,6 +138,11 @@ type GalaxyRepositoryServer interface {
|
||||
// increasing per server start; gaps indicate the per-subscriber buffer dropped
|
||||
// older events because the client was too slow.
|
||||
WatchDeployEvents(*WatchDeployEventsRequest, grpc.ServerStreamingServer[DeployEvent]) error
|
||||
// Returns the direct children of a parent object (or the root objects when
|
||||
// `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
// one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
// DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
BrowseChildren(context.Context, *BrowseChildrenRequest) (*BrowseChildrenReply, error)
|
||||
mustEmbedUnimplementedGalaxyRepositoryServer()
|
||||
}
|
||||
|
||||
@@ -144,6 +165,9 @@ func (UnimplementedGalaxyRepositoryServer) DiscoverHierarchy(context.Context, *D
|
||||
func (UnimplementedGalaxyRepositoryServer) WatchDeployEvents(*WatchDeployEventsRequest, grpc.ServerStreamingServer[DeployEvent]) error {
|
||||
return status.Error(codes.Unimplemented, "method WatchDeployEvents not implemented")
|
||||
}
|
||||
func (UnimplementedGalaxyRepositoryServer) BrowseChildren(context.Context, *BrowseChildrenRequest) (*BrowseChildrenReply, error) {
|
||||
return nil, status.Error(codes.Unimplemented, "method BrowseChildren not implemented")
|
||||
}
|
||||
func (UnimplementedGalaxyRepositoryServer) mustEmbedUnimplementedGalaxyRepositoryServer() {}
|
||||
func (UnimplementedGalaxyRepositoryServer) testEmbeddedByValue() {}
|
||||
|
||||
@@ -230,6 +254,24 @@ func _GalaxyRepository_WatchDeployEvents_Handler(srv interface{}, stream grpc.Se
|
||||
// This type alias is provided for backwards compatibility with existing code that references the prior non-generic stream type by name.
|
||||
type GalaxyRepository_WatchDeployEventsServer = grpc.ServerStreamingServer[DeployEvent]
|
||||
|
||||
func _GalaxyRepository_BrowseChildren_Handler(srv interface{}, ctx context.Context, dec func(interface{}) error, interceptor grpc.UnaryServerInterceptor) (interface{}, error) {
|
||||
in := new(BrowseChildrenRequest)
|
||||
if err := dec(in); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if interceptor == nil {
|
||||
return srv.(GalaxyRepositoryServer).BrowseChildren(ctx, in)
|
||||
}
|
||||
info := &grpc.UnaryServerInfo{
|
||||
Server: srv,
|
||||
FullMethod: GalaxyRepository_BrowseChildren_FullMethodName,
|
||||
}
|
||||
handler := func(ctx context.Context, req interface{}) (interface{}, error) {
|
||||
return srv.(GalaxyRepositoryServer).BrowseChildren(ctx, req.(*BrowseChildrenRequest))
|
||||
}
|
||||
return interceptor(ctx, in, info, handler)
|
||||
}
|
||||
|
||||
// GalaxyRepository_ServiceDesc is the grpc.ServiceDesc for GalaxyRepository service.
|
||||
// It's only intended for direct use with grpc.RegisterService,
|
||||
// and not to be introspected or modified (even as a copy)
|
||||
@@ -249,6 +291,10 @@ var GalaxyRepository_ServiceDesc = grpc.ServiceDesc{
|
||||
MethodName: "DiscoverHierarchy",
|
||||
Handler: _GalaxyRepository_DiscoverHierarchy_Handler,
|
||||
},
|
||||
{
|
||||
MethodName: "BrowseChildren",
|
||||
Handler: _GalaxyRepository_BrowseChildren_Handler,
|
||||
},
|
||||
},
|
||||
Streams: []grpc.StreamDesc{
|
||||
{
|
||||
|
||||
@@ -725,9 +725,10 @@ func (SessionState) EnumDescriptor() ([]byte, []int) {
|
||||
return file_mxaccess_gateway_proto_rawDescGZIP(), []int{8}
|
||||
}
|
||||
|
||||
// Public request shape for QueryActiveAlarms. session_id is currently unused
|
||||
// (the snapshot is session-less) but reserved so a future per-session view
|
||||
// can be added without a wire break.
|
||||
// Public request shape for QueryActiveAlarms.
|
||||
// Clients may leave `session_id` empty; the gateway currently ignores it and
|
||||
// serves the session-less central-monitor cache. A future version may use it
|
||||
// to scope the snapshot to one session.
|
||||
type QueryActiveAlarmsRequest struct {
|
||||
state protoimpl.MessageState `protogen:"open.v1"`
|
||||
SessionId string `protobuf:"bytes,1,opt,name=session_id,json=sessionId,proto3" json:"session_id,omitempty"`
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
// Code generated by protoc-gen-go-grpc. DO NOT EDIT.
|
||||
// versions:
|
||||
// - protoc-gen-go-grpc v1.6.1
|
||||
// - protoc-gen-go-grpc v1.6.2
|
||||
// - protoc v7.34.1
|
||||
// source: mxaccess_gateway.proto
|
||||
|
||||
@@ -50,6 +50,9 @@ type MxAccessGatewayClient interface {
|
||||
// reconnect to seed Part 9 client state, or to reconcile alarms that may
|
||||
// have been missed during a transport blip. Streamed so callers can
|
||||
// begin processing without buffering the full set.
|
||||
// `QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
|
||||
// snapshot to alarms whose `alarm_full_reference` starts with the given
|
||||
// prefix; an empty prefix returns the full set.
|
||||
QueryActiveAlarms(ctx context.Context, in *QueryActiveAlarmsRequest, opts ...grpc.CallOption) (grpc.ServerStreamingClient[ActiveAlarmSnapshot], error)
|
||||
}
|
||||
|
||||
@@ -180,6 +183,9 @@ type MxAccessGatewayServer interface {
|
||||
// reconnect to seed Part 9 client state, or to reconcile alarms that may
|
||||
// have been missed during a transport blip. Streamed so callers can
|
||||
// begin processing without buffering the full set.
|
||||
// `QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
|
||||
// snapshot to alarms whose `alarm_full_reference` starts with the given
|
||||
// prefix; an empty prefix returns the full set.
|
||||
QueryActiveAlarms(*QueryActiveAlarmsRequest, grpc.ServerStreamingServer[ActiveAlarmSnapshot]) error
|
||||
mustEmbedUnimplementedMxAccessGatewayServer()
|
||||
}
|
||||
|
||||
@@ -222,10 +222,22 @@ func resolveTransportCredentials(opts Options) (credentials.TransportCredentials
|
||||
return credentials.NewTLS(cfg), nil
|
||||
}
|
||||
|
||||
return credentials.NewTLS(&tls.Config{
|
||||
MinVersion: tls.VersionTLS12,
|
||||
ServerName: opts.ServerNameOverride,
|
||||
}), nil
|
||||
return credentials.NewTLS(tlsConfigForOptions(opts)), nil
|
||||
}
|
||||
|
||||
// tlsConfigForOptions returns the *tls.Config for the no-CA, no-custom-config TLS path.
|
||||
// It returns nil when the caller should use a different credentials path (CA file or custom TLSConfig).
|
||||
// Exposed as an internal helper so unit tests can assert the InsecureSkipVerify posture.
|
||||
func tlsConfigForOptions(opts Options) *tls.Config {
|
||||
// CA file and custom TLSConfig take their own paths in resolveTransportCredentials.
|
||||
if opts.CACertFile != "" || opts.TLSConfig != nil {
|
||||
return nil
|
||||
}
|
||||
return &tls.Config{
|
||||
MinVersion: tls.VersionTLS12,
|
||||
ServerName: opts.ServerNameOverride,
|
||||
InsecureSkipVerify: !opts.RequireCertificateValidation, //nolint:gosec // internal tool; self-signed gateway cert expected; opt-in strict via RequireCertificateValidation
|
||||
}
|
||||
}
|
||||
|
||||
// OpenSessionOptions describes fields used to create an OpenSessionRequest.
|
||||
|
||||
@@ -0,0 +1,59 @@
|
||||
package mxgateway
|
||||
|
||||
import (
|
||||
"crypto/tls"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// tlsConfigFromOptions is the internal helper under test.
|
||||
// It extracts the *tls.Config from the no-CA TLS path of resolveTransportCredentials.
|
||||
// We exercise it directly to avoid needing a real dial target.
|
||||
|
||||
func TestTLSInsecureSkipVerify_DefaultTrue(t *testing.T) {
|
||||
cfg := tlsConfigForOptions(Options{
|
||||
Endpoint: "localhost:5120",
|
||||
})
|
||||
if cfg == nil {
|
||||
t.Fatal("expected non-nil tls.Config")
|
||||
}
|
||||
if !cfg.InsecureSkipVerify {
|
||||
t.Error("InsecureSkipVerify should be true by default when no CA is pinned")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTLSInsecureSkipVerify_FalseWhenRequireCertificateValidation(t *testing.T) {
|
||||
cfg := tlsConfigForOptions(Options{
|
||||
Endpoint: "localhost:5120",
|
||||
RequireCertificateValidation: true,
|
||||
})
|
||||
if cfg == nil {
|
||||
t.Fatal("expected non-nil tls.Config")
|
||||
}
|
||||
if cfg.InsecureSkipVerify {
|
||||
t.Error("InsecureSkipVerify should be false when RequireCertificateValidation is true")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTLSInsecureSkipVerify_FalseWhenCACertFileSet(t *testing.T) {
|
||||
// When a CA file is pinned, the CA-verification path is taken instead.
|
||||
// tlsConfigForOptions should return nil (the CA path does not use our helper).
|
||||
cfg := tlsConfigForOptions(Options{
|
||||
Endpoint: "localhost:5120",
|
||||
CACertFile: "/some/ca.pem",
|
||||
})
|
||||
if cfg != nil {
|
||||
t.Error("expected nil tls.Config when CACertFile is set (CA path taken)")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTLSInsecureSkipVerify_FalseWhenCustomTLSConfig(t *testing.T) {
|
||||
// When TLSConfig is supplied explicitly, our default skip-verify must not overwrite it.
|
||||
custom := &tls.Config{MinVersion: tls.VersionTLS13}
|
||||
cfg := tlsConfigForOptions(Options{
|
||||
Endpoint: "localhost:5120",
|
||||
TLSConfig: custom,
|
||||
})
|
||||
if cfg != nil {
|
||||
t.Error("expected nil tls.Config when TLSConfig is already set (custom config path taken)")
|
||||
}
|
||||
}
|
||||
@@ -3,7 +3,9 @@ package mxgateway
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
|
||||
@@ -13,6 +15,14 @@ import (
|
||||
"google.golang.org/protobuf/types/known/timestamppb"
|
||||
)
|
||||
|
||||
// browseChildrenPageSize is the per-request page size used by the lazy walker.
|
||||
const browseChildrenPageSize = 500
|
||||
|
||||
// discoverHierarchyPageSize is the per-request page size used by DiscoverHierarchy.
|
||||
// Mirrors the .NET client constant so large galaxies are not silently truncated
|
||||
// by the server's default page cap.
|
||||
const discoverHierarchyPageSize = 5000
|
||||
|
||||
// RawGalaxyRepositoryClient is the generated gRPC client interface for the
|
||||
// Galaxy Repository service exposed for callers that need direct contract
|
||||
// access.
|
||||
@@ -40,6 +50,10 @@ type (
|
||||
WatchDeployEventsRequest = pb.WatchDeployEventsRequest
|
||||
// DeployEvent is one Galaxy Repository deploy event.
|
||||
DeployEvent = pb.DeployEvent
|
||||
// BrowseChildrenRequest is the request for BrowseChildren.
|
||||
BrowseChildrenRequest = pb.BrowseChildrenRequest
|
||||
// BrowseChildrenReply is the reply for BrowseChildren.
|
||||
BrowseChildrenReply = pb.BrowseChildrenReply
|
||||
)
|
||||
|
||||
// RawDeployEventStream is the generated WatchDeployEvents client stream.
|
||||
@@ -146,16 +160,35 @@ func (c *GalaxyClient) GetLastDeployTime(ctx context.Context) (time.Time, bool,
|
||||
|
||||
// DiscoverHierarchy returns the deployed Galaxy object hierarchy with each
|
||||
// object's dynamic attributes. The objects are returned in the order supplied
|
||||
// by the server.
|
||||
// by the server. The call pages over the server's NextPageToken until the
|
||||
// server signals it has no more results, matching the .NET client.
|
||||
func (c *GalaxyClient) DiscoverHierarchy(ctx context.Context) ([]*GalaxyObject, error) {
|
||||
callCtx, cancel := c.callContext(ctx)
|
||||
defer cancel()
|
||||
|
||||
reply, err := c.raw.DiscoverHierarchy(callCtx, &pb.DiscoverHierarchyRequest{})
|
||||
if err != nil {
|
||||
return nil, &GatewayError{Op: "galaxy discover hierarchy", Err: err}
|
||||
var objects []*GalaxyObject
|
||||
pageToken := ""
|
||||
seen := map[string]struct{}{}
|
||||
for {
|
||||
callCtx, cancel := c.callContext(ctx)
|
||||
reply, err := c.raw.DiscoverHierarchy(callCtx, &pb.DiscoverHierarchyRequest{
|
||||
PageSize: discoverHierarchyPageSize,
|
||||
PageToken: pageToken,
|
||||
})
|
||||
cancel()
|
||||
if err != nil {
|
||||
return nil, &GatewayError{Op: "galaxy discover hierarchy", Err: err}
|
||||
}
|
||||
objects = append(objects, reply.GetObjects()...)
|
||||
pageToken = reply.GetNextPageToken()
|
||||
if pageToken == "" {
|
||||
return objects, nil
|
||||
}
|
||||
if _, dup := seen[pageToken]; dup {
|
||||
return nil, &GatewayError{
|
||||
Op: "galaxy discover hierarchy",
|
||||
Err: fmt.Errorf("repeated page token %q", pageToken),
|
||||
}
|
||||
}
|
||||
seen[pageToken] = struct{}{}
|
||||
}
|
||||
return reply.GetObjects(), nil
|
||||
}
|
||||
|
||||
// WatchDeployEventsRaw starts the generated WatchDeployEvents stream for callers
|
||||
@@ -238,6 +271,206 @@ func (c *GalaxyClient) Close() error {
|
||||
return c.conn.Close()
|
||||
}
|
||||
|
||||
// LazyBrowseNode is one node in a lazy Galaxy hierarchy walk produced by
|
||||
// (*GalaxyClient).Browse. Children are not fetched until Expand is called.
|
||||
// The node is safe for concurrent use; concurrent Expand calls coalesce onto
|
||||
// a single in-flight RPC and do not block snapshot accessors.
|
||||
type LazyBrowseNode struct {
|
||||
client *GalaxyClient
|
||||
object *pb.GalaxyObject
|
||||
hasChildrenHint bool
|
||||
options BrowseChildrenOptions
|
||||
|
||||
// expandLock gates inspection and mutation of expand-coordination state
|
||||
// (expanding, expandDone, expandErr). It is held only briefly; the BrowseChildren
|
||||
// RPC itself runs outside this lock so concurrent readers and waiters are not blocked.
|
||||
expandLock sync.Mutex
|
||||
expanding bool
|
||||
expandDone chan struct{}
|
||||
expandErr error
|
||||
|
||||
// mu protects the children snapshot and isExpanded flag for concurrent
|
||||
// Children() / IsExpanded() readers.
|
||||
mu sync.RWMutex
|
||||
children []*LazyBrowseNode
|
||||
isExpanded bool
|
||||
}
|
||||
|
||||
// Object returns the underlying GalaxyObject describing this node.
|
||||
func (n *LazyBrowseNode) Object() *pb.GalaxyObject { return n.object }
|
||||
|
||||
// HasChildrenHint reports the server-supplied hint on whether this node has
|
||||
// matching descendants under the current filter set.
|
||||
func (n *LazyBrowseNode) HasChildrenHint() bool { return n.hasChildrenHint }
|
||||
|
||||
// Children returns a snapshot copy of the currently-loaded child nodes. Returns
|
||||
// an empty slice when Expand has not yet been called.
|
||||
func (n *LazyBrowseNode) Children() []*LazyBrowseNode {
|
||||
n.mu.RLock()
|
||||
defer n.mu.RUnlock()
|
||||
out := make([]*LazyBrowseNode, len(n.children))
|
||||
copy(out, n.children)
|
||||
return out
|
||||
}
|
||||
|
||||
// IsExpanded reports whether Expand has completed successfully on this node.
|
||||
func (n *LazyBrowseNode) IsExpanded() bool {
|
||||
n.mu.RLock()
|
||||
defer n.mu.RUnlock()
|
||||
return n.isExpanded
|
||||
}
|
||||
|
||||
// Expand fetches this node's direct children via BrowseChildren when they have
|
||||
// not yet been loaded. Subsequent calls after a successful Expand are a no-op
|
||||
// and do not issue another RPC.
|
||||
//
|
||||
// Expand is safe to call concurrently from multiple goroutines: callers that
|
||||
// arrive while an expansion is in flight wait on the active RPC and share its
|
||||
// result instead of issuing a second RPC. The RPC itself runs without holding
|
||||
// the snapshot mutex, so concurrent Children() and IsExpanded() callers are
|
||||
// not blocked for the duration of the network round trip.
|
||||
//
|
||||
// Failure semantics: a failed expansion surfaces the same error to every
|
||||
// in-flight waiter, but the node is left in its pre-call state (isExpanded =
|
||||
// false, no in-flight expansion). The next Expand call therefore retries with
|
||||
// a fresh RPC; failures are not sticky.
|
||||
func (n *LazyBrowseNode) Expand(ctx context.Context) error {
|
||||
// Fast path: already expanded.
|
||||
n.mu.RLock()
|
||||
if n.isExpanded {
|
||||
n.mu.RUnlock()
|
||||
return nil
|
||||
}
|
||||
n.mu.RUnlock()
|
||||
|
||||
// Either start a new expansion or wait on an existing one.
|
||||
n.expandLock.Lock()
|
||||
n.mu.RLock()
|
||||
alreadyExpanded := n.isExpanded
|
||||
n.mu.RUnlock()
|
||||
if alreadyExpanded {
|
||||
n.expandLock.Unlock()
|
||||
return nil
|
||||
}
|
||||
if n.expanding {
|
||||
done := n.expandDone
|
||||
n.expandLock.Unlock()
|
||||
select {
|
||||
case <-done:
|
||||
n.expandLock.Lock()
|
||||
err := n.expandErr
|
||||
n.expandLock.Unlock()
|
||||
return err
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
}
|
||||
}
|
||||
n.expanding = true
|
||||
n.expandDone = make(chan struct{})
|
||||
done := n.expandDone
|
||||
n.expandLock.Unlock()
|
||||
|
||||
// Issue the RPC outside any lock so concurrent readers/waiters are not blocked.
|
||||
parentID := n.object.GetGobjectId()
|
||||
children, err := n.client.browseChildrenInner(ctx, &parentID, n.options)
|
||||
|
||||
if err == nil {
|
||||
n.mu.Lock()
|
||||
n.children = children
|
||||
n.isExpanded = true
|
||||
n.mu.Unlock()
|
||||
}
|
||||
|
||||
// Publish result to waiters and clear the in-flight marker so a failed
|
||||
// expansion can be retried by the next Expand call.
|
||||
n.expandLock.Lock()
|
||||
n.expandErr = err
|
||||
n.expanding = false
|
||||
close(done)
|
||||
n.expandLock.Unlock()
|
||||
|
||||
return err
|
||||
}
|
||||
|
||||
// Browse returns the root nodes of the Galaxy hierarchy. The returned nodes
|
||||
// have only their server-supplied hints populated; call Expand on each node to
|
||||
// fetch its direct children. When opts is nil the server defaults apply.
|
||||
func (c *GalaxyClient) Browse(ctx context.Context, opts *BrowseChildrenOptions) ([]*LazyBrowseNode, error) {
|
||||
effective := BrowseChildrenOptions{}
|
||||
if opts != nil {
|
||||
effective = *opts
|
||||
}
|
||||
return c.browseChildrenInner(ctx, nil, effective)
|
||||
}
|
||||
|
||||
// BrowseChildrenRaw issues a single BrowseChildren RPC and returns the raw
|
||||
// reply for callers that need direct page-token control. Transport-level
|
||||
// failures are wrapped in *GatewayError to match the rest of the client.
|
||||
func (c *GalaxyClient) BrowseChildrenRaw(ctx context.Context, req *pb.BrowseChildrenRequest) (*pb.BrowseChildrenReply, error) {
|
||||
callCtx, cancel := c.callContext(ctx)
|
||||
defer cancel()
|
||||
reply, err := c.raw.BrowseChildren(callCtx, req)
|
||||
if err != nil {
|
||||
return nil, &GatewayError{Op: "galaxy browse children", Err: err}
|
||||
}
|
||||
return reply, nil
|
||||
}
|
||||
|
||||
func (c *GalaxyClient) browseChildrenInner(
|
||||
ctx context.Context,
|
||||
parentGobjectID *int32,
|
||||
opts BrowseChildrenOptions,
|
||||
) ([]*LazyBrowseNode, error) {
|
||||
var nodes []*LazyBrowseNode
|
||||
pageToken := ""
|
||||
seen := map[string]struct{}{}
|
||||
for {
|
||||
req := &pb.BrowseChildrenRequest{
|
||||
PageSize: browseChildrenPageSize,
|
||||
PageToken: pageToken,
|
||||
CategoryIds: opts.CategoryIds,
|
||||
TemplateChainContains: opts.TemplateChainContains,
|
||||
TagNameGlob: opts.TagNameGlob,
|
||||
AlarmBearingOnly: opts.AlarmBearingOnly,
|
||||
HistorizedOnly: opts.HistorizedOnly,
|
||||
}
|
||||
if parentGobjectID != nil {
|
||||
req.Parent = &pb.BrowseChildrenRequest_ParentGobjectId{ParentGobjectId: *parentGobjectID}
|
||||
}
|
||||
if opts.IncludeAttributes != nil {
|
||||
req.IncludeAttributes = opts.IncludeAttributes
|
||||
}
|
||||
|
||||
reply, err := c.BrowseChildrenRaw(ctx, req)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
for i, child := range reply.GetChildren() {
|
||||
hasChildren := reply.GetChildHasChildren()
|
||||
hint := i < len(hasChildren) && hasChildren[i]
|
||||
nodes = append(nodes, &LazyBrowseNode{
|
||||
client: c,
|
||||
object: child,
|
||||
hasChildrenHint: hint,
|
||||
options: opts,
|
||||
})
|
||||
}
|
||||
|
||||
pageToken = reply.GetNextPageToken()
|
||||
if pageToken == "" {
|
||||
return nodes, nil
|
||||
}
|
||||
if _, dup := seen[pageToken]; dup {
|
||||
return nil, &GatewayError{
|
||||
Op: "galaxy browse children",
|
||||
Err: fmt.Errorf("repeated page token %q", pageToken),
|
||||
}
|
||||
}
|
||||
seen[pageToken] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
func (c *GalaxyClient) callContext(ctx context.Context) (context.Context, context.CancelFunc) {
|
||||
timeout := c.opts.CallTimeout
|
||||
if timeout == 0 {
|
||||
|
||||
@@ -4,11 +4,14 @@ import (
|
||||
"context"
|
||||
"errors"
|
||||
"net"
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
|
||||
"google.golang.org/grpc"
|
||||
"google.golang.org/grpc/codes"
|
||||
"google.golang.org/grpc/status"
|
||||
"google.golang.org/grpc/test/bufconn"
|
||||
"google.golang.org/protobuf/types/known/timestamppb"
|
||||
)
|
||||
@@ -144,6 +147,47 @@ func TestGalaxyDiscoverHierarchyReturnsObjects(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyDiscoverHierarchyPaginatesAcrossMultiplePages(t *testing.T) {
|
||||
page1 := &pb.DiscoverHierarchyReply{
|
||||
Objects: []*pb.GalaxyObject{
|
||||
{GobjectId: 1, TagName: "A"},
|
||||
{GobjectId: 2, TagName: "B"},
|
||||
},
|
||||
NextPageToken: "page-2",
|
||||
TotalObjectCount: 3,
|
||||
}
|
||||
page2 := &pb.DiscoverHierarchyReply{
|
||||
Objects: []*pb.GalaxyObject{
|
||||
{GobjectId: 3, TagName: "C"},
|
||||
},
|
||||
TotalObjectCount: 3,
|
||||
}
|
||||
fake := &fakeGalaxyServer{
|
||||
discoverHierarchyReplies: []*pb.DiscoverHierarchyReply{page1, page2},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
objs, err := client.DiscoverHierarchy(context.Background())
|
||||
if err != nil {
|
||||
t.Fatalf("DiscoverHierarchy: %v", err)
|
||||
}
|
||||
if got, want := len(objs), 3; got != want {
|
||||
t.Fatalf("len(objs) = %d, want %d", got, want)
|
||||
}
|
||||
if len(fake.discoverHierarchyCalls) != 2 {
|
||||
t.Fatalf("expected 2 RPC calls, got %d", len(fake.discoverHierarchyCalls))
|
||||
}
|
||||
if fake.discoverHierarchyCalls[0].GetPageSize() != discoverHierarchyPageSize {
|
||||
t.Fatalf("first call PageSize = %d, want %d",
|
||||
fake.discoverHierarchyCalls[0].GetPageSize(), discoverHierarchyPageSize)
|
||||
}
|
||||
if fake.discoverHierarchyCalls[1].GetPageToken() != "page-2" {
|
||||
t.Fatalf("second call page token = %q, want %q",
|
||||
fake.discoverHierarchyCalls[1].GetPageToken(), "page-2")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyDialReturnsGatewayErrorOnRpcFailure(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{failTest: true}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
@@ -370,15 +414,20 @@ func newGalaxyBufconnClient(t *testing.T, fake *fakeGalaxyServer) (*GalaxyClient
|
||||
type fakeGalaxyServer struct {
|
||||
pb.UnimplementedGalaxyRepositoryServer
|
||||
|
||||
testReply *pb.TestConnectionReply
|
||||
testAuth string
|
||||
failTest bool
|
||||
deployReply *pb.GetLastDeployTimeReply
|
||||
discoverReply *pb.DiscoverHierarchyReply
|
||||
watchEvents []*pb.DeployEvent
|
||||
watchRequest *pb.WatchDeployEventsRequest
|
||||
watchSendInterval time.Duration
|
||||
watchHoldOpen bool
|
||||
testReply *pb.TestConnectionReply
|
||||
testAuth string
|
||||
failTest bool
|
||||
deployReply *pb.GetLastDeployTimeReply
|
||||
discoverReply *pb.DiscoverHierarchyReply
|
||||
discoverHierarchyCalls []*pb.DiscoverHierarchyRequest
|
||||
discoverHierarchyReplies []*pb.DiscoverHierarchyReply
|
||||
watchEvents []*pb.DeployEvent
|
||||
watchRequest *pb.WatchDeployEventsRequest
|
||||
watchSendInterval time.Duration
|
||||
watchHoldOpen bool
|
||||
browseChildrenCalls []*pb.BrowseChildrenRequest
|
||||
browseChildrenReplies []*pb.BrowseChildrenReply
|
||||
browseChildrenError error
|
||||
}
|
||||
|
||||
func (s *fakeGalaxyServer) TestConnection(ctx context.Context, req *pb.TestConnectionRequest) (*pb.TestConnectionReply, error) {
|
||||
@@ -400,6 +449,12 @@ func (s *fakeGalaxyServer) GetLastDeployTime(ctx context.Context, req *pb.GetLas
|
||||
}
|
||||
|
||||
func (s *fakeGalaxyServer) DiscoverHierarchy(ctx context.Context, req *pb.DiscoverHierarchyRequest) (*pb.DiscoverHierarchyReply, error) {
|
||||
s.discoverHierarchyCalls = append(s.discoverHierarchyCalls, req)
|
||||
if len(s.discoverHierarchyReplies) > 0 {
|
||||
reply := s.discoverHierarchyReplies[0]
|
||||
s.discoverHierarchyReplies = s.discoverHierarchyReplies[1:]
|
||||
return reply, nil
|
||||
}
|
||||
if s.discoverReply != nil {
|
||||
return s.discoverReply, nil
|
||||
}
|
||||
@@ -425,3 +480,385 @@ func (s *fakeGalaxyServer) WatchDeployEvents(req *pb.WatchDeployEventsRequest, s
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (s *fakeGalaxyServer) BrowseChildren(ctx context.Context, req *pb.BrowseChildrenRequest) (*pb.BrowseChildrenReply, error) {
|
||||
s.browseChildrenCalls = append(s.browseChildrenCalls, req)
|
||||
if s.browseChildrenError != nil {
|
||||
err := s.browseChildrenError
|
||||
s.browseChildrenError = nil
|
||||
return nil, err
|
||||
}
|
||||
if len(s.browseChildrenReplies) == 0 {
|
||||
return &pb.BrowseChildrenReply{}, nil
|
||||
}
|
||||
reply := s.browseChildrenReplies[0]
|
||||
s.browseChildrenReplies = s.browseChildrenReplies[1:]
|
||||
return reply, nil
|
||||
}
|
||||
|
||||
func obj(id int32, tag string, isArea bool) *pb.GalaxyObject {
|
||||
return &pb.GalaxyObject{
|
||||
GobjectId: id,
|
||||
TagName: tag,
|
||||
BrowseName: tag,
|
||||
IsArea: isArea,
|
||||
}
|
||||
}
|
||||
|
||||
func buildBrowseReply(children []*pb.GalaxyObject, hasChildren []bool, seq uint64) *pb.BrowseChildrenReply {
|
||||
return &pb.BrowseChildrenReply{
|
||||
TotalChildCount: int32(len(children)),
|
||||
CacheSequence: seq,
|
||||
Children: children,
|
||||
ChildHasChildren: hasChildren,
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseNoParentReturnsRoots(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true), obj(99, "Other", false)},
|
||||
[]bool{true, false},
|
||||
7,
|
||||
),
|
||||
},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
roots, err := client.Browse(context.Background(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
if got, want := len(roots), 2; got != want {
|
||||
t.Fatalf("len(roots) = %d, want %d", got, want)
|
||||
}
|
||||
if roots[0].Object().GetTagName() != "Plant" {
|
||||
t.Fatalf("roots[0].TagName = %q", roots[0].Object().GetTagName())
|
||||
}
|
||||
if !roots[0].HasChildrenHint() {
|
||||
t.Fatal("roots[0].HasChildrenHint = false, want true")
|
||||
}
|
||||
if roots[0].IsExpanded() {
|
||||
t.Fatal("roots[0].IsExpanded = true, want false")
|
||||
}
|
||||
if roots[1].HasChildrenHint() {
|
||||
t.Fatal("roots[1].HasChildrenHint = true, want false")
|
||||
}
|
||||
if len(fake.browseChildrenCalls) != 1 {
|
||||
t.Fatalf("BrowseChildren calls = %d, want 1", len(fake.browseChildrenCalls))
|
||||
}
|
||||
if fake.browseChildrenCalls[0].GetParent() != nil {
|
||||
t.Fatalf("root browse should not set Parent oneof, got %T", fake.browseChildrenCalls[0].GetParent())
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseExpandPopulatesChildrenAndMarksExpanded(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true)},
|
||||
[]bool{true},
|
||||
1,
|
||||
),
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(10, "Area1", true), obj(11, "Tank1", false)},
|
||||
[]bool{true, false},
|
||||
1,
|
||||
),
|
||||
},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
roots, err := client.Browse(context.Background(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
if len(roots) != 1 {
|
||||
t.Fatalf("len(roots) = %d, want 1", len(roots))
|
||||
}
|
||||
plant := roots[0]
|
||||
if plant.IsExpanded() {
|
||||
t.Fatal("plant.IsExpanded = true before Expand, want false")
|
||||
}
|
||||
if err := plant.Expand(context.Background()); err != nil {
|
||||
t.Fatalf("Expand: %v", err)
|
||||
}
|
||||
if !plant.IsExpanded() {
|
||||
t.Fatal("plant.IsExpanded = false after Expand, want true")
|
||||
}
|
||||
children := plant.Children()
|
||||
if len(children) != 2 {
|
||||
t.Fatalf("len(children) = %d, want 2", len(children))
|
||||
}
|
||||
if children[0].Object().GetTagName() != "Area1" {
|
||||
t.Fatalf("children[0].TagName = %q, want Area1", children[0].Object().GetTagName())
|
||||
}
|
||||
if !children[0].HasChildrenHint() {
|
||||
t.Fatal("children[0].HasChildrenHint = false, want true")
|
||||
}
|
||||
if children[1].HasChildrenHint() {
|
||||
t.Fatal("children[1].HasChildrenHint = true, want false")
|
||||
}
|
||||
if len(fake.browseChildrenCalls) != 2 {
|
||||
t.Fatalf("BrowseChildren calls = %d, want 2", len(fake.browseChildrenCalls))
|
||||
}
|
||||
parent := fake.browseChildrenCalls[1].GetParent()
|
||||
parentGobj, ok := parent.(*pb.BrowseChildrenRequest_ParentGobjectId)
|
||||
if !ok {
|
||||
t.Fatalf("Parent oneof = %T, want *BrowseChildrenRequest_ParentGobjectId", parent)
|
||||
}
|
||||
if parentGobj.ParentGobjectId != 1 {
|
||||
t.Fatalf("ParentGobjectId = %d, want 1", parentGobj.ParentGobjectId)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseExpandIdempotentNoSecondRpc(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true)},
|
||||
[]bool{true},
|
||||
1,
|
||||
),
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(10, "Area1", true)},
|
||||
[]bool{false},
|
||||
1,
|
||||
),
|
||||
},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
roots, err := client.Browse(context.Background(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
plant := roots[0]
|
||||
if err := plant.Expand(context.Background()); err != nil {
|
||||
t.Fatalf("Expand #1: %v", err)
|
||||
}
|
||||
callsAfterFirst := len(fake.browseChildrenCalls)
|
||||
if callsAfterFirst != 2 {
|
||||
t.Fatalf("BrowseChildren calls after first Expand = %d, want 2", callsAfterFirst)
|
||||
}
|
||||
if err := plant.Expand(context.Background()); err != nil {
|
||||
t.Fatalf("Expand #2: %v", err)
|
||||
}
|
||||
if got := len(fake.browseChildrenCalls); got != callsAfterFirst {
|
||||
t.Fatalf("BrowseChildren calls after second Expand = %d, want %d (no extra RPC)", got, callsAfterFirst)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseExpandUnknownParentReturnsNotFoundError(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true)},
|
||||
[]bool{true},
|
||||
1,
|
||||
),
|
||||
},
|
||||
browseChildrenError: status.Error(codes.NotFound, "parent not found"),
|
||||
}
|
||||
// The first Browse() consumes the first reply; the next call (Expand) will
|
||||
// then hit browseChildrenError. We need the error to fire only on the second
|
||||
// call, so seed the reply first and let the call sequence consume them in
|
||||
// order. Because BrowseChildren in the fake consumes browseChildrenError
|
||||
// before falling through to replies, swap the strategy: keep the root reply
|
||||
// but have BrowseChildren return the error on the second call. We do this by
|
||||
// emptying the reply list after the first Browse.
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
// First call returns the error (because browseChildrenError takes precedence).
|
||||
// To avoid that, clear it for the root call by performing a manual setup: we
|
||||
// pre-stage replies first, then set the error after the first call. Easiest:
|
||||
// pre-Browse() with error=nil, then set error before Expand.
|
||||
fake.browseChildrenError = nil
|
||||
roots, err := client.Browse(context.Background(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
if len(roots) != 1 {
|
||||
t.Fatalf("len(roots) = %d, want 1", len(roots))
|
||||
}
|
||||
fake.browseChildrenError = status.Error(codes.NotFound, "parent not found")
|
||||
|
||||
err = roots[0].Expand(context.Background())
|
||||
if err == nil {
|
||||
t.Fatal("Expand: error = nil, want NotFound")
|
||||
}
|
||||
if status.Code(err) != codes.NotFound {
|
||||
t.Fatalf("status.Code = %s, want NotFound", status.Code(err))
|
||||
}
|
||||
if roots[0].IsExpanded() {
|
||||
t.Fatal("roots[0].IsExpanded = true after failed Expand, want false")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseExpandMultiPageGathersAllPages(t *testing.T) {
|
||||
firstPage := buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true)},
|
||||
[]bool{true},
|
||||
7,
|
||||
)
|
||||
|
||||
pageA := buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(10, "Child1", false), obj(11, "Child2", false)},
|
||||
[]bool{false, false},
|
||||
7,
|
||||
)
|
||||
pageA.NextPageToken = "7:abc:2"
|
||||
pageB := buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(12, "Child3", false)},
|
||||
[]bool{false},
|
||||
7,
|
||||
)
|
||||
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{firstPage, pageA, pageB},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
roots, err := client.Browse(context.Background(), nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
if err := roots[0].Expand(context.Background()); err != nil {
|
||||
t.Fatalf("Expand: %v", err)
|
||||
}
|
||||
children := roots[0].Children()
|
||||
if len(children) != 3 {
|
||||
t.Fatalf("len(children) = %d, want 3", len(children))
|
||||
}
|
||||
if len(fake.browseChildrenCalls) != 3 {
|
||||
t.Fatalf("BrowseChildren calls = %d, want 3", len(fake.browseChildrenCalls))
|
||||
}
|
||||
if got := fake.browseChildrenCalls[2].GetPageToken(); got != "7:abc:2" {
|
||||
t.Fatalf("third call PageToken = %q, want %q", got, "7:abc:2")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseWithFilterForwardsToRequest(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
buildBrowseReply(nil, nil, 1),
|
||||
},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
include := true
|
||||
opts := &BrowseChildrenOptions{
|
||||
CategoryIds: []int32{7, 9},
|
||||
TemplateChainContains: []string{"$AppObject"},
|
||||
TagNameGlob: "Tank*",
|
||||
IncludeAttributes: &include,
|
||||
AlarmBearingOnly: true,
|
||||
HistorizedOnly: true,
|
||||
}
|
||||
if _, err := client.Browse(context.Background(), opts); err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
if len(fake.browseChildrenCalls) != 1 {
|
||||
t.Fatalf("BrowseChildren calls = %d, want 1", len(fake.browseChildrenCalls))
|
||||
}
|
||||
got := fake.browseChildrenCalls[0]
|
||||
if want := []int32{7, 9}; len(got.GetCategoryIds()) != 2 || got.GetCategoryIds()[0] != want[0] || got.GetCategoryIds()[1] != want[1] {
|
||||
t.Fatalf("CategoryIds = %v, want %v", got.GetCategoryIds(), want)
|
||||
}
|
||||
if want := []string{"$AppObject"}; len(got.GetTemplateChainContains()) != 1 || got.GetTemplateChainContains()[0] != want[0] {
|
||||
t.Fatalf("TemplateChainContains = %v, want %v", got.GetTemplateChainContains(), want)
|
||||
}
|
||||
if got.GetTagNameGlob() != "Tank*" {
|
||||
t.Fatalf("TagNameGlob = %q, want %q", got.GetTagNameGlob(), "Tank*")
|
||||
}
|
||||
if !got.GetIncludeAttributes() {
|
||||
t.Fatal("IncludeAttributes = false, want true")
|
||||
}
|
||||
if !got.GetAlarmBearingOnly() {
|
||||
t.Fatal("AlarmBearingOnly = false, want true")
|
||||
}
|
||||
if !got.GetHistorizedOnly() {
|
||||
t.Fatal("HistorizedOnly = false, want true")
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseExpandConcurrentCallersOnlyFireOneRpc(t *testing.T) {
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{
|
||||
// roots
|
||||
buildBrowseReply([]*pb.GalaxyObject{obj(1, "Plant", true)}, []bool{true}, 7),
|
||||
// one expand: one child
|
||||
buildBrowseReply([]*pb.GalaxyObject{obj(2, "Mixer", false)}, []bool{false}, 7),
|
||||
},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
ctx := context.Background()
|
||||
roots, err := client.Browse(ctx, nil)
|
||||
if err != nil {
|
||||
t.Fatalf("Browse: %v", err)
|
||||
}
|
||||
|
||||
var wg sync.WaitGroup
|
||||
errs := make(chan error, 10)
|
||||
for i := 0; i < 10; i++ {
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
errs <- roots[0].Expand(ctx)
|
||||
}()
|
||||
}
|
||||
wg.Wait()
|
||||
close(errs)
|
||||
for err := range errs {
|
||||
if err != nil {
|
||||
t.Fatalf("concurrent Expand: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
if !roots[0].IsExpanded() {
|
||||
t.Fatal("IsExpanded() = false after 10 concurrent expands")
|
||||
}
|
||||
if got, want := len(roots[0].Children()), 1; got != want {
|
||||
t.Fatalf("len(children) = %d, want %d", got, want)
|
||||
}
|
||||
// 1 roots fetch + exactly 1 expand fetch.
|
||||
if got, want := len(fake.browseChildrenCalls), 2; got != want {
|
||||
t.Fatalf("RPC count = %d, want %d", got, want)
|
||||
}
|
||||
}
|
||||
|
||||
func TestGalaxyBrowseChildrenRejectsRepeatedPageToken(t *testing.T) {
|
||||
// Build a reply that carries a non-empty NextPageToken so browseChildrenInner
|
||||
// will request a second page. Queue the same reply twice so the second response
|
||||
// returns the same page token, triggering the duplicate-token guard.
|
||||
page := buildBrowseReply(
|
||||
[]*pb.GalaxyObject{obj(1, "Plant", true)},
|
||||
[]bool{true},
|
||||
1,
|
||||
)
|
||||
page.NextPageToken = "1:abc:1"
|
||||
|
||||
fake := &fakeGalaxyServer{
|
||||
browseChildrenReplies: []*pb.BrowseChildrenReply{page, page},
|
||||
}
|
||||
client, cleanup := newGalaxyBufconnClient(t, fake)
|
||||
defer cleanup()
|
||||
|
||||
_, err := client.Browse(context.Background(), nil)
|
||||
if err == nil {
|
||||
t.Fatal("Browse: error = nil, want repeated-page-token error")
|
||||
}
|
||||
var gwErr *GatewayError
|
||||
if !errors.As(err, &gwErr) {
|
||||
t.Fatalf("error type = %T, want *GatewayError; err = %v", err, err)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -34,6 +34,32 @@ type Options struct {
|
||||
TransportCredentials credentials.TransportCredentials
|
||||
// DialOptions are appended to the gRPC dial options after the defaults.
|
||||
DialOptions []grpc.DialOption
|
||||
// RequireCertificateValidation forces TLS certificate verification even when
|
||||
// no CACertFile is pinned. Default false: the gateway's self-signed cert is
|
||||
// accepted without verification (internal-tool posture).
|
||||
RequireCertificateValidation bool
|
||||
}
|
||||
|
||||
// BrowseChildrenOptions configures lazy Galaxy hierarchy walks performed by
|
||||
// (*GalaxyClient).Browse and (*LazyBrowseNode).Expand. All fields are optional;
|
||||
// the zero value matches the dashboard default (no filters, all attributes per
|
||||
// the server default).
|
||||
type BrowseChildrenOptions struct {
|
||||
// CategoryIds restricts results to the listed Galaxy category ids when set.
|
||||
CategoryIds []int32
|
||||
// TemplateChainContains restricts results to objects whose template chain
|
||||
// contains any of the listed template tag names.
|
||||
TemplateChainContains []string
|
||||
// TagNameGlob restricts results to objects whose tag name matches the glob
|
||||
// pattern when non-empty.
|
||||
TagNameGlob string
|
||||
// IncludeAttributes overrides the server default for attribute inclusion when
|
||||
// non-nil. The pointer form mirrors the proto's optional field.
|
||||
IncludeAttributes *bool
|
||||
// AlarmBearingOnly limits results to alarm-bearing objects when true.
|
||||
AlarmBearingOnly bool
|
||||
// HistorizedOnly limits results to historized objects when true.
|
||||
HistorizedOnly bool
|
||||
}
|
||||
|
||||
// RedactedAPIKey returns a display-safe representation of the configured API
|
||||
|
||||
@@ -112,6 +112,23 @@ Support:
|
||||
- custom CA certificate file,
|
||||
- server name override for test environments.
|
||||
|
||||
### Trust posture
|
||||
|
||||
The gateway can serve a self-signed certificate it generates itself (it has no
|
||||
PKI). To make that usable, TLS is **lenient by default**: when the channel is not
|
||||
plaintext and no `caCertificatePath` is set, the client builds
|
||||
`GrpcSslContexts.forClient().trustManager(InsecureTrustManagerFactory.INSTANCE)`
|
||||
(grpc-netty-shaded), so the gateway's self-signed certificate is accepted without
|
||||
verification.
|
||||
|
||||
To verify the gateway instead:
|
||||
|
||||
- set `caCertificatePath` to pin a CA (full verification against that root), or
|
||||
- set `requireCertificateValidation` to `true` to verify against the JVM trust
|
||||
store without pinning.
|
||||
|
||||
Pinning a CA always wins over the lenient default.
|
||||
|
||||
## Streaming
|
||||
|
||||
Support both:
|
||||
|
||||
+101
-2
@@ -57,6 +57,16 @@ try (MxGatewayClient client = MxGatewayClient.connect(options);
|
||||
}
|
||||
```
|
||||
|
||||
The gateway can auto-generate its own self-signed certificate (it has no PKI), so
|
||||
the client is **lenient by default**: a TLS connection (`plaintext(false)`) with
|
||||
no `caCertificatePath` accepts whatever certificate the gateway presents (via
|
||||
grpc-netty-shaded's `InsecureTrustManagerFactory`). To verify instead, set
|
||||
`caCertificatePath` to pin a CA, or set `requireCertificateValidation(true)` to
|
||||
verify against the JVM trust store without pinning. Use `serverNameOverride` /
|
||||
`--server-name-override` when the dialed host differs from the certificate SAN.
|
||||
See
|
||||
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
|
||||
|
||||
Use `rawBlockingStub`, `rawFutureStub`, `rawAsyncStub`, `openSessionRaw`,
|
||||
`closeSessionRaw`, `invoke`, and raw session helper methods when tests need the
|
||||
underlying protobuf messages. `MxGatewayCommandException` and
|
||||
@@ -116,6 +126,61 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-deploy-time --endpoint localh
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="galaxy-discover --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --json"
|
||||
```
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `browseChildrenRaw` to walk one level at a
|
||||
time instead of loading the full hierarchy with `discoverHierarchy`. Pass a
|
||||
default request for root objects; subsequent calls set `parentGobjectId`,
|
||||
`parentTagName`, or `parentContainedPath`. Filter fields match
|
||||
`DiscoverHierarchy`. Each response pairs `getChildrenList()` with
|
||||
`getChildHasChildrenList()` so you know which nodes to expand. See
|
||||
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
|
||||
request and filter semantics. For most callers the high-level
|
||||
`browse()`/`LazyBrowseNode` walker below is the preferred surface;
|
||||
`browseChildrenRaw` exposes the single underlying RPC when you need direct
|
||||
control of paging.
|
||||
|
||||
```java
|
||||
BrowseChildrenReply reply = galaxy.browseChildrenRaw(
|
||||
BrowseChildrenRequest.newBuilder().build());
|
||||
|
||||
List<GalaxyObject> children = reply.getChildrenList();
|
||||
List<Boolean> hasChildren = reply.getChildHasChildrenList();
|
||||
for (int i = 0; i < children.size(); i++) {
|
||||
System.out.printf("%s expand=%b%n", children.get(i).getTagName(), hasChildren.get(i));
|
||||
}
|
||||
```
|
||||
|
||||
#### High-level walker
|
||||
|
||||
For UI trees, the client provides a `LazyBrowseNode` walker that handles
|
||||
sibling pagination and the `child_has_children` hint for you:
|
||||
|
||||
```java
|
||||
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
|
||||
.endpoint("localhost:5000")
|
||||
.apiKey(System.getenv("MXGATEWAY_API_KEY"))
|
||||
.plaintext(true)
|
||||
.build();
|
||||
|
||||
try (GalaxyRepositoryClient galaxy = GalaxyRepositoryClient.connect(options)) {
|
||||
List<LazyBrowseNode> roots = galaxy.browse();
|
||||
for (LazyBrowseNode root : roots) {
|
||||
if (root.hasChildrenHint()) {
|
||||
root.expand();
|
||||
}
|
||||
for (LazyBrowseNode child : root.getChildren()) {
|
||||
String kind = child.hasChildrenHint() ? "has children" : "leaf";
|
||||
System.out.println(child.getObject().getTagName() + " (" + kind + ")");
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`expand` is idempotent — calling it twice fires only one RPC,
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`browse` again from the root.
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`GalaxyRepository.WatchDeployEvents` is a server-streaming RPC: the gateway
|
||||
@@ -185,8 +250,11 @@ gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint localhost:5000 --ap
|
||||
```
|
||||
|
||||
The CLI accepts `--api-key`, `--api-key-env`, `--plaintext`, `--ca-file`,
|
||||
`--server-name-override`, `--timeout`, and `--json` on gateway commands. JSON
|
||||
output redacts API keys.
|
||||
`--server-name-override`, `--require-certificate-validation`, `--timeout`, and
|
||||
`--json` on gateway commands. JSON output redacts API keys. TLS is lenient by
|
||||
default (the certificate is not verified unless you pin a CA with `--ca-file`);
|
||||
pass `--require-certificate-validation` to verify the server certificate against
|
||||
the JVM trust store without pinning.
|
||||
|
||||
Use TLS options for a secured gateway:
|
||||
|
||||
@@ -229,6 +297,37 @@ $env:MXGATEWAY_TEST_ITEM = 'TestObject.TestInt'
|
||||
gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json"
|
||||
```
|
||||
|
||||
## Installing from the Gitea Maven repository
|
||||
|
||||
The client publishes to the internal Gitea Maven repository at
|
||||
`https://gitea.dohertylan.com/api/packages/dohertj2/maven`.
|
||||
|
||||
In your consumer project's `build.gradle`:
|
||||
|
||||
````groovy
|
||||
repositories {
|
||||
maven {
|
||||
url 'https://gitea.dohertylan.com/api/packages/dohertj2/maven'
|
||||
credentials {
|
||||
username = System.getenv('GITEA_USERNAME')
|
||||
password = System.getenv('GITEA_TOKEN')
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
dependencies {
|
||||
implementation 'com.zb.mom.ww.mxgateway:zb-mom-ww-mxgateway-client:0.1.1'
|
||||
}
|
||||
````
|
||||
|
||||
To publish a new version from this repo:
|
||||
|
||||
````bash
|
||||
export GITEA_USERNAME=dohertj2
|
||||
export GITEA_TOKEN=<your-gitea-token>
|
||||
gradle :zb-mom-ww-mxgateway-client:publish
|
||||
````
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Client Packaging](../../docs/ClientPackaging.md)
|
||||
|
||||
@@ -13,7 +13,7 @@ ext {
|
||||
|
||||
subprojects {
|
||||
group = 'com.zb.mom.ww.mxgateway'
|
||||
version = '0.1.0'
|
||||
version = '0.1.1'
|
||||
|
||||
pluginManager.withPlugin('java') {
|
||||
java {
|
||||
@@ -37,4 +37,44 @@ subprojects {
|
||||
testRuntimeOnly 'org.junit.platform:junit-platform-launcher'
|
||||
}
|
||||
}
|
||||
|
||||
pluginManager.withPlugin('maven-publish') {
|
||||
publishing {
|
||||
publications {
|
||||
maven(MavenPublication) {
|
||||
from components.java
|
||||
pom {
|
||||
url = 'https://gitea.dohertylan.com/dohertj2/mxaccessgw'
|
||||
description = 'MxAccessGateway Java client'
|
||||
scm {
|
||||
url = 'https://gitea.dohertylan.com/dohertj2/mxaccessgw'
|
||||
connection = 'scm:git:https://gitea.dohertylan.com/dohertj2/mxaccessgw.git'
|
||||
}
|
||||
developers {
|
||||
developer {
|
||||
id = 'dohertj2'
|
||||
name = 'Joseph Doherty'
|
||||
}
|
||||
}
|
||||
licenses {
|
||||
license {
|
||||
name = 'Proprietary'
|
||||
distribution = 'repo'
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
repositories {
|
||||
maven {
|
||||
name = 'GiteaPackages'
|
||||
url = 'https://gitea.dohertylan.com/api/packages/dohertj2/maven'
|
||||
credentials {
|
||||
username = System.getenv('GITEA_USERNAME') ?: ''
|
||||
password = System.getenv('GITEA_TOKEN') ?: ''
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -9,6 +9,10 @@ pluginManagement {
|
||||
}
|
||||
}
|
||||
|
||||
plugins {
|
||||
id 'org.gradle.toolchains.foojay-resolver-convention' version '1.0.0'
|
||||
}
|
||||
|
||||
dependencyResolutionManagement {
|
||||
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
|
||||
repositories {
|
||||
|
||||
+111
@@ -142,6 +142,37 @@ public final class GalaxyRepositoryGrpc {
|
||||
return getWatchDeployEventsMethod;
|
||||
}
|
||||
|
||||
private static volatile io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod;
|
||||
|
||||
@io.grpc.stub.annotations.RpcMethod(
|
||||
fullMethodName = SERVICE_NAME + '/' + "BrowseChildren",
|
||||
requestType = galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest.class,
|
||||
responseType = galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply.class,
|
||||
methodType = io.grpc.MethodDescriptor.MethodType.UNARY)
|
||||
public static io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod() {
|
||||
io.grpc.MethodDescriptor<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest, galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> getBrowseChildrenMethod;
|
||||
if ((getBrowseChildrenMethod = GalaxyRepositoryGrpc.getBrowseChildrenMethod) == null) {
|
||||
synchronized (GalaxyRepositoryGrpc.class) {
|
||||
if ((getBrowseChildrenMethod = GalaxyRepositoryGrpc.getBrowseChildrenMethod) == null) {
|
||||
GalaxyRepositoryGrpc.getBrowseChildrenMethod = getBrowseChildrenMethod =
|
||||
io.grpc.MethodDescriptor.<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest, galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>newBuilder()
|
||||
.setType(io.grpc.MethodDescriptor.MethodType.UNARY)
|
||||
.setFullMethodName(generateFullMethodName(SERVICE_NAME, "BrowseChildren"))
|
||||
.setSampledToLocalTracing(true)
|
||||
.setRequestMarshaller(io.grpc.protobuf.ProtoUtils.marshaller(
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest.getDefaultInstance()))
|
||||
.setResponseMarshaller(io.grpc.protobuf.ProtoUtils.marshaller(
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply.getDefaultInstance()))
|
||||
.setSchemaDescriptor(new GalaxyRepositoryMethodDescriptorSupplier("BrowseChildren"))
|
||||
.build();
|
||||
}
|
||||
}
|
||||
}
|
||||
return getBrowseChildrenMethod;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a new async stub that supports all call types for the service
|
||||
*/
|
||||
@@ -246,6 +277,19 @@ public final class GalaxyRepositoryGrpc {
|
||||
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent> responseObserver) {
|
||||
io.grpc.stub.ServerCalls.asyncUnimplementedUnaryCall(getWatchDeployEventsMethod(), responseObserver);
|
||||
}
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* Returns the direct children of a parent object (or the root objects when
|
||||
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
* one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
* </pre>
|
||||
*/
|
||||
default void browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request,
|
||||
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> responseObserver) {
|
||||
io.grpc.stub.ServerCalls.asyncUnimplementedUnaryCall(getBrowseChildrenMethod(), responseObserver);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -326,6 +370,20 @@ public final class GalaxyRepositoryGrpc {
|
||||
io.grpc.stub.ClientCalls.asyncServerStreamingCall(
|
||||
getChannel().newCall(getWatchDeployEventsMethod(), getCallOptions()), request, responseObserver);
|
||||
}
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* Returns the direct children of a parent object (or the root objects when
|
||||
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
* one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
* </pre>
|
||||
*/
|
||||
public void browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request,
|
||||
io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> responseObserver) {
|
||||
io.grpc.stub.ClientCalls.asyncUnaryCall(
|
||||
getChannel().newCall(getBrowseChildrenMethod(), getCallOptions()), request, responseObserver);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -387,6 +445,19 @@ public final class GalaxyRepositoryGrpc {
|
||||
return io.grpc.stub.ClientCalls.blockingV2ServerStreamingCall(
|
||||
getChannel(), getWatchDeployEventsMethod(), getCallOptions(), request);
|
||||
}
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* Returns the direct children of a parent object (or the root objects when
|
||||
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
* one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
* </pre>
|
||||
*/
|
||||
public galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) throws io.grpc.StatusException {
|
||||
return io.grpc.stub.ClientCalls.blockingV2UnaryCall(
|
||||
getChannel(), getBrowseChildrenMethod(), getCallOptions(), request);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -447,6 +518,19 @@ public final class GalaxyRepositoryGrpc {
|
||||
return io.grpc.stub.ClientCalls.blockingServerStreamingCall(
|
||||
getChannel(), getWatchDeployEventsMethod(), getCallOptions(), request);
|
||||
}
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* Returns the direct children of a parent object (or the root objects when
|
||||
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
* one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
* </pre>
|
||||
*/
|
||||
public galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply browseChildren(galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) {
|
||||
return io.grpc.stub.ClientCalls.blockingUnaryCall(
|
||||
getChannel(), getBrowseChildrenMethod(), getCallOptions(), request);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -494,12 +578,27 @@ public final class GalaxyRepositoryGrpc {
|
||||
return io.grpc.stub.ClientCalls.futureUnaryCall(
|
||||
getChannel().newCall(getDiscoverHierarchyMethod(), getCallOptions()), request);
|
||||
}
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* Returns the direct children of a parent object (or the root objects when
|
||||
* `parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
* one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
* DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
* </pre>
|
||||
*/
|
||||
public com.google.common.util.concurrent.ListenableFuture<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply> browseChildren(
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest request) {
|
||||
return io.grpc.stub.ClientCalls.futureUnaryCall(
|
||||
getChannel().newCall(getBrowseChildrenMethod(), getCallOptions()), request);
|
||||
}
|
||||
}
|
||||
|
||||
private static final int METHODID_TEST_CONNECTION = 0;
|
||||
private static final int METHODID_GET_LAST_DEPLOY_TIME = 1;
|
||||
private static final int METHODID_DISCOVER_HIERARCHY = 2;
|
||||
private static final int METHODID_WATCH_DEPLOY_EVENTS = 3;
|
||||
private static final int METHODID_BROWSE_CHILDREN = 4;
|
||||
|
||||
private static final class MethodHandlers<Req, Resp> implements
|
||||
io.grpc.stub.ServerCalls.UnaryMethod<Req, Resp>,
|
||||
@@ -534,6 +633,10 @@ public final class GalaxyRepositoryGrpc {
|
||||
serviceImpl.watchDeployEvents((galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest) request,
|
||||
(io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent>) responseObserver);
|
||||
break;
|
||||
case METHODID_BROWSE_CHILDREN:
|
||||
serviceImpl.browseChildren((galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest) request,
|
||||
(io.grpc.stub.StreamObserver<galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>) responseObserver);
|
||||
break;
|
||||
default:
|
||||
throw new AssertionError();
|
||||
}
|
||||
@@ -580,6 +683,13 @@ public final class GalaxyRepositoryGrpc {
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.WatchDeployEventsRequest,
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent>(
|
||||
service, METHODID_WATCH_DEPLOY_EVENTS)))
|
||||
.addMethod(
|
||||
getBrowseChildrenMethod(),
|
||||
io.grpc.stub.ServerCalls.asyncUnaryCall(
|
||||
new MethodHandlers<
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest,
|
||||
galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply>(
|
||||
service, METHODID_BROWSE_CHILDREN)))
|
||||
.build();
|
||||
}
|
||||
|
||||
@@ -632,6 +742,7 @@ public final class GalaxyRepositoryGrpc {
|
||||
.addMethod(getGetLastDeployTimeMethod())
|
||||
.addMethod(getDiscoverHierarchyMethod())
|
||||
.addMethod(getWatchDeployEventsMethod())
|
||||
.addMethod(getBrowseChildrenMethod())
|
||||
.build();
|
||||
}
|
||||
}
|
||||
|
||||
+3650
-14
File diff suppressed because it is too large
Load Diff
+16
@@ -37,6 +37,7 @@ import java.util.concurrent.atomic.AtomicReference;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmReply;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AcknowledgeAlarmRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.ActiveAlarmSnapshot;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AlarmProviderStatus;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.AlarmFeedMessage;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.BulkReadResult;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.BulkWriteResult;
|
||||
@@ -1366,6 +1367,13 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
@Option(names = "--server-name-override", description = "TLS server name override.")
|
||||
String serverNameOverride = "";
|
||||
|
||||
@Option(
|
||||
names = "--require-certificate-validation",
|
||||
description =
|
||||
"Verify the server certificate against the JVM trust store "
|
||||
+ "(disables the lenient default; ignored with --plaintext or --ca-file pinning).")
|
||||
boolean requireCertificateValidation;
|
||||
|
||||
@Option(names = "--timeout", defaultValue = "30s", description = "Per-call timeout.")
|
||||
String timeout;
|
||||
|
||||
@@ -1388,6 +1396,7 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
.plaintext(plaintext)
|
||||
.caCertificatePath(caFile)
|
||||
.serverNameOverride(serverNameOverride)
|
||||
.requireCertificateValidation(requireCertificateValidation)
|
||||
.callTimeout(resolvedTimeout)
|
||||
.build();
|
||||
}
|
||||
@@ -1400,6 +1409,7 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
values.put("plaintext", plaintext);
|
||||
values.put("caFile", caFile == null ? "" : caFile.toString());
|
||||
values.put("serverNameOverride", serverNameOverride);
|
||||
values.put("requireCertificateValidation", requireCertificateValidation);
|
||||
values.put("timeout", timeout);
|
||||
return values;
|
||||
}
|
||||
@@ -1703,6 +1713,12 @@ public final class MxGatewayCli implements Callable<Integer> {
|
||||
transition.getTransitionKind().name(),
|
||||
transition.getSeverity());
|
||||
}
|
||||
case PROVIDER_STATUS -> {
|
||||
AlarmProviderStatus status = message.getProviderStatus();
|
||||
yield String.format(
|
||||
"provider-status mode=%s degraded=%b reason=%s",
|
||||
status.getMode().name(), status.getDegraded(), status.getReason());
|
||||
}
|
||||
case PAYLOAD_NOT_SET -> "unknown";
|
||||
};
|
||||
}
|
||||
|
||||
+63
@@ -5,6 +5,7 @@ import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayAlarmFeedSubscription;
|
||||
import com.zb.mom.ww.mxgateway.client.MxGatewayClientOptions;
|
||||
import io.grpc.stub.StreamObserver;
|
||||
import java.io.ByteArrayInputStream;
|
||||
import java.io.InputStream;
|
||||
@@ -289,6 +290,51 @@ final class MxGatewayCliTests {
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void requireCertificateValidationFlagPropagatesThroughToClientOptions() {
|
||||
// Client.Java-038 regression — the --require-certificate-validation
|
||||
// CLI flag must reach MxGatewayClientOptions.requireCertificateValidation
|
||||
// via CommonOptions.toClientOptions(), so CLI users can opt into strict
|
||||
// JVM-trust verification without pinning a CA.
|
||||
CapturingClientFactory factory = new CapturingClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"acknowledge-alarm",
|
||||
"--endpoint",
|
||||
"localhost:5000",
|
||||
"--api-key-env",
|
||||
"MXGATEWAY_API_KEY",
|
||||
"--require-certificate-validation",
|
||||
"--reference",
|
||||
"Tank01.Level.HiHi");
|
||||
|
||||
assertEquals(0, run.exitCode(), "errors:\n" + run.errors());
|
||||
assertTrue(
|
||||
factory.capturedClientOptions.requireCertificateValidation(),
|
||||
"--require-certificate-validation did not propagate into MxGatewayClientOptions");
|
||||
}
|
||||
|
||||
@Test
|
||||
void requireCertificateValidationDefaultsToLenientWhenFlagAbsent() {
|
||||
// Without the flag, the lenient-by-default trust posture must be
|
||||
// preserved (requireCertificateValidation == false).
|
||||
CapturingClientFactory factory = new CapturingClientFactory();
|
||||
CliRun run = execute(
|
||||
factory,
|
||||
"acknowledge-alarm",
|
||||
"--endpoint",
|
||||
"localhost:5000",
|
||||
"--api-key-env",
|
||||
"MXGATEWAY_API_KEY",
|
||||
"--reference",
|
||||
"Tank01.Level.HiHi");
|
||||
|
||||
assertEquals(0, run.exitCode(), "errors:\n" + run.errors());
|
||||
assertFalse(
|
||||
factory.capturedClientOptions.requireCertificateValidation(),
|
||||
"requireCertificateValidation should default to false (lenient)");
|
||||
}
|
||||
|
||||
@Test
|
||||
void streamAlarmsCommandFailsFastOnQueueOverflow() {
|
||||
// Client.Java-033 regression — the CLI's stream-alarms bounded queue
|
||||
@@ -435,6 +481,23 @@ final class MxGatewayCliTests {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Factory that records the {@link MxGatewayClientOptions} produced by
|
||||
* {@link MxGatewayCli.CommonOptions#toClientOptions()} so a test can assert
|
||||
* how CLI flags map onto the library option surface. Wraps the standard
|
||||
* {@link FakeClient} so the command body still completes. Used by the
|
||||
* Client.Java-038 option-flow regression.
|
||||
*/
|
||||
private static final class CapturingClientFactory implements MxGatewayCli.MxGatewayCliClientFactory {
|
||||
private MxGatewayClientOptions capturedClientOptions;
|
||||
|
||||
@Override
|
||||
public MxGatewayCli.MxGatewayCliClient connect(MxGatewayCli.CommonOptions options) {
|
||||
capturedClientOptions = options.toClientOptions();
|
||||
return new FakeClient(options.spec.commandLine().getOut());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Factory whose fake client floods the {@code streamAlarms} observer with
|
||||
* 2000 messages synchronously, exceeding the CLI's bounded 1024-element
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
plugins {
|
||||
id 'java-library'
|
||||
id 'com.google.protobuf'
|
||||
id 'maven-publish'
|
||||
}
|
||||
|
||||
dependencies {
|
||||
@@ -30,6 +31,11 @@ sourceSets {
|
||||
}
|
||||
}
|
||||
|
||||
java {
|
||||
withSourcesJar()
|
||||
withJavadocJar()
|
||||
}
|
||||
|
||||
protobuf {
|
||||
protoc {
|
||||
artifact = "com.google.protobuf:protoc:${protobufVersion}"
|
||||
|
||||
+105
@@ -0,0 +1,105 @@
|
||||
package com.zb.mom.ww.mxgateway.client;
|
||||
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Filters and shape options for {@link GalaxyRepositoryClient#browse(BrowseChildrenOptions)}.
|
||||
* Mirror of the existing DiscoverHierarchy options for the lazy-browse path.
|
||||
*
|
||||
* <p>All filter fields are AND-combined server-side. Empty / unset fields disable
|
||||
* that filter. The {@code includeAttributes} tri-state uses {@code null} to mean
|
||||
* "let the server use its default"; non-{@code null} forwards the explicit flag.
|
||||
*/
|
||||
public final class BrowseChildrenOptions {
|
||||
private final List<Integer> categoryIds;
|
||||
private final List<String> templateChainContains;
|
||||
private final String tagNameGlob;
|
||||
private final Boolean includeAttributes;
|
||||
private final boolean alarmBearingOnly;
|
||||
private final boolean historizedOnly;
|
||||
|
||||
private BrowseChildrenOptions(Builder b) {
|
||||
this.categoryIds = List.copyOf(b.categoryIds);
|
||||
this.templateChainContains = List.copyOf(b.templateChainContains);
|
||||
this.tagNameGlob = b.tagNameGlob;
|
||||
this.includeAttributes = b.includeAttributes;
|
||||
this.alarmBearingOnly = b.alarmBearingOnly;
|
||||
this.historizedOnly = b.historizedOnly;
|
||||
}
|
||||
|
||||
/** @return immutable list of category IDs to include; empty disables this filter. */
|
||||
public List<Integer> getCategoryIds() { return categoryIds; }
|
||||
|
||||
/** @return immutable list of template names that must appear in each child's template chain. */
|
||||
public List<String> getTemplateChainContains() { return templateChainContains; }
|
||||
|
||||
/** @return SQL-LIKE-style glob applied to {@code tag_name}; empty disables. */
|
||||
public String getTagNameGlob() { return tagNameGlob; }
|
||||
|
||||
/** @return tri-state override for {@code include_attributes}; {@code null} keeps the server default. */
|
||||
public Boolean getIncludeAttributes() { return includeAttributes; }
|
||||
|
||||
/** @return restrict to alarm-bearing objects. */
|
||||
public boolean isAlarmBearingOnly() { return alarmBearingOnly; }
|
||||
|
||||
/** @return restrict to objects with at least one historized attribute. */
|
||||
public boolean isHistorizedOnly() { return historizedOnly; }
|
||||
|
||||
/** @return a fresh builder. */
|
||||
public static Builder builder() { return new Builder(); }
|
||||
|
||||
/** @return options with every filter disabled and {@code includeAttributes} unset. */
|
||||
public static BrowseChildrenOptions empty() { return builder().build(); }
|
||||
|
||||
/** Fluent builder for {@link BrowseChildrenOptions}. */
|
||||
public static final class Builder {
|
||||
private List<Integer> categoryIds = Collections.emptyList();
|
||||
private List<String> templateChainContains = Collections.emptyList();
|
||||
private String tagNameGlob = "";
|
||||
private Boolean includeAttributes = null;
|
||||
private boolean alarmBearingOnly = false;
|
||||
private boolean historizedOnly = false;
|
||||
|
||||
/** Sets the category-id filter. */
|
||||
public Builder categoryIds(List<Integer> v) {
|
||||
this.categoryIds = v == null ? Collections.emptyList() : v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Sets the template-chain-contains filter. */
|
||||
public Builder templateChainContains(List<String> v) {
|
||||
this.templateChainContains = v == null ? Collections.emptyList() : v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Sets the tag-name glob. */
|
||||
public Builder tagNameGlob(String v) {
|
||||
this.tagNameGlob = v == null ? "" : v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Sets the tri-state {@code includeAttributes} override; {@code null} keeps the server default. */
|
||||
public Builder includeAttributes(Boolean v) {
|
||||
this.includeAttributes = v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Toggles the alarm-bearing-only filter. */
|
||||
public Builder alarmBearingOnly(boolean v) {
|
||||
this.alarmBearingOnly = v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Toggles the historized-only filter. */
|
||||
public Builder historizedOnly(boolean v) {
|
||||
this.historizedOnly = v;
|
||||
return this;
|
||||
}
|
||||
|
||||
/** Builds the immutable options. */
|
||||
public BrowseChildrenOptions build() {
|
||||
return new BrowseChildrenOptions(this);
|
||||
}
|
||||
}
|
||||
}
|
||||
+95
@@ -4,6 +4,8 @@ import com.google.common.util.concurrent.FutureCallback;
|
||||
import com.google.common.util.concurrent.Futures;
|
||||
import com.google.common.util.concurrent.MoreExecutors;
|
||||
import galaxy_repository.v1.GalaxyRepositoryGrpc;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyReply;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyRequest;
|
||||
@@ -37,6 +39,7 @@ import javax.net.ssl.SSLException;
|
||||
*/
|
||||
public final class GalaxyRepositoryClient implements AutoCloseable {
|
||||
private static final int DISCOVER_HIERARCHY_PAGE_SIZE = 5000;
|
||||
private static final int BROWSE_CHILDREN_PAGE_SIZE = 500;
|
||||
|
||||
private final ManagedChannel ownedChannel;
|
||||
private final MxGatewayClientOptions options;
|
||||
@@ -213,6 +216,98 @@ public final class GalaxyRepositoryClient implements AutoCloseable {
|
||||
return discoverHierarchyPageAsync("", new java.util.ArrayList<>(), new java.util.HashSet<>());
|
||||
}
|
||||
|
||||
/**
|
||||
* Lazy-browse entry point: fetches the root layer of the Galaxy hierarchy.
|
||||
* Each returned {@link LazyBrowseNode} can be expanded on demand via
|
||||
* {@link LazyBrowseNode#expand()} to load its direct children.
|
||||
*
|
||||
* @return the root nodes (no parent selector) with default options
|
||||
* @throws MxGatewayException on transport or protocol failure
|
||||
*/
|
||||
public List<LazyBrowseNode> browse() {
|
||||
return browse(null);
|
||||
}
|
||||
|
||||
/**
|
||||
* Lazy-browse entry point with caller-supplied filters / shape.
|
||||
*
|
||||
* @param options filter and shape options; {@code null} means {@link BrowseChildrenOptions#empty()}
|
||||
* @return the root nodes matching the options
|
||||
* @throws MxGatewayException on transport or protocol failure
|
||||
*/
|
||||
public List<LazyBrowseNode> browse(BrowseChildrenOptions options) {
|
||||
BrowseChildrenOptions effective = options == null ? BrowseChildrenOptions.empty() : options;
|
||||
return browseChildrenInner(null, effective);
|
||||
}
|
||||
|
||||
/**
|
||||
* Issues a single {@code BrowseChildren} RPC and returns the raw reply.
|
||||
* Callers wanting full control over pagination can drive the loop themselves.
|
||||
*
|
||||
* @param request the request to send
|
||||
* @return the reply
|
||||
* @throws MxGatewayException on transport or protocol failure
|
||||
*/
|
||||
public BrowseChildrenReply browseChildrenRaw(BrowseChildrenRequest request) {
|
||||
try {
|
||||
return rawBlockingStub().browseChildren(request);
|
||||
} catch (RuntimeException error) {
|
||||
if (error instanceof MxGatewayException) {
|
||||
throw error;
|
||||
}
|
||||
throw MxGatewayErrors.fromGrpc("galaxy browse children", error);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Drives the BrowseChildren paging loop for a single parent (or roots when
|
||||
* {@code parentGobjectId} is {@code null}). Detects repeated page tokens to
|
||||
* avoid infinite loops on a buggy server.
|
||||
*/
|
||||
List<LazyBrowseNode> browseChildrenInner(Integer parentGobjectId, BrowseChildrenOptions options) {
|
||||
java.util.ArrayList<LazyBrowseNode> nodes = new java.util.ArrayList<>();
|
||||
java.util.HashSet<String> seenPageTokens = new java.util.HashSet<>();
|
||||
String pageToken = "";
|
||||
while (true) {
|
||||
BrowseChildrenRequest.Builder builder = BrowseChildrenRequest.newBuilder()
|
||||
.setPageSize(BROWSE_CHILDREN_PAGE_SIZE)
|
||||
.setPageToken(pageToken)
|
||||
.setAlarmBearingOnly(options.isAlarmBearingOnly())
|
||||
.setHistorizedOnly(options.isHistorizedOnly());
|
||||
if (parentGobjectId != null) {
|
||||
builder.setParentGobjectId(parentGobjectId.intValue());
|
||||
}
|
||||
if (!options.getCategoryIds().isEmpty()) {
|
||||
builder.addAllCategoryIds(options.getCategoryIds());
|
||||
}
|
||||
if (!options.getTemplateChainContains().isEmpty()) {
|
||||
builder.addAllTemplateChainContains(options.getTemplateChainContains());
|
||||
}
|
||||
if (!options.getTagNameGlob().isEmpty()) {
|
||||
builder.setTagNameGlob(options.getTagNameGlob());
|
||||
}
|
||||
if (options.getIncludeAttributes() != null) {
|
||||
builder.setIncludeAttributes(options.getIncludeAttributes());
|
||||
}
|
||||
|
||||
BrowseChildrenReply reply = browseChildrenRaw(builder.build());
|
||||
|
||||
for (int i = 0; i < reply.getChildrenCount(); i++) {
|
||||
boolean hint = i < reply.getChildHasChildrenCount() && reply.getChildHasChildren(i);
|
||||
nodes.add(new LazyBrowseNode(this, reply.getChildren(i), hint, options));
|
||||
}
|
||||
|
||||
pageToken = reply.getNextPageToken();
|
||||
if (pageToken == null || pageToken.isEmpty()) {
|
||||
return nodes;
|
||||
}
|
||||
if (!seenPageTokens.add(pageToken)) {
|
||||
throw new MxGatewayException(
|
||||
"galaxy browse children returned repeated page token: " + pageToken);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Subscribes to {@code WatchDeployEvents} via the async stub and consumes
|
||||
* results through a blocking iterator. Closing the returned stream cancels
|
||||
|
||||
+150
@@ -0,0 +1,150 @@
|
||||
package com.zb.mom.ww.mxgateway.client;
|
||||
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.GalaxyObject;
|
||||
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
import java.util.concurrent.CompletableFuture;
|
||||
import java.util.concurrent.ExecutionException;
|
||||
import java.util.concurrent.locks.ReentrantReadWriteLock;
|
||||
|
||||
/**
|
||||
* One node in a lazy-loaded Galaxy browse tree. Holds the underlying
|
||||
* {@link GalaxyObject} and exposes {@link #expand()} to fetch its direct
|
||||
* children on demand. Expansion is one-shot: a second call is a no-op.
|
||||
* Pagination of large sibling sets is handled internally by the client.
|
||||
*/
|
||||
public final class LazyBrowseNode {
|
||||
private final GalaxyRepositoryClient client;
|
||||
private final GalaxyObject object;
|
||||
private final boolean hasChildrenHint;
|
||||
private final BrowseChildrenOptions options;
|
||||
|
||||
// expandLock gates the start of a new expand AND the publish of the in-flight
|
||||
// future. Readers (getChildren / isExpanded) use a separate read-write lock so
|
||||
// they never block on the gRPC call.
|
||||
private final Object expandLock = new Object();
|
||||
private CompletableFuture<Void> inFlight;
|
||||
|
||||
private final ReentrantReadWriteLock readWriteLock = new ReentrantReadWriteLock();
|
||||
private List<LazyBrowseNode> children = Collections.emptyList();
|
||||
private boolean isExpanded;
|
||||
|
||||
LazyBrowseNode(
|
||||
GalaxyRepositoryClient client,
|
||||
GalaxyObject object,
|
||||
boolean hasChildrenHint,
|
||||
BrowseChildrenOptions options) {
|
||||
this.client = client;
|
||||
this.object = object;
|
||||
this.hasChildrenHint = hasChildrenHint;
|
||||
this.options = options;
|
||||
}
|
||||
|
||||
/** @return the underlying Galaxy object proto for this node. */
|
||||
public GalaxyObject getObject() {
|
||||
return object;
|
||||
}
|
||||
|
||||
/** @return {@code true} when the server reports this node has at least one matching descendant. */
|
||||
public boolean hasChildrenHint() {
|
||||
return hasChildrenHint;
|
||||
}
|
||||
|
||||
/** @return a snapshot of direct children loaded by {@link #expand()}; empty until then. */
|
||||
public List<LazyBrowseNode> getChildren() {
|
||||
readWriteLock.readLock().lock();
|
||||
try {
|
||||
return List.copyOf(children);
|
||||
} finally {
|
||||
readWriteLock.readLock().unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/** @return {@code true} after the first {@link #expand()} call completes. */
|
||||
public boolean isExpanded() {
|
||||
readWriteLock.readLock().lock();
|
||||
try {
|
||||
return isExpanded;
|
||||
} finally {
|
||||
readWriteLock.readLock().unlock();
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Fetches direct children from the gateway and populates {@link #getChildren()}.
|
||||
* Idempotent: subsequent calls are no-ops and do not issue a second RPC.
|
||||
*
|
||||
* <p>Concurrent callers coalesce onto a single in-flight RPC: the first caller
|
||||
* (the "leader") issues the gRPC call, while any other thread that calls
|
||||
* {@code expand()} during that window blocks on the leader's future and sees
|
||||
* the same result (or the same exception). On failure the in-flight slot is
|
||||
* cleared so a subsequent call can retry.
|
||||
*
|
||||
* <p>Readers ({@link #getChildren()} / {@link #isExpanded()}) take a separate
|
||||
* read lock and are never blocked for the duration of the RPC.
|
||||
*
|
||||
* @throws MxGatewayException on transport or protocol failure
|
||||
*/
|
||||
public void expand() {
|
||||
if (isExpanded()) {
|
||||
return;
|
||||
}
|
||||
|
||||
CompletableFuture<Void> future;
|
||||
boolean iAmTheLeader;
|
||||
synchronized (expandLock) {
|
||||
if (isExpanded()) {
|
||||
return;
|
||||
}
|
||||
if (inFlight != null) {
|
||||
future = inFlight;
|
||||
iAmTheLeader = false;
|
||||
} else {
|
||||
future = new CompletableFuture<>();
|
||||
inFlight = future;
|
||||
iAmTheLeader = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (iAmTheLeader) {
|
||||
try {
|
||||
List<LazyBrowseNode> loaded =
|
||||
client.browseChildrenInner(object.getGobjectId(), options);
|
||||
readWriteLock.writeLock().lock();
|
||||
try {
|
||||
this.children = loaded;
|
||||
this.isExpanded = true;
|
||||
} finally {
|
||||
readWriteLock.writeLock().unlock();
|
||||
}
|
||||
synchronized (expandLock) {
|
||||
inFlight = null;
|
||||
}
|
||||
future.complete(null);
|
||||
} catch (RuntimeException ex) {
|
||||
synchronized (expandLock) {
|
||||
inFlight = null;
|
||||
}
|
||||
future.completeExceptionally(ex);
|
||||
throw ex;
|
||||
}
|
||||
} else {
|
||||
try {
|
||||
future.get();
|
||||
} catch (InterruptedException ie) {
|
||||
Thread.currentThread().interrupt();
|
||||
throw new MxGatewayException("Interrupted waiting for browse-children expand.", ie);
|
||||
} catch (ExecutionException ee) {
|
||||
Throwable cause = ee.getCause();
|
||||
if (cause instanceof MxGatewayException me) {
|
||||
throw me;
|
||||
}
|
||||
if (cause instanceof RuntimeException re) {
|
||||
throw re;
|
||||
}
|
||||
throw new MxGatewayException("BrowseChildren expand failed.", cause);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
+22
@@ -384,6 +384,15 @@ public final class MxGatewayClient implements AutoCloseable {
|
||||
} catch (SSLException error) {
|
||||
throw new MxGatewayException("failed to configure gateway TLS", error);
|
||||
}
|
||||
} else if (!options.requireCertificateValidation()) {
|
||||
try {
|
||||
builder.sslContext(GrpcSslContexts.forClient()
|
||||
.trustManager(io.grpc.netty.shaded.io.netty.handler.ssl.util
|
||||
.InsecureTrustManagerFactory.INSTANCE)
|
||||
.build());
|
||||
} catch (SSLException error) {
|
||||
throw new MxGatewayException("failed to configure lenient gateway TLS", error);
|
||||
}
|
||||
} else {
|
||||
builder.useTransportSecurity();
|
||||
}
|
||||
@@ -393,6 +402,19 @@ public final class MxGatewayClient implements AutoCloseable {
|
||||
return builder.build();
|
||||
}
|
||||
|
||||
/**
|
||||
* Package-visible test seam — creates a raw {@link ManagedChannel} from the
|
||||
* given options without attaching auth interceptors. Used by TLS fixture
|
||||
* tests to verify channel construction behaviour without a full
|
||||
* {@link MxGatewayClient} wrapper.
|
||||
*
|
||||
* @param options the client options
|
||||
* @return a new {@link ManagedChannel}
|
||||
*/
|
||||
static ManagedChannel createChannelForTests(MxGatewayClientOptions options) {
|
||||
return createChannel(options);
|
||||
}
|
||||
|
||||
private <T extends io.grpc.stub.AbstractStub<T>> T withDeadline(T stub) {
|
||||
if (options.callTimeout().isNegative()) {
|
||||
return stub;
|
||||
|
||||
+32
@@ -20,6 +20,7 @@ public final class MxGatewayClientOptions {
|
||||
private final String apiKey;
|
||||
private final boolean plaintext;
|
||||
private final Path caCertificatePath;
|
||||
private final boolean requireCertificateValidation;
|
||||
private final String serverNameOverride;
|
||||
private final Duration connectTimeout;
|
||||
private final Duration callTimeout;
|
||||
@@ -31,6 +32,7 @@ public final class MxGatewayClientOptions {
|
||||
apiKey = builder.apiKey == null ? "" : builder.apiKey;
|
||||
plaintext = builder.plaintext;
|
||||
caCertificatePath = builder.caCertificatePath;
|
||||
requireCertificateValidation = builder.requireCertificateValidation;
|
||||
serverNameOverride = builder.serverNameOverride == null ? "" : builder.serverNameOverride;
|
||||
connectTimeout = builder.connectTimeout == null ? DEFAULT_CONNECT_TIMEOUT : builder.connectTimeout;
|
||||
callTimeout = builder.callTimeout == null ? DEFAULT_CALL_TIMEOUT : builder.callTimeout;
|
||||
@@ -95,6 +97,18 @@ public final class MxGatewayClientOptions {
|
||||
return caCertificatePath;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns whether TLS certificate verification is required even when no CA is pinned.
|
||||
* When {@code false} (default), the gateway's self-signed certificate is accepted
|
||||
* without verification. When {@code true}, the OS trust store is used.
|
||||
* Pinning a CA via {@link #caCertificatePath()} always verifies regardless of this flag.
|
||||
*
|
||||
* @return {@code true} if strict certificate verification is required
|
||||
*/
|
||||
public boolean requireCertificateValidation() {
|
||||
return requireCertificateValidation;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the TLS server-name override, or an empty string when none was supplied.
|
||||
*
|
||||
@@ -148,6 +162,8 @@ public final class MxGatewayClientOptions {
|
||||
+ plaintext
|
||||
+ ", caCertificatePath="
|
||||
+ caCertificatePath
|
||||
+ ", requireCertificateValidation="
|
||||
+ requireCertificateValidation
|
||||
+ ", serverNameOverride='"
|
||||
+ serverNameOverride
|
||||
+ '\''
|
||||
@@ -177,6 +193,7 @@ public final class MxGatewayClientOptions {
|
||||
private String apiKey;
|
||||
private boolean plaintext;
|
||||
private Path caCertificatePath;
|
||||
private boolean requireCertificateValidation;
|
||||
private String serverNameOverride;
|
||||
private Duration connectTimeout;
|
||||
private Duration callTimeout;
|
||||
@@ -230,6 +247,21 @@ public final class MxGatewayClientOptions {
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* When {@code true}, TLS connections without a pinned CA use the OS trust store
|
||||
* and will reject the gateway's self-signed certificate. When {@code false}
|
||||
* (default), the gateway certificate is accepted without verification —
|
||||
* appropriate for this internal tool's auto-generated self-signed certificate.
|
||||
* Pinning a CA via {@link #caCertificatePath(Path)} always verifies.
|
||||
*
|
||||
* @param value {@code true} to require certificate validation, {@code false} to accept any cert
|
||||
* @return this builder
|
||||
*/
|
||||
public Builder requireCertificateValidation(boolean value) {
|
||||
requireCertificateValidation = value;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Overrides the TLS server name used during the handshake.
|
||||
*
|
||||
|
||||
+321
@@ -8,6 +8,8 @@ import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import com.google.protobuf.Timestamp;
|
||||
import galaxy_repository.v1.GalaxyRepositoryGrpc;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenReply;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.BrowseChildrenRequest;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DeployEvent;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyReply;
|
||||
import galaxy_repository.v1.GalaxyRepositoryOuterClass.DiscoverHierarchyRequest;
|
||||
@@ -24,6 +26,7 @@ import io.grpc.Server;
|
||||
import io.grpc.ServerCall;
|
||||
import io.grpc.ServerCallHandler;
|
||||
import io.grpc.ServerInterceptor;
|
||||
import io.grpc.Status;
|
||||
import io.grpc.inprocess.InProcessChannelBuilder;
|
||||
import io.grpc.inprocess.InProcessServerBuilder;
|
||||
import io.grpc.stub.ClientCallStreamObserver;
|
||||
@@ -31,11 +34,20 @@ import io.grpc.stub.ClientResponseObserver;
|
||||
import io.grpc.stub.StreamObserver;
|
||||
import java.time.Duration;
|
||||
import java.time.Instant;
|
||||
import java.util.ArrayDeque;
|
||||
import java.util.Collections;
|
||||
import java.util.List;
|
||||
import java.util.Optional;
|
||||
import java.util.Queue;
|
||||
import java.util.UUID;
|
||||
import java.util.ArrayList;
|
||||
import java.util.concurrent.CopyOnWriteArrayList;
|
||||
import java.util.concurrent.CountDownLatch;
|
||||
import java.util.concurrent.ExecutorService;
|
||||
import java.util.concurrent.Executors;
|
||||
import java.util.concurrent.Future;
|
||||
import java.util.concurrent.TimeUnit;
|
||||
import java.util.concurrent.atomic.AtomicInteger;
|
||||
import java.util.concurrent.atomic.AtomicReference;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
@@ -196,6 +208,27 @@ final class GalaxyRepositoryClientTests {
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseChildrenRejectsRepeatedPageToken() throws Exception {
|
||||
// Queue the same BrowseChildrenReply twice with a non-empty NextPageToken.
|
||||
// The client will request a second page and detect that the token repeats.
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
BrowseChildrenReply repeatedReply = browseReply(
|
||||
List.of(obj(1, "Plant", true)),
|
||||
List.of(true),
|
||||
1L,
|
||||
"1:abc:1");
|
||||
service.replies.add(repeatedReply);
|
||||
service.replies.add(repeatedReply);
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
MxGatewayException error = assertThrows(MxGatewayException.class, client::browse);
|
||||
|
||||
assertTrue(error.getMessage().contains("repeated page token"));
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void watchDeployEventsReceivesEventsInOrder() throws Exception {
|
||||
DeployEvent first = DeployEvent.newBuilder()
|
||||
@@ -306,6 +339,294 @@ final class GalaxyRepositoryClientTests {
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseNoParentReturnsRoots() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(1, "Plant", true), obj(2, "Other", false)),
|
||||
List.of(true, false),
|
||||
1L,
|
||||
""));
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
|
||||
assertEquals(2, roots.size());
|
||||
assertEquals("Plant", roots.get(0).getObject().getTagName());
|
||||
assertTrue(roots.get(0).hasChildrenHint());
|
||||
assertFalse(roots.get(0).isExpanded());
|
||||
assertEquals("Other", roots.get(1).getObject().getTagName());
|
||||
assertFalse(roots.get(1).hasChildrenHint());
|
||||
assertFalse(roots.get(1).isExpanded());
|
||||
assertEquals(1, service.calls.size());
|
||||
assertFalse(service.calls.get(0).hasParentGobjectId());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseExpandPopulatesChildrenAndMarksExpanded() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(1, "Plant", true)),
|
||||
List.of(true),
|
||||
1L,
|
||||
""));
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(10, "Line1", false)),
|
||||
List.of(false),
|
||||
1L,
|
||||
""));
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
roots.get(0).expand();
|
||||
|
||||
assertTrue(roots.get(0).isExpanded());
|
||||
assertEquals(1, roots.get(0).getChildren().size());
|
||||
assertEquals("Line1", roots.get(0).getChildren().get(0).getObject().getTagName());
|
||||
assertEquals(2, service.calls.size());
|
||||
assertTrue(service.calls.get(1).hasParentGobjectId());
|
||||
assertEquals(1, service.calls.get(1).getParentGobjectId());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseExpandIdempotentNoSecondRpc() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(1, "Plant", true)),
|
||||
List.of(true),
|
||||
1L,
|
||||
""));
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(10, "Line1", false)),
|
||||
List.of(false),
|
||||
1L,
|
||||
""));
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
roots.get(0).expand();
|
||||
roots.get(0).expand();
|
||||
|
||||
assertEquals(2, service.calls.size());
|
||||
assertEquals(1, roots.get(0).getChildren().size());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseExpandUnknownParentThrowsGalaxyNotFound() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(1, "Plant", true)),
|
||||
List.of(true),
|
||||
1L,
|
||||
""));
|
||||
service.errors.add(Status.NOT_FOUND.withDescription("Parent not found").asRuntimeException());
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
|
||||
MxGatewayException error = assertThrows(MxGatewayException.class, () -> roots.get(0).expand());
|
||||
assertTrue(
|
||||
error.getMessage().toLowerCase().contains("not found"),
|
||||
"expected message to mention 'not found', got: " + error.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseExpandMultiPageGathersAllPages() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
// Roots
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(7, "Plant", true)),
|
||||
List.of(true),
|
||||
1L,
|
||||
""));
|
||||
// First child page with a next token
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(70, "ChildA", false), obj(71, "ChildB", false)),
|
||||
List.of(false, false),
|
||||
1L,
|
||||
"7:abc:2"));
|
||||
// Second child page closes the loop
|
||||
service.replies.add(browseReply(
|
||||
List.of(obj(72, "ChildC", false)),
|
||||
List.of(false),
|
||||
1L,
|
||||
""));
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
roots.get(0).expand();
|
||||
|
||||
assertEquals(3, roots.get(0).getChildren().size());
|
||||
assertEquals(3, service.calls.size());
|
||||
assertEquals("7:abc:2", service.calls.get(2).getPageToken());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseExpandConcurrentCallersOnlyFireOneRpc() throws Exception {
|
||||
// Verifies that concurrent expand() calls coalesce onto a single in-flight
|
||||
// BrowseChildren RPC and that readers (isExpanded/getChildren) are not
|
||||
// blocked for the full RPC duration.
|
||||
BrowseChildrenReply rootsReply = browseReply(
|
||||
List.of(obj(1, "Plant", true)),
|
||||
List.of(true),
|
||||
7L,
|
||||
"");
|
||||
BrowseChildrenReply childrenReply = browseReply(
|
||||
List.of(obj(2, "Mixer_001", false)),
|
||||
List.of(false),
|
||||
7L,
|
||||
"");
|
||||
|
||||
// Gate the child fetch behind a latch so multiple expanders can pile up.
|
||||
CountDownLatch release = new CountDownLatch(1);
|
||||
AtomicInteger childCalls = new AtomicInteger();
|
||||
BrowseChildrenService service = new BrowseChildrenService() {
|
||||
@Override
|
||||
public void browseChildren(
|
||||
BrowseChildrenRequest request, StreamObserver<BrowseChildrenReply> responseObserver) {
|
||||
calls.add(request);
|
||||
BrowseChildrenReply reply;
|
||||
if (!request.hasParentGobjectId()) {
|
||||
reply = rootsReply;
|
||||
} else {
|
||||
// Block the leader until the followers have arrived.
|
||||
try {
|
||||
assertTrue(release.await(5, TimeUnit.SECONDS), "release latch never tripped");
|
||||
} catch (InterruptedException ie) {
|
||||
Thread.currentThread().interrupt();
|
||||
responseObserver.onError(Status.CANCELLED.asRuntimeException());
|
||||
return;
|
||||
}
|
||||
childCalls.incrementAndGet();
|
||||
reply = childrenReply;
|
||||
}
|
||||
responseObserver.onNext(reply);
|
||||
responseObserver.onCompleted();
|
||||
}
|
||||
};
|
||||
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
List<LazyBrowseNode> roots = client.browse();
|
||||
LazyBrowseNode root = roots.get(0);
|
||||
|
||||
int parallelism = 10;
|
||||
ExecutorService pool = Executors.newFixedThreadPool(parallelism);
|
||||
try {
|
||||
CountDownLatch ready = new CountDownLatch(parallelism);
|
||||
List<Future<Void>> futures = new ArrayList<>();
|
||||
for (int i = 0; i < parallelism; i++) {
|
||||
futures.add(pool.submit(() -> {
|
||||
ready.countDown();
|
||||
root.expand();
|
||||
return null;
|
||||
}));
|
||||
}
|
||||
// Wait for all callers to be in flight, then release the leader.
|
||||
assertTrue(ready.await(5, TimeUnit.SECONDS), "expander threads did not start");
|
||||
// Readers must not be blocked by an in-flight expand; this should not deadlock
|
||||
// and should return the pre-expand state.
|
||||
assertFalse(root.isExpanded());
|
||||
assertEquals(0, root.getChildren().size());
|
||||
release.countDown();
|
||||
|
||||
for (Future<Void> f : futures) {
|
||||
f.get(10, TimeUnit.SECONDS);
|
||||
}
|
||||
} finally {
|
||||
pool.shutdownNow();
|
||||
}
|
||||
|
||||
assertTrue(root.isExpanded());
|
||||
assertEquals(1, root.getChildren().size());
|
||||
// Exactly one expand RPC was issued even though many callers raced.
|
||||
assertEquals(1, childCalls.get());
|
||||
// 1 roots fetch + exactly 1 expand fetch.
|
||||
assertEquals(2, service.calls.size());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void browseWithFilterForwardsToRequest() throws Exception {
|
||||
BrowseChildrenService service = new BrowseChildrenService();
|
||||
// Default reply is empty; only the request shape matters here.
|
||||
try (InProcessGalaxy g = InProcessGalaxy.start(service, new AtomicReference<>());
|
||||
GalaxyRepositoryClient client = g.client("")) {
|
||||
client.browse(BrowseChildrenOptions.builder()
|
||||
.tagNameGlob("Mixer*")
|
||||
.alarmBearingOnly(true)
|
||||
.build());
|
||||
}
|
||||
|
||||
assertEquals(1, service.calls.size());
|
||||
BrowseChildrenRequest request = service.calls.get(0);
|
||||
assertEquals("Mixer*", request.getTagNameGlob());
|
||||
assertTrue(request.getAlarmBearingOnly());
|
||||
}
|
||||
|
||||
private static GalaxyObject obj(int id, String tag, boolean isArea) {
|
||||
return GalaxyObject.newBuilder()
|
||||
.setGobjectId(id)
|
||||
.setTagName(tag)
|
||||
.setBrowseName(tag)
|
||||
.setIsArea(isArea)
|
||||
.build();
|
||||
}
|
||||
|
||||
private static BrowseChildrenReply browseReply(
|
||||
List<GalaxyObject> children,
|
||||
List<Boolean> childHasChildren,
|
||||
long cacheSequence,
|
||||
String nextPageToken) {
|
||||
BrowseChildrenReply.Builder b = BrowseChildrenReply.newBuilder()
|
||||
.setTotalChildCount(children.size())
|
||||
.setCacheSequence(cacheSequence)
|
||||
.setNextPageToken(nextPageToken);
|
||||
b.addAllChildren(children);
|
||||
b.addAllChildHasChildren(childHasChildren);
|
||||
return b.build();
|
||||
}
|
||||
|
||||
private static class BrowseChildrenService extends TestService {
|
||||
final List<BrowseChildrenRequest> calls =
|
||||
Collections.synchronizedList(new CopyOnWriteArrayList<>());
|
||||
final Queue<BrowseChildrenReply> replies = new ArrayDeque<>();
|
||||
final Queue<Throwable> errors = new ArrayDeque<>();
|
||||
|
||||
@Override
|
||||
public void browseChildren(
|
||||
BrowseChildrenRequest request, StreamObserver<BrowseChildrenReply> responseObserver) {
|
||||
calls.add(request);
|
||||
BrowseChildrenReply reply;
|
||||
Throwable err;
|
||||
synchronized (this) {
|
||||
// Prefer queued replies first; once they're exhausted, fall through to any
|
||||
// queued error. This matches the .NET fake's ordering used by parity tests.
|
||||
reply = replies.poll();
|
||||
err = reply == null ? errors.poll() : null;
|
||||
}
|
||||
if (err != null) {
|
||||
responseObserver.onError(err);
|
||||
return;
|
||||
}
|
||||
if (reply == null) {
|
||||
reply = BrowseChildrenReply.getDefaultInstance();
|
||||
}
|
||||
responseObserver.onNext(reply);
|
||||
responseObserver.onCompleted();
|
||||
}
|
||||
}
|
||||
|
||||
private abstract static class TestService extends GalaxyRepositoryGrpc.GalaxyRepositoryImplBase {
|
||||
@Override
|
||||
public void testConnection(
|
||||
|
||||
+198
@@ -0,0 +1,198 @@
|
||||
package com.zb.mom.ww.mxgateway.client;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import io.grpc.ManagedChannel;
|
||||
import io.grpc.Server;
|
||||
import io.grpc.StatusRuntimeException;
|
||||
import io.grpc.netty.shaded.io.grpc.netty.GrpcSslContexts;
|
||||
import io.grpc.netty.shaded.io.grpc.netty.NettyServerBuilder;
|
||||
import io.grpc.stub.StreamObserver;
|
||||
import java.io.File;
|
||||
import java.io.FileOutputStream;
|
||||
import java.io.IOException;
|
||||
import java.net.InetSocketAddress;
|
||||
import java.nio.file.Files;
|
||||
import java.security.KeyStore;
|
||||
import java.security.PrivateKey;
|
||||
import java.security.cert.Certificate;
|
||||
import java.security.cert.X509Certificate;
|
||||
import java.time.Duration;
|
||||
import java.util.Base64;
|
||||
import java.util.concurrent.TimeUnit;
|
||||
import javax.net.ssl.SSLException;
|
||||
import mxaccess_gateway.v1.MxAccessGatewayGrpc;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionReply;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.OpenSessionRequest;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatus;
|
||||
import mxaccess_gateway.v1.MxaccessGateway.ProtocolStatusCode;
|
||||
import org.junit.jupiter.api.AfterEach;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
/**
|
||||
* Verifies that the Java client connects to a Netty TLS server with a
|
||||
* self-signed certificate when no CA is pinned (lenient default), and that
|
||||
* setting {@code requireCertificateValidation(true)} causes a TLS failure.
|
||||
*
|
||||
* <p>A self-signed certificate is generated using {@code keytool} (always
|
||||
* available in the JDK) to avoid dependencies on internal JDK APIs or
|
||||
* BouncyCastle, and so the test works on all JDK versions used by the project.
|
||||
*/
|
||||
final class MxGatewayClientTlsTests {
|
||||
|
||||
private Server server;
|
||||
private int port;
|
||||
private File certPemFile;
|
||||
private File keyPemFile;
|
||||
private File keystoreFile;
|
||||
|
||||
@BeforeEach
|
||||
void startTlsServer() throws Exception {
|
||||
keystoreFile = File.createTempFile("gw-test-ks", ".p12");
|
||||
certPemFile = File.createTempFile("gw-test-cert", ".pem");
|
||||
keyPemFile = File.createTempFile("gw-test-key", ".pem");
|
||||
|
||||
// keytool refuses to write to a pre-existing (even empty) file; delete it first.
|
||||
keystoreFile.delete();
|
||||
|
||||
// Use keytool to generate a self-signed PKCS12 keystore.
|
||||
String keytool = ProcessHandle.current().info().command()
|
||||
.map(cmd -> cmd.replace("java", "keytool"))
|
||||
.orElse("keytool");
|
||||
// Fall back to just "keytool" on PATH if the resolved path doesn't exist.
|
||||
if (!new File(keytool).exists()) {
|
||||
keytool = "keytool";
|
||||
}
|
||||
Process p = new ProcessBuilder(
|
||||
keytool,
|
||||
"-genkeypair",
|
||||
"-alias", "server",
|
||||
"-keyalg", "RSA",
|
||||
"-keysize", "2048",
|
||||
"-sigalg", "SHA256withRSA",
|
||||
"-validity", "1",
|
||||
"-dname", "CN=localhost",
|
||||
"-storetype", "PKCS12",
|
||||
"-storepass", "changeit",
|
||||
"-keypass", "changeit",
|
||||
"-keystore", keystoreFile.getAbsolutePath())
|
||||
.redirectErrorStream(true)
|
||||
.start();
|
||||
int exit = p.waitFor();
|
||||
if (exit != 0) {
|
||||
String out = new String(p.getInputStream().readAllBytes());
|
||||
throw new IllegalStateException("keytool failed (exit " + exit + "): " + out);
|
||||
}
|
||||
|
||||
// Export cert and private key from the PKCS12 keystore to PEM files.
|
||||
KeyStore ks = KeyStore.getInstance("PKCS12");
|
||||
try (var is = Files.newInputStream(keystoreFile.toPath())) {
|
||||
ks.load(is, "changeit".toCharArray());
|
||||
}
|
||||
X509Certificate cert = (X509Certificate) ks.getCertificate("server");
|
||||
PrivateKey privateKey = (PrivateKey) ks.getKey("server", "changeit".toCharArray());
|
||||
|
||||
try (FileOutputStream out = new FileOutputStream(certPemFile)) {
|
||||
out.write("-----BEGIN CERTIFICATE-----\n".getBytes());
|
||||
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(cert.getEncoded()));
|
||||
out.write("\n-----END CERTIFICATE-----\n".getBytes());
|
||||
}
|
||||
try (FileOutputStream out = new FileOutputStream(keyPemFile)) {
|
||||
out.write("-----BEGIN PRIVATE KEY-----\n".getBytes());
|
||||
out.write(Base64.getMimeEncoder(64, new byte[]{'\n'}).encode(privateKey.getEncoded()));
|
||||
out.write("\n-----END PRIVATE KEY-----\n".getBytes());
|
||||
}
|
||||
|
||||
server = NettyServerBuilder
|
||||
.forAddress(new InetSocketAddress("127.0.0.1", 0))
|
||||
.sslContext(GrpcSslContexts.forServer(certPemFile, keyPemFile).build())
|
||||
.addService(new MinimalGatewayService())
|
||||
.build()
|
||||
.start();
|
||||
port = server.getPort();
|
||||
}
|
||||
|
||||
@AfterEach
|
||||
void stopTlsServer() throws InterruptedException {
|
||||
if (server != null) {
|
||||
server.shutdown();
|
||||
server.awaitTermination(5, TimeUnit.SECONDS);
|
||||
}
|
||||
if (certPemFile != null) {
|
||||
certPemFile.delete();
|
||||
}
|
||||
if (keyPemFile != null) {
|
||||
keyPemFile.delete();
|
||||
}
|
||||
if (keystoreFile != null) {
|
||||
keystoreFile.delete();
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void connectsToSelfSignedServer_WhenRequireCertificateValidationIsFalse() throws SSLException {
|
||||
// Default options — requireCertificateValidation defaults to false.
|
||||
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
|
||||
.endpoint("127.0.0.1:" + port)
|
||||
.apiKey("test-key")
|
||||
.connectTimeout(Duration.ofSeconds(5))
|
||||
.callTimeout(Duration.ofSeconds(5))
|
||||
.build();
|
||||
|
||||
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
|
||||
try {
|
||||
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
|
||||
MxAccessGatewayGrpc.newBlockingStub(channel);
|
||||
OpenSessionReply reply = stub.openSession(
|
||||
OpenSessionRequest.newBuilder()
|
||||
.setClientSessionName("tls-test")
|
||||
.build());
|
||||
assertTrue(reply.getProtocolStatus().getCode()
|
||||
== ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK);
|
||||
} finally {
|
||||
channel.shutdownNow();
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void failsToConnect_WhenRequireCertificateValidationIsTrue() throws SSLException {
|
||||
MxGatewayClientOptions options = MxGatewayClientOptions.builder()
|
||||
.endpoint("127.0.0.1:" + port)
|
||||
.apiKey("test-key")
|
||||
.requireCertificateValidation(true)
|
||||
.connectTimeout(Duration.ofSeconds(5))
|
||||
.callTimeout(Duration.ofSeconds(5))
|
||||
.build();
|
||||
|
||||
ManagedChannel channel = MxGatewayClient.createChannelForTests(options);
|
||||
try {
|
||||
MxAccessGatewayGrpc.MxAccessGatewayBlockingStub stub =
|
||||
MxAccessGatewayGrpc.newBlockingStub(channel);
|
||||
assertThrows(StatusRuntimeException.class, () ->
|
||||
stub.openSession(OpenSessionRequest.newBuilder()
|
||||
.setClientSessionName("tls-strict-test")
|
||||
.build()));
|
||||
} finally {
|
||||
channel.shutdownNow();
|
||||
}
|
||||
}
|
||||
|
||||
/** Minimal gateway stub that succeeds any OpenSession call. */
|
||||
private static final class MinimalGatewayService
|
||||
extends MxAccessGatewayGrpc.MxAccessGatewayImplBase {
|
||||
@Override
|
||||
public void openSession(
|
||||
OpenSessionRequest request,
|
||||
StreamObserver<OpenSessionReply> responseObserver) {
|
||||
responseObserver.onNext(OpenSessionReply.newBuilder()
|
||||
.setSessionId("tls-test-session")
|
||||
.setProtocolStatus(ProtocolStatus.newBuilder()
|
||||
.setCode(ProtocolStatusCode.PROTOCOL_STATUS_CODE_OK)
|
||||
.build())
|
||||
.build());
|
||||
responseObserver.onCompleted();
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -112,6 +112,28 @@ Support:
|
||||
- TLS channel with default roots,
|
||||
- custom root certificate file.
|
||||
|
||||
### Trust posture (trust-on-first-use)
|
||||
|
||||
The gateway can serve a self-signed certificate it generates itself (it has no
|
||||
PKI). grpc-python exposes no per-channel skip-verify hook, so the client cannot
|
||||
"accept any certificate" the way the other clients do. Instead, when the channel
|
||||
is not plaintext and neither `ca_file` nor `require_certificate_validation` is
|
||||
set, the TLS default is **trust-on-first-use**: the client fetches the server's
|
||||
presented certificate once via `ssl.get_server_certificate` (an unverified
|
||||
probe), pins it as the channel's only trust root, and — because the generated
|
||||
certificate always carries a `localhost` SAN — defaults
|
||||
`grpc.ssl_target_name_override` to `localhost` when no `server_name_override` was
|
||||
supplied (tolerating dial-by-IP or a hostname mismatch). A failed probe is
|
||||
surfaced as a transport error naming the endpoint.
|
||||
|
||||
To verify the gateway instead:
|
||||
|
||||
- set `ca_file` to verify against a specific CA, or
|
||||
- set `require_certificate_validation=True` to verify against the system trust
|
||||
roots.
|
||||
|
||||
Both bypass the TOFU path.
|
||||
|
||||
## Streaming
|
||||
|
||||
Expose `stream_events` as an async iterator. Canceling the task should cancel
|
||||
|
||||
@@ -138,6 +138,49 @@ The methods return native Python types (`bool`, `datetime | None`, and a
|
||||
into the hierarchy without learning the underlying stub class. The
|
||||
service requires the `metadata:read` scope on the API key.
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
|
||||
time instead of loading the full hierarchy with `discover_hierarchy`. Pass an
|
||||
empty request for root objects; subsequent calls set `parent_gobject_id`,
|
||||
`parent_tag_name`, or `parent_contained_path`. Filter fields match
|
||||
`DiscoverHierarchy`. Each response pairs `children` with `child_has_children` so
|
||||
you know which nodes to expand. See
|
||||
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
|
||||
request and filter semantics.
|
||||
|
||||
```python
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb2
|
||||
|
||||
reply = await galaxy.browse_children(galaxy_pb2.BrowseChildrenRequest())
|
||||
for child, has_children in zip(reply.children, reply.child_has_children):
|
||||
print(child.tag_name, "expand=" + str(has_children))
|
||||
```
|
||||
|
||||
#### High-level walker
|
||||
|
||||
For UI trees, the client provides a `LazyBrowseNode` walker that handles
|
||||
sibling pagination and the `child_has_children` hint for you:
|
||||
|
||||
```python
|
||||
async with await GalaxyRepositoryClient.connect(
|
||||
endpoint="localhost:5000",
|
||||
api_key="<gateway-api-key>",
|
||||
plaintext=True,
|
||||
) as galaxy:
|
||||
roots = await galaxy.browse()
|
||||
for root in roots:
|
||||
if root.has_children_hint:
|
||||
await root.expand()
|
||||
for child in root.children:
|
||||
kind = "has children" if child.has_children_hint else "leaf"
|
||||
print(f"{child.object.tag_name} ({kind})")
|
||||
```
|
||||
|
||||
`expand` is idempotent — calling it twice fires only one RPC,
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`browse` again from the root.
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`GalaxyRepositoryClient.watch_deploy_events` opens a server-streaming
|
||||
@@ -187,6 +230,21 @@ The client supports plaintext channels for local development, TLS with system
|
||||
roots, TLS with a custom `ca_file`, and an optional test server name override.
|
||||
API keys are redacted from option repr output and CLI error output.
|
||||
|
||||
The gateway can auto-generate its own self-signed certificate (it has no PKI).
|
||||
grpc-python has no per-channel skip-verify, so the lenient TLS default is
|
||||
**trust-on-first-use**: with no `ca_file` and `require_certificate_validation`
|
||||
left `False`, the client fetches the gateway's presented certificate once
|
||||
(unverified) and pins it for the channel, defaulting the SNI/target-name override
|
||||
to `localhost` (the generated certificate always carries a `localhost` SAN) when
|
||||
none was supplied. To verify instead, pass `ca_file` to verify against a specific
|
||||
CA, or set `require_certificate_validation=True` to verify against the system
|
||||
trust roots. The strict posture is reachable through every documented entry
|
||||
point: the `require_certificate_validation=True` keyword on
|
||||
`GatewayClient.connect(...)` / `GalaxyRepositoryClient.connect(...)`, the
|
||||
`ClientOptions(require_certificate_validation=True)` struct, and the
|
||||
`--require-certificate-validation` CLI flag. See
|
||||
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
|
||||
|
||||
## CLI
|
||||
|
||||
The CLI emits deterministic JSON for automation:
|
||||
@@ -213,6 +271,13 @@ Use TLS options for a secured gateway:
|
||||
mxgw-py smoke --endpoint mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item Object.Attribute --json
|
||||
```
|
||||
|
||||
To force certificate validation against the system trust store instead of the
|
||||
lenient trust-on-first-use default, add `--require-certificate-validation`:
|
||||
|
||||
```powershell
|
||||
mxgw-py smoke --endpoint mxgateway.example.local:5001 --tls --require-certificate-validation --api-key-env MXGATEWAY_API_KEY --item Object.Attribute --json
|
||||
```
|
||||
|
||||
## Integration Checks
|
||||
|
||||
Run live checks only when a gateway and MXAccess-backed worker are available:
|
||||
@@ -225,6 +290,19 @@ $env:MXGATEWAY_TEST_ITEM = 'Object.Attribute'
|
||||
mxgw-py smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
```
|
||||
|
||||
## Installing from the Gitea PyPI Feed
|
||||
|
||||
The client publishes to the internal Gitea PyPI feed:
|
||||
|
||||
````bash
|
||||
pip install \
|
||||
--index-url https://gitea.dohertylan.com/api/packages/dohertj2/pypi/simple/ \
|
||||
zb-mom-ww-mxaccess-gateway-client
|
||||
````
|
||||
|
||||
If you need authentication (private feed), use `--extra-index-url` and either
|
||||
a `~/.netrc` entry or `PIP_INDEX_URL=https://<user>:<token>@gitea.dohertylan.com/...`.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Client Packaging](../../docs/ClientPackaging.md)
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
[build-system]
|
||||
requires = ["setuptools>=69", "wheel"]
|
||||
# setuptools >=77 emits core-metadata 2.4 (PEP 639 License-Expression), which the
|
||||
# Gitea PyPI feed does not yet accept; cap below that so the dist stays <=2.3.
|
||||
requires = ["setuptools>=69,<77", "wheel"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "zb-mom-ww-mxaccess-gateway-client"
|
||||
version = "0.1.0"
|
||||
version = "0.1.1"
|
||||
description = "Async Python client scaffold for MXAccess Gateway."
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.12"
|
||||
@@ -13,12 +15,34 @@ dependencies = [
|
||||
"grpcio>=1.80,<2",
|
||||
"protobuf>=6.33,<7",
|
||||
]
|
||||
authors = [
|
||||
{ name = "Joseph Doherty" },
|
||||
]
|
||||
keywords = ["mxaccess", "mxgateway", "grpc", "client", "archestra"]
|
||||
classifiers = [
|
||||
"License :: Other/Proprietary License",
|
||||
"Development Status :: 3 - Alpha",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.12",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
"Topic :: System :: Distributed Computing",
|
||||
"Topic :: Software Development :: Libraries :: Python Modules",
|
||||
"Intended Audience :: Developers",
|
||||
"Operating System :: OS Independent",
|
||||
]
|
||||
|
||||
[project.urls]
|
||||
Homepage = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
Repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
Issues = "https://gitea.dohertylan.com/dohertj2/mxaccessgw/issues"
|
||||
|
||||
[project.optional-dependencies]
|
||||
dev = [
|
||||
"grpcio-tools>=1.80,<2",
|
||||
"pytest>=9,<10",
|
||||
"pytest-asyncio>=1.3,<2",
|
||||
"build>=1.2,<2",
|
||||
"twine>=5,<6",
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
@@ -31,3 +55,6 @@ where = ["src"]
|
||||
addopts = "-ra"
|
||||
pythonpath = ["src"]
|
||||
testpaths = ["tests"]
|
||||
markers = [
|
||||
"tls: loopback TLS tests, opt-in via MXGATEWAY_RUN_TLS_TESTS=1",
|
||||
]
|
||||
|
||||
@@ -40,6 +40,7 @@ class GatewayClient:
|
||||
api_key: str | None = None,
|
||||
plaintext: bool = False,
|
||||
ca_file: str | None = None,
|
||||
require_certificate_validation: bool = False,
|
||||
server_name_override: str | None = None,
|
||||
stub: Any | None = None,
|
||||
) -> "GatewayClient":
|
||||
@@ -50,13 +51,16 @@ class GatewayClient:
|
||||
api_key=api_key,
|
||||
plaintext=plaintext,
|
||||
ca_file=ca_file,
|
||||
require_certificate_validation=require_certificate_validation,
|
||||
server_name_override=server_name_override,
|
||||
)
|
||||
|
||||
if stub is not None:
|
||||
return cls(options=resolved, stub=stub)
|
||||
|
||||
channel = create_channel(resolved)
|
||||
# create_channel may perform a blocking TLS certificate probe (TOFU
|
||||
# default); run it off the event loop so connect never freezes the loop.
|
||||
channel = await asyncio.to_thread(create_channel, resolved)
|
||||
return cls(
|
||||
options=resolved,
|
||||
stub=pb_grpc.MxAccessGatewayStub(channel),
|
||||
|
||||
@@ -21,9 +21,10 @@ from .auth import merge_metadata
|
||||
from .errors import MxGatewayError, map_rpc_error
|
||||
from .generated import galaxy_repository_pb2 as galaxy_pb
|
||||
from .generated import galaxy_repository_pb2_grpc as galaxy_pb_grpc
|
||||
from .options import ClientOptions, create_channel
|
||||
from .options import BrowseChildrenOptions, ClientOptions, create_channel
|
||||
|
||||
_DISCOVER_HIERARCHY_PAGE_SIZE = 5000
|
||||
_BROWSE_CHILDREN_PAGE_SIZE = 500
|
||||
|
||||
|
||||
class GalaxyRepositoryClient:
|
||||
@@ -51,6 +52,7 @@ class GalaxyRepositoryClient:
|
||||
api_key: str | None = None,
|
||||
plaintext: bool = False,
|
||||
ca_file: str | None = None,
|
||||
require_certificate_validation: bool = False,
|
||||
server_name_override: str | None = None,
|
||||
stub: Any | None = None,
|
||||
) -> "GalaxyRepositoryClient":
|
||||
@@ -61,13 +63,16 @@ class GalaxyRepositoryClient:
|
||||
api_key=api_key,
|
||||
plaintext=plaintext,
|
||||
ca_file=ca_file,
|
||||
require_certificate_validation=require_certificate_validation,
|
||||
server_name_override=server_name_override,
|
||||
)
|
||||
|
||||
if stub is not None:
|
||||
return cls(options=resolved, stub=stub)
|
||||
|
||||
channel = create_channel(resolved)
|
||||
# create_channel may perform a blocking TLS certificate probe (TOFU
|
||||
# default); run it off the event loop so connect never freezes the loop.
|
||||
channel = await asyncio.to_thread(create_channel, resolved)
|
||||
return cls(
|
||||
options=resolved,
|
||||
stub=galaxy_pb_grpc.GalaxyRepositoryStub(channel),
|
||||
@@ -139,6 +144,89 @@ class GalaxyRepositoryClient:
|
||||
)
|
||||
seen_page_tokens.add(page_token)
|
||||
|
||||
async def browse_children_raw(
|
||||
self, request: galaxy_pb.BrowseChildrenRequest
|
||||
) -> galaxy_pb.BrowseChildrenReply:
|
||||
"""Issue one BrowseChildren RPC and return the raw reply.
|
||||
|
||||
Lower-level escape hatch for callers that need direct page-token control
|
||||
or do not want LazyBrowseNode wrapping. Most callers should use
|
||||
:py:meth:`browse` and :py:meth:`LazyBrowseNode.expand` instead.
|
||||
"""
|
||||
|
||||
return await self._unary(
|
||||
"browse children",
|
||||
self.raw_stub.BrowseChildren,
|
||||
request,
|
||||
)
|
||||
|
||||
async def browse(
|
||||
self,
|
||||
options: BrowseChildrenOptions | None = None,
|
||||
) -> list["LazyBrowseNode"]:
|
||||
"""Return the root browse nodes for lazy hierarchy traversal.
|
||||
|
||||
Each returned ``LazyBrowseNode`` wraps a Galaxy object whose direct
|
||||
children can be loaded on demand by ``await node.expand()``.
|
||||
"""
|
||||
|
||||
effective = options or BrowseChildrenOptions()
|
||||
return [
|
||||
node
|
||||
async for node in self._iter_browse_children(
|
||||
parent_gobject_id=None,
|
||||
options=effective,
|
||||
)
|
||||
]
|
||||
|
||||
async def _iter_browse_children(
|
||||
self,
|
||||
*,
|
||||
parent_gobject_id: int | None,
|
||||
options: BrowseChildrenOptions,
|
||||
) -> AsyncIterator["LazyBrowseNode"]:
|
||||
page_token = ""
|
||||
seen_page_tokens: set[str] = set()
|
||||
while True:
|
||||
request = galaxy_pb.BrowseChildrenRequest(
|
||||
page_size=_BROWSE_CHILDREN_PAGE_SIZE,
|
||||
page_token=page_token,
|
||||
alarm_bearing_only=options.alarm_bearing_only,
|
||||
historized_only=options.historized_only,
|
||||
)
|
||||
if parent_gobject_id is not None:
|
||||
request.parent_gobject_id = parent_gobject_id
|
||||
if options.category_ids:
|
||||
request.category_ids.extend(options.category_ids)
|
||||
if options.template_chain_contains:
|
||||
request.template_chain_contains.extend(options.template_chain_contains)
|
||||
if options.tag_name_glob:
|
||||
request.tag_name_glob = options.tag_name_glob
|
||||
if options.include_attributes is not None:
|
||||
request.include_attributes = options.include_attributes
|
||||
|
||||
reply = await self._unary(
|
||||
"browse children",
|
||||
self.raw_stub.BrowseChildren,
|
||||
request,
|
||||
)
|
||||
|
||||
for index, obj in enumerate(reply.children):
|
||||
hint = (
|
||||
index < len(reply.child_has_children)
|
||||
and bool(reply.child_has_children[index])
|
||||
)
|
||||
yield LazyBrowseNode(self, obj, hint, options)
|
||||
|
||||
page_token = reply.next_page_token
|
||||
if not page_token:
|
||||
return
|
||||
if page_token in seen_page_tokens:
|
||||
raise MxGatewayError(
|
||||
f"galaxy browse children returned repeated page token {page_token!r}"
|
||||
)
|
||||
seen_page_tokens.add(page_token)
|
||||
|
||||
def watch_deploy_events(
|
||||
self,
|
||||
last_seen_deploy_time: datetime | None = None,
|
||||
@@ -202,6 +290,67 @@ class GalaxyRepositoryClient:
|
||||
raise map_rpc_error(operation, error) from error
|
||||
|
||||
|
||||
class LazyBrowseNode:
|
||||
"""One node in a lazy-loaded Galaxy browse tree.
|
||||
|
||||
Calling ``expand`` once fetches direct children (paginating as needed)
|
||||
and populates ``children``. Subsequent calls are no-ops so callers can
|
||||
drive UI expand toggles without de-duping.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
client: "GalaxyRepositoryClient",
|
||||
obj: galaxy_pb.GalaxyObject,
|
||||
has_children_hint: bool,
|
||||
options: BrowseChildrenOptions,
|
||||
) -> None:
|
||||
"""Initialize a node bound to its owning client and filter set."""
|
||||
self._client = client
|
||||
self._object = obj
|
||||
self._has_children_hint = has_children_hint
|
||||
self._options = options
|
||||
self._children: list[LazyBrowseNode] = []
|
||||
self._is_expanded = False
|
||||
self._expand_lock = asyncio.Lock()
|
||||
|
||||
@property
|
||||
def object(self) -> galaxy_pb.GalaxyObject:
|
||||
"""Return the underlying ``GalaxyObject`` proto for this node."""
|
||||
return self._object
|
||||
|
||||
@property
|
||||
def has_children_hint(self) -> bool:
|
||||
"""Return the server hint about whether this node has children."""
|
||||
return self._has_children_hint
|
||||
|
||||
@property
|
||||
def children(self) -> list["LazyBrowseNode"]:
|
||||
"""Return a copy of the loaded child nodes (empty until expanded)."""
|
||||
return list(self._children)
|
||||
|
||||
@property
|
||||
def is_expanded(self) -> bool:
|
||||
"""Return whether ``expand`` has already populated ``children``."""
|
||||
return self._is_expanded
|
||||
|
||||
async def expand(self) -> None:
|
||||
"""Fetch direct children of this node; no-op on subsequent calls."""
|
||||
if self._is_expanded:
|
||||
return
|
||||
async with self._expand_lock:
|
||||
if self._is_expanded:
|
||||
return
|
||||
new_children: list[LazyBrowseNode] = []
|
||||
async for child in self._client._iter_browse_children(
|
||||
parent_gobject_id=self._object.gobject_id,
|
||||
options=self._options,
|
||||
):
|
||||
new_children.append(child)
|
||||
self._children.extend(new_children)
|
||||
self._is_expanded = True
|
||||
|
||||
|
||||
async def _canceling_iterator(call: Any) -> AsyncIterator[galaxy_pb.DeployEvent]:
|
||||
try:
|
||||
async for event in call:
|
||||
|
||||
@@ -26,7 +26,7 @@ from google.protobuf import timestamp_pb2 as google_dot_protobuf_dot_timestamp__
|
||||
from google.protobuf import wrappers_pb2 as google_dot_protobuf_dot_wrappers__pb2
|
||||
|
||||
|
||||
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17galaxy_repository.proto\x12\x14galaxy_repository.v1\x1a\x1fgoogle/protobuf/timestamp.proto\x1a\x1egoogle/protobuf/wrappers.proto\"\x17\n\x15TestConnectionRequest\"!\n\x13TestConnectionReply\x12\n\n\x02ok\x18\x01 \x01(\x08\"\x1a\n\x18GetLastDeployTimeRequest\"b\n\x16GetLastDeployTimeReply\x12\x0f\n\x07present\x18\x01 \x01(\x08\x12\x37\n\x13time_of_last_deploy\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\x87\x03\n\x18\x44iscoverHierarchyRequest\x12\x11\n\tpage_size\x18\x01 \x01(\x05\x12\x12\n\npage_token\x18\x02 \x01(\t\x12\x19\n\x0froot_gobject_id\x18\x03 \x01(\x05H\x00\x12\x17\n\rroot_tag_name\x18\x04 \x01(\tH\x00\x12\x1d\n\x13root_contained_path\x18\x05 \x01(\tH\x00\x12.\n\tmax_depth\x18\x06 \x01(\x0b\x32\x1b.google.protobuf.Int32Value\x12\x14\n\x0c\x63\x61tegory_ids\x18\x07 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x08 \x03(\t\x12\x15\n\rtag_name_glob\x18\t \x01(\t\x12\x1f\n\x12include_attributes\x18\n \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\x0b \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0c \x01(\x08\x42\x06\n\x04rootB\x15\n\x13_include_attributes\"\x82\x01\n\x16\x44iscoverHierarchyReply\x12\x33\n\x07objects\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x1a\n\x12total_object_count\x18\x03 \x01(\x05\"U\n\x18WatchDeployEventsRequest\x12\x39\n\x15last_seen_deploy_time\x18\x01 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\xdd\x01\n\x0b\x44\x65ployEvent\x12\x10\n\x08sequence\x18\x01 \x01(\x04\x12/\n\x0bobserved_at\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12\x37\n\x13time_of_last_deploy\x18\x03 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12#\n\x1btime_of_last_deploy_present\x18\x04 \x01(\x08\x12\x14\n\x0cobject_count\x18\x05 \x01(\x05\x12\x17\n\x0f\x61ttribute_count\x18\x06 \x01(\x05\"\x93\x02\n\x0cGalaxyObject\x12\x12\n\ngobject_id\x18\x01 \x01(\x05\x12\x10\n\x08tag_name\x18\x02 \x01(\t\x12\x16\n\x0e\x63ontained_name\x18\x03 \x01(\t\x12\x13\n\x0b\x62rowse_name\x18\x04 \x01(\t\x12\x19\n\x11parent_gobject_id\x18\x05 \x01(\x05\x12\x0f\n\x07is_area\x18\x06 \x01(\x08\x12\x13\n\x0b\x63\x61tegory_id\x18\x07 \x01(\x05\x12\x1c\n\x14hosted_by_gobject_id\x18\x08 \x01(\x05\x12\x16\n\x0etemplate_chain\x18\t \x03(\t\x12\x39\n\nattributes\x18\n \x03(\x0b\x32%.galaxy_repository.v1.GalaxyAttribute\"\xa8\x02\n\x0fGalaxyAttribute\x12\x16\n\x0e\x61ttribute_name\x18\x01 \x01(\t\x12\x1a\n\x12\x66ull_tag_reference\x18\x02 \x01(\t\x12\x14\n\x0cmx_data_type\x18\x03 \x01(\x05\x12\x16\n\x0e\x64\x61ta_type_name\x18\x04 \x01(\t\x12\x10\n\x08is_array\x18\x05 \x01(\x08\x12\x17\n\x0f\x61rray_dimension\x18\x06 \x01(\x05\x12\x1f\n\x17\x61rray_dimension_present\x18\x07 \x01(\x08\x12\x1d\n\x15mx_attribute_category\x18\x08 \x01(\x05\x12\x1f\n\x17security_classification\x18\t \x01(\x05\x12\x15\n\ris_historized\x18\n \x01(\x08\x12\x10\n\x08is_alarm\x18\x0b \x01(\x08\x32\xcc\x03\n\x10GalaxyRepository\x12h\n\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n\x11\x44iscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x42-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3')
|
||||
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17galaxy_repository.proto\x12\x14galaxy_repository.v1\x1a\x1fgoogle/protobuf/timestamp.proto\x1a\x1egoogle/protobuf/wrappers.proto\"\x17\n\x15TestConnectionRequest\"!\n\x13TestConnectionReply\x12\n\n\x02ok\x18\x01 \x01(\x08\"\x1a\n\x18GetLastDeployTimeRequest\"b\n\x16GetLastDeployTimeReply\x12\x0f\n\x07present\x18\x01 \x01(\x08\x12\x37\n\x13time_of_last_deploy\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\x87\x03\n\x18\x44iscoverHierarchyRequest\x12\x11\n\tpage_size\x18\x01 \x01(\x05\x12\x12\n\npage_token\x18\x02 \x01(\t\x12\x19\n\x0froot_gobject_id\x18\x03 \x01(\x05H\x00\x12\x17\n\rroot_tag_name\x18\x04 \x01(\tH\x00\x12\x1d\n\x13root_contained_path\x18\x05 \x01(\tH\x00\x12.\n\tmax_depth\x18\x06 \x01(\x0b\x32\x1b.google.protobuf.Int32Value\x12\x14\n\x0c\x63\x61tegory_ids\x18\x07 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x08 \x03(\t\x12\x15\n\rtag_name_glob\x18\t \x01(\t\x12\x1f\n\x12include_attributes\x18\n \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\x0b \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0c \x01(\x08\x42\x06\n\x04rootB\x15\n\x13_include_attributes\"\x82\x01\n\x16\x44iscoverHierarchyReply\x12\x33\n\x07objects\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x1a\n\x12total_object_count\x18\x03 \x01(\x05\"U\n\x18WatchDeployEventsRequest\x12\x39\n\x15last_seen_deploy_time\x18\x01 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\"\xdd\x01\n\x0b\x44\x65ployEvent\x12\x10\n\x08sequence\x18\x01 \x01(\x04\x12/\n\x0bobserved_at\x18\x02 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12\x37\n\x13time_of_last_deploy\x18\x03 \x01(\x0b\x32\x1a.google.protobuf.Timestamp\x12#\n\x1btime_of_last_deploy_present\x18\x04 \x01(\x08\x12\x14\n\x0cobject_count\x18\x05 \x01(\x05\x12\x17\n\x0f\x61ttribute_count\x18\x06 \x01(\x05\"\x93\x02\n\x0cGalaxyObject\x12\x12\n\ngobject_id\x18\x01 \x01(\x05\x12\x10\n\x08tag_name\x18\x02 \x01(\t\x12\x16\n\x0e\x63ontained_name\x18\x03 \x01(\t\x12\x13\n\x0b\x62rowse_name\x18\x04 \x01(\t\x12\x19\n\x11parent_gobject_id\x18\x05 \x01(\x05\x12\x0f\n\x07is_area\x18\x06 \x01(\x08\x12\x13\n\x0b\x63\x61tegory_id\x18\x07 \x01(\x05\x12\x1c\n\x14hosted_by_gobject_id\x18\x08 \x01(\x05\x12\x16\n\x0etemplate_chain\x18\t \x03(\t\x12\x39\n\nattributes\x18\n \x03(\x0b\x32%.galaxy_repository.v1.GalaxyAttribute\"\xa8\x02\n\x0fGalaxyAttribute\x12\x16\n\x0e\x61ttribute_name\x18\x01 \x01(\t\x12\x1a\n\x12\x66ull_tag_reference\x18\x02 \x01(\t\x12\x14\n\x0cmx_data_type\x18\x03 \x01(\x05\x12\x16\n\x0e\x64\x61ta_type_name\x18\x04 \x01(\t\x12\x10\n\x08is_array\x18\x05 \x01(\x08\x12\x17\n\x0f\x61rray_dimension\x18\x06 \x01(\x05\x12\x1f\n\x17\x61rray_dimension_present\x18\x07 \x01(\x08\x12\x1d\n\x15mx_attribute_category\x18\x08 \x01(\x05\x12\x1f\n\x17security_classification\x18\t \x01(\x05\x12\x15\n\ris_historized\x18\n \x01(\x08\x12\x10\n\x08is_alarm\x18\x0b \x01(\x08\"\xdc\x02\n\x15\x42rowseChildrenRequest\x12\x1b\n\x11parent_gobject_id\x18\x01 \x01(\x05H\x00\x12\x19\n\x0fparent_tag_name\x18\x02 \x01(\tH\x00\x12\x1f\n\x15parent_contained_path\x18\x03 \x01(\tH\x00\x12\x11\n\tpage_size\x18\x04 \x01(\x05\x12\x12\n\npage_token\x18\x05 \x01(\t\x12\x14\n\x0c\x63\x61tegory_ids\x18\x06 \x03(\x05\x12\x1f\n\x17template_chain_contains\x18\x07 \x03(\t\x12\x15\n\rtag_name_glob\x18\x08 \x01(\t\x12\x1f\n\x12include_attributes\x18\t \x01(\x08H\x01\x88\x01\x01\x12\x1a\n\x12\x61larm_bearing_only\x18\n \x01(\x08\x12\x17\n\x0fhistorized_only\x18\x0b \x01(\x08\x42\x08\n\x06parentB\x15\n\x13_include_attributes\"\xb3\x01\n\x13\x42rowseChildrenReply\x12\x34\n\x08\x63hildren\x18\x01 \x03(\x0b\x32\".galaxy_repository.v1.GalaxyObject\x12\x17\n\x0fnext_page_token\x18\x02 \x01(\t\x12\x19\n\x11total_child_count\x18\x03 \x01(\x05\x12\x1a\n\x12\x63hild_has_children\x18\x04 \x03(\x08\x12\x16\n\x0e\x63\x61\x63he_sequence\x18\x05 \x01(\x04\x32\xb6\x04\n\x10GalaxyRepository\x12h\n\x0eTestConnection\x12+.galaxy_repository.v1.TestConnectionRequest\x1a).galaxy_repository.v1.TestConnectionReply\x12q\n\x11GetLastDeployTime\x12..galaxy_repository.v1.GetLastDeployTimeRequest\x1a,.galaxy_repository.v1.GetLastDeployTimeReply\x12q\n\x11\x44iscoverHierarchy\x12..galaxy_repository.v1.DiscoverHierarchyRequest\x1a,.galaxy_repository.v1.DiscoverHierarchyReply\x12h\n\x11WatchDeployEvents\x12..galaxy_repository.v1.WatchDeployEventsRequest\x1a!.galaxy_repository.v1.DeployEvent0\x01\x12h\n\x0e\x42rowseChildren\x12+.galaxy_repository.v1.BrowseChildrenRequest\x1a).galaxy_repository.v1.BrowseChildrenReplyB-\xaa\x02*ZB.MOM.WW.MxGateway.Contracts.Proto.Galaxyb\x06proto3')
|
||||
|
||||
_globals = globals()
|
||||
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
|
||||
@@ -54,6 +54,10 @@ if not _descriptor._USE_C_DESCRIPTORS:
|
||||
_globals['_GALAXYOBJECT']._serialized_end=1416
|
||||
_globals['_GALAXYATTRIBUTE']._serialized_start=1419
|
||||
_globals['_GALAXYATTRIBUTE']._serialized_end=1715
|
||||
_globals['_GALAXYREPOSITORY']._serialized_start=1718
|
||||
_globals['_GALAXYREPOSITORY']._serialized_end=2178
|
||||
_globals['_BROWSECHILDRENREQUEST']._serialized_start=1718
|
||||
_globals['_BROWSECHILDRENREQUEST']._serialized_end=2066
|
||||
_globals['_BROWSECHILDRENREPLY']._serialized_start=2069
|
||||
_globals['_BROWSECHILDRENREPLY']._serialized_end=2248
|
||||
_globals['_GALAXYREPOSITORY']._serialized_start=2251
|
||||
_globals['_GALAXYREPOSITORY']._serialized_end=2817
|
||||
# @@protoc_insertion_point(module_scope)
|
||||
|
||||
@@ -65,6 +65,11 @@ class GalaxyRepositoryStub(object):
|
||||
request_serializer=galaxy__repository__pb2.WatchDeployEventsRequest.SerializeToString,
|
||||
response_deserializer=galaxy__repository__pb2.DeployEvent.FromString,
|
||||
_registered_method=True)
|
||||
self.BrowseChildren = channel.unary_unary(
|
||||
'/galaxy_repository.v1.GalaxyRepository/BrowseChildren',
|
||||
request_serializer=galaxy__repository__pb2.BrowseChildrenRequest.SerializeToString,
|
||||
response_deserializer=galaxy__repository__pb2.BrowseChildrenReply.FromString,
|
||||
_registered_method=True)
|
||||
|
||||
|
||||
class GalaxyRepositoryServicer(object):
|
||||
@@ -111,6 +116,16 @@ class GalaxyRepositoryServicer(object):
|
||||
context.set_details('Method not implemented!')
|
||||
raise NotImplementedError('Method not implemented!')
|
||||
|
||||
def BrowseChildren(self, request, context):
|
||||
"""Returns the direct children of a parent object (or the root objects when
|
||||
`parent` is unset). Designed for OPC UA-style lazy expand: clients walk
|
||||
one level at a time instead of paging the full hierarchy. Filters mirror
|
||||
DiscoverHierarchy exactly. Backed by the same shared hierarchy cache.
|
||||
"""
|
||||
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
|
||||
context.set_details('Method not implemented!')
|
||||
raise NotImplementedError('Method not implemented!')
|
||||
|
||||
|
||||
def add_GalaxyRepositoryServicer_to_server(servicer, server):
|
||||
rpc_method_handlers = {
|
||||
@@ -134,6 +149,11 @@ def add_GalaxyRepositoryServicer_to_server(servicer, server):
|
||||
request_deserializer=galaxy__repository__pb2.WatchDeployEventsRequest.FromString,
|
||||
response_serializer=galaxy__repository__pb2.DeployEvent.SerializeToString,
|
||||
),
|
||||
'BrowseChildren': grpc.unary_unary_rpc_method_handler(
|
||||
servicer.BrowseChildren,
|
||||
request_deserializer=galaxy__repository__pb2.BrowseChildrenRequest.FromString,
|
||||
response_serializer=galaxy__repository__pb2.BrowseChildrenReply.SerializeToString,
|
||||
),
|
||||
}
|
||||
generic_handler = grpc.method_handlers_generic_handler(
|
||||
'galaxy_repository.v1.GalaxyRepository', rpc_method_handlers)
|
||||
@@ -263,3 +283,30 @@ class GalaxyRepository(object):
|
||||
timeout,
|
||||
metadata,
|
||||
_registered_method=True)
|
||||
|
||||
@staticmethod
|
||||
def BrowseChildren(request,
|
||||
target,
|
||||
options=(),
|
||||
channel_credentials=None,
|
||||
call_credentials=None,
|
||||
insecure=False,
|
||||
compression=None,
|
||||
wait_for_ready=None,
|
||||
timeout=None,
|
||||
metadata=None):
|
||||
return grpc.experimental.unary_unary(
|
||||
request,
|
||||
target,
|
||||
'/galaxy_repository.v1.GalaxyRepository/BrowseChildren',
|
||||
galaxy__repository__pb2.BrowseChildrenRequest.SerializeToString,
|
||||
galaxy__repository__pb2.BrowseChildrenReply.FromString,
|
||||
options,
|
||||
channel_credentials,
|
||||
insecure,
|
||||
call_credentials,
|
||||
compression,
|
||||
wait_for_ready,
|
||||
timeout,
|
||||
metadata,
|
||||
_registered_method=True)
|
||||
|
||||
@@ -135,6 +135,9 @@ class MxAccessGatewayServicer(object):
|
||||
reconnect to seed Part 9 client state, or to reconcile alarms that may
|
||||
have been missed during a transport blip. Streamed so callers can
|
||||
begin processing without buffering the full set.
|
||||
`QueryActiveAlarmsRequest.alarm_filter_prefix` optionally narrows the
|
||||
snapshot to alarms whose `alarm_full_reference` starts with the given
|
||||
prefix; an empty prefix returns the full set.
|
||||
"""
|
||||
context.set_code(grpc.StatusCode.UNIMPLEMENTED)
|
||||
context.set_details('Method not implemented!')
|
||||
|
||||
@@ -2,12 +2,19 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
import ssl
|
||||
from collections.abc import Sequence
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
import grpc
|
||||
|
||||
from .auth import REDACTED, ApiKey
|
||||
from .errors import MxGatewayTransportError
|
||||
|
||||
# Fallback bound for the TOFU certificate probe when no call_timeout is set, so a
|
||||
# black-holed host fails fast instead of hanging on the OS default connect timeout.
|
||||
_TOFU_PROBE_TIMEOUT_SECONDS = 10.0
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
@@ -18,6 +25,7 @@ class ClientOptions:
|
||||
api_key: str | ApiKey | None = None
|
||||
plaintext: bool = False
|
||||
ca_file: str | None = None
|
||||
require_certificate_validation: bool = False
|
||||
server_name_override: str | None = None
|
||||
call_timeout: float | None = 30.0
|
||||
stream_timeout: float | None = None
|
||||
@@ -44,6 +52,7 @@ class ClientOptions:
|
||||
f"{type(self).__name__}(endpoint={self.endpoint!r}, "
|
||||
f"api_key={api_key!r}, plaintext={self.plaintext!r}, "
|
||||
f"ca_file={self.ca_file!r}, "
|
||||
f"require_certificate_validation={self.require_certificate_validation!r}, "
|
||||
f"server_name_override={self.server_name_override!r}, "
|
||||
f"call_timeout={self.call_timeout!r}, "
|
||||
f"stream_timeout={self.stream_timeout!r}, "
|
||||
@@ -51,8 +60,60 @@ class ClientOptions:
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class BrowseChildrenOptions:
|
||||
"""Filters and shape options for ``GalaxyRepositoryClient.browse``.
|
||||
|
||||
Mirrors the AND-combined filter set on ``BrowseChildrenRequest`` so a
|
||||
single instance can be re-used across an entire lazy browse session
|
||||
(the filter set is part of the page-token contract).
|
||||
"""
|
||||
|
||||
category_ids: Sequence[int] = field(default_factory=tuple)
|
||||
template_chain_contains: Sequence[str] = field(default_factory=tuple)
|
||||
tag_name_glob: str | None = None
|
||||
include_attributes: bool | None = None
|
||||
alarm_bearing_only: bool = False
|
||||
historized_only: bool = False
|
||||
|
||||
|
||||
def _split_authority(endpoint: str) -> tuple[str, int]:
|
||||
"""Split a gRPC target (optionally scheme-prefixed) into (host, port).
|
||||
|
||||
Handles bracketed IPv6 literals (e.g. ``[::1]:5120`` or bare ``[::1]``),
|
||||
returning the host without brackets so it is safe to pass to
|
||||
``ssl.get_server_certificate``.
|
||||
"""
|
||||
target = endpoint.split("://", 1)[-1]
|
||||
if target.startswith("["):
|
||||
# Bracketed IPv6: "[::1]:5120" or "[::1]"
|
||||
bracket_end = target.find("]")
|
||||
host = target[1:bracket_end] # strip surrounding brackets
|
||||
remainder = target[bracket_end + 1 :] # ":5120" or ""
|
||||
port_str = remainder.lstrip(":")
|
||||
return (host, int(port_str) if port_str else 443)
|
||||
host, sep, port = target.rpartition(":")
|
||||
if not sep:
|
||||
# No colon at all (e.g. a bare hostname "mygateway"): the whole target
|
||||
# is the host; default the port rather than raising on int("mygateway").
|
||||
return (target or "localhost", 443)
|
||||
if not port.isdigit():
|
||||
# A colon with a non-numeric / empty tail (e.g. a trailing ":") is not
|
||||
# an explicit port — keep the left side as the host and default the
|
||||
# port so a typo cannot raise an uncaught ValueError on the TOFU path.
|
||||
return (host or "localhost", 443)
|
||||
return (host or "localhost", int(port))
|
||||
|
||||
|
||||
def create_channel(options: ClientOptions) -> grpc.aio.Channel:
|
||||
"""Create a plaintext or TLS `grpc.aio` channel from client options."""
|
||||
"""Create a plaintext or TLS `grpc.aio` channel from client options.
|
||||
|
||||
The TLS default is lenient: grpc-python has no per-channel skip-verify, so
|
||||
the server's presented certificate is fetched once (unverified) and pinned
|
||||
as the channel's only trust root (trust-on-first-use). Set
|
||||
`require_certificate_validation=True` to force system-trust verification, or
|
||||
pass `ca_file` to verify against a specific CA — both bypass the TOFU path.
|
||||
"""
|
||||
|
||||
channel_options: list[tuple[str, str | int]] = [
|
||||
("grpc.max_receive_message_length", options.max_grpc_message_bytes),
|
||||
@@ -64,11 +125,34 @@ def create_channel(options: ClientOptions) -> grpc.aio.Channel:
|
||||
if options.plaintext:
|
||||
return grpc.aio.insecure_channel(options.endpoint, options=channel_options)
|
||||
|
||||
root_certificates = None
|
||||
if options.ca_file:
|
||||
root_certificates = Path(options.ca_file).read_bytes()
|
||||
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
|
||||
elif options.require_certificate_validation:
|
||||
credentials = grpc.ssl_channel_credentials()
|
||||
else:
|
||||
# Lenient default: grpc-python has no per-channel skip-verify, so fetch the
|
||||
# server's certificate (unverified) and pin it for this channel (TOFU).
|
||||
# The probe opens a real blocking TCP+TLS socket, so it MUST be bounded —
|
||||
# a black-holed / firewall-drop host would otherwise hang on the OS default
|
||||
# connect timeout (minutes). Bound it by call_timeout (or a short fixed
|
||||
# fallback) so the dial fails fast as a transport error. The async
|
||||
# `connect` classmethods run this off the event loop (asyncio.to_thread).
|
||||
host, port = _split_authority(options.endpoint)
|
||||
probe_timeout = options.call_timeout if options.call_timeout else _TOFU_PROBE_TIMEOUT_SECONDS
|
||||
try:
|
||||
presented = ssl.get_server_certificate((host, port), timeout=probe_timeout)
|
||||
except OSError as error:
|
||||
raise MxGatewayTransportError(
|
||||
f"failed to fetch TLS certificate from {options.endpoint}: {error}"
|
||||
) from error
|
||||
credentials = grpc.ssl_channel_credentials(root_certificates=presented.encode("ascii"))
|
||||
# The gateway self-signed cert always carries a "localhost" SAN, so default
|
||||
# the SNI/target-name override to it when none was supplied, tolerating
|
||||
# dial-by-IP or hostname mismatch.
|
||||
if not options.server_name_override:
|
||||
channel_options.append(("grpc.ssl_target_name_override", "localhost"))
|
||||
|
||||
credentials = grpc.ssl_channel_credentials(root_certificates=root_certificates)
|
||||
return grpc.aio.secure_channel(
|
||||
options.endpoint,
|
||||
credentials,
|
||||
|
||||
@@ -170,6 +170,13 @@ def gateway_options(command: Callable[..., Any]) -> Callable[..., Any]:
|
||||
command = click.option("--plaintext", is_flag=True, help="Use plaintext gRPC.")(command)
|
||||
command = click.option("--tls", "use_tls", is_flag=True, help="Use TLS gRPC.")(command)
|
||||
command = click.option("--ca-file", default=None, help="Custom root certificate file.")(command)
|
||||
command = click.option(
|
||||
"--require-certificate-validation",
|
||||
"require_certificate_validation",
|
||||
is_flag=True,
|
||||
help="Verify the TLS certificate against the system trust store "
|
||||
"instead of the lenient trust-on-first-use default.",
|
||||
)(command)
|
||||
command = click.option(
|
||||
"--server-name-override",
|
||||
default=None,
|
||||
@@ -923,6 +930,7 @@ async def _connect(kwargs: dict[str, Any]) -> GatewayClient:
|
||||
api_key=api_key,
|
||||
plaintext=_use_plaintext(kwargs),
|
||||
ca_file=kwargs.get("ca_file"),
|
||||
require_certificate_validation=bool(kwargs.get("require_certificate_validation")),
|
||||
server_name_override=kwargs.get("server_name_override"),
|
||||
call_timeout=kwargs.get("call_timeout"),
|
||||
stream_timeout=kwargs.get("stream_timeout"),
|
||||
|
||||
@@ -1,9 +1,12 @@
|
||||
"""Tests for auth metadata and connection options."""
|
||||
|
||||
import socket
|
||||
|
||||
import pytest
|
||||
|
||||
from zb_mom_ww_mxgateway.auth import REDACTED, ApiKey, auth_metadata, redact_secret
|
||||
from zb_mom_ww_mxgateway import options as options_module
|
||||
from zb_mom_ww_mxgateway.errors import MxGatewayTransportError
|
||||
from zb_mom_ww_mxgateway.options import ClientOptions, create_channel
|
||||
|
||||
|
||||
@@ -72,27 +75,85 @@ def test_create_channel_uses_plaintext_channel(monkeypatch: pytest.MonkeyPatch)
|
||||
]
|
||||
|
||||
|
||||
def test_create_channel_uses_tls_channel(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
calls: list[tuple[str, object, object]] = []
|
||||
def test_create_channel_uses_tls_channel_tofu_default(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""Default TLS (no ca_file, no require_certificate_validation) uses TOFU:
|
||||
fetches the server cert unverified, pins it as root_certificates, and adds
|
||||
grpc.ssl_target_name_override = "localhost" automatically.
|
||||
"""
|
||||
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
|
||||
get_cert_calls: list[tuple[str, int]] = []
|
||||
|
||||
def fake_credentials(*, root_certificates: object) -> str:
|
||||
assert root_certificates is None
|
||||
def fake_get_server_certificate(
|
||||
addr: tuple[str, int], *, timeout: float | None = None
|
||||
) -> str:
|
||||
get_cert_calls.append(addr)
|
||||
return _DUMMY_PEM
|
||||
|
||||
cred_calls: list[object] = []
|
||||
|
||||
def fake_credentials(*, root_certificates: object = None) -> str:
|
||||
cred_calls.append(root_certificates)
|
||||
return "creds"
|
||||
|
||||
channel_calls: list[tuple[str, object, object]] = []
|
||||
|
||||
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
|
||||
calls.append((endpoint, credentials, options))
|
||||
channel_calls.append((endpoint, credentials, options))
|
||||
return "tls-channel"
|
||||
|
||||
monkeypatch.setattr(
|
||||
options_module.grpc,
|
||||
"ssl_channel_credentials",
|
||||
fake_credentials,
|
||||
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
|
||||
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
|
||||
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
|
||||
|
||||
channel = create_channel(
|
||||
ClientOptions(endpoint="gateway.example:5001"),
|
||||
)
|
||||
|
||||
assert channel == "tls-channel"
|
||||
# TOFU: should have fetched the cert from the server (host, port)
|
||||
assert get_cert_calls == [("gateway.example", 5001)]
|
||||
# Pinned the fetched PEM bytes as root_certificates
|
||||
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
|
||||
# Auto-injected localhost override (no server_name_override supplied)
|
||||
assert channel_calls == [
|
||||
(
|
||||
"gateway.example:5001",
|
||||
"creds",
|
||||
[
|
||||
("grpc.max_receive_message_length", 16 * 1024 * 1024),
|
||||
("grpc.max_send_message_length", 16 * 1024 * 1024),
|
||||
("grpc.ssl_target_name_override", "localhost"),
|
||||
],
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def test_create_channel_uses_tls_channel_tofu_respects_server_name_override(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""When server_name_override is set, TOFU still runs but does NOT add the
|
||||
auto-localhost override (the explicit override is already in channel_options).
|
||||
"""
|
||||
_DUMMY_PEM = "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
|
||||
monkeypatch.setattr(
|
||||
options_module.grpc.aio,
|
||||
"secure_channel",
|
||||
fake_secure_channel,
|
||||
options_module.ssl,
|
||||
"get_server_certificate",
|
||||
lambda addr, *, timeout=None: _DUMMY_PEM,
|
||||
)
|
||||
cred_calls: list[object] = []
|
||||
|
||||
def fake_credentials(*, root_certificates: object = None) -> str:
|
||||
cred_calls.append(root_certificates)
|
||||
return "creds"
|
||||
|
||||
channel_calls: list[tuple[str, object, object]] = []
|
||||
|
||||
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
|
||||
channel_calls.append((endpoint, credentials, options))
|
||||
return "tls-channel"
|
||||
|
||||
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
|
||||
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
|
||||
|
||||
channel = create_channel(
|
||||
ClientOptions(
|
||||
@@ -102,14 +163,164 @@ def test_create_channel_uses_tls_channel(monkeypatch: pytest.MonkeyPatch) -> Non
|
||||
)
|
||||
|
||||
assert channel == "tls-channel"
|
||||
assert calls == [
|
||||
(
|
||||
"gateway.example:5001",
|
||||
"creds",
|
||||
[
|
||||
("grpc.max_receive_message_length", 16 * 1024 * 1024),
|
||||
("grpc.max_send_message_length", 16 * 1024 * 1024),
|
||||
("grpc.ssl_target_name_override", "gateway.test"),
|
||||
],
|
||||
),
|
||||
]
|
||||
assert cred_calls == [_DUMMY_PEM.encode("ascii")]
|
||||
assert channel_calls == [
|
||||
(
|
||||
"gateway.example:5001",
|
||||
"creds",
|
||||
[
|
||||
("grpc.max_receive_message_length", 16 * 1024 * 1024),
|
||||
("grpc.max_send_message_length", 16 * 1024 * 1024),
|
||||
# Explicit override from ClientOptions — not the auto-localhost one
|
||||
("grpc.ssl_target_name_override", "gateway.test"),
|
||||
],
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def test_create_channel_uses_tls_channel_require_cert_validation(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""require_certificate_validation=True uses system trust (no TOFU, no root_certificates)."""
|
||||
get_cert_called = False
|
||||
|
||||
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
|
||||
nonlocal get_cert_called
|
||||
get_cert_called = True
|
||||
return "SHOULD_NOT_BE_CALLED"
|
||||
|
||||
cred_calls: list[object] = []
|
||||
|
||||
def fake_credentials(**kwargs: object) -> str:
|
||||
cred_calls.append(kwargs)
|
||||
return "creds"
|
||||
|
||||
channel_calls: list[tuple[str, object, object]] = []
|
||||
|
||||
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
|
||||
channel_calls.append((endpoint, credentials, options))
|
||||
return "tls-channel"
|
||||
|
||||
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
|
||||
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
|
||||
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
|
||||
|
||||
channel = create_channel(
|
||||
ClientOptions(
|
||||
endpoint="gateway.example:5001",
|
||||
require_certificate_validation=True,
|
||||
),
|
||||
)
|
||||
|
||||
assert channel == "tls-channel"
|
||||
# Must NOT call TOFU prefetch
|
||||
assert not get_cert_called
|
||||
# ssl_channel_credentials() called with NO keyword args (system trust)
|
||||
assert cred_calls == [{}]
|
||||
assert channel_calls == [
|
||||
(
|
||||
"gateway.example:5001",
|
||||
"creds",
|
||||
[
|
||||
("grpc.max_receive_message_length", 16 * 1024 * 1024),
|
||||
("grpc.max_send_message_length", 16 * 1024 * 1024),
|
||||
],
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def test_create_channel_uses_tls_channel_ca_file(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
tmp_path: pytest.TempPathFactory,
|
||||
) -> None:
|
||||
"""ca_file path: reads the PEM file, passes bytes as root_certificates, skips TOFU."""
|
||||
ca_pem = b"-----BEGIN CERTIFICATE-----\nY2FkYXRh\n-----END CERTIFICATE-----\n"
|
||||
ca_file = tmp_path / "ca.pem"
|
||||
ca_file.write_bytes(ca_pem)
|
||||
|
||||
get_cert_called = False
|
||||
|
||||
def fake_get_server_certificate(addr: object) -> str: # pragma: no cover
|
||||
nonlocal get_cert_called
|
||||
get_cert_called = True
|
||||
return "SHOULD_NOT_BE_CALLED"
|
||||
|
||||
cred_calls: list[object] = []
|
||||
|
||||
def fake_credentials(*, root_certificates: object = None) -> str:
|
||||
cred_calls.append(root_certificates)
|
||||
return "creds"
|
||||
|
||||
channel_calls: list[tuple[str, object, object]] = []
|
||||
|
||||
def fake_secure_channel(endpoint: str, credentials: object, *, options: object) -> str:
|
||||
channel_calls.append((endpoint, credentials, options))
|
||||
return "tls-channel"
|
||||
|
||||
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
|
||||
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", fake_credentials)
|
||||
monkeypatch.setattr(options_module.grpc.aio, "secure_channel", fake_secure_channel)
|
||||
|
||||
channel = create_channel(
|
||||
ClientOptions(
|
||||
endpoint="gateway.example:5001",
|
||||
ca_file=str(ca_file),
|
||||
),
|
||||
)
|
||||
|
||||
assert channel == "tls-channel"
|
||||
assert not get_cert_called
|
||||
assert cred_calls == [ca_pem]
|
||||
assert channel_calls == [
|
||||
(
|
||||
"gateway.example:5001",
|
||||
"creds",
|
||||
[
|
||||
("grpc.max_receive_message_length", 16 * 1024 * 1024),
|
||||
("grpc.max_send_message_length", 16 * 1024 * 1024),
|
||||
],
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
def test_tofu_probe_passes_a_bounded_timeout(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""The TOFU cert pre-fetch must be bounded so a black-holed host fails fast."""
|
||||
captured: dict[str, object] = {}
|
||||
|
||||
def fake_get_server_certificate(addr: object, *, timeout: float | None = None) -> str:
|
||||
captured["timeout"] = timeout
|
||||
return "-----BEGIN CERTIFICATE-----\nZmFrZQ==\n-----END CERTIFICATE-----\n"
|
||||
|
||||
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
|
||||
monkeypatch.setattr(options_module.grpc, "ssl_channel_credentials", lambda **_: "creds")
|
||||
monkeypatch.setattr(
|
||||
options_module.grpc.aio,
|
||||
"secure_channel",
|
||||
lambda endpoint, credentials, *, options: "tls-channel",
|
||||
)
|
||||
|
||||
create_channel(ClientOptions(endpoint="gateway.example:5001", call_timeout=7.5))
|
||||
|
||||
# A finite, positive timeout must be supplied (bounded by call_timeout here).
|
||||
assert isinstance(captured["timeout"], (int, float))
|
||||
assert 0 < captured["timeout"] <= 7.5
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"raised",
|
||||
[socket.timeout("timed out"), TimeoutError("timed out"), OSError("connection refused")],
|
||||
)
|
||||
def test_tofu_probe_timeout_raises_transport_error(
|
||||
monkeypatch: pytest.MonkeyPatch, raised: Exception
|
||||
) -> None:
|
||||
"""A timed-out / failed probe surfaces as MxGatewayTransportError, not a raw error."""
|
||||
|
||||
def fake_get_server_certificate(addr: object, *, timeout: float | None = None) -> str:
|
||||
raise raised
|
||||
|
||||
monkeypatch.setattr(options_module.ssl, "get_server_certificate", fake_get_server_certificate)
|
||||
|
||||
options = ClientOptions(endpoint="gateway.example:5001")
|
||||
with pytest.raises(MxGatewayTransportError) as excinfo:
|
||||
create_channel(options)
|
||||
assert options.endpoint in str(excinfo.value)
|
||||
|
||||
@@ -2,14 +2,79 @@
|
||||
|
||||
import json
|
||||
|
||||
import pytest
|
||||
from click.testing import CliRunner
|
||||
|
||||
from zb_mom_ww_mxgateway import __version__
|
||||
from zb_mom_ww_mxgateway_cli import commands as commands_module
|
||||
from zb_mom_ww_mxgateway_cli.commands import main
|
||||
|
||||
_BATCH_EOR = "__MXGW_BATCH_EOR__"
|
||||
|
||||
|
||||
def test_require_certificate_validation_flag_flows_through_connect(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""The --require-certificate-validation CLI flag must reach ClientOptions (Client.Python-027)."""
|
||||
captured: dict[str, object] = {}
|
||||
|
||||
async def fake_connect(options, **_kwargs):
|
||||
captured["options"] = options
|
||||
# Return a minimal object that supports the async context-manager protocol
|
||||
# used by every CLI command body (async with await _connect(...) as client).
|
||||
return _FakeAsyncClient()
|
||||
|
||||
monkeypatch.setattr(commands_module.GatewayClient, "connect", fake_connect)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
[
|
||||
"open-session",
|
||||
"--endpoint",
|
||||
"gateway.example:5001",
|
||||
"--require-certificate-validation",
|
||||
"--json",
|
||||
],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
assert captured["options"].require_certificate_validation is True
|
||||
|
||||
|
||||
def test_require_certificate_validation_defaults_off(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""Without the flag the strict-validation posture stays off (TOFU default)."""
|
||||
captured: dict[str, object] = {}
|
||||
|
||||
async def fake_connect(options, **_kwargs):
|
||||
captured["options"] = options
|
||||
return _FakeAsyncClient()
|
||||
|
||||
monkeypatch.setattr(commands_module.GatewayClient, "connect", fake_connect)
|
||||
|
||||
result = CliRunner().invoke(
|
||||
main,
|
||||
["open-session", "--endpoint", "gateway.example:5001", "--plaintext", "--json"],
|
||||
)
|
||||
|
||||
assert result.exit_code == 0, result.output
|
||||
assert captured["options"].require_certificate_validation is False
|
||||
|
||||
|
||||
class _FakeAsyncClient:
|
||||
"""Minimal async-context-manager fake satisfying the open-session command body."""
|
||||
|
||||
async def __aenter__(self) -> "_FakeAsyncClient":
|
||||
return self
|
||||
|
||||
async def __aexit__(self, *_exc: object) -> None:
|
||||
return None
|
||||
|
||||
async def open_session_raw(self, *_args, **_kwargs):
|
||||
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
|
||||
|
||||
return pb.OpenSessionReply(session_id="cli-test-session")
|
||||
|
||||
|
||||
def test_version_json_is_deterministic() -> None:
|
||||
runner = CliRunner()
|
||||
|
||||
|
||||
@@ -8,9 +8,107 @@ from typing import Any
|
||||
import pytest
|
||||
|
||||
from zb_mom_ww_mxgateway import ClientOptions, GatewayClient, MxAccessError
|
||||
from zb_mom_ww_mxgateway import client as client_module
|
||||
from zb_mom_ww_mxgateway import galaxy as galaxy_module
|
||||
from zb_mom_ww_mxgateway.galaxy import GalaxyRepositoryClient
|
||||
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_gateway_connect_forwards_require_certificate_validation(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""The connect convenience kwarg must reach ClientOptions (Client.Python-027)."""
|
||||
captured: dict[str, Any] = {}
|
||||
|
||||
def fake_create_channel(options: ClientOptions) -> object:
|
||||
captured["options"] = options
|
||||
return object()
|
||||
|
||||
monkeypatch.setattr(client_module, "create_channel", fake_create_channel)
|
||||
monkeypatch.setattr(client_module.pb_grpc, "MxAccessGatewayStub", lambda channel: object())
|
||||
|
||||
await GatewayClient.connect(
|
||||
endpoint="gateway.example:5001",
|
||||
require_certificate_validation=True,
|
||||
)
|
||||
|
||||
assert captured["options"].require_certificate_validation is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_galaxy_connect_forwards_require_certificate_validation(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""GalaxyRepositoryClient.connect must thread the flag too (Client.Python-027)."""
|
||||
captured: dict[str, Any] = {}
|
||||
|
||||
def fake_create_channel(options: ClientOptions) -> object:
|
||||
captured["options"] = options
|
||||
return object()
|
||||
|
||||
monkeypatch.setattr(galaxy_module, "create_channel", fake_create_channel)
|
||||
monkeypatch.setattr(
|
||||
galaxy_module.galaxy_pb_grpc, "GalaxyRepositoryStub", lambda channel: object()
|
||||
)
|
||||
|
||||
await GalaxyRepositoryClient.connect(
|
||||
endpoint="gateway.example:5001",
|
||||
require_certificate_validation=True,
|
||||
)
|
||||
|
||||
assert captured["options"].require_certificate_validation is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_gateway_connect_runs_create_channel_off_the_event_loop(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""connect must run the blocking channel factory off the loop (Client.Python-028)."""
|
||||
ran_in_thread: dict[str, bool] = {}
|
||||
|
||||
def fake_create_channel(options: ClientOptions) -> object:
|
||||
# If this runs on the event loop thread, get_running_loop() succeeds.
|
||||
try:
|
||||
asyncio.get_running_loop()
|
||||
ran_in_thread["off_loop"] = False
|
||||
except RuntimeError:
|
||||
ran_in_thread["off_loop"] = True
|
||||
return object()
|
||||
|
||||
monkeypatch.setattr(client_module, "create_channel", fake_create_channel)
|
||||
monkeypatch.setattr(client_module.pb_grpc, "MxAccessGatewayStub", lambda channel: object())
|
||||
|
||||
await GatewayClient.connect(endpoint="gateway.example:5001")
|
||||
|
||||
assert ran_in_thread["off_loop"] is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_galaxy_connect_runs_create_channel_off_the_event_loop(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
"""GalaxyRepositoryClient.connect must also run the probe off the loop (Client.Python-028)."""
|
||||
ran_in_thread: dict[str, bool] = {}
|
||||
|
||||
def fake_create_channel(options: ClientOptions) -> object:
|
||||
try:
|
||||
asyncio.get_running_loop()
|
||||
ran_in_thread["off_loop"] = False
|
||||
except RuntimeError:
|
||||
ran_in_thread["off_loop"] = True
|
||||
return object()
|
||||
|
||||
monkeypatch.setattr(galaxy_module, "create_channel", fake_create_channel)
|
||||
monkeypatch.setattr(
|
||||
galaxy_module.galaxy_pb_grpc, "GalaxyRepositoryStub", lambda channel: object()
|
||||
)
|
||||
|
||||
await GalaxyRepositoryClient.connect(endpoint="gateway.example:5001")
|
||||
|
||||
assert ran_in_thread["off_loop"] is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_session_helpers_send_auth_metadata_and_preserve_raw_replies() -> None:
|
||||
stub = FakeGatewayStub()
|
||||
|
||||
@@ -6,12 +6,16 @@ import asyncio
|
||||
from datetime import datetime, timezone
|
||||
from typing import Any
|
||||
|
||||
import grpc
|
||||
import pytest
|
||||
from google.protobuf.timestamp_pb2 import Timestamp
|
||||
|
||||
from zb_mom_ww_mxgateway import ClientOptions, DeployEvent, GalaxyRepositoryClient, WatchDeployEventsRequest
|
||||
from zb_mom_ww_mxgateway.errors import MxGatewayError
|
||||
from zb_mom_ww_mxgateway.galaxy import LazyBrowseNode
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2 as galaxy_pb
|
||||
from zb_mom_ww_mxgateway.generated import galaxy_repository_pb2_grpc as galaxy_pb_grpc
|
||||
from zb_mom_ww_mxgateway.options import BrowseChildrenOptions
|
||||
|
||||
|
||||
def test_galaxy_messages_import() -> None:
|
||||
@@ -268,15 +272,281 @@ async def test_close_marks_channel_closed_when_no_real_channel() -> None:
|
||||
await client.close()
|
||||
|
||||
|
||||
def _obj(gid: int, tag: str, is_area: bool = False) -> galaxy_pb.GalaxyObject:
|
||||
return galaxy_pb.GalaxyObject(
|
||||
gobject_id=gid, tag_name=tag, browse_name=tag, is_area=is_area,
|
||||
)
|
||||
|
||||
|
||||
def _build_browse_reply(
|
||||
children: list[galaxy_pb.GalaxyObject],
|
||||
child_has_children: list[bool],
|
||||
cache_sequence: int,
|
||||
next_page_token: str = "",
|
||||
) -> galaxy_pb.BrowseChildrenReply:
|
||||
reply = galaxy_pb.BrowseChildrenReply(
|
||||
total_child_count=len(children),
|
||||
cache_sequence=cache_sequence,
|
||||
next_page_token=next_page_token,
|
||||
)
|
||||
reply.children.extend(children)
|
||||
reply.child_has_children.extend(child_has_children)
|
||||
return reply
|
||||
|
||||
|
||||
def _fake_aio_rpc_error(code: grpc.StatusCode, details: str) -> grpc.aio.AioRpcError:
|
||||
return grpc.aio.AioRpcError(
|
||||
code=code,
|
||||
initial_metadata=grpc.aio.Metadata(),
|
||||
trailing_metadata=grpc.aio.Metadata(),
|
||||
details=details,
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_no_parent_returns_roots() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(1, "Area_A", is_area=True), _obj(2, "Area_B", is_area=True)],
|
||||
child_has_children=[True, False],
|
||||
cache_sequence=7,
|
||||
),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
|
||||
assert len(roots) == 2
|
||||
assert all(isinstance(node, LazyBrowseNode) for node in roots)
|
||||
assert roots[0].object.tag_name == "Area_A"
|
||||
assert roots[0].has_children_hint is True
|
||||
assert roots[1].has_children_hint is False
|
||||
assert roots[0].is_expanded is False
|
||||
request = stub.browse_children.requests[0]
|
||||
assert request.WhichOneof("parent") is None
|
||||
assert request.page_size == 500
|
||||
assert request.page_token == ""
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_expand_populates_children_and_marks_expanded() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(1, "Area_A", is_area=True)],
|
||||
child_has_children=[True],
|
||||
cache_sequence=1,
|
||||
),
|
||||
_build_browse_reply(
|
||||
children=[_obj(11, "Child_A"), _obj(12, "Child_B")],
|
||||
child_has_children=[False, False],
|
||||
cache_sequence=1,
|
||||
),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
await roots[0].expand()
|
||||
|
||||
assert roots[0].is_expanded is True
|
||||
assert [n.object.tag_name for n in roots[0].children] == ["Child_A", "Child_B"]
|
||||
assert len(stub.browse_children.requests) == 2
|
||||
expand_request = stub.browse_children.requests[1]
|
||||
assert expand_request.WhichOneof("parent") == "parent_gobject_id"
|
||||
assert expand_request.parent_gobject_id == 1
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_expand_idempotent_no_second_rpc() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(1, "Area_A", is_area=True)],
|
||||
child_has_children=[True],
|
||||
cache_sequence=1,
|
||||
),
|
||||
_build_browse_reply(
|
||||
children=[_obj(11, "Child_A")],
|
||||
child_has_children=[False],
|
||||
cache_sequence=1,
|
||||
),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
await roots[0].expand()
|
||||
await roots[0].expand()
|
||||
|
||||
assert len(stub.browse_children.requests) == 2
|
||||
assert len(roots[0].children) == 1
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_expand_concurrent_callers_only_fire_one_rpc() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply([_obj(1, "Plant", is_area=True)], [True], 7),
|
||||
_build_browse_reply([_obj(2, "Mixer_001")], [False], 7),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
# Ten concurrent expand calls on the same node should issue exactly one RPC.
|
||||
await asyncio.gather(*(roots[0].expand() for _ in range(10)))
|
||||
|
||||
assert roots[0].is_expanded
|
||||
assert len(roots[0].children) == 1
|
||||
# 1 roots fetch + exactly 1 expand fetch = 2 total
|
||||
assert len(stub.browse_children.requests) == 2
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_expand_unknown_parent_raises_mxgateway_error() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(99, "Stale_Parent", is_area=True)],
|
||||
child_has_children=[True],
|
||||
cache_sequence=1,
|
||||
),
|
||||
]
|
||||
stub.browse_children.exceptions = [
|
||||
None,
|
||||
_fake_aio_rpc_error(grpc.StatusCode.NOT_FOUND, "parent not found"),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
with pytest.raises(MxGatewayError):
|
||||
await roots[0].expand()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_expand_multi_page_gathers_all_pages() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(7, "Area_Big", is_area=True)],
|
||||
child_has_children=[True],
|
||||
cache_sequence=2,
|
||||
),
|
||||
_build_browse_reply(
|
||||
children=[_obj(71, "Child_1"), _obj(72, "Child_2")],
|
||||
child_has_children=[False, False],
|
||||
cache_sequence=2,
|
||||
next_page_token="7:abc:2",
|
||||
),
|
||||
_build_browse_reply(
|
||||
children=[_obj(73, "Child_3")],
|
||||
child_has_children=[False],
|
||||
cache_sequence=2,
|
||||
),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
|
||||
roots = await client.browse()
|
||||
await roots[0].expand()
|
||||
|
||||
assert [n.object.tag_name for n in roots[0].children] == ["Child_1", "Child_2", "Child_3"]
|
||||
assert len(stub.browse_children.requests) == 3
|
||||
assert stub.browse_children.requests[2].page_token == "7:abc:2"
|
||||
assert stub.browse_children.requests[2].parent_gobject_id == 7
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_with_filter_forwards_to_request() -> None:
|
||||
stub = FakeGalaxyStub()
|
||||
stub.browse_children.replies = [
|
||||
_build_browse_reply(
|
||||
children=[_obj(1, "Area_A", is_area=True)],
|
||||
child_has_children=[False],
|
||||
cache_sequence=3,
|
||||
),
|
||||
]
|
||||
client = await GalaxyRepositoryClient.connect(
|
||||
ClientOptions(endpoint="fake", plaintext=True),
|
||||
stub=stub,
|
||||
)
|
||||
options = BrowseChildrenOptions(
|
||||
category_ids=(4, 5),
|
||||
template_chain_contains=("$DelmiaReceiver",),
|
||||
tag_name_glob="Area_*",
|
||||
include_attributes=True,
|
||||
alarm_bearing_only=True,
|
||||
historized_only=True,
|
||||
)
|
||||
|
||||
await client.browse(options)
|
||||
|
||||
request = stub.browse_children.requests[0]
|
||||
assert list(request.category_ids) == [4, 5]
|
||||
assert list(request.template_chain_contains) == ["$DelmiaReceiver"]
|
||||
assert request.tag_name_glob == "Area_*"
|
||||
assert request.HasField("include_attributes")
|
||||
assert request.include_attributes is True
|
||||
assert request.alarm_bearing_only is True
|
||||
assert request.historized_only is True
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_browse_children_raw_returns_reply_unwrapped() -> None:
|
||||
"""browse_children_raw forwards the request to the stub and returns the raw reply."""
|
||||
stub = FakeGalaxyStub()
|
||||
expected = _build_browse_reply(
|
||||
children=[_obj(1, "Plant", is_area=True)],
|
||||
child_has_children=[True],
|
||||
cache_sequence=42,
|
||||
)
|
||||
stub.browse_children.replies = [expected]
|
||||
|
||||
async with await GalaxyRepositoryClient.connect(
|
||||
endpoint="fake",
|
||||
plaintext=True,
|
||||
stub=stub,
|
||||
) as client:
|
||||
request = galaxy_pb.BrowseChildrenRequest(
|
||||
page_size=10,
|
||||
tag_name_glob="Plant*",
|
||||
)
|
||||
reply = await client.browse_children_raw(request)
|
||||
|
||||
assert reply.cache_sequence == 42
|
||||
assert len(reply.children) == 1
|
||||
assert reply.children[0].tag_name == "Plant"
|
||||
assert len(stub.browse_children.requests) == 1
|
||||
assert stub.browse_children.requests[0].tag_name_glob == "Plant*"
|
||||
|
||||
|
||||
class FakeGalaxyStub:
|
||||
def __init__(self) -> None:
|
||||
self.test_connection = FakeUnary([galaxy_pb.TestConnectionReply(ok=False)])
|
||||
self.get_last_deploy_time = FakeUnary([galaxy_pb.GetLastDeployTimeReply(present=False)])
|
||||
self.discover_hierarchy = FakeUnary([galaxy_pb.DiscoverHierarchyReply()])
|
||||
self.browse_children = FakeUnary([galaxy_pb.BrowseChildrenReply()])
|
||||
self.watch_deploy_events = FakeStream([])
|
||||
self.TestConnection = self.test_connection
|
||||
self.GetLastDeployTime = self.get_last_deploy_time
|
||||
self.DiscoverHierarchy = self.discover_hierarchy
|
||||
self.BrowseChildren = self.browse_children
|
||||
|
||||
@property
|
||||
def WatchDeployEvents(self) -> "FakeStream": # noqa: N802 — gRPC naming
|
||||
@@ -287,6 +557,8 @@ class FakeUnary:
|
||||
def __init__(self, replies: list[Any]) -> None:
|
||||
self.replies = replies
|
||||
self.requests: list[Any] = []
|
||||
# None entries mean "no exception on this call"; aligns with the replies queue index-by-index.
|
||||
self.exceptions: list[BaseException | None] = []
|
||||
self.metadata: tuple[tuple[str, str], ...] | None = None
|
||||
|
||||
async def __call__(
|
||||
@@ -298,6 +570,10 @@ class FakeUnary:
|
||||
) -> Any:
|
||||
self.requests.append(request)
|
||||
self.metadata = metadata
|
||||
if self.exceptions:
|
||||
exc = self.exceptions.pop(0)
|
||||
if exc is not None:
|
||||
raise exc
|
||||
return self.replies.pop(0)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,176 @@
|
||||
"""TLS behaviour tests for ``create_channel``.
|
||||
|
||||
These spin up a real loopback ``grpc.aio`` server with a freshly generated
|
||||
self-signed certificate (carrying a ``localhost`` SAN, mirroring the gateway's
|
||||
auto-generated cert) and assert the lenient TOFU default lets a client connect
|
||||
without any CA configured.
|
||||
|
||||
Marked ``tls`` and skipped unless ``MXGATEWAY_RUN_TLS_TESTS=1`` because loopback
|
||||
TLS handshakes can be timing-flaky on shared CI runners. This mirrors how the
|
||||
suite gates anything that depends on real sockets rather than fakes.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import socket
|
||||
import ssl
|
||||
import subprocess
|
||||
import tempfile
|
||||
from collections.abc import AsyncIterator
|
||||
from pathlib import Path
|
||||
|
||||
import grpc
|
||||
import pytest
|
||||
import pytest_asyncio
|
||||
|
||||
from zb_mom_ww_mxgateway import ClientOptions
|
||||
from zb_mom_ww_mxgateway.errors import MxGatewayTransportError
|
||||
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2 as pb
|
||||
from zb_mom_ww_mxgateway.generated import mxaccess_gateway_pb2_grpc as pb_grpc
|
||||
from zb_mom_ww_mxgateway.options import create_channel
|
||||
|
||||
pytestmark = pytest.mark.tls
|
||||
|
||||
_RUN_TLS_TESTS = os.environ.get("MXGATEWAY_RUN_TLS_TESTS") == "1"
|
||||
_OPENSSL = shutil.which("openssl")
|
||||
|
||||
requires_tls = pytest.mark.skipif(
|
||||
not _RUN_TLS_TESTS,
|
||||
reason="set MXGATEWAY_RUN_TLS_TESTS=1 to run loopback TLS tests",
|
||||
)
|
||||
requires_openssl = pytest.mark.skipif(
|
||||
_OPENSSL is None,
|
||||
reason="openssl CLI is required to generate a self-signed test certificate",
|
||||
)
|
||||
|
||||
|
||||
def _generate_self_signed_cert(directory: Path) -> tuple[Path, Path]:
|
||||
"""Generate a self-signed cert/key pair with a ``localhost`` SAN."""
|
||||
key_path = directory / "server.key"
|
||||
cert_path = directory / "server.crt"
|
||||
subprocess.run(
|
||||
[
|
||||
str(_OPENSSL),
|
||||
"req",
|
||||
"-x509",
|
||||
"-newkey",
|
||||
"rsa:2048",
|
||||
"-nodes",
|
||||
"-keyout",
|
||||
str(key_path),
|
||||
"-out",
|
||||
str(cert_path),
|
||||
"-days",
|
||||
"1",
|
||||
"-subj",
|
||||
"/CN=mxgateway-test",
|
||||
"-addext",
|
||||
"subjectAltName=DNS:localhost,IP:127.0.0.1",
|
||||
],
|
||||
check=True,
|
||||
capture_output=True,
|
||||
)
|
||||
return cert_path, key_path
|
||||
|
||||
|
||||
def _free_port() -> int:
|
||||
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
|
||||
sock.bind(("127.0.0.1", 0))
|
||||
return int(sock.getsockname()[1])
|
||||
|
||||
|
||||
class _StaticGatewayServicer(pb_grpc.MxAccessGatewayServicer):
|
||||
"""Minimal servicer answering ``OpenSession`` with a fixed session id."""
|
||||
|
||||
async def OpenSession( # noqa: N802 - generated gRPC method name
|
||||
self, request: pb.OpenSessionRequest, context: object
|
||||
) -> pb.OpenSessionReply:
|
||||
return pb.OpenSessionReply(session_id="tls-session-1")
|
||||
|
||||
|
||||
@pytest_asyncio.fixture
|
||||
async def tls_server() -> AsyncIterator[int]:
|
||||
with tempfile.TemporaryDirectory() as tmp:
|
||||
cert_path, key_path = _generate_self_signed_cert(Path(tmp))
|
||||
credentials = grpc.ssl_server_credentials(
|
||||
[(key_path.read_bytes(), cert_path.read_bytes())]
|
||||
)
|
||||
server = grpc.aio.server()
|
||||
pb_grpc.add_MxAccessGatewayServicer_to_server(_StaticGatewayServicer(), server)
|
||||
port = _free_port()
|
||||
server.add_secure_port(f"127.0.0.1:{port}", credentials)
|
||||
await server.start()
|
||||
try:
|
||||
yield port
|
||||
finally:
|
||||
await server.stop(grace=None)
|
||||
|
||||
|
||||
@requires_tls
|
||||
@requires_openssl
|
||||
@pytest.mark.asyncio
|
||||
async def test_default_tls_connects_via_tofu(tls_server: int) -> None:
|
||||
"""Default TLS options (no CA) connect by pinning the presented cert."""
|
||||
options = ClientOptions(
|
||||
endpoint=f"127.0.0.1:{tls_server}",
|
||||
api_key="mxgw_test_secret",
|
||||
)
|
||||
channel = create_channel(options)
|
||||
try:
|
||||
stub = pb_grpc.MxAccessGatewayStub(channel)
|
||||
reply = await stub.OpenSession(pb.OpenSessionRequest(), timeout=10)
|
||||
assert reply.session_id == "tls-session-1"
|
||||
finally:
|
||||
await channel.close()
|
||||
|
||||
|
||||
def test_split_authority_parses_host_and_port() -> None:
|
||||
from zb_mom_ww_mxgateway.options import _split_authority
|
||||
|
||||
assert _split_authority("https://10.0.0.5:5120") == ("10.0.0.5", 5120)
|
||||
assert _split_authority("localhost:5120") == ("localhost", 5120)
|
||||
assert _split_authority(":5120") == ("localhost", 5120)
|
||||
|
||||
|
||||
def test_split_authority_defaults_port_for_portless_endpoint() -> None:
|
||||
from zb_mom_ww_mxgateway.options import _split_authority
|
||||
|
||||
# A bare hostname (no ":port") must default to 443, not crash on int("mygateway").
|
||||
assert _split_authority("mygateway") == ("mygateway", 443)
|
||||
# Scheme-prefixed bare hostname behaves the same.
|
||||
assert _split_authority("https://mygateway") == ("mygateway", 443)
|
||||
# A non-numeric tail after a colon is treated as no explicit port.
|
||||
assert _split_authority("mygateway:") == ("mygateway", 443)
|
||||
|
||||
|
||||
def test_split_authority_strips_ipv6_brackets() -> None:
|
||||
from zb_mom_ww_mxgateway.options import _split_authority
|
||||
|
||||
# Bracketed IPv6 with port — brackets must be removed for ssl.get_server_certificate
|
||||
assert _split_authority("[::1]:5120") == ("::1", 5120)
|
||||
# Bare bracketed IPv6 (no port) — default port 443
|
||||
assert _split_authority("[::1]") == ("::1", 443)
|
||||
# Scheme-prefixed bracketed IPv6
|
||||
assert _split_authority("grpc://[::1]:5120") == ("::1", 5120)
|
||||
|
||||
|
||||
def test_tofu_connect_failure_raises_transport_error() -> None:
|
||||
"""A failed cert pre-fetch surfaces the client's transport error type."""
|
||||
options = ClientOptions(endpoint=f"127.0.0.1:{_free_port()}")
|
||||
with pytest.raises(MxGatewayTransportError) as excinfo:
|
||||
create_channel(options)
|
||||
assert options.endpoint in str(excinfo.value)
|
||||
|
||||
|
||||
def test_require_certificate_validation_uses_system_trust() -> None:
|
||||
"""``require_certificate_validation`` must not attempt a TOFU pre-fetch."""
|
||||
# Pointing at a closed port: with system-trust the channel is created lazily
|
||||
# (no eager pre-fetch), so create_channel must succeed without connecting.
|
||||
options = ClientOptions(
|
||||
endpoint=f"127.0.0.1:{_free_port()}",
|
||||
require_certificate_validation=True,
|
||||
)
|
||||
channel = create_channel(options)
|
||||
assert isinstance(channel, grpc.aio.Channel)
|
||||
@@ -17,3 +17,6 @@
|
||||
# args through the GNU linker and reject `/STACK:`, are unaffected.
|
||||
[target.'cfg(all(windows, target_env = "msvc"))']
|
||||
rustflags = ["-C", "link-arg=/STACK:8388608"]
|
||||
|
||||
[registries.dohertj2-gitea]
|
||||
index = "sparse+https://gitea.dohertylan.com/api/packages/dohertj2/cargo/"
|
||||
|
||||
Generated
+69
-2
@@ -207,6 +207,22 @@ version = "1.0.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570"
|
||||
|
||||
[[package]]
|
||||
name = "core-foundation"
|
||||
version = "0.10.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b2a6cd9ae233e7f62ba4e9353e81a88df7fc8a5987b8d445b4d90c879bd156f6"
|
||||
dependencies = [
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "core-foundation-sys"
|
||||
version = "0.8.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
|
||||
|
||||
[[package]]
|
||||
name = "either"
|
||||
version = "1.15.0"
|
||||
@@ -574,7 +590,7 @@ checksum = "1d87ecb2933e8aeadb3e3a02b828fed80a7528047e68b4f424523a0981a3a084"
|
||||
|
||||
[[package]]
|
||||
name = "mxgw-cli"
|
||||
version = "0.1.0"
|
||||
version = "0.1.1"
|
||||
dependencies = [
|
||||
"clap",
|
||||
"futures-util",
|
||||
@@ -597,6 +613,12 @@ version = "1.70.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe"
|
||||
|
||||
[[package]]
|
||||
name = "openssl-probe"
|
||||
version = "0.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7c87def4c32ab89d880effc9e097653c8da5d6ef28e6b539d313baaacfbafcbe"
|
||||
|
||||
[[package]]
|
||||
name = "percent-encoding"
|
||||
version = "2.3.2"
|
||||
@@ -796,6 +818,18 @@ dependencies = [
|
||||
"zeroize",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls-native-certs"
|
||||
version = "0.8.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "612460d5f7bea540c490b2b6395d8e34a953e52b491accd6c86c8164c5932a63"
|
||||
dependencies = [
|
||||
"openssl-probe",
|
||||
"rustls-pki-types",
|
||||
"schannel",
|
||||
"security-framework",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rustls-pki-types"
|
||||
version = "1.14.1"
|
||||
@@ -816,6 +850,38 @@ dependencies = [
|
||||
"untrusted",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "schannel"
|
||||
version = "0.1.28"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "891d81b926048e76efe18581bf793546b4c0eaf8448d72be8de2bbee5fd166e1"
|
||||
dependencies = [
|
||||
"windows-sys 0.61.2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "security-framework"
|
||||
version = "3.7.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b7f4bc775c73d9a02cde8bf7b2ec4c9d12743edf609006c7facc23998404cd1d"
|
||||
dependencies = [
|
||||
"bitflags",
|
||||
"core-foundation",
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
"security-framework-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "security-framework-sys"
|
||||
version = "2.17.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6ce2691df843ecc5d231c0b14ece2acc3efb62c0a398c7e1d875f3983ce020e3"
|
||||
dependencies = [
|
||||
"core-foundation-sys",
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "semver"
|
||||
version = "1.0.28"
|
||||
@@ -1056,6 +1122,7 @@ dependencies = [
|
||||
"percent-encoding",
|
||||
"pin-project",
|
||||
"prost",
|
||||
"rustls-native-certs",
|
||||
"socket2 0.5.10",
|
||||
"tokio",
|
||||
"tokio-rustls",
|
||||
@@ -1423,7 +1490,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "zb-mom-ww-mxgateway-client"
|
||||
version = "0.1.0"
|
||||
version = "0.1.1"
|
||||
dependencies = [
|
||||
"futures-core",
|
||||
"futures-util",
|
||||
|
||||
+17
-5
@@ -1,8 +1,17 @@
|
||||
[package]
|
||||
name = "zb-mom-ww-mxgateway-client"
|
||||
version = "0.1.0"
|
||||
version = "0.1.1"
|
||||
edition = "2021"
|
||||
publish = false
|
||||
authors = ["Joseph Doherty"]
|
||||
description = "Async Rust client for the MxAccessGateway gRPC service, including a lazy-browse walker over the Galaxy Repository hierarchy."
|
||||
license = "Proprietary"
|
||||
repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
homepage = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
documentation = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
readme = "README.md"
|
||||
keywords = ["mxaccess", "mxgateway", "grpc", "client", "archestra"]
|
||||
categories = ["api-bindings", "asynchronous"]
|
||||
publish = ["dohertj2-gitea"]
|
||||
build = "build.rs"
|
||||
|
||||
[workspace]
|
||||
@@ -11,8 +20,11 @@ resolver = "2"
|
||||
|
||||
[workspace.package]
|
||||
edition = "2021"
|
||||
version = "0.1.0"
|
||||
publish = false
|
||||
version = "0.1.1"
|
||||
authors = ["Joseph Doherty"]
|
||||
license = "Proprietary"
|
||||
repository = "https://gitea.dohertylan.com/dohertj2/mxaccessgw"
|
||||
publish = ["dohertj2-gitea"]
|
||||
|
||||
[workspace.dependencies]
|
||||
clap = { version = "4.5.53", features = ["derive"] }
|
||||
@@ -25,7 +37,7 @@ serde_json = "1.0.145"
|
||||
thiserror = "2.0.17"
|
||||
tokio = { version = "1.48.0", features = ["macros", "rt-multi-thread", "sync", "time"] }
|
||||
tokio-stream = { version = "0.1.17", features = ["net"] }
|
||||
tonic = { version = "0.13.1", features = ["transport", "tls-ring"] }
|
||||
tonic = { version = "0.13.1", features = ["transport", "tls-ring", "tls-native-roots"] }
|
||||
tonic-build = "0.13.1"
|
||||
|
||||
[dependencies]
|
||||
|
||||
@@ -76,6 +76,27 @@ types.
|
||||
cargo run -p mxgw-cli -- smoke --endpoint https://mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item TestChildObject.TestInt --json
|
||||
```
|
||||
|
||||
### TLS trust (pin-only)
|
||||
|
||||
The gateway can auto-generate its own self-signed certificate (it has no PKI).
|
||||
Unlike the other clients, the Rust client is **not** lenient: tonic 0.13.1
|
||||
exposes no public hook to inject a custom certificate verifier, so TLS over Rust
|
||||
cannot accept an *arbitrary* self-signed certificate. A TLS connection requires
|
||||
one of two trust paths:
|
||||
|
||||
- `--ca-file` / `ClientOptions::with_ca_file(...)` to pin a CA (export the
|
||||
gateway's self-signed certificate and pin it). This is the path for a
|
||||
self-signed gateway.
|
||||
- `--require-certificate-validation` / `with_require_certificate_validation(true)`
|
||||
to verify against the operating system's trust roots (`tls-native-roots`). This
|
||||
only succeeds for a certificate that chains to a root the host already trusts —
|
||||
i.e. a gateway fronted by a publicly- or enterprise-CA-issued certificate, not a
|
||||
bare self-signed one.
|
||||
|
||||
TLS with neither set fails `connect` with a clear, actionable error rather
|
||||
than accepting the certificate. See
|
||||
[Gateway Configuration](../../docs/GatewayConfiguration.md#automatic-self-signed-certificate).
|
||||
|
||||
## Library Surface
|
||||
|
||||
`ClientOptions` configures endpoint, API key, plaintext or TLS transport,
|
||||
@@ -138,6 +159,50 @@ cargo run -p mxgw-cli -- galaxy last-deploy-time --endpoint http://localhost:500
|
||||
cargo run -p mxgw-cli -- galaxy discover-hierarchy --endpoint http://localhost:5000 --api-key-env MXGATEWAY_API_KEY --json
|
||||
```
|
||||
|
||||
### Browsing lazily
|
||||
|
||||
For UI trees or OPC UA bridges, use `browse_children` to walk one level at a
|
||||
time instead of paging the full hierarchy. Pass a default request for root
|
||||
objects; subsequent calls set `parent_gobject_id`, `parent_tag_name`, or
|
||||
`parent_contained_path`. Filter fields match `discover_hierarchy`. Each response
|
||||
pairs `children` with `child_has_children` so you know which nodes to expand. See
|
||||
[Galaxy Repository](../../docs/GalaxyRepository.md#browsechildren) for full
|
||||
request and filter semantics.
|
||||
|
||||
```rust
|
||||
use zb_mom_ww_mxgateway_client::generated::galaxy_repository::v1::BrowseChildrenRequest;
|
||||
|
||||
let reply = galaxy.browse_children(BrowseChildrenRequest::default()).await?.into_inner();
|
||||
for (child, has_children) in reply.children.iter().zip(reply.child_has_children.iter()) {
|
||||
println!("{} expand={}", child.tag_name, has_children);
|
||||
}
|
||||
```
|
||||
|
||||
#### High-level walker
|
||||
|
||||
For UI trees, the client provides a `LazyBrowseNode` walker that handles
|
||||
sibling pagination and the `child_has_children` hint for you:
|
||||
|
||||
```rust
|
||||
let mut client = GalaxyClient::connect(
|
||||
ClientOptions::new("http://localhost:5000").with_api_key(ApiKey::new(api_key)),
|
||||
).await?;
|
||||
let roots = client.browse(None).await?;
|
||||
for root in &roots {
|
||||
if root.has_children_hint() {
|
||||
root.expand().await?;
|
||||
}
|
||||
for child in root.children().await {
|
||||
let kind = if child.has_children_hint() { "has children" } else { "leaf" };
|
||||
println!("{} ({kind})", child.object().tag_name);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`expand` is idempotent — calling it twice fires only one RPC,
|
||||
and is safe under concurrent callers. To refresh after a Galaxy redeploy, call
|
||||
`browse` again from the root.
|
||||
|
||||
### Watching deploy events
|
||||
|
||||
`watch_deploy_events` opens the `WatchDeployEvents` server stream. The
|
||||
@@ -192,3 +257,27 @@ cargo run -p mxgw-cli -- smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --
|
||||
- [Client Proto Generation](../../docs/ClientProtoGeneration.md)
|
||||
- [Rust Client Detailed Design](./RustClientDesign.md)
|
||||
- [Rust Style Guide](../../docs/style-guides/RustStyleGuide.md)
|
||||
|
||||
## Installing from the Gitea Cargo registry
|
||||
|
||||
The crate publishes to the internal Gitea Cargo registry. Register the
|
||||
registry once in your global `~/.cargo/config.toml`:
|
||||
|
||||
```toml
|
||||
[registries.dohertj2-gitea]
|
||||
index = "sparse+https://gitea.dohertylan.com/api/packages/dohertj2/cargo/"
|
||||
```
|
||||
|
||||
Authentication: cargo reads credentials from `~/.cargo/credentials.toml`:
|
||||
|
||||
```toml
|
||||
[registries.dohertj2-gitea]
|
||||
token = "Bearer <your-gitea-token>"
|
||||
```
|
||||
|
||||
Then add the dependency:
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
zb-mom-ww-mxgateway-client = { version = "0.1.1", registry = "dohertj2-gitea" }
|
||||
```
|
||||
|
||||
@@ -162,12 +162,73 @@ impl GatewayClient {
|
||||
|
||||
`stream_alarms` opens with one `active_alarm` per currently-active alarm
|
||||
(the ConditionRefresh snapshot), then a single `snapshot_complete`, then a
|
||||
`transition` for every subsequent raise / acknowledge / clear. The feed is
|
||||
served by the gateway's always-on alarm monitor — no worker session is
|
||||
opened — so any number of clients may attach. Dropping the stream cancels
|
||||
the gRPC call cooperatively. `acknowledge_alarm` is idempotent at the
|
||||
MxAccess layer; the returned `AcknowledgeAlarmReply` carries the native
|
||||
MxStatus from the worker.
|
||||
`transition` for every subsequent raise / acknowledge / clear. A fourth
|
||||
`provider_status` oneof case (`AlarmProviderStatus`: `mode`, `degraded`,
|
||||
`reason`, `since`) is emitted once on stream open and again on every
|
||||
failover/failback so late joiners learn the current alarm-provider mode.
|
||||
The CLI renders all four cases in both its one-line summary and its
|
||||
protobuf-JSON output (`alarm_feed_message_summary` /
|
||||
`alarm_feed_message_to_json`). The feed is served by the gateway's always-on
|
||||
alarm monitor — no worker session is opened — so any number of clients may
|
||||
attach. Dropping the stream cancels the gRPC call cooperatively.
|
||||
`acknowledge_alarm` is idempotent at the MxAccess layer; the returned
|
||||
`AcknowledgeAlarmReply` carries the native MxStatus from the worker.
|
||||
|
||||
## Galaxy Repository
|
||||
|
||||
`GalaxyClient` is a session-less metadata client (requires the
|
||||
`metadata:read` API-key scope). Alongside `test_connection`,
|
||||
`get_last_deploy_time`, `discover_hierarchy`, and `watch_deploy_events`, it
|
||||
exposes a lazy hierarchy walker built on the `BrowseChildren` RPC:
|
||||
|
||||
```rust
|
||||
impl GalaxyClient {
|
||||
pub async fn browse(&mut self, options: Option<BrowseChildrenOptions>) -> Result<Vec<LazyBrowseNode>, Error>;
|
||||
pub async fn browse_children_raw(&mut self, request: BrowseChildrenRequest) -> Result<BrowseChildrenReply, Error>;
|
||||
}
|
||||
|
||||
pub struct BrowseChildrenOptions {
|
||||
pub category_ids: Vec<i32>,
|
||||
pub template_chain_contains: Vec<String>,
|
||||
pub tag_name_glob: Option<String>,
|
||||
pub include_attributes: Option<bool>,
|
||||
pub alarm_bearing_only: bool,
|
||||
pub historized_only: bool,
|
||||
}
|
||||
|
||||
impl LazyBrowseNode {
|
||||
pub fn object(&self) -> &GalaxyObject;
|
||||
pub fn has_children_hint(&self) -> bool;
|
||||
pub async fn children(&self) -> Vec<LazyBrowseNode>;
|
||||
pub async fn is_expanded(&self) -> bool;
|
||||
pub async fn expand(&self) -> Result<(), Error>;
|
||||
}
|
||||
```
|
||||
|
||||
- `browse(options)` returns the root objects as `LazyBrowseNode`s. The
|
||||
supplied `BrowseChildrenOptions` filter is captured and reused when any
|
||||
returned node is expanded, so a single filter set scopes the entire walk.
|
||||
- `BrowseChildrenOptions` mirrors the request-level filters on the wire and
|
||||
combines them with **AND**: a child appears only when it satisfies every
|
||||
populated criterion (`category_ids` membership, every
|
||||
`template_chain_contains` substring, the `tag_name_glob`, plus the
|
||||
`alarm_bearing_only` / `historized_only` flags). `include_attributes` is a
|
||||
tri-state (`None` = server default). Empty/`None` fields impose no
|
||||
restriction. See
|
||||
[Galaxy Repository — BrowseChildren](../../docs/GalaxyRepository.md#browsechildren)
|
||||
for the wire-level semantics.
|
||||
- `LazyBrowseNode` is cheap to clone — clones share state through an internal
|
||||
`Arc`, so expanding one clone makes the children visible to every clone.
|
||||
`has_children_hint()` exposes the server's `child_has_children` hint so a UI
|
||||
can draw an expand affordance without issuing an RPC. `expand()` is
|
||||
idempotent: the first call issues a paged `BrowseChildren` walk (page size
|
||||
500) under an async mutex held across the await, sets the `is_expanded`
|
||||
flag, and caches the children; subsequent calls are no-ops and re-hit
|
||||
nothing. The internal paged loop guards against a server returning a
|
||||
repeated `next_page_token` by failing with `Error::InvalidArgument` rather
|
||||
than looping forever.
|
||||
- `browse_children_raw` issues a single `BrowseChildren` RPC and returns the
|
||||
raw reply for callers that want to drive paging themselves.
|
||||
|
||||
## Authentication
|
||||
|
||||
@@ -189,6 +250,32 @@ Support:
|
||||
- custom CA file,
|
||||
- domain override.
|
||||
|
||||
### Trust posture (pin-only)
|
||||
|
||||
The gateway can serve a self-signed certificate it generates itself (it has no
|
||||
PKI). Rust is the **exception** to the lenient-by-default posture the other
|
||||
clients use: tonic 0.13.1 exposes no public hook to inject a custom certificate
|
||||
verifier, so the Rust client cannot accept an arbitrary certificate. TLS over the
|
||||
Rust client is therefore **pin-only** — it requires either:
|
||||
|
||||
- `ClientOptions::with_ca_file(...)` to pin a CA (the supported path for the
|
||||
gateway's self-signed certificate; export the certificate and pin it), or
|
||||
- `ClientOptions::with_require_certificate_validation(true)` to verify against the
|
||||
operating system's trust roots. This enables the `tonic` `tls-native-roots`
|
||||
feature and calls `ClientTlsConfig::with_native_roots()`, so the handshake
|
||||
validates a certificate that chains to a root the host already trusts. It does
|
||||
**not** accept a bare self-signed gateway certificate — that still needs
|
||||
`with_ca_file`.
|
||||
|
||||
`build_tls_config` computes the trust posture with the pure `tls_trust_decision`
|
||||
helper (`None` / `PinnedCa` / `SystemRoots` / `RejectNoCa`) so the posture is
|
||||
unit-testable without a live handshake. With TLS enabled (`with_plaintext(false)`),
|
||||
no pinned CA, and certificate validation not required (`RejectNoCa`),
|
||||
`GatewayClient::connect` rejects the connection with a clear, actionable error
|
||||
pointing at `with_ca_file` / `require_certificate_validation` rather than building
|
||||
a config with zero trust anchors. The CLI exposes `--ca-file` and
|
||||
`--require-certificate-validation`.
|
||||
|
||||
## Streaming
|
||||
|
||||
Expose event streams as a `Stream<Item = Result<MxEvent, Error>>`. Dropping the
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
name = "mxgw-cli"
|
||||
version.workspace = true
|
||||
edition.workspace = true
|
||||
publish.workspace = true
|
||||
publish = false
|
||||
|
||||
[[bin]]
|
||||
name = "mxgw"
|
||||
|
||||
@@ -426,6 +426,11 @@ struct ConnectionArgs {
|
||||
ca_file: Option<PathBuf>,
|
||||
#[arg(long)]
|
||||
server_name_override: Option<String>,
|
||||
/// Verify the server certificate against the system trust roots even
|
||||
/// without a pinned CA. The Rust client's default is to require a CA
|
||||
/// file (see `--ca-file`); set this flag to use system roots instead.
|
||||
#[arg(long)]
|
||||
require_certificate_validation: bool,
|
||||
#[arg(long, default_value_t = 10)]
|
||||
connect_timeout_seconds: u64,
|
||||
#[arg(long, default_value_t = 30)]
|
||||
@@ -453,6 +458,9 @@ impl ConnectionArgs {
|
||||
if let Some(server_name_override) = &self.server_name_override {
|
||||
options = options.with_server_name_override(server_name_override);
|
||||
}
|
||||
if self.require_certificate_validation {
|
||||
options = options.with_require_certificate_validation(true);
|
||||
}
|
||||
|
||||
options
|
||||
}
|
||||
@@ -1718,7 +1726,7 @@ fn event_value_to_json(value: &ProtoMxValue) -> Value {
|
||||
}
|
||||
|
||||
/// Render a streamed [`AlarmFeedMessage`] as a terse one-line summary that
|
||||
/// distinguishes the three `payload` oneof cases.
|
||||
/// distinguishes the four `payload` oneof cases.
|
||||
fn alarm_feed_message_summary(message: &AlarmFeedMessage) -> String {
|
||||
match &message.payload {
|
||||
Some(alarm_feed_message::Payload::ActiveAlarm(snapshot)) => {
|
||||
@@ -1738,6 +1746,14 @@ fn alarm_feed_message_summary(message: &AlarmFeedMessage) -> String {
|
||||
AlarmEnumName::transition_kind(transition.transition_kind)
|
||||
)
|
||||
}
|
||||
Some(alarm_feed_message::Payload::ProviderStatus(status)) => {
|
||||
format!(
|
||||
"provider-status mode={} degraded={} reason={:?}",
|
||||
AlarmEnumName::provider_mode(status.mode),
|
||||
status.degraded,
|
||||
status.reason
|
||||
)
|
||||
}
|
||||
None => "(empty)".to_owned(),
|
||||
}
|
||||
}
|
||||
@@ -1776,6 +1792,17 @@ fn alarm_feed_message_to_json(message: &AlarmFeedMessage) -> Value {
|
||||
"description": transition.description,
|
||||
}
|
||||
}),
|
||||
Some(alarm_feed_message::Payload::ProviderStatus(status)) => json!({
|
||||
"providerStatus": {
|
||||
"mode": AlarmEnumName::provider_mode(status.mode),
|
||||
"degraded": status.degraded,
|
||||
"reason": status.reason,
|
||||
"since": status.since.as_ref().map(|ts| json!({
|
||||
"seconds": ts.seconds,
|
||||
"nanos": ts.nanos,
|
||||
})),
|
||||
}
|
||||
}),
|
||||
None => Value::Null,
|
||||
}
|
||||
}
|
||||
@@ -1798,6 +1825,13 @@ impl AlarmEnumName {
|
||||
.map(|kind| kind.as_str_name().to_owned())
|
||||
.unwrap_or_else(|_| value.to_string())
|
||||
}
|
||||
|
||||
fn provider_mode(value: i32) -> String {
|
||||
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::AlarmProviderMode;
|
||||
AlarmProviderMode::try_from(value)
|
||||
.map(|mode| mode.as_str_name().to_owned())
|
||||
.unwrap_or_else(|_| value.to_string())
|
||||
}
|
||||
}
|
||||
|
||||
/// Render an [`AcknowledgeAlarmReply`] as a terse line or a JSON document.
|
||||
@@ -2157,4 +2191,40 @@ mod tests {
|
||||
assert_eq!(frac.seconds, utc.seconds);
|
||||
assert_eq!(frac.nanos, 250_000_000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn alarm_feed_provider_status_renders_in_summary_and_json() {
|
||||
use zb_mom_ww_mxgateway_client::generated::mxaccess_gateway::v1::{
|
||||
alarm_feed_message, AlarmFeedMessage, AlarmProviderMode, AlarmProviderStatus,
|
||||
};
|
||||
|
||||
let message = AlarmFeedMessage {
|
||||
payload: Some(alarm_feed_message::Payload::ProviderStatus(
|
||||
AlarmProviderStatus {
|
||||
mode: AlarmProviderMode::Subtag as i32,
|
||||
degraded: true,
|
||||
reason: "alarmmgr unavailable".to_owned(),
|
||||
since: Some(prost_types::Timestamp {
|
||||
seconds: 1_777_995_000,
|
||||
nanos: 0,
|
||||
}),
|
||||
},
|
||||
)),
|
||||
};
|
||||
|
||||
let summary = super::alarm_feed_message_summary(&message);
|
||||
assert!(summary.contains("provider-status"), "summary: {summary}");
|
||||
assert!(
|
||||
summary.contains("ALARM_PROVIDER_MODE_SUBTAG"),
|
||||
"summary: {summary}"
|
||||
);
|
||||
assert!(summary.contains("degraded=true"), "summary: {summary}");
|
||||
|
||||
let value = super::alarm_feed_message_to_json(&message);
|
||||
let provider = &value["providerStatus"];
|
||||
assert_eq!(provider["mode"], "ALARM_PROVIDER_MODE_SUBTAG");
|
||||
assert_eq!(provider["degraded"], true);
|
||||
assert_eq!(provider["reason"], "alarmmgr unavailable");
|
||||
assert_eq!(provider["since"]["seconds"], 1_777_995_000_i64);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -6,10 +6,8 @@
|
||||
//! code should prefer [`GatewayClient::open_session`] and the [`Session`]
|
||||
//! handle it returns, rather than the `*_raw` methods.
|
||||
|
||||
use std::fs;
|
||||
|
||||
use tonic::codegen::InterceptedService;
|
||||
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
|
||||
use tonic::transport::Channel;
|
||||
use tonic::Request;
|
||||
|
||||
use crate::auth::AuthInterceptor;
|
||||
@@ -21,7 +19,7 @@ use crate::generated::mxaccess_gateway::v1::{
|
||||
OpenSessionReply, OpenSessionRequest, QueryActiveAlarmsRequest, StreamAlarmsRequest,
|
||||
StreamEventsRequest,
|
||||
};
|
||||
use crate::options::ClientOptions;
|
||||
use crate::options::{build_tls_config, ClientOptions};
|
||||
use crate::session::Session;
|
||||
|
||||
/// Generated gateway client wrapped in the auth interceptor that
|
||||
@@ -78,18 +76,7 @@ impl GatewayClient {
|
||||
})?;
|
||||
endpoint = endpoint.connect_timeout(options.connect_timeout());
|
||||
|
||||
if !options.plaintext() {
|
||||
let mut tls = ClientTlsConfig::new();
|
||||
if let Some(server_name) = options.server_name_override() {
|
||||
tls = tls.domain_name(server_name.to_owned());
|
||||
}
|
||||
if let Some(ca_file) = options.ca_file() {
|
||||
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
|
||||
endpoint: options.endpoint().to_owned(),
|
||||
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
|
||||
})?;
|
||||
tls = tls.ca_certificate(Certificate::from_pem(certificate));
|
||||
}
|
||||
if let Some(tls) = build_tls_config(&options)? {
|
||||
endpoint = endpoint.tls_config(tls)?;
|
||||
}
|
||||
|
||||
|
||||
+539
-20
@@ -5,23 +5,143 @@
|
||||
//! read-only RPCs as Rust async methods. Generated Galaxy proto types are
|
||||
//! re-exported through [`crate::generated::galaxy_repository::v1`].
|
||||
|
||||
use std::fs;
|
||||
use std::collections::HashSet;
|
||||
use std::sync::Arc;
|
||||
|
||||
use prost_types::Timestamp;
|
||||
use tokio::sync::Mutex as AsyncMutex;
|
||||
use tonic::codegen::InterceptedService;
|
||||
use tonic::transport::{Certificate, Channel, ClientTlsConfig};
|
||||
use tonic::transport::Channel;
|
||||
use tonic::Request;
|
||||
|
||||
use crate::auth::AuthInterceptor;
|
||||
use crate::error::Error;
|
||||
use crate::generated::galaxy_repository::v1::galaxy_repository_client::GalaxyRepositoryClient;
|
||||
use crate::generated::galaxy_repository::v1::{
|
||||
DeployEvent, DiscoverHierarchyRequest, GalaxyObject, GetLastDeployTimeRequest,
|
||||
TestConnectionRequest, WatchDeployEventsRequest,
|
||||
browse_children_request, BrowseChildrenReply, BrowseChildrenRequest, DeployEvent,
|
||||
DiscoverHierarchyRequest, GalaxyObject, GetLastDeployTimeRequest, TestConnectionRequest,
|
||||
WatchDeployEventsRequest,
|
||||
};
|
||||
use crate::options::ClientOptions;
|
||||
use crate::options::{build_tls_config, ClientOptions};
|
||||
|
||||
const DISCOVER_HIERARCHY_PAGE_SIZE: i32 = 5000;
|
||||
const BROWSE_CHILDREN_PAGE_SIZE: i32 = 500;
|
||||
|
||||
/// Optional filter set forwarded to `GalaxyRepository.BrowseChildren`.
|
||||
///
|
||||
/// Mirrors the request-level filters on the wire: combined with AND so a child
|
||||
/// only appears when it satisfies every populated criterion. Construct via
|
||||
/// [`BrowseChildrenOptions::default`] and tweak the fields you care about.
|
||||
#[derive(Debug, Clone, Default)]
|
||||
pub struct BrowseChildrenOptions {
|
||||
/// Restrict to objects whose `category_id` matches one of the supplied
|
||||
/// Galaxy category identifiers. Empty means "no restriction".
|
||||
pub category_ids: Vec<i32>,
|
||||
/// Restrict to objects whose template chain contains every supplied
|
||||
/// template name (case-sensitive substring match on each entry).
|
||||
pub template_chain_contains: Vec<String>,
|
||||
/// Restrict to objects whose tag name matches the supplied glob (SQL
|
||||
/// `LIKE`-style on the server). `None` means "no glob filter".
|
||||
pub tag_name_glob: Option<String>,
|
||||
/// Optional tri-state hint for whether to populate `GalaxyObject.attributes`
|
||||
/// on returned children. `None` falls back to the server default.
|
||||
pub include_attributes: Option<bool>,
|
||||
/// When `true`, only return children that own at least one alarm-bearing
|
||||
/// attribute (matches `DiscoverHierarchy` semantics).
|
||||
pub alarm_bearing_only: bool,
|
||||
/// When `true`, only return children that own at least one historized
|
||||
/// attribute (matches `DiscoverHierarchy` semantics).
|
||||
pub historized_only: bool,
|
||||
}
|
||||
|
||||
/// Lazy hierarchy node used by the walker built on top of `BrowseChildren`.
|
||||
///
|
||||
/// A node owns its [`GalaxyObject`], a hint as to whether the server believes
|
||||
/// it has at least one matching descendant under the active filter set, and an
|
||||
/// internal `expanded` flag protected by an async mutex. Calling [`expand`]
|
||||
/// the first time issues a paged `BrowseChildren` RPC; subsequent calls are
|
||||
/// no-ops so callers can poll without re-hitting the server.
|
||||
///
|
||||
/// `LazyBrowseNode` is cheap to clone — clones share state through an
|
||||
/// internal `Arc`, so expanding one clone makes the children visible to every
|
||||
/// other clone.
|
||||
///
|
||||
/// [`expand`]: LazyBrowseNode::expand
|
||||
pub struct LazyBrowseNode {
|
||||
inner: Arc<LazyBrowseNodeInner>,
|
||||
}
|
||||
|
||||
impl Clone for LazyBrowseNode {
|
||||
fn clone(&self) -> Self {
|
||||
Self {
|
||||
inner: Arc::clone(&self.inner),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
struct LazyBrowseNodeInner {
|
||||
client: GalaxyClient,
|
||||
object: GalaxyObject,
|
||||
has_children_hint: bool,
|
||||
options: BrowseChildrenOptions,
|
||||
state: AsyncMutex<LazyBrowseNodeState>,
|
||||
}
|
||||
|
||||
struct LazyBrowseNodeState {
|
||||
children: Vec<LazyBrowseNode>,
|
||||
is_expanded: bool,
|
||||
}
|
||||
|
||||
impl LazyBrowseNode {
|
||||
/// Borrow the [`GalaxyObject`] returned by the server for this node.
|
||||
pub fn object(&self) -> &GalaxyObject {
|
||||
&self.inner.object
|
||||
}
|
||||
|
||||
/// Server-supplied hint: `true` when the child likely has at least one
|
||||
/// further matching descendant. Useful to decide whether a UI should draw
|
||||
/// an expand triangle without issuing the RPC up front.
|
||||
pub fn has_children_hint(&self) -> bool {
|
||||
self.inner.has_children_hint
|
||||
}
|
||||
|
||||
/// Snapshot of the currently-known children. Empty until [`expand`] has
|
||||
/// run at least once.
|
||||
///
|
||||
/// [`expand`]: LazyBrowseNode::expand
|
||||
pub async fn children(&self) -> Vec<LazyBrowseNode> {
|
||||
self.inner.state.lock().await.children.clone()
|
||||
}
|
||||
|
||||
/// Returns `true` once [`expand`] has populated this node's children.
|
||||
///
|
||||
/// [`expand`]: LazyBrowseNode::expand
|
||||
pub async fn is_expanded(&self) -> bool {
|
||||
self.inner.state.lock().await.is_expanded
|
||||
}
|
||||
|
||||
/// Populate this node's children by issuing a paged `BrowseChildren` RPC.
|
||||
/// Subsequent calls are no-ops — the cached children stay in place and no
|
||||
/// additional RPC is issued.
|
||||
pub async fn expand(&self) -> Result<(), Error> {
|
||||
let mut state = self.inner.state.lock().await;
|
||||
if state.is_expanded {
|
||||
return Ok(());
|
||||
}
|
||||
|
||||
let mut client = self.inner.client.clone();
|
||||
let new_children = client
|
||||
.browse_children_inner(
|
||||
Some(self.inner.object.gobject_id),
|
||||
self.inner.options.clone(),
|
||||
)
|
||||
.await?;
|
||||
|
||||
state.children = new_children;
|
||||
state.is_expanded = true;
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
/// Convenience alias for the generated Galaxy client wrapped in the
|
||||
/// authentication interceptor.
|
||||
@@ -62,18 +182,7 @@ impl GalaxyClient {
|
||||
})?;
|
||||
endpoint = endpoint.connect_timeout(options.connect_timeout());
|
||||
|
||||
if !options.plaintext() {
|
||||
let mut tls = ClientTlsConfig::new();
|
||||
if let Some(server_name) = options.server_name_override() {
|
||||
tls = tls.domain_name(server_name.to_owned());
|
||||
}
|
||||
if let Some(ca_file) = options.ca_file() {
|
||||
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
|
||||
endpoint: options.endpoint().to_owned(),
|
||||
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
|
||||
})?;
|
||||
tls = tls.ca_certificate(Certificate::from_pem(certificate));
|
||||
}
|
||||
if let Some(tls) = build_tls_config(&options)? {
|
||||
endpoint = endpoint.tls_config(tls)?;
|
||||
}
|
||||
|
||||
@@ -172,6 +281,99 @@ impl GalaxyClient {
|
||||
}
|
||||
}
|
||||
|
||||
/// Browse the top-level (root) objects of the hierarchy as
|
||||
/// [`LazyBrowseNode`] instances. Pass [`BrowseChildrenOptions`] to
|
||||
/// restrict the result set; the same filter is reused when callers expand
|
||||
/// any returned node.
|
||||
pub async fn browse(
|
||||
&mut self,
|
||||
options: Option<BrowseChildrenOptions>,
|
||||
) -> Result<Vec<LazyBrowseNode>, Error> {
|
||||
let effective = options.unwrap_or_default();
|
||||
self.browse_children_inner(None, effective).await
|
||||
}
|
||||
|
||||
/// Issue a single `BrowseChildren` RPC and return the raw reply. Callers
|
||||
/// that want to drive paging themselves (or inspect the cache sequence)
|
||||
/// use this; high-level walking goes through [`browse`] and
|
||||
/// [`LazyBrowseNode::expand`].
|
||||
///
|
||||
/// [`browse`]: GalaxyClient::browse
|
||||
pub async fn browse_children_raw(
|
||||
&mut self,
|
||||
request: BrowseChildrenRequest,
|
||||
) -> Result<BrowseChildrenReply, Error> {
|
||||
let response = self
|
||||
.inner
|
||||
.browse_children(self.unary_request(request))
|
||||
.await?;
|
||||
Ok(response.into_inner())
|
||||
}
|
||||
|
||||
pub(crate) async fn browse_children_inner(
|
||||
&mut self,
|
||||
parent_gobject_id: Option<i32>,
|
||||
options: BrowseChildrenOptions,
|
||||
) -> Result<Vec<LazyBrowseNode>, Error> {
|
||||
let mut nodes = Vec::new();
|
||||
let mut page_token = String::new();
|
||||
let mut seen_page_tokens: HashSet<String> = HashSet::new();
|
||||
loop {
|
||||
let parent = parent_gobject_id.map(browse_children_request::Parent::ParentGobjectId);
|
||||
let request = BrowseChildrenRequest {
|
||||
page_size: BROWSE_CHILDREN_PAGE_SIZE,
|
||||
page_token: page_token.clone(),
|
||||
category_ids: options.category_ids.clone(),
|
||||
template_chain_contains: options.template_chain_contains.clone(),
|
||||
tag_name_glob: options.tag_name_glob.clone().unwrap_or_default(),
|
||||
include_attributes: options.include_attributes,
|
||||
alarm_bearing_only: options.alarm_bearing_only,
|
||||
historized_only: options.historized_only,
|
||||
parent,
|
||||
};
|
||||
|
||||
let reply = self.browse_children_raw(request).await?;
|
||||
let hints = reply.child_has_children;
|
||||
for (index, object) in reply.children.into_iter().enumerate() {
|
||||
let hint = hints.get(index).copied().unwrap_or(false);
|
||||
nodes.push(self.make_lazy_node(object, hint, options.clone()));
|
||||
}
|
||||
|
||||
page_token = reply.next_page_token;
|
||||
if page_token.is_empty() {
|
||||
return Ok(nodes);
|
||||
}
|
||||
if !seen_page_tokens.insert(page_token.clone()) {
|
||||
return Err(Error::InvalidArgument {
|
||||
name: "page_token".to_owned(),
|
||||
detail: format!(
|
||||
"galaxy browse children returned repeated page token `{page_token}`"
|
||||
),
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn make_lazy_node(
|
||||
&self,
|
||||
object: GalaxyObject,
|
||||
has_children_hint: bool,
|
||||
options: BrowseChildrenOptions,
|
||||
) -> LazyBrowseNode {
|
||||
LazyBrowseNode {
|
||||
inner: Arc::new(LazyBrowseNodeInner {
|
||||
client: self.clone(),
|
||||
object,
|
||||
has_children_hint,
|
||||
options,
|
||||
state: AsyncMutex::new(LazyBrowseNodeState {
|
||||
children: Vec::new(),
|
||||
is_expanded: false,
|
||||
}),
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
/// Subscribe to the server-streamed deploy-event feed.
|
||||
///
|
||||
/// The server emits a bootstrap event describing the current cache state
|
||||
@@ -234,9 +436,10 @@ mod tests {
|
||||
GalaxyRepository, GalaxyRepositoryServer,
|
||||
};
|
||||
use crate::generated::galaxy_repository::v1::{
|
||||
DeployEvent, DiscoverHierarchyReply, DiscoverHierarchyRequest, GalaxyAttribute,
|
||||
GalaxyObject, GetLastDeployTimeReply, GetLastDeployTimeRequest, TestConnectionReply,
|
||||
TestConnectionRequest, WatchDeployEventsRequest,
|
||||
BrowseChildrenReply, BrowseChildrenRequest, DeployEvent, DiscoverHierarchyReply,
|
||||
DiscoverHierarchyRequest, GalaxyAttribute, GalaxyObject, GetLastDeployTimeReply,
|
||||
GetLastDeployTimeRequest, TestConnectionReply, TestConnectionRequest,
|
||||
WatchDeployEventsRequest,
|
||||
};
|
||||
|
||||
type DeployEventTx = mpsc::Sender<Result<DeployEvent, Status>>;
|
||||
@@ -249,6 +452,9 @@ mod tests {
|
||||
objects: Mutex<Vec<GalaxyObject>>,
|
||||
discover_requests: Mutex<Vec<DiscoverHierarchyRequest>>,
|
||||
discover_replies: Mutex<std::collections::VecDeque<DiscoverHierarchyReply>>,
|
||||
browse_children_calls: Mutex<Vec<BrowseChildrenRequest>>,
|
||||
browse_children_replies: Mutex<std::collections::VecDeque<BrowseChildrenReply>>,
|
||||
browse_children_errors: Mutex<Vec<Status>>,
|
||||
watch_requests: Mutex<Vec<WatchDeployEventsRequest>>,
|
||||
watch_events: Mutex<Vec<DeployEvent>>,
|
||||
watch_senders: Mutex<Vec<DeployEventTx>>,
|
||||
@@ -306,6 +512,28 @@ mod tests {
|
||||
}))
|
||||
}
|
||||
|
||||
async fn browse_children(
|
||||
&self,
|
||||
request: Request<BrowseChildrenRequest>,
|
||||
) -> Result<Response<BrowseChildrenReply>, Status> {
|
||||
self.state
|
||||
.browse_children_calls
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push(request.into_inner());
|
||||
if let Some(error) = self.state.browse_children_errors.lock().unwrap().pop() {
|
||||
return Err(error);
|
||||
}
|
||||
let reply = self
|
||||
.state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.pop_front()
|
||||
.unwrap_or_default();
|
||||
Ok(Response::new(reply))
|
||||
}
|
||||
|
||||
type WatchDeployEventsStream =
|
||||
Pin<Box<dyn tokio_stream::Stream<Item = Result<DeployEvent, Status>> + Send + 'static>>;
|
||||
|
||||
@@ -695,4 +923,295 @@ mod tests {
|
||||
"drop signal channel closed unexpectedly"
|
||||
);
|
||||
}
|
||||
|
||||
fn browse_obj(gid: i32, tag: &str, is_area: bool) -> GalaxyObject {
|
||||
GalaxyObject {
|
||||
gobject_id: gid,
|
||||
tag_name: tag.to_owned(),
|
||||
contained_name: String::new(),
|
||||
browse_name: tag.to_owned(),
|
||||
parent_gobject_id: 0,
|
||||
is_area,
|
||||
category_id: 0,
|
||||
hosted_by_gobject_id: 0,
|
||||
template_chain: Vec::new(),
|
||||
attributes: Vec::new(),
|
||||
}
|
||||
}
|
||||
|
||||
fn build_browse_reply(
|
||||
children: Vec<GalaxyObject>,
|
||||
child_has_children: Vec<bool>,
|
||||
cache_sequence: u64,
|
||||
) -> BrowseChildrenReply {
|
||||
BrowseChildrenReply {
|
||||
total_child_count: children.len() as i32,
|
||||
cache_sequence,
|
||||
children,
|
||||
child_has_children,
|
||||
next_page_token: String::new(),
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_no_parent_returns_roots() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(1, "Area_A", true), browse_obj(2, "Area_B", true)],
|
||||
vec![true, false],
|
||||
7,
|
||||
));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let roots = client.browse(None).await.unwrap();
|
||||
|
||||
assert_eq!(roots.len(), 2);
|
||||
assert_eq!(roots[0].object().tag_name, "Area_A");
|
||||
assert!(roots[0].has_children_hint());
|
||||
assert_eq!(roots[1].object().tag_name, "Area_B");
|
||||
assert!(!roots[1].has_children_hint());
|
||||
|
||||
let calls = state.browse_children_calls.lock().unwrap();
|
||||
assert_eq!(calls.len(), 1);
|
||||
assert!(
|
||||
calls[0].parent.is_none(),
|
||||
"root browse must send an empty parent oneof, got {:?}",
|
||||
calls[0].parent
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_expand_populates_children_and_marks_expanded() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
// First call: roots.
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(10, "Area_A", true)],
|
||||
vec![true],
|
||||
1,
|
||||
));
|
||||
// Second call: children of gobject 10.
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(11, "Receiver_1", false)],
|
||||
vec![false],
|
||||
1,
|
||||
));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let roots = client.browse(None).await.unwrap();
|
||||
let root = roots.into_iter().next().expect("at least one root");
|
||||
assert!(!root.is_expanded().await);
|
||||
|
||||
root.expand().await.unwrap();
|
||||
|
||||
assert!(root.is_expanded().await);
|
||||
let children = root.children().await;
|
||||
assert_eq!(children.len(), 1);
|
||||
assert_eq!(children[0].object().tag_name, "Receiver_1");
|
||||
|
||||
let calls = state.browse_children_calls.lock().unwrap();
|
||||
assert_eq!(calls.len(), 2);
|
||||
let expand_call = &calls[1];
|
||||
match expand_call.parent.as_ref().expect("expand sends parent") {
|
||||
browse_children_request::Parent::ParentGobjectId(id) => assert_eq!(*id, 10),
|
||||
other => panic!("expected ParentGobjectId variant, got {other:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_expand_idempotent_no_second_rpc() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(20, "Area_X", true)],
|
||||
vec![true],
|
||||
1,
|
||||
));
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(21, "Leaf", false)],
|
||||
vec![false],
|
||||
1,
|
||||
));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let roots = client.browse(None).await.unwrap();
|
||||
let root = roots.into_iter().next().unwrap();
|
||||
root.expand().await.unwrap();
|
||||
let after_first = state.browse_children_calls.lock().unwrap().len();
|
||||
|
||||
// Calling expand a second time must NOT issue a new RPC.
|
||||
root.expand().await.unwrap();
|
||||
|
||||
let after_second = state.browse_children_calls.lock().unwrap().len();
|
||||
assert_eq!(
|
||||
after_first, after_second,
|
||||
"expand should be idempotent — no extra RPC the second time"
|
||||
);
|
||||
assert_eq!(root.children().await.len(), 1);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_expand_unknown_parent_returns_not_found_error() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
// Root browse succeeds.
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(99, "GhostArea", true)],
|
||||
vec![true],
|
||||
1,
|
||||
));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let roots = client.browse(None).await.unwrap();
|
||||
let root = roots.into_iter().next().unwrap();
|
||||
|
||||
// Seed the NotFound only AFTER the root call so the FakeGalaxy's
|
||||
// error stack doesn't intercept the initial browse.
|
||||
state
|
||||
.browse_children_errors
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push(Status::not_found("parent gobject 99 not present in cache"));
|
||||
|
||||
let error = root.expand().await.unwrap_err();
|
||||
|
||||
match &error {
|
||||
Error::Status(status) => {
|
||||
assert_eq!(status.code(), tonic::Code::NotFound);
|
||||
}
|
||||
other => panic!("expected Error::Status(NotFound), got {other:?}"),
|
||||
}
|
||||
// Failed expand must NOT mark the node as expanded — caller can retry.
|
||||
assert!(!root.is_expanded().await);
|
||||
assert!(root.children().await.is_empty());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_expand_multi_page_gathers_all_pages() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
// First reply: roots.
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(30, "Plant", true)],
|
||||
vec![true],
|
||||
5,
|
||||
));
|
||||
// Second reply: page 1 of children, with a next_page_token.
|
||||
let mut page_one = build_browse_reply(
|
||||
vec![
|
||||
browse_obj(31, "Child_A", false),
|
||||
browse_obj(32, "Child_B", false),
|
||||
],
|
||||
vec![false, false],
|
||||
5,
|
||||
);
|
||||
page_one.next_page_token = "cursor-2".to_owned();
|
||||
page_one.total_child_count = 3;
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(page_one);
|
||||
// Third reply: page 2 of children, with no next page.
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(
|
||||
vec![browse_obj(33, "Child_C", false)],
|
||||
vec![false],
|
||||
5,
|
||||
));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let roots = client.browse(None).await.unwrap();
|
||||
let root = roots.into_iter().next().unwrap();
|
||||
root.expand().await.unwrap();
|
||||
|
||||
let children = root.children().await;
|
||||
assert_eq!(children.len(), 3);
|
||||
assert_eq!(children[0].object().tag_name, "Child_A");
|
||||
assert_eq!(children[1].object().tag_name, "Child_B");
|
||||
assert_eq!(children[2].object().tag_name, "Child_C");
|
||||
|
||||
let calls = state.browse_children_calls.lock().unwrap();
|
||||
// 1 root call + 2 paged expand calls = 3 total.
|
||||
assert_eq!(calls.len(), 3);
|
||||
assert_eq!(calls[1].page_token, "");
|
||||
assert_eq!(calls[2].page_token, "cursor-2");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn browse_with_filter_forwards_to_request() {
|
||||
let state = Arc::new(FakeState::default());
|
||||
state
|
||||
.browse_children_replies
|
||||
.lock()
|
||||
.unwrap()
|
||||
.push_back(build_browse_reply(Vec::new(), Vec::new(), 1));
|
||||
let endpoint = spawn_fake(state.clone()).await;
|
||||
let mut client = GalaxyClient::connect(ClientOptions::new(endpoint))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let options = BrowseChildrenOptions {
|
||||
category_ids: vec![3, 5],
|
||||
template_chain_contains: vec!["$DelmiaReceiver".to_owned()],
|
||||
tag_name_glob: Some("Recv_*".to_owned()),
|
||||
include_attributes: Some(true),
|
||||
alarm_bearing_only: true,
|
||||
historized_only: false,
|
||||
};
|
||||
|
||||
let _ = client.browse(Some(options)).await.unwrap();
|
||||
|
||||
let calls = state.browse_children_calls.lock().unwrap();
|
||||
assert_eq!(calls.len(), 1);
|
||||
let req = &calls[0];
|
||||
assert_eq!(req.category_ids, vec![3, 5]);
|
||||
assert_eq!(req.template_chain_contains, vec!["$DelmiaReceiver"]);
|
||||
assert_eq!(req.tag_name_glob, "Recv_*");
|
||||
assert_eq!(req.include_attributes, Some(true));
|
||||
assert!(req.alarm_bearing_only);
|
||||
assert!(!req.historized_only);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -3,10 +3,14 @@
|
||||
//! chain of `with_*` setters; the `Debug` impl redacts the API key.
|
||||
|
||||
use std::fmt;
|
||||
use std::fs;
|
||||
use std::path::PathBuf;
|
||||
use std::time::Duration;
|
||||
|
||||
use tonic::transport::{Certificate, ClientTlsConfig};
|
||||
|
||||
use crate::auth::ApiKey;
|
||||
use crate::error::Error;
|
||||
|
||||
const DEFAULT_MAX_GRPC_MESSAGE_BYTES: usize = 16 * 1024 * 1024;
|
||||
|
||||
@@ -22,6 +26,7 @@ pub struct ClientOptions {
|
||||
api_key: Option<ApiKey>,
|
||||
plaintext: bool,
|
||||
ca_file: Option<PathBuf>,
|
||||
require_certificate_validation: bool,
|
||||
server_name_override: Option<String>,
|
||||
connect_timeout: Duration,
|
||||
call_timeout: Duration,
|
||||
@@ -38,6 +43,7 @@ impl ClientOptions {
|
||||
api_key: None,
|
||||
plaintext: true,
|
||||
ca_file: None,
|
||||
require_certificate_validation: false,
|
||||
server_name_override: None,
|
||||
connect_timeout: Duration::from_secs(10),
|
||||
call_timeout: Duration::from_secs(30),
|
||||
@@ -67,6 +73,28 @@ impl ClientOptions {
|
||||
self
|
||||
}
|
||||
|
||||
/// Require TLS certificate verification even without a pinned CA. Default
|
||||
/// false. Setting a CA file always verifies against that CA.
|
||||
///
|
||||
/// Note for Rust: tonic 0.13's `ClientTlsConfig` exposes no hook for a
|
||||
/// custom rustls verifier, so the Rust client cannot accept an *arbitrary*
|
||||
/// self-signed certificate the way the other clients do. With the default
|
||||
/// (false) and no pinned CA, [`crate::client::GatewayClient::connect`]
|
||||
/// rejects the TLS connection and asks for a CA file. There are two
|
||||
/// supported TLS paths:
|
||||
///
|
||||
/// - Pin the gateway certificate with [`ClientOptions::with_ca_file`] (the
|
||||
/// lenient pin-only path; works for a self-signed gateway cert).
|
||||
/// - Set this `true` to verify against the operating system's trust roots
|
||||
/// (`tls-native-roots`). This only succeeds for a certificate that chains
|
||||
/// to a root the host already trusts, so it is for gateways fronted by a
|
||||
/// publicly- or enterprise-CA-issued certificate, not a bare self-signed
|
||||
/// one.
|
||||
pub fn with_require_certificate_validation(mut self, require: bool) -> Self {
|
||||
self.require_certificate_validation = require;
|
||||
self
|
||||
}
|
||||
|
||||
/// Override the SNI/server name used during the TLS handshake. Useful
|
||||
/// when the dial-target host name does not match the certificate.
|
||||
pub fn with_server_name_override(mut self, server_name_override: impl Into<String>) -> Self {
|
||||
@@ -121,6 +149,12 @@ impl ClientOptions {
|
||||
self.ca_file.as_ref()
|
||||
}
|
||||
|
||||
/// Whether TLS certificate verification is required even without a pinned
|
||||
/// CA. See [`ClientOptions::with_require_certificate_validation`].
|
||||
pub fn require_certificate_validation(&self) -> bool {
|
||||
self.require_certificate_validation
|
||||
}
|
||||
|
||||
/// Optional SNI / server-name override for TLS handshakes.
|
||||
pub fn server_name_override(&self) -> Option<&str> {
|
||||
self.server_name_override.as_deref()
|
||||
@@ -147,6 +181,114 @@ impl ClientOptions {
|
||||
}
|
||||
}
|
||||
|
||||
/// Where the TLS handshake gets its trust anchors for a given set of options.
|
||||
/// Computed by [`tls_trust_decision`] and applied by [`build_tls_config`];
|
||||
/// split out so the trust posture is unit-testable without a live handshake.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub(crate) enum TlsTrustDecision {
|
||||
/// Plaintext transport — no TLS, no trust anchors.
|
||||
None,
|
||||
/// Validate against the CA pinned with [`ClientOptions::with_ca_file`].
|
||||
PinnedCa,
|
||||
/// Validate against the operating system's trust roots
|
||||
/// (`require_certificate_validation == true`, no pinned CA).
|
||||
SystemRoots,
|
||||
/// Reject up front: TLS requested with neither a pinned CA nor strict
|
||||
/// verification (the Rust pin-only lenient default).
|
||||
RejectNoCa,
|
||||
}
|
||||
|
||||
/// Decide the TLS trust posture from `options` without touching the filesystem
|
||||
/// or the network.
|
||||
pub(crate) fn tls_trust_decision(options: &ClientOptions) -> TlsTrustDecision {
|
||||
if options.plaintext() {
|
||||
TlsTrustDecision::None
|
||||
} else if options.ca_file().is_some() {
|
||||
TlsTrustDecision::PinnedCa
|
||||
} else if options.require_certificate_validation() {
|
||||
TlsTrustDecision::SystemRoots
|
||||
} else {
|
||||
TlsTrustDecision::RejectNoCa
|
||||
}
|
||||
}
|
||||
|
||||
/// Build the [`ClientTlsConfig`] for a non-plaintext connection described by
|
||||
/// `options`, applying the lenient-default guard that is the **Rust
|
||||
/// pin-only exception**.
|
||||
///
|
||||
/// Returns `Ok(None)` when `options.plaintext()` is `true` (no TLS needed).
|
||||
/// Returns `Ok(Some(tls))` when a valid TLS config can be assembled — either
|
||||
/// pinned to the CA from [`ClientOptions::with_ca_file`], or, when
|
||||
/// `require_certificate_validation` is set with no pinned CA, verifying against
|
||||
/// the operating system's trust roots (`tls-native-roots`).
|
||||
/// Returns `Err(Error::InvalidEndpoint)` when TLS is requested but no pinned
|
||||
/// CA was provided and `require_certificate_validation` is `false`.
|
||||
///
|
||||
/// # Why the no-CA guard exists
|
||||
///
|
||||
/// `tonic` 0.13's `ClientTlsConfig` builds its rustls verifier inside a
|
||||
/// crate-private connector and exposes no hook for a custom
|
||||
/// `ServerCertVerifier`. The Rust client therefore cannot accept an *arbitrary*
|
||||
/// self-signed certificate the way the other language clients do. Rather than
|
||||
/// silently falling back to a verifier with no trust anchors (which rejects
|
||||
/// every certificate with a confusing handshake error), the lenient default
|
||||
/// rejects the configuration early with an actionable error. The strict opt-in
|
||||
/// instead loads the system trust roots so a certificate chaining to an
|
||||
/// already-trusted root validates.
|
||||
pub(crate) fn build_tls_config(options: &ClientOptions) -> Result<Option<ClientTlsConfig>, Error> {
|
||||
let decision = tls_trust_decision(options);
|
||||
if decision == TlsTrustDecision::None {
|
||||
return Ok(None);
|
||||
}
|
||||
|
||||
let mut tls = ClientTlsConfig::new();
|
||||
if let Some(server_name) = options.server_name_override() {
|
||||
tls = tls.domain_name(server_name.to_owned());
|
||||
}
|
||||
match decision {
|
||||
TlsTrustDecision::PinnedCa => {
|
||||
let ca_file = options.ca_file().expect("PinnedCa implies a CA file");
|
||||
let certificate = fs::read(ca_file).map_err(|source| Error::InvalidEndpoint {
|
||||
endpoint: options.endpoint().to_owned(),
|
||||
detail: format!("failed to read CA file {}: {source}", ca_file.display()),
|
||||
})?;
|
||||
tls = tls.ca_certificate(Certificate::from_pem(certificate));
|
||||
}
|
||||
TlsTrustDecision::SystemRoots => {
|
||||
// Strict opt-in with no pinned CA: verify against the OS trust
|
||||
// store. Without this the bare `ClientTlsConfig` carries zero
|
||||
// trust anchors and rejects every certificate, so the documented
|
||||
// "verify against the system trust roots" behaviour would be
|
||||
// unreachable. Only a certificate chaining to an already-trusted
|
||||
// root validates — a bare self-signed gateway cert still needs
|
||||
// `with_ca_file`.
|
||||
tls = tls.with_native_roots();
|
||||
}
|
||||
TlsTrustDecision::RejectNoCa => {
|
||||
// Lenient-default fallback (Rust pin-only exception): the Rust
|
||||
// client cannot accept an arbitrary self-signed cert. Pin the
|
||||
// gateway's CA, or opt into strict verification against the
|
||||
// system trust roots.
|
||||
//
|
||||
// Note: a server-name override affects SNI (the hostname sent in
|
||||
// the TLS ClientHello) but does NOT pin trust.
|
||||
return Err(Error::InvalidEndpoint {
|
||||
endpoint: options.endpoint().to_owned(),
|
||||
detail: "TLS requested without a pinned CA. The Rust client cannot accept an \
|
||||
arbitrary self-signed certificate (tonic 0.13 exposes no custom \
|
||||
rustls verifier). Pin the gateway certificate with \
|
||||
ClientOptions::with_ca_file, or call \
|
||||
ClientOptions::with_require_certificate_validation(true) to verify \
|
||||
against the system trust roots. Note: a server-name override \
|
||||
affects SNI but does not pin trust."
|
||||
.to_owned(),
|
||||
});
|
||||
}
|
||||
TlsTrustDecision::None => unreachable!("handled above"),
|
||||
}
|
||||
Ok(Some(tls))
|
||||
}
|
||||
|
||||
impl Default for ClientOptions {
|
||||
fn default() -> Self {
|
||||
Self::new("http://127.0.0.1:5000")
|
||||
@@ -161,6 +303,10 @@ impl fmt::Debug for ClientOptions {
|
||||
.field("api_key", &self.api_key.as_ref().map(|_| "<redacted>"))
|
||||
.field("plaintext", &self.plaintext)
|
||||
.field("ca_file", &self.ca_file)
|
||||
.field(
|
||||
"require_certificate_validation",
|
||||
&self.require_certificate_validation,
|
||||
)
|
||||
.field("server_name_override", &self.server_name_override)
|
||||
.field("connect_timeout", &self.connect_timeout)
|
||||
.field("call_timeout", &self.call_timeout)
|
||||
@@ -175,6 +321,8 @@ mod tests {
|
||||
use super::ClientOptions;
|
||||
use crate::auth::ApiKey;
|
||||
|
||||
use super::{build_tls_config, tls_trust_decision, TlsTrustDecision};
|
||||
|
||||
#[test]
|
||||
fn debug_redacts_api_key() {
|
||||
let options =
|
||||
@@ -185,4 +333,47 @@ mod tests {
|
||||
assert!(debug.contains("<redacted>"));
|
||||
assert!(!debug.contains("mxgw_secret"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn plaintext_needs_no_tls() {
|
||||
let options = ClientOptions::new("http://127.0.0.1:5000").with_plaintext(true);
|
||||
assert_eq!(tls_trust_decision(&options), TlsTrustDecision::None);
|
||||
assert!(build_tls_config(&options).unwrap().is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn pinned_ca_uses_pinned_trust() {
|
||||
let options = ClientOptions::new("https://127.0.0.1:5000")
|
||||
.with_plaintext(false)
|
||||
.with_ca_file("/some/ca.pem");
|
||||
assert_eq!(tls_trust_decision(&options), TlsTrustDecision::PinnedCa);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn strict_without_ca_uses_system_roots() {
|
||||
// Regression for Client.Rust-031: strict verification with no pinned CA
|
||||
// must verify against the system trust roots, not produce a config with
|
||||
// zero trust anchors. The trust decision proves roots are consulted; the
|
||||
// build then succeeds (no no-CA guard error) and emits a config.
|
||||
let options = ClientOptions::new("https://127.0.0.1:5000")
|
||||
.with_plaintext(false)
|
||||
.with_require_certificate_validation(true);
|
||||
|
||||
assert_eq!(
|
||||
tls_trust_decision(&options),
|
||||
TlsTrustDecision::SystemRoots,
|
||||
"strict-no-CA must request the system trust roots"
|
||||
);
|
||||
assert!(
|
||||
build_tls_config(&options).unwrap().is_some(),
|
||||
"strict-no-CA must build a usable TLS config"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lenient_without_ca_is_rejected() {
|
||||
let options = ClientOptions::new("https://127.0.0.1:5000").with_plaintext(false);
|
||||
assert_eq!(tls_trust_decision(&options), TlsTrustDecision::RejectNoCa);
|
||||
assert!(build_tls_config(&options).is_err());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,137 @@
|
||||
//! TLS posture coverage for the Rust client.
|
||||
//!
|
||||
//! tonic 0.13.1's `ClientTlsConfig` exposes no hook for a custom rustls
|
||||
//! `ServerCertVerifier` (the verifier is built internally inside the
|
||||
//! crate-private `TlsConnector`), so the Rust client cannot implement the
|
||||
//! "accept any server certificate" lenient default the other clients use.
|
||||
//! Rust is therefore the documented **pin-only exception**: TLS without a
|
||||
//! pinned CA is rejected up front with a clear, actionable error, and
|
||||
//! supplying a CA file is the supported path. These tests pin that contract.
|
||||
|
||||
use std::time::Duration;
|
||||
|
||||
use zb_mom_ww_mxgateway_client::{ClientOptions, Error, GalaxyClient, GatewayClient};
|
||||
|
||||
/// Drive `connect` to its error without requiring `GatewayClient: Debug`
|
||||
/// (the success arm is dropped explicitly so `unwrap_err` is unnecessary).
|
||||
async fn connect_err(options: ClientOptions) -> Error {
|
||||
match GatewayClient::connect(options).await {
|
||||
Ok(_client) => panic!("connect unexpectedly succeeded against a dead TLS address"),
|
||||
Err(error) => error,
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn tls_without_ca_is_rejected_with_actionable_error_by_default() {
|
||||
let options = ClientOptions::new("https://127.0.0.1:1")
|
||||
.with_plaintext(false)
|
||||
.with_connect_timeout(Duration::from_millis(200));
|
||||
|
||||
let error = connect_err(options).await;
|
||||
|
||||
let Error::InvalidEndpoint { detail, .. } = error else {
|
||||
panic!("expected InvalidEndpoint, got {error:?}");
|
||||
};
|
||||
// The message must point the caller at the supported remedy (pin a CA)
|
||||
// and name the opt-in escape hatch.
|
||||
assert!(
|
||||
detail.contains("ca_file") || detail.contains("CA"),
|
||||
"error should instruct the user to pass a CA file: {detail}"
|
||||
);
|
||||
assert!(
|
||||
detail.contains("require_certificate_validation"),
|
||||
"error should mention the require_certificate_validation opt-in: {detail}"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn tls_with_require_certificate_validation_does_not_short_circuit() {
|
||||
// With strict verification opted in, the no-CA guard must not fire; the
|
||||
// connect attempt instead proceeds to the transport (and fails to reach
|
||||
// the dead address) rather than returning the "CA required" guard error.
|
||||
let options = ClientOptions::new("https://127.0.0.1:1")
|
||||
.with_plaintext(false)
|
||||
.with_require_certificate_validation(true)
|
||||
.with_connect_timeout(Duration::from_millis(200));
|
||||
|
||||
let error = connect_err(options).await;
|
||||
|
||||
assert!(
|
||||
!matches!(&error, Error::InvalidEndpoint { detail, .. }
|
||||
if detail.contains("require_certificate_validation")),
|
||||
"strict verification must bypass the no-CA guard, got {error:?}"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn tls_with_ca_file_is_permitted_and_proceeds_past_the_guard() {
|
||||
// Pinning a CA is the supported TLS path: the no-CA guard must not fire.
|
||||
// We hand it a readable PEM file; construction proceeds past the guard
|
||||
// and only fails later at the transport (dead address / handshake).
|
||||
let ca_path = std::env::temp_dir().join("mxgw-rust-tls-ca-fixture.pem");
|
||||
std::fs::write(&ca_path, SELF_SIGNED_CA_PEM).unwrap();
|
||||
|
||||
let options = ClientOptions::new("https://127.0.0.1:1")
|
||||
.with_plaintext(false)
|
||||
.with_ca_file(&ca_path)
|
||||
.with_connect_timeout(Duration::from_millis(200));
|
||||
|
||||
let error = connect_err(options).await;
|
||||
|
||||
let _ = std::fs::remove_file(&ca_path);
|
||||
|
||||
assert!(
|
||||
!matches!(&error, Error::InvalidEndpoint { detail, .. }
|
||||
if detail.contains("require_certificate_validation")),
|
||||
"pinning a CA must bypass the no-CA guard, got {error:?}"
|
||||
);
|
||||
}
|
||||
|
||||
/// Drive `GalaxyClient::connect` to its error (mirrors `connect_err` above).
|
||||
async fn galaxy_connect_err(options: ClientOptions) -> Error {
|
||||
match GalaxyClient::connect(options).await {
|
||||
Ok(_client) => {
|
||||
panic!("GalaxyClient::connect unexpectedly succeeded against a dead TLS address")
|
||||
}
|
||||
Err(error) => error,
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn galaxy_tls_without_ca_is_rejected_with_actionable_error_by_default() {
|
||||
// GalaxyClient::connect must apply the same TLS guard as GatewayClient —
|
||||
// TLS without a pinned CA (and without require_certificate_validation)
|
||||
// returns a clear, actionable InvalidEndpoint error.
|
||||
let options = ClientOptions::new("https://127.0.0.1:1")
|
||||
.with_plaintext(false)
|
||||
.with_connect_timeout(Duration::from_millis(200));
|
||||
|
||||
let error = galaxy_connect_err(options).await;
|
||||
|
||||
let Error::InvalidEndpoint { detail, .. } = error else {
|
||||
panic!("expected InvalidEndpoint, got {error:?}");
|
||||
};
|
||||
assert!(
|
||||
detail.contains("ca_file") || detail.contains("CA"),
|
||||
"error should instruct the user to pass a CA file: {detail}"
|
||||
);
|
||||
assert!(
|
||||
detail.contains("require_certificate_validation"),
|
||||
"error should mention the require_certificate_validation opt-in: {detail}"
|
||||
);
|
||||
}
|
||||
|
||||
/// A throwaway self-signed CA certificate (PEM). Only needs to parse as a
|
||||
/// PEM trust root so the CA-pinning path is exercised past the guard.
|
||||
const SELF_SIGNED_CA_PEM: &str = "-----BEGIN CERTIFICATE-----
|
||||
MIIBhTCCASugAwIBAgIQIRi6zePL6mKjOipn+dNuaTAKBggqhkjOPQQDAjASMRAw
|
||||
DgYDVQQKEwdBY21lIENvMB4XDTE3MTAyMDE5NDMwNloXDTE4MTAyMDE5NDMwNlow
|
||||
EjEQMA4GA1UEChMHQWNtZSBDbzBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABD0d
|
||||
7VNhbWvZLWPuj/RtHFjvtJBEwOkhbN/BnnE8rnZR8+sbwnc/KhCk3FhnpHZnQz7B
|
||||
5aETbbIgmuvewdjvSBSjYzBhMA4GA1UdDwEB/wQEAwICpDATBgNVHSUEDDAKBggr
|
||||
BgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MCkGA1UdEQQiMCCCDmxvY2FsaG9zdDo1
|
||||
NDUzgg4xMjcuMC4wLjE6NTQ1MzAKBggqhkjOPQQDAgNIADBFAiEA2zpJEPQyz6/l
|
||||
Wf86aX6PepsntZv2GYlA5UpabfT2EZICICpJ5h/iI+i341gBmLiAFQOyTDT+/wQc
|
||||
6MF9+Yw1Yy0t
|
||||
-----END CERTIFICATE-----
|
||||
";
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `clients/dotnet` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -383,6 +383,40 @@ Re-review pass at `42b0037`. Diff against `d692232` consists of four commits:
|
||||
| 9 | Testing coverage | No new issues — `RunAsync_StreamAlarms_*`, `RunAsync_AcknowledgeAlarm_*`, and `RunAsync_Batch_*` give the new surface unit coverage. `bench-read-bulk` is the same stress-harness-not-SDK shape called out in the prior re-review and is not flagged here. |
|
||||
| 10 | Documentation & comments | Issue found (this review): the README examples for the two new alarm CLI subcommands cite wrong flag names and a non-existent `--session-id` (Client.Dotnet-018). The new XML docs on `StreamAlarmsAsync` / `AcknowledgeAlarmAsync` and on the bulk SDK methods are accurate and complete. |
|
||||
|
||||
#### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9`. The diff against `42b0037` is packaging/release metadata
|
||||
(NuGet/Gitea feed), a TLS trust-posture option (`RequireCertificateValidation` + a
|
||||
lenient accept-all default for the gateway's auto-generated self-signed cert), the
|
||||
Galaxy `BrowseChildren` RPC plumbing plus a `LazyBrowseNode` lazy-browse walker, and
|
||||
in-source resolutions of the prior pass's Client.Dotnet-018..021 (CLI flag-name README
|
||||
fix, `RequireRegisterServerHandle`, `ParseTimeoutMs` negative guard, steady-state OCE
|
||||
filter). The alarm-provider-fallback proto surface mentioned in the review brief is
|
||||
**not** present in this diff — no `AlarmProviderMode` / `AlarmProviderStatus` /
|
||||
`source_provider` / provider-mode-changed event reaches the .NET client here.
|
||||
|
||||
Build is green (`dotnet build … .slnx` succeeds) and all 78 unit tests pass (1 skipped
|
||||
live smoke). The build now emits **10 CS1591 warnings** that do not break the build,
|
||||
because the `clients/dotnet/Directory.Build.props` enforcement floor recorded as
|
||||
resolved under Client.Dotnet-012 (`TreatWarningsAsErrors` / `EnforceCodeStyleInBuild` /
|
||||
`AnalysisLevel` / `Deterministic`) is **absent** from the history that reaches HEAD —
|
||||
the props file at HEAD is packaging-metadata-only (Client.Dotnet-022). `git merge-base
|
||||
--is-ancestor a020350 HEAD` is false: the 2026-05-20 review-sweep commit that resolved
|
||||
012 is not in this line of history.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No new issues. The Galaxy `BrowseAsync` / `LazyBrowseNode.ExpandAsync` pagination correctly drains `next_page_token`, re-binds the same parent selector + filter set per page (matching the opaque-token contract), and guards against repeated tokens; the per-child `child_has_children` hint is read with an index-bounds check. The Client.Dotnet-019/021 in-source fixes (`RequireRegisterServerHandle`, `ParseTimeoutMs`) are correctly applied. |
|
||||
| 2 | mxaccessgw conventions | Issue found (this review): the `clients/dotnet/Directory.Build.props` enforcement floor (warnings-as-errors / code-style enforcement) mandated by CLAUDE.md and recorded resolved under Client.Dotnet-012 is missing at HEAD; the new props file carries only packaging metadata (Client.Dotnet-022). Consumes the shared contracts project, no forked proto, `authorization: Bearer` metadata correct. |
|
||||
| 3 | Concurrency & thread safety | Issue found (this review): `LazyBrowseNode.Children` and `IsExpanded` are read lock-free while `ExpandAsync` mutates `_children` and writes `_isExpanded` under `_expandLock`, with no release/acquire barrier to a concurrent reader (Client.Dotnet-025). `ExpandAsync`'s one-RPC dedup itself is correct (double-checked under the lock). |
|
||||
| 4 | Error handling & resilience | No new issues — `BrowseChildrenAsync` routes `RpcException` through the shared `MapRpcException`; the bench steady-state OCE filter (Client.Dotnet-020) is correctly applied. |
|
||||
| 5 | Security | No committed secret — the README Gitea-feed `dotnet nuget add source` example uses `<gitea-username>` / `<gitea-token-or-password>` placeholders. Note: TLS is lenient-by-default (accept-all callback when `UseTls` and no pinned CA), which disables certificate verification / MITM protection; this is an explicit, documented design choice for the gateway's auto-generated self-signed cert and is opt-out via `RequireCertificateValidation` or CA pinning, so not flagged as a finding. |
|
||||
| 6 | Performance & resource management | No issues found — `LazyBrowseNode` holds one `SemaphoreSlim` per node (never disposed, but it owns no unmanaged handle and the node lifetime is the tree's); browse paging caps at 500/page. |
|
||||
| 7 | Design-document adherence | No issues found — `BrowseChildren` / lazy-browse match `docs/GalaxyRepository.md#browsechildren`; the TLS posture matches `docs/GatewayConfiguration.md` (`RequireCertificateValidation` default `false`) and `DotnetClientDesign.md`. |
|
||||
| 8 | Code organization & conventions | Issue found (this review): Client.Dotnet-022 (lost enforcement props); the new `GenerateDocumentationFile=true` in the shared props also applies to the Cli and Tests projects, surfacing CS1591 on `IMxGatewayCliClient` and every test class (Client.Dotnet-023); the client (and Contracts) NuGet package ships with no `<license>` metadata despite setting `PackageRequireLicenseAcceptance=false` (Client.Dotnet-024). The nuspec correctly emits the transitive `ZB.MOM.WW.MxGateway.Contracts 0.1.0` dependency, so the README "pulled in transitively" claim holds. |
|
||||
| 9 | Testing coverage | No new issues — `LazyBrowseNodeTests` (7 cases incl. multi-page, concurrent-expand-one-RPC, filter forwarding), `MxGatewayClientTlsHandlerTests` / `GalaxyRepositoryClientTlsHandlerTests`, and the README-example parse tests give the new surface good coverage. |
|
||||
| 10 | Documentation & comments | No new issues — README NuGet-install / lazy-browse / TLS-trust sections are accurate, cross-doc anchors (`#automatic-self-signed-certificate`, `#browsechildren`) resolve, and the new XML docs on `BrowseAsync` / `LazyBrowseNode` / `RequireCertificateValidation` are complete. (The CS1591-surfaced missing docs are tracked under Client.Dotnet-023.) |
|
||||
|
||||
### Client.Dotnet-018
|
||||
|
||||
| Field | Value |
|
||||
@@ -507,3 +541,65 @@ uint timeoutMs = (uint)timeoutMsRaw;
|
||||
A single shared helper (e.g. `ParseTimeoutMs(CliArguments, string, int)`) on `MxGatewayClientCli` would cover both call sites and remove the duplication.
|
||||
|
||||
**Resolution:** 2026-05-24 — Confirmed against source: both `ReadBulkAsync` (line 490) and `BenchReadBulkAsync` (line 715) cast `arguments.GetInt32("timeout-ms", ...)` straight to `uint`, so `--timeout-ms -1` silently wrapped to `0xFFFFFFFF` (~49.7 days). Added a single shared private helper `ParseTimeoutMs(CliArguments arguments, int defaultValue)` on `MxGatewayClientCli` that reads the int32, rejects negatives with a clear `ArgumentException` ("--timeout-ms must be a non-negative integer (use 0 for the gateway default)."), and returns the safe `(uint)`. Both call sites now route through the helper. Regression test `MxGatewayClientCliTests.RunAsync_TimeoutMs_NegativeValue_RejectsWithClearError` (xUnit `[Theory]` over `read-bulk` and `bench-read-bulk`) drives the CLI with `--timeout-ms -1` and asserts the exit code is non-zero, that stderr contains "timeout-ms", and that the "non-negative" guard text is present. Verified red against the original `(uint)arguments.GetInt32(...)` casts (the bench proceeded past the timeout parse and tripped a downstream "Queue empty" error rather than the descriptive guard message) and green after the helper landed.
|
||||
|
||||
### Client.Dotnet-022
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | mxaccessgw conventions |
|
||||
| Location | `clients/dotnet/Directory.Build.props:1-21` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Client.Dotnet-012 was recorded resolved (2026-05-20, commit `a020350`) by adding `clients/dotnet/Directory.Build.props` mirroring `src/Directory.Build.props` — `TreatWarningsAsErrors=true`, `EnforceCodeStyleInBuild=true`, `AnalysisLevel=latest`, `Deterministic=true`, `LangVersion=latest`, `Nullable=enable`, `ImplicitUsings=enable` — to restore the build-quality floor that `CLAUDE.md` calls a baseline for the .NET client. That enforcement props file is **not present in the line of history that reaches HEAD**: `git merge-base --is-ancestor a020350 HEAD` is false (the 2026-05-20 review-sweep commit was dropped during the `ZB.MOM.WW` rename / history rebuild). At `42b0037` the file did not exist at all (`git show 42b0037:clients/dotnet/Directory.Build.props` fails), and at HEAD commit `523f944` introduced a **new** `clients/dotnet/Directory.Build.props` that carries only NuGet packaging metadata (Authors/Company/RepositoryUrl/Version/etc.) — none of the enforcement properties. None of the three client `.csproj` files set `TreatWarningsAsErrors` or `EnforceCodeStyleInBuild` independently (they set only `TargetFramework` and `Nullable`).
|
||||
|
||||
Net effect at HEAD: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` **succeeds with 10 CS1591 warnings** instead of failing. The mandated quality gate that would turn new warnings (missing docs, analyzer findings, code-style violations) into build breaks is gone for the entire client tree. This is a regression of the previously-closed Client.Dotnet-012; recorded as a fresh finding at the new commit per the re-review process.
|
||||
|
||||
**Recommendation:** Restore the enforcement properties in `clients/dotnet/Directory.Build.props` alongside the packaging metadata (they can coexist in the same `<Project>`), or add a sibling `clients/dotnet/Directory.Build.props` import. Re-run `dotnet build …slnx` and confirm 0 warnings / 0 errors (which will require closing Client.Dotnet-023 too, since the CS1591 warnings would otherwise become errors). Add a guard so the floor is not silently dropped again — e.g. assert the property is set in a small build test or CI check.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed at HEAD: `clients/dotnet/Directory.Build.props` carried only packaging metadata; none of the three client `.csproj` files set the enforcement properties, so `dotnet build …slnx` succeeded with 10 CS1591 warnings instead of failing. Restored the enforcement floor in `clients/dotnet/Directory.Build.props` mirroring `src/Directory.Build.props` (`LangVersion=latest`, `Nullable=enable`, `ImplicitUsings=enable`, `TreatWarningsAsErrors=true`, `AnalysisLevel=latest`, `EnforceCodeStyleInBuild=true`, `Deterministic=true`) in a second `<PropertyGroup>` alongside the existing packaging metadata. Resolved jointly with Client.Dotnet-023 (the CS1591 warnings would otherwise become errors under the restored `TreatWarningsAsErrors`). `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx -t:Rebuild` now reports 0 Warning(s) / 0 Error(s).
|
||||
|
||||
### Client.Dotnet-023
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `clients/dotnet/Directory.Build.props:17`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/IMxGatewayCliClient.cs:6`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Tests/*.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The new shared `clients/dotnet/Directory.Build.props` sets `GenerateDocumentationFile=true` at the directory level, so it applies to all three projects — including `ZB.MOM.WW.MxGateway.Client.Cli` and `ZB.MOM.WW.MxGateway.Client.Tests`, which are not packable and were not previously generating an XML doc file. Turning it on surfaces 10 CS1591 "missing XML comment" warnings: `IMxGatewayCliClient` (the public CLI interface, never documented at the type level — note Client.Dotnet-013's resolution claimed a type-level summary was added, but it is absent in the history reaching HEAD for the same reason as Client.Dotnet-022) plus every public xUnit test class (`GalaxyRepositoryClientTests`, `MxGatewayClientTlsHandlerTests`, `GalaxyRepositoryClientTlsHandlerTests`, and seven others). Today these are only warnings because the enforcement floor is missing (Client.Dotnet-022); once that floor is restored they become build-breaking errors.
|
||||
|
||||
**Recommendation:** Scope `GenerateDocumentationFile=true` to the packable library project only (move it from the shared props into `ZB.MOM.WW.MxGateway.Client.csproj`, which is the only project that ships a `.nupkg`), or keep it directory-wide but suppress CS1591 on the non-public test/CLI assemblies (`<NoWarn>$(NoWarn);CS1591</NoWarn>` in those two `.csproj` files) and add the one-line type summary to `IMxGatewayCliClient`. The first option is cleaner and avoids documenting test classes.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed via `-t:Rebuild`: the directory-wide `GenerateDocumentationFile=true` surfaced exactly 10 CS1591 warnings — `IMxGatewayCliClient` plus nine xUnit test classes (`GalaxyRepositoryClientTests`, `MxCommandReplyExtensionsTests`, `MxGatewayClientContractInfoTests`, `MxGatewayClientOptionsTests`, `MxGatewayClientTlsHandlerTests`, `GalaxyRepositoryClientTlsHandlerTests`, `MxGatewayGeneratedContractTests`, `MxStatusProxyExtensionsTests`, `MxValueExtensionsTests`); the shipped Client library itself emitted zero (its public surface was already fully documented). Took the first (cleaner) option, matching how `src/` handles this — only the packable `src/ZB.MOM.WW.MxGateway.Contracts.csproj` sets `GenerateDocumentationFile` directly. Removed `GenerateDocumentationFile=true` from the shared `clients/dotnet/Directory.Build.props` and moved it into the packable `ZB.MOM.WW.MxGateway.Client.csproj` only, so the Cli and Tests projects no longer generate doc files and CS1591 is not raised against them. No doc comments were added to test classes. With the Client.Dotnet-022 floor restored, the rebuild is clean (0 warnings / 0 errors).
|
||||
|
||||
### Client.Dotnet-024
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `clients/dotnet/Directory.Build.props:12`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj:19-24` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The client package sets `PackageRequireLicenseAcceptance=false` but declares **no license at all** — there is no `PackageLicenseExpression` and no `PackageLicenseFile` in `clients/dotnet/Directory.Build.props` or in the packable `.csproj`. Confirmed by packing: the emitted `ZB.MOM.WW.MxGateway.Client.0.1.0.nuspec` has no `<license>` element, so the produced package carries no license metadata and a NuGet feed renders it as "License: not specified." The sibling `ZB.MOM.WW.MxGateway.Contracts` package (the transitive dependency) has the same gap. `dotnet pack` does not warn (a missing license is allowed), so the omission is silent. Setting `PackageRequireLicenseAcceptance=false` while shipping no license is internally inconsistent — that flag exists to control acceptance of a license that should be present.
|
||||
|
||||
**Recommendation:** Add the intended license to `clients/dotnet/Directory.Build.props` (and to `ZB.MOM.WW.MxGateway.Contracts.csproj` for parity) — either `<PackageLicenseExpression>` with an SPDX id (e.g. a proprietary marker or the actual license) or `<PackageLicenseFile>` pointing at a committed `LICENSE`. If the package is intentionally unlicensed/internal-only, document that explicitly rather than leaving the field blank.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed via pack: the emitted nuspec had no `<license>` element. Marked the package "Proprietary" consistent with the other clients' decision (Rust `license = "Proprietary"`, Python `license = { text = "Proprietary" }` + `License :: Other/Proprietary License`). A `<PackageLicenseExpression>LicenseRef-Proprietary</PackageLicenseExpression>` was tried first but the current NuGet toolset rejects `LicenseRef-*` (NU5124), which the restored `TreatWarningsAsErrors` escalates to a pack failure — so the proprietary terms ship as a committed license file instead: added `clients/dotnet/LICENSE.txt` (proprietary/internal-use terms), set `<PackageLicenseFile>LICENSE.txt</PackageLicenseFile>` in the shared `clients/dotnet/Directory.Build.props`, and packed it at the package root via a `<None Include="..\LICENSE.txt" Pack="true" PackagePath="\" />` item in the packable `ZB.MOM.WW.MxGateway.Client.csproj`. `dotnet pack` now succeeds and the nuspec carries `<license type="file">LICENSE.txt</license>` with `LICENSE.txt` present in the `.nupkg`. Scope was limited to Client.Dotnet per the constraints — the sibling `ZB.MOM.WW.MxGateway.Contracts` package has the same gap and is NOT touched here (it is a different module; flagging it for that module's review).
|
||||
|
||||
### Client.Dotnet-025
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `clients/dotnet/ZB.MOM.WW.MxGateway.Client/LazyBrowseNode.cs:38,41,54,82,94` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `LazyBrowseNode.ExpandAsync` is explicitly documented as thread-safe ("concurrent callers see exactly one fetch"), and its one-RPC dedup is correct: it double-checks `_isExpanded` under `_expandLock`. But the *readers* of the results are lock-free. `Children => _children` returns the live backing `List<LazyBrowseNode>` reference, and `IsExpanded => _isExpanded` reads the plain `bool` field — neither takes `_expandLock` nor uses `Volatile`. A thread that observes `IsExpanded == true` (or simply enumerates `Children`) concurrently with the writer thread inside `ExpandAsync` has no release/acquire barrier guaranteeing it sees the fully-populated `_children` contents that were appended under the lock. On x86/x64 the bool read and the list-reference read are atomic and the practical risk is low, but the published-state visibility is not guaranteed by the memory model, and a reader enumerating `Children` while a concurrent `ExpandAsync` is mid-append can throw `InvalidOperationException` ("collection was modified"). This is inconsistent with the type's own thread-safety claim.
|
||||
|
||||
**Recommendation:** Either (a) tighten the documented contract to "ExpandAsync is safe to call concurrently, but Children/IsExpanded must only be read after the awaited ExpandAsync completes (no concurrent reader/expander)", or (b) make the publication safe: write `_isExpanded` via `Volatile.Write` and read via `Volatile.Read`, and return an immutable snapshot from `Children` (e.g. assign a completed `IReadOnlyList` under the lock and expose that field) so lock-free readers never observe a partially-populated list. Option (a) is the smallest change and matches the realistic usage (UI thread expands then renders).
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed against source: `Children => _children` returned the live mutable backing `List<LazyBrowseNode>` and `IsExpanded => _isExpanded` read a plain `bool`, while `ExpandAsync` appended to that same list under `_expandLock` with no release/acquire barrier to lock-free readers — so a concurrent reader could enumerate a mid-append list and throw `InvalidOperationException` ("collection was modified"). Applied option (b) (safe publication): `ExpandAsync` now accumulates children into a method-local `List<LazyBrowseNode>` and, only when fully drained across all pages, publishes it via `Volatile.Write(ref _children, children)` (release) immediately before setting the now-`volatile bool _isExpanded = true`. The `_children` field is an `IReadOnlyList<LazyBrowseNode>` read via `Volatile.Read` from the `Children` getter (acquire), so a reader that observes `IsExpanded == true` always sees the fully-populated snapshot and never enumerates a partially-built list. Updated the `ExpandAsync` `<remarks>` to document the strengthened concurrent-read guarantee. Regression test `LazyBrowseNodeTests.Expand_ConcurrentReadOfChildren_NeverTearsAndPublishesAtomically` gates the child-page RPCs (via a new `FakeGalaxyRepositoryTransport.BrowseChildrenGate` hook) to hold the expand mid-flight while a background reader spins enumerating `Children` and reading `IsExpanded`, asserting no exception escapes and that once `IsExpanded` is true the published snapshot has all five children. Verified red against the pre-fix code (the reader threw `InvalidOperationException: Collection was modified` deterministically across three runs) and green after the fix.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `clients/go` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -83,6 +83,39 @@ that earlier commit.
|
||||
| 9 | Testing coverage | New issue: the five new bulk SDK methods and `Client.StreamAlarms` have no unit tests in `mxgateway/` (Client.Go-024). |
|
||||
| 10 | Documentation & comments | No issues found in this diff. README documents the new `StreamAlarms`/`AcknowledgeAlarm` SDK calls; `Session.ReadBulk` documents the cached-vs-snapshot semantics and `timeout=0` default; `WriteSecuredBulk` flags credential sensitivity. |
|
||||
|
||||
### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9`. The diff is larger than the brief suggested:
|
||||
`82996aa` resolved Client.Go-022..027 (already closed). On top of that,
|
||||
`fd2a0ac`/`4a19854`/`da3aa7b`/`92cc468`/`75610e3` added a `LazyBrowseNode`
|
||||
lazy-hierarchy walker (`Browse`/`Expand`/`BrowseChildrenRaw`) over the new
|
||||
`BrowseChildren` RPC and paginated `DiscoverHierarchy`; `c463b49`/`2eb8137`/
|
||||
`9bdb899` made the TLS path lenient-by-default (accept the gateway's
|
||||
self-signed cert unless `RequireCertificateValidation` or `CACertFile` is set);
|
||||
`6df373a` added the release docs + `scripts/tag-go-module.ps1`. `gofmt -l .`,
|
||||
`go vet ./...`, `go build ./...`, and `go test ./...` are all clean at HEAD.
|
||||
|
||||
Two new low/medium issues in the release-helper and install docs. The
|
||||
lenient-TLS default is an intentional, documented project posture
|
||||
(`docs/GatewayConfiguration.md` "clients are lenient" to pair with the
|
||||
auto-generated self-signed cert) and the `//nolint:gosec` is correctly
|
||||
justified — not a finding. The `LazyBrowseNode` concurrency model
|
||||
(coalesced in-flight Expand, non-sticky failures, snapshot copies under
|
||||
`RWMutex`) is sound and well-tested, including a 10-goroutine race test.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | New issue: `tag-go-module.ps1`'s clean-tree guard is order-dependent and silently permits tagging with uncommitted tracked changes when an untracked path sorts first (Client.Go-028). `DiscoverHierarchy`/`browseChildrenInner` pagination, the `child_has_children` hint mapping, and the duplicate-page-token guard are all correct. |
|
||||
| 2 | mxaccessgw conventions | No issues found. `gofmt -l .` / `go vet ./...` clean; the `//nolint:gosec` on `InsecureSkipVerify` carries a narrow justified reason per the suppression convention. |
|
||||
| 3 | Concurrency & thread safety | No issues found — `LazyBrowseNode.Expand` runs the RPC outside both locks, coalesces concurrent callers onto one in-flight RPC, publishes the result before `close(done)`, and leaves failures retryable; verified by `TestGalaxyBrowseExpandConcurrentCallersOnlyFireOneRpc` (`-race`-shaped). |
|
||||
| 4 | Error handling & resilience | No issues found — `BrowseChildrenRaw` wraps transport failures in `*GatewayError`; both paginating loops guard against a repeated page token. |
|
||||
| 5 | Security | No issues found — no committed secrets (only `"test"` / `"test-api-key"` fixtures); the lenient-TLS default is the documented project posture with an opt-in strict mode (`RequireCertificateValidation`). |
|
||||
| 6 | Performance & resource management | No issues found — `DiscoverHierarchy` cancels each page's call context promptly inside the loop; `Children()` returns a defensive copy. |
|
||||
| 7 | Design-document adherence | No issues found — lazy browse matches `docs/GalaxyRepository.md#browsechildren`; lenient TLS matches `docs/GatewayConfiguration.md`. |
|
||||
| 8 | Code organization & conventions | No issues found — additive API (`Browse`/`BrowseChildrenOptions`/`RequireCertificateValidation`); `tlsConfigForOptions` cleanly extracted for testability. |
|
||||
| 9 | Testing coverage | No issues found — new walker, pagination, dup-token, filter-forwarding, and TLS-posture paths are all covered. |
|
||||
| 10 | Documentation & comments | New issue: README "Installing the Go client" recommends the `GONOSUMCHECK` env var, which was removed from the Go toolchain in 1.13 and is a no-op on Go 1.26 (Client.Go-029). |
|
||||
|
||||
## Findings
|
||||
|
||||
### Client.Go-001
|
||||
@@ -625,3 +658,51 @@ The two cases the empty-line check seems to cover — (a) operator pressing Ente
|
||||
**Recommendation:** Change `if line == "" { break }` to `if line == "" { continue }` (alongside the existing `len(args) == 0` continue, which is then redundant — keep one, drop the other for clarity). Update the `runBatch` doc-comment to read "only stdin EOF ends the session" and drop the "or an empty line" clause. If the interactive ergonomic is genuinely wanted, gate it on `isatty(stdin)` so the batch-from-pipe case isn't affected.
|
||||
|
||||
**Resolution:** 2026-05-24 — `runBatch` no longer treats a blank line as end-of-session. The `if line == "" { break }` early-exit was removed; blank or whitespace-only lines now fall through the existing `if len(args) == 0 { continue }` guard (kept as the single blank-line skip rule for clarity), so only stdin EOF ends the session. The doc-comment was updated to read "Blank lines are skipped; only stdin EOF ends the session." Regression test `TestRunBatchSkipsBlankLinesAndContinuesUntilEOF` in `cmd/mxgw-go/main_test.go` feeds `version --json\n\nversion --json\n` (a stray blank line between two commands) and asserts two EOR sentinels are emitted — pre-fix the test failed with "EOR sentinel count = 1, want 2" because the blank line broke the loop and the second command never ran; post-fix both commands run.
|
||||
|
||||
### Client.Go-028
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `scripts/tag-go-module.ps1:42-46` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The release helper's clean-working-tree guard is order-dependent and can silently let a release tag be created on top of uncommitted tracked changes — the exact thing it advertises it prevents (the README at `clients/go/README.md` says "The script ... refuses to tag with uncommitted tracked changes"). The check is:
|
||||
|
||||
```powershell
|
||||
$status = (git status --porcelain) -join "`n"
|
||||
if ($status -and -not ($status -match '^\?\?')) {
|
||||
throw "Working tree has tracked changes. Commit or stash before tagging."
|
||||
}
|
||||
```
|
||||
|
||||
`git status --porcelain` emits one line per path (`XY path`), with untracked entries prefixed `??`. The lines are joined into a single string and matched against `'^\?\?'` with PowerShell `-match`, which by default is single-line (no `(?m)` multiline flag), so `^` anchors to the start of the *whole* joined string. The guard therefore inspects only the **first** status line: if that first line is an untracked file (`??`), the `-not (... -match '^\?\?')` clause is false and the throw is skipped — even when later lines are tracked modifications (` M file.go`, `A file.go`, etc.). Because `git status --porcelain` orders entries by pathname, an untracked file whose name sorts ahead of a modified tracked file (e.g. an untracked `AAA-notes.md` alongside a modified `mxgateway/session.go`) puts the `??` line first and the tag is created from a dirty tree. This was confirmed empirically: with `"?? untracked.md\n M tracked.go"` the script allows the tag; with the tracked line first it correctly throws. The whole point of the guard — reproducible release tags that match a committed state — is defeated in this ordering.
|
||||
|
||||
**Recommendation:** Test each status entry individually rather than the first line of a joined blob. For example, iterate the porcelain lines and throw if any line does **not** start with `??`:
|
||||
|
||||
```powershell
|
||||
$dirty = (git status --porcelain) | Where-Object { $_ -and ($_ -notmatch '^\?\?') }
|
||||
if ($dirty) {
|
||||
throw "Working tree has tracked changes. Commit or stash before tagging.`n$($dirty -join "`n")"
|
||||
}
|
||||
```
|
||||
|
||||
(Equivalently, keep the joined string but use the multiline flag and negate per-line: `($status -split "`n") | ? { $_ -notmatch '^\?\?' }`.) Including the offending lines in the thrown message also helps the operator see what is dirty.
|
||||
|
||||
**Resolution:** 2026-06-15 — Replaced the order-dependent joined-blob check in `tag-go-module.ps1` with a per-line filter (`git status --porcelain | Where-Object { $_ -and ($_ -notmatch '^\?\?') }`) that throws on any tracked change regardless of ordering, listing the offending lines. Verified under pwsh 7.5.4 that an untracked path sorting ahead of a modified tracked file is now correctly rejected, while untracked-only and clean trees are still allowed.
|
||||
|
||||
### Client.Go-029
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `clients/go/README.md:300-303` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The "Installing the Go client" section advises, for build environments that cannot reach `gitea.dohertylan.com` directly, to "use `GONOSUMCHECK` + `GOPRIVATE` to bypass the checksum database for the internal module path." `GONOSUMCHECK` is a dead environment variable — it was removed from the Go toolchain in Go 1.13 (its short-lived successor `GONOSUMDB` was also removed), and on the Go 1.26 toolchain this client targets (`go.mod` says `go 1.26`) setting it has no effect. The actual mechanism is `GOPRIVATE` (or the finer-grained `GONOSUMCHECK`-replacement `GONOSUMDB`→now `GONOSUMCHECK` is gone) — `GOPRIVATE=gitea.dohertylan.com/*` alone already both skips the checksum database and bypasses the public proxy for matching module paths, so the `GONOSUMCHECK` half of the recommendation is inert and misleading. A reader who copies the advice and finds checksum-db verification still failing has no working escape hatch from this doc.
|
||||
|
||||
**Recommendation:** Drop `GONOSUMCHECK` and document the current knobs: set `GOPRIVATE=gitea.dohertylan.com/*` (covers both sum-db bypass and direct VCS fetch), or for the checksum database specifically `GONOSUMCHECK`'s modern equivalent `GONOSUMDB` is also gone — use `GONOSUMCHECK`→`GOFLAGS=-insecure` only for plaintext, and `GONOSUMCHECK`. Concretely: "set `GOPRIVATE=gitea.dohertylan.com/*` (this disables both the checksum database and the public module proxy for that path); add `GOINSECURE=gitea.dohertylan.com/*` if the host serves the module over plain HTTP."
|
||||
|
||||
**Resolution:** 2026-06-15 — Dropped the dead `GONOSUMCHECK` advice from the "Installing the Go client" section of `clients/go/README.md`; it now documents `GOPRIVATE=gitea.dohertylan.com/*` (which bypasses both the public module proxy and checksum-database verification for that path) plus `GOINSECURE=gitea.dohertylan.com/*` for plain-HTTP hosts.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `clients/java` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -77,6 +77,35 @@ Client.Java-001..031 are unchanged.
|
||||
| 9 | Testing coverage | Issue found: the new `MxGatewayClient.streamAlarms` SDK method has no library-side test in `zb-mom-ww-mxgateway-client/src/test/...` — only the CLI test exercises the path via a `FakeClient.streamAlarms` override that bypasses the production `subscription.wrap(observer)` glue (Client.Java-035). |
|
||||
| 10 | Documentation & comments | Issue found: README (`clients/java/README.md:182-183`) documents the new `stream-alarms` and `acknowledge-alarm` commands with `--session-id <id>` (neither command has that option) and `acknowledge-alarm --alarm-reference …` (actual flag is `--reference`) — every documented invocation fails at picocli parse time (Client.Java-032). |
|
||||
|
||||
### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9`. Diff against `42b0037` is eleven commits touching
|
||||
`clients/java`: `d3cb311` (Client.Java-032..036 fixes — shared subscription
|
||||
base + batch tokenizer), `0d6193c`/`803a207`/`b4bc2df`/`4a19854`/`b244851`/
|
||||
`68f905a` (the `BrowseChildren` lazy-browse SDK surface: `GalaxyRepositoryClient.browse()`,
|
||||
`browse(BrowseChildrenOptions)`, `browseChildrenRaw`, `browseChildrenInner`,
|
||||
plus the `LazyBrowseNode` walker and `BrowseChildrenOptions`), `a276f46`/
|
||||
`ba82afe`/`2eb8137` (lenient-by-default TLS: new `requireCertificateValidation`
|
||||
option, `InsecureTrustManagerFactory` fallback, foojay toolchain resolver), and
|
||||
`fe44e3c` (maven-publish wiring for the Gitea Maven feed). Generated
|
||||
protobuf/gRPC Java is excluded. `gradle test` could not be run here — this macOS
|
||||
host has no Java runtime (the module builds on the Windows host per project
|
||||
memory); findings below are from source inspection. Prior findings
|
||||
Client.Java-001..036 are unchanged.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found in this diff. `LazyBrowseNode.expand()` leader/coalesce logic is correct (single in-flight future, slot cleared on failure for retry); `browseChildrenInner` pagination handles the empty/null next-page token and guards against repeated page tokens; the `child_has_children` parallel array is bounds-checked (`i < getChildHasChildrenCount()`), defaulting absent hints to false. |
|
||||
| 2 | mxaccessgw conventions | No issues found. No MXAccess COM, no synthesized events, generated code untouched. The lenient-TLS default is a documented repo-wide design decision (`docs/DesignDecisions.md` "TLS Auto-Certificate and Lenient Client Trust"), not a Java-specific deviation. |
|
||||
| 3 | Concurrency & thread safety | No issues found. `LazyBrowseNode` does not hold the `expandLock` monitor across the BrowseChildren RPC (fixed in `68f905a`); readers use a separate `ReentrantReadWriteLock` so `getChildren()`/`isExpanded()` never block on the in-flight RPC; `BrowseChildrenOptions` is immutable. The shared `MxGatewayStreamSubscription` base (Client.Java-036) is covered. |
|
||||
| 4 | Error handling & resilience | No issues found. `browseChildrenRaw` normalises non-`MxGatewayException` gRPC errors via `MxGatewayErrors.fromGrpc`; the non-leader `expand()` path rethrows the leader's `MxGatewayException`/`RuntimeException` and restores the interrupt flag on `InterruptedException`. |
|
||||
| 5 | Security | No issues found. maven-publish credentials come from `GITEA_USERNAME`/`GITEA_TOKEN` env vars with empty-string fallback — no committed secrets. The lenient-TLS `InsecureTrustManagerFactory` default is the documented, intentional design for this PKI-less internal tool; strict verification is reachable via `caCertificatePath` (pin) or `requireCertificateValidation(true)`, both tested in `MxGatewayClientTlsTests`. |
|
||||
| 6 | Performance & resource management | No issues found. |
|
||||
| 7 | Design-document adherence | No issues found. The browse surface matches `docs/GalaxyRepository.md#browsechildren` (cache-served lazy expand, `has_children` hint, repeated-page-token → error); the TLS posture matches `docs/GatewayConfiguration.md` and `JavaClientDesign.md`. |
|
||||
| 8 | Code organization & conventions | Issue found: the new `requireCertificateValidation` library option is not exposed or propagated by the CLI `CommonOptions.toClientOptions()`, so CLI users cannot opt into JVM-trust-store verification — same additive-surface gap pattern as the resolved Client.Java-025 (Client.Java-038). |
|
||||
| 9 | Testing coverage | No issues found. The browse surface has thorough library tests in `GalaxyRepositoryClientTests` (roots, expand-populates, idempotent-single-RPC, unknown-parent not-found, multi-page gather, concurrent-callers-one-RPC, filter forwarding, repeated-page-token rejection); TLS lenient/strict paths are covered by `MxGatewayClientTlsTests` against a real in-process TLS server. |
|
||||
| 10 | Documentation & comments | Issue found: the README "Browsing lazily" first code snippet calls `galaxy.browseChildren(BrowseChildrenRequest…)`, but no such method exists on `GalaxyRepositoryClient` — the raw single-RPC method is `browseChildrenRaw(BrowseChildrenRequest)`; the documented snippet does not compile (Client.Java-037). |
|
||||
|
||||
## Findings
|
||||
|
||||
### Client.Java-001
|
||||
@@ -662,4 +691,56 @@ This is the same maintenance-hazard pattern Client.Java-009 / Client.Java-016 id
|
||||
|
||||
**Resolution:** 2026-05-24 — Extracted a package-private abstract base `MxGatewayStreamSubscription<TRequest, TResponse> implements AutoCloseable` (new file `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayStreamSubscription.java`). It holds the shared `AtomicReference<ClientCallStreamObserver<TRequest>>` and `AtomicBoolean cancelled` pair, the `wrap(StreamObserver<TResponse>)` factory that returns a `ClientResponseObserver` with the Client.Java-014 close-before-beforeStart fix baked in, the `cancel()` / `close()` implementation, and an immutable `cancelMessage` injected by the subclass constructor. The four prior 60-line near-clones (`MxGatewayEventSubscription`, `MxGatewayAlarmFeedSubscription`, `MxGatewayActiveAlarmsSubscription`, `DeployEventSubscription`) collapse to ~10-line subclasses that only declare their `<Request, Response>` type parameters and supply the cancel-message string to `super(...)`. Public API surface is preserved: each subclass remains a `public final class` with a public no-arg constructor (the constructor was implicit on the original classes; I made it explicit `public` on the subclasses so the existing CLI `FakeClient.streamAlarms` in a different package can still `new MxGatewayAlarmFeedSubscription()`). The `wrap(...)` method is `final` and package-private on the base — same accessibility the four subclasses had before — so production callers in `MxGatewayClient`/`GalaxyRepositoryClient` see no change. New test file `MxGatewayStreamSubscriptionContractTests` exercises the lifecycle/cancellation contract identically across all four subclasses (16 tests, four per scenario): (a) cancel-before-beforeStart eagerly cancels the stream once it attaches with the subclass-specific message, (b) cancel-after-beforeStart forwards directly to the stream, (c) `close()` delegates to `cancel()`, (d) the wrapped observer forwards `onNext`/`onError`/`onCompleted` verbatim, and a compile-time `typeBoundsCheck` helper that asserts each subclass still binds its `<Req, Resp>` parameters to the right proto types. TDD red phase confirmed: temporarily breaking one subclass's `super(...)` message to `"BROKEN MESSAGE"` made the contract test for that subclass fail with `expected: <client cancelled alarm feed> but was: <BROKEN MESSAGE>`; restoring the correct value turned all 16 contract tests green. Future fixes to the shared lifecycle now live in one place — the next Client.Java-014/021-style race fix cannot drift across the four classes.
|
||||
|
||||
### Client.Java-037
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `clients/java/README.md:138-149` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The "Browsing lazily" section's first (low-level) code snippet documents a `browseChildren` method that does not exist on the public client surface:
|
||||
|
||||
```java
|
||||
BrowseChildrenReply reply = galaxy.browseChildren(
|
||||
BrowseChildrenRequest.newBuilder().build());
|
||||
```
|
||||
|
||||
`GalaxyRepositoryClient` exposes only `browse()`, `browse(BrowseChildrenOptions)`, and the raw single-RPC method `browseChildrenRaw(BrowseChildrenRequest)` (verified at `GalaxyRepositoryClient.java:227,238,251`). There is no `browseChildren(BrowseChildrenRequest)`, so the documented snippet fails to compile — a user copy-pasting the primary low-level example hits a missing-symbol error immediately. The README hedges the snippet with "This snippet documents the API as it appears once the Java client is regenerated on the Windows host," but the discrepancy is not a regeneration artifact: the hand-written wrapper method is named `browseChildrenRaw`, not `browseChildren`. The adjacent "High-level walker" snippet (`galaxy.browse()`, `root.expand()`, `root.getChildren()`, `child.hasChildrenHint()`, `child.getObject().getTagName()`) is correct against the actual API; only the low-level snippet is wrong.
|
||||
|
||||
**Recommendation:** Change `galaxy.browseChildren(` to `galaxy.browseChildrenRaw(` in the low-level snippet so it matches the real method name, or replace the low-level example with the `browse()`/`LazyBrowseNode` walker that the SDK actually intends as the primary surface. Drop the "as it appears once regenerated" caveat once the snippet compiles against the current source. Consider an `installDist`-based or compile-checked doc snippet test to prevent README API drift, mirroring the parse-only assertions added for Client.Java-032.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed against source: `GalaxyRepositoryClient` (`zb-mom-ww-mxgateway-client/.../GalaxyRepositoryClient.java:227,238,251`) exposes only `browse()`, `browse(BrowseChildrenOptions)`, and the raw single-RPC `browseChildrenRaw(BrowseChildrenRequest)` — there is no `browseChildren(BrowseChildrenRequest)`, so the documented snippet did not compile. Fixed the README "Browsing lazily" low-level snippet at `clients/java/README.md` by renaming `galaxy.browseChildren(` to `galaxy.browseChildrenRaw(`; the surrounding accessors (`BrowseChildrenReply`/`BrowseChildrenRequest` types, `getChildrenList()`, `getChildHasChildrenList()`, `getTagName()`) are all valid proto accessors and were left unchanged. Replaced the misleading "as it appears once the Java client is regenerated on the Windows host" caveat (the discrepancy was a hand-written wrapper name, not a codegen artifact) with prose steering callers to the high-level `browse()`/`LazyBrowseNode` walker as the preferred surface and `browseChildrenRaw` as the direct-paging escape hatch. Documentation-only change; no test added (no compile-checked doc-snippet harness exists yet — left as the noted future enhancement).
|
||||
|
||||
### Client.Java-038
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1347-1393` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Commit `a276f46` added `requireCertificateValidation` to `MxGatewayClientOptions` as a first-class TLS-trust toggle (lenient-by-default; set `true` to verify against the JVM trust store without pinning a CA). The CLI `CommonOptions` exposes `--plaintext`, `--ca-file`, and `--server-name-override` and propagates them through `toClientOptions()`, but it neither declares a `--require-certificate-validation` option nor sets `builder.requireCertificateValidation(...)`. CLI users therefore have no way to request strict verification short of supplying a pinned CA via `--ca-file`; the lenient `InsecureTrustManagerFactory` default is forced on every non-pinned TLS CLI connection. This is the same additive-surface gap pattern as the resolved Client.Java-025 (`shutdownTimeout` not propagated to the CLI). `docs/CrossLanguageSmokeMatrix.md` documents `--require-certificate-validation` for the Rust CLI's pin-only stack but not Java, so this is not a direct README contradiction; it is a library-vs-CLI surface inconsistency. Severity is Low because the secure-by-pinning path (`--ca-file`) remains available and the lenient default is the documented intended behaviour for this internal tool.
|
||||
|
||||
**Recommendation:** Add a `--require-certificate-validation` boolean option to `CommonOptions` (default unset/false to preserve the lenient default) and propagate it into `toClientOptions()` via `builder.requireCertificateValidation(value)`. Include the resolved value in `redactedJsonMap()` so `--json` output reflects the effective trust posture. Add a CLI parse-only assertion exercising the flag to keep the CLI surface tracking the library surface.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed against source: `MxGatewayClientOptions` (`zb-mom-ww-mxgateway-client/.../MxGatewayClientOptions.java:108,260`) exposes `requireCertificateValidation()` and a `Builder.requireCertificateValidation(boolean)`, but the CLI `CommonOptions` in `MxGatewayCli.java` declared no flag and `toClientOptions()` never set it, forcing the lenient default on every non-pinned TLS CLI connection. Added a bare-boolean `@Option(names = "--require-certificate-validation")` field to `CommonOptions` (defaults to `false`, preserving the lenient default; mirrors the existing `--plaintext` flag-style option), propagated it through `toClientOptions()` via `.requireCertificateValidation(requireCertificateValidation)`, and added it to `redactedJsonMap()` so `--json` output reflects the effective trust posture. Documented the new flag and the lenient-by-default trust posture in `clients/java/README.md`. Note: the Client.Java-025 precedent (`shutdownTimeout`) was applied to the pre-rename `mxgateway-cli` module and is not present in this renamed `zb-mom-ww-mxgateway-cli` `toClientOptions()`; I mirrored the live `--ca-file`/`--server-name-override` TLS-option plumbing pattern instead, which is the correct precedent here. Regression tests in `MxGatewayCliTests`: `requireCertificateValidationFlagPropagatesThroughToClientOptions` (drives `acknowledge-alarm --require-certificate-validation` through a new `CapturingClientFactory` that records `options.toClientOptions()` and asserts `MxGatewayClientOptions.requireCertificateValidation()` is `true`) and `requireCertificateValidationDefaultsToLenientWhenFlagAbsent` (asserts the flag defaults to `false`). The capturing factory exercises the real `toClientOptions()` propagation, stronger than a parse-only check.
|
||||
|
||||
|
||||
|
||||
### Client.Java-039
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1699` (origin: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`, `AlarmFeedMessage.payload` provider-status arm added in commit `1d85db7`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The Java CLI does not compile at HEAD `410acc9`. `formatAlarmFeedMessage` switches over `message.getPayloadCase()` as an exhaustive switch *expression* with no `default`, covering only `ACTIVE_ALARM`, `SNAPSHOT_COMPLETE`, `TRANSITION`, and `PAYLOAD_NOT_SET`. The alarm-provider-fallback contract change `1d85db7` added a fourth `AlarmFeedMessage.payload` oneof arm (`AlarmProviderStatus provider_status`), so the generated `PayloadCase` enum now has a `PROVIDER_STATUS` value the switch does not handle — `javac` rejects it with "the switch expression does not cover all possible input values" and `gradle :zb-mom-ww-mxgateway-cli:compileJava` fails. This is the same class of cross-component contract-propagation break as Client.Rust-030 and IntegrationTests-026: a new contract field that left a downstream exhaustive consumer uncompilable. The original re-review (Client.Java-037/038) missed it because there is no JVM on the macOS review host and `gradle` could not be run; the break surfaced when the fixes were verified on the Windows host. Because the CLI is the cross-language e2e driver, the whole Java client artifact set cannot build and no Java e2e smoke can run.
|
||||
|
||||
**Recommendation:** Add a `PROVIDER_STATUS` arm to `formatAlarmFeedMessage` that renders the provider status (mode / degraded / reason) consistently with the other alarm-feed arms — do not add a `default ->` that silently drops it, since the provider status is meaningful and the exhaustive switch is the compiler-enforced guard that catches exactly this kind of future contract drift.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed via `gradle :zb-mom-ww-mxgateway-cli:compileJava` failing with "the switch expression does not cover all possible input values" at `MxGatewayCli.java:1699` on the Windows host. Added a `case PROVIDER_STATUS ->` arm to `formatAlarmFeedMessage` yielding `provider-status mode=%s degraded=%b reason=%s` (from `AlarmProviderStatus.getMode().name()` / `getDegraded()` / `getReason()`), plus the `import mxaccess_gateway.v1.MxaccessGateway.AlarmProviderStatus;`. No `default` arm — the exhaustive switch expression remains the compile-time guard against future `payload` oneof additions. Verified `gradle test` builds and passes on the Windows host (Java 21).
|
||||
|
||||
@@ -4,16 +4,48 @@
|
||||
|---|---|
|
||||
| Module | `clients/python` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9`. The diff against the previous review base
|
||||
`42b0037` covers: PyPI metadata + Gitea PyPI-feed install instructions in
|
||||
`pyproject.toml` / `README.md`; a new lazy Galaxy browse surface
|
||||
(`GalaxyRepositoryClient.browse_children_raw` / `browse` / `_iter_browse_children`,
|
||||
the `LazyBrowseNode` walker, and `BrowseChildrenOptions`); a TLS
|
||||
trust-on-first-use (TOFU) default in `options.py` gated by a new
|
||||
`ClientOptions.require_certificate_validation` flag; the `_use_plaintext`
|
||||
TLS-default contract carried forward; and the `batch` `CliRunner`-removal
|
||||
follow-through. The new browse / TOFU surface is well tested
|
||||
(`tests/test_galaxy.py`, `tests/test_auth_options.py`, `tests/test_tls.py`).
|
||||
|
||||
`python -m pytest` passes (80 passed, 1 skipped — the loopback-TLS test is
|
||||
opt-in via `MXGATEWAY_RUN_TLS_TESTS=1`). `python -m pip wheel .` builds the
|
||||
wheel cleanly against the installed setuptools 82.0.1.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Issue found: `_split_authority` raises an uncaught `ValueError` for a port-less endpoint instead of a transport error (Client.Python-029). |
|
||||
| 2 | mxaccessgw conventions | No new issues found — secrets still redacted, generated code untouched, no committed tokens in the new Gitea feed URLs (placeholders only). |
|
||||
| 3 | Concurrency & thread safety | No new issues found — `LazyBrowseNode.expand` uses a per-node `asyncio.Lock` with a double-checked guard and is verified concurrent-safe by `test_browse_expand_concurrent_callers_only_fire_one_rpc`. |
|
||||
| 4 | Error handling & resilience | Issue found: the TOFU branch calls the blocking `ssl.get_server_certificate` with no timeout from inside the `async def connect` path, blocking the event loop and hanging indefinitely on a black-holed host (Client.Python-028). |
|
||||
| 5 | Security | Issue found: the new `require_certificate_validation` security flag is not reachable through the documented `connect(...)` convenience kwargs or any CLI flag, so callers using those paths are locked into TOFU and cannot force certificate validation (Client.Python-027). TOFU itself is design-sanctioned (`docs/GatewayConfiguration.md` line 470). |
|
||||
| 6 | Performance & resource management | No new issues found beyond the blocking TLS probe captured in Client.Python-028. |
|
||||
| 7 | Design-document adherence | No new issues found — TOFU default, `require_certificate_validation` naming, and the BrowseChildren surface match `docs/GatewayConfiguration.md` / `docs/GalaxyRepository.md`; both README doc anchors resolve. |
|
||||
| 8 | Code organization & conventions | Issue found: `pyproject.toml` uses the PEP 639-deprecated `license = { text = ... }` table form (Client.Python-030). pyproject metadata is otherwise correct and the wheel builds. |
|
||||
| 9 | Testing coverage | Issue found: the `tls` pytest mark used by `tests/test_tls.py` is not registered in `[tool.pytest.ini_options]`, emitting a `PytestUnknownMarkWarning` (Client.Python-031). New browse / TOFU paths are otherwise well covered. |
|
||||
| 10 | Documentation & comments | No new issues found — README TLS/browse/Gitea-feed prose matches the code; the alarm-CLI README examples corrected under Client.Python-022 remain correct. |
|
||||
|
||||
### Prior coverage (commit a020350)
|
||||
|
||||
A re-review at commit `a020350` over the same module. Prior findings
|
||||
(Client.Python-001 — Client.Python-017) remain closed and are kept as
|
||||
history. This section reflects categories evaluated in this pass.
|
||||
history. This section reflects categories evaluated in that pass.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
@@ -1171,3 +1203,238 @@ scope; `test_commands_module_bench_read_bulk_does_not_use_bare_except_pass`
|
||||
greps the function source for the `except Exception:\n pass` pattern
|
||||
and rejects it. Both tests failed against the pre-fix source and pass
|
||||
against the fix.
|
||||
|
||||
### Client.Python-027
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Location | `clients/python/src/zb_mom_ww_mxgateway/client.py:36-54`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:47-66`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:165-172,918-930` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** This commit adds `ClientOptions.require_certificate_validation`
|
||||
(default `False`) so a caller can force system-trust certificate verification
|
||||
instead of the new lenient trust-on-first-use (TOFU) default. The flag is
|
||||
honoured inside `create_channel`, but it is not surfaced through either of the
|
||||
two documented ways a normal caller dials the gateway:
|
||||
|
||||
1. `GatewayClient.connect(...)` and `GalaxyRepositoryClient.connect(...)` accept
|
||||
the convenience kwargs `endpoint` / `api_key` / `plaintext` / `ca_file` /
|
||||
`server_name_override` and build the `ClientOptions` internally, but do **not**
|
||||
accept or forward `require_certificate_validation`. The README's high-level
|
||||
examples (e.g. the lazy-browse walker) use exactly this kwarg form
|
||||
(`GalaxyRepositoryClient.connect(endpoint=..., api_key=..., plaintext=True)`),
|
||||
so the kwarg path is the primary documented entry point.
|
||||
2. The CLI exposes `--plaintext`, `--tls`, and `--ca-file` but no
|
||||
`--require-certificate-validation` flag, and `_connect` constructs
|
||||
`ClientOptions(...)` without setting the field. A CLI user connecting to a
|
||||
TLS gateway is therefore locked into TOFU.
|
||||
|
||||
The net effect is that the *only* way to opt into real certificate validation is
|
||||
to construct a `ClientOptions` instance directly and pass it as the positional
|
||||
`options=` argument — a path neither the README nor the CLI documents. A
|
||||
security-sensitive deployment that wants the strict (verify-against-system-trust)
|
||||
posture cannot select it through the documented surface, so it silently stays on
|
||||
TOFU. TOFU itself is design-sanctioned (`docs/GatewayConfiguration.md` line 470
|
||||
explicitly says "Python uses trust-on-first-use"), so this is an opt-in-to-strict
|
||||
reachability gap rather than an insecure default — hence Medium with a workaround.
|
||||
|
||||
**Recommendation:** Add a `require_certificate_validation: bool = False` kwarg to
|
||||
both `GatewayClient.connect` and `GalaxyRepositoryClient.connect` and forward it
|
||||
into the constructed `ClientOptions`. Add a `--require-certificate-validation`
|
||||
(or `--verify-tls`) flag to the shared CLI option set and wire it through
|
||||
`_connect`. Add a test asserting the flag flows through to
|
||||
`ClientOptions.require_certificate_validation` and a README note documenting how
|
||||
to select the strict posture.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed: `connect` built `ClientOptions` from a
|
||||
fixed kwarg set that omitted `require_certificate_validation`, and the CLI had no
|
||||
flag, so the strict posture was only reachable via a hand-built `options=`. Added
|
||||
a `require_certificate_validation: bool = False` kwarg to both
|
||||
`GatewayClient.connect` and `GalaxyRepositoryClient.connect` (forwarded into the
|
||||
constructed `ClientOptions`), a `--require-certificate-validation` flag to the
|
||||
shared `gateway_options` CLI option set, and wired it through `_connect`. README
|
||||
TLS section now documents the strict posture is reachable via the connect kwarg,
|
||||
the options struct, and the CLI flag. Tests:
|
||||
`tests/test_client_session.py::test_gateway_connect_forwards_require_certificate_validation`,
|
||||
`::test_galaxy_connect_forwards_require_certificate_validation`,
|
||||
`tests/test_cli.py::test_require_certificate_validation_flag_flows_through_connect`,
|
||||
`::test_require_certificate_validation_defaults_off` — all failed before the fix
|
||||
and pass after.
|
||||
|
||||
### Client.Python-028
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `clients/python/src/zb_mom_ww_mxgateway/options.py:120-130`, `clients/python/src/zb_mom_ww_mxgateway/client.py:59`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:71` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The TOFU branch of `create_channel` calls
|
||||
`ssl.get_server_certificate((host, port))` to pre-fetch the server certificate.
|
||||
`create_channel` is a synchronous function, but it is invoked exclusively from
|
||||
inside the `async def connect` classmethods of `GatewayClient` and
|
||||
`GalaxyRepositoryClient` (`client.py:59`, `galaxy.py:71`). `ssl.get_server_certificate`
|
||||
opens a real blocking TCP+TLS socket on the calling thread, so:
|
||||
|
||||
1. It **blocks the asyncio event loop** for the full duration of the connect/handshake.
|
||||
This is at odds with the rest of the client, which is fully `async`.
|
||||
2. It passes **no `timeout`** to `ssl.get_server_certificate`. The `test_tofu_connect_failure_raises_transport_error`
|
||||
test only proves the *connection-refused* case (a closed port returns fast).
|
||||
A black-holed / firewall-drop host (packets silently dropped) makes the
|
||||
underlying `socket.create_connection` hang on the OS default connect timeout,
|
||||
which can be minutes, with the event loop frozen the whole time. A caller that
|
||||
wrapped `connect` in `asyncio.wait_for(...)` cannot cancel it because the block
|
||||
is in synchronous C, not at an `await` point.
|
||||
|
||||
The other TLS branches (`ca_file`, `require_certificate_validation`) build the
|
||||
channel lazily and return immediately, so only the lenient default — the most
|
||||
common path — has this hazard.
|
||||
|
||||
**Recommendation:** Pass an explicit `timeout=` to `ssl.get_server_certificate`
|
||||
(it accepts one), bounded by `options.call_timeout` or a short fixed value, so a
|
||||
black-holed host fails fast as a `MxGatewayTransportError` instead of hanging.
|
||||
Better, run the synchronous probe off the event loop — make the TOFU pre-fetch
|
||||
path awaitable (e.g. wrap it in `asyncio.get_running_loop().run_in_executor(...)`
|
||||
from an `async` channel factory, or document that `connect` must not be called
|
||||
from a running loop). Add a regression test that asserts the probe honours a
|
||||
timeout.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed: the TOFU branch called
|
||||
`ssl.get_server_certificate((host, port))` with no timeout from the synchronous
|
||||
`create_channel`, which both `connect` classmethods invoked directly on the event
|
||||
loop. Fix is two-part: (1) `create_channel` now passes
|
||||
`timeout=options.call_timeout` (falling back to a fixed
|
||||
`_TOFU_PROBE_TIMEOUT_SECONDS = 10.0` when no call_timeout is set) to
|
||||
`ssl.get_server_certificate`, and the existing `except OSError` wraps a
|
||||
timeout/connect failure into `MxGatewayTransportError` (TimeoutError/socket.timeout
|
||||
are OSError subclasses); (2) both `GatewayClient.connect` and
|
||||
`GalaxyRepositoryClient.connect` now run the blocking factory off the loop via
|
||||
`await asyncio.to_thread(create_channel, resolved)`, so the event loop is never
|
||||
frozen and a caller's `asyncio.wait_for` can cancel the connect. Tests:
|
||||
`tests/test_auth_options.py::test_tofu_probe_passes_a_bounded_timeout`,
|
||||
`::test_tofu_probe_timeout_raises_transport_error` (parametrized over
|
||||
socket.timeout / TimeoutError / OSError), and
|
||||
`tests/test_client_session.py::test_gateway_connect_runs_create_channel_off_the_event_loop`,
|
||||
`::test_galaxy_connect_runs_create_channel_off_the_event_loop`. The timeout and
|
||||
off-loop tests failed before the fix and pass after.
|
||||
|
||||
### Client.Python-029
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `clients/python/src/zb_mom_ww_mxgateway/options.py:78-90` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `_split_authority` parses a non-bracketed target with
|
||||
`host, _, port = target.rpartition(":")` and returns
|
||||
`(host or "localhost", int(port) if port else 443)`. For a port-less endpoint
|
||||
such as `"mygateway"`, `rpartition(":")` returns `("", "", "mygateway")`, so
|
||||
`host` becomes `""` (→ `"localhost"`) and `port` becomes `"mygateway"`, and
|
||||
`int("mygateway")` raises an uncaught `ValueError: invalid literal for int()`.
|
||||
Because `_split_authority` is called *before* the `try/except OSError` guard in
|
||||
`create_channel`, the failure escapes as a raw `ValueError` rather than the
|
||||
intended `MxGatewayTransportError`, and the message does not name the endpoint.
|
||||
Verified at runtime:
|
||||
`_split_authority("mygateway")` → `ValueError: invalid literal for int() with base 10: 'mygateway'`.
|
||||
gRPC targets normally carry an explicit port (`host:port`), so impact is narrow,
|
||||
but a typo or a bare-hostname endpoint produces a confusing crash on the TOFU
|
||||
default path. The bracketed-IPv6 and `host:port` cases are covered by tests; the
|
||||
port-less case is not.
|
||||
|
||||
**Recommendation:** Treat a non-numeric / missing port as the default (443) and
|
||||
keep the whole string as the host, e.g. detect a trailing `:<digits>` explicitly
|
||||
rather than assuming the `rpartition` tail is numeric, or wrap the `int(port)`
|
||||
conversion so a non-numeric tail falls back to host-only with the default port.
|
||||
Add a `_split_authority("mygateway")` case to `tests/test_tls.py`.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed: `_split_authority("mygateway")` raised
|
||||
`ValueError: invalid literal for int() with base 10: 'mygateway'` because
|
||||
`rpartition(":")` put the whole string in the port slot. Rewrote the
|
||||
non-bracketed branch to inspect the `rpartition` separator and the tail: no colon
|
||||
→ whole target is the host with default port 443; a colon with a non-digit/empty
|
||||
tail → left side is the host with default port 443; a digit tail → parse the
|
||||
port. The bare-hostname case now returns `("mygateway", 443)` instead of raising,
|
||||
and the existing `":5120"` / `"localhost:5120"` / IPv6 cases are unchanged. Test:
|
||||
`tests/test_tls.py::test_split_authority_defaults_port_for_portless_endpoint`
|
||||
(covers `"mygateway"`, `"https://mygateway"`, and `"mygateway:"`) — failed before
|
||||
the fix and passes after.
|
||||
|
||||
### Client.Python-030
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `clients/python/pyproject.toml:17` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** This commit re-adds a `license` key to `pyproject.toml` as the
|
||||
table form `license = { text = "Proprietary" }`. Under PEP 639 (active in the
|
||||
installed setuptools 82.0.1), the `[project.license]` **table** forms (`text` and
|
||||
`file`) are deprecated in favour of the SPDX string expression, and a future
|
||||
setuptools major may reject them — the same class of regression that
|
||||
Client.Python-018 (the earlier `license = "Proprietary"` string, rejected because
|
||||
`Proprietary` is not a valid SPDX identifier) recorded for this exact field. The
|
||||
build currently succeeds (verified: `python -m pip wheel .` produces
|
||||
`zb_mom_ww_mxaccess_gateway_client-0.1.0-py3-none-any.whl` and the metadata
|
||||
carries `License: Proprietary` plus the `License :: Other/Proprietary License`
|
||||
classifier), so this is a forward-looking maintainability flag, not a present
|
||||
breakage. Note that pairing a `license` table with a `License ::` trove
|
||||
classifier is also flagged by PyPI/twine as redundant under the new metadata
|
||||
rules.
|
||||
|
||||
**Recommendation:** Prefer the PEP 639 SPDX-string form with a `LicenseRef-*`
|
||||
custom identifier for an unlisted licence (`license = "LicenseRef-Proprietary"`)
|
||||
— this is the future-proof equivalent of the intent and avoids the deprecated
|
||||
table form — or drop the `license` key entirely and rely on the existing
|
||||
`License :: Other/Proprietary License` classifier (the Client.Python-018
|
||||
resolution chose this). The `tests/test_packaging.py::test_pip_wheel_build_succeeds`
|
||||
guard (added under Client.Python-020) will catch the day a setuptools upgrade
|
||||
turns the deprecation into a hard error.
|
||||
|
||||
**Resolution:** 2026-06-15 — Switched the deprecated `license = { text =
|
||||
"Proprietary" }` table form to the PEP 639 SPDX-string form
|
||||
`license = "LicenseRef-Proprietary"` (the future-proof custom identifier for an
|
||||
unlisted/proprietary licence). Also removed the now-redundant
|
||||
`License :: Other/Proprietary License` trove classifier, which setuptools >= 77
|
||||
flags as conflicting when a `License-Expression` is present. The built wheel
|
||||
metadata now carries `License-Expression: LicenseRef-Proprietary` and no
|
||||
`Classifier: License ::` line. Verified by `python -m pip wheel . --no-deps`,
|
||||
which builds cleanly; the existing
|
||||
`tests/test_packaging.py::test_pip_wheel_build_succeeds` guard exercises the same
|
||||
build and passes.
|
||||
|
||||
### Client.Python-031
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `clients/python/tests/test_tls.py:34`, `clients/python/pyproject.toml:53-56` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `tests/test_tls.py` applies a module-level
|
||||
`pytestmark = pytest.mark.tls`, but the `tls` marker is not registered in
|
||||
`[tool.pytest.ini_options]` (which declares only `addopts`, `pythonpath`, and
|
||||
`testpaths`). Every run emits a `PytestUnknownMarkWarning: Unknown
|
||||
pytest.mark.tls - is this a typo?`. The warning is benign today, but (a) it is
|
||||
exactly the kind of typo the warning exists to catch, so a future genuine
|
||||
mistyped marker would be lost in the noise, and (b) if the suite ever adopts
|
||||
`filterwarnings = ["error"]` (a common hardening step), the unregistered marker
|
||||
would turn into a hard collection failure.
|
||||
|
||||
**Recommendation:** Register the marker, e.g.
|
||||
`markers = ["tls: loopback TLS tests, opt-in via MXGATEWAY_RUN_TLS_TESTS=1"]`
|
||||
under `[tool.pytest.ini_options]` in `clients/python/pyproject.toml`.
|
||||
|
||||
**Resolution:** 2026-06-15 — Registered the `tls` marker by adding
|
||||
`markers = ["tls: loopback TLS tests, opt-in via MXGATEWAY_RUN_TLS_TESTS=1"]`
|
||||
under `[tool.pytest.ini_options]` in `clients/python/pyproject.toml`.
|
||||
`python -m pytest` now reports no `PytestUnknownMarkWarning` (full run: 91
|
||||
passed, 1 skipped, 0 warnings; previously 1 warning). The `tls`-marked
|
||||
`tests/test_tls.py` module is the guard — its run is now warning-free.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `clients/rust` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -96,6 +96,25 @@ under review does not address them.
|
||||
| 9 | Testing coverage | Issue found: zero tests cover `stream_alarms` on `GatewayClient`, the new bulk read/write SDK methods, or the `BenchReadBulk` flow; the fake gateway's `stream_alarms` impl drops the sender immediately (Client.Rust-024). |
|
||||
| 10 | Documentation & comments | Issue found: `.cargo/config.toml`'s comment promises "Release builds are unaffected" but the `link-arg=/STACK:8388608` setting is unconditional under `cfg(windows)` and only applies to the MSVC linker (Client.Rust-027). |
|
||||
|
||||
### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9`. The diff against `42b0037` (`git diff 42b0037..HEAD -- clients/rust/`) covers: Cargo metadata + Gitea alternative-registry config (`Cargo.toml`, `.cargo/config.toml`, README install section); a `[registries.dohertj2-gitea]` index entry and `publish = ["dohertj2-gitea"]` with `mxgw-cli` set `publish = false`; the resolution work for Client.Rust-022..029 (malformed-reply `Result` plumbing, `next_correlation_id` re-export, clippy fixes, `read_bulk<S: AsRef<str>>`); a **new** Galaxy lazy-browse walker (`browse`, `browse_children_raw`, `browse_children_inner`, `BrowseChildrenOptions`, `LazyBrowseNode`) with six unit tests; a **new** TLS pin-only guard (`build_tls_config` + `ClientOptions::with_require_certificate_validation` + `--require-certificate-validation` CLI flag) with a new `tests/tls.rs`; and the alarm-provider-fallback proto surface (`AlarmFeedMessage.provider_status`, added contracts-side in `1d85db7`).
|
||||
|
||||
`cargo fmt --check` is clean. `cargo check -p zb-mom-ww-mxgateway-client`, `cargo test -p zb-mom-ww-mxgateway-client` (24 lib + integration, 4 proto-fixture, 4 tls — all pass), and the library half of the workspace are clean. **`cargo clippy --workspace --all-targets -- -D warnings` and `cargo check --workspace` both FAIL at HEAD** — not on a lint but on a hard `E0004` compile error: the `mxgw-cli` binary's two `match &message.payload` blocks (`crates/mxgw-cli/src/main.rs:1731,1757`) are non-exhaustive after the proto added `AlarmFeedMessage.payload::ProviderStatus` (Client.Rust-030). The library crate compiles and all its tests pass; the break is confined to the CLI binary. No committed registry tokens — `.cargo/config.toml` carries only the sparse index URL; the README documents the token living in `~/.cargo/credentials.toml`.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Issue found: `mxgw-cli` fails to compile at HEAD — non-exhaustive `AlarmFeedMessage.payload` match missing the new `ProviderStatus` arm (Client.Rust-030). The library `read_bulk`/galaxy-walker/TLS-guard logic is correct and tested. |
|
||||
| 2 | mxaccessgw conventions | Issue found: `cargo clippy --workspace --all-targets -- -D warnings` / `cargo check --workspace` do not pass — CLAUDE.md mandates they do (Client.Rust-030). The prior 029 clippy regressions are resolved; this is a new build break from the alarm-provider proto change. |
|
||||
| 3 | Concurrency & thread safety | No issues found — `LazyBrowseNode` shares state via `Arc<…AsyncMutex<…>>`; `expand()` holds the mutex across the `browse_children_inner` await so concurrent expanders serialize and the idempotency check is race-free. `CORRELATION_SEQUENCE` is still `AtomicU64`/`Relaxed`. No `unsafe`. |
|
||||
| 4 | Error handling & resilience | Issue found: the strict TLS path (`require_certificate_validation(true)` with no CA) builds a `ClientTlsConfig` with zero trust roots (no `tls-native-roots`/`tls-webpki-roots` feature, no `.with_*_roots()` call), so it cannot validate any certificate — contradicting the documented "verify against the system trust roots" behaviour (Client.Rust-031). The galaxy page-token loop has a correct repeated-token guard. |
|
||||
| 5 | Security | No issues found in the registry/secret surface — `.cargo/config.toml` holds only the sparse index URL, no token; README puts the Bearer token in `~/.cargo/credentials.toml` (uncommitted). (See Client.Rust-031 for the strict-TLS validation gap, classified under error handling.) |
|
||||
| 6 | Performance & resource management | No issues found — `read_bulk` is now borrow-based (`&[S]`), the bench loop reuses `tags_ref` (Client.Rust-026 resolved). The walker clones the `GalaxyClient` channel handle per node, which is the intended cheap `Channel` clone. |
|
||||
| 7 | Design-document adherence | Issue found: `RustClientDesign.md` is not updated for the new Galaxy lazy-browse SDK surface (`browse` / `browse_children_raw` / `LazyBrowseNode` / `BrowseChildrenOptions`); CLAUDE.md requires docs to change with the source (Client.Rust-032). The TLS pin-only section pre-dates this diff but repeats the inaccurate "system trust roots" claim (cross-referenced from Client.Rust-031). |
|
||||
| 8 | Code organization & conventions | No issues found — Cargo metadata (name/version/license/repository/keywords/categories) is well-formed; `publish = ["dohertj2-gitea"]` on the library and `publish = false` on `mxgw-cli` is the right split. `license = "Proprietary"` is non-SPDX but cargo accepts it and it is a deliberate closed-source marker. |
|
||||
| 9 | Testing coverage | No issues found in the new surface — the walker has six unit tests (roots, expand, idempotency, NotFound, multi-page, filter-forwarding) and TLS has four. Gap noted: `tls_with_require_certificate_validation_does_not_short_circuit` connects to a dead address, so it only asserts the guard does not fire and never exercises a real handshake — which is why the no-trust-roots defect in Client.Rust-031 is not caught by a test. |
|
||||
| 10 | Documentation & comments | Issue found: the `alarm_feed_message_summary` / `alarm_feed_message_to_json` doc comments still say "three `payload` oneof cases" (`main.rs:1729,1755`) although the proto now has four; folded into Client.Rust-030's fix. The TLS doc inaccuracy is Client.Rust-031. |
|
||||
|
||||
## Findings
|
||||
|
||||
### Client.Rust-001
|
||||
@@ -687,3 +706,59 @@ The third error (`BulkReplyKind` enum-variant-names) is also touched by the diff
|
||||
**Recommendation:** Re-apply Client.Rust-001 (add doc comments on `with_max_grpc_message_bytes` / `max_grpc_message_bytes` in `options.rs`), Client.Rust-002 (drop the `Bulk` suffix from `BulkReplyKind`'s variants so they become `AddItem` / `AdviseItem` / …, or add a narrowly-scoped `#[allow(clippy::enum_variant_names)]` with a reason comment), and Client.Rust-012 (replace `last_deploy.lock().unwrap().clone()` with `*last_deploy.lock().unwrap()` in `galaxy.rs:282`). Verify with `cargo clippy --workspace --all-targets -- -D warnings`. Consider adding a pre-commit / CI gate so the next reviewer never has to discover the regression by running clippy.
|
||||
|
||||
**Resolution:** 2026-05-24 — Re-applied all three resolutions. `clients/rust/src/options.rs` now has `///` doc comments on `with_max_grpc_message_bytes` and `max_grpc_message_bytes`. `clients/rust/src/galaxy.rs:282` uses `*self.state.last_deploy.lock().unwrap()` instead of `.clone()`. `clients/rust/src/session.rs`'s `BulkReplyKind` variants are renamed to `AddItem` / `AdviseItem` / `RemoveItem` / `UnAdviseItem` / `Subscribe` / `Unsubscribe` (no shared `Bulk` suffix), with the call sites in `add_item_bulk` / `advise_item_bulk` / `remove_item_bulk` / `un_advise_item_bulk` / `subscribe_bulk` / `unsubscribe_bulk` updated accordingly. The sibling `BulkWriteReplyKind` already had non-suffix-sharing variants (`Write` / `Write2` / `WriteSecured` / `WriteSecured2`) and required no rename. `cargo clippy --workspace --all-targets -- -D warnings` is clean at HEAD.
|
||||
|
||||
### Client.Rust-030
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `clients/rust/crates/mxgw-cli/src/main.rs:1731,1757` (origin: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:909-924`, added in commit `1d85db7`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The `mxgw-cli` binary does not compile at HEAD `410acc9`. `cargo check --workspace`, `cargo clippy --workspace --all-targets -- -D warnings`, `cargo build --workspace`, and `cargo test --workspace` all fail with a hard `E0004` (non-exhaustive patterns), so the entire documented Rust build/test/clippy workflow that CLAUDE.md mandates is broken:
|
||||
|
||||
```
|
||||
error[E0004]: non-exhaustive patterns: `&Some(...alarm_feed_message::Payload::ProviderStatus(_))` not covered
|
||||
--> crates/mxgw-cli/src/main.rs:1731:11
|
||||
error[E0004]: non-exhaustive patterns: `&Some(...alarm_feed_message::Payload::ProviderStatus(_))` not covered
|
||||
--> crates/mxgw-cli/src/main.rs:1757:11
|
||||
```
|
||||
|
||||
The alarm-provider-fallback contract change (`1d85db7`, within the reviewed range) added a fourth `AlarmFeedMessage.payload` oneof arm — `AlarmProviderStatus provider_status = 4`. tonic-build regenerates the Rust enum with the new `ProviderStatus` variant, but `alarm_feed_message_summary` (`main.rs:1731`) and `alarm_feed_message_to_json` (`main.rs:1757`) each `match &message.payload` exhaustively over only `ActiveAlarm` / `SnapshotComplete` / `Transition` / `None` with no wildcard arm. Because they are exhaustive matches on a now-larger enum, the binary fails to compile rather than silently mishandling the new variant. The library crate (`zb-mom-ww-mxgateway-client`) itself compiles cleanly and all 32 of its tests pass; the break is confined to the CLI — but the CLI is the cross-language e2e matrix driver, so the whole `clients/rust` workspace is unbuildable and no Rust e2e smoke can run against the gateway at this commit. This is the alarm-surface gap the review request asked to check: the `ProviderStatus` payload is unhandled in the only place the Rust client renders the alarm feed.
|
||||
|
||||
**Recommendation:** Add a `Some(alarm_feed_message::Payload::ProviderStatus(status))` arm to both `alarm_feed_message_summary` and `alarm_feed_message_to_json` (render the provider-status fields — mode, degraded/provenance, reference — consistent with how the .NET/Go/Java/Python CLIs serialise it so the cross-language parity matcher recognises the payload). While there, update the two doc comments that still say "three `payload` oneof cases" (`main.rs:1729,1755`) to four. Verify with `cargo clippy --workspace --all-targets -- -D warnings` and `cargo test --workspace`. Consider a CI gate so a contract change that adds a oneof arm cannot leave the Rust CLI unbuildable again.
|
||||
|
||||
**Resolution:** 2026-06-15 — Root cause confirmed: the contract's new fourth `AlarmFeedMessage.payload` oneof arm (`AlarmProviderStatus provider_status`, proto fields `mode`/`degraded`/`reason`/`since`) left both `match &message.payload` blocks non-exhaustive (`E0004`). Added a `Some(alarm_feed_message::Payload::ProviderStatus(status))` arm to both `alarm_feed_message_summary` (`mode`/`degraded`/`reason` one-liner) and `alarm_feed_message_to_json` (a `providerStatus` object with `mode`/`degraded`/`reason`/`since`), added an `AlarmEnumName::provider_mode` enum-name helper consistent with the existing `condition_state`/`transition_kind` renderers, and updated the summary doc comment to "four payload oneof cases". No `_ => {}` wildcard. Test: `alarm_feed_provider_status_renders_in_summary_and_json` (in `crates/mxgw-cli/src/main.rs`). All four cargo commands now pass.
|
||||
|
||||
### Client.Rust-031
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `clients/rust/src/options.rs:196-240` (`build_tls_config`); `clients/rust/Cargo.toml:40` (tonic features); docs: `clients/rust/src/options.rs:76-101`, `clients/rust/README.md` (TLS trust section), `clients/rust/crates/mxgw-cli/src/main.rs:429-431`, `clients/rust/RustClientDesign.md:202` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The new strict-verification escape hatch does not do what it documents. `build_tls_config` only configures trust roots when a CA file is pinned: with `require_certificate_validation(true)` and no `ca_file`, it returns a bare `ClientTlsConfig::new()` and never calls `.with_native_roots()`, `.with_webpki_roots()`, or `.with_enabled_roots()`. The crate also enables only `tonic` feature `tls-ring` (`Cargo.toml:40`) — neither `tls-native-roots` nor `tls-webpki-roots` is on, so even if the code wanted to enable system roots the methods are feature-gated out. A `ClientTlsConfig` with zero trust anchors rejects every server certificate during the rustls handshake, so the strict path cannot connect to any TLS gateway — not even one whose certificate is genuinely chained to a system root. Yet `with_require_certificate_validation`'s doc comment (`options.rs:80-89`), the README "TLS trust (pin-only)" section, the `--require-certificate-validation` CLI flag help (`main.rs:429-431`), and `RustClientDesign.md:202` all tell the user this option will "verify against the system trust roots." The documented behaviour is unreachable; the only working TLS path is CA pinning (`with_ca_file`).
|
||||
|
||||
This is masked by the tests: `tls_with_require_certificate_validation_does_not_short_circuit` (`tests/tls.rs`) dials a dead address (`https://127.0.0.1:1`) and only asserts the no-CA guard error does *not* fire — it never reaches a handshake, so the absent-roots defect is invisible to the suite.
|
||||
|
||||
**Recommendation:** Either (a) make the strict path actually load system roots — add the `tls-native-roots` (and/or `tls-webpki-roots`) feature to the `tonic` dependency and call `tls = tls.with_native_roots()` (or `.with_enabled_roots()`) in the `require_certificate_validation == true && ca_file.is_none()` branch of `build_tls_config` — and add a test that pins a self-signed cert as a CA and asserts a system-root-only connection to that same server is *rejected* (proving roots are actually consulted); or (b) if loading system roots is intentionally out of scope for v1, correct every doc site (the `with_require_certificate_validation` doc comment, README, CLI flag help, and `RustClientDesign.md`) to state that the strict flag does not currently enable any trust roots and that CA pinning is the only supported TLS path. Option (a) is the better fix because the flag otherwise has no working effect.
|
||||
|
||||
**Resolution:** 2026-06-15 — Took option (a). Root cause confirmed: strict-on/no-CA returned a bare `ClientTlsConfig::new()` with zero trust anchors and the crate only enabled tonic `tls-ring`, so the documented "verify against the system trust roots" path could never validate any certificate. Added `tls-native-roots` to the `tonic` features in `Cargo.toml` and refactored `build_tls_config` to compute the trust posture via a new pure `tls_trust_decision` helper returning `TlsTrustDecision::{None,PinnedCa,SystemRoots,RejectNoCa}`; the `SystemRoots` branch now calls `ClientTlsConfig::with_native_roots()` so a cert chaining to an OS-trusted root validates. Corrected every doc site to state the strict flag verifies against OS roots (not a bare self-signed cert, which still needs `with_ca_file`): the `with_require_certificate_validation` doc comment and `build_tls_config` docs (`options.rs`), README "TLS trust" section, and `RustClientDesign.md` "Trust posture"; the CLI flag help was already accurate. TDD: added failing-first unit tests then the fix — `strict_without_ca_uses_system_roots`, `lenient_without_ca_is_rejected`, `pinned_ca_uses_pinned_trust`, `plaintext_needs_no_tls` (in `src/options.rs`). All four cargo commands pass.
|
||||
|
||||
### Client.Rust-032
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `clients/rust/RustClientDesign.md`; surface in `clients/rust/src/galaxy.rs:281-379` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The diff under review adds substantial new public Galaxy SDK surface — `GalaxyClient::browse`, `GalaxyClient::browse_children_raw`, the `BrowseChildrenOptions` filter struct, and the `LazyBrowseNode` lazy walker (`object`, `has_children_hint`, `children`, `is_expanded`, `expand`) — none of which is described in `RustClientDesign.md`. The README was updated with a "Browsing lazily" / "High-level walker" section, but CLAUDE.md requires the design docs to change in the same change as the public API. A reader consulting the detailed design to understand the Galaxy client surface will not learn that lazy browsing, sibling pagination, the `child_has_children` hint, or the idempotent `expand` contract exist.
|
||||
|
||||
**Recommendation:** Add a "Lazy browse" subsection to the Galaxy section of `RustClientDesign.md` enumerating `browse`, `browse_children_raw`, `BrowseChildrenOptions` (its filter fields and AND semantics), and `LazyBrowseNode` (the `Arc`-shared clone semantics, the idempotent single-RPC `expand`, the `has_children_hint`, and the internal paged `BrowseChildren` loop with its repeated-page-token guard). Cross-reference `docs/GalaxyRepository.md#browsechildren` for the wire-level request/filter semantics the README already links.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed by inspection that `RustClientDesign.md` had no Galaxy library-API coverage at all. Added a new "Galaxy Repository" section documenting `browse`, `browse_children_raw`, the `BrowseChildrenOptions` filter struct (all six fields, AND combination semantics, `include_attributes` tri-state), and `LazyBrowseNode` (`Arc`-shared clone semantics, `has_children_hint`, the idempotent single-RPC `expand` under an async mutex with page size 500, and the repeated-page-token `Error::InvalidArgument` guard), cross-referencing `docs/GalaxyRepository.md#browsechildren`. Also noted the fourth alarm `provider_status` oneof case in the Alarms section while resolving Client.Rust-030. Doc-only change verified by inspection; design-doc anchor target confirmed present.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Contracts` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -50,6 +50,43 @@ Python and Go descriptors. No fields renumbered or repurposed.
|
||||
| 9 | Testing coverage | No issues found — `ProtobufContractRoundTripTests` and `GatewayContractInfoTests` continue to pin the protocol version; new `QueryActiveAlarmsRequest` lacks a round-trip test but the RPC type is generated and exercised end-to-end by the gRPC client tests in each language. |
|
||||
| 10 | Documentation & comments | Issues found: Contracts-017 (the `rpc QueryActiveAlarms` comment block does not mention the `alarm_filter_prefix` request field). |
|
||||
|
||||
#### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9` scoped to the contract changes since `42b0037`
|
||||
(`git diff 42b0037..HEAD -- src/ZB.MOM.WW.MxGateway.Contracts/`). The window
|
||||
contains two unrelated additive contract features. The brief targets the
|
||||
**alarm-provider fallback** surface in `mxaccess_gateway.proto`: the new
|
||||
`AlarmProviderMode` enum (`UNSPECIFIED=0`/`ALARMMGR=1`/`SUBTAG=2`), the
|
||||
`AlarmSubtagTarget` watch-list message, `AlarmFailoverConfig`, the three new
|
||||
`SubscribeAlarmsCommand` fields (`forced_mode=2`, `watch_list=3`, `failover=4`),
|
||||
the `OnAlarmProviderModeChangedEvent` (`MxEvent.body` oneof tag 25,
|
||||
`MxEventFamily=6`), the `degraded=14`/`source_provider=15` provenance fields on
|
||||
`OnAlarmTransitionEvent` **and** `ActiveAlarmSnapshot`, and the
|
||||
`AlarmFeedMessage.provider_status=4` oneof case carrying `AlarmProviderStatus`.
|
||||
The same window also adds the Galaxy `BrowseChildren` lazy-browse RPC
|
||||
(`galaxy_repository.proto`) and three XML doc comments on `GatewayContractInfo`
|
||||
constants — both outside the brief's alarm focus but checked for additive-only
|
||||
hygiene (clean). `Generated/*.cs` is build output and was not reviewed as
|
||||
hand-written. `mxaccess_worker.proto` is unchanged (the alarm additions live in
|
||||
the gateway proto the worker imports — matches the design doc's Superseded note).
|
||||
|
||||
Verified against `docs/plans/2026-06-13-alarm-subtag-fallback-design.md`,
|
||||
`docs/plans/2026-06-15-forced-subtag-mode-fix.md`, and the worker/gateway source
|
||||
(`AlarmDispatcher.cs:213`, `MxAccessEventMapper.cs:151`, `GatewayAlarmMonitor.cs`).
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found. Field semantics are correct against source: `AlarmProviderStatus.degraded`/`OnAlarmTransitionEvent.degraded` track `mode == SUBTAG` (worker `AlarmDispatcher.cs:213` sets `SourceProvider = Degraded ? Subtag : Alarmmgr`; gateway `GatewayAlarmMonitor._providerDegraded = toMode == Subtag`). `OnAlarmProviderModeChangedEvent.hresult` "0 on failback" matches the Auto-mode failover/failback path that emits it; forced mode is seeded gateway-side and emits no worker event, so the comment is not contradicted. |
|
||||
| 2 | mxaccessgw conventions | No issues found. The subtag fallback synthesizes events **inside the worker** and marks every synthesized transition `degraded`, satisfying the CLAUDE.md "gateway forwards only worker-emitted events; synthesizing is an explicit opt-in non-parity mode" rule. `snake_case` fields, `PascalCase` messages, the `ALARM_PROVIDER_MODE_`/`MX_EVENT_FAMILY_` enum-prefix discipline, and the top-of-file wire-compatibility policy block (Contracts-005) are all honoured. Generated code regenerated, not hand-edited. |
|
||||
| 3 | Concurrency & thread safety | N/A — pure contract definitions plus a static constants class. |
|
||||
| 4 | Error handling & resilience | No issues found. The degraded/provider-status surface lets clients distinguish the lower-fidelity subtag feed from the authoritative alarmmgr feed; `AlarmProviderStatus` is emitted on stream open and every switch so late joiners learn the mode. |
|
||||
| 5 | Security | No issues found — none of the new fields carry credentials or secrets. `AlarmSubtagTarget` carries only item-address strings. |
|
||||
| 6 | Performance & resource management | No issues found. `repeated AlarmSubtagTarget watch_list` is sent once at subscribe time, not per-event; provenance fields are scalars. No hot-path bloat. |
|
||||
| 7 | Design-document adherence | No drift. The shipped contract matches `docs/plans/2026-06-13-alarm-subtag-fallback-design.md` (including its Superseded notes: additions in the gateway proto, not the worker proto). |
|
||||
| 8 | Code organization & conventions | No issues found. Every addition uses a new, contiguous field number — `SubscribeAlarmsCommand` 2-4, `MxEvent.body` 25, `MxEventFamily` 6, `OnAlarmTransitionEvent`/`ActiveAlarmSnapshot` 14-15, `AlarmFeedMessage.payload` 4 — with no reuse, renumbering, or type narrowing of any existing field. Enum zero-values are `UNSPECIFIED`. Additive-only invariant preserved. |
|
||||
| 9 | Testing coverage | Issues found: Contracts-018 — `ProtobufContractRoundTripTests` covers the new `AlarmProviderStatus` (via `AlarmFeedMessage`) and the `OnAlarmTransitionEvent` `degraded`/`source_provider` fields, but has no round-trip coverage for the `ActiveAlarmSnapshot` provenance fields, the `SubscribeAlarmsCommand` extensions (`forced_mode`/`watch_list`/`failover`), or `OnAlarmProviderModeChangedEvent`. |
|
||||
| 10 | Documentation & comments | Issues found: Contracts-019 — the `ActiveAlarmSnapshot.degraded`/`source_provider` fields carry no in-proto comment while the byte-identical fields on `OnAlarmTransitionEvent` are documented; and the `AlarmProviderMode` enum doc explains `UNSPECIFIED` only for the `forced_mode` use, not for the provenance (`source_provider`) reuse. |
|
||||
|
||||
## Findings
|
||||
|
||||
### Contracts-001
|
||||
@@ -341,3 +378,33 @@ additive-only with no reuse, renumbering, or type narrowing.
|
||||
Re-review: no new findings. Open finding count remains 0. All seventeen
|
||||
recorded Contracts findings (Contracts-001..017) remain closed
|
||||
(Resolved / Won't Fix).
|
||||
|
||||
### Contracts-018
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs:396` (`ActiveAlarmSnapshot_RoundTripsAllFields`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The alarm-provider fallback feature added several new wire fields to `mxaccess_gateway.proto`. `ProtobufContractRoundTripTests` was extended with `AlarmFeedMessage_RoundTripsProviderStatus` (covers `AlarmProviderStatus` + the `provider_status` oneof case) and `Transition_RoundTripsDegradedProvenance` (covers `OnAlarmTransitionEvent.degraded`/`source_provider`), but three pieces of the new contract surface have no round-trip coverage: (a) `ActiveAlarmSnapshot.degraded` (14) / `source_provider` (15) — `ActiveAlarmSnapshot_RoundTripsAllFields` stops at `OperatorComment` (field 11) and never sets or asserts the two new provenance fields, so a future renumber/type change to them would not be caught; (b) the `SubscribeAlarmsCommand` extensions `forced_mode` (2), `watch_list` (3, `repeated AlarmSubtagTarget`), and `failover` (4, `AlarmFailoverConfig`) — no test exercises these, and the live `forced_mode` enum-drop concern that prompted the `2026-06-15-forced-subtag-mode-fix` investigation is exactly the kind of wire shape prior contract tests have been written to pin; (c) `OnAlarmProviderModeChangedEvent` (the `MxEvent.body` oneof tag 25 / `MxEventFamily=6` worker→gateway event). This is the same class of gap previously flagged for the bulk family (Contracts-007 / Contracts-010): new wire shapes shipped without round-trip pinning.
|
||||
|
||||
**Recommendation:** Extend `ActiveAlarmSnapshot_RoundTripsAllFields` (or add a focused test) to set and assert `degraded = true` + `source_provider = AlarmProviderMode.Subtag`; add a round-trip test for `SubscribeAlarmsCommand` populating `forced_mode`, a `watch_list` entry (all six `AlarmSubtagTarget` string fields), and a `failover` `AlarmFailoverConfig`; and add a round-trip / `MxEvent` oneof-case test for `OnAlarmProviderModeChangedEvent` pinning `MxEvent.BodyCase == OnAlarmProviderModeChanged` for `MxEventFamily.OnAlarmProviderModeChanged`.
|
||||
|
||||
**Resolution:** _(2026-06-15)_ Verified the three coverage gaps against the proto — `ActiveAlarmSnapshot.degraded`/`source_provider` (14/15), `SubscribeAlarmsCommand.forced_mode`/`watch_list`/`failover` (2/3/4), and the `MxEvent.body` oneof tag 25 / `MxEventFamily=6` `OnAlarmProviderModeChangedEvent` were all unpinned. Added three focused round-trip tests to `ProtobufContractRoundTripTests`: `ActiveAlarmSnapshot_RoundTripsDegradedProvenance` (sets/asserts `degraded = true` + `source_provider = AlarmProviderMode.Subtag`), `SubscribeAlarmsCommand_RoundTripsForcedModeWatchListAndFailover` (populates `forced_mode`, a `watch_list` entry with all six `AlarmSubtagTarget` string fields, and a `failover` `AlarmFailoverConfig`), and `MxEvent_RoundTripsOnAlarmProviderModeChangedBody` (pins `MxEvent.BodyCase == OnAlarmProviderModeChanged` + `Family == OnAlarmProviderModeChanged`). All fields round-trip — no contract bug found. The full `ProtobufContractRoundTrip` filter is 49/49 green.
|
||||
|
||||
### Contracts-019
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:850-851` (`ActiveAlarmSnapshot`), `:318-324` (`AlarmProviderMode`) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Two in-proto documentation gaps on the new alarm-provider surface. (1) `OnAlarmTransitionEvent.degraded` (line 805-808) and `source_provider` (809-810) carry clear comments ("True when this transition came from the subtag-monitoring fallback … synthesized from data changes, reduced fidelity"; "Which provider produced this transition."), but the byte-identical `ActiveAlarmSnapshot.degraded` (850) and `source_provider` (851) are declared bare with no comment. The two messages model the same provenance concept and a reader of `ActiveAlarmSnapshot` alone gets no signal that a non-`UNSPECIFIED` `source_provider` plus `degraded = true` means the snapshot came from the lower-fidelity subtag source. (2) The `AlarmProviderMode` enum comment (318-319) documents the zero value only for one use site — "UNSPECIFIED on a SubscribeAlarmsCommand means auto: alarmmgr primary with subtag fallback" — but the same enum is reused as a provenance field on `OnAlarmTransitionEvent.source_provider`, `ActiveAlarmSnapshot.source_provider`, `OnAlarmProviderModeChangedEvent.mode`, and `AlarmProviderStatus.mode`. The worker always sets `source_provider` to `ALARMMGR` or `SUBTAG` (never `UNSPECIFIED`; `MxAccessEventMapper.cs:151` defaults to `Alarmmgr`, `AlarmDispatcher.cs:213` picks `Subtag`/`Alarmmgr`), so `UNSPECIFIED` as a provenance value has no defined meaning and the comment does not say so. The ProtobufStyleGuide rule "comment fields carrying MXAccess parity / non-obvious semantics" applies — this is a non-parity provenance marker.
|
||||
|
||||
**Recommendation:** (1) Add comments to `ActiveAlarmSnapshot.degraded` / `source_provider` mirroring the wording already on `OnAlarmTransitionEvent` (or a one-line cross-reference). (2) Extend the `AlarmProviderMode` enum comment to note that as a `source_provider` / `mode` provenance value the field is always `ALARMMGR` or `SUBTAG` on the wire and `UNSPECIFIED` should be treated as "unknown / not yet determined", so the zero value is unambiguous at every use site. Comment-only changes; no wire-format impact.
|
||||
|
||||
**Resolution:** _(2026-06-15)_ Confirmed both gaps in `mxaccess_gateway.proto`: `ActiveAlarmSnapshot.degraded`/`source_provider` (14/15) were bare while the byte-identical `OnAlarmTransitionEvent` fields were documented, and the `AlarmProviderMode` enum comment only explained `UNSPECIFIED` for the `forced_mode` use. (1) Added comments to `ActiveAlarmSnapshot.degraded`/`source_provider` mirroring the `OnAlarmTransitionEvent` wording (subtag-fallback / reduced-fidelity, always ALARMMGR or SUBTAG, never UNSPECIFIED). (2) Extended the `AlarmProviderMode` enum comment to distinguish its two use sites: as `forced_mode`, `UNSPECIFIED` = auto; as a provenance value (`OnAlarmTransitionEvent.source_provider`, `ActiveAlarmSnapshot.source_provider`, `OnAlarmProviderModeChangedEvent.mode`, `AlarmProviderStatus.mode`) the worker always emits ALARMMGR/SUBTAG and `UNSPECIFIED` should be read as "unknown / not yet determined". Comment-only changes; no wire-format impact. NOTE: on this dev box the `csharp` protoc generator DOES emit proto leading comments into `Generated/MxaccessGateway.cs` `<summary>` XML doc (contrary to the brief's assumption), so the build regenerated `Generated/MxaccessGateway.cs` with the new doc comments only — diff is `///`-comment lines exclusively, zero code/wire/type changes. `dotnet build -f net10.0` succeeds with 0 warnings / 0 errors.
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.IntegrationTests` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -14,6 +14,34 @@
|
||||
A comprehensive review completes every category, recording "No issues found" where
|
||||
a category produced nothing rather than leaving it blank.
|
||||
|
||||
### 2026-06-15 re-review (commit `410acc9`)
|
||||
|
||||
Scope: `git diff 42b0037..HEAD -- src/ZB.MOM.WW.MxGateway.IntegrationTests/`
|
||||
(5 files). The substantive change is the `DashboardLdapLiveTests` cutover to the
|
||||
shared `ZB.MOM.WW.Auth.Ldap.LdapAuthService` + `DashboardGroupRoleMapper`
|
||||
(matching the production `DashboardAuthenticator` ctor split); plus the
|
||||
`ResolveRepositoryRoot` `stopBoundary` parameter and its new regression test
|
||||
(IntegrationTests-025 resolution), and XML-doc backfill on
|
||||
`LiveLdapFactAttribute` / `WorkerLiveMxAccessSmokeTests`. NOTE: the review
|
||||
brief's "live alarm-subtag smoke test(s)" do not exist in this diff — no new
|
||||
alarm-subtag tests landed here. Instead the in-window Server alarm-monitor
|
||||
evolution (`ebf1d95`/`9208225`/`410acc9`) changed `GatewayAlarmMonitor`'s
|
||||
constructor without updating its IntegrationTests caller, leaving the whole
|
||||
module non-compiling (IntegrationTests-026).
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Issue found: IntegrationTests-026 (the entire IntegrationTests project fails to compile at HEAD — `WorkerLiveMxAccessSmokeTests` constructs `GatewayAlarmMonitor` with the stale 3-arg form `(sessionManager, options, logger)` while the production ctor now requires 5 args `(ISessionManager, IAlarmWatchListResolver, GatewayMetrics, IOptions<GatewayOptions>, ILogger)`; verified by `dotnet build` → CS7036). |
|
||||
| 2 | mxaccessgw conventions | No issues found. Live opt-in gating, `[Collection]`/`[Trait]` discipline, "no synthesized events", and the credential-redaction contract for the LDAP failure-path assertions are all preserved; the cutover keeps the existing skip-by-default behaviour. |
|
||||
| 3 | Concurrency & thread safety | No issues found in this diff. |
|
||||
| 4 | Error handling & resilience | No issues found. The `ServerUnreachable` test still asserts the connect failure is absorbed into a `Fail` result; the fail-closed contract now lives in the shared `LdapAuthService` and the test exercises it via `Port = 1`. |
|
||||
| 5 | Security | No issues found. The wrong-password / unknown-user / unreachable tests still assert no credential leak into `FailureMessage`; the cutover adds no new credential surface and writes no secrets to evidence/probe logs. |
|
||||
| 6 | Performance & resource management | No issues found. |
|
||||
| 7 | Design-document adherence | Issue found: IntegrationTests-028 (the live test hand-rolls a field-by-field `LibraryLdapOptions` from the gateway shadow `LdapOptions` defaults instead of binding `MxGateway:Ldap` the way production's `AddZbLdapAuth(configuration, "MxGateway:Ldap")` does, so the live test no longer exercises the production option-binding path and silently omits `ConnectionTimeoutMs` / `ServerCertificateValidationCallback`). |
|
||||
| 8 | Code organization & conventions | Issue found: IntegrationTests-027 (`DashboardLdapLiveTests` directly consumes `LdapAuthService` / `LdapOptions` from `ZB.MOM.WW.Auth.Ldap` but the IntegrationTests `.csproj` has no direct `PackageReference` — it compiles only via transitive flow through the Server `ProjectReference`). |
|
||||
| 9 | Testing coverage | No issues found beyond IntegrationTests-026 — the role-claim and stop-boundary assertions added in this window strengthen coverage; but the module cannot build, so none of the IntegrationTests run until IntegrationTests-026 is fixed. |
|
||||
| 10 | Documentation & comments | Issue found: IntegrationTests-029 (`docs/GatewayTesting.md` "Live LDAP" still describes the old in-`DashboardAuthenticator` branches — "rejected by the candidate bind", "yields no candidate" — that the library cutover moved into the shared `LdapAuthService`; the test comments were updated in this diff but the doc prose was not, contrary to CLAUDE.md's same-commit doc rule). |
|
||||
|
||||
### 2026-05-20 re-review (commit `a020350`)
|
||||
|
||||
| # | Category | Result |
|
||||
@@ -506,3 +534,77 @@ The current dev box layout (`C:\Users\dohertj2\Desktop\mxaccessgw`) is safe beca
|
||||
**Recommendation:** Isolate the walker from any ambient ancestor by either (a) constructing an `isolatedRoot` directly under a drive root and pointing the walker at a chain entirely under it (e.g. create `<isolatedRoot>\level1\level2\level3` and start the walk at `level3`, then assert the throw — the walker stops at the drive root regardless of what is on it), (b) refactoring `ResolveRepositoryRoot` to accept an injectable `stopBoundary` parameter for tests and pass `isolatedRoot` as the boundary, or (c) replacing the `Assert.Throws` shape with an explicit upward-walk check that the test owns. Option (a) is the smallest change: prepend a sentinel — e.g. create a dummy `<isolatedRoot>\sentinel-no-markers` and assert nothing about Temp ancestors — and pass the test only when the walker reaches that sentinel without finding a marker. The current shape is acceptable on the documented dev box but should not be the sole regression coverage for IntegrationTests-022.
|
||||
|
||||
**Resolution:** Resolved 2026-05-24 — Took option (b) (inject a stop-boundary) because option (a) does not actually solve the leak: a sentinel chain under `Path.GetTempPath()` still leaves the walker free to ascend past it into Temp / AppData / Users / C:\, so any ambient ancestor with `src/` + `.git`/`.sln`/`.slnx` still wins. Added an optional `stopBoundary` parameter to `IntegrationTestEnvironment.ResolveRepositoryRoot(string startDirectory, string? stopBoundary = null)`. When supplied, the walker checks the boundary for markers and then stops, refusing to ascend past it; production callers (the `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` resolution path) continue to pass `null` so the walk to drive-root behavior is unchanged. Updated both existing tests (`ResolveRepositoryRoot_AcceptsGitWorktreeFile` and `ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) to pass their owned temp directory as the boundary, sealing the walker inside a chain the test fully controls. Added a new regression test `ResolveRepositoryRoot_StopBoundary_IsolatesWalkerFromAmbientAncestorMarkers` that deliberately constructs an outer marker-bearing ancestor (`outerRoot/src` + `outerRoot/.git`), an inner boundary, and an isolated start beneath the boundary; first asserts that without the boundary the walker leaks up to `outerRoot` (the precise IntegrationTests-025 failure mode), then asserts that *with* the boundary the same call throws — proving the boundary is the load-bearing isolation. TDD red/green confirmed: the new regression test fails against the pre-fix walker (`Assert.Throws() Failure: No exception was thrown`) and passes once the boundary handling is restored. Re-ran the full `IntegrationTestEnvironmentTests` slice with `TMP` / `TEMP` redirected under a deliberately constructed `<temp>\fake-repo-ancestor` directory carrying `src/` and a `.git` file — the original flake repro from the finding — and confirmed all 5 tests pass (the same redirection produced `Assert.Throws() Failure` on the pre-fix code). Build: 0 warnings / 0 errors.
|
||||
|
||||
### IntegrationTests-026
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:1098-1101`, `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:55-60` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The entire IntegrationTests project fails to compile at HEAD (`410acc9`). `GatewayServiceFixture` (in `WorkerLiveMxAccessSmokeTests.cs`) constructs the `GatewayAlarmMonitor` it passes into `MxAccessGatewayService` with the stale three-argument form:
|
||||
|
||||
```csharp
|
||||
new ZB.MOM.WW.MxGateway.Server.Alarms.GatewayAlarmMonitor(
|
||||
sessionManager,
|
||||
options,
|
||||
_loggerFactory.CreateLogger<...GatewayAlarmMonitor>())
|
||||
```
|
||||
|
||||
but the production constructor (evolved in-window by `ebf1d95` "monitor resolves watch-list, sends ForcedMode/failover, reflects provider mode into feed + metrics", with later refinements in `9208225` and `410acc9`) now requires **five** parameters: `GatewayAlarmMonitor(ISessionManager sessionManager, IAlarmWatchListResolver watchListResolver, GatewayMetrics metrics, IOptions<GatewayOptions> options, ILogger<GatewayAlarmMonitor> logger)`. `dotnet build src/ZB.MOM.WW.MxGateway.IntegrationTests/...` fails with `CS7036: There is no argument given that corresponds to the required parameter 'options'`. Because this is the only `MxAccessGatewayService` assembly site in the fixture, the whole module — every live opt-in test *and* the non-live `IntegrationTestEnvironmentTests` — cannot build or run. This is a CLAUDE.md "Source Update Workflow" violation: a cross-component Server alarm-monitor change was not propagated to its IntegrationTests caller in the same commit, and "build each affected component" was not honored for the IntegrationTests project. It also silently masks the verification basis for IntegrationTests-022..025's "build is green" resolution claims at this HEAD.
|
||||
|
||||
**Recommendation:** Update the `GatewayAlarmMonitor` construction in `GatewayServiceFixture` to the current 5-arg signature: supply an `IAlarmWatchListResolver` (a minimal test stub returning an empty/representative watch list, or the production resolver if cheap to construct), the existing `_metrics` (`GatewayMetrics`), the existing `options` wrapped as `IOptions<GatewayOptions>` (e.g. `Options.Create(...)`), and the logger. Then run `dotnet build src/ZB.MOM.WW.MxGateway.IntegrationTests/...` to confirm 0 errors and `dotnet test ... --filter FullyQualifiedName~IntegrationTestEnvironmentTests` to confirm the non-live tests pass and the live tests still skip cleanly when the env vars are unset. Add a build of the IntegrationTests project to the verification step whenever `GatewayAlarmMonitor` / `MxAccessGatewayService` constructors change.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15: Confirmed the project failed to build at HEAD (CS7036 on the stale 3-arg `GatewayAlarmMonitor` ctor call in `GatewayServiceFixture`). Updated the construction to the current 5-arg signature — added a new `TestSupport/EmptyAlarmWatchListResolver` singleton stub (`IAlarmWatchListResolver` returning an empty watch-list, avoiding the production resolver's `IGalaxyRepository` dependency), and passed the fixture's existing `_metrics` (`GatewayMetrics`) and `options` (`IOptions<GatewayOptions>`). `dotnet build` now succeeds with 0 errors/warnings; non-live tests pass (5) and all 15 live tests skip cleanly with the env vars unset.
|
||||
|
||||
### IntegrationTests-027
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj`, `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:4-5,134` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** After the cutover, `DashboardLdapLiveTests` directly consumes `ZB.MOM.WW.Auth.Ldap.LdapAuthService` and `ZB.MOM.WW.Auth.Abstractions.Ldap.LdapOptions` (`using ZB.MOM.WW.Auth.Ldap; using ZB.MOM.WW.Auth.Abstractions.Ldap;` and `new LdapAuthService(ldapOptions)`). But the IntegrationTests `.csproj` declares no direct `PackageReference` to `ZB.MOM.WW.Auth.Ldap` or `ZB.MOM.WW.Auth.Abstractions` — it has only `ProjectReference`s to Contracts and Server. It compiles solely because the Server's `PackageReference`s to those packages flow transitively (the Server csproj sets no `PrivateAssets`). A project that directly references a library's public types should declare a direct dependency on it; the current shape means the build silently depends on the Server never marking those packages `PrivateAssets="compile"` and on the transitive compile-asset flow staying enabled. If either changes, the IntegrationTests build breaks with a confusing CS0246 far from the cause.
|
||||
|
||||
**Recommendation:** Add explicit `<PackageReference Include="ZB.MOM.WW.Auth.Ldap" Version="0.1.2" />` and `<PackageReference Include="ZB.MOM.WW.Auth.Abstractions" Version="0.1.2" />` (matching the Server's pinned versions, ideally via a shared `Directory.Packages.props` if central package management is in use) to the IntegrationTests project so its direct use of those types is backed by a direct dependency.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15: Confirmed the csproj had only `ProjectReference`s and pulled `LdapAuthService`/`LdapOptions` transitively. Added direct `PackageReference`s `ZB.MOM.WW.Auth.Abstractions` and `ZB.MOM.WW.Auth.Ldap` at `0.1.2` (matching the Server's pinned versions; no central package management exists in this repo). Build remains clean. (The IntegrationTests-028 fix also added `Microsoft.Extensions.Configuration.Json`/`.Binder` at `10.0.7`, pinned to the resolved transitive version to avoid an NU1605 downgrade.)
|
||||
|
||||
### IntegrationTests-028
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:120-161`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:35` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Production wires the shared LDAP provider by binding the `MxGateway:Ldap` configuration section straight onto the shared `LdapOptions` via `AddZbLdapAuth(configuration, "MxGateway:Ldap")`. The live test instead hand-rolls a `LibraryLdapOptions` instance by copying the eleven fields of the gateway *shadow* `LdapOptions` defaults (the `LibraryOptions()` helper). Two consequences:
|
||||
|
||||
1. The shared `LdapOptions` actually exposes **thirteen** settable properties — the hand-copy omits `ConnectionTimeoutMs` and `ServerCertificateValidationCallback` (verified by reflecting `ZB.MOM.WW.Auth.Abstractions` 0.1.2). `ConnectionTimeoutMs` has a non-zero default and directly governs the `AuthenticateAsync_ServerUnreachable_FailsWithoutThrowing` (`Port = 1`) test's timing, so the live test exercises the *shared default* timeout, not whatever an operator (or the gateway config) would set — diverging from the production-bound value.
|
||||
2. It adds a third manual copy of the shadow→shared field mapping on top of the documented "Review C2 DRIFT WARNING" seam in `Server/Configuration/LdapOptions.cs`. A field added to the shared type is silently dropped by this test until someone remembers to extend `LibraryOptions()`.
|
||||
|
||||
The prior `DashboardAuthenticator` ctor took `IOptions<GatewayOptions>`, so the old test shared the same options object production used; the cutover lost that fidelity. CLAUDE.md treats the live tests as the parity check against the real seeded directory — they should bind options the way production does.
|
||||
|
||||
**Recommendation:** Have the test build the shared `LdapOptions` the same way production does — bind it from the `MxGateway:Ldap` section (e.g. load the gateway `appsettings.json` / a minimal in-memory config and call the same `AddZbLdapAuth` binding path, or resolve the bound `IOptions<LdapOptions>` from a DI container that ran `AddZbLdapAuth`). At minimum, document why the two extra shared fields are intentionally left at their defaults, and add `ConnectionTimeoutMs` to the copy so the unreachable-server test's timeout matches production. Prefer eliminating the hand-copy so the shadow-drift surface does not grow.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15: Confirmed by reflecting `ZB.MOM.WW.Auth.Abstractions` 0.1.2 that the shared `LdapOptions` exposes 13 settable properties while the hand-copy populated only 11 (omitting `ConnectionTimeoutMs` and `ServerCertificateValidationCallback`). Eliminated the field-by-field hand-copy: `LibraryOptions()` now binds the real `MxGateway:Ldap` section from the Server's `appsettings.json` (resolved via `IntegrationTestEnvironment.ResolveRepositoryRoot`) onto the shared `LdapOptions` with `configuration.GetSection("MxGateway:Ldap").Get<LdapOptions>()` — the same section/binding path production's `AddZbLdapAuth(configuration, "MxGateway:Ldap")` uses. Verified the bind yields `ConnectionTimeoutMs=10000` (the shared default the unreachable-server test relies on) and the dev directory connection (localhost:3893, Transport=None, AllowInsecure). A new shared field is now picked up automatically rather than silently dropped.
|
||||
|
||||
### IntegrationTests-029
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `docs/GatewayTesting.md:218-224` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The "Live LDAP" section of `docs/GatewayTesting.md` still describes the failure branches in terms of the old `DashboardAuthenticator` internals: "`admin` with a wrong password is rejected by the **candidate bind**" and "an unknown username yields **no candidate**". After the cutover in this diff, the bind/search mechanics (and therefore the "candidate bind" / "candidate is null" branches) live in the shared `LdapAuthService`, not in `DashboardAuthenticator` — which is exactly why the test comments in `DashboardLdapLiveTests.cs` were reworded in this same diff from "Exercises the `LdapException` branch" / "the `candidate is null` branch" to "user-bind-failure branch" / "user-not-found branch". The doc prose was not updated to match. CLAUDE.md requires docs that describe security/auth behavior to change in the same commit as the source; the comments moved but the doc did not, leaving the doc describing branches that no longer exist in `DashboardAuthenticator`.
|
||||
|
||||
**Recommendation:** Reword the `docs/GatewayTesting.md` "Live LDAP" failure-branch sentences to describe observable behavior without referencing the now-internal "candidate bind" mechanics (e.g. "a wrong password is rejected without leaking the password", "an unknown username fails authentication"), and note that bind/search is delegated to the shared `ZB.MOM.WW.Auth.Ldap` provider so the prose stays accurate after the cutover.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15: Reworded the "Live LDAP" failure-branch prose to describe observable behavior ("fails authentication without leaking the password", "an unknown username fails authentication") instead of the now-internal "candidate bind" / "no candidate" mechanics, and added a sentence noting `DashboardAuthenticator` delegates the bind/search to the shared `ZB.MOM.WW.Auth.Ldap` provider (`LdapAuthService`) and only maps groups to roles — matching the in-source test-comment cutover. Verified by inspection.
|
||||
|
||||
+47
-11
@@ -10,17 +10,17 @@ Each module's `findings.md` is the source of truth; this file is generated from
|
||||
|
||||
| Module | Reviewer | Date | Commit | Status | Open | Total |
|
||||
|---|---|---|---|---|---|---|
|
||||
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 21 |
|
||||
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 27 |
|
||||
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 36 |
|
||||
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 26 |
|
||||
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 29 |
|
||||
| [Contracts](Contracts/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 17 |
|
||||
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 25 |
|
||||
| [Server](Server/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 50 |
|
||||
| [Tests](Tests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 31 |
|
||||
| [Worker](Worker/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 25 |
|
||||
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-05-24 | `42b0037` | Re-reviewed | 0 | 30 |
|
||||
| [Client.Dotnet](Client.Dotnet/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 25 |
|
||||
| [Client.Go](Client.Go/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 29 |
|
||||
| [Client.Java](Client.Java/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 39 |
|
||||
| [Client.Python](Client.Python/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 31 |
|
||||
| [Client.Rust](Client.Rust/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 32 |
|
||||
| [Contracts](Contracts/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 19 |
|
||||
| [IntegrationTests](IntegrationTests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 29 |
|
||||
| [Server](Server/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 53 |
|
||||
| [Tests](Tests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 35 |
|
||||
| [Worker](Worker/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 28 |
|
||||
| [Worker.Tests](Worker.Tests/findings.md) | Claude Code | 2026-06-15 | `410acc9` | Re-reviewed | 0 | 33 |
|
||||
|
||||
## Pending findings
|
||||
|
||||
@@ -38,6 +38,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Go-001 | High | Resolved | Correctness & logic bugs | `clients/go/mxgateway/errors.go:88-93`, `clients/go/mxgateway/errors.go:117-128` |
|
||||
| Client.Java-013 | High | Resolved | Testing coverage | `clients/java/mxgateway-cli/src/test/java/com/dohertylan/mxgateway/cli/MxGatewayCliTests.java:212-304`, `clients/java/mxgateway-cli/src/main/java/com/dohertylan/mxgateway/cli/MxGatewayCli.java:1214-1244` |
|
||||
| Client.Java-032 | High | Resolved | Documentation & comments | `clients/java/README.md:182-183` |
|
||||
| Client.Java-039 | High | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1699` (origin: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`, `AlarmFeedMessage.payload` provider-status arm added in commit `1d85db7`) |
|
||||
| Client.Python-018 | High | Resolved | Code organization & conventions | `clients/python/pyproject.toml:11` |
|
||||
| Client.Python-022 | High | Resolved | Documentation & comments | `clients/python/README.md:201-202`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:389-420` |
|
||||
| Client.Rust-001 | High | Resolved | mxaccessgw conventions | `clients/rust/src/options.rs:98,143` |
|
||||
@@ -46,8 +47,10 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Rust-012 | High | Resolved | mxaccessgw conventions | `clients/rust/src/galaxy.rs:282` |
|
||||
| Client.Rust-013 | High | Resolved | mxaccessgw conventions | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:414-424` (origin); `clients/rust/src/generated.rs:11-31` (suppression site) |
|
||||
| Client.Rust-029 | High | Resolved | mxaccessgw conventions | `clients/rust/src/options.rs:98,143`; `clients/rust/src/galaxy.rs:282`; `clients/rust/src/session.rs:664-671` |
|
||||
| Client.Rust-030 | High | Resolved | Correctness & logic bugs | `clients/rust/crates/mxgw-cli/src/main.rs:1731,1757` (origin: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:909-924`, added in commit `1d85db7`) |
|
||||
| IntegrationTests-001 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:7`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs` |
|
||||
| IntegrationTests-002 | High | Resolved | Design-document adherence | `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:13`, `src/MxGateway.Server/Configuration/LdapOptions.cs:27` |
|
||||
| IntegrationTests-026 | High | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:1098-1101`, `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:55-60` |
|
||||
| Server-003 | High | Resolved | Security | `src/MxGateway.Server/Dashboard/DashboardAuthorizationHandler.cs:39,54-59`, `src/MxGateway.Server/Dashboard/DashboardAuthenticator.cs:236-258` |
|
||||
| Server-017 | High | Resolved | Security | `src/MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:13-27`, `src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:173-247`, `docs/Authorization.md:108-110` |
|
||||
| Tests-001 | High | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:483-489` |
|
||||
@@ -55,16 +58,19 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Worker-001 | High | Resolved | Concurrency & thread safety | `src/MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs:204-207` |
|
||||
| Worker-002 | High | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:545-549` |
|
||||
| Worker-003 | High | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:399-403`, `:416-419` |
|
||||
| Worker-026 | High | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/FailoverAlarmConsumer.cs:289-338`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessStaSession.cs:307-320` |
|
||||
| Worker.Tests-001 | High | Resolved | Testing coverage | `src/MxGateway.Worker.Tests/Sta/` (no `StaMessagePumpTests.cs`) |
|
||||
| Worker.Tests-002 | High | Resolved | Testing coverage | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs`, `src/MxGateway.Worker.Tests/MxAccess/MxAccessEventMapperTests.cs` |
|
||||
| Client.Dotnet-001 | Medium | Resolved | Error handling & resilience | `clients/dotnet/MxGateway.Client/GrpcMxGatewayClientTransport.cs:190-199`, `clients/dotnet/MxGateway.Client/GrpcGalaxyRepositoryClientTransport.cs:131-140` |
|
||||
| Client.Dotnet-002 | Medium | Resolved | Testing coverage | `clients/dotnet/MxGateway.Client.Tests/FakeGatewayTransport.cs:145-148`, `clients/dotnet/MxGateway.Client.Tests/MxGatewayClientSessionTests.cs:236-256` |
|
||||
| Client.Dotnet-003 | Medium | Resolved | Concurrency & thread safety | `clients/dotnet/MxGateway.Client/MxGatewaySession.cs:659-663`, `clients/dotnet/MxGateway.Client/MxGatewayClient.cs:230-240` |
|
||||
| Client.Dotnet-018 | Medium | Resolved | Documentation & comments | `clients/dotnet/README.md:137-138` |
|
||||
| Client.Dotnet-022 | Medium | Resolved | mxaccessgw conventions | `clients/dotnet/Directory.Build.props:1-21` |
|
||||
| Client.Go-002 | Medium | Resolved | Error handling & resilience | `clients/go/mxgateway/session.go:440-516` |
|
||||
| Client.Go-003 | Medium | Resolved | Correctness & logic bugs | `clients/go/cmd/mxgw-go/main.go:517-532` |
|
||||
| Client.Go-022 | Medium | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:398-412,417-519` |
|
||||
| Client.Go-023 | Medium | Resolved | Concurrency & thread safety | `clients/go/cmd/mxgw-go/main.go:604-606,616-632` |
|
||||
| Client.Go-028 | Medium | Resolved | Correctness & logic bugs | `scripts/tag-go-module.ps1:42-46` |
|
||||
| Client.Java-001 | Medium | Resolved | Security | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewaySecrets.java:30-32` |
|
||||
| Client.Java-002 | Medium | Resolved | Concurrency & thread safety | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxEventStream.java:31,66-92` |
|
||||
| Client.Java-003 | Medium | Resolved | mxaccessgw conventions | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:119-140` |
|
||||
@@ -77,12 +83,15 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Java-028 | Medium | Resolved | Documentation & comments | `clients/java/JavaClientDesign.md:23-27` |
|
||||
| Client.Java-033 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1078-1098` |
|
||||
| Client.Java-034 | Medium | Resolved | Correctness & logic bugs | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:182-198` |
|
||||
| Client.Java-037 | Medium | Resolved | Documentation & comments | `clients/java/README.md:138-149` |
|
||||
| Client.Python-003 | Medium | Resolved | Error handling & resilience | `clients/python/src/mxgateway/client.py:125-137,155-173` |
|
||||
| Client.Python-005 | Medium | Resolved | Performance & resource management | `clients/python/src/mxgateway/galaxy.py:117-140` |
|
||||
| Client.Python-009 | Medium | Resolved | Testing coverage | `clients/python/tests/` |
|
||||
| Client.Python-013 | Medium | Resolved | Security | `clients/python/src/mxgateway_cli/commands.py:757-762` |
|
||||
| Client.Python-023 | Medium | Resolved | Security | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:901-906` |
|
||||
| Client.Python-024 | Medium | Resolved | Code organization & conventions | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:13,48-119` |
|
||||
| Client.Python-027 | Medium | Resolved | Security | `clients/python/src/zb_mom_ww_mxgateway/client.py:36-54`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:47-66`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:165-172,918-930` |
|
||||
| Client.Python-028 | Medium | Resolved | Error handling & resilience | `clients/python/src/zb_mom_ww_mxgateway/options.py:120-130`, `clients/python/src/zb_mom_ww_mxgateway/client.py:59`, `clients/python/src/zb_mom_ww_mxgateway/galaxy.py:71` |
|
||||
| Client.Rust-005 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:489-520` |
|
||||
| Client.Rust-006 | Medium | Resolved | Error handling & resilience | `clients/rust/src/session.rs:531-555` |
|
||||
| Client.Rust-015 | Medium | Resolved | Error handling & resilience | `clients/rust/crates/mxgw-cli/src/main.rs:1053-1070` |
|
||||
@@ -90,6 +99,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Rust-018 | Medium | Resolved | Error handling & resilience | `clients/rust/crates/mxgw-cli/src/main.rs:1098-1170`; `scripts/bench-read-bulk.ps1:347-365`; siblings: `clients/go/cmd/mxgw-go/main.go:600-648`, `clients/python/src/mxgateway_cli/commands.py:614-662`, `clients/dotnet/MxGateway.Client.Cli/MxGatewayClientCli.cs:685-770`, `clients/java/mxgateway-cli/src/main/java/com/dohertylan/mxgateway/cli/MxGatewayCli.java:855-940` |
|
||||
| Client.Rust-022 | Medium | Resolved | Correctness & logic bugs | `clients/rust/src/session.rs:369-391,403-420,427-444,452-469,476-493,631-696,706-724` |
|
||||
| Client.Rust-024 | Medium | Resolved | Testing coverage | `clients/rust/tests/client_behavior.rs:405-415`; `clients/rust/src/session.rs:369-493`; `clients/rust/src/client.rs:265-291`; `clients/rust/crates/mxgw-cli/src/main.rs:1310-1505` |
|
||||
| Client.Rust-031 | Medium | Resolved | Error handling & resilience | `clients/rust/src/options.rs:196-240` (`build_tls_config`); `clients/rust/Cargo.toml:40` (tonic features); docs: `clients/rust/src/options.rs:76-101`, `clients/rust/README.md` (TLS trust section), `clients/rust/crates/mxgw-cli/src/main.rs:429-431`, `clients/rust/RustClientDesign.md:202` |
|
||||
| Contracts-002 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:384-385`, `:95` |
|
||||
| Contracts-009 | Medium | Resolved | Design-document adherence | `docs/Contracts.md:13-24` |
|
||||
| IntegrationTests-003 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:89-97` |
|
||||
@@ -113,6 +123,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Server-033 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs:265-323` (`TryRestoreFromDiskAsync`), `:84-99` (`_firstLoad` / `WaitForFirstLoadAsync`); `src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:141-163` (`WaitForCacheBootstrap`) |
|
||||
| Server-038 | Medium | Resolved | Security | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:23-44` |
|
||||
| Server-044 | Medium | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-254` |
|
||||
| Server-051 | Medium | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:64-78` |
|
||||
| Tests-003 | Medium | Resolved | Performance & resource management | `src/MxGateway.Tests/Security/Authentication/SqliteAuthStoreTests.cs:170-176`, `src/MxGateway.Tests/Security/Authentication/ApiKeyAdminCliRunnerTests.cs:252-258` |
|
||||
| Tests-004 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs` |
|
||||
| Tests-005 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs:239-261`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs` |
|
||||
@@ -122,6 +133,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Tests-020 | Medium | Resolved | Testing coverage | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceConstraintTests.cs:275-347`, `src/MxGateway.Server/Grpc/MxAccessGatewayService.cs:803-829` |
|
||||
| Tests-026 | Medium | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/EventStreamServiceTests.cs`, `src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:123-126` |
|
||||
| Tests-027 | Medium | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:199-240`, `src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:8,73,246-251` |
|
||||
| Tests-032 | Medium | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:435-441`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs` |
|
||||
| Worker-004 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:565-588` |
|
||||
| Worker-005 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:205-258` (production alarm poll loop) |
|
||||
| Worker-006 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:117-124`, `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:386-491` |
|
||||
@@ -130,6 +142,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Worker-016 | Medium | Resolved | Concurrency & thread safety | `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:261-265` |
|
||||
| Worker-017 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Worker/Sta/StaRuntime.cs:280-288`, `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:602-631` |
|
||||
| Worker-023 | Medium | Resolved | Error handling & resilience | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:610-668`, `src/MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs:124-153` |
|
||||
| Worker-027 | Medium | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SyntheticAlarmGuid.cs:38-40` |
|
||||
| Worker.Tests-003 | Medium | Resolved | Concurrency & thread safety | `src/MxGateway.Worker.Tests/Sta/StaRuntimeTests.cs:46-48` |
|
||||
| Worker.Tests-004 | Medium | Resolved | Concurrency & thread safety | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:281-329` |
|
||||
| Worker.Tests-005 | Medium | Resolved | Performance & resource management | `src/MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs:20-31,103-105`, `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:28-31` |
|
||||
@@ -138,6 +151,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Worker.Tests-016 | Medium | Resolved | Code organization & conventions | `src/MxGateway.Worker.Tests/MxAccess/AlarmCommandExecutorTests.cs:317-393` |
|
||||
| Worker.Tests-017 | Medium | Resolved | Testing coverage | `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs` |
|
||||
| Worker.Tests-018 | Medium | Resolved | Correctness & logic bugs | `src/MxGateway.Worker.Tests/MxAccess/MxAccessLiveComCreationTests.cs:18-31, 35-73, 75-145, 148-220, 222-342` |
|
||||
| Worker.Tests-031 | Medium | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs` (all `FailoverSettings` constructions) |
|
||||
| Client.Dotnet-004 | Low | Resolved | Error handling & resilience | `clients/dotnet/MxGateway.Client/MxGatewayClient.cs:283-294`, `clients/dotnet/MxGateway.Client/GalaxyRepositoryClient.cs:392-403` |
|
||||
| Client.Dotnet-005 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/MxGateway.Client/MxGatewaySession.cs:82,124,175` |
|
||||
| Client.Dotnet-006 | Low | Resolved | Code organization & conventions | `clients/dotnet/MxGateway.Client/MxGatewayClientOptions.cs:50`, `clients/dotnet/MxGateway.Client/MxGatewayClientContractInfo.cs:10-14` |
|
||||
@@ -155,6 +169,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Dotnet-019 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:745` |
|
||||
| Client.Dotnet-020 | Low | Resolved | Error handling & resilience | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:792-810`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:774-780` |
|
||||
| Client.Dotnet-021 | Low | Resolved | Correctness & logic bugs | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:487`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/MxGatewayClientCli.cs:715` |
|
||||
| Client.Dotnet-023 | Low | Resolved | Code organization & conventions | `clients/dotnet/Directory.Build.props:17`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/IMxGatewayCliClient.cs:6`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Tests/*.cs` |
|
||||
| Client.Dotnet-024 | Low | Resolved | Code organization & conventions | `clients/dotnet/Directory.Build.props:12`, `clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj:19-24` |
|
||||
| Client.Dotnet-025 | Low | Resolved | Concurrency & thread safety | `clients/dotnet/ZB.MOM.WW.MxGateway.Client/LazyBrowseNode.cs:38,41,54,82,94` |
|
||||
| Client.Go-004 | Low | Resolved | mxaccessgw conventions | `clients/go/mxgateway/alarms_test.go:153-154`, `clients/go/mxgateway/galaxy_test.go:58-59` |
|
||||
| Client.Go-005 | Low | Resolved | Design-document adherence | `clients/go/mxgateway/client.go:64,68`, `clients/go/mxgateway/galaxy.go:83,87` |
|
||||
| Client.Go-006 | Low | Resolved | Error handling & resilience | `clients/go/mxgateway/errors.go:9-130` |
|
||||
@@ -177,6 +194,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Go-025 | Low | Resolved | Correctness & logic bugs | `clients/go/mxgateway/session.go:395-485,495-525` |
|
||||
| Client.Go-026 | Low | Resolved | Error handling & resilience | `clients/go/cmd/mxgw-go/main.go:1196-1222` |
|
||||
| Client.Go-027 | Low | Resolved | Code organization & conventions | `clients/go/cmd/mxgw-go/main.go:1195-1206` |
|
||||
| Client.Go-029 | Low | Resolved | Documentation & comments | `clients/go/README.md:300-303` |
|
||||
| Client.Java-006 | Low | Resolved | Performance & resource management | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:323-328`, `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/GalaxyRepositoryClient.java:279-284` |
|
||||
| Client.Java-007 | Low | Resolved | Testing coverage | `clients/java/mxgateway-client/src/test/java/com/dohertylan/mxgateway/client/` |
|
||||
| Client.Java-008 | Low | Resolved | Error handling & resilience | `clients/java/mxgateway-client/src/main/java/com/dohertylan/mxgateway/client/MxGatewayClient.java:298-304` |
|
||||
@@ -199,6 +217,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Java-031 | Low | Resolved | mxaccessgw conventions | `clients/java/README.md:13,17,26` |
|
||||
| Client.Java-035 | Low | Resolved | Testing coverage | `clients/java/zb-mom-ww-mxgateway-client/src/test/java/com/zb/mom/ww/mxgateway/client/MxGatewayClientSessionTests.java` |
|
||||
| Client.Java-036 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayAlarmFeedSubscription.java`, `MxGatewayEventSubscription.java`, `MxGatewayActiveAlarmsSubscription.java`, `DeployEventSubscription.java` |
|
||||
| Client.Java-038 | Low | Resolved | Code organization & conventions | `clients/java/zb-mom-ww-mxgateway-cli/src/main/java/com/zb/mom/ww/mxgateway/cli/MxGatewayCli.java:1347-1393` |
|
||||
| Client.Python-001 | Low | Resolved | Documentation & comments | `clients/python/pyproject.toml:8,25`, `clients/python/src/mxgateway_cli/commands.py:25` |
|
||||
| Client.Python-002 | Low | Resolved | Code organization & conventions | `clients/python/src/mxgateway/__init__.py:27` |
|
||||
| Client.Python-004 | Low | Resolved | Correctness & logic bugs | `clients/python/src/mxgateway_cli/commands.py:386,402-404` |
|
||||
@@ -217,6 +236,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Python-021 | Low | Resolved | Documentation & comments | `clients/python/src/mxgateway_cli/commands.py`, `clients/python/README.md:235-258` |
|
||||
| Client.Python-025 | Low | Resolved | Testing coverage | `clients/python/tests/test_cli.py`, `clients/python/src/zb_mom_ww_mxgateway/{client.py,session.py}`, `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py` |
|
||||
| Client.Python-026 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway_cli/commands.py:674-738` |
|
||||
| Client.Python-029 | Low | Resolved | Correctness & logic bugs | `clients/python/src/zb_mom_ww_mxgateway/options.py:78-90` |
|
||||
| Client.Python-030 | Low | Resolved | Code organization & conventions | `clients/python/pyproject.toml:17` |
|
||||
| Client.Python-031 | Low | Resolved | Testing coverage | `clients/python/tests/test_tls.py:34`, `clients/python/pyproject.toml:53-56` |
|
||||
| Client.Rust-004 | Low | Resolved | Documentation & comments | `clients/rust/src/version.rs:7` |
|
||||
| Client.Rust-007 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md:14-55` |
|
||||
| Client.Rust-008 | Low | Resolved | Performance & resource management | `clients/rust/src/value.rs:161-261` |
|
||||
@@ -233,6 +255,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Client.Rust-026 | Low | Resolved | Performance & resource management | `clients/rust/crates/mxgw-cli/src/main.rs:1402-1406,1419-1423` |
|
||||
| Client.Rust-027 | Low | Resolved | Documentation & comments | `clients/rust/.cargo/config.toml:1-9` |
|
||||
| Client.Rust-028 | Low | Resolved | mxaccessgw conventions | `clients/rust/crates/mxgw-cli/src/main.rs:1126-1166` |
|
||||
| Client.Rust-032 | Low | Resolved | Design-document adherence | `clients/rust/RustClientDesign.md`; surface in `clients/rust/src/galaxy.rs:281-379` |
|
||||
| Contracts-001 | Low | Resolved | Design-document adherence | `docs/Grpc.md:13` (and `:3`, `:32`, `:39`) |
|
||||
| Contracts-003 | Low | Won't Fix | Code organization & conventions | `src/MxGateway.Contracts/MxGateway.Contracts.csproj:10` |
|
||||
| Contracts-004 | Low | Resolved | Documentation & comments | `src/MxGateway.Contracts/GatewayContractInfo.cs:3-6` |
|
||||
@@ -248,6 +271,8 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Contracts-015 | Low | Resolved | Documentation & comments | `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto:571-582` |
|
||||
| Contracts-016 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:31-41` (`QueryActiveAlarmsRequest`) |
|
||||
| Contracts-017 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:23-29` (the `rpc QueryActiveAlarms` block) |
|
||||
| Contracts-018 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs:396` (`ActiveAlarmSnapshot_RoundTripsAllFields`) |
|
||||
| Contracts-019 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto:850-851` (`ActiveAlarmSnapshot`), `:318-324` (`AlarmProviderMode`) |
|
||||
| IntegrationTests-007 | Low | Resolved | Concurrency & thread safety | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:20`, `src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs:5`, `src/MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:9` |
|
||||
| IntegrationTests-008 | Low | Resolved | Code organization & conventions | `src/MxGateway.IntegrationTests/LiveLdapFactAttribute.cs`, `src/MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs`, `src/MxGateway.IntegrationTests/LiveMxAccessFactAttribute.cs` |
|
||||
| IntegrationTests-009 | Low | Resolved | Documentation & comments | `src/MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:372-375` |
|
||||
@@ -263,6 +288,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| IntegrationTests-023 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:14-29` |
|
||||
| IntegrationTests-024 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs` (`NullDashboardEventBroadcaster` private class at end of file) |
|
||||
| IntegrationTests-025 | Low | Resolved | Correctness & logic bugs | `src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironmentTests.cs:57-84` (`ResolveRepositoryRoot_NoMarkers_ThrowsInvalidOperationExceptionNamingStartAndMarkers`) |
|
||||
| IntegrationTests-027 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj`, `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:4-5,134` |
|
||||
| IntegrationTests-028 | Low | Resolved | Design-document adherence | `src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:120-161`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:35` |
|
||||
| IntegrationTests-029 | Low | Resolved | Documentation & comments | `docs/GatewayTesting.md:218-224` |
|
||||
| Server-007 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Galaxy/GalaxyHierarchyProjector.cs:55-70` |
|
||||
| Server-008 | Low | Resolved | Performance & resource management | `src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs:111-134,160-189` |
|
||||
| Server-009 | Low | Resolved | Error handling & resilience | `src/MxGateway.Server/Security/Authentication/AuthSqliteConnectionFactory.cs:15-32` |
|
||||
@@ -297,6 +325,8 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Server-048 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:463-498` |
|
||||
| Server-049 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardSessionAdminService.cs:5-18`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:8-25` |
|
||||
| Server-050 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:42-75,92-125` |
|
||||
| Server-052 | Low | Resolved | Documentation & comments | `src/ZB.MOM.WW.MxGateway.Server/Alarms/IAlarmWatchListResolver.cs:24-30`, `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:101-114`, `docs/GatewayConfiguration.md:247` |
|
||||
| Server-053 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmWatchListResolverTests.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs` |
|
||||
| Tests-007 | Low | Resolved | Code organization & conventions | `src/MxGateway.Tests/Gateway/Grpc/MxAccessGatewayServiceTests.cs:682`, `src/MxGateway.Tests/Gateway/Grpc/GalaxyRepositoryGrpcServiceTests.cs:324`, `src/MxGateway.Tests/Gateway/GatewayEndToEndFakeWorkerSmokeTests.cs:460`, `src/MxGateway.Tests/Security/Authorization/GatewayGrpcAuthorizationInterceptorTests.cs:233` |
|
||||
| Tests-008 | Low | Resolved | mxaccessgw conventions | `src/MxGateway.Tests/Gateway/Sessions/WorkerAlarmRpcDispatcherTests.cs:1-9`, `src/MxGateway.Tests/Gateway/Sessions/NotWiredAlarmRpcDispatcherTests.cs:1-3`, `src/MxGateway.Tests/Gateway/Sessions/SessionManagerAlarmAutoSubscribeTests.cs:1` |
|
||||
| Tests-009 | Low | Resolved | Documentation & comments | `src/MxGateway.Tests/Gateway/Sessions/SessionManagerTests.cs:36-37,99,365` |
|
||||
@@ -317,6 +347,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Tests-029 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs:61-106,139-222`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs:77-125` |
|
||||
| Tests-030 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardApiKeyManagementServiceTests.cs:115-163`, `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:146-177` |
|
||||
| Tests-031 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSnapshotPublisherTests.cs:22-61` |
|
||||
| Tests-033 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAlarmProviderStatus.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardBrowseAndAlarmModelTests.cs:140-195` |
|
||||
| Tests-034 | Low | Resolved | mxaccessgw conventions | `src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs:1-15` |
|
||||
| Tests-035 | Low | Resolved | Concurrency & thread safety | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs:315-329` |
|
||||
| Worker-009 | Low | Resolved | Performance & resource management | `src/MxGateway.Worker/Ipc/WorkerFrameReader.cs:31,49`, `src/MxGateway.Worker/Ipc/WorkerFrameWriter.cs:57-58` |
|
||||
| Worker-010 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Conversion/VariantConverter.cs:204-226` |
|
||||
| Worker-011 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeClient.cs:169-171` |
|
||||
@@ -331,6 +364,7 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Worker-022 | Low | Resolved | Code organization & conventions | `src/MxGateway.Worker/MxAccess/MxAlarmSnapshot.cs:12`, `:26`, `:49` |
|
||||
| Worker-024 | Low | Resolved | Concurrency & thread safety | `src/MxGateway.Worker/MxAccess/AlarmCommandHandler.cs:63-187`, `src/MxGateway.Worker/MxAccess/MxAccessStaSession.cs:191-323` |
|
||||
| Worker-025 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker/Ipc/WorkerPipeSession.cs:111-117` |
|
||||
| Worker-028 | Low | Resolved | Code organization & conventions | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmStateMachine.cs:43-52`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmConsumer.cs:70-75` |
|
||||
| Worker.Tests-008 | Low | Resolved | Documentation & comments | `src/MxGateway.Worker.Tests/Conversion/VariantConverterTests.cs:175-182` |
|
||||
| Worker.Tests-009 | Low | Resolved | Code organization & conventions | `src/MxGateway.Worker.Tests/MxAccess/AlarmCommandHandlerTests.cs`, `AlarmDispatcherTests.cs`, `AlarmCommandExecutorTests.cs`, `AlarmRecordTransitionMapperTests.cs`, `WnWrapAlarmConsumerXmlTests.cs` |
|
||||
| Worker.Tests-010 | Low | Resolved | Correctness & logic bugs | `src/MxGateway.Worker.Tests/MxAccess/MxAccessStaSessionTests.cs:230-258` |
|
||||
@@ -351,3 +385,5 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
|
||||
| Worker.Tests-028 | Low | Resolved | Design-document adherence | `docs/GatewayTesting.md`, `src/MxGateway.Worker.Tests/Probes/` |
|
||||
| Worker.Tests-029 | Low | Resolved | Code organization & conventions | `src/MxGateway.Worker.Tests/Probes/AlarmsLiveSmokeTests.cs:9`, `src/MxGateway.Worker.Tests/Probes/AlarmClientWmProbeTests.cs:14`, `src/MxGateway.Worker.Tests/Probes/WnWrapConsumerProbeTests.cs:10` |
|
||||
| Worker.Tests-030 | Low | Resolved | Documentation & comments | `src/MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:862-890` |
|
||||
| Worker.Tests-032 | Low | Resolved | Error handling & resilience | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs` |
|
||||
| Worker.Tests-033 | Low | Resolved | Testing coverage | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmStateMachineTests.cs` |
|
||||
|
||||
@@ -4,8 +4,8 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Server` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
@@ -120,6 +120,38 @@ contention nor the bounded `_events` channel saw any changes in this wave.
|
||||
| 9 | Testing coverage | No issues found in this module — see Tests-026 in the Tests module for the missing EventsHub broadcast coverage. |
|
||||
| 10 | Documentation & comments | Issues found: Server-040, Server-043 (both documentation gaps). |
|
||||
|
||||
### 2026-06-15 re-review (commit 410acc9)
|
||||
|
||||
Re-review pass at `410acc9` over the `42b0037..HEAD` diff. The diff is large (~137 files)
|
||||
but the bulk is vendored theme/CSS/font asset swaps (`wwwroot`), generated code, and the
|
||||
shared-library auth refactor / TLS cert-autogen / lazy-browse / canonical-audit waves that
|
||||
each carry their own design+plan and were verified in passing only. This pass is scoped to
|
||||
the **alarm-provider subtag-fallback** wave the task called out: the central
|
||||
`GatewayAlarmMonitor` provider-mode seeding + failover/failback handling, the new
|
||||
`AlarmWatchListResolver` / `IAlarmWatchListResolver`, `AlarmFallbackOptions` /
|
||||
`AlarmDiscoveryOptions` / `AlarmSubtagNameOptions` and their `GatewayOptionsValidator`
|
||||
wiring, the `DashboardAlarmProviderStatus` badge + `AlarmsPage.razor` hub attach, the
|
||||
provider-mode gauge + `provider_switches` counter (`GatewayMetrics`,
|
||||
`AlarmProviderSwitchReason`), the Galaxy alarm-attribute discovery query
|
||||
(`GalaxyRepository.GetAlarmAttributesAsync` / `AlarmAttributesSql` / `GalaxyAlarmAttributeRow`),
|
||||
the `/auth/login` POST move + configurable `Dashboard:CookieName`, and the
|
||||
`BrowseChildrenRequest` scope-resolver entry. Prior findings Server-044 through Server-050
|
||||
are confirmed resolved by the SessionManager/GatewaySession changes in range and remain
|
||||
closed. New findings filed against this pass: Server-051..053.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Issues found: Server-051 (`AlarmWatchListResolver.ResolveAsync`'s broad `catch (Exception)` swallows `OperationCanceledException`, contradicting the `IAlarmWatchListResolver` cancellation contract). |
|
||||
| 2 | mxaccessgw conventions | No issues found — file-scoped namespaces, `sealed`, `Async` suffix, Options pattern, MXAccess-aligned naming all hold; no UI component libraries (badge is Bootstrap-only); the alarm SQL is a parameterless constant; no secret/tag-value logging; well-known reason strings centralised in `AlarmProviderReasons`. |
|
||||
| 3 | Concurrency & thread safety | No issues found — `_providerMode`/`_providerDegraded`/`_providerReason`/`_providerSince` are read/written only under `_sync`; `BroadcastToAll` runs under `_sync`; the reconcile after a mode change is intentionally awaited outside `_sync` to avoid the documented self-deadlock; the provider-mode gauge is serialized on `GatewayMetrics._syncRoot`. |
|
||||
| 4 | Error handling & resilience | Issues found: Server-051 (cancellation swallowed in the resolver — also an error-handling/contract concern). |
|
||||
| 5 | Security | No issues found — `BrowseChildren` runs the same `ResolveBrowseSubtrees()` constraint scoping and `MetadataRead` scope as `DiscoverHierarchy`; the configurable `Dashboard:CookieName` falls back to the canonical default and cannot be blanked; the `/auth/login` POST keeps antiforgery + return-URL sanitisation. |
|
||||
| 6 | Performance & resource management | No issues found in the alarm-fallback code — discovery is a one-shot per subscribe lifecycle; the watch-list is composed once. |
|
||||
| 7 | Design-document adherence | No issues found — `docs/GatewayConfiguration.md`, `docs/Metrics.md`, `docs/GalaxyRepository.md`, and the `docs/plans/2026-06-13-alarm-subtag-fallback*` / `2026-06-15-forced-subtag-mode-fix.md` plans were landed in the same range and match the code. |
|
||||
| 8 | Code organization & conventions | No issues found — new alarm types live under `Alarms/`, options under `Configuration/`, metric helper under `Metrics/`, registered via `AddGatewayAlarms`. |
|
||||
| 9 | Testing coverage | Issues found: Server-053 (`AlarmWatchListResolver` `ExcludeAttributes`-vs-`IncludeAttributes` precedence and the resolver's cancellation contract are untested; no redundant-mode-change guard test). |
|
||||
| 10 | Documentation & comments | Issues found: Server-052 (`IAlarmWatchListResolver` XML contract claims cancellation propagates while the implementation swallows it; the `Discovery:ExcludeAttributes` doc says "Repository-derived watch-list" while the code also removes matching explicit `IncludeAttributes`). |
|
||||
|
||||
## Findings
|
||||
|
||||
### Server-001
|
||||
@@ -929,3 +961,64 @@ Today neither call site has a Blazor error boundary, so an unhandled exception l
|
||||
**Recommendation:** Add a general `catch (Exception exception)` after the `SessionManagerException` catch in both `CloseSessionAsync` and `KillWorkerAsync`, log a warning (matching the SessionManagerException pattern), and return `DashboardSessionAdminResult.Fail($"{operation} failed unexpectedly. See the gateway log for details.")`. This makes the result type truly the only output the page sees. Add a regression test using a `ThrowingSessionManager` that throws e.g. `InvalidOperationException` from `KillWorkerAsync` and asserts the service returns a failing result rather than propagating.
|
||||
|
||||
**Resolution:** 2026-05-24 — Added the recommended general `catch (Exception)` arms to both `DashboardSessionAdminService.CloseSessionAsync` and `KillWorkerAsync` (`src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSessionAdminService.cs`), placed after the `SessionManagerException` catches and behind a `catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) throw;` so caller cancellation still propagates cleanly. The new catches log a warning with actor + session id and return `DashboardSessionAdminResult.Fail("{Operation} failed unexpectedly for session {SessionId}. See the gateway log for details.")`, mirroring the SessionManagerException pattern. Regression tests in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardSessionAdminServiceTests.cs`: `CloseSessionAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` (the `ISessionManager` throws `InvalidOperationException("unexpected")`) and `KillWorkerAsync_WhenManagerThrowsUnexpected_ReturnsFriendlyFail` (throws `IOException("pipe broken")`); both assert the service returns a failing result with a non-blank message rather than propagating. The fake's new `CloseThrowsUnexpected` / `KillThrowsUnexpected` properties hold the configured exception. Confirmed to fail before the fix (raw exception propagated) and pass after.
|
||||
|
||||
### Server-051
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:64-78` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `AlarmWatchListResolver.ResolveAsync` wraps the Galaxy Repository discovery call in a bare `catch (Exception ex)` that logs a warning and continues with an empty (config-only) discovery set:
|
||||
|
||||
```csharp
|
||||
try { rows = await _repository.GetAlarmAttributesAsync(cancellationToken)...; }
|
||||
catch (Exception ex) { _logger.LogWarning(ex, "...continuing with configuration-only watch-list."); rows = []; }
|
||||
```
|
||||
|
||||
`OperationCanceledException` / `TaskCanceledException` derive from `Exception`, so a cancellation triggered while `GetAlarmAttributesAsync` is awaiting SQL is **swallowed**, not propagated. The resolver then returns a (config-only or empty) watch-list as though the call completed normally. This directly contradicts the `IAlarmWatchListResolver.ResolveAsync` XML contract, which states: *"Cancellation is the one exception: a triggered cancellationToken still propagates an OperationCanceledException."* In practice the resolver is called from `GatewayAlarmMonitor.SubscribeAlarmsAsync` on the monitor's lifecycle token; if the gateway is shutting down (or the monitor lifecycle is being torn down) mid-discovery, the resolver hides the cancellation and the monitor proceeds to issue `SubscribeAlarms` with a wrong (empty) watch-list instead of unwinding promptly. The `GalaxyRepository.GetAlarmAttributesAsync` SQL path does honour the token (`OpenAsync(ct)` / `ExecuteReaderAsync(ct)` / `ReadAsync(ct)`), so a real cancellation can land inside this catch.
|
||||
|
||||
**Recommendation:** Add a `catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) { throw; }` ahead of the general catch (or filter the general catch with `when (ex is not OperationCanceledException)`), so cancellation propagates per the documented contract while genuine discovery failures still degrade to a config-only list. Add a regression test that cancels the token mid-`GetAlarmAttributesAsync` and asserts `OperationCanceledException` propagates.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15. Confirmed against source: the bare `catch (Exception ex)` swallowed `OperationCanceledException`. Filtered the catch with `when (ex is not OperationCanceledException)` so a real cancellation propagates per the `IAlarmWatchListResolver` contract while genuine discovery failures still degrade to a config-only list. Regression test: `AlarmWatchListResolverTests.ResolveAsync_RepositoryCancelled_PropagatesOperationCanceled` (failed before the fix, passes after).
|
||||
|
||||
### Server-052
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Server/Alarms/IAlarmWatchListResolver.cs:24-30`, `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs:101-114`, `docs/GatewayConfiguration.md:247` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Two prose-vs-code mismatches in the watch-list resolver:
|
||||
|
||||
1. The `IAlarmWatchListResolver.ResolveAsync` XML `<returns>` promises that a triggered `cancellationToken` propagates an `OperationCanceledException`, but the implementation swallows it (see Server-051). Whichever way Server-051 is resolved, exactly one of the doc or the code is currently wrong; right now the doc over-promises.
|
||||
|
||||
2. `AlarmDiscoveryOptions.ExcludeAttributes` and `docs/GatewayConfiguration.md:247` both describe `ExcludeAttributes` as removing entries from the **"Repository-derived"** watch-list. The implementation's `ordered.RemoveAll(e => excluded.Contains(e.Reference))` runs over the combined list — Galaxy-Repository rows **and** the explicit `Discovery:IncludeAttributes` entries appended just above it — so an exclude entry that matches an explicit include silently removes that include too. The behaviour is defensible (excludes win) but is not what the "Repository-derived" wording says, and an operator who adds an attribute via `IncludeAttributes` and also lists it in `ExcludeAttributes` would be surprised it disappears.
|
||||
|
||||
**Recommendation:** For (1), align the `IAlarmWatchListResolver` doc with whatever Server-051 settles on. For (2), either restrict the exclude to GR-discovered rows (apply `RemoveAll` before appending the `IncludeAttributes` entries) or update the option XML doc and `GatewayConfiguration.md` to say excludes are applied to the merged GR-plus-include list and therefore also suppress matching explicit includes.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15. (1) No longer over-promises: the Server-051 fix makes the implementation propagate `OperationCanceledException`, so the `IAlarmWatchListResolver.ResolveAsync` `<returns>` doc is now accurate and was left unchanged. (2) Kept the "excludes win" code behaviour (excludes applied to the merged GR-plus-include list) and corrected the prose to match: `AlarmDiscoveryOptions.ExcludeAttributes` XML doc and `docs/GatewayConfiguration.md:247` now state the exclude runs after the GR rows and explicit `IncludeAttributes` are combined, so an exclude matching an explicit include suppresses it too. The "excludes win" precedence is pinned by `AlarmWatchListResolverTests.ResolveAsync_ExcludeAlsoSuppressesMatchingExplicitInclude`.
|
||||
|
||||
### Server-053
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmWatchListResolverTests.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The new alarm-fallback surface is broadly well-tested (`AlarmWatchListResolverTests`, `GatewayAlarmMonitorProviderModeTests`, `DashboardBrowseAndAlarmModelTests`, `GalaxyAlarmAttributeMappingTests`, `GatewayOptionsValidatorTests`), but two behaviours that the diff introduced have no coverage:
|
||||
|
||||
- **Resolver cancellation contract (Server-051):** no test cancels the token mid-discovery and asserts `OperationCanceledException` propagates. Because the existing `ResolveAsync_RepositoryThrows_LogsAndReturnsConfigOnlySet` asserts the swallow path, the cancellation regression is precisely the case that would catch the Server-051 bug — and its absence is why the contract violation went unnoticed.
|
||||
- **Exclude-vs-include precedence (Server-052 item 2):** no test exercises a `Discovery:IncludeAttributes` entry that also appears in `ExcludeAttributes`, so the "excludes also drop explicit includes" behaviour is unpinned and would silently change if the merge order were edited.
|
||||
|
||||
Additionally, `GatewayAlarmMonitor.ApplyProviderModeChangeAsync` increments the `mxgateway.alarms.provider_switches` counter and resets `_providerSince` unconditionally on every `OnAlarmProviderModeChanged` event, with no guard for a redundant event whose `toMode` equals the current mode; there is no test asserting the from==to / no-op behaviour either way.
|
||||
|
||||
**Recommendation:** Add resolver tests for (a) cancellation propagation and (b) an include that is also excluded; and a `GatewayAlarmMonitorProviderMode` test pinning the provider-switch counter behaviour for a same-mode repeat event (whichever semantics the team intends). These lock down the contracts the Server-051/052 findings expose.
|
||||
|
||||
**Resolution:** Resolved 2026-06-15. Added all three missing tests: (a) `AlarmWatchListResolverTests.ResolveAsync_RepositoryCancelled_PropagatesOperationCanceled` (cancellation propagation, also covers Server-051); (b) `AlarmWatchListResolverTests.ResolveAsync_ExcludeAlsoSuppressesMatchingExplicitInclude` (exclude-vs-include precedence, also Server-052 item 2); and (c) `GatewayAlarmMonitorProviderModeTests.ProviderModeChange_RepeatedSameMode_RecordsASwitchForEachEvent`, which pins the existing semantics — each worker-reported `OnAlarmProviderModeChanged` event records a `provider_switches` increment (and resets `_providerSince`) even when `toMode` equals the current mode, since the worker is the authority on when a mode change occurred and the gateway does not synthesize or suppress it.
|
||||
|
||||
@@ -4,13 +4,43 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Tests` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
### 2026-06-15 re-review (commit `410acc9`)
|
||||
|
||||
Re-review of the `42b0037..410acc9` diff (≈57 files), scoped to the alarm-provider
|
||||
fallback feature: the end-to-end failover/failback lifecycle test
|
||||
(`AlarmFailoverEndToEndTests`), the provider-mode/metric tests
|
||||
(`GatewayAlarmMonitorProviderModeTests`), the watch-list resolver tests
|
||||
(`AlarmWatchListResolverTests`), the validator additions
|
||||
(`GatewayOptionsValidatorTests` AlarmFallback block), the dashboard badge model
|
||||
(`DashboardBrowseAndAlarmModelTests`), the alarm metric tests
|
||||
(`GatewayMetricsTests`), the Galaxy alarm mapper (`GalaxyAlarmAttributeMappingTests`),
|
||||
and the new `provider_status` / degraded-provenance protobuf round-trips. The
|
||||
non-alarm churn in the diff (kill/shutdown SessionManager tests closing prior
|
||||
Tests-028/029, XML-doc-only additions to `SessionManagerBulkTests`/`GatewaySessionTests`,
|
||||
browse-tab and TLS tests) was walked but is not the review focus.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found in this diff. The lifecycle test correctly disambiguates the recovery `ProviderStatus` from the baseline by matching on `Reason == "recovered"`; the `ModeString_MapsToForcedProviderMode` `Assert.Empty(WatchList)` is weak (the stub resolver returns `[]` regardless of mode) but not wrong. |
|
||||
| 2 | mxaccessgw conventions | Issue found: Tests-034 (`GatewayLogRedactorSeamTests.cs` is in the global namespace with redundant `System.Collections.Generic`/`Xunit` usings, `var`, and a non-`sealed` `public class` — the same C# style drift family as the resolved Tests-008). |
|
||||
| 3 | Concurrency & thread safety | Issue found: Tests-035 (`AlarmFailoverEndToEndTests.DegradedTransition_*`'s second-subscriber `await foreach`-to-`SnapshotComplete` has no `WaitTimeout`, so a regression that never emits `SnapshotComplete` hangs the test instead of failing cleanly). The metric/feed reader races are correctly gated by `baselineReceived` TCS before emitting events. |
|
||||
| 4 | Error handling & resilience | No issues found in this diff. `AlarmWatchListResolverTests.ResolveAsync_RepositoryThrows_LogsAndReturnsConfigOnlySet` covers the discovery-unavailable degradation path; validator failure paths are well covered. |
|
||||
| 5 | Security | No issues found in this diff. The redaction seam assertion in `GatewayLogRedactorSeamTests` (despite its style drift) meaningfully pins API-key masking in `ClientIdentity`; secured-bulk credential round-trips are pinned. |
|
||||
| 6 | Performance & resource management | No issues found in this diff. Monitors/CTSs are disposed; `using GatewayMetrics`/`using GatewayAlarmMonitor` throughout. |
|
||||
| 7 | Design-document adherence | No issues found in this diff. Tests match the alarm-fallback plan and the forced-vs-failover-degraded badge distinction. |
|
||||
| 8 | Code organization & conventions | See Tests-034. The two alarm-monitor test files replicate (not share) the `FakeSessionManager`/`StubWatchListResolver` harness; the in-file remark documents this is deliberate to keep the sibling untouched — acceptable, not filed. |
|
||||
| 9 | Testing coverage | Issues found: Tests-032 (the monitor's `toMode`→`AlarmProviderSwitchReason` derivation — Subtag→Failover, Alarmmgr→Failback — is untested: `Failback` is asserted nowhere and the monitor tests check only the switch *count*, so a swapped/`Unknown` reason regression passes), Tests-033 (`DashboardAlarmProviderStatus.FromFeed` and its non-provider-status `ArgumentException` guard, the `SinceUtc` mapping, the `DegradedLabel` text, and the `Degraded && Mode==Alarmmgr` guard branch are all uncovered). |
|
||||
| 10 | Documentation & comments | No issues found in this diff. New alarm test files carry orienting class-level summaries; `GalaxyAlarmAttributeMappingTests`'s "derivation" framing slightly overstates the pass-through mapper but is harmless. |
|
||||
|
||||
### 2026-05-24 re-review of the Tests-013–019 batch
|
||||
|
||||
This pass (commit `a020350`) re-reviews the module after the Tests-013–019 batch was resolved alongside Server-017, Server-021, and Contracts-010.
|
||||
|
||||
| # | Category | Result |
|
||||
@@ -557,3 +587,63 @@ The cancellation tests for `WorkerClient` in `WorkerClientTests` *do* exercise t
|
||||
**Recommendation:** (a) The cheap fix: have `ThrowOnceThenYieldSnapshotService` record `_firstThrowAt = DateTimeOffset.UtcNow` immediately before the `throw`, and change the assertion to `secondSubscribeAt - firstThrowAt >= reconnectDelay - 10ms` — the gap then measures only the reconnect delay, eliminating the variable scheduling baseline. (b) The deeper fix: extend `DashboardSnapshotPublisher` to accept an `ITimeProvider`-style delay seam (or a virtual `DelayAsync` hook) so a `ManualTimeProvider` could advance time deterministically. (a) is preferred for now; (b) belongs as a follow-up if more reconnect-loop tests are added.
|
||||
|
||||
**Resolution:** 2026-05-24 — Applied option (a). Added `FirstThrowAt` to `ThrowOnceThenYieldSnapshotService` and set it via `FirstThrowAt = DateTimeOffset.UtcNow;` immediately before the first-call `throw`. Removed the pre-`StartAsync` `startedAt` baseline; the assertion now reads `gap = secondSubscribeAt - firstThrowAt` (both timestamps captured inside the fake), and the 10 ms slack absorbs the Windows `Task.Delay` quantum without the variable `StartAsync` / scheduling overhead in the baseline. This is the same flake-isolation pattern Tests-006 / Tests-017 used (measuring only the production delay, not test-side setup). Suite green; the test passes deterministically across repeated runs.
|
||||
|
||||
### Tests-032
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:435-441`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `GatewayAlarmMonitor.HandleProviderModeChanged` derives the provider-switch reason from the target mode: `toMode switch { Subtag => Failover, Alarmmgr => Failback, _ => Unknown }` (lines 435-439), then calls `_metrics.AlarmProviderSwitched(fromModeInt, toModeInt, switchReason)`. No test in the diff asserts this derivation. `GatewayAlarmMonitorProviderModeTests.ProviderModeChange_BroadcastsDegradedStatus_AndIncrementsSwitchMetric` only asserts the switch *count* (`switchCount == 1`) — it never inspects the `from`/`to`/`reason` tags on the measurement. `AlarmFailoverEndToEndTests.ProviderFailoverAndFailback_FullLifecycle` drives both a failover (alarmmgr→subtag) and a failback (subtag→alarmmgr) but asserts only on feed `ProviderStatus` messages, not on the metric tags. The only place the `reason` tag is read is `GatewayMetricsTests.AlarmProviderSwitched_IncrementsCounterWithExpectedTags`, which passes `AlarmProviderSwitchReason.Failover` *explicitly* to the metrics layer — that pins the metrics-side tag formatting, not the monitor's `toMode→reason` mapping. `AlarmProviderSwitchReason.Failback` is asserted nowhere in the suite. A regression that swapped the Failover/Failback arms, or collapsed them to `Unknown`, would pass every existing test while emitting wrong dashboard/observability data for every failback.
|
||||
|
||||
**Recommendation:** Extend `GatewayAlarmMonitorProviderModeTests` (or add a failback case) to capture the `reason` tag through a `MeterListener` and assert it equals `"failover"` on an alarmmgr→subtag change and `"failback"` on a subtag→alarmmgr change, mirroring the tag-capturing pattern already in `GatewayMetricsTests.AlarmProviderSwitched_IncrementsCounterWithExpectedTags`. This pins the monitor's `toMode→AlarmProviderSwitchReason` derivation, not just the count.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed root cause: the existing monitor tests asserted only the switch *count*, and `Failback` was asserted nowhere in the suite, so a swapped/`Unknown` reason arm would pass. Added `GatewayAlarmMonitorProviderModeTests.ProviderModeChange_FailoverThenFailback_RecordsCorrectReasonTags`, which captures the `reason` tag off the `mxgateway.alarms.provider_switches` counter via a `MeterListener` and drives an alarmmgr→subtag change then a subtag→alarmmgr change, asserting the captured reasons are exactly `["failover", "failback"]`. This pins the monitor's `toMode→AlarmProviderSwitchReason` derivation (`ApplyProviderModeChangeAsync`). Test passes against current production code (no production change); no bug found.
|
||||
|
||||
### Tests-033
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAlarmProviderStatus.cs`, `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Dashboard/DashboardBrowseAndAlarmModelTests.cs:140-195` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** The three new badge-mapping tests cover `FromProviderStatus` for green (Alarmmgr/not-degraded), amber (Subtag/degraded), and cyan (Subtag/forced). Several adjacent behaviours of the same projection are uncovered: (1) `DashboardAlarmProviderStatus.FromFeed(AlarmFeedMessage)` — the public entry the dashboard SignalR snapshot path actually calls — and its `ArgumentException` thrown when the message is not a `ProviderStatus` payload have zero coverage in the suite (a grep for `FromFeed` across the test project returns no hits). (2) The `SinceUtc` field (`status.Since?.ToDateTimeOffset()`) is never asserted, so a regression dropping or mis-converting the badge timestamp would not be caught. (3) The `DegradedLabel` constant text ("Subtag monitoring (degraded)") is asserted nowhere — the amber test only checks the `bg-warning` CSS class, so a label swap would pass. (4) The `degraded = status.Degraded || status.Mode == AlarmProviderMode.Subtag` guard's second branch (`Degraded == true` while `Mode == Alarmmgr`) — an explicitly-degraded alarmmgr status — is untested, so the "guard against either being set independently" comment in the product code is unverified.
|
||||
|
||||
**Recommendation:** Add `FromFeed_NonProviderStatusPayload_Throws` (asserting `ArgumentException`) and `FromFeed_ProviderStatusPayload_ProjectsBadge`; assert `SinceUtc` on a status carrying a `Since` timestamp; assert `model.Label == DashboardAlarmProviderStatus.DegradedLabel` in the amber test; and add a `Degraded=true, Mode=Alarmmgr` case asserting it maps to the degraded (amber) badge per the independent-flag guard.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed the four coverage gaps against `DashboardAlarmProviderStatus`. Added to `DashboardBrowseAndAlarmModelTests`: `FromFeed_ProviderStatusPayload_ProjectsBadge` and `FromFeed_NonProviderStatusPayload_Throws` (the latter asserts `ArgumentException` for a `SnapshotComplete` feed message); `FromProviderStatus_WithSinceTimestamp_MapsSinceUtc` (pins `SinceUtc` round-trips the protobuf `Since` timestamp); `FromProviderStatus_Alarmmgr_DegradedFlagSet_WarningBadge` (the `Degraded && Mode==Alarmmgr` independent-flag branch maps to the amber degraded badge); and a `DegradedLabel` text assertion added to the existing amber `FromProviderStatus_Subtag_Degraded_WarningBadge`. All pass against current production code (no production change); no bug found.
|
||||
|
||||
### Tests-034
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | mxaccessgw conventions |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorSeamTests.cs:1-15` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `GatewayLogRedactorSeamTests.cs` diverges from the project's C# style guide and the rest of the test suite: it declares no file-scoped namespace (the class lands in the global namespace, unlike every other test file which sits under `ZB.MOM.WW.MxGateway.Tests.*`); it carries redundant explicit `using System.Collections.Generic;` and `using Xunit;` (both are implicit global usings in this project, enforced elsewhere — see the resolved Tests-008); it uses `var` for `redactor`/`props` where the suite uses explicit types per `docs/style-guides/CSharpStyleGuide.md`; and it declares `public class` rather than the project's `sealed`-by-default convention. The redaction assertion itself is sound (it meaningfully pins API-key masking in `ClientIdentity`), so this is purely the same style-drift family as the previously-filed-and-resolved Tests-008, not a correctness issue.
|
||||
|
||||
**Recommendation:** Add `namespace ZB.MOM.WW.MxGateway.Tests.Diagnostics;` (file-scoped), drop the redundant `System.Collections.Generic`/`Xunit` usings, mark the class `public sealed class`, and replace the two `var` declarations with explicit types (`GatewayLogRedactorSeam` / `Dictionary<string, object?>`).
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed the style drift. Rewrote `GatewayLogRedactorSeamTests.cs` to add the file-scoped `namespace ZB.MOM.WW.MxGateway.Tests.Diagnostics;`, dropped the redundant `using System.Collections.Generic;`/`using Xunit;` (both implicit global usings), marked the class `public sealed class`, and replaced the two `var` declarations with explicit `GatewayLogRedactorSeam` / `Dictionary<string, object?>` types. The single `Redact_MasksApiKeyInClientIdentity` assertion is unchanged and still passes.
|
||||
|
||||
### Tests-035
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs:315-329` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** In `DegradedTransition_CachedThenReplayed_CarriesDegradedAndSourceProviderToNewSubscriber`, the second-subscriber loop iterates `monitor.StreamAsync(null, newStreamCts.Token)` with no timeout, breaking only when a `SnapshotComplete` payload arrives (lines 317-329). Every other wait in this file routes through `WaitForAsync(..., WaitTimeout)` or `Task.WaitAsync(WaitTimeout)`; this `await foreach` does not. If a regression caused the monitor to stop emitting `SnapshotComplete` for a new subscriber (e.g. a snapshot-replay path that throws before the terminal message), the test would hang on the `await foreach` rather than fail with a `TimeoutException`, relying on the xUnit `longRunningTestSeconds` warning or the CI hard-kill instead of a clean assertion failure. The first subscriber in the same test is correctly bounded by `WaitForAsync`.
|
||||
|
||||
**Recommendation:** Bound the second-subscriber drain with the same `WaitTimeout` used elsewhere — e.g. link `newStreamCts` to a `CancellationTokenSource.CreateLinkedTokenSource` plus `CancelAfter(WaitTimeout)`, or wrap the drain in a `Task` awaited via `WaitAsync(WaitTimeout)` — so a missing `SnapshotComplete` surfaces as a deterministic failure rather than a hang.
|
||||
|
||||
**Resolution:** 2026-06-15 — Confirmed the unbounded `await foreach` in `DegradedTransition_CachedThenReplayed_CarriesDegradedAndSourceProviderToNewSubscriber`. Bounded the second-subscriber drain with a `CancellationTokenSource.CreateLinkedTokenSource(newStreamCts.Token, drainTimeoutCts.Token)` where `drainTimeoutCts.CancelAfter(WaitTimeout)`, and wrapped the loop in a `try/catch (OperationCanceledException) when (drainTimeoutCts.IsCancellationRequested)` that rethrows a `TimeoutException`. A regression that never emits `SnapshotComplete` now fails cleanly instead of hanging. Test still passes.
|
||||
|
||||
@@ -4,11 +4,48 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Worker.Tests` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
## 2026-06-15 re-review (commit `410acc9`)
|
||||
|
||||
Re-review of the alarm-fallback test additions in `git diff 42b0037..HEAD --
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/`. New unit suites land for the subtag
|
||||
fallback (`SubtagAlarmConsumerTests`, `SubtagAlarmStateMachineTests`,
|
||||
`SyntheticAlarmGuidTests`, `LmxSubtagAlarmSourceTests`) and the auto-failover
|
||||
composite (`FailoverAlarmConsumerTests`); the existing alarm suites are updated
|
||||
for the `SubscribeAlarmsCommand`-based handler signature, the
|
||||
`(eq, affinity, comFactory)` handler-factory delegate, and the new
|
||||
degraded/source-provider fields. Most of the change is genuinely new coverage
|
||||
plus a large volume of XML-doc additions on existing test doubles (benign).
|
||||
|
||||
Findings: the failover state-machine transitions (failover at threshold,
|
||||
failback after stable probes, intermittent-failure reset, before/after-switch
|
||||
forwarding, ack delegation, `ProbeOnce`-never-re-Subscribes) are all covered;
|
||||
the acked latch (`OutOfOrderAckThenClear_StillEmitsAckRtn`), the dup-address
|
||||
guard (`DuplicateActiveSubtag_Throws`), and the exact-match-vs-substring ack
|
||||
resolution (`AcknowledgeByName_PrefixNameDoesNotFalseMatch`,
|
||||
`AcknowledgeByGuid_*`) are all pinned. Three coverage gaps remain
|
||||
(Worker.Tests-031/032/033), all in new alarm-fallback code paths. The two
|
||||
newest files (`SyntheticAlarmGuidTests`, `LmxSubtagAlarmSourceTests`) omit an
|
||||
explicit `using Xunit;` but compile via the `<Using Include="Xunit" />` global
|
||||
using in the csproj, so that is not a finding.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found — failover/state-machine/ack tests assert meaningful post-conditions (mode, emitted state, target subtag address) and do not pass for the wrong reason; the prefix-name and unknown-guid negative cases pin the exact-match contract. |
|
||||
| 2 | mxaccessgw conventions | No issues found — new test methods follow `Method_Scenario_Expectation`; STA-affinity respected (state machine / consumer driven synchronously through internal seams). |
|
||||
| 3 | Concurrency & thread safety | No issues found — new failover/subtag suites are single-threaded and event-driven; no wall-clock floors or fixed sleeps were introduced (the `MxAccessValueCacheTests` change only deletes the old Worker.Tests-020 comment block). |
|
||||
| 4 | Error handling & resilience | Issues found: Worker.Tests-032 — the `RunPrimary` `when (ex is not OutOfMemoryException)` filter (the OOM-safe catch path) and the `FailoverSettings` clamp branches are untested. |
|
||||
| 5 | Security | No issues found — no secrets/credentials; ack-operator identity fields are sentinels. |
|
||||
| 6 | Performance & resource management | No issues found — `IDisposable` test subjects use `using`; the `LmxSubtagAlarmSource` dispose-idempotency / unadvise-only-advised-handles teardown is regression-tested. |
|
||||
| 7 | Design-document adherence | No issues found — tests mirror the alarm-fallback plan (degraded flag, synthetic GUID, subtag-ack via ack-comment, single-subscribe primary). |
|
||||
| 8 | Code organization & conventions | No issues found — new suites live under `MxAccess/`; test doubles are per-file (acceptable for these narrow fakes). |
|
||||
| 9 | Testing coverage | Issues found: Worker.Tests-031 (`ProbeIntervalSeconds` throttle-active branch never exercised — every test uses `probeIntervalSeconds: 0`), Worker.Tests-033 (`SubtagAlarmStateMachine` ack-while-inactive and priority-subtag branches uncovered). |
|
||||
| 10 | Documentation & comments | No issues found — test XML docs match assertions; no misleading names observed. |
|
||||
|
||||
## 2026-05-24 re-review (commit `42b0037`)
|
||||
|
||||
**Re-review: no new findings.** `git diff --name-only d692232..42b0037 -- src/ZB.MOM.WW.MxGateway.Worker.Tests` returns empty — the Worker.Tests module has zero source changes since the previous review. All ten checklist categories therefore inherit "No issues found" from the `d692232` pass. The header is bumped to track the latest reviewed commit; Worker.Tests-001..030 remain closed.
|
||||
@@ -533,3 +570,48 @@ findings (Worker.Tests-001 through -030) are unaffected.
|
||||
**Recommendation:** Either (a) reassign `CreateCancelEnvelope` to a sequence value `>` shutdown (or pass the sequence as a parameter, matching `CreateGatewayHelloEnvelope`'s parameter style), so the wire trace reads in ascending order; (b) add an XML-doc note on the cancel test stating that the worker has no inbound monotonicity check and the test ignores envelope sequence ordering; (c) parameterise all four helper methods so each test passes its desired sequence and the literal numbers stop carrying implicit meaning. Option (c) is the cleanest because `CreateGatewayHelloEnvelope` is already parameter-driven for nonce/version.
|
||||
|
||||
**Resolution:** 2026-05-20 — Took option (c): parameterised `CreateGatewayHelloEnvelope`/`CreateCommandEnvelope`/`CreateCancelEnvelope`/`CreateShutdownEnvelope` with a `ulong sequence` argument (defaults 1/2/2/3 respectively, matching the typical Hello/Command/Cancel/Shutdown ordering), so the literal sequence values no longer carry implicit meaning. Updated the cancel-correlation test's wire trace to ascend (Hello=1, Cancel=2, Shutdown=3) and added a comment noting that the worker has no inbound monotonicity check — the parameter exists so multi-frame tests can pin the trace ordering explicitly when needed.
|
||||
|
||||
### Worker.Tests-031
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs` (all `FailoverSettings` constructions) |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Every `FailoverSettings` in `FailoverAlarmConsumerTests` is built with `probeIntervalSeconds: 0`, which deliberately *disables* the probe throttle. The throttle-active branch in `FailoverAlarmConsumer.ProbeOnce` (`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/FailoverAlarmConsumer.cs:211-215`) — where a probe is *skipped* because fewer than `ProbeIntervalSeconds` have elapsed since `lastProbeAtUtc` — is therefore never exercised. This is a genuine production behaviour: the failback cadence is the only thing preventing a degraded worker from hammering the broken primary with a `PollOnce` on every timer tick, and `AlarmCommandHandlerTests.Subscribe_AutoModeWithWatchList_...` wires a real non-zero `FailbackProbeIntervalSeconds = 1` into the handler, so the throttle is on the live path. A regression that inverted the comparison (probing only *after* the interval became `>=` instead of skipping while `<`), dropped the `lastProbeAtUtc` update, or removed the throttle entirely would not be caught by any test. The task brief named "ProbeIntervalSeconds enforcement" as an explicit focus area.
|
||||
|
||||
**Recommendation:** Add a test that constructs `FailoverSettings(threshold: 1, probeIntervalSeconds: <N>, stableProbes: 1)` with a non-zero interval, forces failover, makes the primary healthy, then calls `ProbeOnce()` twice in quick succession and asserts the second call did *not* probe (e.g. assert `primary.Polls` advanced by exactly one and `Mode` is still `Subtag`). Because the throttle reads `DateTime.UtcNow` directly, either accept a coarse same-wall-clock-instant assertion (two back-to-back calls reliably fall inside any interval ≥ 1s) or, preferably, refactor `ProbeOnce` to take an injectable clock so the throttle boundary can be pinned deterministically without wall-clock dependence (consistent with the Worker.Tests-020 manual-time-source approach).
|
||||
|
||||
**Resolution:** 2026-06-15 — Took the coarse same-wall-clock-instant approach (no production-code clock injection needed). Added `FailoverAlarmConsumerTests.ProbeOnce_WithNonZeroInterval_ThrottlesSecondProbeWithinInterval`: builds `FailoverSettings(threshold: 1, probeIntervalSeconds: 3600, stableProbes: 5)`, forces failover to Subtag, makes the primary healthy, then calls `ProbeOnce()` twice back-to-back. The first probe re-polls the primary (`primary.Polls == 1`); the second falls inside the 3600s interval and is throttled, so `primary.Polls` is unchanged and `Mode` stays `Subtag`. `stableProbes: 5` keeps a single clean probe from failing back, so the throttled `ProbeOnce` path stays in scope. A 1-hour interval makes the two back-to-back calls reliably fall inside the window without any timing flakiness.
|
||||
|
||||
### Worker.Tests-032
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** Two resilience branches of `FailoverAlarmConsumer` are uncovered by the new tests. (1) `RunPrimary` catches `Exception ex when (ex is not OutOfMemoryException)` (`FailoverAlarmConsumer.cs:295`) — the OOM-safe catch path the task brief explicitly called out. No test throws `OutOfMemoryException` from the primary to verify it *propagates* (rather than being swallowed and counted toward the failover threshold like every other exception); the `FlakyPrimary` fake throws only `COMException`. A regression that broadened the filter to swallow OOM would convert a fatal allocation failure into a silent failover. (2) The `FailoverSettings` constructor clamps `threshold < 1 → 1` and `stableProbes < 1 → 1` (`FailoverSettings.cs:38-40`); no test passes a sub-1 value to confirm the clamp, so a misconfigured `ConsecutiveFailureThreshold = 0` from the gateway could change failover semantics undetected.
|
||||
|
||||
**Recommendation:** Add a `FlakyPrimary`-style fake (or a flag on the existing one) that throws `OutOfMemoryException` from `PollOnce`, and assert `sut.PollOnce()` rethrows it via `Assert.Throws<OutOfMemoryException>` and that no `ProviderModeChanged` fired. Add a small `FailoverSettings` fact (or `[Theory]`) asserting `new FailoverSettings(0, 0, 0).Threshold == 1` and `.StableProbes == 1` to pin the clamp.
|
||||
|
||||
**Resolution:** 2026-06-15 — Added a `ThrowOutOfMemoryOnPoll` flag to the existing `FlakyPrimary` fake (its `PollOnce` throws `OutOfMemoryException` when set, checked before the `COMException` branch). Regression test `FailoverAlarmConsumerTests.RunPrimary_WhenPrimaryThrowsOutOfMemory_PropagatesAndDoesNotFailOver` drives `PollOnce` through the primary, asserts `Assert.Throws<OutOfMemoryException>`, and asserts no `ProviderModeChanged` fired and `Mode` stays `Alarmmgr` — pinning that the `when (ex is not OutOfMemoryException)` filter lets OOM propagate rather than swallowing it and counting it toward the failover threshold. The clamp is pinned by `FailoverSettings_ClampsSubMinimumValues` (a `[Theory]`): `(0,0,0)→(1,0,1)`, `(-5,-5,-5)→(1,0,1)`, and a pass-through `(3,7,2)→(3,7,2)` to confirm in-range values are not altered.
|
||||
|
||||
### Worker.Tests-033
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmStateMachineTests.cs` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `SubtagAlarmStateMachineTests` covers the core transition matrix and the acked latch well, but two branches of the new state machine are unexercised. (1) The ack-while-inactive path in `SubtagAlarmStateMachine.ApplyAcked` (`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmStateMachine.cs:156-164`): when `.acked` flips true while the alarm is *not* active, the machine must emit nothing and must *not* set `AckedDuringEpisode` — otherwise a stale ack from a prior episode could mis-latch the next raise into a spurious `ACK_RTN`. No test drives an `.acked` change without a preceding active raise. (2) The priority-subtag path (`SubtagRole.Priority` → `state.Priority = CoerceInt(...)`, line 76-78): `SubtagAlarmConsumerTests.Subscribe_AdvisesAllSubtagsIncludingAckComment` confirms the priority subtag is *advised*, but no test raises a priority value change and asserts it flows into the emitted/snapshot record's `Priority`, so `CoerceInt` and the priority assignment are untested in the state-machine layer.
|
||||
|
||||
**Recommendation:** Add (a) `AckedTrueWhileInactive_EmitsNothingAndDoesNotLatch` — apply `.acked=true` with no prior active raise, assert `Apply` returns empty, then raise active and clear and assert the clear emits `UnackRtn` (proving the stale ack did not latch); and (b) `PriorityChange_FlowsIntoEmittedRecord` — apply a priority value then an active raise and assert the emitted record's `Priority` equals the supplied value (and a `CoerceInt` string/garbage case falls back).
|
||||
|
||||
**Resolution:** 2026-06-15 — Added both tests to `SubtagAlarmStateMachineTests`. `AckedTrueWhileInactive_EmitsNothingAndDoesNotLatch` applies `.acked=true` with no preceding active raise (asserts `Apply` returns empty), then drives a fresh raise→clear episode and asserts the clear emits `UnackRtn` — proving the stale inactive ack did not latch `AckedDuringEpisode`. `PriorityChange_FlowsIntoEmittedRecord` (the target now includes a `PrioritySubtag`) applies an `int` priority `750` (asserts the priority change emits nothing), raises active and asserts the emitted record's `Priority == 750` (exercising `CoerceInt`'s `int` path and the priority assignment), then applies a non-numeric `"not-a-number"` priority and asserts the snapshot `Priority` is still `750` (the `CoerceInt` string fallback keeps the prior value, not zero).
|
||||
|
||||
@@ -4,11 +4,38 @@
|
||||
|---|---|
|
||||
| Module | `src/ZB.MOM.WW.MxGateway.Worker` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-24 |
|
||||
| Commit reviewed | `42b0037` |
|
||||
| Review date | 2026-06-15 |
|
||||
| Commit reviewed | `410acc9` |
|
||||
| Status | Re-reviewed |
|
||||
| Open findings | 0 |
|
||||
|
||||
## 2026-06-15 re-review (commit `410acc9`)
|
||||
|
||||
Re-review of the `42b0037..410acc9` diff — the alarm-provider subtag-fallback
|
||||
feature (`git diff 42b0037..410acc9 -- src/ZB.MOM.WW.MxGateway.Worker/`). New
|
||||
substantive code: `SubtagAlarmConsumer`, `SubtagAlarmStateMachine`,
|
||||
`FailoverAlarmConsumer`, `LmxSubtagAlarmSource`, `SyntheticAlarmGuid`,
|
||||
`AlarmProviderModeChange`, `FailoverSettings`, `ISubtagAlarmSource` /
|
||||
`SubtagValueChange`, plus the degraded/`source_provider` propagation in
|
||||
`AlarmDispatcher` / `MxAccessAlarmEventSink` / `MxAccessEventMapper`, the
|
||||
`ForcedMode`/watch-list routing and STA-COM-factory threading in
|
||||
`AlarmCommandHandler` / `MxAccessStaSession`, and the `SubscribeAlarmsCommand`
|
||||
re-plumb in `MxAccessCommandExecutor`. Three new findings: **Worker-026** (High),
|
||||
**Worker-027** (Medium), **Worker-028** (Low). Worker-001..025 remain closed.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | No issues found. Subtag synthesis (`SubtagAlarmStateMachine` raise/ack/clear, `AckedDuringEpisode` latch, segment-boundary name derivation), exact-match ack resolution (`ResolveTargetByName` avoids the prefix false-positive), and `MapTransition`'s `Unspecified→*Alm` raise path are all sound. |
|
||||
| 2 | mxaccessgw conventions | No issues found. The synthesis is worker-side and every degraded record/event carries `degraded=true` + `source_provider=SUBTAG`, satisfying the explicit opt-in non-parity exception to the "never synthesize events" rule. The gateway never instantiates COM. net48 constraint respected — `AlarmProviderModeChange`/`FailoverSettings` are plain classes with get-only ctor-assigned props (no init/positional records); no `WriteRecord`-style init usage introduced. |
|
||||
| 3 | Concurrency & thread safety | Issue found: Worker-026 (an exception in the failover switch path — `SwitchToStandby`'s priming snapshot or either switch's `ProviderModeChanged` handler — escapes the state machine after `active` has already flipped, killing the STA alarm-poll loop with no mode-changed event). STA affinity itself is sound: `LmxSubtagAlarmSource` owns its own apartment-bound `LMXProxyServerClass`, all consumer calls are STA-confined via `AlarmCommandHandler`'s affinity guard, and `Dispose` UnAdvises before tearing handles down so a late pump callback cannot re-enter. |
|
||||
| 4 | Error handling & resilience | Issue found: Worker-027 (`SyntheticAlarmGuid` uses `MD5.Create()`, which throws on a net48 FIPS-policy host — breaking every subtag transition stamp and snapshot, and feeding Worker-026's poll-loop-kill path). `FailoverSettings` clamps tunables to safe minimums; `LmxSubtagAlarmSource` teardown is best-effort/idempotent. |
|
||||
| 5 | Security | No issues found. No secret/credential logging on the alarm path; ack comments are operator-supplied alarm metadata, not secrets. Synthetic GUID is non-cryptographic by design and not a security control. |
|
||||
| 6 | Performance & resource management | No issues found. `LmxSubtagAlarmSource` releases its COM object via `FinalReleaseComObject` and tracks advised-vs-added handles so `Dispose` only UnAdvises what it advised. The standby is armed once and gated-by-active rather than churning subscribe/unsubscribe per switch. |
|
||||
| 7 | Design-document adherence | No issues found. Implementation matches `docs/plans/2026-06-13-alarm-subtag-fallback-design.md` (auto-failover/failback, ack-comment-write ack, worker-side synthesis, additive proto fields). The probe re-polls the still-subscribed primary (single-subscribe constraint) as the design's "Superseded" notes describe. |
|
||||
| 8 | Code organization & conventions | Issue found: Worker-028 (the dup-subtag-address guard in `SubtagAlarmStateMachine.Bind` does not cover duplicate `AlarmFullReference` entries, which silently overwrite in `targetsByReference`/`_statesByReference`). One-public-type-per-file is otherwise respected for the new files. |
|
||||
| 9 | Testing coverage | No standalone finding. New unit suites exist for each major component (`SubtagAlarmConsumerTests`, `SubtagAlarmStateMachineTests`, `FailoverAlarmConsumerTests`, `LmxSubtagAlarmSourceTests`, `SyntheticAlarmGuidTests`), matching the design's test matrix. The switch-path exception fragility (Worker-026) and the dup-reference case (Worker-028) are untested edge cases noted in those findings. |
|
||||
| 10 | Documentation & comments | No issues found. The new types carry accurate XML docs; the net48-constraint rationale is documented inline on `FailoverSettings`/`AlarmProviderModeChange`; the "why PollOnce only, no re-Subscribe" and probe-throttle behaviour are documented on `FailoverAlarmConsumer.ProbeOnce`. |
|
||||
|
||||
## 2026-05-24 re-review (commit `42b0037`)
|
||||
|
||||
**Re-review: no new findings.** `git diff --name-only d692232..42b0037 -- src/ZB.MOM.WW.MxGateway.Worker` returns empty — the Worker module has zero source changes since the previous review. All ten checklist categories therefore inherit "No issues found" from the `d692232` pass. The header is bumped to track the latest reviewed commit; Worker-001..025 remain closed.
|
||||
@@ -464,3 +491,50 @@ _runtimeSession = _runtimeSessionFactory()
|
||||
Match the pattern `AlarmCommandHandler.Subscribe` already uses for `consumerFactory()` (`AlarmCommandHandler.cs:76-77`).
|
||||
|
||||
**Resolution:** 2026-05-20 — `WorkerPipeSession.RunAsync` now uses `_runtimeSession = _runtimeSessionFactory() ?? throw new InvalidOperationException("Worker runtime session factory returned null.");`, matching the pattern `AlarmCommandHandler.Subscribe` uses for its `consumerFactory()`. A null factory return now produces a clear diagnostic exception at the call site instead of NRE-ing on the next dereference (and the `finally` block's `_runtimeSession?.Dispose()` silently no-oping on a half-initialized session). Regression test `WorkerPipeSessionTests.RunAsync_WhenRuntimeSessionFactoryReturnsNull_ThrowsDiagnosticException` drives `RunAsync` with `() => null!` and asserts the diagnostic `InvalidOperationException` is thrown with the expected message.
|
||||
|
||||
### Worker-026
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/FailoverAlarmConsumer.cs:289-338`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessStaSession.cs:307-320` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `FailoverAlarmConsumer.SwitchToStandby` flips `active = Active.Standby` / `mode = Subtag` first, then calls `_ = standby.SnapshotActiveAlarms();` (the priming side-effect), and only then calls `RaiseModeChanged(...)`. If `standby.SnapshotActiveAlarms()` throws, the exception escapes `SwitchToStandby`, escapes the `catch` in `RunPrimary`, and escapes `FailoverAlarmConsumer.PollOnce`/`Subscribe`. The `SubtagAlarmConsumer.SnapshotActiveAlarms` path is not exception-free: it calls `StampSynthetic` → `SyntheticAlarmGuid.ForReference` (which throws on a FIPS host — see Worker-027) and walks live state. The same exposure exists for `RaiseModeChanged` itself: the attached `AlarmCommandHandler.OnProviderModeChanged` handler runs synchronously and calls `eventQueue.Enqueue(...)`, which throws `MxAccessEventQueueOverflowException` at capacity; that also propagates out of both `SwitchToStandby` and `SwitchToPrimary`.
|
||||
|
||||
When this happens the consumer has **already** transitioned `active`/`mode` to Standby (or Primary) but the `ProviderModeChanged` event is never emitted — so the gateway never learns the feed went degraded. Worse, because the failover calls run on the worker's STA inside `RunAlarmPollLoopAsync`, the escaping exception lands in that loop's trailing `catch (Exception)` arm (`MxAccessStaSession.cs:307-320`), which records a single fault and **permanently stops the alarm poll loop**. The standby is then never pumped or probed again — i.e. a transient primary COM fault that should have produced a clean degraded-mode handoff instead produces a total, undetected alarm outage for the session, defeating the entire purpose of the fallback feature. There is no safe operator workaround short of restarting the session.
|
||||
|
||||
**Recommendation:** Make the switch atomic and exception-isolated: raise `ProviderModeChanged` (and perform the priming snapshot) inside their own `try`/`catch` so a snapshot or handler failure cannot abort the switch or unwind into the poll loop. Order the state flip so the mode-changed notification is guaranteed to fire even if priming fails (e.g. flip state, raise mode-changed in a guarded block, then attempt the priming snapshot in a separate guarded block whose failure is logged/faulted but non-fatal). Add a regression test where the standby's `SnapshotActiveAlarms` throws on the first call after failover, asserting (a) `ProviderModeChanged` still fires and (b) `PollOnce` does not rethrow.
|
||||
|
||||
**Resolution:** 2026-06-15 — Reordered and exception-isolated the failover switch in `FailoverAlarmConsumer`. `SwitchToStandby` now flips `active`/`mode`, then raises `ProviderModeChanged` FIRST (so the gateway always learns the feed went degraded), then primes the standby snapshot via a new `TryPrimeStandbySnapshot()` whose failure is swallowed (`catch when ex is not OutOfMemoryException`) — a priming failure can no longer abort the switch or unwind into the poll loop. `RaiseModeChanged` itself now wraps `ProviderModeChanged?.Invoke` in a `try`/`catch (when ex is not OutOfMemoryException)` so a subscriber handler exception (e.g. `AlarmCommandHandler.OnProviderModeChanged`'s `eventQueue.Enqueue` overflowing) cannot escape `SwitchToStandby`/`SwitchToPrimary` into `RunAlarmPollLoopAsync`'s trailing catch and permanently stop alarm polling. `OutOfMemoryException` is deliberately allowed to propagate. The MXAccessStaSession poll-loop arm is unchanged — the fix prevents the escape rather than catching it there. Regression tests in `FailoverAlarmConsumerTests`: `Failover_WhenStandbyPrimingSnapshotThrows_StillRaisesModeChangeAndDoesNotRethrow` (standby `SnapshotActiveAlarms` throws on the priming call → `ProviderModeChanged` still fires, `Mode` is Subtag, `Subscribe`/`PollOnce` do not rethrow) and `Failover_WhenModeChangedHandlerThrows_SwitchStillTakesEffectAndDoesNotRethrow` (a throwing `ProviderModeChanged` subscriber → switch still takes effect, no rethrow).
|
||||
|
||||
### Worker-027
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SyntheticAlarmGuid.cs:38-40` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `SyntheticAlarmGuid.ForReference` derives the deterministic alarm GUID via `using MD5 md5 = MD5.Create();`. The worker targets .NET Framework 4.8, where `MD5.Create()` returns `MD5CryptoServiceProvider`. When the host has the Windows FIPS-compliance policy enabled (`Enabled=1` under `HKLM\System\CurrentControlSet\Control\Lsa\FIPSAlgorithmPolicy`), the non-validated `MD5CryptoServiceProvider` constructor throws `InvalidOperationException` ("This implementation is not part of the Windows Platform FIPS validated cryptographic algorithms."). `SyntheticAlarmGuid.ForReference` is on the hot path of the subtag fallback: `SubtagAlarmConsumer.StampSynthetic` calls it for **every** synthesized transition and **every** snapshot record. On a FIPS host the subtag fallback therefore throws on first use; combined with Worker-026 that exception kills the STA alarm-poll loop, so the fallback is not merely degraded but completely non-functional exactly when it is needed (after the primary alarmmgr provider has failed). The comment already notes MD5 is "never for security" — the issue is availability under FIPS policy, not cryptographic strength. The regulated deployment hosts (Zimmer) are a plausible FIPS environment.
|
||||
|
||||
**Recommendation:** Replace `MD5.Create()` with a FIPS-agnostic non-cryptographic 128-bit hash that does not route through the crypto FIPS gate — e.g. compute the 16 GUID bytes from a stable hash that does not use `System.Security.Cryptography` (a fixed FNV-1a / xxHash-style derivation over the UTF-8 bytes), or use `SHA256` truncated to 16 bytes via the managed `SHA256Managed`/`IncrementalHash` only if confirmed FIPS-safe on net48 (it is not guaranteed — prefer the non-crypto route). The mapping only needs determinism and collision resistance for distinct references, not cryptographic properties. Add a test that exercises `ForReference` without depending on a crypto provider.
|
||||
|
||||
**Resolution:** 2026-06-15 — Replaced the `MD5.Create()` derivation in `SyntheticAlarmGuid.ForReference` with a pure-managed FNV-1a hash: two independent 64-bit FNV-1a passes over the UTF-8 bytes (the high pass mixes the byte index into its accumulator to decorrelate the halves) fill the low/high 64 bits of the 128-bit GUID, and the input length is folded in so the empty string is non-degenerate (never `Guid.Empty`). The `using System.Security.Cryptography;` import is gone, so no FIPS-gated `MD5CryptoServiceProvider` is ever constructed — the subtag fallback no longer throws on a FIPS-policy host. The derivation stays deterministic and distinct-per-reference. The existing `SyntheticAlarmGuidTests` (`SameReference_SameGuid`, `DifferentReference_DifferentGuid`, `Reference_ProducesNonEmptyGuid`) pin only those properties — not a specific GUID literal — so they continue to pass unchanged; no test needed a value update. Added regression tests `SyntheticAlarmGuidTests.EmptyReference_ProducesNonEmptyGuid` (length-fold guard against a degenerate all-zero result) and `ForReference_UnderFipsEnforcement_DoesNotThrowAndStaysDeterministic` (sets the managed `UseLegacyFipsThrow` AppContext switch and asserts the derivation still succeeds deterministically; a regression reintroducing a FIPS-gated provider would throw here).
|
||||
|
||||
### Worker-028
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Code organization & conventions |
|
||||
| Location | `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmStateMachine.cs:43-52`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmConsumer.cs:70-75` |
|
||||
| Status | Resolved |
|
||||
|
||||
**Description:** `SubtagAlarmStateMachine.Bind` throws `ArgumentException` on a duplicate subtag **item address** (the documented dup-address guard), but neither the state machine nor `SubtagAlarmConsumer` guards against a duplicate `AlarmFullReference` in the watch list. When two `AlarmSubtagTarget` entries share an `AlarmFullReference` but use different subtag addresses, `_statesByReference[target.AlarmFullReference] = state` and `targetsByReference[reference] = target` each silently overwrite the earlier entry, while the earlier target's subtag addresses are still bound to an orphaned `AlarmState`. The orphaned state is mutated by incoming value changes but is invisible to `SnapshotActive` (which iterates only the surviving `_statesByReference.Values`) and to ack resolution (which uses the surviving `targetsByReference`). The result is silently inconsistent synthesized state for that reference. This is a watch-list configuration error (the gateway resolves the watch list), so impact is limited, but the asymmetry — addresses are guarded, references are not — is surprising and silent.
|
||||
|
||||
**Recommendation:** Add a duplicate-`AlarmFullReference` guard symmetric with the dup-address guard: throw a descriptive `ArgumentException` from the `SubtagAlarmStateMachine` (or `SubtagAlarmConsumer`) constructor when two watch-list entries share a reference, so a misconfigured watch list fails fast at subscribe time rather than producing silently inconsistent state. Cover it with a unit test.
|
||||
|
||||
**Resolution:** 2026-06-15 — Added a duplicate-`AlarmFullReference` guard in the `SubtagAlarmStateMachine` constructor symmetric with the existing dup-address guard in `Bind`: before adding each target's `_statesByReference` entry it checks `ContainsKey` (the dictionary is `OrdinalIgnoreCase`, matching the consumer's `targetsByReference` lookup) and throws a descriptive `ArgumentException` ("Duplicate alarm full reference '{reference}' is bound to more than one alarm target."). Because `SubtagAlarmConsumer` constructs the state machine before populating its own `targetsByReference`, this guard fires before the consumer's silent overwrite too, covering both dictionaries from one canonical check. Regression test `SubtagAlarmStateMachineTests.DuplicateAlarmFullReference_Throws` (two targets sharing a reference but using distinct active subtags → `ArgumentException`).
|
||||
|
||||
@@ -790,3 +790,159 @@ Post-ack transition: kind=Clear …
|
||||
|
||||
10s cadence held throughout; full proto fields populated correctly;
|
||||
ack registered server-side without errors.
|
||||
|
||||
## Subtag-monitoring fallback provider
|
||||
|
||||
When the wnwrap alarm-manager source fails, the gateway worker switches to
|
||||
`SubtagAlarmConsumer` — a synthetic alarm source that advises each alarm
|
||||
attribute's subtags via the existing MXAccess `AddItem`/`Advise` pipeline and
|
||||
derives alarm transitions from the resulting value-change stream. This is a
|
||||
non-parity, degraded-mode source; every transition and snapshot it produces
|
||||
carries `degraded = true`.
|
||||
|
||||
### Watch-list discovery
|
||||
|
||||
`GatewayAlarmMonitor` resolves the subtag watch-list at subscribe time by
|
||||
calling `IAlarmWatchListResolver.GetAlarmAttributesAsync`. The resolver merges:
|
||||
|
||||
1. Galaxy Repository SQL (`GetAlarmAttributesAsync`) — objects that have alarm
|
||||
extensions in the configured area.
|
||||
2. Config overrides — `IncludeAttributes` adds explicit entries;
|
||||
`ExcludeAttributes` removes Repository-derived ones. The config list takes
|
||||
effect even when `UseGalaxyRepository` is `false`.
|
||||
|
||||
The resolved list is a set of `AlarmSubtagTarget` messages sent to the worker
|
||||
inside `SubscribeAlarmsCommand.watch_list`. Each target carries the composed
|
||||
MXAccess item addresses for the `InAlarm`, `Acked`, `AckMsg`, and `Priority`
|
||||
subtags (confirmed AVEVA `AlarmExtension` field names, verified against the live
|
||||
ZB Galaxy `attribute_definition` rows). The gateway re-runs discovery on its
|
||||
reconcile cadence and pushes an updated watch-list when the model changes.
|
||||
|
||||
Each target's canonical `AlarmFullReference` is composed as
|
||||
`Galaxy!{area}.{reference}` (literal `Galaxy` provider). The `{area}` is the
|
||||
alarm object's **real Galaxy area** — discovered per object via
|
||||
`gobject.area_gobject_id` (`GetAlarmAttributesAsync` projects it as `area_name`)
|
||||
— so the synthesized reference's group matches exactly the area the native
|
||||
alarmmgr (wnwrap) emits for the same alarm (e.g. `TestMachine_001` in `TestArea`
|
||||
yields `Galaxy!TestArea.TestMachine_001.TestAlarm001`). The configured
|
||||
`Discovery.Area` / `DefaultArea` is **only** the fallback for explicit
|
||||
`IncludeAttributes` entries, which carry no discovered area.
|
||||
|
||||
### Subtag advise and `LmxSubtagAlarmSource`
|
||||
|
||||
`LmxSubtagAlarmSource` (implements `ISubtagAlarmSource`) owns a separate
|
||||
`LMXProxyServerClass` instance on the worker STA — it does not share the
|
||||
session's main MXAccess object. For each watch-list target it calls
|
||||
`AddItem`/`Advise` on the configured subtag addresses. When a subtag value
|
||||
changes, it raises `ValueChanged` on the STA and `SubtagAlarmConsumer`
|
||||
forwards it to `SubtagAlarmStateMachine`.
|
||||
|
||||
`PollOnce()` on the subtag consumer is a no-op — the path is event-driven
|
||||
through `Advise`, not poll-driven.
|
||||
|
||||
### Synthesis rules
|
||||
|
||||
`SubtagAlarmStateMachine` tracks `(active, acked)` per watch-list entry and
|
||||
emits `MxAlarmTransitionEvent` records on change:
|
||||
|
||||
| Subtag change | Emitted transition | Notes |
|
||||
|---|---|---|
|
||||
| `InAlarm` false → true | Raise (`UNACK_ALM`) | `original_raise_timestamp` = first observed active time for this episode |
|
||||
| `Acked` false → true, while `InAlarm` | Acknowledge (`ACK_ALM`) | `AckedDuringEpisode` latch set |
|
||||
| `InAlarm` true → false | Clear | `AckRtn` if `AckedDuringEpisode` is set, else `UnackRtn` |
|
||||
| `Acked` true → false, while `InAlarm` | (none) | Latch is NOT cleared; the episode retains its acknowledged status at clear |
|
||||
|
||||
The `AckedDuringEpisode` latch addresses out-of-order subtag delivery:
|
||||
MXAccess does not guarantee the `Acked = false` update arrives before the
|
||||
`InAlarm = false` update. The latch ensures a clear always emits `ACK_RTN`
|
||||
when the alarm was acknowledged at any point during the active episode.
|
||||
|
||||
`SnapshotActive()` returns one `MxAlarmSnapshotRecord` per currently-active
|
||||
alarm. State mapping:
|
||||
|
||||
- `InAlarm && !Acked` → `UNACK_ALM`
|
||||
- `InAlarm && Acked` → `ACK_ALM`
|
||||
- `!InAlarm` → not included in the snapshot
|
||||
|
||||
### Synthetic GUID
|
||||
|
||||
The alarmmgr provider supplies a native GUID per alarm record. The subtag
|
||||
provider has no native GUID. `SubtagAlarmConsumer` derives a deterministic
|
||||
GUID by hashing `alarm_full_reference` (via `SyntheticAlarmGuid.ForReference`).
|
||||
The same reference always produces the same GUID within a session, so
|
||||
GUID-based ack routing resolves correctly. The GUID is not stable across
|
||||
different alarm references or gateway restarts in the sense of matching any
|
||||
AVEVA-internal GUID.
|
||||
|
||||
### Acknowledge in subtag mode
|
||||
|
||||
`AlarmDispatcher` routes ack calls by active provider mode:
|
||||
|
||||
- **Alarm-manager mode:** `AlarmAckByName` on `wwAlarmConsumerClass` (unchanged).
|
||||
- **Subtag mode:** `SubtagAlarmConsumer.AcknowledgeByName` resolves the
|
||||
watch-list entry's `ack_comment_subtag` and issues a `Write(comment)` on
|
||||
the STA via `LmxSubtagAlarmSource`. Writing the `AckMsg` subtag performs
|
||||
the acknowledge in AVEVA (`AckMsg` is the confirmed `AlarmExtension` ack-comment
|
||||
write target).
|
||||
|
||||
If the alarm has no writable ack-comment subtag (`AckComment` config key is
|
||||
empty, or the entry's `ack_comment_subtag` field is empty), the ack call
|
||||
returns a failure code that the gateway surfaces as `FailedPrecondition`.
|
||||
`AcknowledgeByGuid` maps the synthetic GUID back to its reference via an
|
||||
internal dictionary, then calls the same write path.
|
||||
|
||||
`SubtagAlarmConsumer.Subscribe` advises the ack-comment subtag alongside the
|
||||
observed ones (active/acked/priority). This is required: MXAccess rejects a
|
||||
write to an item that has been added but not advised with `E_INVALIDARG`
|
||||
("Value does not fall within the expected range"). Advising it at subscribe
|
||||
time makes it an active item so the later ack write succeeds — its value
|
||||
changes carry no transition (the state machine ignores unmapped addresses).
|
||||
|
||||
### Live validation
|
||||
|
||||
The subtag path was validated against live MXAccess on the dev rig
|
||||
(`DESKTOP-6JL3KKO`, Galaxy `DEV`, `TestMachine_001.TestAlarm001`):
|
||||
|
||||
- `….InAlarm` → `True` (Boolean), `….Acked` → `False` (Boolean),
|
||||
`….Priority` → `500` (Int32), `….AckMsg` → string — confirming the field
|
||||
names **and** the runtime reference shape `<Object>.<AlarmAttr>.<field>`
|
||||
with **no** intermediate alarm-condition segment.
|
||||
- `AcknowledgeByName` (AckMsg write) returned `0` once the ack-comment subtag
|
||||
was advised — confirming the ack-by-comment-write mechanism end to end.
|
||||
|
||||
### Fidelity limitations
|
||||
|
||||
The following fields are not available or have lower quality in subtag mode:
|
||||
|
||||
| Field | Subtag-mode behavior |
|
||||
|-------|---------------------|
|
||||
| `alarm_guid` | Synthetic deterministic GUID from `alarm_full_reference`; not an AVEVA-native GUID |
|
||||
| `original_raise_timestamp` | First observed `active = true` time; no AVEVA-native raise time |
|
||||
| `transition_timestamp` | `OnDataChange` source timestamp from MXAccess |
|
||||
| `severity` | From priority subtag if advised; 0 otherwise |
|
||||
| `category` / `description` | Not populated (no subtag for these) |
|
||||
| `current_value` / `limit_value` | Not populated unless corresponding subtags are in the watch-list |
|
||||
| `alarm_type_name` | Not populated |
|
||||
| `operator_user` / `operator_comment` | Not populated on synthesized raise/clear transitions |
|
||||
| `retrigger` transition | Not synthesized (no re-alarm counter subtag is observed) |
|
||||
|
||||
Every transition and snapshot record carries `degraded = true` and
|
||||
`source_provider = ALARM_PROVIDER_MODE_SUBTAG`. Clients that require full
|
||||
fidelity must wait for failback to the alarm manager.
|
||||
|
||||
### Provider mode reflection
|
||||
|
||||
When `FailoverAlarmConsumer` switches between providers, it raises
|
||||
`ProviderModeChanged`. `AlarmDispatcher` enqueues an
|
||||
`OnAlarmProviderModeChangedEvent` (carried as an `MxEvent`), which the
|
||||
gateway receives and reflects into:
|
||||
|
||||
- `AlarmFeedMessage.provider_status` emitted to every `StreamAlarms`
|
||||
subscriber.
|
||||
- The `/hubs/alarms` SignalR hub for the dashboard.
|
||||
- Metrics: `mxgateway.alarms.provider_mode` gauge and
|
||||
`mxgateway.alarms.provider_switches` counter.
|
||||
|
||||
On every switch `GatewayAlarmMonitor` also forces a reconcile
|
||||
(`QueryActiveAlarms`) against the now-active provider so the gateway cache
|
||||
reflects the post-switch state without a spurious raise/clear storm.
|
||||
|
||||
@@ -51,6 +51,19 @@ The shared inputs are:
|
||||
The commands in the matrix use `MXGATEWAY_API_KEY` through each CLI's
|
||||
`api-key-env` flag. They must not embed bearer tokens or raw API keys.
|
||||
|
||||
### TLS variant
|
||||
|
||||
The matrix runs over plaintext (`h2c`) by default. A TLS variant exists but stays
|
||||
a manual/opt-in run, consistent with the gate above, because it needs the gateway
|
||||
started with an HTTPS endpoint (an `https://` `MXGATEWAY_ENDPOINT`) and each CLI
|
||||
switched to its TLS flag (`--tls` / `-tls` / `--plaintext=false` /
|
||||
`plaintext=False`). The clients are lenient by default and accept the gateway's
|
||||
auto-generated self-signed certificate without extra trust setup, except the Rust
|
||||
CLI, which is pin-only and needs `--ca-file` or `--require-certificate-validation`
|
||||
(and Python uses trust-on-first-use). See
|
||||
[Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
|
||||
and each client README for the per-client TLS flags.
|
||||
|
||||
## JSON Comparison
|
||||
|
||||
Every command in the matrix requests JSON output. A runner can compare the
|
||||
|
||||
@@ -362,6 +362,107 @@ Dashboard access should require API-key-backed dashboard authentication with
|
||||
is enabled by default through `Dashboard:AllowAnonymousLocalhost`; the bypass is
|
||||
limited to loopback requests.
|
||||
|
||||
## Lazy Browse Is Wire-Only
|
||||
|
||||
Decision: the gateway continues to pull the full Galaxy hierarchy on each
|
||||
deploy. `BrowseChildren` and the lazy dashboard render only avoid sending and
|
||||
DOM-materializing the full tree — they do not push laziness into SQL or cache
|
||||
loading.
|
||||
|
||||
Rationale: snapshot persistence and the dashboard summary both depend on a
|
||||
fully-materialized cache. Lazy SQL would increase per-click latency on a
|
||||
deployment-heavy box, multiply per-session SQL connections, and complicate the
|
||||
cold-start path. Wire-side laziness solves the actual pain (oversized gRPC
|
||||
replies and a heavy DOM) without disturbing the materialization model.
|
||||
|
||||
## TLS Auto-Certificate and Lenient Client Trust
|
||||
|
||||
Decision: when a Kestrel `https://` endpoint is configured without a certificate
|
||||
of its own (and no `Kestrel:Certificates:Default` is set), the gateway generates
|
||||
and persists a self-signed certificate rather than failing to start. Clients
|
||||
connecting over TLS without a pinned CA accept whatever certificate the server
|
||||
presents by default; pinning a CA restores full verification.
|
||||
|
||||
Rationale: `mxaccessgw` is an internal tool with no PKI to issue or distribute
|
||||
certificates. The prior behavior — an `https` endpoint with no certificate
|
||||
fails at startup with Kestrel's opaque "no server certificate was specified"
|
||||
error — pushed operators toward plaintext (`h2c`), exposing the API key and
|
||||
request payloads on the wire. Auto-generating a long-lived, persisted, reused
|
||||
certificate lets TLS "just work" with zero certificate management, while the
|
||||
lenient client default means clients connect to that self-signed certificate
|
||||
without a manual trust step. Both choices are deliberate, not oversights:
|
||||
strict-by-default would force PKI work this tool does not warrant. Plaintext-only
|
||||
deployments are untouched — no certificate or key material is written for them —
|
||||
and an operator who supplies a real certificate transparently overrides the
|
||||
generated one.
|
||||
|
||||
Two clients diverge from "accept any certificate" because their gRPC stacks lack
|
||||
a per-channel skip-verify hook:
|
||||
|
||||
- Python uses trust-on-first-use: it fetches the server's presented certificate
|
||||
over a separate unverified probe and pins it for the channel, and defaults the
|
||||
SNI/target-name override to `localhost` (the generated certificate always
|
||||
carries a `localhost` SAN).
|
||||
- Rust is pin-only: tonic exposes no public hook to inject a custom certificate
|
||||
verifier, so TLS over Rust requires either a pinned CA or an explicit opt-in to
|
||||
system-trust verification; otherwise connecting returns a clear, actionable
|
||||
error.
|
||||
|
||||
See [Gateway Configuration — Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
|
||||
and the per-client READMEs for the as-built behavior.
|
||||
|
||||
## Alarm-Manager to Subtag Fallback
|
||||
|
||||
Decision: add a second alarm provider (subtag monitoring) that the worker
|
||||
activates automatically when the native wnwrap alarm manager fails, and fails
|
||||
back to automatically when the manager recovers.
|
||||
|
||||
### Worker-side synthesis
|
||||
|
||||
Synthesis of alarm transitions from subtag value changes happens entirely in
|
||||
the worker (`SubtagAlarmConsumer` / `SubtagAlarmStateMachine`). The gateway
|
||||
still forwards only events the worker emits and synthesizes nothing itself.
|
||||
This satisfies the parity rule even though the subtag path is inherently
|
||||
non-parity: the parity rule governs where synthesis lives, not whether
|
||||
synthesis is permitted when the native source is unavailable.
|
||||
|
||||
### Degraded is explicit
|
||||
|
||||
Every subtag-mode transition carries `degraded = true` on the
|
||||
`OnAlarmTransitionEvent` and `ActiveAlarmSnapshot` proto messages, and the
|
||||
`AlarmFeedMessage` feed carries an `AlarmProviderStatus` payload on stream
|
||||
open and on every switch. No client can mistake a subtag-mode alarm for an
|
||||
authoritative alarmmgr record. Subtag mode has lower fidelity: synthetic
|
||||
deterministic GUID (SHA-derived from the alarm reference), best-effort
|
||||
original-raise timestamp, narrower field set. Clients that need full fidelity
|
||||
must wait for failback.
|
||||
|
||||
### Failover trigger
|
||||
|
||||
The failover trigger is N consecutive wnwrap COM failures — a `COMException`
|
||||
thrown by `Subscribe` or `PollOnce`, or a failure HRESULT from
|
||||
`GetXmlCurrentAlarms2`. A single poll failure does not trigger a switch; the
|
||||
threshold (default 3, floored at 1) guards against transient COM hiccups. The
|
||||
counter resets on any clean poll so a flapping provider does not permanently
|
||||
latch in subtag mode.
|
||||
|
||||
### Acknowledge via ack-comment write
|
||||
|
||||
In subtag mode, `AcknowledgeAlarm` writes the operator comment to the alarm
|
||||
attribute's ack-comment subtag (`Fallback:Subtags:AckComment`). The write
|
||||
performs the native ack in AVEVA. This differs from alarmmgr mode, where
|
||||
`AlarmAckByName` on `wwAlarmConsumerClass` is called directly. The `AckComment`
|
||||
subtag name is empty by default; configuring it is required for ack to work in
|
||||
subtag mode. The exact AVEVA subtag names are not hard-coded — the `Subtags`
|
||||
config block exists precisely so names are not guessed without validation
|
||||
against the live MXAccess attribute set.
|
||||
|
||||
### Related documentation
|
||||
|
||||
- [Gateway Configuration — Alarm Fallback options](./GatewayConfiguration.md#alarm-fallback-options)
|
||||
- [Alarm Client Discovery — Subtag provider](./AlarmClientDiscovery.md)
|
||||
- [gRPC Contract — provider_status and degraded fields](./Grpc.md)
|
||||
|
||||
## Later Revisit Items
|
||||
|
||||
These are explicit post-v1 revisit items, not open blockers:
|
||||
|
||||
@@ -36,6 +36,7 @@ The service is defined in
|
||||
| `GetLastDeployTime` | Returns the cached `galaxy.time_of_last_deploy`. Served from the shared hierarchy cache; refreshed in the background. |
|
||||
| `DiscoverHierarchy` | Returns one page of the deployed hierarchy plus each returned object's attributes (configured and built-in — see [Built-in vs configured attributes](#built-in-vs-configured-attributes)). **Served from cache** — see [Hierarchy Cache](#hierarchy-cache). |
|
||||
| `WatchDeployEvents` | **Server-streaming.** The server emits the current state immediately on subscribe (so clients can bootstrap without waiting), then emits one event per detected deploy change. See [Deploy Notifications](#deploy-notifications). |
|
||||
| `BrowseChildren` | Returns the direct children of one parent object (or root objects when `parent` is unset). Filters mirror `DiscoverHierarchy`. Includes a per-child `has_children` hint so UIs can draw expand triangles without an extra round trip. **Served from cache.** |
|
||||
|
||||
`DiscoverHierarchy` is a paged unary RPC. The raw request accepts `page_size`
|
||||
and `page_token`; the server defaults omitted page size to 1000 objects and
|
||||
@@ -52,6 +53,57 @@ alarm-only, historized-only, and `include_attributes = false` for a skeleton
|
||||
tree. All filters are applied with AND semantics, and `total_object_count`
|
||||
reports the post-filter count.
|
||||
|
||||
### BrowseChildren
|
||||
|
||||
`BrowseChildren` is an OPC UA-style lazy expand: clients that walk one level at
|
||||
a time — UI trees, OPC UA address-space bridges — call it instead of paging the
|
||||
full hierarchy with `DiscoverHierarchy`.
|
||||
|
||||
**Parent selection.** The `parent` oneof accepts `parent_gobject_id`,
|
||||
`parent_tag_name`, or `parent_contained_path`. An empty oneof returns root
|
||||
objects — those whose `parent_gobject_id` is 0.
|
||||
|
||||
**Filters.** Category ids, template-chain substring, tag-name glob, alarm-only,
|
||||
historized-only, and `include_attributes` all behave identically to
|
||||
`DiscoverHierarchy` and are AND-combined. One important difference applies to
|
||||
`alarm_bearing_only` and `historized_only`: an ancestor that does not itself
|
||||
carry a matching attribute is still returned when one of its descendants does.
|
||||
This is intentional — without it a UI tree cannot navigate to the matching
|
||||
leaves. `DiscoverHierarchy`'s flat-list semantics filter out such intermediate
|
||||
ancestors; `BrowseChildren` retains them so the path to each match remains
|
||||
traversable.
|
||||
|
||||
**`child_has_children` hint.** The reply carries a boolean parallel to
|
||||
`children`, set true when the child has at least one matching descendant under
|
||||
the same filter set. UIs can use this to decide whether to draw an expand
|
||||
triangle without issuing a follow-up `BrowseChildren` call. Because the hint is
|
||||
computed against the *filtered* descendant set, a branch that contains no
|
||||
matching objects gets `false`, not `true`.
|
||||
|
||||
**Paging.** Default page size is 500; the server caps any requested size at
|
||||
5000. Page tokens encode `(cache_sequence, parent_id, filter_signature,
|
||||
offset)`. A token from a different cache generation or a different filter set
|
||||
returns `InvalidArgument`. The error messages reference "DiscoverHierarchy
|
||||
page_token" because `BrowseChildren` reuses the same encoding and validation
|
||||
path — if you see that wording in a `BrowseChildren` context it is expected.
|
||||
|
||||
**Errors.**
|
||||
|
||||
| Condition | Status code |
|
||||
|-----------|-------------|
|
||||
| Unknown parent | `NotFound` |
|
||||
| First load not yet complete after 5 s | `Unavailable` |
|
||||
| Stale or filter-mismatched page token | `InvalidArgument` |
|
||||
| Missing `metadata:read` scope | `PermissionDenied` |
|
||||
| No API key | `Unauthenticated` |
|
||||
|
||||
**Authorization.** Same `metadata:read` scope as the other Galaxy RPCs.
|
||||
`browse_subtrees` API-key constraints intersect with the result set.
|
||||
|
||||
**Sort order.** Areas first, then `OrdinalIgnoreCase` by display name
|
||||
(`browse_name` → `contained_name` → `tag_name`). Matches the dashboard tree so
|
||||
server and dashboard views are consistent.
|
||||
|
||||
## Hierarchy Cache
|
||||
|
||||
The gateway holds a single shared `IGalaxyHierarchyCache`
|
||||
@@ -271,9 +323,13 @@ fields cannot express null. Use it to distinguish "no dimension reported" from
|
||||
```text
|
||||
gRPC client(s)
|
||||
-> GalaxyRepositoryGrpcService (src/ZB.MOM.WW.MxGateway.Server/Grpc/)
|
||||
DiscoverHierarchy, GetLastDeployTime -> IGalaxyHierarchyCache.Current
|
||||
WatchDeployEvents -> IGalaxyDeployNotifier
|
||||
TestConnection -> GalaxyRepository (direct SQL)
|
||||
DiscoverHierarchy, GetLastDeployTime, BrowseChildren -> IGalaxyHierarchyCache.Current
|
||||
WatchDeployEvents -> IGalaxyDeployNotifier
|
||||
TestConnection -> GalaxyRepository (direct SQL)
|
||||
|
||||
Dashboard (Blazor)
|
||||
-> IDashboardBrowseService (DashboardBrowseService)
|
||||
-> GalaxyBrowseProjector over IGalaxyHierarchyCache.Current
|
||||
|
||||
GalaxyHierarchyRefreshService (BackgroundService)
|
||||
-> IGalaxyHierarchyCache.RefreshAsync
|
||||
@@ -309,9 +365,17 @@ Component breakdown:
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyProtoMapper.cs`) converts row models to
|
||||
proto messages. Used by the cache during refresh to materialize the reply
|
||||
once.
|
||||
- `GalaxyBrowseProjector`
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyBrowseProjector.cs`) projects one level
|
||||
of children out of an immutable cache entry. Memoizes the filtered child list
|
||||
per cache-entry instance so repeated paging is an O(pageSize) slice rather than an
|
||||
O(siblings) filter scan. The memo is keyed on the cache entry reference, so a new
|
||||
entry from the background refresh makes the stale memo unreachable and it is
|
||||
collected with it. `DashboardBrowseService` wraps this projector to drive the
|
||||
dashboard's lazy-expand tree.
|
||||
- `GalaxyRepositoryGrpcService`
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs`) implements
|
||||
the four RPCs.
|
||||
the five RPCs.
|
||||
|
||||
## Configuration
|
||||
|
||||
|
||||
@@ -148,6 +148,7 @@ the affected stream while the MXAccess session remains active.
|
||||
| `MxGateway:Dashboard:Enabled` | `true` | Enables Blazor Server dashboard route mapping. The dashboard mounts at the host root (`/`); there is no separate path-base prefix. |
|
||||
| `MxGateway:Dashboard:AllowAnonymousLocalhost` | `true` | Allows loopback dashboard requests to bypass the dashboard cookie requirement for local development. Remote requests still require dashboard authentication. |
|
||||
| `MxGateway:Dashboard:RequireHttpsCookie` | `true` | Sets the dashboard auth cookie's secure policy. `true` keeps `CookieSecurePolicy.Always` — the cookie is only sent over HTTPS, which matches a production HTTPS deployment. Set to `false` for plain-HTTP dev deployments to use `CookieSecurePolicy.SameAsRequest`; the cookie is still flagged Secure on HTTPS requests, but it can round-trip over HTTP. Browsers drop Secure cookies set over HTTP from non-localhost hosts, so leaving this `true` while serving the dashboard over plain HTTP will break login from any remote browser. |
|
||||
| `MxGateway:Dashboard:CookieName` | `MxGatewayDashboard` | Dashboard auth cookie name. Leave unset (null/blank) to use the default. Override it to give a distinct name to a gateway that shares a hostname with another gateway instance: browser cookies are scoped by host+path but **not** by port, so two instances on the same host would otherwise clobber each other's dashboard session under a shared cookie name. Changing it signs out existing dashboard sessions on next deploy. |
|
||||
| `MxGateway:Dashboard:SnapshotIntervalMilliseconds` | `1000` | Dashboard snapshot refresh interval used by the snapshot SignalR hub and the pages that subscribe to it. |
|
||||
| `MxGateway:Dashboard:RecentFaultLimit` | `100` | Maximum number of fault summaries projected into each dashboard snapshot. |
|
||||
| `MxGateway:Dashboard:RecentSessionLimit` | `200` | Maximum number of session summaries projected into each dashboard snapshot. |
|
||||
@@ -229,6 +230,254 @@ behavior.
|
||||
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
|
||||
`StreamAlarms` are session-less RPCs served by the monitor.
|
||||
|
||||
### Alarm fallback options
|
||||
|
||||
The `Fallback` sub-section controls how the alarm feed selects between the
|
||||
native wnwrap alarm-manager provider and the subtag-monitoring fallback.
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Alarms:Fallback:Mode` | `Auto` | Provider selection mode. `Auto` uses the alarm manager as primary and fails over to subtag monitoring after consecutive COM failures, then fails back automatically. `ForceAlarmManager` disables failover. `ForceSubtag` forces subtag monitoring on from startup. Values are case-insensitive. |
|
||||
| `MxGateway:Alarms:Fallback:ConsecutiveFailureThreshold` | `3` | Number of consecutive wnwrap COM failures (`COMException` or failure HRESULT from `Subscribe` / `GetXmlCurrentAlarms2`) before the monitor switches to subtag mode. Floored at 1. |
|
||||
| `MxGateway:Alarms:Fallback:FailbackProbeIntervalSeconds` | `30` | While in subtag mode, how often (in seconds) the monitor probes the wnwrap provider to detect recovery. Floored at 1. |
|
||||
| `MxGateway:Alarms:Fallback:FailbackStableProbes` | `3` | Number of consecutive clean wnwrap probes required before the monitor switches back to the alarm manager. Floored at 1. |
|
||||
| `MxGateway:Alarms:Fallback:Discovery:UseGalaxyRepository` | `true` | When `true`, the monitor queries the Galaxy Repository SQL database to build the subtag watch-list for the configured area. |
|
||||
| `MxGateway:Alarms:Fallback:Discovery:Area` | _(empty)_ | Galaxy area to scope the Repository query to. Falls back to `MxGateway:Alarms:DefaultArea` when empty. Ignored when `UseGalaxyRepository` is `false`. This area is **not** used to compose a Repository-derived alarm's canonical `Galaxy!{area}.{reference}`: each discovered alarm uses its object's real Galaxy area (discovered via `gobject.area_gobject_id`), so the reference's group matches what the native alarmmgr emits. `Discovery:Area` / `DefaultArea` is used as the composition area only for explicit `IncludeAttributes` entries, which carry no discovered area. |
|
||||
| `MxGateway:Alarms:Fallback:Discovery:IncludeAttributes` | _(empty)_ | Explicit MXAccess attribute paths to add to the subtag watch-list, supplementing (or replacing, when `UseGalaxyRepository` is `false`) the Repository-derived list. |
|
||||
| `MxGateway:Alarms:Fallback:Discovery:ExcludeAttributes` | _(empty)_ | Attribute paths to remove from the merged watch-list (case-insensitive). The exclude is applied after the Repository-derived rows and the explicit `IncludeAttributes` entries are combined, so an exclude that matches an explicit include suppresses it too — excludes win. Ignored when `UseGalaxyRepository` is `false`. |
|
||||
| `MxGateway:Alarms:Fallback:Subtags:Active` | `InAlarm` | Subtag name for the in-alarm boolean. Confirmed AVEVA `AlarmExtension` field name. |
|
||||
| `MxGateway:Alarms:Fallback:Subtags:Acked` | `Acked` | Subtag name for the acknowledged boolean. Confirmed AVEVA `AlarmExtension` field name. |
|
||||
| `MxGateway:Alarms:Fallback:Subtags:AckComment` | `AckMsg` | Subtag name for the acknowledgement comment write target. Writing this subtag performs the acknowledge in AVEVA. Confirmed AVEVA `AlarmExtension` field name. When empty, the ack-comment write path is disabled. |
|
||||
| `MxGateway:Alarms:Fallback:Subtags:Priority` | `Priority` | Subtag name for the alarm priority / severity value. Confirmed AVEVA `AlarmExtension` field name. |
|
||||
|
||||
Validation rules:
|
||||
|
||||
- `Mode` must be `Auto`, `ForceAlarmManager`, or `ForceSubtag` (case-insensitive).
|
||||
- `Mode = ForceSubtag` with both `UseGalaxyRepository = false` and an empty
|
||||
`IncludeAttributes` list produces a startup validation warning: the subtag
|
||||
provider has no attributes to advise.
|
||||
- `ConsecutiveFailureThreshold`, `FailbackProbeIntervalSeconds`, and
|
||||
`FailbackStableProbes` are floored at 1 by `GatewayOptionsValidator`.
|
||||
|
||||
Full example with non-default fallback settings:
|
||||
|
||||
```json
|
||||
{
|
||||
"MxGateway": {
|
||||
"Alarms": {
|
||||
"Enabled": true,
|
||||
"SubscriptionExpression": "\\\\SCADA01\\Galaxy!PlantArea",
|
||||
"DefaultArea": "PlantArea",
|
||||
"ReconcileIntervalSeconds": 30,
|
||||
"Fallback": {
|
||||
"Mode": "Auto",
|
||||
"ConsecutiveFailureThreshold": 3,
|
||||
"FailbackProbeIntervalSeconds": 30,
|
||||
"FailbackStableProbes": 3,
|
||||
"Discovery": {
|
||||
"UseGalaxyRepository": true,
|
||||
"Area": "",
|
||||
"IncludeAttributes": [],
|
||||
"ExcludeAttributes": []
|
||||
},
|
||||
"Subtags": {
|
||||
"Active": "InAlarm",
|
||||
"Acked": "Acked",
|
||||
"AckComment": "AckMsg",
|
||||
"Priority": "Priority"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The defaults (`InAlarm`/`Acked`/`AckMsg`/`Priority`) are the confirmed AVEVA
|
||||
`AlarmExtension` primitive field names, verified by querying the live ZB Galaxy
|
||||
`attribute_definition` rows. The `Subtags` block exists so names can be
|
||||
overridden without a code change if a site's alarm template uses different
|
||||
attribute names. See `docs/AlarmClientDiscovery.md` for the synthesis rules that
|
||||
depend on these names.
|
||||
|
||||
## Host Endpoints and Transport Security (Kestrel)
|
||||
|
||||
The listening endpoints are **not** part of the `MxGateway` section. The gateway
|
||||
uses the stock ASP.NET Core host (`WebApplication.CreateBuilder`) with no
|
||||
`ConfigureKestrel` call in code, so endpoints come entirely from the standard
|
||||
`Kestrel` configuration section. On the deployed hosts these values are supplied
|
||||
as NSSM environment variables (`Kestrel__Endpoints__...`), not from
|
||||
`appsettings.json`.
|
||||
|
||||
Two named endpoints are bound:
|
||||
|
||||
| Endpoint name | Purpose | Protocol requirement |
|
||||
|---|---|---|
|
||||
| `Http` | Public gRPC API (sessions, invoke, events, Galaxy browse) | HTTP/2 |
|
||||
| `Dashboard` | Blazor dashboard and SignalR hubs | HTTP/1.1 (HTTP/2 optional) |
|
||||
|
||||
Both endpoints share one routing pipeline; the names only select which TCP port
|
||||
serves which traffic. The gRPC endpoint must negotiate **HTTP/2**, which drives
|
||||
the protocol settings below.
|
||||
|
||||
### Plaintext (current deployments)
|
||||
|
||||
Both running hosts (`10.100.0.48` and `wonder-app-vd03`) serve the gRPC port in
|
||||
**cleartext HTTP/2 (`h2c`)**. Because cleartext HTTP/2 has no ALPN to negotiate
|
||||
the protocol, the gRPC endpoint must be pinned to `Http2` with prior knowledge:
|
||||
|
||||
```text
|
||||
Kestrel__Endpoints__Http__Url=http://0.0.0.0:5120
|
||||
Kestrel__Endpoints__Http__Protocols=Http2
|
||||
Kestrel__Endpoints__Dashboard__Url=http://0.0.0.0:5130
|
||||
```
|
||||
|
||||
In this mode all client↔gateway traffic — including the
|
||||
`authorization: Bearer mxgw_...` API key and any `WriteSecured` / `AuthenticateUser`
|
||||
payloads — crosses the network **unencrypted**. This is acceptable only on a
|
||||
trusted/isolated network segment. Prefer TLS for anything else.
|
||||
|
||||
### TLS
|
||||
|
||||
To encrypt the gRPC channel, give the `Http` endpoint an `https://` URL and a
|
||||
certificate. Over TLS, ALPN negotiates HTTP/2, so the explicit `Protocols=Http2`
|
||||
pin is no longer required (the default `Http1AndHttp2` works for gRPC over TLS).
|
||||
|
||||
`appsettings.json` form:
|
||||
|
||||
```json
|
||||
{
|
||||
"Kestrel": {
|
||||
"Endpoints": {
|
||||
"Http": {
|
||||
"Url": "https://0.0.0.0:5120",
|
||||
"Certificate": {
|
||||
"Path": "C:\\ProgramData\\MxGateway\\certs\\gateway.pfx",
|
||||
"Password": "<pfx-password>"
|
||||
}
|
||||
},
|
||||
"Dashboard": {
|
||||
"Url": "https://0.0.0.0:5130",
|
||||
"Certificate": {
|
||||
"Path": "C:\\ProgramData\\MxGateway\\certs\\gateway.pfx",
|
||||
"Password": "<pfx-password>"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Equivalent NSSM environment-variable form (how config is delivered on the hosts —
|
||||
see [server deploy mechanics in the project notes]):
|
||||
|
||||
```text
|
||||
Kestrel__Endpoints__Http__Url=https://0.0.0.0:5120
|
||||
Kestrel__Endpoints__Http__Certificate__Path=C:\ProgramData\MxGateway\certs\gateway.pfx
|
||||
Kestrel__Endpoints__Http__Certificate__Password=<pfx-password>
|
||||
Kestrel__Endpoints__Dashboard__Url=https://0.0.0.0:5130
|
||||
Kestrel__Endpoints__Dashboard__Certificate__Path=C:\ProgramData\MxGateway\certs\gateway.pfx
|
||||
Kestrel__Endpoints__Dashboard__Certificate__Password=<pfx-password>
|
||||
```
|
||||
|
||||
Certificate sourcing options (any standard ASP.NET Core form is accepted):
|
||||
|
||||
| Form | Keys |
|
||||
|---|---|
|
||||
| PFX file | `Certificate:Path` (+ `Certificate:Password` if encrypted) |
|
||||
| PEM pair | `Certificate:Path` (cert) + `Certificate:KeyPath` (private key) |
|
||||
| Windows cert store | `Certificate:Subject`, `Certificate:Store` (e.g. `My`), `Certificate:Location` (`LocalMachine`), `Certificate:AllowInvalid` |
|
||||
|
||||
The certificate's CN/SAN must cover the host name clients dial (or clients must
|
||||
set a server-name override — see below). The dashboard endpoint can keep its own
|
||||
certificate independent of the gRPC endpoint; pair this with
|
||||
`MxGateway:Dashboard:RequireHttpsCookie` (`true`) for production HTTPS.
|
||||
|
||||
### Automatic self-signed certificate
|
||||
|
||||
`mxaccessgw` is an internal tool with no PKI to issue certificates, so requiring
|
||||
an operator to supply one before TLS works pushed deployments toward plaintext.
|
||||
To avoid that, the gateway fills in a self-signed certificate when an HTTPS
|
||||
endpoint is configured without one.
|
||||
|
||||
**Trigger.** At startup the gateway inspects `Kestrel:Endpoints:*`. If any
|
||||
endpoint has an `https://` URL and no `Certificate` subsection of its own, and no
|
||||
`Kestrel:Certificates:Default` is set, the gateway generates (or loads) a
|
||||
persisted self-signed certificate and wires it in as the HTTPS *default* via
|
||||
`ConfigureHttpsDefaults`. All-plaintext deployments are untouched: when no HTTPS
|
||||
endpoint is configured, no certificate or key material is generated or written.
|
||||
|
||||
**Generated certificate.** ECDSA P-256, `serverAuth` EKU, validity ≈
|
||||
`ValidityYears` (default 10 years, with one day of clock-skew slack before
|
||||
`notBefore`). SANs cover `localhost`, the machine name (and its FQDN when
|
||||
resolvable), each entry in `AdditionalDnsNames`, and the loopback addresses
|
||||
`127.0.0.1` and `::1`.
|
||||
|
||||
**`MxGateway:Tls:*` options.** All optional; the zero-config path needs none of
|
||||
them.
|
||||
|
||||
| Option | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `Tls:SelfSignedCertPath` | `C:\ProgramData\MxGateway\certs\gateway-selfsigned.pfx` | Where the generated certificate is persisted |
|
||||
| `Tls:ValidityYears` | `10` | Lifetime of the generated certificate (validated 1–100) |
|
||||
| `Tls:AdditionalDnsNames` | `[]` | Extra DNS SANs (e.g. a load-balancer name) |
|
||||
| `Tls:RegenerateIfExpired` | `true` | Replace an expired persisted certificate instead of failing |
|
||||
|
||||
`ValidityYears` is validated by `GatewayOptionsValidator` (range 1–100); the
|
||||
"HTTPS endpoint configured but no certificate available" fail-fast lives in the
|
||||
bootstrap/provider, because the validator only sees the `MxGateway` section, not
|
||||
`Kestrel:Endpoints`.
|
||||
|
||||
**Persistence.** The PFX is written with an **empty** export password — a random
|
||||
in-memory password could not be reused across restarts, which the
|
||||
persist-and-reuse model requires. The private key is instead protected at rest by
|
||||
filesystem permissions: a restrictive ACL on Windows (SYSTEM + Administrators,
|
||||
inherited ACEs stripped) on the `certs` directory and file, and mode `0600` on
|
||||
non-Windows. The write is atomic (hardened temp file, then move). The persisted
|
||||
certificate is reused across restarts (stable thumbprint, so CA-pinning clients
|
||||
keep working) and regenerated only when it is missing, expired (and
|
||||
`RegenerateIfExpired` is `true`), or unreadable/corrupt. If the directory is not
|
||||
writable or the ACL cannot be applied, the gateway fails fast with a diagnostic
|
||||
naming the path rather than falling back to an in-memory certificate.
|
||||
|
||||
**Logging.** On generate or load, the gateway logs the certificate thumbprint,
|
||||
SAN list, and `notAfter` at Information. The PFX bytes, export password, and
|
||||
private key are never logged.
|
||||
|
||||
**Operator override.** The generated certificate is only the HTTPS *default*. To
|
||||
use a real certificate, configure one explicitly — either per endpoint via
|
||||
`Kestrel:Endpoints:<name>:Certificate` (`Path`/`Subject`/`Thumbprint`, etc., as
|
||||
in the table above) or globally via `Kestrel:Certificates:Default`. An
|
||||
explicitly-configured certificate takes precedence, and the gateway then writes
|
||||
no self-signed material.
|
||||
|
||||
### Client side
|
||||
|
||||
Each official client opts into TLS explicitly. For the .NET client
|
||||
(`MxGatewayClientOptions`):
|
||||
|
||||
| Option | Effect |
|
||||
|---|---|
|
||||
| `UseTls` (default `false`) | Enables TLS. Requires an `https://` endpoint; an `https://` endpoint without `UseTls` fails validation, and vice versa. |
|
||||
| `CaCertificatePath` | Pins a custom root (self-signed / private CA) using `CustomRootTrust` chain validation instead of the OS trust store; the .NET client also enforces the certificate hostname/SAN match on this path. |
|
||||
| `RequireCertificateValidation` (default `false`) | Forces OS/system-trust verification on a TLS connection with no pinned CA. Leave `false` for the lenient default. |
|
||||
| `ServerNameOverride` | SNI / certificate host name override when the dialed host differs from the certificate CN/SAN. |
|
||||
|
||||
To pair with the auto-generated self-signed certificate above, the clients are
|
||||
**lenient by default**: a TLS connection with no pinned CA accepts whatever
|
||||
certificate the gateway presents. Pin `CaCertificatePath` to verify, or set
|
||||
`RequireCertificateValidation` to force system-trust verification without
|
||||
pinning. The other language clients expose the equivalent options; the exact
|
||||
behavior differs per stack — Python uses trust-on-first-use and Rust is pin-only.
|
||||
See each client README for the as-built behavior.
|
||||
|
||||
### Gateway↔worker IPC
|
||||
|
||||
Transport security here applies only to the public gRPC channel. The
|
||||
gateway↔worker link is a per-session **named pipe**
|
||||
(`mxaccess-gateway-{gatewayPid}-{sessionId}`), not a network socket. It is not
|
||||
TLS-encrypted and does not need to be: it never leaves the local Windows host and
|
||||
is secured by the OS pipe ACL. See [Worker Frame Protocol](./WorkerFrameProtocol.md).
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
|
||||
|
||||
+12
-7
@@ -215,13 +215,18 @@ beyond "LDAP is up." See the "Adding a gw-specific group" section of
|
||||
`glauth.md` for the provisioning step that adds `GwAdmin` and grants it to
|
||||
`admin`.
|
||||
|
||||
The suite covers both the success path and the `DashboardAuthenticator` failure
|
||||
branches: `admin` whose LDAP groups resolve to the `Admin` role succeeds and
|
||||
emits the role claim; `readonly` is denied because no group in their `memberOf`
|
||||
appears in `GroupToRole`; `admin` with a wrong password is rejected by the
|
||||
candidate bind without leaking the password into `FailureMessage`; an unknown
|
||||
username yields no candidate; and an unreachable LDAP server is absorbed into a
|
||||
failed result rather than throwing.
|
||||
`DashboardAuthenticator` delegates the LDAP bind and group search to the shared
|
||||
`ZB.MOM.WW.Auth.Ldap` provider (`LdapAuthService`) and only maps the resulting
|
||||
groups to dashboard roles via `DashboardGroupRoleMapper`; the bind/search
|
||||
mechanics that decide each outcome live in that shared provider, not in
|
||||
`DashboardAuthenticator`.
|
||||
|
||||
The suite covers both the success path and the failure outcomes: `admin` whose
|
||||
LDAP groups resolve to the `Admin` role succeeds and emits the role claim;
|
||||
`readonly` is denied because no group in their `memberOf` appears in
|
||||
`GroupToRole`; `admin` with a wrong password fails authentication without leaking
|
||||
the password into `FailureMessage`; an unknown username fails authentication; and
|
||||
an unreachable LDAP server is absorbed into a failed result rather than throwing.
|
||||
|
||||
Run the LDAP live tests explicitly:
|
||||
|
||||
|
||||
@@ -94,6 +94,73 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
|
||||
|
||||
`StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree.
|
||||
|
||||
#### Provider status on the alarm feed
|
||||
|
||||
`AlarmFeedMessage` has a fourth `payload` case, `provider_status`, carrying
|
||||
an `AlarmProviderStatus` message:
|
||||
|
||||
```protobuf
|
||||
message AlarmProviderStatus {
|
||||
AlarmProviderMode mode = 1;
|
||||
bool degraded = 2; // true whenever mode == SUBTAG
|
||||
string reason = 3; // human-readable switch reason
|
||||
google.protobuf.Timestamp since = 4;
|
||||
}
|
||||
```
|
||||
|
||||
The gateway emits `provider_status` once when a client first subscribes
|
||||
(immediately after the initial snapshot and before the first live transition)
|
||||
and again on every failover or failback. A late-joining client therefore
|
||||
always learns the current provider mode without waiting for the next switch.
|
||||
|
||||
`AlarmProviderMode` is an enum with three values:
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `ALARM_PROVIDER_MODE_UNSPECIFIED` (0) | Default / unset |
|
||||
| `ALARM_PROVIDER_MODE_ALARMMGR` (1) | Native wnwrap alarm-manager source |
|
||||
| `ALARM_PROVIDER_MODE_SUBTAG` (2) | Subtag-monitoring fallback (degraded) |
|
||||
|
||||
#### Degraded and source-provider fields on transitions and snapshots
|
||||
|
||||
`OnAlarmTransitionEvent` and `ActiveAlarmSnapshot` both carry two new fields:
|
||||
|
||||
- `bool degraded` (field 14) — `true` when the record came from the subtag
|
||||
fallback, not the native alarmmgr.
|
||||
- `AlarmProviderMode source_provider` (field 15) — which provider produced
|
||||
this record (`ALARMMGR` or `SUBTAG`).
|
||||
|
||||
Both fields are proto3 defaults (`false` / `UNSPECIFIED`) in alarmmgr mode,
|
||||
so existing clients that do not read them continue to function without change.
|
||||
Clients that care about provenance — for example, an OPC UA server that
|
||||
applies different quality flags to degraded alarms — should inspect `degraded`
|
||||
before consuming the transition.
|
||||
|
||||
Subtag-mode records are a non-parity source. They carry synthetic GUIDs,
|
||||
best-effort timestamps, and reduced field coverage. See
|
||||
`docs/AlarmClientDiscovery.md` for the full fidelity table.
|
||||
|
||||
#### Provider-mode-changed event
|
||||
|
||||
The worker emits `OnAlarmProviderModeChangedEvent` (family
|
||||
`MX_EVENT_FAMILY_ON_ALARM_PROVIDER_MODE_CHANGED`) on each switch between
|
||||
providers:
|
||||
|
||||
```protobuf
|
||||
message OnAlarmProviderModeChangedEvent {
|
||||
AlarmProviderMode mode = 1;
|
||||
string reason = 2;
|
||||
int32 hresult = 3; // COM HRESULT that triggered failover; 0 on failback
|
||||
google.protobuf.Timestamp at = 4;
|
||||
}
|
||||
```
|
||||
|
||||
This event arrives on the `StreamEvents` stream of the alarm monitor's
|
||||
internal gateway session (not on client sessions). `GatewayAlarmMonitor`
|
||||
consumes it and reflects the new mode into the `StreamAlarms` feed's
|
||||
`provider_status`, the dashboard hub, and metrics. Client sessions do not
|
||||
receive this event directly.
|
||||
|
||||
## Validation Rules
|
||||
|
||||
`MxAccessGrpcRequestValidator` rejects requests with `StatusCode.InvalidArgument` before any session work happens. The rules are intentionally narrow — anything that requires session state (for example, "session does not exist") is left for `ISessionManager` so the validator can stay synchronous and side-effect free.
|
||||
@@ -243,9 +310,27 @@ services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInt
|
||||
|
||||
Because the interceptor runs before any handler, `MxAccessGatewayService` can safely assume the call has been authorized and that `IGatewayRequestIdentityAccessor.Current` is populated. The handler's only responsibility is to read the identity for `OpenSession` so the session is owned by the authenticated principal; it does not perform any authorization checks of its own. See [Authorization](./Authorization.md) for the policy and identity model.
|
||||
|
||||
## Transport Security
|
||||
|
||||
The gRPC endpoint runs over HTTP/2, in cleartext (`h2c`) or TLS depending on the
|
||||
Kestrel endpoint configuration. The current deployments serve it in cleartext, so
|
||||
the API key and request payloads cross the network unencrypted. The endpoint,
|
||||
protocol pinning, and TLS certificate configuration — plus the corresponding
|
||||
client `UseTls` / `CaCertificatePath` options — are documented in
|
||||
[Host Endpoints and Transport Security](./GatewayConfiguration.md#host-endpoints-and-transport-security-kestrel).
|
||||
|
||||
To make TLS usable without PKI, the gateway can auto-generate and persist a
|
||||
self-signed certificate when an HTTPS endpoint is configured without one, and the
|
||||
language clients are lenient by default — a TLS connection with no pinned CA
|
||||
accepts the presented certificate (with per-stack nuances: Python is
|
||||
trust-on-first-use, Rust is pin-only). See
|
||||
[Automatic self-signed certificate](./GatewayConfiguration.md#automatic-self-signed-certificate)
|
||||
and each client README for the as-built behavior.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Contracts](./Contracts.md)
|
||||
- [Sessions](./Sessions.md)
|
||||
- [Authorization](./Authorization.md)
|
||||
- [Gateway Configuration](./GatewayConfiguration.md)
|
||||
- [Gateway Process Design](./GatewayProcessDesign.md)
|
||||
|
||||
+6
-6
@@ -4,7 +4,7 @@ The metrics subsystem exposes counters, histograms, and observable gauges that d
|
||||
|
||||
## Overview
|
||||
|
||||
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `ZB.MOM.WW.MxGateway.Server` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
|
||||
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `ZB.MOM.WW.MxGateway` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
|
||||
|
||||
## Meter and OpenTelemetry Compatibility
|
||||
|
||||
@@ -13,7 +13,7 @@ The meter name is exposed as a constant so that hosting code can register it wit
|
||||
```csharp
|
||||
public sealed class GatewayMetrics : IDisposable
|
||||
{
|
||||
public const string MeterName = "ZB.MOM.WW.MxGateway.Server";
|
||||
public const string MeterName = "ZB.MOM.WW.MxGateway";
|
||||
|
||||
public GatewayMetrics()
|
||||
{
|
||||
@@ -50,12 +50,12 @@ All counters are `Counter<long>`. Tag values come from the call sites listed und
|
||||
|
||||
### Histograms
|
||||
|
||||
Histograms record durations in milliseconds (the `unit` argument on `CreateHistogram`):
|
||||
Histograms record durations in seconds (the `unit` argument on `CreateHistogram`):
|
||||
|
||||
```csharp
|
||||
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "ms");
|
||||
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "ms");
|
||||
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "ms");
|
||||
_workerStartupLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.workers.startup.duration", "s");
|
||||
_commandLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.commands.duration", "s");
|
||||
_eventStreamSendLatencyHistogram = _meter.CreateHistogram<double>("mxgateway.events.stream_send.duration", "s");
|
||||
```
|
||||
|
||||
| Instrument | Tags | What it measures |
|
||||
|
||||
@@ -0,0 +1,240 @@
|
||||
# Client Lazy-Browse Walker Helpers + Per-Language Tests
|
||||
|
||||
Date: 2026-05-28
|
||||
Status: approved, ready for implementation plan
|
||||
|
||||
## Problem
|
||||
|
||||
The `BrowseChildren` RPC shipped (branch `feat/lazy-browse-children`, merged
|
||||
or pending merge), but each language client exposes only the raw generated
|
||||
gRPC stub. Callers must hand-write recursion, sibling pagination, and
|
||||
NotFound translation themselves. Only one client (.NET) has a smoke test,
|
||||
and it is skippable.
|
||||
|
||||
This work adds a small high-level walker to each client and unit tests so
|
||||
callers can build OPC UA-style browse trees without re-implementing the
|
||||
same plumbing five times.
|
||||
|
||||
## Scope
|
||||
|
||||
Each of the five clients (.NET, Python, Rust, Go, Java) gains:
|
||||
|
||||
1. A low-level `BrowseChildren*Async` wrapper on the existing
|
||||
`GalaxyRepositoryClient`, mirroring the existing `DiscoverHierarchy*Async`
|
||||
shape.
|
||||
2. A high-level `LazyBrowseNode` type plus a `BrowseAsync` factory.
|
||||
3. Five unit tests against the language's existing fake-transport fixture.
|
||||
|
||||
Plus a one-time toolchain bootstrap so the Java client builds locally on
|
||||
the macOS dev host (Homebrew install of Temurin 21 + Gradle).
|
||||
|
||||
## Architecture
|
||||
|
||||
`LazyBrowseNode` is shared in shape across languages:
|
||||
|
||||
```text
|
||||
LazyBrowseNode {
|
||||
Object GalaxyObject (immutable, from server)
|
||||
HasChildrenHint bool (server's child_has_children value)
|
||||
Children list<LazyBrowseNode> (empty until Expand)
|
||||
IsExpanded bool
|
||||
ExpandAsync(ct) Task (idempotent; no-op after first call)
|
||||
}
|
||||
```
|
||||
|
||||
`GalaxyRepositoryClient.BrowseAsync(parent?, ct)` returns a list of root
|
||||
`LazyBrowseNode`s. Empty `parent` means structural roots. Each returned
|
||||
node is unexpanded; the caller invokes `ExpandAsync` to fetch direct
|
||||
children. After expand, `Children` is a list of further `LazyBrowseNode`s.
|
||||
|
||||
**Pagination is hidden.** `ExpandAsync` walks `next_page_token` internally
|
||||
until all siblings of this parent are gathered. Callers see one flat
|
||||
`Children` list.
|
||||
|
||||
**Errors:** server `NotFound` becomes a language-idiomatic typed error
|
||||
(`MxGatewayException` in .NET, `GalaxyNotFoundError` in Python,
|
||||
`GalaxyError::NotFound` in Rust, typed error in Go,
|
||||
`GalaxyNotFoundException` in Java).
|
||||
|
||||
**Filters:** `BrowseAsync` accepts a `BrowseChildrenOptions` (or
|
||||
language-equivalent) mirroring the existing `DiscoverHierarchyOptions`. The
|
||||
same options apply to every `ExpandAsync` call rooted from that factory
|
||||
call — stored on the node so child expansions inherit them.
|
||||
|
||||
## Per-language API
|
||||
|
||||
Each language adapts to its own idioms; the structure is parallel.
|
||||
|
||||
### .NET (`clients/dotnet/ZB.MOM.WW.MxGateway.Client/GalaxyRepositoryClient.cs`)
|
||||
|
||||
```csharp
|
||||
public sealed class LazyBrowseNode
|
||||
{
|
||||
public GalaxyObject Object { get; }
|
||||
public bool HasChildrenHint { get; }
|
||||
public IReadOnlyList<LazyBrowseNode> Children { get; }
|
||||
public bool IsExpanded { get; }
|
||||
public Task ExpandAsync(CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public Task<IReadOnlyList<LazyBrowseNode>> BrowseAsync(
|
||||
BrowseChildrenOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
public Task<BrowseChildrenReply> BrowseChildrenRawAsync(
|
||||
BrowseChildrenRequest request,
|
||||
CancellationToken ct = default);
|
||||
```
|
||||
|
||||
### Python (`clients/python/src/zb_mom_ww_mxgateway/galaxy.py`)
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class LazyBrowseNode:
|
||||
object: GalaxyObject
|
||||
has_children_hint: bool
|
||||
children: list["LazyBrowseNode"]
|
||||
is_expanded: bool
|
||||
async def expand(self) -> None: ...
|
||||
|
||||
async def browse(
|
||||
self,
|
||||
options: BrowseChildrenOptions | None = None,
|
||||
) -> list[LazyBrowseNode]: ...
|
||||
```
|
||||
|
||||
### Rust (`clients/rust/src/galaxy.rs`)
|
||||
|
||||
```rust
|
||||
pub struct LazyBrowseNode { /* private fields; Arc<Mutex<>> for Children */ }
|
||||
|
||||
impl LazyBrowseNode {
|
||||
pub fn object(&self) -> &GalaxyObject;
|
||||
pub fn has_children_hint(&self) -> bool;
|
||||
pub fn children(&self) -> Vec<LazyBrowseNode>; // cloned snapshot
|
||||
pub fn is_expanded(&self) -> bool;
|
||||
pub async fn expand(&self) -> Result<(), GalaxyError>;
|
||||
}
|
||||
|
||||
pub async fn browse(
|
||||
&self,
|
||||
options: Option<BrowseChildrenOptions>,
|
||||
) -> Result<Vec<LazyBrowseNode>, GalaxyError>;
|
||||
```
|
||||
|
||||
### Go (`clients/go/mxgateway/galaxy.go`)
|
||||
|
||||
```go
|
||||
type LazyBrowseNode struct { /* unexported */ }
|
||||
func (n *LazyBrowseNode) Object() *pb.GalaxyObject
|
||||
func (n *LazyBrowseNode) HasChildrenHint() bool
|
||||
func (n *LazyBrowseNode) Children() []*LazyBrowseNode
|
||||
func (n *LazyBrowseNode) IsExpanded() bool
|
||||
func (n *LazyBrowseNode) Expand(ctx context.Context) error
|
||||
|
||||
func (c *Client) Browse(
|
||||
ctx context.Context,
|
||||
opts *BrowseChildrenOptions,
|
||||
) ([]*LazyBrowseNode, error)
|
||||
```
|
||||
|
||||
### Java (`clients/java/zb-mom-ww-mxgateway-client/`)
|
||||
|
||||
```java
|
||||
public final class LazyBrowseNode {
|
||||
public GalaxyObject getObject();
|
||||
public boolean hasChildrenHint();
|
||||
public List<LazyBrowseNode> getChildren();
|
||||
public boolean isExpanded();
|
||||
public CompletableFuture<Void> expandAsync();
|
||||
}
|
||||
|
||||
public CompletableFuture<List<LazyBrowseNode>> browseAsync(
|
||||
BrowseChildrenOptions options);
|
||||
```
|
||||
|
||||
If the existing Java client surface is synchronous, mirror that — both
|
||||
sync and async variants are acceptable as long as the choice matches the
|
||||
client's existing convention.
|
||||
|
||||
## Tests
|
||||
|
||||
Each language adds these six facts against its existing fake-transport
|
||||
fixture (`FakeGalaxyRepositoryTransport` in .NET, the equivalent in each
|
||||
other client):
|
||||
|
||||
| # | Test | Purpose |
|
||||
|---|------|---------|
|
||||
| 1 | `Browse_NoParent_ReturnsRoots` | factory returns roots, each unexpanded, hint reflects fake's `child_has_children` |
|
||||
| 2 | `Expand_PopulatesChildrenAndMarksExpanded` | one ExpandAsync call fires one BrowseChildren RPC; Children populated; IsExpanded flips |
|
||||
| 3 | `Expand_CalledTwice_NoSecondRpc` | idempotency — fake records RPC count == 1 |
|
||||
| 4 | `Expand_UnknownParent_ThrowsGalaxyNotFound` | server NotFound surfaces as language-typed error |
|
||||
| 5 | `Expand_MultiPageSiblings_GathersAllPages` | fake returns NextPageToken on first call; helper walks pages until empty; flat Children list |
|
||||
| 6 | `Browse_WithFilter_ForwardsToRequest` | options propagate into the wire request (`tag_name_glob` etc.) |
|
||||
|
||||
No new live-only tests in this batch. The existing
|
||||
`BrowseChildrenSmokeTests` in .NET covers wire compatibility.
|
||||
|
||||
## Java toolchain bootstrap
|
||||
|
||||
The macOS dev host lacks a JVM. Install via Homebrew (one-time):
|
||||
|
||||
```bash
|
||||
brew install temurin@21
|
||||
brew install gradle
|
||||
```
|
||||
|
||||
Verify:
|
||||
```bash
|
||||
java -version # expect 21.x
|
||||
gradle --version # expect 8.x or 9.x
|
||||
```
|
||||
|
||||
If the Temurin formula does not auto-link `JAVA_HOME`, add to shell init:
|
||||
```bash
|
||||
export JAVA_HOME="$(/usr/libexec/java_home -v 21)"
|
||||
```
|
||||
|
||||
Then verify the existing Java client builds against its committed
|
||||
generated tree:
|
||||
```bash
|
||||
cd clients/java
|
||||
gradle build -x test
|
||||
```
|
||||
|
||||
If build succeeds, regenerate protos to pick up `BrowseChildren`:
|
||||
```bash
|
||||
gradle generateProto
|
||||
```
|
||||
|
||||
This produces the new Java RPC stubs that the walker work depends on.
|
||||
|
||||
**Failure path:** if Homebrew installs but `gradle build` fails for an
|
||||
environmental reason (e.g., proto plugin version mismatch), fall back to
|
||||
"defer Java" — implement the other four clients and document that Java
|
||||
walker work waits for the Windows host. Do not spend more than ~30
|
||||
minutes debugging local Java issues; the Windows host already builds the
|
||||
Java client cleanly.
|
||||
|
||||
## Documentation updates
|
||||
|
||||
Each client's `README.md` "Browsing lazily" snippet (added in commit
|
||||
`0d6193c`) gets one short example block showing the high-level walker
|
||||
in addition to the existing raw-RPC snippet. Approximately three
|
||||
sentences plus a 5-line code block per language.
|
||||
|
||||
No changes to `gateway.md`, `docs/GalaxyRepository.md`, or
|
||||
`docs/DesignDecisions.md` — those describe the wire contract; the
|
||||
walkers are client-side ergonomics, not part of the wire surface.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Async iterator / streaming walker (rejected in brainstorming —
|
||||
encourages eager-to-completion consumption that defeats laziness).
|
||||
- Explicit `RefreshAsync` on `LazyBrowseNode` (single-shot expand is
|
||||
enough; caller invalidates the tree by re-calling `BrowseAsync`).
|
||||
- Tree-builder helpers that pre-fetch the whole hierarchy (that's just
|
||||
`DiscoverHierarchy` with extra round-trips).
|
||||
- Server changes — the wire contract is final.
|
||||
- Cross-client integration test runner — each client tests in isolation.
|
||||
- Java regen on Mac if Homebrew install fails — defer to Windows host.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,15 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-05-28-client-walker-implementation.md",
|
||||
"tasks": [
|
||||
{"id": 23, "subject": "Task 0: Branch state check", "status": "pending"},
|
||||
{"id": 24, "subject": "Task 1: Java toolchain bootstrap", "status": "pending", "blockedBy": [23]},
|
||||
{"id": 25, "subject": "Task 2: .NET LazyBrowseNode walker + 6 tests", "status": "pending", "blockedBy": [23]},
|
||||
{"id": 26, "subject": "Task 3: Python LazyBrowseNode walker + 6 tests", "status": "pending", "blockedBy": [23]},
|
||||
{"id": 27, "subject": "Task 4: Rust LazyBrowseNode walker + 6 tests", "status": "pending", "blockedBy": [23]},
|
||||
{"id": 28, "subject": "Task 5: Go LazyBrowseNode walker + 6 tests", "status": "pending", "blockedBy": [23]},
|
||||
{"id": 29, "subject": "Task 6: Java LazyBrowseNode walker + tests", "status": "pending", "blockedBy": [23, 24]},
|
||||
{"id": 30, "subject": "Task 7: README walker examples for all 5 clients", "status": "pending", "blockedBy": [25, 26, 27, 28]},
|
||||
{"id": 31, "subject": "Task 8: Final integration build + verification", "status": "pending", "blockedBy": [29, 30]}
|
||||
],
|
||||
"lastUpdated": "2026-05-28T18:30:00Z"
|
||||
}
|
||||
@@ -102,12 +102,14 @@ message BrowseChildrenReply {
|
||||
| Condition | Status |
|
||||
|---|---|
|
||||
| Unknown `parent_gobject_id` / `parent_tag_name` / `parent_contained_path` | `NotFound` |
|
||||
| Stale `page_token` (cache deployed forward) | `FailedPrecondition`; current `cache_sequence` in trailers |
|
||||
| Stale `page_token` (cache deployed forward) | `InvalidArgument`; current `cache_sequence` in trailers |
|
||||
| Filter set differs between pages of the same token | `InvalidArgument` |
|
||||
| First load not complete within 5s | `Unavailable` |
|
||||
| API key missing `metadata:read` scope | `PermissionDenied` |
|
||||
| No API key | `Unauthenticated` |
|
||||
|
||||
Stale and filter-changed page tokens both surface as `InvalidArgument` — same contract as `DiscoverHierarchy`, since `BrowseChildren` reuses the same token encoding (`sequence:filter-signature:offset`).
|
||||
|
||||
`browse_subtrees` API-key constraints intersect with the request as today.
|
||||
|
||||
## Server projection
|
||||
@@ -224,7 +226,7 @@ Unit tests (no live MXAccess / Galaxy required):
|
||||
- Ordering matches `DashboardBrowseTreeBuilder` byte-for-byte.
|
||||
- Sibling pagination across multiple pages.
|
||||
- Page-token round trip (serialize → deserialize → same offset).
|
||||
- Stale `page_token` → `FailedPrecondition`.
|
||||
- Stale `page_token` → `InvalidArgument`.
|
||||
- Unknown parent → `NotFound`.
|
||||
- Filter change between pages of the same token → `InvalidArgument`.
|
||||
- `GalaxyRepositoryGrpcServiceTests` — new `BrowseChildren` happy path,
|
||||
|
||||
@@ -0,0 +1,156 @@
|
||||
# Gateway TLS Auto-Certificate and Lenient Client Trust — Design
|
||||
|
||||
Date: 2026-06-01
|
||||
Status: Approved (brainstorming), pending implementation plan
|
||||
|
||||
## Problem
|
||||
|
||||
The gateway can serve gRPC and the dashboard over TLS, but only if an operator
|
||||
supplies a certificate via the Kestrel `https://` endpoint config. With no cert,
|
||||
an `https` endpoint fails at startup with Kestrel's opaque "No server certificate
|
||||
was specified" error. Both current deployments therefore run plaintext (`h2c`),
|
||||
exposing the API key and request payloads on the wire.
|
||||
|
||||
`mxaccessgw` is an internal tool. The goal is for TLS to "just work" with zero PKI
|
||||
management: the gateway fabricates its own long-lived certificate when an HTTPS
|
||||
endpoint is configured without one, and clients accept whatever certificate is
|
||||
presented unless an operator explicitly opts into pinning.
|
||||
|
||||
## Decisions
|
||||
|
||||
1. **Gateway = fill-missing-cert-only.** No new "enable TLS" switch. TLS is still
|
||||
driven by configuring a Kestrel `https://` endpoint. New behavior: when an
|
||||
HTTPS endpoint has no `Certificate` section, the gateway generates/loads a
|
||||
persisted self-signed cert instead of failing. Plaintext-only hosts are
|
||||
untouched — no certificate or key material is ever written for them.
|
||||
2. **Persist & reuse.** The self-signed cert is saved as a PFX under
|
||||
`C:\ProgramData\MxGateway\certs`, reused across restarts, regenerated only if
|
||||
missing, expired, or unreadable. Stable thumbprint; survives restarts; any
|
||||
CA-pinning client keeps working.
|
||||
3. **Clients = lenient TLS, plaintext default.** When a client connects over TLS
|
||||
without a pinned CA, it skips verification (accepts any cert). Pinning a CA file
|
||||
restores full verification. The per-client connection default (mostly
|
||||
plaintext/`http`) does not change — TLS is still opt-in via the endpoint scheme.
|
||||
|
||||
**Scope boundary:** the gateway↔worker named-pipe IPC is unchanged (local,
|
||||
OS-secured by the pipe ACL). This work touches only the public gRPC/dashboard
|
||||
transport and the five language clients.
|
||||
|
||||
## Gateway component
|
||||
|
||||
New type `SelfSignedCertificateProvider` in
|
||||
`src/ZB.MOM.WW.MxGateway.Server/Security/Tls/`.
|
||||
|
||||
1. **Detect need.** Inspect `Kestrel:Endpoints:*` configuration at startup. If any
|
||||
endpoint has an `https://` URL and no `Certificate` subsection, a default cert
|
||||
is needed. If none do, the provider is a no-op (no file written).
|
||||
2. **Load-or-create.** Look for the persisted PFX. If present, valid, and
|
||||
unexpired, load it. Otherwise generate and persist.
|
||||
3. **Generate.** `CertificateRequest` with **ECDSA P-256**, `notBefore = now - 1
|
||||
day` (clock-skew slack), `notAfter = now + ValidityYears`. SANs: `DNS=localhost`,
|
||||
`DNS=<MachineName>`, `DNS=<MachineName.FQDN>` when resolvable, plus
|
||||
`IP=127.0.0.1` and `IP=::1`. Server-auth EKU.
|
||||
4. **Persist securely.** Write the PFX with an **empty** export password (a random
|
||||
in-memory password cannot be reused across restarts, which the persist-and-reuse
|
||||
decision requires); protect the private key with a restrictive ACL (SYSTEM +
|
||||
Administrators + service account) on the `certs` directory and file on Windows,
|
||||
and `0600` on non-Windows; atomic write (temp + rename). After generating, the
|
||||
cert is reloaded from the persisted PFX so Kestrel always serves the on-disk key.
|
||||
5. **Wire into Kestrel.** In `GatewayApplication.CreateBuilder`, add
|
||||
`builder.WebHost.ConfigureKestrel(o => o.ConfigureHttpsDefaults(h =>
|
||||
h.ServerCertificate = cert))`. `ConfigureHttpsDefaults` supplies the cert only
|
||||
for HTTPS endpoints that did not specify their own, so an operator-configured
|
||||
`Kestrel:Endpoints:*:Certificate` transparently overrides it. One hook covers
|
||||
both the gRPC and dashboard ports.
|
||||
|
||||
### New config block `MxGateway:Tls`
|
||||
|
||||
All optional; the zero-config path needs none of them.
|
||||
|
||||
| Option | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `Tls:SelfSignedCertPath` | `C:\ProgramData\MxGateway\certs\gateway-selfsigned.pfx` | Where the generated cert lives |
|
||||
| `Tls:ValidityYears` | `10` | Lifetime of the generated cert |
|
||||
| `Tls:AdditionalDnsNames` | `[]` | Extra SANs (e.g. a load-balancer name) |
|
||||
| `Tls:RegenerateIfExpired` | `true` | Auto-replace an expired persisted cert |
|
||||
|
||||
Validated by `GatewayOptionsValidator`: `ValidityYears` in 1–100,
|
||||
`SelfSignedCertPath` is a valid path shape when non-blank, and
|
||||
`AdditionalDnsNames` entries are non-blank. (The "https endpoint exists but cert
|
||||
path is blank" fail-fast lives in the bootstrap/provider, not the validator,
|
||||
because the validator only sees the `MxGateway` section, not `Kestrel:Endpoints`.)
|
||||
|
||||
**Logging:** on generate/load, log thumbprint + SAN list + `notAfter` at
|
||||
Information. Never log the PFX password or private key.
|
||||
|
||||
## Client lenient-TLS behavior
|
||||
|
||||
Uniform rule: **TLS on + no CA pinned ⇒ skip verification; CA pinned ⇒ full
|
||||
verification.** No transport default changes. Each client also exposes an explicit
|
||||
switch to force-disable leniency (strict-without-pinning) for the future.
|
||||
|
||||
| Client | Mechanism | Effort |
|
||||
|---|---|---|
|
||||
| .NET | In `CreateHttpHandler`, when `UseTls` and `CaCertificatePath` empty, set `SslOptions.RemoteCertificateValidationCallback = (_,_,_,_) => true`. CA path keeps existing custom-root validation. | trivial |
|
||||
| Go | In `buildCredentials`, when TLS and no `CACertFile`/`TLSConfig`, use `tls.Config{InsecureSkipVerify: true, ServerName: override}`. | trivial |
|
||||
| Java | grpc-netty-shaded 1.76.0 ships `InsecureTrustManagerFactory`. When TLS and no CA, build `GrpcSslContexts.forClient().trustManager(InsecureTrustManagerFactory.INSTANCE)`. | easy |
|
||||
| Python | grpc-python has no per-channel skip-verify. Fetch the server leaf cert at connect via `ssl.get_server_certificate((host, port))`, pass it as `root_certificates` to `ssl_channel_credentials`, plus `grpc.ssl_target_name_override`. Effectively trusts what is presented (TOFU). | moderate, special-cased |
|
||||
| Rust | tonic 0.13.1 + rustls (`tls-ring`). Implement a custom `rustls::client::danger::ServerCertVerifier` that accepts everything, build a `rustls::ClientConfig` via `.dangerous().with_custom_certificate_verifier(...)`, feed it to the channel. May require a custom hyper-rustls connector if `ClientTlsConfig` will not take a raw rustls config. **Needs an API spike.** | highest |
|
||||
|
||||
### Honesty caveats
|
||||
|
||||
- **Python** is not literally "ignore the cert"; it pins whatever the server
|
||||
presents on first contact via a separate unverified TLS probe. For a self-signed
|
||||
internal cert this is the intended outcome. Documented as a difference.
|
||||
- **Rust** leniency depends on the tonic 0.13 TLS surface. If a custom verifier is
|
||||
disproportionately invasive, the fallback is to require a CA file for Rust TLS
|
||||
(pin-only) and document Rust as the exception.
|
||||
|
||||
## Error handling
|
||||
|
||||
Gateway:
|
||||
- Cert dir not writable / ACL fails ⇒ fail fast at startup with a diagnostic naming
|
||||
the path and required permission. No silent in-memory fallback.
|
||||
- Persisted PFX corrupt/unreadable ⇒ warn, regenerate, overwrite.
|
||||
- Persisted cert expired ⇒ regenerate if `RegenerateIfExpired` (default), else fail
|
||||
fast instructing the operator to delete it or enable regeneration.
|
||||
- HTTPS endpoint configured but generation disabled / path empty ⇒ validator
|
||||
rejects at startup rather than letting Kestrel throw its opaque error.
|
||||
|
||||
Clients: surface unchanged. Skip-verify cannot itself raise. Python's pre-fetch
|
||||
wraps connect failure into the existing connect-error type with the endpoint in the
|
||||
message. Rust pin-only fallback surfaces the existing CA-file error.
|
||||
|
||||
## Documentation (same commit as source, per CLAUDE.md)
|
||||
|
||||
- `docs/GatewayConfiguration.md` — extend the TLS section: auto-generation, the
|
||||
`MxGateway:Tls:*` block, persistence location/ACL, thumbprint logging, operator
|
||||
override via `Kestrel:Endpoints:*:Certificate`.
|
||||
- Each client README + `*ClientDesign.md` — "TLS is lenient by default; pin a CA to
|
||||
verify," with Python TOFU and any Rust caveat noted.
|
||||
- `docs/DesignDecisions.md` — record both posture choices and the why (internal
|
||||
tool, no PKI) so they are not mistaken for an oversight.
|
||||
|
||||
## Testing
|
||||
|
||||
Gateway (`MxGateway.Tests`, no MXAccess):
|
||||
- `SelfSignedCertificateProvider`: SANs, server-auth EKU, `notAfter ≈ now +
|
||||
ValidityYears`, ECDSA P-256.
|
||||
- Load-or-create: valid persisted PFX reused (same thumbprint); expired regenerates
|
||||
when enabled; corrupt regenerates with a warning.
|
||||
- Detection: HTTPS-without-cert engages; all-plaintext no-ops and writes no file;
|
||||
endpoint with its own cert is not overridden.
|
||||
- `GatewayOptionsValidator`: new `Tls:*` rules.
|
||||
- Host integration: `Kestrel:Endpoints:Http:Url=https://127.0.0.1:0` builds and
|
||||
binds (today it throws "no certificate specified").
|
||||
|
||||
Clients: each test project gets a lenient-TLS test against a throwaway self-signed
|
||||
cert — connect with no CA succeeds; pinning a wrong CA fails (proves pinning still
|
||||
verifies). Python exercises the pre-fetch path; mark opt-in if loopback timing is
|
||||
flaky. Standard (non-live) tests; no MXAccess or external services.
|
||||
|
||||
Cross-language: add a TLS variant note to `docs/CrossLanguageSmokeMatrix.md`;
|
||||
running the matrix over TLS stays manual/opt-in, consistent with the existing gate.
|
||||
|
||||
Per-component verification follows CLAUDE.md's source-update table (build + test
|
||||
each touched component independently).
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-01-gateway-cert-autogen-implementation.md",
|
||||
"tasks": [
|
||||
{"id": 1, "subject": "Task 1: Add TlsOptions config + bind into GatewayOptions", "status": "pending"},
|
||||
{"id": 2, "subject": "Task 2: Validate MxGateway:Tls in GatewayOptionsValidator", "status": "pending", "blockedBy": [1]},
|
||||
{"id": 3, "subject": "Task 3: SelfSignedCertificateProvider.GenerateCertificate", "status": "pending", "blockedBy": [1]},
|
||||
{"id": 4, "subject": "Task 4: SelfSignedCertificateProvider.LoadOrCreate (persist/reuse/regenerate/ACL)", "status": "pending", "blockedBy": [3]},
|
||||
{"id": 5, "subject": "Task 5: KestrelTlsInspector (detect HTTPS-without-cert)", "status": "pending"},
|
||||
{"id": 6, "subject": "Task 6: Wire auto-cert into GatewayApplication.CreateBuilder", "status": "pending", "blockedBy": [1, 4, 5]},
|
||||
{"id": 7, "subject": "Task 7: .NET client lenient TLS by default", "status": "pending"},
|
||||
{"id": 8, "subject": "Task 8: Go client lenient TLS by default", "status": "pending"},
|
||||
{"id": 9, "subject": "Task 9: Java client lenient TLS by default", "status": "pending"},
|
||||
{"id": 10, "subject": "Task 10: Python client lenient TLS via TOFU pre-fetch", "status": "pending"},
|
||||
{"id": 11, "subject": "Task 11: Rust client lenient TLS via rustls verifier (spike + fallback)", "status": "pending"},
|
||||
{"id": 12, "subject": "Task 12: Documentation", "status": "pending", "blockedBy": [6, 7, 8, 9, 10, 11]}
|
||||
],
|
||||
"lastUpdated": "2026-06-01"
|
||||
}
|
||||
@@ -0,0 +1,316 @@
|
||||
# Alarm Subtag-Monitoring Fallback — Design
|
||||
|
||||
**Date:** 2026-06-13
|
||||
**Status:** Superseded by implementation (merged to `main`). This is the original
|
||||
brainstorming design; a few details below were refined during implementation —
|
||||
see the inline **Superseded** notes. The shipped behaviour is documented in
|
||||
`docs/AlarmClientDiscovery.md`, the client READMEs, and the contracts.
|
||||
**Branch:** `feat/alarm-subtag-fallback`
|
||||
|
||||
## Problem
|
||||
|
||||
The gateway's central alarm feed (`GatewayAlarmMonitor` → worker
|
||||
`WnWrapAlarmConsumer`) depends on the AVEVA wnwrap COM consumer
|
||||
(`WNWRAPCONSUMERLib.wwAlarmConsumerClass`), which polls `GetXmlCurrentAlarms2`
|
||||
on the worker STA. That provider can fail at the COM boundary (the older
|
||||
`aaAlarmManagedClient` crashed on FILETIME marshaling; wnwrap can still return
|
||||
failure HRESULTs or throw `COMException`). When it does, the gateway loses all
|
||||
alarm visibility.
|
||||
|
||||
This design adds a **second alarm source** — direct monitoring of each alarm
|
||||
attribute's subtags (`.active`, `.acked`, …) via the existing MXAccess
|
||||
`AddItem`/`Advise` pipeline — and **fails over to it automatically when the
|
||||
wnwrap provider breaks, then fails back automatically when it recovers**. The
|
||||
subtag source can also be forced on by config.
|
||||
|
||||
## Decisions (locked during brainstorming)
|
||||
|
||||
| Decision | Choice |
|
||||
|---|---|
|
||||
| Failover model | **Auto-failover + auto-failback** (both directions, runtime) |
|
||||
| Watch-list source | **Galaxy Repository SQL discovery + config override** |
|
||||
| Acknowledge in subtag mode | **Write the operator comment to the alarm's ack-comment subtag** (the write performs the ack) |
|
||||
| Failure signal | **N consecutive wnwrap COM failures** (Subscribe / `GetXmlCurrentAlarms2` throws or returns a failure HRESULT) |
|
||||
| Degraded-state visibility | **Both** — explicit field in the gRPC contract **and** dashboard + metrics |
|
||||
| Synthesis location | **Worker-side** (`Approach A`) — keeps the parity rule "the gateway forwards only events the worker emits; it never synthesizes events" |
|
||||
|
||||
## Core principle
|
||||
|
||||
Subtag monitoring is, by definition, a **non-parity, lower-fidelity** alarm
|
||||
source: it synthesizes alarm transitions from raw data changes, has no native
|
||||
alarm GUID, no native original-raise timestamp, and a narrower field set. Per
|
||||
`CLAUDE.md`, synthesizing events is allowed only as an explicit opt-in
|
||||
non-parity mode. This design satisfies that by (a) doing the synthesis **inside
|
||||
the worker** (so the gateway still only forwards worker-emitted events) and
|
||||
(b) marking every degraded event and the whole feed as degraded so no client
|
||||
mistakes it for the authoritative alarmmgr feed.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
GATEWAY (.NET 10, x64)
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ GatewayAlarmMonitor (BackgroundService) │
|
||||
│ • resolves watch-list: Galaxy Repository SQL + config override │
|
||||
│ • arms the worker with the watch-list at subscribe time │
|
||||
│ • consumes AlarmProviderModeChanged → reflects mode into feed, │
|
||||
│ /hubs/alarms dashboard hub, and metrics │
|
||||
│ • forces a cache reconcile (QueryActiveAlarms) on every switch │
|
||||
└───────────────────────────────┬───────────────────────────────────┘
|
||||
│ IPC (WorkerEnvelope frames)
|
||||
│ · SubscribeAlarms{ watch_list, failover cfg }
|
||||
│ · AlarmProviderModeChanged{ mode, reason, hresult }
|
||||
│ · OnAlarmTransitionEvent (degraded flag set in subtag mode)
|
||||
▼
|
||||
WORKER (.NET FW 4.8, x86, STA)
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AlarmDispatcher → FailoverAlarmConsumer : IMxAccessAlarmConsumer │
|
||||
│ ├─ primary : WnWrapAlarmConsumer (wnwrap COM poll, unchanged) │
|
||||
│ └─ standby : SubtagAlarmConsumer (AddItem/Advise on subtags) │
|
||||
│ │
|
||||
│ FailoverAlarmConsumer owns the state machine: │
|
||||
│ PrimaryActive ──(N consecutive wnwrap COM failures)──▶ Degraded │
|
||||
│ Degraded ──(M consecutive clean wnwrap probe polls)──▶ Primary │
|
||||
│ on each switch: snapshot the now-active provider, hand off │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
The failover state machine lives **worker-local** so the switch is instant — no
|
||||
IPC round-trip at the moment alarmmgr dies. The gateway *arms* the standby
|
||||
consumer up front (passes the watch-list at subscribe time) so it is ready
|
||||
before it is ever needed.
|
||||
|
||||
## Components
|
||||
|
||||
### Worker (`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/`)
|
||||
|
||||
**`SubtagAlarmConsumer : IMxAccessAlarmConsumer` (new)** — the standby provider.
|
||||
|
||||
- On `Subscribe`, instead of wnwrap registration it `AddItem`/`Advise`s the
|
||||
configured subtags for each watch-list entry on the existing STA (reuses the
|
||||
worker's item-subscription machinery). Per attribute it advises at minimum
|
||||
`.active` and `.acked`; optionally `.priority`/severity, `.descr`, value/limit
|
||||
if present.
|
||||
- Converts each `OnDataChange` into the same `MxAlarmTransitionEvent` the wnwrap
|
||||
consumer emits, via the synthesis rules below, and raises
|
||||
`AlarmTransitionEmitted`. Marks each as **degraded**.
|
||||
- `SnapshotActiveAlarms()` returns the currently-active set computed from
|
||||
last-known subtag values.
|
||||
- `AcknowledgeByName(...)` resolves the watch-list entry's ack-comment subtag and
|
||||
issues a `Write(comment)` on the STA. `AcknowledgeByGuid(...)` maps the
|
||||
synthetic GUID (see below) back to a reference, then does the same. If the
|
||||
attribute exposes no writable ack-comment subtag, returns a failure code that
|
||||
the gateway surfaces as `FailedPrecondition`.
|
||||
- `PollOnce()` is a no-op (subtag mode is event-driven via Advise).
|
||||
|
||||
**`FailoverAlarmConsumer : IMxAccessAlarmConsumer` (new)** — composite + state
|
||||
machine. Owns the wnwrap consumer (primary) and the subtag consumer (standby),
|
||||
forwards `AlarmTransitionEmitted` from whichever child is active, and raises a
|
||||
new `ProviderModeChanged` event on every switch.
|
||||
|
||||
- **Failure counting:** wraps `Subscribe`/`PollOnce` on the primary; a thrown
|
||||
`COMException` or a failure HRESULT increments a consecutive-failure counter,
|
||||
reset to zero on any clean poll.
|
||||
- **Failover** (`PrimaryActive → Degraded`): at `ConsecutiveFailureThreshold`
|
||||
(default 3), ensures the standby is subscribed (it was armed at startup), sets
|
||||
active = standby, snapshots the standby's active set for hand-off, and emits
|
||||
`ProviderModeChanged(SUBTAG, reason, hresult)`.
|
||||
- **Failback probe** (`Degraded → PrimaryActive`): while degraded, every
|
||||
`FailbackProbeIntervalSeconds` (default 30) it re-attempts wnwrap
|
||||
`Subscribe`+`PollOnce` on the STA. After `FailbackStableProbes` (default 3)
|
||||
consecutive clean polls it switches active = primary, returns the standby to
|
||||
standby, and emits `ProviderModeChanged(ALARMMGR, "recovered")`.
|
||||
- **Hand-off:** on every switch it takes `SnapshotActiveAlarms()` from the
|
||||
now-active provider so the gateway can reconcile and avoid spurious
|
||||
raise/clear storms.
|
||||
|
||||
**`AlarmDispatcher` / `MxAccessAlarmEventSink` / `AlarmCommandHandler`
|
||||
(changed, minimal)** — `AlarmDispatcher` holds a `FailoverAlarmConsumer` instead
|
||||
of a bare `WnWrapAlarmConsumer`; it subscribes to `ProviderModeChanged` and
|
||||
enqueues a mode-changed worker event. The ack path routes by active mode (native
|
||||
wnwrap ack in alarmmgr mode; ack-comment write in subtag mode), but that routing
|
||||
is entirely inside the consumer — the dispatcher just calls
|
||||
`AcknowledgeByName`/`AcknowledgeByGuid`.
|
||||
|
||||
### Gateway (`src/ZB.MOM.WW.MxGateway.Server/`)
|
||||
|
||||
**Galaxy Repository discovery (new query)** — alongside the existing GR SQL
|
||||
browse RPCs, a query "attributes that have alarms configured, with their
|
||||
ack-comment subtag and area", scoped to the configured area. Merged with the
|
||||
config override (explicit includes/excludes). Produces the watch-list of
|
||||
`AlarmSubtagTarget`s.
|
||||
|
||||
**`GatewayAlarmMonitor` (changed)** — resolves the watch-list at subscribe time
|
||||
and passes it to the worker; consumes `AlarmProviderModeChanged` and reflects
|
||||
the current provider mode into (a) the `AlarmFeedMessage` provider-status,
|
||||
(b) the `/hubs/alarms` dashboard hub, and (c) metrics; forces a reconcile
|
||||
(`QueryActiveAlarms`) on every switch. Re-runs discovery on its existing
|
||||
reconcile cadence and pushes an updated watch-list when the model changes.
|
||||
|
||||
**`AlarmsOptions` (extended)** — new `Fallback` sub-section (below).
|
||||
|
||||
### Contract (`src/ZB.MOM.WW.MxGateway.Contracts/Protos/`)
|
||||
|
||||
**`mxaccess_gateway.proto`:**
|
||||
|
||||
- `enum AlarmProviderMode { ALARM_PROVIDER_MODE_UNSPECIFIED = 0; ALARMMGR = 1; SUBTAG = 2; }`
|
||||
- New `AlarmFeedMessage` oneof case `AlarmProviderStatus provider_status`,
|
||||
carrying `{ AlarmProviderMode mode; bool degraded; string reason;
|
||||
google.protobuf.Timestamp since; }`. Emitted on stream open and on every
|
||||
change so a late-joining client immediately learns the mode.
|
||||
- Add `bool degraded` + `AlarmProviderMode source_provider` to
|
||||
`OnAlarmTransitionEvent` **and** `ActiveAlarmSnapshot`, so per-item provenance
|
||||
is visible even mid-stream. All additions are new field numbers — backward
|
||||
compatible; existing clients ignore them and keep seeing alarms.
|
||||
|
||||
**`mxaccess_worker.proto`:**
|
||||
|
||||
> **Superseded:** these additions shipped in `mxaccess_gateway.proto`, not
|
||||
> `mxaccess_worker.proto` — the worker imports the gateway proto and the alarm
|
||||
> commands/events live there (`AlarmSubtagTarget`,
|
||||
> `OnAlarmProviderModeChangedEvent`, the extended subscribe command).
|
||||
|
||||
- Extend the alarm-subscribe command with: `AlarmProviderMode forced_mode`
|
||||
(`UNSPECIFIED` = auto), `int32 consecutive_failure_threshold`,
|
||||
`int32 failback_probe_interval_seconds`, `int32 failback_stable_probes`, and
|
||||
`repeated AlarmSubtagTarget watch_list`, where `AlarmSubtagTarget =
|
||||
{ string alarm_full_reference; string source_object_reference;
|
||||
string active_subtag; string acked_subtag; string ack_comment_subtag;
|
||||
string priority_subtag; }`.
|
||||
- New worker→gateway event `AlarmProviderModeChanged { AlarmProviderMode mode;
|
||||
string reason; int32 hresult; google.protobuf.Timestamp at; }`.
|
||||
|
||||
> Generated code under `Generated/` and `clients/*/generated*/` is rebuilt from
|
||||
> these `.proto` files — never hand-edited. Every generated client touched by
|
||||
> the contract is rebuilt per the source-update workflow.
|
||||
|
||||
## Data flow
|
||||
|
||||
### Subtag synthesis rules
|
||||
|
||||
`SubtagAlarmConsumer` keeps last-known `(active, acked)` per watch-list entry and
|
||||
emits transitions on change:
|
||||
|
||||
| Subtag change | Emitted transition | Notes |
|
||||
|---|---|---|
|
||||
| `active` false → true | `RAISE` (state `UNACK_ALM`) | `original_raise_timestamp` = first-observed active time |
|
||||
| `acked` false → true while `active` | `ACKNOWLEDGE` | `operator_user`/`operator_comment` from ack-comment subtag if advised |
|
||||
| `active` true → false | `CLEAR` | maps to `AckRtn` if acked at clear, else `UnackRtn` |
|
||||
| `active` stays true, re-alarm | `RETRIGGER` | **only** if a re-alarm counter subtag exists; otherwise not synthesized (documented limitation) |
|
||||
|
||||
Snapshot state mapping for `ActiveAlarmSnapshot.current_state`:
|
||||
`active && !acked → ACTIVE`, `active && acked → ACTIVE_ACKED`,
|
||||
`!active → INACTIVE`.
|
||||
|
||||
Field degradation in subtag mode:
|
||||
- `alarm_full_reference` — from the watch-list entry (stable, drives ack-by-ref).
|
||||
- Synthetic, deterministic GUID derived by hashing `alarm_full_reference` so
|
||||
GUID-based ack still resolves; flagged `degraded = true`.
|
||||
- `severity` — from the priority subtag if advised, else 0.
|
||||
- `original_raise_timestamp` — first-observed active time (best effort).
|
||||
- `transition_timestamp` — the `OnDataChange` timestamp.
|
||||
- `category`/`description`/`current_value`/`limit_value` — populated only if the
|
||||
corresponding subtag is advised; otherwise empty.
|
||||
|
||||
### Acknowledge
|
||||
|
||||
`AcknowledgeAlarm`/`AcknowledgeAlarmByName` are unchanged at the RPC surface.
|
||||
`AlarmDispatcher` routes by active provider mode:
|
||||
- **alarmmgr mode:** native wnwrap `AlarmAckByName`/`AlarmAckByGUID` (unchanged).
|
||||
- **subtag mode:** resolve the target's `ack_comment_subtag`, `Write` the
|
||||
operator comment via the existing worker write path on the STA. No writable
|
||||
ack-comment subtag → `FailedPrecondition`.
|
||||
|
||||
### Provider-mode reflection
|
||||
|
||||
Worker `AlarmProviderModeChanged` → `GatewayAlarmMonitor` → (a) emit/refresh
|
||||
`AlarmFeedMessage.provider_status` to every `StreamAlarms` subscriber, (b) push
|
||||
to `/hubs/alarms`, (c) update metrics, (d) force a reconcile.
|
||||
|
||||
## Error handling
|
||||
|
||||
- **Both providers down** (subtag advise also failing): the monitor stays
|
||||
faulted and keeps retrying both; acknowledge returns `Unavailable`. No silent
|
||||
data loss — the feed reports degraded with reason.
|
||||
- **Empty watch-list in subtag mode** (GR SQL unavailable, no config override):
|
||||
log + metric `alarm_fallback_watchlist_empty`; the feed reports degraded +
|
||||
empty; the gateway keeps re-running discovery on its reconcile cadence and
|
||||
pushes an updated watch-list when one becomes available.
|
||||
- **Switch hand-off:** every switch snapshots the now-active provider and
|
||||
reconciles against the gateway cache to avoid a raise/clear storm.
|
||||
- **STA affinity:** all subtag advise/write and wnwrap probe calls run on the
|
||||
worker STA (reuse the existing affinity guard) to satisfy
|
||||
`ThreadingModel=Apartment`.
|
||||
|
||||
### Metrics
|
||||
|
||||
- `mxgateway_alarm_provider_mode` (gauge: 1 = alarmmgr, 2 = subtag)
|
||||
- `mxgateway_alarm_provider_switch_total{from,to,reason}` (counter)
|
||||
- `mxgateway_alarm_fallback_watchlist_size` (gauge)
|
||||
|
||||
> **Superseded:** the shipped meter names are `mxgateway.alarms.provider_mode`
|
||||
> (gauge) and `mxgateway.alarms.provider_switches{from,to,reason}` (counter,
|
||||
> `reason` bounded to `failover`/`failback`/`unknown`). The watch-list-size /
|
||||
> watch-list-empty gauges were not implemented; an empty watch-list is surfaced
|
||||
> via a warning log and the feed's degraded `ProviderStatus` instead.
|
||||
|
||||
## Configuration
|
||||
|
||||
```jsonc
|
||||
"MxGateway": {
|
||||
"Alarms": {
|
||||
"Enabled": true,
|
||||
"SubscriptionExpression": "\\\\DESKTOP-6JL3KKO\\Galaxy!DEV",
|
||||
"DefaultArea": "DEV",
|
||||
"ReconcileIntervalSeconds": 30,
|
||||
"Fallback": {
|
||||
"Mode": "Auto", // Auto | ForceAlarmManager | ForceSubtag
|
||||
"ConsecutiveFailureThreshold": 3,
|
||||
"FailbackProbeIntervalSeconds": 30,
|
||||
"FailbackStableProbes": 3,
|
||||
"Discovery": {
|
||||
"UseGalaxyRepository": true,
|
||||
"Area": "", // defaults to Alarms.DefaultArea
|
||||
"IncludeAttributes": [], // explicit additions
|
||||
"ExcludeAttributes": []
|
||||
},
|
||||
"Subtags": {
|
||||
"Active": "active",
|
||||
"Acked": "acked",
|
||||
"AckComment": "", // verified against MXAccess analysis
|
||||
"Priority": "priority"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`GatewayOptionsValidator` additions: `Mode = ForceSubtag` with empty discovery
|
||||
result and no explicit `IncludeAttributes` → startup validation warning;
|
||||
threshold/interval/probe values floored at sane minimums.
|
||||
|
||||
## Open item to confirm during implementation
|
||||
|
||||
The exact AVEVA subtag names (`.active`, `.acked`, the ack-comment attribute,
|
||||
priority) must be confirmed against the MXAccess analysis project
|
||||
(`C:\Users\dohertj2\Desktop\mxaccess`, `docs/MXAccess-Public-API.md`) and the
|
||||
live Galaxy before wiring `SubtagAlarmConsumer`. The config `Subtags` block
|
||||
exists precisely so the resolved names are not hard-coded.
|
||||
|
||||
## Testing
|
||||
|
||||
| Layer | Tests |
|
||||
|---|---|
|
||||
| Worker unit (`MxGateway.Worker.Tests`, x86) | `SubtagAlarmConsumer` synthesis — feed `OnDataChange` sequences, assert raise/ack/clear transitions, snapshot states, degraded flag, synthetic-GUID stability, ack-comment write routing |
|
||||
| Worker unit | `FailoverAlarmConsumer` state machine — fake wnwrap throwing after K polls: assert switch at threshold, failback after stable probes, `ProviderModeChanged` emitted, no duplicate transitions across switch (hand-off reconcile) |
|
||||
| Gateway unit (`MxGateway.Tests`, fake worker) | discovery + config-override merge; `GatewayAlarmMonitor` reflects mode into feed + hub; metrics increment on switch |
|
||||
| Contract | proto round-trip for new fields; existing alarm tests unchanged (alarmmgr-mode regression — parity preserved) |
|
||||
| Live (opt-in, `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`) | real subtag advise + ack-comment write against a live alarm; GR SQL discovery query against the `ZB` DB (gated like existing GR tests) |
|
||||
|
||||
## Docs to update in the same change
|
||||
|
||||
`gateway.md` (alarm provider section), `docs/DesignDecisions.md` (record the
|
||||
fallback decision), `docs/GatewayConfiguration.md` (the `Fallback` block),
|
||||
`docs/AlarmClientDiscovery.md` (subtag provider + synthesis rules),
|
||||
`docs/Grpc.md` (the new `provider_status` / `degraded` fields), and any client
|
||||
READMEs whose generated alarm types gain fields.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user