rename: prefix gateway projects/namespaces with ZB.MOM.WW + sln→slnx
Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.
External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths
Also fixes two tests that were not rename-related but became visible
while validating the rename:
- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
gateway service correctly maps to RpcException(Cancelled) per gRPC
convention was being misclassified as a stream fault. Added a sibling
catch on RpcException with StatusCode.Cancelled.
- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
and made it accept either a .git marker OR a .sln/.slnx next to src/
so the worker-exe walker works in non-git working copies.
clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.
Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
Tests: 472/472 pass
Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
IntegrationTests: 18/18 pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# aaAlarmManagedClient discovery — public surface, 2026-05-01
|
||||
|
||||
Result of running
|
||||
`MxGateway.Worker.Tests.AlarmClientDiscoveryTests.DumpAlarmClientPublicSurface`
|
||||
`ZB.MOM.WW.MxGateway.Worker.Tests.AlarmClientDiscoveryTests.DumpAlarmClientPublicSurface`
|
||||
against the deployed AVEVA assembly:
|
||||
|
||||
- File:
|
||||
@@ -68,7 +68,7 @@ list.
|
||||
## What this means
|
||||
|
||||
The architecture comment on
|
||||
`src/MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` (PR A.5) is
|
||||
`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` (PR A.5) is
|
||||
**wrong against this deployed assembly**:
|
||||
|
||||
> "The AVEVA alarm-manager surface (`IAlarmMgrDataProvider`) exposes
|
||||
@@ -89,7 +89,7 @@ never gets invoked at runtime. Until A.2 lands a WM_APP pump,
|
||||
|
||||
## Live runtime probe — 2026-05-01
|
||||
|
||||
`MxGateway.Worker.Tests.AlarmClientWmProbeTests.ProbeAlarmClientWmMessages`
|
||||
`ZB.MOM.WW.MxGateway.Worker.Tests.AlarmClientWmProbeTests.ProbeAlarmClientWmMessages`
|
||||
is a Skip-gated runtime probe that creates a real message-only
|
||||
window, calls `AlarmClient.RegisterConsumer(hWnd, …)` +
|
||||
`Subscribe(@"\Galaxy!", …)`, and pumps for 20s while logging every
|
||||
@@ -505,7 +505,7 @@ Interop.WNWRAPCONSUMERLib.dll`). The COM class is registered in
|
||||
Apartment` — `new wwAlarmConsumerClass()` succeeds via
|
||||
`CoCreateInstance`.
|
||||
|
||||
The probe `MxGateway.Worker.Tests/WnWrapConsumerProbeTests.cs`
|
||||
The probe `ZB.MOM.WW.MxGateway.Worker.Tests/WnWrapConsumerProbeTests.cs`
|
||||
(Skip-gated, archival) drove the captured run. Lifecycle:
|
||||
|
||||
1. `new wwAlarmConsumerClass()` — instantiated.
|
||||
@@ -622,7 +622,7 @@ Replacing `aaAlarmManagedClient.AlarmClient` with
|
||||
alarm-consumer surface unblocks A.2 fully. Outline:
|
||||
|
||||
1. **Reference path:** drop `aaAlarmManagedClient.dll` reference
|
||||
from `MxGateway.Worker.csproj`; add `Interop.WNWRAPCONSUMERLib.dll`
|
||||
from `ZB.MOM.WW.MxGateway.Worker.csproj`; add `Interop.WNWRAPCONSUMERLib.dll`
|
||||
reference from `mxaccessgw/lib/`. (Or commit the interop dll
|
||||
in-tree under `lib/` and reference relatively.)
|
||||
2. **`AlarmClientConsumer` → `WnWrapAlarmConsumer`:** rewrite
|
||||
|
||||
+16
-18
@@ -107,29 +107,20 @@ The gateway keeps API key state in a dedicated SQLite database. SQLite is suffic
|
||||
|
||||
### Connection factory
|
||||
|
||||
`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`, ensures the parent directory exists, and opens the connection in `ReadWriteCreate` mode so first-run installations can create the file without manual provisioning:
|
||||
`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`, ensures the parent directory exists, and builds a connection string in `ReadWriteCreate` mode so first-run installations can create the file without manual provisioning. Connection pooling is enabled and the connection string carries a non-zero `DefaultTimeout`:
|
||||
|
||||
```csharp
|
||||
public SqliteConnection CreateConnection()
|
||||
SqliteConnectionStringBuilder builder = new()
|
||||
{
|
||||
string sqlitePath = options.Value.Authentication.SqlitePath;
|
||||
string? directory = Path.GetDirectoryName(sqlitePath);
|
||||
|
||||
if (!string.IsNullOrWhiteSpace(directory))
|
||||
{
|
||||
Directory.CreateDirectory(directory);
|
||||
}
|
||||
|
||||
SqliteConnectionStringBuilder builder = new()
|
||||
{
|
||||
DataSource = sqlitePath,
|
||||
Mode = SqliteOpenMode.ReadWriteCreate
|
||||
};
|
||||
|
||||
return new SqliteConnection(builder.ToString());
|
||||
}
|
||||
DataSource = sqlitePath,
|
||||
Mode = SqliteOpenMode.ReadWriteCreate,
|
||||
Pooling = true,
|
||||
DefaultTimeout = (int)BusyTimeout.TotalSeconds,
|
||||
};
|
||||
```
|
||||
|
||||
Every store opens its connection through `OpenConnectionAsync`, which opens the connection and then applies `PRAGMA journal_mode=WAL` and `PRAGMA busy_timeout`. WAL is a persistent database-level setting so re-applying it per connection is a cheap no-op; `busy_timeout` is per-connection state. Because `MarkKeyUsedAsync` runs on every authenticated request and `SqliteApiKeyAuditStore` appends on every denial, this lets concurrent readers and writers retry briefly instead of surfacing `SQLITE_BUSY` as a hard failure on the request path.
|
||||
|
||||
### Schema
|
||||
|
||||
`SqliteAuthSchema` declares table names and the current schema version as constants. Three tables are involved:
|
||||
@@ -166,6 +157,8 @@ public static ApiKeyRecord Read(SqliteDataReader reader)
|
||||
|
||||
`SqliteApiKeyAdminStore` (`IApiKeyAdminStore`) implements administrative mutations: `CreateAsync` accepts an `ApiKeyCreateRequest`, `RevokeAsync` sets `revoked_utc` only when not already revoked, and `RotateAsync` replaces `secret_hash`, clears `last_used_utc`, and clears `revoked_utc` so a rotated key is immediately usable.
|
||||
|
||||
Because `RotateAsync` clears `revoked_utc`, rotating a previously revoked key reactivates it. The dashboard API Keys page therefore offers the Rotate (and Revoke) action only for keys whose status is `Active`; a revoked key shows no actions, so an operator cannot un-revoke a deliberately disabled key as a side effect of a rotation.
|
||||
|
||||
### Audit trail
|
||||
|
||||
`SqliteApiKeyAuditStore` (`IApiKeyAuditStore`) appends `ApiKeyAuditEntry` values to the `api_key_audit` table and stamps each row with a UTC timestamp inside the store rather than trusting the caller. `ListRecentAsync` returns the most recent rows ordered by `audit_id` descending and projects them into `ApiKeyAuditRecord`. Rows are kept even after the referenced key is revoked because the audit history is the durable record of administrative action; the `key_id` column is nullable to accommodate non-key-scoped events such as `init-db`.
|
||||
@@ -223,6 +216,10 @@ constraints remain fully unconstrained after migration.
|
||||
|
||||
Key ids are restricted by the parser to ASCII letters, digits, periods, and hyphens so they remain safe to embed in the token format and in URL paths used by administrative tooling.
|
||||
|
||||
The CLI is not the only management surface: the dashboard API Keys page
|
||||
creates, rotates, and revokes keys through the same `IApiKeyAdminStore`. See
|
||||
[Gateway Dashboard Design](./GatewayDashboardDesign.md#api-keys-page).
|
||||
|
||||
## Scope Serialization
|
||||
|
||||
Scopes are persisted as a single TEXT column rather than a join table because the set is small, never queried by membership at the database level, and changes atomically with the owning row. `ApiKeyScopeSerializer.Serialize` writes a JSON array sorted with `StringComparer.Ordinal` so equivalent scope sets produce byte-identical column values, which makes audit diffing and database comparisons deterministic:
|
||||
@@ -276,4 +273,5 @@ Singletons are safe because each operation opens its own short-lived `SqliteConn
|
||||
|
||||
- [Gateway Configuration](./GatewayConfiguration.md)
|
||||
- [Authorization](./Authorization.md)
|
||||
- [Gateway Dashboard Design](./GatewayDashboardDesign.md)
|
||||
- [Diagnostics](./Diagnostics.md)
|
||||
|
||||
+37
-13
@@ -8,7 +8,7 @@ what an authenticated API key can browse, read, or write inside the Galaxy.
|
||||
|
||||
Authorization runs as a single gRPC server interceptor registered for every call on the gateway. It pulls the authenticated identity for the current request, derives the scope that the request type requires, and either lets the call continue or fails the call with a gRPC status. The pipeline keeps service classes free of cross-cutting checks, which matches the `gateway.md` "thin gRPC layer" rule that service handlers translate between contracts and domain code without owning policy.
|
||||
|
||||
The participating types live under `src/MxGateway.Server/Security/Authorization/`:
|
||||
The participating types live under `src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/`:
|
||||
|
||||
- `GatewayGrpcAuthorizationInterceptor` runs the authenticate-then-authorize pipeline for unary and server-streaming calls.
|
||||
- `GatewayGrpcScopeResolver` maps a request message (and, for `MxCommandRequest`, the inner `MxCommandKind`) to the scope string that must be present on the caller.
|
||||
@@ -102,12 +102,18 @@ public string ResolveRequiredScope(object request)
|
||||
CloseSessionRequest => GatewayScopes.SessionClose,
|
||||
StreamEventsRequest => GatewayScopes.EventsRead,
|
||||
MxCommandRequest commandRequest => ResolveCommandScope(commandRequest.Command?.Kind ?? MxCommandKind.Unspecified),
|
||||
AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite,
|
||||
StreamAlarmsRequest => GatewayScopes.EventsRead,
|
||||
TestConnectionRequest or
|
||||
GetLastDeployTimeRequest or
|
||||
DiscoverHierarchyRequest or
|
||||
WatchDeployEventsRequest => GatewayScopes.MetadataRead,
|
||||
_ => GatewayScopes.Admin
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
The `_ => GatewayScopes.Admin` fallback is intentional: any future request type that the resolver does not recognize fails closed, requiring the strongest scope until the resolver is updated.
|
||||
The `_ => GatewayScopes.Admin` fallback is intentional: any future request type that the resolver does not recognize fails closed, requiring the strongest scope until the resolver is updated. `AcknowledgeAlarm` is treated as a write — it mutates alarm state, mirroring `MxCommandKind.Write*` — and `StreamAlarms` shares the alarm/event surface with `StreamEvents` and `MxCommandKind.DrainEvents`, so it carries `events:read`. Both alarm RPCs are session-less: the scope check is the only authorization gate, since there is no per-session ownership to enforce.
|
||||
|
||||
`MxCommandRequest` is special because it multiplexes many MxAccess operations through a single RPC. The resolver inspects the embedded `MxCommandKind` so each operation gets its own scope:
|
||||
|
||||
@@ -117,10 +123,14 @@ private static string ResolveCommandScope(MxCommandKind kind)
|
||||
return kind switch
|
||||
{
|
||||
MxCommandKind.Write or
|
||||
MxCommandKind.Write2 => GatewayScopes.InvokeWrite,
|
||||
MxCommandKind.Write2 or
|
||||
MxCommandKind.WriteBulk or
|
||||
MxCommandKind.Write2Bulk => GatewayScopes.InvokeWrite,
|
||||
|
||||
MxCommandKind.WriteSecured or
|
||||
MxCommandKind.WriteSecured2 or
|
||||
MxCommandKind.WriteSecuredBulk or
|
||||
MxCommandKind.WriteSecured2Bulk or
|
||||
MxCommandKind.AuthenticateUser => GatewayScopes.InvokeSecure,
|
||||
|
||||
MxCommandKind.ArchestraUserToId or
|
||||
@@ -135,7 +145,7 @@ private static string ResolveCommandScope(MxCommandKind kind)
|
||||
}
|
||||
```
|
||||
|
||||
Reads (`Register`, `AddItem`, `Advise`, and any other unspecified kind) fall through to `InvokeRead`, which keeps the matrix small while still separating reads from writes, secured writes, metadata lookups, event drains, and worker shutdown.
|
||||
Reads (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any other unspecified kind) fall through to `InvokeRead`, which keeps the matrix small while still separating reads from writes, secured writes, metadata lookups, event drains, and worker shutdown. The four bulk-write families (`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`) are mapped explicitly so a missing arm cannot silently demote a bulk write to a read scope.
|
||||
|
||||
## Constraint Enforcement
|
||||
|
||||
@@ -161,12 +171,25 @@ Glob matching is anchored, case-insensitive, and supports `*` and `?`.
|
||||
Subtree and tag glob lists are alternatives: matching either list allows that
|
||||
scope dimension. Empty lists mean unconstrained for that dimension.
|
||||
|
||||
Constraints are set when a key is created — through the `apikey create-key`
|
||||
flags (see [Authentication](./Authentication.md)) or the dashboard API Keys
|
||||
page create dialog (see
|
||||
[Gateway Dashboard Design](./GatewayDashboardDesign.md#api-keys-page)). The
|
||||
dashboard API Keys page also renders each key's effective constraints.
|
||||
|
||||
The service checks read constraints for `AddItem`, `AddItem2`, `AddItemBulk`,
|
||||
`SubscribeBulk`, and `AdviseItemBulk`. It checks write constraints for
|
||||
`Write`, `Write2`, `WriteSecured`, and `WriteSecured2`. Successful item
|
||||
registrations are tracked per session so later item-handle commands resolve
|
||||
back to the original tag address. If a constrained key presents an unknown item
|
||||
handle, the gateway fails closed.
|
||||
`SubscribeBulk`, `AdviseItemBulk`, and `ReadBulk`. It checks write constraints
|
||||
for `Write`, `Write2`, `WriteSecured`, `WriteSecured2`, `WriteBulk`,
|
||||
`Write2Bulk`, `WriteSecuredBulk`, and `WriteSecured2Bulk`. Bulk commands run
|
||||
through `BulkConstraintPlan` (`ReadBulkConstraintPlan`,
|
||||
`WriteBulkConstraintPlan`, `SubscribeBulkConstraintPlan`), which preserves the
|
||||
caller's input order: each entry is evaluated against the constraint surface,
|
||||
and `BulkConstraintPlan.MergeDeniedInto` re-merges denied entries back into
|
||||
their original index positions so the reply slot at `entries[i]` always
|
||||
corresponds to the request slot at `entries[i]`. Successful item registrations
|
||||
are tracked per session so later item-handle commands resolve back to the
|
||||
original tag address. If a constrained key presents an unknown item handle,
|
||||
the gateway fails closed.
|
||||
|
||||
Non-bulk constraint failures return gRPC `PermissionDenied`. Bulk read
|
||||
commands preserve input order and return a failed `SubscribeResult` for each
|
||||
@@ -182,10 +205,10 @@ blocking constraint; secured values and raw credentials are never logged.
|
||||
|----------|-------|--------------|
|
||||
| `SessionOpen` | `session:open` | `OpenSessionRequest` |
|
||||
| `SessionClose` | `session:close` | `CloseSessionRequest` |
|
||||
| `EventsRead` | `events:read` | `StreamEventsRequest`, `MxCommandKind.DrainEvents` |
|
||||
| `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, and any kind not otherwise mapped) |
|
||||
| `InvokeWrite` | `invoke:write` | `MxCommandKind.Write`, `MxCommandKind.Write2` |
|
||||
| `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.AuthenticateUser` |
|
||||
| `EventsRead` | `events:read` | `StreamEventsRequest`, `StreamAlarmsRequest`, `MxCommandKind.DrainEvents` |
|
||||
| `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any kind not otherwise mapped) |
|
||||
| `InvokeWrite` | `invoke:write` | `AcknowledgeAlarmRequest`, `MxCommandKind.Write`, `MxCommandKind.Write2`, `MxCommandKind.WriteBulk`, `MxCommandKind.Write2Bulk` |
|
||||
| `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.WriteSecuredBulk`, `MxCommandKind.WriteSecured2Bulk`, `MxCommandKind.AuthenticateUser` |
|
||||
| `MetadataRead` | `metadata:read` | `MxCommandKind.ArchestraUserToId`, `MxCommandKind.GetSessionState`, `MxCommandKind.GetWorkerInfo`, `GalaxyRepository.TestConnection`, `GalaxyRepository.GetLastDeployTime`, `GalaxyRepository.DiscoverHierarchy`, `GalaxyRepository.WatchDeployEvents` |
|
||||
| `Admin` | `admin` | `MxCommandKind.ShutdownWorker`, the default for any unrecognized request type, and the dashboard authorization policy |
|
||||
|
||||
@@ -252,6 +275,7 @@ Singleton lifetimes are appropriate because none of the three classes hold per-r
|
||||
## Related Documentation
|
||||
|
||||
- [Authentication](./Authentication.md)
|
||||
- [Gateway Dashboard Design](./GatewayDashboardDesign.md)
|
||||
- [Grpc](./Grpc.md)
|
||||
- [GatewayConfiguration](./GatewayConfiguration.md)
|
||||
- [Galaxy Repository Browse](./GalaxyRepository.md)
|
||||
|
||||
@@ -398,7 +398,7 @@ README.md
|
||||
examples/
|
||||
```
|
||||
|
||||
Generated code should be reproducible from `src/MxGateway.Contracts/Protos/`.
|
||||
Generated code should be reproducible from `src/ZB.MOM.WW.MxGateway.Contracts/Protos/`.
|
||||
Do not hand-edit generated code.
|
||||
|
||||
The stable client proto manifest defines the generated-code directories:
|
||||
|
||||
+10
-10
@@ -8,7 +8,7 @@ in [Toolchain Links](./ToolchainLinks.md) when a command is missing from
|
||||
## Shared Inputs
|
||||
|
||||
All clients generate bindings from the shared protobuf files under
|
||||
`src/MxGateway.Contracts/Protos`. Regenerate the published client descriptor
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/Protos`. Regenerate the published client descriptor
|
||||
after changing either `.proto` file or `clients/proto/proto-inputs.json`:
|
||||
|
||||
```powershell
|
||||
@@ -35,37 +35,37 @@ machine boundary or uses a production certificate.
|
||||
## .NET
|
||||
|
||||
The .NET client uses .NET 10 and references
|
||||
`src/MxGateway.Contracts/MxGateway.Contracts.csproj` for generated C# contract
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj` for generated C# contract
|
||||
types. `clients/dotnet/generated` remains reserved for client-local generator
|
||||
output if the client later decouples from the contracts project.
|
||||
|
||||
Regenerate the generated C# contract types:
|
||||
|
||||
```powershell
|
||||
dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj
|
||||
```
|
||||
|
||||
Build and test from the repository root:
|
||||
|
||||
```powershell
|
||||
dotnet build clients/dotnet/MxGateway.Client.sln
|
||||
dotnet test clients/dotnet/MxGateway.Client.sln --no-build
|
||||
dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.sln
|
||||
dotnet test clients/dotnet/ZB.MOM.WW.MxGateway.Client.sln --no-build
|
||||
```
|
||||
|
||||
Create local package artifacts:
|
||||
|
||||
```powershell
|
||||
$dotnetPackageOutput = Join-Path (Get-Location) 'artifacts/clients/dotnet'
|
||||
dotnet pack clients/dotnet/MxGateway.Client/MxGateway.Client.csproj -c Release -p:PackageOutputPath="$dotnetPackageOutput"
|
||||
dotnet publish clients/dotnet/MxGateway.Client.Cli/MxGateway.Client.Cli.csproj -c Release -o artifacts/clients/dotnet/mxgw-dotnet
|
||||
dotnet pack clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj -c Release -p:PackageOutputPath="$dotnetPackageOutput"
|
||||
dotnet publish clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli/ZB.MOM.WW.MxGateway.Client.Cli.csproj -c Release -o artifacts/clients/dotnet/mxgw-dotnet
|
||||
```
|
||||
|
||||
Run the CLI from source:
|
||||
|
||||
```powershell
|
||||
dotnet run --project clients/dotnet/MxGateway.Client.Cli -- version --json
|
||||
dotnet run --project clients/dotnet/MxGateway.Client.Cli -- smoke --endpoint "http://$env:MXGATEWAY_ENDPOINT" --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
dotnet run --project clients/dotnet/MxGateway.Client.Cli -- smoke --endpoint "https://mxgateway.example.local:5001" --tls --ca-file C:\certs\mxgateway-ca.pem --server-name mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- version --json
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint "http://$env:MXGATEWAY_ENDPOINT" --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
dotnet run --project clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli -- smoke --endpoint "https://mxgateway.example.local:5001" --tls --ca-file C:\certs\mxgateway-ca.pem --server-name mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
|
||||
```
|
||||
|
||||
## Go
|
||||
|
||||
@@ -21,9 +21,9 @@ records:
|
||||
|
||||
The source files listed by the manifest are:
|
||||
|
||||
- `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`
|
||||
- `src/MxGateway.Contracts/Protos/mxaccess_worker.proto`
|
||||
- `src/MxGateway.Contracts/Protos/galaxy_repository.proto`
|
||||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`
|
||||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto`
|
||||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto`
|
||||
|
||||
`mxaccess_gateway.proto` defines the public gRPC service and shared DTOs.
|
||||
`mxaccess_worker.proto` is included in the descriptor because worker-aware
|
||||
@@ -86,7 +86,7 @@ issues.
|
||||
|
||||
## Language Generation Inputs
|
||||
|
||||
All generators use `src/MxGateway.Contracts/Protos` as the protobuf import
|
||||
All generators use `src/ZB.MOM.WW.MxGateway.Contracts/Protos` as the protobuf import
|
||||
root. The checked-in descriptor is available when a language build prefers a
|
||||
descriptor input, but the `.proto` files remain canonical.
|
||||
|
||||
@@ -94,7 +94,7 @@ Use these commands to regenerate language-specific client bindings:
|
||||
|
||||
| Client | Command |
|
||||
|--------|---------|
|
||||
| .NET | `dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj` |
|
||||
| .NET | `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj` |
|
||||
| Go | `Push-Location clients/go; ./generate-proto.ps1; Pop-Location` |
|
||||
| Rust | `Push-Location clients/rust; cargo check --workspace; Pop-Location` |
|
||||
| Python | `Push-Location clients/python; ./generate-proto.ps1; Pop-Location` |
|
||||
@@ -103,10 +103,10 @@ Use these commands to regenerate language-specific client bindings:
|
||||
.NET generation currently runs through the contracts project:
|
||||
|
||||
```powershell
|
||||
dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj
|
||||
```
|
||||
|
||||
Future .NET client projects may either reference `MxGateway.Contracts` or
|
||||
Future .NET client projects may either reference `ZB.MOM.WW.MxGateway.Contracts` or
|
||||
generate client-local files into `clients/dotnet/generated` with `Grpc.Tools`.
|
||||
|
||||
Go clients should generate `mxaccess_gateway.proto` and
|
||||
|
||||
+49
-7
@@ -6,7 +6,7 @@ recreated by the contracts project build.
|
||||
|
||||
## Files
|
||||
|
||||
`src/MxGateway.Contracts/Protos/mxaccess_gateway.proto` defines the public
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto` defines the public
|
||||
`MxAccessGateway` gRPC service, command payloads, command replies, event DTOs,
|
||||
`MxValue`, `MxArray`, and `MxStatusProxy`.
|
||||
|
||||
@@ -23,19 +23,61 @@ the corresponding MXAccess `AddItem`, `Advise`, `UnAdvise`, and `RemoveItem`
|
||||
calls sequentially on the session STA and preserves input order in the result
|
||||
list.
|
||||
|
||||
`src/MxGateway.Contracts/Protos/mxaccess_worker.proto` defines the named-pipe
|
||||
The command model also includes bulk write/read command kinds:
|
||||
`WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, and
|
||||
`ReadBulk`. They are unary `Invoke` payloads on the same `MxAccessGateway`
|
||||
surface (not separate gRPC methods) and exist so a caller can submit one list
|
||||
of items per round trip while preserving MXAccess parity per entry.
|
||||
|
||||
- `WriteBulkCommand` / `Write2BulkCommand` / `WriteSecuredBulkCommand` /
|
||||
`WriteSecured2BulkCommand` each carry a `server_handle` and a `repeated`
|
||||
list of entries (`WriteBulkEntry`, `Write2BulkEntry`,
|
||||
`WriteSecuredBulkEntry`, `WriteSecured2BulkEntry`). Each entry mirrors the
|
||||
single-item command shape — `item_handle` + `value` (+ `timestamp_value` on
|
||||
the `*2` variants, + `current_user_id` / `verifier_user_id` on the secured
|
||||
variants). All four replies use `BulkWriteReply`, which carries
|
||||
`repeated BulkWriteResult`. A `BulkWriteResult` has `server_handle`,
|
||||
`item_handle`, `was_successful`, `optional int32 hresult`, `repeated
|
||||
MxStatusProxy statuses`, and `error_message`. Per-entry failures populate
|
||||
`error_message` + `hresult` and never raise — callers iterate and inspect
|
||||
each entry. The credential-sensitive redaction rules for `WriteSecured` /
|
||||
`WriteSecured2` apply to every `value` inside `WriteSecuredBulkEntry` and
|
||||
`WriteSecured2BulkEntry`.
|
||||
|
||||
- `ReadBulkCommand` carries `server_handle`, `repeated string tag_addresses`,
|
||||
and `uint32 timeout_ms` (0 means use the gateway-configured default). The
|
||||
reply is `BulkReadReply` carrying `repeated BulkReadResult`. A
|
||||
`BulkReadResult` has `server_handle`, `tag_address`, `item_handle`,
|
||||
`was_successful`, `was_cached`, `value`, `quality`, `source_timestamp`,
|
||||
`repeated MxStatusProxy statuses`, and `error_message`. MXAccess has no
|
||||
synchronous `Read`, so `ReadBulk` is dual-mode per entry: when a tag is
|
||||
already advised in the session the worker returns the cached
|
||||
`OnDataChange` payload without touching the subscription
|
||||
(`was_cached = true`); otherwise the worker takes a full
|
||||
`AddItem` + `Advise` + wait-for-first-`OnDataChange` + `UnAdvise` +
|
||||
`RemoveItem` snapshot lifecycle and returns the result
|
||||
(`was_cached = false`). The asymmetry that `BulkReadResult` has no
|
||||
`hresult` field is intentional — `ReadBulk` outcomes are timeout / cache
|
||||
/ lifecycle states rather than MXAccess COM return codes.
|
||||
|
||||
See `gateway.md` for the full cached-vs-snapshot `ReadBulk` lifecycle and the
|
||||
per-command scope requirements, and `docs/DesignDecisions.md` "Bulk Command
|
||||
Family" for the rationale behind the per-entry result shape (independent
|
||||
success tracking, input-order preservation, no partial-failure exceptions).
|
||||
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto` defines the named-pipe
|
||||
worker IPC envelope and control messages. It imports
|
||||
`mxaccess_gateway.proto` so the worker and gateway use the same command, reply,
|
||||
event, value, and status shapes.
|
||||
|
||||
`src/MxGateway.Contracts/Protos/galaxy_repository.proto` defines the
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto` defines the
|
||||
`GalaxyRepository` service used by clients to browse the Galaxy Repository
|
||||
(deployed object hierarchy and dynamic attributes). The service is metadata-
|
||||
only and does not share types with `mxaccess_gateway.proto`. See
|
||||
[Galaxy Repository Browse](./GalaxyRepository.md) for the RPC catalog and
|
||||
behavior.
|
||||
|
||||
Generated C# output is written to `src/MxGateway.Contracts/Generated/`. Do not
|
||||
Generated C# output is written to `src/ZB.MOM.WW.MxGateway.Contracts/Generated/`. Do not
|
||||
hand-edit generated files.
|
||||
|
||||
Client generation inputs are published through
|
||||
@@ -49,20 +91,20 @@ generation inputs, output directories, and golden protobuf JSON fixtures.
|
||||
Run the contracts build to regenerate C# protobuf and gRPC code:
|
||||
|
||||
```bash
|
||||
dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj
|
||||
```
|
||||
|
||||
Run the focused contract tests after changing either `.proto` file:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter ProtobufContractRoundTripTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter ProtobufContractRoundTripTests
|
||||
```
|
||||
|
||||
The full solution build also regenerates the C# contracts before compiling
|
||||
gateway and test projects:
|
||||
|
||||
```bash
|
||||
dotnet build src/MxGateway.sln
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.slnx
|
||||
```
|
||||
|
||||
Regenerate the client descriptor after changing either `.proto` file:
|
||||
|
||||
@@ -85,7 +85,7 @@ The explicit sequence remains the parity baseline for issue-level validation.
|
||||
Run the matrix shape tests after changing the smoke matrix:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
|
||||
```
|
||||
|
||||
Live execution remains a separate opt-in step because it depends on a running
|
||||
|
||||
@@ -82,6 +82,18 @@ fan-out may be added later with explicit backpressure semantics.
|
||||
Rationale: one subscriber preserves simple event ordering and failure behavior
|
||||
while parity is being proven.
|
||||
|
||||
### Alarms — superseded for the alarm subsystem
|
||||
|
||||
The single-subscriber rule above no longer applies to alarms. The gateway runs
|
||||
an always-on central alarm monitor (`GatewayAlarmMonitor`) that owns one
|
||||
gateway-managed worker session, caches the active-alarm set, and fans it out to
|
||||
any number of clients through the session-less `StreamAlarms` RPC. Per-session
|
||||
alarm auto-subscribe is removed; `AcknowledgeAlarm` is session-less and routes
|
||||
through the monitor. Data-side `StreamEvents` remains one subscriber per
|
||||
session. Rationale: alarm state is gateway-wide, not session-scoped — every
|
||||
client wants the same current set plus updates, and forcing each to own a
|
||||
worker would multiply AVEVA polling load for no benefit.
|
||||
|
||||
## Authentication
|
||||
|
||||
Decision: API key authentication for the public gateway.
|
||||
@@ -199,6 +211,57 @@ and failure behavior are easy to compare against direct MXAccess.
|
||||
|
||||
Batch tag registration can be added later if measured setup latency requires it.
|
||||
|
||||
## Bulk Command Family
|
||||
|
||||
Decision: the gateway exposes a fixed set of *bulk* command kinds —
|
||||
`AddItemBulk`, `AdviseItemBulk`, `RemoveItemBulk`, `UnAdviseItemBulk`,
|
||||
`SubscribeBulk`, `UnsubscribeBulk`, `WriteBulk`, `Write2Bulk`,
|
||||
`WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk` — that carry a list of
|
||||
entries in one round-trip and return one per-entry result. Each command kind
|
||||
runs the corresponding single-item MXAccess COM call sequentially on the
|
||||
worker STA; per-entry failures populate `was_successful = false` with the
|
||||
underlying HRESULT and never throw. There is no transactional / fail-fast
|
||||
semantic — bulk here means "one round-trip, per-entry results", not
|
||||
"atomic".
|
||||
|
||||
Rationale: MXAccess COM itself has no native bulk API for any of these
|
||||
operations. Surfacing the per-entry result list keeps parity transparent —
|
||||
the caller sees the same per-item HRESULT they would see calling MXAccess
|
||||
N times directly — while the bulk shape collapses the gateway/IPC overhead
|
||||
to one round-trip per batch and lets the worker keep the STA hot.
|
||||
|
||||
`ReadBulk` is the only bulk command without a 1:1 MXAccess analogue. Two
|
||||
choices were considered:
|
||||
|
||||
1. **Cache-then-snapshot** (chosen): when a requested tag is already in the
|
||||
session's item registry AND advised, the worker returns the last cached
|
||||
`OnDataChange` value without touching the subscription
|
||||
(`was_cached = true`). Otherwise it takes the full `AddItem + Advise +
|
||||
wait-for-first-OnDataChange + UnAdvise + RemoveItem` lifecycle itself
|
||||
(`was_cached = false`) and leaves the session exactly as it was before
|
||||
the call. The cache lives on a per-session `MxAccessValueCache`,
|
||||
populated by `MxAccessBaseEventSink` on every `OnDataChange` after the
|
||||
event clears the outbound queue.
|
||||
|
||||
2. **Always-snapshot**: take the AddItem-through-RemoveItem lifecycle for
|
||||
every requested tag. Cleaner conceptually but pays the full lifecycle
|
||||
cost on every call and would interfere with existing subscriptions if
|
||||
MXAccess reuses item handles.
|
||||
|
||||
The chosen behavior matches what callers actually want from "current
|
||||
value" — a free read of an already-streaming tag, and a one-shot snapshot
|
||||
otherwise — and never disturbs subscriptions the caller did not create.
|
||||
The decision intentionally does NOT synthesize an `OnDataChange` event
|
||||
from the snapshot path: the snapshot value reaches the caller through
|
||||
`ReadBulk`'s reply payload only, not through the event stream. This
|
||||
preserves the "Don't synthesize events" rule that scopes the rest of the
|
||||
worker.
|
||||
|
||||
`ReadBulk`'s wait loop pumps Windows messages on the worker STA
|
||||
(`StaRuntime.PumpPendingMessages`) on every poll iteration so the inbound
|
||||
MXAccess COM event can dispatch while the bulk executor still holds the
|
||||
thread — without the pump the OnDataChange would never deliver.
|
||||
|
||||
## Graceful Worker Shutdown
|
||||
|
||||
Decision: best-effort cleanup before COM release.
|
||||
|
||||
+4
-4
@@ -1,6 +1,6 @@
|
||||
# Gateway Diagnostics
|
||||
|
||||
The diagnostics subsystem provides structured logging, credential redaction, and request-scoped log enrichment for the gateway. It lives under `src/MxGateway.Server/Diagnostics/` and is wired into the ASP.NET Core pipeline so every gRPC and HTTP request carries the same correlation fields.
|
||||
The diagnostics subsystem provides structured logging, credential redaction, and request-scoped log enrichment for the gateway. It lives under `src/ZB.MOM.WW.MxGateway.Server/Diagnostics/` and is wired into the ASP.NET Core pipeline so every gRPC and HTTP request carries the same correlation fields.
|
||||
|
||||
## Goals
|
||||
|
||||
@@ -162,7 +162,7 @@ public static IApplicationBuilder UseGatewayRequestLoggingScope(this IApplicatio
|
||||
{
|
||||
ILogger logger = context.RequestServices
|
||||
.GetRequiredService<ILoggerFactory>()
|
||||
.CreateLogger("MxGateway.Request");
|
||||
.CreateLogger("ZB.MOM.WW.MxGateway.Request");
|
||||
|
||||
using IDisposable? scope = logger.BeginGatewayScope(new GatewayLogScope(
|
||||
SessionId: ReadHeader(context, SessionIdHeaderName),
|
||||
@@ -188,7 +188,7 @@ The scope is keyed off four custom headers and the standard `authorization` head
|
||||
|
||||
The numeric headers use `int.TryParse` and `ulong.TryParse`; missing or unparseable values become `null` and are dropped by `GatewayLogScope.ToDictionary`. This keeps the middleware tolerant of clients that do not yet emit every header, which matters because the earliest call in a session (`OpenSession`) has no `SessionId` to send.
|
||||
|
||||
The logger category is `MxGateway.Request`, which lets operators filter the request scope events independently from per-component categories.
|
||||
The logger category is `ZB.MOM.WW.MxGateway.Request`, which lets operators filter the request scope events independently from per-component categories.
|
||||
|
||||
### Pipeline ordering
|
||||
|
||||
@@ -213,7 +213,7 @@ The order matters: putting the logging scope first ensures that authentication f
|
||||
|
||||
- `GatewayLogScope.ToDictionary` redacts `ClientIdentity` whenever a scope is materialized.
|
||||
- `DashboardRedactor.Redact` delegates to `RedactClientIdentity` for any value containing the `mxgw_` marker, then falls back to a marker-keyword check for fields like `password` or `token`. This keeps dashboard renders aligned with log redaction.
|
||||
- `MxGateway.Tests/Diagnostics/GatewayLogRedactorTests.cs` covers each redaction branch, including the assertion that `WriteSecured` values stay redacted even when `valueLoggingEnabled` is true.
|
||||
- `ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorTests.cs` covers each redaction branch, including the assertion that `WriteSecured` values stay redacted even when `valueLoggingEnabled` is true.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
|
||||
+91
-19
@@ -2,7 +2,7 @@
|
||||
|
||||
The gateway exposes a read-only browse surface over the AVEVA System Platform
|
||||
Galaxy Repository (the SQL Server database named `ZB`). Clients use it to
|
||||
enumerate the deployed object hierarchy and each object's dynamic attributes
|
||||
enumerate the deployed object hierarchy and each object's attributes
|
||||
before subscribing to runtime values via the existing `MxAccessGateway` RPCs.
|
||||
|
||||
This is a metadata layer: it never reads or writes runtime tag values, never
|
||||
@@ -19,20 +19,22 @@ ArchestrA IDE renders the deployment tree. Surfacing that data over gRPC lets
|
||||
remote clients build a navigable address space without any coupling to the
|
||||
COM layer or the host platform.
|
||||
|
||||
The query bodies are kept byte-for-byte identical to the equivalent OPC UA
|
||||
server in the OtOpcUa project so the two consumers see the same row sets.
|
||||
`HierarchySql` is the object-hierarchy query originally ported from the
|
||||
equivalent OPC UA server in the OtOpcUa project. `AttributesSql` has since
|
||||
diverged from OtOpcUa — see [Built-in vs configured attributes](#built-in-vs-configured-attributes)
|
||||
— and is no longer kept in sync with it.
|
||||
|
||||
## RPC Surface
|
||||
|
||||
The service is defined in
|
||||
`src/MxGateway.Contracts/Protos/galaxy_repository.proto` under package
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto` under package
|
||||
`galaxy_repository.v1`.
|
||||
|
||||
| RPC | Purpose |
|
||||
|-----|---------|
|
||||
| `TestConnection` | Connectivity probe. Returns `{ ok: bool }` after a `SELECT 1`. Does not throw on SQL failure — returns `ok = false`. Always hits SQL directly so it remains a true health check. |
|
||||
| `GetLastDeployTime` | Returns the cached `galaxy.time_of_last_deploy`. Served from the shared hierarchy cache; refreshed in the background. |
|
||||
| `DiscoverHierarchy` | Returns one page of the deployed hierarchy plus each returned object's dynamic attributes. **Served from cache** — see [Hierarchy Cache](#hierarchy-cache). |
|
||||
| `DiscoverHierarchy` | Returns one page of the deployed hierarchy plus each returned object's attributes (configured and built-in — see [Built-in vs configured attributes](#built-in-vs-configured-attributes)). **Served from cache** — see [Hierarchy Cache](#hierarchy-cache). |
|
||||
| `WatchDeployEvents` | **Server-streaming.** The server emits the current state immediately on subscribe (so clients can bootstrap without waiting), then emits one event per detected deploy change. See [Deploy Notifications](#deploy-notifications). |
|
||||
|
||||
`DiscoverHierarchy` is a paged unary RPC. The raw request accepts `page_size`
|
||||
@@ -53,7 +55,7 @@ reports the post-filter count.
|
||||
## Hierarchy Cache
|
||||
|
||||
The gateway holds a single shared `IGalaxyHierarchyCache`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) — every
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) — every
|
||||
`DiscoverHierarchy` and `GetLastDeployTime` request reads from this cache
|
||||
rather than hitting SQL. Many clients can browse concurrently with at most
|
||||
one SQL query in flight.
|
||||
@@ -87,10 +89,40 @@ load to complete before returning. If the first load fails or times out,
|
||||
the client gets `Unavailable` with a short reason. Once any load completes
|
||||
(success or failure), this wait is skipped on subsequent calls.
|
||||
|
||||
### On-disk snapshot
|
||||
|
||||
The gateway may lose connectivity to the Galaxy database — and the database is
|
||||
often unreachable right when the gateway itself restarts. To keep browse
|
||||
working across that gap, the cache persists its dataset to disk:
|
||||
|
||||
- After every successful **heavy** refresh (a deploy change), the raw
|
||||
hierarchy and attribute rowsets are written to
|
||||
`MxGateway:Galaxy:SnapshotCachePath`
|
||||
(default `C:\ProgramData\MxGateway\galaxy-snapshot.json`). The write is
|
||||
atomic — a temp file plus rename — so a crash mid-write cannot corrupt the
|
||||
snapshot. Cheap no-change ticks write nothing; the file is already current.
|
||||
- On the **first** refresh after startup, before any SQL runs, the cache
|
||||
reloads that file. The restored data is served with `Stale` status —
|
||||
it is last-known data, not live — so clients can browse immediately even
|
||||
when the Galaxy database is unreachable.
|
||||
- The first live query then reconciles: if it observes the **same**
|
||||
`time_of_last_deploy` the snapshot was saved at, the entry is promoted to
|
||||
`Healthy` with no heavy re-query (the snapshot is provably current); if it
|
||||
observes a newer deploy, the heavy queries run and replace the snapshot; if
|
||||
the database is still unreachable, the entry stays `Stale`.
|
||||
|
||||
`is_alarm` / `is_historized` filters, paging, and the dashboard summary all
|
||||
work against a restored snapshot exactly as against a live pull — the restore
|
||||
path runs the same materialization. Persistence is disabled by setting
|
||||
`MxGateway:Galaxy:PersistSnapshot` to `false`; the snapshot file is then
|
||||
neither written nor read, and a cold start with an unreachable database comes
|
||||
up `Unavailable` as before. The on-disk file is a cache, not a system of
|
||||
record: deleting it only forces the next cold start to wait for live SQL.
|
||||
|
||||
## Deploy Notifications
|
||||
|
||||
`WatchDeployEvents` is a server-streaming RPC backed by
|
||||
`IGalaxyDeployNotifier` (`src/MxGateway.Server/Galaxy/GalaxyDeployNotifier.cs`).
|
||||
`IGalaxyDeployNotifier` (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyDeployNotifier.cs`).
|
||||
The notifier maintains a private bounded channel per subscriber so a slow
|
||||
client cannot back-pressure other subscribers or the publisher.
|
||||
|
||||
@@ -176,6 +208,43 @@ message DiscoverHierarchyReply {
|
||||
}
|
||||
```
|
||||
|
||||
### Built-in vs configured attributes
|
||||
|
||||
Each `GalaxyObject` carries two kinds of attribute, both surfaced the same way
|
||||
in the `attributes` list:
|
||||
|
||||
- **Configured (dynamic) attributes** — attributes added in the ArchestrA IDE
|
||||
attribute editor. Stored in the Galaxy `dynamic_attribute` table.
|
||||
- **Built-in attributes** — attributes every object inherits from its
|
||||
primitives: the object framework, the engine/platform primitives, and the
|
||||
per-attribute extensions (Alarm, History, Boolean, …). Stored in
|
||||
`attribute_definition` and reached through `primitive_instance`.
|
||||
|
||||
Built-in attributes are why an `AppEngine` or `WinPlatform` object reports its
|
||||
`Engine.*` and `Alarm*` attributes, and why an alarmed attribute such as
|
||||
`TestAlarm001` reports its extension leaves `TestAlarm001.Acked`,
|
||||
`TestAlarm001.AckMsg`, `TestAlarm001.ActiveAlarmState`, and so on. An earlier
|
||||
version of the browse query returned only configured attributes, so those
|
||||
objects came back empty or partial; including built-ins makes the browse
|
||||
surface match what System Platform's own Object Viewer shows. Expect roughly
|
||||
seven times as many attributes as configured-only — the dashboard attribute
|
||||
count reflects this.
|
||||
|
||||
Two rules govern the built-in rows:
|
||||
|
||||
- **No category filter.** `attribute_definition` uses a different
|
||||
`mx_attribute_category` numbering than `dynamic_attribute`, so only the
|
||||
`_`-prefixed-name and `.Description` exclusions apply to built-ins. (The
|
||||
configured-attribute category allow-list is unchanged.)
|
||||
- **`is_historized` / `is_alarm` are always `false` for built-in rows.** Those
|
||||
flags identify a configured attribute that *anchors* a history or alarm
|
||||
extension (e.g. `TestAlarm001`), not the extension's machinery leaves
|
||||
(`TestAlarm001.Acked`). `alarm_bearing_only` and `historized_only` therefore
|
||||
still select the anchor attributes, not their built-in children.
|
||||
|
||||
When a configured attribute and a built-in attribute resolve to the same
|
||||
reference, the configured attribute wins.
|
||||
|
||||
### Contained name vs tag name
|
||||
|
||||
Galaxy objects carry two names. `tag_name` is globally unique and is what
|
||||
@@ -201,7 +270,7 @@ fields cannot express null. Use it to distinguish "no dimension reported" from
|
||||
|
||||
```text
|
||||
gRPC client(s)
|
||||
-> GalaxyRepositoryGrpcService (src/MxGateway.Server/Grpc/)
|
||||
-> GalaxyRepositoryGrpcService (src/ZB.MOM.WW.MxGateway.Server/Grpc/)
|
||||
DiscoverHierarchy, GetLastDeployTime -> IGalaxyHierarchyCache.Current
|
||||
WatchDeployEvents -> IGalaxyDeployNotifier
|
||||
TestConnection -> GalaxyRepository (direct SQL)
|
||||
@@ -218,29 +287,30 @@ GalaxyHierarchyRefreshService (BackgroundService)
|
||||
|
||||
Component breakdown:
|
||||
|
||||
- `GalaxyRepository` (`src/MxGateway.Server/Galaxy/GalaxyRepository.cs`) holds
|
||||
the SQL. Its constants `HierarchySql` and `AttributesSql` are copied verbatim
|
||||
from the OtOpcUa project; do not edit them in isolation here. The two
|
||||
queries walk template-derivation and package-derivation chains via
|
||||
recursive CTEs and pick the most-derived attribute override per object.
|
||||
- `GalaxyRepository` (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepository.cs`) holds
|
||||
the SQL. Both `HierarchySql` and `AttributesSql` walk template-derivation and
|
||||
package-derivation chains via recursive CTEs and pick the most-derived
|
||||
override per object. `HierarchySql` still matches the OtOpcUa original;
|
||||
`AttributesSql` does not — it additionally enumerates built-in primitive
|
||||
attributes (see [Built-in vs configured attributes](#built-in-vs-configured-attributes)).
|
||||
- `GalaxyHierarchyCache`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most
|
||||
recent immutable `GalaxyHierarchyCacheEntry` (materialized objects +
|
||||
precomputed dashboard summary + counts + status). All gRPC clients share the
|
||||
same entry.
|
||||
- `GalaxyHierarchyRefreshService`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyHierarchyRefreshService.cs`) is a
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyRefreshService.cs`) is a
|
||||
hosted `BackgroundService` that drives `RefreshAsync` on the configured
|
||||
interval, with deploy-time gating to avoid unnecessary heavy queries.
|
||||
- `GalaxyDeployNotifier`
|
||||
(`src/MxGateway.Server/Galaxy/GalaxyDeployNotifier.cs`) is a thin
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyDeployNotifier.cs`) is a thin
|
||||
per-subscriber-channel fan-out for streaming clients.
|
||||
- `GalaxyProtoMapper`
|
||||
(`src/MxGateway.Server/Grpc/GalaxyProtoMapper.cs`) converts row models to
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyProtoMapper.cs`) converts row models to
|
||||
proto messages. Used by the cache during refresh to materialize the reply
|
||||
once.
|
||||
- `GalaxyRepositoryGrpcService`
|
||||
(`src/MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs`) implements
|
||||
(`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs`) implements
|
||||
the four RPCs.
|
||||
|
||||
## Configuration
|
||||
@@ -251,6 +321,8 @@ Bound to `MxGateway:Galaxy` via `GalaxyRepositoryOptions`.
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository. Integrated Security against `localhost` is the dev default; production deployments should override this through the standard double-underscore environment variable form, e.g. `MxGateway__Galaxy__ConnectionString`. |
|
||||
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout. Applies to all three RPCs. |
|
||||
| `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists each successful browse dataset to disk and reloads it at startup. See [On-disk snapshot](#on-disk-snapshot). |
|
||||
| `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted browse snapshot. Ignored when `PersistSnapshot` is `false`. |
|
||||
|
||||
The connection string is not treated as a secret in dev (`Integrated
|
||||
Security`), but production deployments that use SQL authentication should set
|
||||
@@ -306,7 +378,7 @@ that as a yellow or red status badge plus the truncated error.
|
||||
- Failures to reach the Galaxy database surface as `Unavailable`. Detailed
|
||||
SQL exceptions are logged at `Warning` and never returned to clients.
|
||||
- Integration tests live in
|
||||
`src/MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs`. Set
|
||||
`src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs`. Set
|
||||
`MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` (and optionally
|
||||
`MXGATEWAY_LIVE_GALAXY_CONN`) to run them; otherwise they skip.
|
||||
|
||||
|
||||
@@ -19,7 +19,7 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
|
||||
"RunMigrationsOnStartup": true
|
||||
},
|
||||
"Worker": {
|
||||
"ExecutablePath": "src\\MxGateway.Worker\\bin\\x86\\Release\\MxGateway.Worker.exe",
|
||||
"ExecutablePath": "src\\ZB.MOM.WW.MxGateway.Worker\\bin\\x86\\Release\\ZB.MOM.WW.MxGateway.Worker.exe",
|
||||
"WorkingDirectory": null,
|
||||
"RequiredArchitecture": "X86",
|
||||
"StartupTimeoutSeconds": 30,
|
||||
@@ -60,7 +60,15 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
|
||||
"Galaxy": {
|
||||
"ConnectionString": "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
|
||||
"CommandTimeoutSeconds": 60,
|
||||
"DashboardRefreshIntervalSeconds": 30
|
||||
"DashboardRefreshIntervalSeconds": 30,
|
||||
"PersistSnapshot": true,
|
||||
"SnapshotCachePath": "C:\\ProgramData\\MxGateway\\galaxy-snapshot.json"
|
||||
},
|
||||
"Alarms": {
|
||||
"Enabled": false,
|
||||
"SubscriptionExpression": "",
|
||||
"DefaultArea": "",
|
||||
"ReconcileIntervalSeconds": 30
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -86,7 +94,7 @@ When `Mode` is `ApiKey`, `SqlitePath` and `PepperSecretName` must be present.
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Worker:ExecutablePath` | `src\MxGateway.Worker\bin\x86\Release\MxGateway.Worker.exe` | Path to the x86 worker executable launched for each gateway session. The path must be valid and point to a `.exe` file. |
|
||||
| `MxGateway:Worker:ExecutablePath` | `src\ZB.MOM.WW.MxGateway.Worker\bin\x86\Release\ZB.MOM.WW.MxGateway.Worker.exe` | Path to the x86 worker executable launched for each gateway session. The path must be valid and point to a `.exe` file. |
|
||||
| `MxGateway:Worker:WorkingDirectory` | `null` | Optional working directory for the worker process. When set, it must be a valid filesystem path. |
|
||||
| `MxGateway:Worker:RequiredArchitecture` | `X86` | Required Portable Executable architecture for the worker. Supported values are `X86` and `X64`; MXAccess parity uses `X86`. |
|
||||
| `MxGateway:Worker:StartupTimeoutSeconds` | `30` | Total startup budget for process launch, startup probe, pipe connect, handshake, and worker readiness. |
|
||||
@@ -164,10 +172,24 @@ at startup.
|
||||
| `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository (`ZB`) used by the `GalaxyRepository` browse RPCs. Override in production via `MxGateway__Galaxy__ConnectionString`. |
|
||||
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout for all Galaxy browse RPCs. |
|
||||
| `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` | `30` | Interval between background refreshes of the dashboard Galaxy summary cache. SQL is hit at most once per interval regardless of dashboard render rate. |
|
||||
| `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists the latest successful Galaxy browse dataset to disk. When `true`, the cache reloads that snapshot at startup so clients can still browse last-known data while the Galaxy database is unreachable. The restored data is served with `Stale` status until a live query confirms it. |
|
||||
| `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted Galaxy browse snapshot. Ignored when `PersistSnapshot` is `false`. The snapshot is written atomically (temp file plus rename). |
|
||||
|
||||
See [Galaxy Repository Browse](./GalaxyRepository.md) for the RPC surface and
|
||||
behavior.
|
||||
|
||||
## Alarm Options
|
||||
|
||||
| Option | Default | Description |
|
||||
|--------|---------|-------------|
|
||||
| `MxGateway:Alarms:Enabled` | `false` | Gates the gateway's always-on central alarm monitor. When `true`, the gateway opens one gateway-owned worker session dedicated to alarms, caches the active-alarm set, and fans it out to every client through the `StreamAlarms` RPC — no client opens its own session to see alarms. |
|
||||
| `MxGateway:Alarms:SubscriptionExpression` | _(empty)_ | AVEVA alarm-subscription expression the monitor subscribes on startup, in canonical `\\<machine>\Galaxy!<area>` form. The literal `Galaxy` provider is correct regardless of the Galaxy database name. When empty and `Enabled` is `true`, the gateway falls back to `\\<MachineName>\Galaxy!<DefaultArea>` if `DefaultArea` is set. |
|
||||
| `MxGateway:Alarms:DefaultArea` | _(empty)_ | Area name used to compose a default subscription when `SubscriptionExpression` is empty. If both are empty while `Enabled` is `true`, the monitor faults with a configuration diagnostic. |
|
||||
| `MxGateway:Alarms:ReconcileIntervalSeconds` | `30` | How often the monitor reconciles its in-process alarm cache against the worker's authoritative active-alarm snapshot, catching transitions the live poll-and-diff feed missed. Floored at 5 seconds. |
|
||||
|
||||
The alarm monitor is independent of client sessions: `AcknowledgeAlarm` and
|
||||
`StreamAlarms` are session-less RPCs served by the monitor.
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
|
||||
|
||||
@@ -34,7 +34,7 @@ SignalR circuit. Bootstrap is sufficient for a basic dashboard.
|
||||
|
||||
## Hosting Model
|
||||
|
||||
The dashboard is hosted by `MxGateway.Server` alongside the gRPC API. When
|
||||
The dashboard is hosted by `ZB.MOM.WW.MxGateway.Server` alongside the gRPC API. When
|
||||
`MxGateway:Dashboard:Enabled` is `true`, `MapGatewayDashboard()` maps the
|
||||
configured `Dashboard:PathBase` to the Blazor Server app and maps the login,
|
||||
logout, and access-denied HTTP endpoints beside it. When dashboard hosting is
|
||||
@@ -49,6 +49,7 @@ Endpoint layout:
|
||||
/dashboard/workers
|
||||
/dashboard/events
|
||||
/dashboard/galaxy
|
||||
/dashboard/apikeys
|
||||
/dashboard/settings
|
||||
/dashboard/_blazor
|
||||
```
|
||||
@@ -68,7 +69,7 @@ dashboard as the default web page. Otherwise leave gRPC/API hosting unaffected.
|
||||
## High-Level Components
|
||||
|
||||
```text
|
||||
MxGateway.Server
|
||||
ZB.MOM.WW.MxGateway.Server
|
||||
Dashboard/
|
||||
Components/
|
||||
App.razor
|
||||
@@ -83,6 +84,7 @@ MxGateway.Server
|
||||
SessionDetailsPage.razor
|
||||
WorkersPage.razor
|
||||
EventsPage.razor
|
||||
ApiKeysPage.razor
|
||||
SettingsPage.razor
|
||||
Shared/
|
||||
MetricCard.razor
|
||||
@@ -91,6 +93,9 @@ MxGateway.Server
|
||||
DashboardSnapshotService.cs
|
||||
DashboardAuthorizationHandler.cs
|
||||
DashboardAuthenticator.cs
|
||||
DashboardApiKeyAuthorization.cs
|
||||
DashboardApiKeyManagementService.cs
|
||||
DashboardApiKeySummary.cs
|
||||
DashboardSnapshot.cs
|
||||
DashboardSessionSummary.cs
|
||||
DashboardWorkerSummary.cs
|
||||
@@ -249,6 +254,99 @@ Show aggregate event diagnostics:
|
||||
Do not display full tag values by default. If value display is later added, make
|
||||
it opt-in and redacted.
|
||||
|
||||
### Browse page
|
||||
|
||||
`/dashboard/browse` lets an operator explore the Galaxy tag hierarchy and watch
|
||||
live values. The tree is built in-process by `DashboardBrowseTreeBuilder` from
|
||||
`IGalaxyHierarchyCache.Current` — the same cache the Galaxy page reads — so a
|
||||
render costs no gRPC call and no SQL round-trip. Each node shows its child
|
||||
objects and, when expanded, its attributes with attribute name, data type
|
||||
(including array dimension), and the alarm / historized flags. Galaxy SQL
|
||||
carries no attribute description, so none is shown. A filter box switches the
|
||||
tree to a flat list of matching attributes.
|
||||
|
||||
Right-clicking an attribute (or double-clicking it) adds it to the subscription
|
||||
panel. The panel shows each subscribed tag's live value, MXAccess data type,
|
||||
quality and source timestamp, refreshed every two seconds. The subscription
|
||||
panel is the explicit opt-in tag-value surface: it always shows values
|
||||
regardless of `Dashboard:ShowTagValues`, which continues to govern only the
|
||||
diagnostic session/worker views.
|
||||
|
||||
### Alarms page
|
||||
|
||||
`/dashboard/alarms` lists the alarms the gateway's central alarm monitor
|
||||
currently holds as Active or ActiveAcked, refreshed every three seconds. It
|
||||
defaults to showing unacknowledged `Active` alarms; filters add acknowledged
|
||||
alarms and narrow by area, severity range, and a reference/source/description
|
||||
text search. Cleared alarms are not retained — the gateway holds no
|
||||
alarm-history store, so the page reflects only the live active set. The page is
|
||||
read-only; it does not acknowledge alarms. If `MxGateway:Alarms:Enabled` is
|
||||
false the central monitor never starts, and the page says so instead of showing
|
||||
an empty list with no explanation.
|
||||
|
||||
### Live data source
|
||||
|
||||
Both the Browse subscription panel and the Alarms page read live MXAccess data
|
||||
through `IDashboardLiveDataService` (`DashboardLiveDataService`). For tag data
|
||||
it owns one shared gateway session for the whole dashboard, opened lazily on
|
||||
first use via `ISessionManager` and re-opened transparently when it faults or
|
||||
its lease expires. One session means one worker process backs every dashboard
|
||||
circuit; all access is serialised so the worker sees one in-flight command at a
|
||||
time. Tag reads go through `GatewaySession.SubscribeBulkAsync` / `ReadBulkAsync`.
|
||||
|
||||
The Alarms page does **not** use the dashboard session: alarm data comes from
|
||||
the gateway's always-on central monitor. `QueryAlarmsAsync` reads
|
||||
`IGatewayAlarmService.CurrentAlarms` — the monitor's in-process cache — so the
|
||||
dashboard sees the same active-alarm set as every `StreamAlarms` client, with
|
||||
no per-dashboard alarm subscription. When `MxGateway:Alarms:Enabled` is false
|
||||
the monitor never starts and the cache stays empty.
|
||||
|
||||
### API keys page
|
||||
|
||||
`/dashboard/apikeys` lists the gateway's API keys and, for authorized
|
||||
operators, manages them. It reads key metadata through the same
|
||||
`IApiKeyAdminStore` the `apikey` CLI uses, so the dashboard and the CLI act
|
||||
on one source of truth.
|
||||
|
||||
The table shows one row per key:
|
||||
|
||||
- key id,
|
||||
- status (`Active` or `Revoked`),
|
||||
- display name,
|
||||
- scopes,
|
||||
- constraints (rendered as `unconstrained` when none are set),
|
||||
- created timestamp,
|
||||
- last-used timestamp.
|
||||
|
||||
Key secrets are never listed. Only the peppered hash is stored, and the page
|
||||
never reconstructs a key. See [Authorization](./Authorization.md#constraint-enforcement)
|
||||
for what each constraint means and how it is enforced on the gRPC path.
|
||||
|
||||
#### Management actions
|
||||
|
||||
Create, Rotate, and Revoke controls render only when the signed-in user is
|
||||
authorized. `DashboardApiKeyAuthorization.CanManage` requires an authenticated
|
||||
principal that is a member of the LDAP `MxGateway:Ldap:RequiredGroup` — the
|
||||
same group the dashboard login enforces. An anonymous localhost viewer can read
|
||||
the table but sees no action controls.
|
||||
|
||||
- **Create** opens a dialog for the key id, display name, scope checkboxes
|
||||
(the `GatewayScopes` catalog), and the optional constraint fields: read and
|
||||
write subtrees, read and write tag globs, browse subtrees, max write
|
||||
classification, and the read-alarm-only / read-historized-only flags.
|
||||
- **Rotate** issues a new secret for an existing key id and invalidates the
|
||||
old one.
|
||||
- **Revoke** marks a key revoked; a revoked key cannot be un-revoked.
|
||||
|
||||
Create and Rotate return the assembled `mxgw_<keyId>_<secret>` token **once**,
|
||||
in a one-time banner. It is never shown again, so the operator must copy it
|
||||
immediately. This mirrors the `apikey create-key` / `rotate-key` CLI.
|
||||
|
||||
Every management action appends an `api_key_audit` entry
|
||||
(`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`) with
|
||||
the key id and the caller's remote address. Secrets and pepper values are never
|
||||
logged.
|
||||
|
||||
### Settings page
|
||||
|
||||
Show read-only effective configuration:
|
||||
@@ -330,8 +428,8 @@ Suggested configuration:
|
||||
## Styling
|
||||
|
||||
The dashboard serves Bootstrap 5.3.3 assets from
|
||||
`src/MxGateway.Server/wwwroot/lib/bootstrap/` and local layout/status styling
|
||||
from `src/MxGateway.Server/wwwroot/css/dashboard.css`.
|
||||
`src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/` and local layout/status styling
|
||||
from `src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/dashboard.css`.
|
||||
|
||||
Recommended visual language:
|
||||
|
||||
@@ -377,7 +475,7 @@ Integration tests should verify:
|
||||
|
||||
The first dashboard slice implements:
|
||||
|
||||
1. Blazor Server hosting in `MxGateway.Server`.
|
||||
1. Blazor Server hosting in `ZB.MOM.WW.MxGateway.Server`.
|
||||
2. local Bootstrap static assets.
|
||||
3. dashboard configuration binding.
|
||||
4. dashboard auth using API key login and HTTP-only cookie.
|
||||
|
||||
@@ -59,7 +59,7 @@ Those belong to the worker.
|
||||
## High-Level Components
|
||||
|
||||
```text
|
||||
MxGateway.Server
|
||||
ZB.MOM.WW.MxGateway.Server
|
||||
Program / Host
|
||||
Configuration
|
||||
Grpc
|
||||
@@ -677,7 +677,7 @@ development only.
|
||||
Dashboard authentication reuses the API-key verifier and scope model. The
|
||||
dashboard login endpoint accepts the key in a form post, checks `admin` scope
|
||||
when `Dashboard:RequireAdminScope` is enabled, and signs in with the
|
||||
`MxGateway.Dashboard` cookie scheme. The cookie is HTTP-only, secure, strict
|
||||
`ZB.MOM.WW.MxGateway.Dashboard` cookie scheme. The cookie is HTTP-only, secure, strict
|
||||
SameSite, and scoped with the `__Host-MxGatewayDashboard` name. Logout clears
|
||||
that cookie. Login and logout posts use anti-forgery validation, and dashboard
|
||||
API keys are not accepted in query strings. `Dashboard:AllowAnonymousLocalhost`
|
||||
@@ -703,15 +703,15 @@ gRPC admin API. It should initialize the auth database, create keys, list keys
|
||||
without secrets, revoke keys, rotate keys, and print raw secrets only once at
|
||||
creation.
|
||||
|
||||
`MxGateway.Server` exposes local API-key administration as an `apikey`
|
||||
`ZB.MOM.WW.MxGateway.Server` exposes local API-key administration as an `apikey`
|
||||
subcommand before the web host starts:
|
||||
|
||||
```bash
|
||||
MxGateway.Server apikey init-db --sqlite-path C:\ProgramData\MxGateway\gateway-auth.db
|
||||
MxGateway.Server apikey create-key --key-id operator01 --display-name Operator --scopes session:open,events:read
|
||||
MxGateway.Server apikey list-keys --json
|
||||
MxGateway.Server apikey revoke-key --key-id operator01
|
||||
MxGateway.Server apikey rotate-key --key-id operator01 --json
|
||||
ZB.MOM.WW.MxGateway.Server apikey init-db --sqlite-path C:\ProgramData\MxGateway\gateway-auth.db
|
||||
ZB.MOM.WW.MxGateway.Server apikey create-key --key-id operator01 --display-name Operator --scopes session:open,events:read
|
||||
ZB.MOM.WW.MxGateway.Server apikey list-keys --json
|
||||
ZB.MOM.WW.MxGateway.Server apikey revoke-key --key-id operator01
|
||||
ZB.MOM.WW.MxGateway.Server apikey rotate-key --key-id operator01 --json
|
||||
```
|
||||
|
||||
The subcommands accept `--sqlite-path`, `--pepper`, and `--json`. `--pepper`
|
||||
@@ -846,7 +846,7 @@ Suggested configuration shape:
|
||||
"RunMigrationsOnStartup": true
|
||||
},
|
||||
"Worker": {
|
||||
"ExecutablePath": "src/MxGateway.Worker/bin/x86/Release/MxGateway.Worker.exe",
|
||||
"ExecutablePath": "src/ZB.MOM.WW.MxGateway.Worker/bin/x86/Release/ZB.MOM.WW.MxGateway.Worker.exe",
|
||||
"WorkingDirectory": null,
|
||||
"RequiredArchitecture": "X86",
|
||||
"StartupTimeoutSeconds": 30,
|
||||
@@ -887,7 +887,7 @@ Suggested configuration shape:
|
||||
|
||||
Do not scatter connection or path constants through implementation code.
|
||||
|
||||
`MxGateway.Server` binds this section to `GatewayOptions` at startup and
|
||||
`ZB.MOM.WW.MxGateway.Server` binds this section to `GatewayOptions` at startup and
|
||||
registers validation with `ValidateOnStart()`. Startup fails before the gateway
|
||||
begins serving traffic when required authentication settings are missing,
|
||||
timeouts or queue sizes are not positive, dashboard settings are malformed, or
|
||||
|
||||
+260
-25
@@ -7,13 +7,13 @@ provider state.
|
||||
|
||||
## Fake Worker Harness
|
||||
|
||||
`FakeWorkerHarness` in `src/MxGateway.Tests/Gateway/Workers/Fakes/` provides an
|
||||
`FakeWorkerHarness` in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/` provides an
|
||||
in-process worker side for named-pipe IPC tests. It uses the same
|
||||
`WorkerFrameReader`, `WorkerFrameWriter`, and `WorkerEnvelope` contract as the
|
||||
gateway so tests exercise real frame validation and worker-client state changes.
|
||||
|
||||
Use the harness when a gateway or session test needs worker behavior without
|
||||
starting `MxGateway.Worker.exe` or loading MXAccess COM. The harness scripts:
|
||||
starting `ZB.MOM.WW.MxGateway.Worker.exe` or loading MXAccess COM. The harness scripts:
|
||||
|
||||
- `WorkerHello` and `WorkerReady` startup,
|
||||
- command replies with matching correlation ids,
|
||||
@@ -37,43 +37,196 @@ event, and `CloseSession` without loading MXAccess COM.
|
||||
|
||||
## Live MXAccess Smoke
|
||||
|
||||
`WorkerLiveMxAccessSmokeTests` in `src/MxGateway.IntegrationTests/` composes the
|
||||
`WorkerLiveMxAccessSmokeTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` composes the
|
||||
real gRPC service, `SessionManager`, `SessionWorkerClientFactory`,
|
||||
`WorkerClient`, `WorkerProcessLauncher`, and `MxGateway.Worker.exe`. It is
|
||||
`WorkerClient`, `WorkerProcessLauncher`, and `ZB.MOM.WW.MxGateway.Worker.exe`. It is
|
||||
skipped unless `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` is set because it creates
|
||||
the installed MXAccess COM object and depends on live provider state.
|
||||
|
||||
The live smoke opens a gateway session, launches the x86 worker, runs
|
||||
`Register`, `AddItem`, and `Advise`, waits a bounded time for one
|
||||
`OnDataChange`, and closes the session in a `finally` block so the worker gets a
|
||||
graceful shutdown request even when a command or event assertion fails.
|
||||
`Register`, `AddItem`, and `Advise`, waits a bounded time for the first
|
||||
`OnDataChange` event (skipping any earlier bootstrap/registration-state event),
|
||||
and closes the session in a `finally` block so the worker gets a graceful
|
||||
shutdown request even when a command or event assertion fails. Cleanup failures
|
||||
in that `finally` block are logged rather than thrown, so a real assertion
|
||||
failure is never masked by a shutdown timeout.
|
||||
|
||||
`WorkerLiveMxAccessSmokeTests` additionally covers five MXAccess parity paths the
|
||||
fake-worker tests cannot validate:
|
||||
|
||||
- a `Write` round-trip against an advised item, asserting both that the reply is
|
||||
`Ok` / `MxCommandKind.Write` *and* that the worker emits a matching
|
||||
`OnWriteComplete` event for the targeted (server, item) handle pair — the
|
||||
same round-trip proof used by `scripts/run-client-e2e-tests.ps1`,
|
||||
- an `AddItem` against an invalid server handle, asserting the MXAccess failure
|
||||
surfaces in the command reply without faulting the gateway transport,
|
||||
- the `UnAdvise` → `RemoveItem` → `Unregister` teardown chain, asserting each
|
||||
step replies `Ok` with the matching `MxCommandKind`, that no further
|
||||
`OnDataChange` events arrive for the un-advised pair, and that a second
|
||||
`RemoveItem` against the freed handle relays a non-`Ok` MXAccess failure,
|
||||
- a `WriteSecured` round-trip after `AuthenticateUser`, asserting the reply
|
||||
carries `MxCommandKind.WriteSecured` and the credential password never
|
||||
appears in the diagnostic message (parity for both the secured-write
|
||||
ordering rule and the "do not log secrets" contract), and
|
||||
- an abnormal worker exit (the worker process is killed mid-session) where the
|
||||
gateway must transition the session to `SessionState.Faulted` with a
|
||||
non-empty fault description carrying a known worker-client classification
|
||||
(pipe disconnected / worker faulted / end-of-stream / heartbeat expired).
|
||||
|
||||
All six tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1`
|
||||
opt-in variable.
|
||||
|
||||
Build the worker before running the smoke:
|
||||
|
||||
```bash
|
||||
dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86
|
||||
```
|
||||
|
||||
Run the smoke explicitly:
|
||||
|
||||
```bash
|
||||
$env:MXGATEWAY_RUN_LIVE_MXACCESS_TESTS = "1"
|
||||
dotnet test src/MxGateway.IntegrationTests/MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests
|
||||
```
|
||||
|
||||
Optional live smoke variables:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` | First existing `MxGateway.Worker.exe` under `src/MxGateway.Worker/bin/...` | Worker executable path. Set this when running against a packaged worker or a non-default build output. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` | First existing `ZB.MOM.WW.MxGateway.Worker.exe` under `src/ZB.MOM.WW.MxGateway.Worker/bin/...` | Worker executable path. Set this when running against a packaged worker or a non-default build output. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_ITEM` | `TestChildObject.TestInt` | MXAccess item reference used by `AddItem`. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME` | `MxGateway.IntegrationTests` | Client name passed to `Register`. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS` | `15` | Maximum wait for the first `OnDataChange`. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME` | `ZB.MOM.WW.MxGateway.IntegrationTests` | Client name passed to `Register`. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS` | `15` | Maximum wait for the first `OnDataChange` (also used for the `OnWriteComplete` round-trip and the abnormal-exit fault transition). |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER` | `admin` | ArchestrA user name passed to `AuthenticateUser` before the `WriteSecured` parity step. |
|
||||
| `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD` | `admin123` | Password paired with the user above. Never logged; the test asserts the value does not appear in the WriteSecured diagnostic message. |
|
||||
|
||||
The test output includes session id, worker process id, command status,
|
||||
HRESULT/status diagnostics, event sequence and handles, close status, and worker
|
||||
stdout/stderr lines emitted during the run.
|
||||
|
||||
## Dev-rig Probes
|
||||
|
||||
`src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/` partitions runtime probes from the regular
|
||||
Worker.Tests regression suite. The folder is its own
|
||||
`ZB.MOM.WW.MxGateway.Worker.Tests.Probes` namespace so a discovery filter (e.g. `dotnet
|
||||
test --filter FullyQualifiedName~ZB.MOM.WW.MxGateway.Worker.Tests.Probes`) can target or
|
||||
exclude them without enumerating individual class names. The probes are
|
||||
`[Fact(Skip = "...")]` by default and exist to characterize live AVEVA
|
||||
behavior on the dev rig, not to gate CI — flip `Skip = null` on the dev box
|
||||
with installed MXAccess + a running Galaxy provider when running them:
|
||||
|
||||
- `AlarmsLiveSmokeTests` — end-to-end smoke for the alarms-over-gateway
|
||||
pipeline (`WnWrapAlarmConsumer` + `AlarmDispatcher` +
|
||||
`MxAccessAlarmEventSink`) against `\\<machine>\Galaxy!DEV` with the dev rig's
|
||||
10-second flip script writing `TestMachine_001.TestAlarm001`.
|
||||
- `AlarmClientWmProbeTests` — registers as an `AlarmClient` consumer on a real
|
||||
hidden message-only window and logs every Win32 message that arrives during
|
||||
a fixed pump window. Used to identify the `WM_APP` /
|
||||
`RegisterWindowMessage` IDs alarm callbacks use.
|
||||
- `WnWrapConsumerProbeTests` — instantiates AVEVA's standalone `wnwrapConsumer`
|
||||
COM class, subscribes to the dev rig's `\\<machine>\Galaxy!DEV` provider,
|
||||
and polls `GetXmlCurrentAlarms2`. The XML payload bypasses the
|
||||
`FILETIME→DateTime` auto-marshaling that crashes
|
||||
`aaAlarmManagedClient.AlarmClient.GetHighPriAlarm` on this rig.
|
||||
|
||||
The probes share the Worker.Tests project (so they can use its `net48`/`x86`
|
||||
configuration and the installed `ArchestrA.MxAccess` / `aaAlarmManagedClient`
|
||||
references), but they are not part of the regression contract — a Worker.Tests
|
||||
run with `Skip` left in place passes them as skipped.
|
||||
|
||||
## Live Galaxy Repository
|
||||
|
||||
`GalaxyRepositoryLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/` exercises
|
||||
`GalaxyRepository` directly against the `ZB` Galaxy Repository SQL database. It is
|
||||
skipped unless `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` is set because it depends on a
|
||||
reachable SQL Server instance and deployed Galaxy state — fake-worker tests cannot
|
||||
cover the SQL browse RPCs.
|
||||
|
||||
The suite covers `TestConnectionAsync`, `GetLastDeployTimeAsync`,
|
||||
`GetHierarchyAsync`, and `GetAttributesAsync`. `GetHierarchyAsync` and
|
||||
`GetAttributesAsync` assert a non-empty result, so the connected `ZB` database
|
||||
must contain a deployed Galaxy, not just an empty schema.
|
||||
|
||||
Run the Galaxy live tests explicitly:
|
||||
|
||||
```bash
|
||||
$env:MXGATEWAY_RUN_LIVE_GALAXY_TESTS = "1"
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests
|
||||
```
|
||||
|
||||
Optional live Galaxy variables:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `MXGATEWAY_LIVE_GALAXY_CONN` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | Galaxy Repository connection string. Set this when the `ZB` database is on a non-default instance or needs SQL authentication. |
|
||||
|
||||
The default connection string targets `ZB` on `localhost` with Windows
|
||||
authentication, which matches the Galaxy Repository conventions in CLAUDE.md.
|
||||
|
||||
## Galaxy Filter Safety
|
||||
|
||||
`GalaxyFilterInputSafetyTests` in `src/ZB.MOM.WW.MxGateway.Tests/Galaxy/` covers adversarial
|
||||
input handling for the Galaxy Repository browse filter layer. It runs in the
|
||||
unit-test project (no live SQL needed) and complements the live SQL coverage in
|
||||
`GalaxyRepositoryLiveTests`.
|
||||
|
||||
The test class re-frames the original "Galaxy SQL injection" concern (Tests-002 in
|
||||
`code-reviews/Tests/findings.md`). `GalaxyRepository` issues only four *constant*
|
||||
SQL statements (`HierarchySql`, `AttributesSql`, `SELECT 1`,
|
||||
`SELECT time_of_last_deploy FROM galaxy`) — no `DiscoverHierarchyRequest` field
|
||||
is ever concatenated into a SQL string, so there is no dynamic SQL surface and no
|
||||
`LIKE`-escaping helper to test. All filters (`TagNameGlob`, `RootTagName`,
|
||||
template-chain, category, contained-path) are applied **in memory** by
|
||||
`GalaxyHierarchyProjector` / `GalaxyGlobMatcher` against the cached snapshot.
|
||||
|
||||
The adversarial-input matrix (`'`, `' OR '1'='1`, `'; DROP TABLE gobject;--`,
|
||||
`%`, `_`, `100%_off`, `[abc]`, `Pump'001`) pins the following invariants:
|
||||
|
||||
- SQL metacharacters (`'`, `;`) and `LIKE`-wildcards (`%`, `_`) are treated as
|
||||
opaque literals by `GalaxyGlobMatcher` — they never act as wildcards, never
|
||||
spuriously match unrelated text.
|
||||
- Only `*` and `?` are glob wildcards.
|
||||
- `GalaxyGlobMatcher` applies a 100 ms regex timeout so a pathological glob
|
||||
(e.g. 5 000 `a` characters plus a literal `!`) completes promptly rather than
|
||||
catastrophically backtracking.
|
||||
- `GalaxyHierarchyProjector` returns zero matches (rather than the whole
|
||||
hierarchy) for an adversarial `TagNameGlob` or `TemplateChainContains`, and
|
||||
surfaces `NotFound` for an adversarial `RootTagName`.
|
||||
- The `DiscoverHierarchy` RPC end-to-end returns zero matches for adversarial
|
||||
`TagNameGlob` rather than faulting.
|
||||
|
||||
These invariants are the real security surface of the Galaxy browse path; the
|
||||
SQL-injection framing does not apply to a constant-query layer.
|
||||
|
||||
## Live LDAP
|
||||
|
||||
`DashboardLdapLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` exercises
|
||||
`DashboardAuthenticator` against the live GLAuth directory. It is skipped unless
|
||||
`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1` is set because it binds against the GLAuth
|
||||
service described in `glauth.md`.
|
||||
|
||||
The suite builds the authenticator with a default `GatewayOptions`, so
|
||||
`LdapOptions.RequiredGroup` keeps its `GwAdmin` default. `GwAdmin` is the
|
||||
gateway-specific dashboard-admin role and is **not** part of the five baseline
|
||||
GLAuth role groups — it must be provisioned before the LDAP live tests pass.
|
||||
`AuthenticateAsync_AdminInGwAdminGroup_Succeeds` fails (rather than skips) when
|
||||
GLAuth has only the baseline groups, so this is a hard prerequisite beyond "LDAP
|
||||
is up." See the "Adding a gw-specific group" section of `glauth.md` for the
|
||||
provisioning step that adds `GwAdmin` and grants it to `admin`.
|
||||
|
||||
The suite covers both the success path and the `DashboardAuthenticator` failure
|
||||
branches: `admin` in `GwAdmin` succeeds; `readonly` is denied for missing group;
|
||||
`admin` with a wrong password is rejected by the candidate bind without leaking
|
||||
the password into `FailureMessage`; an unknown username yields no candidate; and
|
||||
an unreachable LDAP server is absorbed into a failed result rather than throwing.
|
||||
|
||||
Run the LDAP live tests explicitly:
|
||||
|
||||
```bash
|
||||
$env:MXGATEWAY_RUN_LIVE_LDAP_TESTS = "1"
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests
|
||||
```
|
||||
|
||||
## Client E2E Scripts
|
||||
|
||||
`scripts/discover-testmachine-tags.ps1` queries the ZB Galaxy Repository for the
|
||||
@@ -100,11 +253,75 @@ powershell -ExecutionPolicy Bypass -File scripts/discover-testmachine-tags.ps1 -
|
||||
```
|
||||
|
||||
`scripts/run-client-e2e-tests.ps1` drives the .NET, Go, Rust, Python, and Java
|
||||
client CLIs through a live gateway session. For each client it opens one
|
||||
session, registers, verifies `SubscribeBulk` and `UnsubscribeBulk` on a bounded
|
||||
tag subset, adds and advises every discovered test tag, reads a bounded event
|
||||
stream, then closes the session in a `finally` path. The script writes a JSON
|
||||
report under `artifacts/e2e/`.
|
||||
client CLIs through a live gateway session. The gateway and worker are assumed
|
||||
to be already running at `-Endpoint`; the script does not start or stop them.
|
||||
For each client it runs these phases, then closes the session in a `finally`
|
||||
path and writes a JSON report under `artifacts/e2e/`:
|
||||
|
||||
1. **Session + register** — opens one session and registers.
|
||||
2. **Bulk** — verifies `SubscribeBulk` / `UnsubscribeBulk` on a bounded tag
|
||||
subset (skip with `-SkipBulk`).
|
||||
3. **Add-item / advise** — adds and advises every discovered test tag. The
|
||||
loop has no `StreamEvents` consumer attached, so advised tags accumulate
|
||||
MXAccess change events in the worker event channel
|
||||
(`MxGateway:Events:QueueCapacity`); left unbounded it overflows under
|
||||
`FailFast` backpressure and faults the worker. Every `-DrainEveryTags`
|
||||
advised tags (default 15) the loop connects a short-lived `StreamEvents`
|
||||
drain so the gateway pumps that channel empty. `-DrainEveryTags 0` disables
|
||||
the drain.
|
||||
4. **Stream** — asserts a bounded event stream delivers at least one event
|
||||
(skip with `-SkipStream`).
|
||||
5. **Parity** — asserts MXAccess error paths are rejected rather than silently
|
||||
succeeding: an invalid item handle and an unknown session id (skip with
|
||||
`-SkipParity`).
|
||||
6. **Auth rejection** — asserts `open-session` is rejected when the API key is
|
||||
missing, and (when `-RejectScopeApiKeyEnv` names an insufficient-scope key)
|
||||
when the key lacks the required scope. Skip with `-SkipAuth`.
|
||||
7. **Write round-trip** — *opt-in (`-VerifyWrite`).* Runs right after
|
||||
`register`: adds and advises a configurable writable attribute
|
||||
(`-WriteAttribute`, default `TestChangingInt`), writes a per-client
|
||||
sentinel value, then streams events and asserts an `OnWriteComplete` event
|
||||
for that item is observed — proof the write round-tripped through the
|
||||
gateway, worker, and MXAccess provider. The written value being echoed back
|
||||
in an `OnDataChange` is recorded best-effort (`echoObserved`): a
|
||||
provider-driven attribute such as `TestChangingInt` accepts the write but
|
||||
immediately overwrites it, so no data-change carries the value back. The
|
||||
Rust `stream-events` CLI emits full per-event JSON (`family`, `itemHandle`,
|
||||
`value`) so all five clients apply the same checks.
|
||||
|
||||
It is opt-in because it mutates live tag state. The phase fails fast if the
|
||||
write command is rejected — e.g. against a gateway whose worker predates
|
||||
write support (`MxAccessCommandExecutor` returning `InvalidRequest` for
|
||||
`Write`/`Write2`/`WriteSecured`/`WriteSecured2`).
|
||||
8. **Alarm feed + acknowledge** — *opt-in (`-VerifyAlarms`).* Runs after the
|
||||
stream phase. Exercises the two session-less alarm subcommands against the
|
||||
gateway's central alarm monitor: `stream-alarms` reads a bounded slice of
|
||||
the feed (`-AlarmStreamMax`, default 1 — the feed's first message always
|
||||
arrives immediately, whereas later ones depend on live transitions) and
|
||||
asserts at least one `AlarmFeedMessage`; `acknowledge-alarm` acknowledges
|
||||
`-AlarmReference` (default `Galaxy!TestArea.TestMachine_001.TestAlarm001`)
|
||||
and asserts the RPC round-trips. The native ack outcome is not asserted —
|
||||
it depends on whether that alarm is currently active.
|
||||
|
||||
It is opt-in because it depends on the gateway's central alarm monitor
|
||||
being enabled (`MxGateway:Alarms:Enabled`) and a live alarm provider.
|
||||
|
||||
Each client CLI is driven through one long-lived `batch` process. Every CLI
|
||||
exposes a `batch` subcommand: a process that reads one command line from stdin,
|
||||
runs it through the normal subcommand dispatch, writes the JSON result, then a
|
||||
line containing exactly `__MXGW_BATCH_EOR__`. The harness launches one such
|
||||
process per client and pings the ~250 operations of the flow through it, so the
|
||||
process — and, for the JVM, the runtime — cold-start is paid once per client
|
||||
instead of once per operation. A command that fails inside the batch process
|
||||
writes its `{"error":...}` envelope and the loop continues; the harness treats
|
||||
that envelope as the operation failure (used by the parity and auth phases).
|
||||
|
||||
Before the per-client phases run, the script builds the .NET CLI
|
||||
(`dotnet build`) and installs the Java CLI (`gradle :mxgateway-cli:installDist`)
|
||||
once, so the `batch` process launches straight from the compiled exe / the
|
||||
installed launcher. The Go, Rust, and Python batch processes are launched via
|
||||
`go run` / `cargo run` / `python -m`, which compile-or-start once when that
|
||||
single per-client process starts.
|
||||
|
||||
Build the gateway and worker, start the gateway, and provide a valid API key
|
||||
before running the client e2e script:
|
||||
@@ -121,40 +338,58 @@ powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Clien
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -BulkTagCount 10
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipStream
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -SkipBulk
|
||||
# Write round-trip (opt-in): point at a writable scalar attribute and its
|
||||
# value type.
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyWrite -WriteAttribute TestChangingInt -WriteType int32
|
||||
# Alarm feed + acknowledge (opt-in): needs MxGateway:Alarms:Enabled on the gateway.
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -VerifyAlarms -AlarmReference "Galaxy!TestArea.TestMachine_001.TestAlarm001"
|
||||
# Auth rejection: also assert an insufficient-scope key is denied.
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -RejectScopeApiKeyEnv MXGATEWAY_READONLY_API_KEY
|
||||
# Run all five clients concurrently as isolated child processes.
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Parallel
|
||||
# Validate the flow offline (prints commands, contacts no gateway).
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -DryRun
|
||||
powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1 -Endpoint localhost:5000 -ApiKeyEnv MXGATEWAY_API_KEY
|
||||
```
|
||||
|
||||
When `-VerifyWrite` is enabled, the write round-trip fails loudly if the write
|
||||
command is rejected, if `-WriteAttribute` does not name a writable scalar
|
||||
attribute, or if no `OnWriteComplete` event is observed for the written item
|
||||
within `-WriteEchoMaxEvents` (default 200) streamed events. Raise
|
||||
`-WriteEchoMaxEvents` if the gateway's per-session event backlog is large
|
||||
enough to push `OnWriteComplete` past that bound.
|
||||
|
||||
## Focused Commands
|
||||
|
||||
Run the cross-language smoke matrix tests after changing the documented client
|
||||
smoke command list:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests
|
||||
```
|
||||
|
||||
Run the parity fixture matrix tests after changing the integration parity
|
||||
scenario list:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
|
||||
```
|
||||
|
||||
Run the fake worker tests after changing gateway worker IPC, session startup, or
|
||||
event streaming behavior:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~FakeWorkerHarnessTests
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~SessionWorkerClientFactoryFakeWorkerTests
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~WorkerClientTests
|
||||
dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter FullyQualifiedName~WorkerPipeSessionTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~FakeWorkerHarnessTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~SessionWorkerClientFactoryFakeWorkerTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~GatewayEndToEndFakeWorkerSmokeTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~WorkerClientTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter FullyQualifiedName~WorkerPipeSessionTests
|
||||
```
|
||||
|
||||
Run the gateway test project after shared gateway test infrastructure changes:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj
|
||||
```
|
||||
|
||||
## Related Documentation
|
||||
|
||||
+12
-2
@@ -10,7 +10,7 @@ The layer is composed of four collaborators:
|
||||
|
||||
| Type | Lifetime | Role |
|
||||
|------|----------|------|
|
||||
| `MxAccessGatewayService` | scoped (gRPC) | Implements the four `MxAccessGateway` RPCs, performs exception mapping. |
|
||||
| `MxAccessGatewayService` | scoped (gRPC) | Implements the six `MxAccessGateway` RPCs, performs exception mapping. |
|
||||
| `MxAccessGrpcRequestValidator` | singleton | Rejects malformed requests before any session work runs. |
|
||||
| `MxAccessGrpcMapper` | singleton | Converts public proto types to internal `WorkerCommand`/`WorkerEvent` types and back. |
|
||||
| `IEventStreamService` (`EventStreamService`) | singleton | Owns the event stream pipeline, including bounded queue and backpressure handling. |
|
||||
@@ -29,7 +29,7 @@ A second gRPC service, `GalaxyRepositoryGrpcService`, is mapped alongside it. It
|
||||
|
||||
## RPC Handlers
|
||||
|
||||
`MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract.
|
||||
`MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto` — six in total: `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, and `StreamAlarms`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract.
|
||||
|
||||
Public gRPC send and receive message sizes are configured from
|
||||
`MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use
|
||||
@@ -86,6 +86,14 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
|
||||
|
||||
`StreamEvents` is a server-streaming RPC. The handler delegates the full pipeline to `IEventStreamService` and just forwards each `MxEvent` onto the response stream. Keeping the channel and producer/consumer machinery out of the handler means cancellation, exception mapping, and metric bookkeeping live in one place.
|
||||
|
||||
### `AcknowledgeAlarm`
|
||||
|
||||
`AcknowledgeAlarm` is a unary, **session-less** RPC that acknowledges a single alarm. The handler validates `alarm_full_reference` inline (it does not run through `MxAccessGrpcRequestValidator`) and delegates to `IGatewayAlarmService.AcknowledgeAsync`. The always-on `GatewayAlarmMonitor` routes the ack over its own gateway-managed worker session — clients no longer open a session to acknowledge an alarm. A reference that parses as a canonical GUID forwards to `AcknowledgeAlarmCommand`; a `Provider!Group.Tag` reference forwards to `AcknowledgeAlarmByNameCommand`.
|
||||
|
||||
### `StreamAlarms`
|
||||
|
||||
`StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree.
|
||||
|
||||
## Validation Rules
|
||||
|
||||
`MxAccessGrpcRequestValidator` rejects requests with `StatusCode.InvalidArgument` before any session work happens. The rules are intentionally narrow — anything that requires session state (for example, "session does not exist") is left for `ISessionManager` so the validator can stay synchronous and side-effect free.
|
||||
@@ -96,6 +104,8 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
|
||||
| `CloseSession` | `session_id` must be non-empty. | `InvalidArgument` |
|
||||
| `StreamEvents` | `session_id` must be non-empty. | `InvalidArgument` |
|
||||
| `Invoke` | `session_id` non-empty, `command` present, `kind` not `Unspecified`, payload oneof must match `kind`. | `InvalidArgument` |
|
||||
| `AcknowledgeAlarm` | `alarm_full_reference` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` |
|
||||
| `StreamAlarms` | No required fields — `alarm_filter_prefix` is optional. | — |
|
||||
|
||||
The payload-vs-kind check matters because the `MxCommand.payload` oneof is non-discriminated on the wire — a misaligned client could send `kind = Write` with a `Register` payload and silently confuse the worker. The validator turns that into a clear client error:
|
||||
|
||||
|
||||
@@ -64,9 +64,9 @@ Labels: `area:client-dotnet`, `type:infra`, `priority:p0`
|
||||
|
||||
Deliverables:
|
||||
|
||||
- `clients/dotnet/MxGateway.Client`,
|
||||
- `clients/dotnet/MxGateway.Client.Cli`,
|
||||
- `clients/dotnet/MxGateway.Client.Tests`,
|
||||
- `clients/dotnet/ZB.MOM.WW.MxGateway.Client`,
|
||||
- `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Cli`,
|
||||
- `clients/dotnet/ZB.MOM.WW.MxGateway.Client.Tests`,
|
||||
- optional integration test project,
|
||||
- generated protobuf setup.
|
||||
|
||||
|
||||
@@ -22,19 +22,19 @@ Labels: `area:gateway`, `type:infra`, `priority:p0`
|
||||
|
||||
Deliverables:
|
||||
|
||||
- create `src/MxGateway.sln`,
|
||||
- create `src/MxGateway.Contracts`,
|
||||
- create `src/MxGateway.Server`,
|
||||
- create `src/MxGateway.Tests`,
|
||||
- create `src/MxGateway.IntegrationTests`,
|
||||
- target `MxGateway.Server` to `net10.0`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.slnx`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.Contracts`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.Server`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.Tests`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.IntegrationTests`,
|
||||
- target `ZB.MOM.WW.MxGateway.Server` to `net10.0`,
|
||||
- add shared C# build settings in `Directory.Build.props`,
|
||||
- add baseline tests.
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
- `dotnet build src/MxGateway.sln` succeeds,
|
||||
- `dotnet test src/MxGateway.sln` succeeds,
|
||||
- `dotnet build src/ZB.MOM.WW.MxGateway.slnx` succeeds,
|
||||
- `dotnet test src/ZB.MOM.WW.MxGateway.slnx` succeeds,
|
||||
- gateway project does not reference MXAccess COM.
|
||||
|
||||
### Issue: Define Protobuf Contracts
|
||||
@@ -43,8 +43,8 @@ Labels: `area:contracts`, `type:feature`, `priority:p0`
|
||||
|
||||
Deliverables:
|
||||
|
||||
- `src/MxGateway.Contracts/Protos/mxaccess_gateway.proto`,
|
||||
- `src/MxGateway.Contracts/Protos/mxaccess_worker.proto`,
|
||||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`,
|
||||
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto`,
|
||||
- `MxAccessGateway` service with `OpenSession`, `CloseSession`, `Invoke`, and
|
||||
`StreamEvents`,
|
||||
- `WorkerEnvelope` and worker IPC messages,
|
||||
|
||||
@@ -23,12 +23,12 @@ Labels: `area:worker`, `type:infra`, `priority:p0`
|
||||
|
||||
Deliverables:
|
||||
|
||||
- create `src/MxGateway.Worker`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.Worker`,
|
||||
- target `.NET Framework 4.8`,
|
||||
- platform target `x86`,
|
||||
- reference generated worker contracts,
|
||||
- reference `ArchestrA.MXAccess.dll`,
|
||||
- create `src/MxGateway.Worker.Tests`,
|
||||
- create `src/ZB.MOM.WW.MxGateway.Worker.Tests`,
|
||||
- document MSBuild command from `docs/ToolchainLinks.md`.
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
+2
-2
@@ -4,7 +4,7 @@ The metrics subsystem exposes counters, histograms, and observable gauges that d
|
||||
|
||||
## Overview
|
||||
|
||||
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `MxGateway.Server` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
|
||||
`GatewayMetrics` is a singleton (registered in `GatewayApplication.cs`) that owns a single `Meter` named `ZB.MOM.WW.MxGateway.Server` and a set of synchronised counters, histograms, and observable gauges. Subsystems call typed mutator methods (`SessionOpened`, `CommandFailed`, `EventReceived`, etc.) rather than touching the `Meter` directly, which keeps the OpenTelemetry instrument names and tag conventions in one place. A `lock (_syncRoot)` block guards the scalar fields used by `GetSnapshot`, while per-event maps use `ConcurrentDictionary<string, long>` so the hot event path avoids the lock.
|
||||
|
||||
## Meter and OpenTelemetry Compatibility
|
||||
|
||||
@@ -13,7 +13,7 @@ The meter name is exposed as a constant so that hosting code can register it wit
|
||||
```csharp
|
||||
public sealed class GatewayMetrics : IDisposable
|
||||
{
|
||||
public const string MeterName = "MxGateway.Server";
|
||||
public const string MeterName = "ZB.MOM.WW.MxGateway.Server";
|
||||
|
||||
public GatewayMetrics()
|
||||
{
|
||||
|
||||
@@ -33,23 +33,23 @@ project targets .NET Framework 4.8, but the SDK resolver comes from the .NET SDK
|
||||
installation:
|
||||
|
||||
```powershell
|
||||
dotnet msbuild src\MxGateway.Worker\MxGateway.Worker.csproj /restore /p:Configuration=Debug /p:Platform=x86
|
||||
dotnet msbuild src\ZB.MOM.WW.MxGateway.Worker\ZB.MOM.WW.MxGateway.Worker.csproj /restore /p:Configuration=Debug /p:Platform=x86
|
||||
```
|
||||
|
||||
`docs/ToolchainLinks.md` records the Visual Studio MSBuild executable for
|
||||
classic .NET Framework and COM interop builds:
|
||||
|
||||
```powershell
|
||||
& "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Current\Bin\MSBuild.exe" src\MxGateway.Worker\MxGateway.Worker.csproj /p:Configuration=Debug /p:Platform=x86
|
||||
& "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Current\Bin\MSBuild.exe" src\ZB.MOM.WW.MxGateway.Worker\ZB.MOM.WW.MxGateway.Worker.csproj /p:Configuration=Debug /p:Platform=x86
|
||||
```
|
||||
|
||||
Run the worker tests with the same platform target:
|
||||
|
||||
```powershell
|
||||
dotnet test src\MxGateway.Worker.Tests\MxGateway.Worker.Tests.csproj -p:Platform=x86
|
||||
dotnet test src\ZB.MOM.WW.MxGateway.Worker.Tests\ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86
|
||||
```
|
||||
|
||||
The only MXAccess interop reference belongs in `MxGateway.Worker`. Gateway and
|
||||
The only MXAccess interop reference belongs in `ZB.MOM.WW.MxGateway.Worker`. Gateway and
|
||||
test projects may reference the worker project for metadata and scaffold tests,
|
||||
but they must not reference `ArchestrA.MXAccess.dll` directly.
|
||||
|
||||
@@ -132,7 +132,7 @@ credential, or API key values before the message is written.
|
||||
## Internal Components
|
||||
|
||||
```text
|
||||
MxGateway.Worker
|
||||
ZB.MOM.WW.MxGateway.Worker
|
||||
Program
|
||||
Bootstrap
|
||||
WorkerOptions
|
||||
@@ -251,7 +251,7 @@ The loop should update a heartbeat timestamp after:
|
||||
- processing an MXAccess event.
|
||||
|
||||
`StaRuntime` implements this runtime boundary in the worker. It starts one
|
||||
background thread named `MxGateway.Worker.STA`, sets it to `ApartmentState.STA`,
|
||||
background thread named `ZB.MOM.WW.MxGateway.Worker.STA`, sets it to `ApartmentState.STA`,
|
||||
initializes COM through `StaComApartmentInitializer`, and runs
|
||||
`StaMessagePump`. Commands are scheduled through `InvokeAsync`; the command
|
||||
queue signals an `AutoResetEvent` so `MsgWaitForMultipleObjectsEx` can wake the
|
||||
@@ -655,12 +655,39 @@ the event queue implementation owns those counters.
|
||||
|
||||
The STA watchdog currently emits a `WorkerFault` with
|
||||
`WorkerFaultCategory.StaHung` when `LastStaActivityUtc` is older than
|
||||
`WorkerPipeSessionOptions.HeartbeatGrace`. The fault includes the current
|
||||
command correlation id when a command is active. Command duration and high event
|
||||
queue depth remain observable through heartbeat fields until dedicated
|
||||
thresholds own those warnings. The worker reports stale STA activity, but the
|
||||
gateway owns the final kill decision through its existing heartbeat and worker
|
||||
lifecycle policy.
|
||||
`WorkerPipeSessionOptions.HeartbeatGrace` **and no command is in flight**.
|
||||
`StaRuntime.ProcessQueuedCommands` calls `MarkActivity()` only immediately
|
||||
before and after each work item, so a synchronously long-running STA command
|
||||
(for example a `ReadBulk` waiting `timeout_ms` for the first `OnDataChange`)
|
||||
legitimately freezes `LastStaActivityUtc` for the duration of the wait while
|
||||
the worker is healthy. The watchdog is therefore suppressed while the
|
||||
heartbeat snapshot's `CurrentCommandCorrelationId` is non-empty: the worker is
|
||||
busy executing a command, not hung, and the heartbeat already surfaces the
|
||||
in-flight correlation id so the gateway can apply its own per-command timeout
|
||||
if it considers the command too slow. The fault still fires on a truly hung
|
||||
STA — no command in flight and no activity for longer than `HeartbeatGrace` —
|
||||
which is the only case the watchdog can usefully distinguish from a slow
|
||||
command. Command duration and high event queue depth remain observable through
|
||||
heartbeat fields until dedicated thresholds own those warnings. The worker
|
||||
reports stale STA activity, but the gateway owns the final kill decision
|
||||
through its existing heartbeat and worker lifecycle policy.
|
||||
|
||||
The in-flight-command suppression itself is bounded by
|
||||
`WorkerPipeSessionOptions.HeartbeatStuckCeiling` (default 75 seconds = 5 ×
|
||||
`HeartbeatGrace`). The motivating case for the suppression is a legitimately
|
||||
slow synchronous command — but a genuinely stuck COM call (for example
|
||||
against a dead MXAccess provider whose cross-apartment marshaler is
|
||||
permanently blocked, or a write completion that never fires) leaves
|
||||
`CurrentCommandCorrelationId` non-empty indefinitely. Without an upper bound
|
||||
the worker-side `StaHung` watchdog would be permanently defeated for that
|
||||
session and only the gateway's per-command timeout would catch the hang —
|
||||
losing the worker-originated diagnostic (`StaHung` fault category, the
|
||||
stale-by interval) from the gateway audit trail. Once `LastStaActivityUtc`
|
||||
has been stale for longer than `HeartbeatStuckCeiling`, the watchdog fires
|
||||
`StaHung` regardless of whether a command is in flight, on the assumption
|
||||
that no legitimate STA command should run that long without periodically
|
||||
refreshing activity. Deployments that legitimately run very long bulk
|
||||
operations should raise the ceiling rather than disable it.
|
||||
|
||||
## Shutdown
|
||||
|
||||
@@ -807,7 +834,7 @@ tests. `AddItem` uses `TestChildObject.TestInt` by default and accepts an
|
||||
override through `MXGATEWAY_LIVE_MXACCESS_ITEM`; `AddItem2` uses the captured
|
||||
parity fixture shape `AddItem2("TestInt", "TestChildObject")`.
|
||||
|
||||
`WorkerLiveMxAccessSmokeTests` in `src/MxGateway.IntegrationTests/` uses the
|
||||
`WorkerLiveMxAccessSmokeTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` uses the
|
||||
same opt-in variable for the gateway-to-worker live smoke. It launches the x86
|
||||
worker through `WorkerProcessLauncher`, opens a gateway session, runs
|
||||
`Register`, `AddItem`, and `Advise`, waits for one `OnDataChange`, and closes
|
||||
|
||||
@@ -88,7 +88,7 @@ into a transport failure when the worker captured HRESULT or status details.
|
||||
Run the parity fixture matrix tests after changing the matrix:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests
|
||||
```
|
||||
|
||||
Live MXAccess execution remains opt-in. The matrix defines which scenarios to
|
||||
|
||||
+10
-3
@@ -16,7 +16,7 @@ All four interfaces (`ISessionManager`, `ISessionRegistry`, `ISessionWorkerClien
|
||||
|
||||
The session id is an opaque string in the form `session-{guid:N}` and the per-session pipe name is `mxaccess-gateway-{ProcessId}-{SessionId}`. Encoding the gateway PID into the pipe name avoids collisions when an old gateway process leaks pipes that the OS has not yet reclaimed.
|
||||
|
||||
`SessionState` itself is the protobuf-generated enum from `MxGateway.Contracts.Proto`, so it is shared between the gateway and clients on the wire.
|
||||
`SessionState` itself is the protobuf-generated enum from `ZB.MOM.WW.MxGateway.Contracts.Proto`, so it is shared between the gateway and clients on the wire.
|
||||
|
||||
```csharp
|
||||
public void TransitionTo(SessionState nextState)
|
||||
@@ -33,12 +33,19 @@ public void TransitionTo(SessionState nextState)
|
||||
return;
|
||||
}
|
||||
|
||||
if (_state is SessionState.Closing
|
||||
&& nextState is not SessionState.Closed
|
||||
&& nextState is not SessionState.Faulted)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
_state = nextState;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Closed` is terminal and `Faulted` only allows a transition to `Closed`. This guards against late callbacks (worker exit, heartbeat timeout) re-animating a session that is already torn down.
|
||||
`Closed` is terminal, `Faulted` only allows a transition to `Closed`, and `Closing` only allows a transition to `Closed` or `Faulted`. This guards against late callbacks (worker exit, heartbeat timeout) re-animating a session that is already tearing down or torn down — once `CloseAsync` has set `Closing` under `_syncRoot`, no `TransitionTo(Ready)` from another thread can walk the session back to `Ready`. Both close-related writes (`Closing` and `Closed`) go through `_syncRoot` exactly like every other state write; `_closeLock` only serializes concurrent close attempts.
|
||||
|
||||
### SessionManager (ISessionManager)
|
||||
|
||||
@@ -184,7 +191,7 @@ Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800) added
|
||||
|
||||
### Close
|
||||
|
||||
`GatewaySession.CloseAsync` is serialized by a per-session `SemaphoreSlim` (`_closeLock`). It transitions to `Closing`, asks the worker client to shut down within `ShutdownTimeout`, and on success transitions to `Closed`. If `WorkerClient.ShutdownAsync` throws, the session falls back to `IWorkerClient.Kill` (forced close):
|
||||
`GatewaySession.CloseAsync` is serialized by a per-session `SemaphoreSlim` (`_closeLock`) so only one close runs at a time, but every read/write of `_state` still passes through `_syncRoot` (via `TryBeginClose` and `MarkClosed`). The close path therefore obeys the same lock discipline as `TransitionTo` / `MarkFaulted`: it transitions to `Closing`, asks the worker client to shut down within `ShutdownTimeout`, and on success transitions to `Closed`. `DisposeAsync` waits on `_closeLock` once before disposing the semaphore so an in-flight close's `Release()` cannot race against the dispose. If `WorkerClient.ShutdownAsync` throws, the session falls back to `IWorkerClient.Kill` (forced close):
|
||||
|
||||
```csharp
|
||||
if (_workerClient is not null)
|
||||
|
||||
@@ -1,13 +1,13 @@
|
||||
# Worker Bootstrap
|
||||
|
||||
The bootstrap layer parses the command-line arguments and environment variables passed to the `MxGateway.Worker` process, validates them against the gateway contract, and produces either a populated `WorkerOptions` instance or a structured failure that maps to a `WorkerExitCode`.
|
||||
The bootstrap layer parses the command-line arguments and environment variables passed to the `ZB.MOM.WW.MxGateway.Worker` process, validates them against the gateway contract, and produces either a populated `WorkerOptions` instance or a structured failure that maps to a `WorkerExitCode`.
|
||||
|
||||
## Overview
|
||||
|
||||
The worker process is a short-lived child of the gateway. The gateway side of this contract lives in [WorkerProcessLauncher](./WorkerProcessLauncher.md). On the worker side, `Program.cs` is a single line that delegates to `WorkerApplication.Run(args)`:
|
||||
|
||||
```csharp
|
||||
using MxGateway.Worker;
|
||||
using ZB.MOM.WW.MxGateway.Worker;
|
||||
|
||||
return WorkerApplication.Run(args);
|
||||
```
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
# Worker Conversion Layer
|
||||
|
||||
The conversion layer in `MxGateway.Worker.Conversion` projects COM `VARIANT` payloads, `HRESULT` codes, and `MXSTATUS_PROXY` records into the protobuf wire types in `MxGateway.Contracts.Proto`. The design is parity-first: every projection preserves enough raw metadata that the original COM observation can be reconstructed downstream.
|
||||
The conversion layer in `ZB.MOM.WW.MxGateway.Worker.Conversion` projects COM `VARIANT` payloads, `HRESULT` codes, and `MXSTATUS_PROXY` records into the protobuf wire types in `ZB.MOM.WW.MxGateway.Contracts.Proto`. The design is parity-first: every projection preserves enough raw metadata that the original COM observation can be reconstructed downstream.
|
||||
|
||||
## Overview
|
||||
|
||||
`gateway.md` (sections "Value Model" and "Status Model") requires that the wire format use a value union capable of representing COM `VARIANT` values and arrays, that lossy conversions retain both the typed projection and raw diagnostic metadata, and that `MXSTATUS_PROXY` arrays never collapse to a single success flag. The types in `src/MxGateway.Worker/Conversion/` are the worker-side enforcement of those rules.
|
||||
`gateway.md` (sections "Value Model" and "Status Model") requires that the wire format use a value union capable of representing COM `VARIANT` values and arrays, that lossy conversions retain both the typed projection and raw diagnostic metadata, and that `MXSTATUS_PROXY` arrays never collapse to a single success flag. The types in `src/ZB.MOM.WW.MxGateway.Worker/Conversion/` are the worker-side enforcement of those rules.
|
||||
|
||||
The layer is split into three concerns:
|
||||
|
||||
|
||||
@@ -35,17 +35,22 @@ oversized frames, protocol version mismatches, and session mismatches.
|
||||
|
||||
## Verification
|
||||
|
||||
The frame protocol lives in `ZB.MOM.WW.MxGateway.Worker.Ipc` (`WorkerFrameReader`,
|
||||
`WorkerFrameWriter`, `WorkerFrameProtocolOptions`) and is covered by
|
||||
`src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`. The worker is an
|
||||
x86 process, so build and test it with `-p:Platform=x86`.
|
||||
|
||||
Run the focused tests after changing the frame protocol:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter WorkerFrameProtocolTests
|
||||
```powershell
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86 --filter WorkerFrameProtocolTests
|
||||
```
|
||||
|
||||
Run the gateway build because the frame protocol is part of
|
||||
`MxGateway.Server`:
|
||||
Run the x86 worker build because the frame protocol is part of
|
||||
`ZB.MOM.WW.MxGateway.Worker`:
|
||||
|
||||
```bash
|
||||
dotnet build src/MxGateway.Server/MxGateway.Server.csproj
|
||||
```powershell
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86
|
||||
```
|
||||
|
||||
## Related Documentation
|
||||
|
||||
@@ -60,13 +60,13 @@ optional pipe reservation, records a worker kill metric, and reports a
|
||||
Run the focused launcher tests after changing process launch behavior:
|
||||
|
||||
```bash
|
||||
dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj --filter WorkerProcessLauncherTests
|
||||
dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter WorkerProcessLauncherTests
|
||||
```
|
||||
|
||||
Run the gateway build because the launcher is part of `MxGateway.Server`:
|
||||
Run the gateway build because the launcher is part of `ZB.MOM.WW.MxGateway.Server`:
|
||||
|
||||
```bash
|
||||
dotnet build src/MxGateway.Server/MxGateway.Server.csproj
|
||||
dotnet build src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
|
||||
```
|
||||
|
||||
## Related Documentation
|
||||
|
||||
+3
-3
@@ -4,7 +4,7 @@ The worker STA runtime owns the dedicated single-threaded apartment thread that
|
||||
|
||||
## Why an STA Is Required
|
||||
|
||||
The installed MXAccess interop assembly declares an `Apartment` threading model (see `gateway.md` under "STA Worker Thread Model"). COM objects with that model must be created and called on a thread initialized as a single-threaded apartment, and any callbacks the object raises (event sink calls) are delivered through the thread's Windows message queue. A plain blocking queue is not sufficient: the STA loop must pump Windows messages so that the COM marshaler can deliver event invocations on the same thread that holds the object. Because of that constraint, every MXAccess operation in the worker is funneled through the types in `src/MxGateway.Worker/Sta/`.
|
||||
The installed MXAccess interop assembly declares an `Apartment` threading model (see `gateway.md` under "STA Worker Thread Model"). COM objects with that model must be created and called on a thread initialized as a single-threaded apartment, and any callbacks the object raises (event sink calls) are delivered through the thread's Windows message queue. A plain blocking queue is not sufficient: the STA loop must pump Windows messages so that the COM marshaler can deliver event invocations on the same thread that holds the object. Because of that constraint, every MXAccess operation in the worker is funneled through the types in `src/ZB.MOM.WW.MxGateway.Worker/Sta/`.
|
||||
|
||||
## Types
|
||||
|
||||
@@ -20,13 +20,13 @@ The installed MXAccess interop assembly declares an `Apartment` threading model
|
||||
|
||||
## STA Thread Initialization
|
||||
|
||||
`StaRuntime`'s constructor configures a background `Thread` named `MxGateway.Worker.STA` and forces it into `ApartmentState.STA` before the thread starts. `Start()` releases the thread and then blocks on `startedEvent` so callers observe a fully-initialized apartment (or a captured `startupException`) before the first `InvokeAsync` call:
|
||||
`StaRuntime`'s constructor configures a background `Thread` named `ZB.MOM.WW.MxGateway.Worker.STA` and forces it into `ApartmentState.STA` before the thread starts. `Start()` releases the thread and then blocks on `startedEvent` so callers observe a fully-initialized apartment (or a captured `startupException`) before the first `InvokeAsync` call:
|
||||
|
||||
```csharp
|
||||
staThread = new Thread(ThreadMain)
|
||||
{
|
||||
IsBackground = true,
|
||||
Name = "MxGateway.Worker.STA"
|
||||
Name = "ZB.MOM.WW.MxGateway.Worker.STA"
|
||||
};
|
||||
staThread.SetApartmentState(ApartmentState.STA);
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user