rename: prefix gateway projects/namespaces with ZB.MOM.WW + sln→slnx

Apply the ZB.MOM.WW. prefix to all gateway-side projects, folders,
.csproj/.sln contents, C# namespaces, using directives, generated proto
C# (csharp_namespace + checked-in generated files), InternalsVisibleTo
attributes, project-name string literals (LoadProject, .sln lookups,
worker exe paths, staticwebassets manifest), and the install/script/doc
references that point at any of the above. Migrate the solution from
.sln to .slnx via `dotnet sln migrate` and delete the old file.

External-runtime identifiers are intentionally NOT prefixed so external
configuration keeps working:
- GatewayMetrics.cs MeterName ("MxGateway.Server")
- DashboardAuthenticationDefaults Scheme/Policy ("MxGateway.Dashboard")
- GatewayRequestLoggingMiddleware logger category ("MxGateway.Request")
- StaRuntime thread name ("MxGateway.Worker.STA")
- appsettings.json root section "MxGateway" + env-var prefix
  MxGateway__... and secret-name MxGateway:ApiKeyPepper
- C:\ProgramData\MxGateway\ data dir paths

Also fixes two tests that were not rename-related but became visible
while validating the rename:

- WorkerLiveMxAccessSmokeTests.ShutDownAsync: cancellation that the
  gateway service correctly maps to RpcException(Cancelled) per gRPC
  convention was being misclassified as a stream fault. Added a sibling
  catch on RpcException with StatusCode.Cancelled.

- IntegrationTestEnvironment.ResolveRepositoryRoot: extracted IsRepositoryRoot
  and made it accept either a .git marker OR a .sln/.slnx next to src/
  so the worker-exe walker works in non-git working copies.

clients/proto/proto-inputs.json's protoRoot updated to point at
src/ZB.MOM.WW.MxGateway.Contracts/Protos.

Verified by `dotnet build` and a full `dotnet test` of the .slnx with
MXGATEWAY_RUN_LIVE_{MXACCESS,LDAP,GALAXY}_TESTS=1:
  Tests: 472/472 pass
  Worker.Tests: 280/280 pass (4 dev-rig [Fact(Skip=...)] skipped)
  IntegrationTests: 18/18 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-23 16:22:23 -04:00
parent 867bf18116
commit dc9c0c950c
491 changed files with 32854 additions and 8414 deletions
+71 -12
View File
@@ -104,8 +104,8 @@ Responsibilities:
The gateway must never instantiate or call MXAccess directly.
The gateway observability foundation lives in `MxGateway.Server.Diagnostics`
and `MxGateway.Server.Metrics`. Structured logging scopes carry session,
The gateway observability foundation lives in `ZB.MOM.WW.MxGateway.Server.Diagnostics`
and `ZB.MOM.WW.MxGateway.Server.Metrics`. Structured logging scopes carry session,
worker, correlation, command, and client identity fields with redaction applied
before values enter log state. `GatewayMetrics` exposes counters, gauges, and
histograms through .NET `Meter` and a snapshot API that dashboard services can
@@ -113,13 +113,31 @@ project without binding to a metrics exporter.
`DashboardSnapshotService` projects sessions, workers, metrics, faults, and
effective configuration into immutable DTOs for read-only dashboard rendering.
The Blazor Server dashboard renders those snapshots at `/dashboard`,
`/dashboard/sessions`, `/dashboard/workers`, `/dashboard/events`, and
`/dashboard/settings`. Components subscribe to
`/dashboard/sessions`, `/dashboard/workers`, `/dashboard/events`,
`/dashboard/galaxy`, and `/dashboard/settings`. Components subscribe to
`IDashboardSnapshotService.WatchSnapshotsAsync()` and update on the configured
snapshot interval without mutating session or worker state. The dashboard uses
local Bootstrap CSS and JavaScript plus a small local stylesheet; it does not
use a Blazor UI component library.
`/dashboard/browse` walks the `IGalaxyHierarchyCache` tree and reads subscribed
tag values live through `IDashboardLiveDataService`, which owns one shared,
lazily-opened gateway session for the whole dashboard. `/dashboard/alarms`
reads the central alarm monitor's in-process cache directly. See
`docs/GatewayDashboardDesign.md`.
The gateway runs an always-on central alarm monitor (`GatewayAlarmMonitor`):
one gateway-owned worker session subscribes the configured AVEVA alarm
provider, caches the active-alarm set (reconciled periodically against the
worker's snapshot), and fans it out to every client through the session-less
`StreamAlarms` RPC — the stream opens with the current active-alarm snapshot,
then streams live transitions. `AcknowledgeAlarm` is session-less and routes
through the monitor. Clients never open a worker session to see alarms, and
alarm monitoring is independent of client lifecycle; the monitor re-opens its
session if the worker faults. Gated by `MxGateway:Alarms:Enabled` — see
`docs/DesignDecisions.md` for why this reverses the v1 single-subscriber rule
for the alarm subsystem.
Dashboard routes use the same API-key verifier as gRPC. `/dashboard/login`
accepts the API key in a form body, validates the configured `admin` scope,
and issues an HTTP-only secure cookie for subsequent dashboard requests.
@@ -283,6 +301,44 @@ Core commands:
- `AuthenticateUser`
- `ArchestrAUserToId`
Bulk variants (single gRPC round-trip carries the full list, the worker
runs the per-item MXAccess calls sequentially on its STA, and the reply
returns one result per requested entry — per-entry failures populate
`was_successful = false` + `error_message` and never throw):
- `AddItemBulk``repeated string tag_addresses``BulkSubscribeReply`.
- `AdviseItemBulk``repeated int32 item_handles``BulkSubscribeReply`.
- `RemoveItemBulk``repeated int32 item_handles``BulkSubscribeReply`.
- `UnAdviseItemBulk``repeated int32 item_handles``BulkSubscribeReply`.
- `SubscribeBulk``repeated string tag_addresses` (AddItem + Advise per
entry, with cleanup on Advise failure) → `BulkSubscribeReply`.
- `UnsubscribeBulk``repeated int32 item_handles` (UnAdvise + RemoveItem
per entry, with independent error tracking) → `BulkSubscribeReply`.
- `WriteBulk``repeated WriteBulkEntry` (each `{item_handle, value, user_id}`)
`BulkWriteReply` (`repeated BulkWriteResult`). Required scope: `invoke:write`.
- `Write2Bulk``repeated Write2BulkEntry` (each adds `timestamp_value`) →
`BulkWriteReply`. Required scope: `invoke:write`.
- `WriteSecuredBulk``repeated WriteSecuredBulkEntry` (each
`{item_handle, current_user_id, verifier_user_id, value}`) → `BulkWriteReply`.
Required scope: `invoke:secure`. Same redaction rules as single-item
`WriteSecured`: per-entry `value` must never reach logs unless an explicit
redacted value-logging path is enabled.
- `WriteSecured2Bulk``repeated WriteSecured2BulkEntry` (each adds
`timestamp_value`) → `BulkWriteReply`. Required scope: `invoke:secure`.
- `ReadBulk``repeated string tag_addresses` + `uint32 timeout_ms`
`BulkReadReply` (`repeated BulkReadResult`). MXAccess COM has no
synchronous `Read`; the worker satisfies this command by returning the
last cached `OnDataChange` payload when the requested tag is already
advised (`was_cached = true`, no subscription side-effects), or by
taking a full `AddItem` + `Advise` + wait-for-first-OnDataChange +
`UnAdvise` + `RemoveItem` snapshot lifecycle when no live subscription
exists (`was_cached = false`). Per-tag timeouts surface as
`was_successful = false` rather than throwing. The cache lives on the
worker's `MxAccessValueCache`, populated by `MxAccessBaseEventSink` on
every `OnDataChange` after the event clears the outbound queue.
Required scope: `invoke:read`. `timeout_ms == 0` uses the worker's
default (1000 ms).
Optional diagnostics:
- `Ping`
@@ -579,8 +635,11 @@ Policy:
- command exceptions return structured command fault with HRESULT if known,
- stale sessions are closed by lease timeout,
- stuck workers are killed by process id,
- gateway restart should not attempt to reattach old workers unless explicitly
designed; first version should terminate orphaned workers on startup.
- gateway restart does not reattach old workers; `OrphanWorkerCleanupHostedService`
runs `OrphanWorkerTerminator` once on startup — before the server accepts
sessions — to kill leftover `ZB.MOM.WW.MxGateway.Worker.exe` processes (matched by the
configured worker executable path, or by image name when the x64 gateway cannot
introspect the x86 worker's module) left behind by a previous unclean run.
Because each client owns one worker, a crash or leak affects only that session.
@@ -667,36 +726,36 @@ Optimizations after parity:
Suggested additions:
```text
src/MxGateway.Contracts/
src/ZB.MOM.WW.MxGateway.Contracts/
Protos/
mxaccess_gateway.proto
mxaccess_worker.proto
Generated/
src/MxGateway.Server/
src/ZB.MOM.WW.MxGateway.Server/
Program.cs
Sessions/
Workers/
Grpc/
Metrics/
src/MxGateway.Worker/
src/ZB.MOM.WW.MxGateway.Worker/
Program.cs
Ipc/
Sta/
MxAccess/
Conversion/
src/MxGateway.Tests/
src/ZB.MOM.WW.MxGateway.Tests/
contract tests
gateway session tests
fake worker tests
src/MxGateway.Worker.Tests/
src/ZB.MOM.WW.MxGateway.Worker.Tests/
value/status conversion tests
STA queue tests
src/MxGateway.IntegrationTests/
src/ZB.MOM.WW.MxGateway.IntegrationTests/
optional live MXAccess tests
```