docs(audit): apply per-cluster judgment fixes across living docs

Resolve audit findings: correct WorkerEnvelope proto/route/metric/session
facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme),
and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap
options, and gateway alarm broker; fix client CLI flags and package paths.
This commit is contained in:
Joseph Doherty
2026-06-03 16:01:28 -04:00
parent f84e0c3474
commit e541339c07
29 changed files with 1102 additions and 432 deletions
+45 -8
View File
@@ -81,11 +81,16 @@ computed against the *filtered* descendant set, a branch that contains no
matching objects gets `false`, not `true`.
**Paging.** Default page size is 500; the server caps any requested size at
5000. Page tokens encode `(cache_sequence, parent_id, filter_signature,
offset)`. A token from a different cache generation or a different filter set
returns `InvalidArgument`. The error messages reference "DiscoverHierarchy
page_token" because `BrowseChildren` reuses the same encoding and validation
path — if you see that wording in a `BrowseChildren` context it is expected.
5000. Page tokens are the colon-delimited triple `sequence:filterSignature:offset`
— the same encoding `DiscoverHierarchy` uses. The parent selector is not a
separate token field: it is folded into `filterSignature` along with the rest of
the filter set (the projector's `ComputeFilterSignature` takes the parent id),
so a page token implicitly pins the parent. A token from a different cache
generation (`sequence` mismatch) or a different filter set (`filterSignature`
mismatch) returns `InvalidArgument`. The error messages reference
"DiscoverHierarchy page_token" because `BrowseChildren` reuses the same encoding
and validation path — if you see that wording in a `BrowseChildren` context it is
expected.
**Errors.**
@@ -133,6 +138,15 @@ When SQL is unreachable, the cache retains the previous data and flips
`Status` to `Stale` (or `Unavailable` if no data was ever loaded). A
`SqlException` never bubbles out as the client-facing error.
The cache also auto-degrades a `Healthy` entry to `Stale` purely on age: when the
last successful refresh is older than five minutes, the projected status is
reported as `Stale` even though the data hasn't otherwise changed. This guards
against a silently wedged refresh loop — if ticks stop succeeding, browse
results visibly go `Stale` rather than continuing to look fresh. (`Unknown` and
`Unavailable` entries are returned as-is and not aged.) The first refresh runs at
service startup, before the interval loop begins, so the cache is populated as
soon as practical rather than waiting one full interval.
### First-load behavior
If a client calls `DiscoverHierarchy` before the background service has
@@ -156,7 +170,10 @@ working across that gap, the cache persists its dataset to disk:
- On the **first** refresh after startup, before any SQL runs, the cache
reloads that file. The restored data is served with `Stale` status —
it is last-known data, not live — so clients can browse immediately even
when the Galaxy database is unreachable.
when the Galaxy database is unreachable. The restore also publishes a deploy
event through `IGalaxyDeployNotifier`, so a `WatchDeployEvents` subscriber that
attaches before the first live query still sees the restored snapshot's deploy
state.
- The first live query then reconciles: if it observes the **same**
`time_of_last_deploy` the snapshot was saved at, the entry is promoted to
`Healthy` with no heavy re-query (the snapshot is provably current); if it
@@ -349,6 +366,25 @@ Component breakdown:
override per object. `HierarchySql` still matches the OtOpcUa original;
`AttributesSql` does not — it additionally enumerates built-in primitive
attributes (see [Built-in vs configured attributes](#built-in-vs-configured-attributes)).
`HierarchySql` restricts the result to a fixed allow-list of object categories
via `WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)` — the same set
the dashboard's `ResolveCategoryName` map names. Categories outside this set
(for example, internal framework objects) are never browsed. The mapping:
| `category_id` | Name |
|---|---|
| 1 | WinPlatform |
| 3 | AppEngine |
| 4 | InTouchViewApp |
| 10 | UserDefined |
| 11 | FieldReference |
| 13 | Area |
| 17 | DIObject |
| 24 | DDESuiteLinkClient |
| 26 | OPCClient |
Any other category id renders as `Category {id}` in the dashboard.
- `GalaxyHierarchyCache`
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most
recent immutable `GalaxyHierarchyCacheEntry` (materialized objects +
@@ -384,7 +420,7 @@ Bound to `MxGateway:Galaxy` via `GalaxyRepositoryOptions`.
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository. Integrated Security against `localhost` is the dev default; production deployments should override this through the standard double-underscore environment variable form, e.g. `MxGateway__Galaxy__ConnectionString`. |
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout. Applies to all three RPCs. |
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout applied to every SQL command the repository runs (the connectivity probe, the deploy-time poll, and the hierarchy and attribute queries), which back all five Galaxy RPCs. |
| `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists each successful browse dataset to disk and reloads it at startup. See [On-disk snapshot](#on-disk-snapshot). |
| `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted browse snapshot. Ignored when `PersistSnapshot` is `false`. |
@@ -400,7 +436,8 @@ unparsed connection string text.
## Authorization
All four Galaxy RPCs (including `WatchDeployEvents`) require the
All five Galaxy RPCs (`TestConnection`, `GetLastDeployTime`,
`DiscoverHierarchy`, `WatchDeployEvents`, and `BrowseChildren`) require the
`metadata:read` API-key scope. Browse is read-only metadata, equivalent in
privilege to `MxCommandKind.GetSessionState` or `MxCommandKind.GetWorkerInfo`.
The mapping lives in `GatewayGrpcScopeResolver`; see