docs: post-PR-7.2 cleanup — audit + three-track scrub

Audit (three parallel agent passes) found 43 markdown files carrying
stale references to the deleted Galaxy.Host/Proxy/Shared projects
after the v2-mxgw merge. This commit lands the prioritized fixes.

Track 1 — high-traffic in-place rewrites (3 files, ~454 lines deleted)
- README.md (202 → 91 lines): drops .NET 4.8 / x86 / TopShelf install
  text; leads with the multi-driver .NET 10 server identity and points
  at scripts/install/Install-Services.ps1 and the parity rig.
- docs/v2/driver-specs.md §1 Galaxy (~289 → ~66 lines): replaces the
  Tier-C out-of-process spec with a Tier-A in-process description
  matching the current GalaxyDriver code, with the four-section
  GalaxyDriverOptions JSON shape pulled verbatim from
  Config/GalaxyDriverOptions.cs.
- docs/drivers/Galaxy.md (211 → 92 lines): full rewrite around the
  current Browse/Runtime/Health/Config sub-folders.

Track 2 — historical banners (5 files)
- lmx_mxgw.md, lmx_mxgw_impl.md, lmx_backend.md,
  docs/v2/Galaxy.ParityMatrix.md,
  docs/v2/implementation/phase-2-galaxy-out-of-process.md each get a
  " Completed 2026-04-30 — historical record" banner block. lmx_mxgw.md
  also fixes two dead links (`docs/Galaxy.Driver.md` and
  `docs/v2/Galaxy.Driver.md`) → `docs/drivers/Galaxy.md`.

Track 3 — v1 archive sweep (10 git mv + 1 new index + 2 in-place scrubs)
- Moved 10 v1 docs under docs/v1/ preserving subpath structure:
  AlarmTracking, Configuration, DataTypeMapping, HistoricalDataAccess,
  Subscriptions (top-level); drivers/Galaxy-Repository,
  drivers/Galaxy-Test-Fixture; reqs/GalaxyRepositoryReqs,
  reqs/MxAccessClientReqs, reqs/ServiceHostReqs.
- New docs/v1/README.md is the shared archive banner + per-file table.
- docs/README.md repointed to the v1 paths and updated to reflect the
  v2 two-process deploy shape (Server + Admin + optional
  OtOpcUaWonderwareHistorian).
- docs/v2/Galaxy.ParityRig.md got a historical banner + four inline
  scrubs marking the OtOpcUaGalaxyHost service / Driver.Galaxy.Host
  EXE / Driver.Galaxy.ParityTests project as deleted-in-PR-7.2.

The repo's live-reading surface (README + CLAUDE.md + docs/v2/) now
describes only the post-PR-7.2 architecture. v1 docs are preserved as
a labelled archive under docs/v1/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-30 08:59:59 -04:00
parent ae7106dfce
commit 006af51768
21 changed files with 322 additions and 671 deletions

View File

@@ -10,289 +10,65 @@
### Summary
Out-of-process **Tier C** driver bridging AVEVA System Platform (Wonderware) Galaxies. The existing v1 implementation is refactored behind the new driver capability interfaces and hosted in a separate Windows service (.NET 4.8 x86) that communicates with the main OtOpcUa server (.NET 10 x64) via named pipes + MessagePack. Hosted out-of-process for **two reasons**: COM/.NET 4.8 x86 bitness constraint **and** Tier C stability isolation (per `driver-stability.md`). FOCAS is the second Tier C driver, also out-of-process — see §7.
### Library & Dependencies
| Component | Package / Source | Version | Target | Notes |
|-----------|------------------|---------|--------|-------|
| **MXAccess COM** | `ArchestrA.MxAccess` (GAC / `lib/ArchestrA.MxAccess.dll`) | version-neutral late-bound | .NET 4.8 x86 | Pinned via `<Reference Include="ArchestrA.MxAccess">` with `EmbedInteropTypes=false`; interfaces: `LMXProxyServer`, `ILMXProxyServerEvents`, `MXSTATUS_PROXY` |
| **Galaxy DB client** | `System.Data.SqlClient` (BCL) | BCL | .NET 4.8 x86 | Direct SQL for hierarchy/attribute/change-detection queries |
| **Wonderware Historian SDK** | `aahClientManaged`, `aahClientCommon` | Historian-shipped | .NET 4.8 x86 | Optional — loaded only when `Historian.Enabled=true` |
| **MessagePack-CSharp** | `MessagePack` NuGet | 2.x | .NET Standard 2.0 (Shared) | IPC serialization; shared contract between Proxy and Host |
| **Named pipes** | `System.IO.Pipes` (BCL) | BCL | both sides | IPC transport, localhost only |
### Required Components
- **AVEVA System Platform / ArchestrA Platform** deployed on the same machine as `Galaxy.Host` (installs MXAccess COM objects into the GAC)
- A **deployed Galaxy** with at least one $WinPlatform object hosting $AppEngine(s) hosting AutomationObjects
- **SQL Server** reachable from `Galaxy.Host` with the Galaxy repository database (default `ZB`); Windows Auth by default
- **32-bit .NET Framework 4.8** runtime on the Host machine (MXAccess is 32-bit COM, no 64-bit variant)
- **STA thread + Win32 message pump** inside the Host process for all COM calls and event callbacks (see §13)
- **Wonderware Historian** installed on-box or reachable via aah SDK — *only* if HDA is enabled
- **No external firewall ports** — MXAccess is local-machine COM/IPC; pipe is localhost-only. Galaxy DB port (default SQL 1433) if the ZB database is remote.
### Connection Settings (per driver instance, from central config DB)
All settings live under a schemaless `DriverConfig` JSON blob on the `DriverInstance` row. Current v1 equivalents (defaults and source file references in parentheses):
**MXAccess** (`MxAccessConfiguration.cs`):
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| `ClientName` | string | `"LmxOpcUa"` | Registration name passed to `LMXProxyServer.Register()` |
| `NodeName` | string? | `null` | Optional ArchestrA node override (null = local) |
| `GalaxyName` | string? | `null` | Optional Galaxy name override |
| `ReadTimeoutSeconds` | int | `5` | Per-read timeout |
| `WriteTimeoutSeconds` | int | `5` | Per-write timeout |
| `RequestTimeoutSeconds` | int | `30` | Outer safety timeout around any MXAccess request |
| `MaxConcurrentOperations` | int | `10` | Pool bound on in-flight MXAccess work items |
| `MonitorIntervalSeconds` | int | `5` | Connectivity heartbeat probe interval |
| `AutoReconnect` | bool | `true` | Replay stored subscriptions on COM reconnect |
| `ProbeTag` | string? | `null` | Optional heartbeat tag for health monitoring |
| `ProbeStaleThresholdSeconds` | int | `60` | Mark connection stale if no probe callback within |
| `RuntimeStatusProbesEnabled` | bool | `true` | Auto-subscribe `ScanState` for $WinPlatform / $AppEngine |
| `RuntimeStatusUnknownTimeoutSeconds` | int | `15` | Grace period before an un-probed host is assumed Stopped |
**Galaxy repository** (`GalaxyRepositoryConfiguration.cs`):
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| `ConnectionString` | string | `Server=localhost;Database=ZB;Integrated Security=true;` | ZB SQL Server connection |
| `ChangeDetectionIntervalSeconds` | int | `30` | Poll interval for `galaxy.time_of_last_deploy` |
| `CommandTimeoutSeconds` | int | `30` | SQL command timeout |
| `ExtendedAttributes` | bool | `false` | Include extended attribute metadata in discovery |
| `Scope` | enum (`Galaxy` \| `LocalPlatform`) | `Galaxy` | Address-space scope filter (commit bc282b6) |
| `PlatformName` | string? | `Environment.MachineName` | Platform to scope to when `Scope=LocalPlatform` |
**IPC** (new for v2):
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| `PipeName` | string | `otopcua-galaxy-{InstanceId}` | Named pipe name |
| `HostStartupTimeoutMs` | int | `30000` | Proxy wait for Host `Ready` handshake |
| `IpcCallTimeoutMs` | int | `15000` | Per-call RPC timeout |
### Addressing
Galaxy objects carry two names:
- **`contained_name`** — human-readable, scoped to parent; used for OPC UA browse tree
- **`tag_name`** — globally unique system identifier; used for MXAccess runtime references
| Layer | Example |
|-------|---------|
| OPC UA browse path | `TestMachine_001/DelmiaReceiver/DownloadPath` |
| OPC UA NodeId | `ns=<galaxyNs>;s=<tagName>.<AttributeName>` |
| MXAccess reference | `DelmiaReceiver_001.DownloadPath` (passed to `AddItem()`) |
Tag discovery is **dynamic** — driven by the Galaxy repository DB (`gobject`, `dynamic_attribute`, `primitive_instance`, `template_definition`). Optional `Scope=LocalPlatform` filters the hierarchy via the `hosted_by_gobject_id` chain to the subtree rooted at the local $WinPlatform (on a dev Galaxy: 49→3 objects, 4206→386 attributes).
### Data Type Mapping (`MxDataTypeMapper.cs`, `gr/data_type_mapping.md`)
| mx_data_type | Galaxy Type | OPC UA BuiltInType | CLR Type |
|--------------|-------------|--------------------|----------|
| 1 | Boolean | Boolean (i=1) | `bool` |
| 2 | Integer | Int32 (i=6) | `int` |
| 3 | Float | Float (i=10) | `float` |
| 4 | Double | Double (i=11) | `double` |
| 5 | String | String (i=12) | `string` |
| 6 | Time | DateTime (i=13) | `DateTime` |
| 7 | ElapsedTime | Double (i=11) | `double` (seconds) |
| 8 | Reference | String (i=12) | `string` |
| 13 | Enumeration | Int32 (i=6) | `int` |
| 14 / 16 | Custom | String (i=12) | `string` |
| 15 | InternationalizedString | LocalizedText (i=21) | `string` |
| (default) | Unknown | String (i=12) | `string` |
**Arrays**: `is_array=0` → ValueRank `-1` (Scalar); `is_array=1` → ValueRank `1` (OneDimension), ArrayDimensions = `[array_dimension]`.
### Security Classification Mapping (`SecurityClassificationMapper.cs`)
| security_classification | Galaxy Level | OPC UA Write Permission |
|-------------------------|--------------|-------------------------|
| 0 | FreeAccess | `WriteOperate` |
| 1 | Operate | `WriteOperate` |
| 2 | SecuredWrite | — (read-only in v1) |
| 3 | VerifiedWrite | — (read-only in v1) |
| 4 | Tune | `WriteTune` |
| 5 | Configure | `WriteConfigure` |
| 6 | ViewOnly | — (read-only) |
Maps to the OPC UA roles `ReadOnly` / `WriteOperate` / `WriteTune` / `WriteConfigure` defined in the LDAP role provider (see `docs/security.md`).
### Subscription Model — Native MXAccess Advisories
**Galaxy is one of three drivers with native subscriptions (Galaxy, TwinCAT, OPC UA Client).** No polling.
- Mechanism: `LMXProxyServer.AddItem()``AdviseSupervisory(handle, itemHandle)`; callbacks delivered through the `ILMXProxyServerEvents.OnDataChange` COM event
- Callback signature: `MxDataChangeHandler(itemHandle, MXSTATUS_PROXY, value, quality, timestamp)`
- Dispatch: STA COM event → dispatch-thread queue → OPC UA `ClearChangeMasks` fan-out (decouples COM thread from UA stack lock — commit c76ab8f)
- **Stored subscriptions** replayed on reconnect via `ReplayStoredSubscriptionsAsync()`
- **Probe tag** + runtime-status probes provide connection-health visibility (see §14)
- **Bad-quality fan-out**: when a host ($WinPlatform or $AppEngine) ScanState transitions to Stopped, every attribute under that host is immediately published as `BadOutOfService` (commits 7310925, c76ab8f)
### Alarm Model
In-process alarm-condition tracking (v1 baseline; extended in v2 to match `IAlarmSource`):
- **Auto-subscribed attributes per alarm-eligible object**: `InAlarm`, `Priority`, `Description` (cached for severity and message)
- **Filtering**: `AlarmFilterConfiguration.ObjectFilters[]` — include/exclude by template chain (empty = all eligible)
- **Transitions**: `InAlarm` change → OPC UA A&C `AlarmConditionState` event (Active / Return to Normal)
- **Severity**: Galaxy `Priority` (1 = highest) mapped to OPC UA 11000 severity (higher = more severe)
- **Acknowledgment**: local OPC UA ack forwards to MXAccess write on the `Ack` attribute of the alarm-bearing object
### History Model — Wonderware Historian (optional plugin)
- Loaded **at runtime** from `ZB.MOM.WW.LmxOpcUa.Historian.Aveva.dll` when `Historian.Enabled=true`; compile-time optional
- SDK: `aahClientManaged` / `aahClientCommon`
- Supported OPC UA HDA calls:
- `HistoryReadRawModified` (raw values with bounds)
- `HistoryReadProcessed` (Historian aggregates: AVG, MIN, MAX, TIMEAVG, etc. — mapped to OPC UA aggregates)
- Continuation points for paged reads
- Only attributes flagged `historize=1` in the Galaxy DB expose `AccessLevel.HistoryRead`
### Error Mapping — MXAccess → Quality → OPC UA StatusCode
**Byte quality (OPC DA convention)**`QualityMapper.cs`:
| OPC DA Quality | Category |
|----------------|----------|
| `>= 192` | Good |
| `64191` | Uncertain |
| `< 64` | Bad |
**MXAccess error codes → Quality** (`MxErrorCodes.cs`):
| Code | Name | Quality |
|------|------|---------|
| 1008 | `MX_E_InvalidReference` | `BadConfigError` |
| 1012 | `MX_E_WrongDataType` | `BadConfigError` |
| 1013 | `MX_E_NotWritable` | `BadOutOfService` |
| 1014 | `MX_E_RequestTimedOut` | `BadCommFailure` |
| 1015 | `MX_E_CommFailure` | `BadCommFailure` |
| 1016 | `MX_E_NotConnected` | `BadNotConnected` |
**Quality → OPC UA StatusCode** (`QualityMapper.cs`):
| Quality | StatusCode |
|---------|-----------|
| Good | `0x00000000` |
| GoodLocalOverride | `0x00D80000` |
| Uncertain | `0x40000000` |
| Bad (generic) | `0x80000000` |
| BadCommFailure | `0x80050000` |
| BadNotConnected | `0x808A0000` |
| BadOutOfService | `0x808D0000` |
### Change Detection
- `ChangeDetectionService` polls `galaxy.time_of_last_deploy` at `ChangeDetectionIntervalSeconds` (default 30s)
- On timestamp change, `OnGalaxyChanged` fires → Host re-queries hierarchy/attributes → emits `TagSetChanged` over IPC → Proxy implements `IRediscoverable` and rebuilds the affected subtree in the address space
- Platform-scope filter (commit bc282b6) applied during hierarchy load when `Scope=LocalPlatform`
### IPC Contract (Proxy ↔ Host) — `Galaxy.Shared`
.NET Standard 2.0 MessagePack contracts. Every request carries a correlation ID; responses carry the same ID plus success/error.
**Lifecycle / handshake**:
| Message | Direction | Payload |
|---------|-----------|---------|
| `ClientHello` | Proxy → Host | InstanceId, expected protocol version |
| `HostReady` | Host → Proxy | Host version, Galaxy name, capabilities |
| `Shutdown` | Proxy → Host | Graceful stop |
**Tag discovery** (`ITagDiscovery`):
| Message | Direction | Payload |
|---------|-----------|---------|
| `DiscoverHierarchyRequest` | Proxy → Host | `Scope`, `PlatformName` |
| `DiscoverHierarchyResponse` | Host → Proxy | `GalaxyObjectInfo[]` (TagName, ContainedName, ParentTagName, TemplateChain, category) |
| `DiscoverAttributesRequest` | Proxy → Host | `TagName[]` |
| `DiscoverAttributesResponse` | Host → Proxy | `GalaxyAttributeInfo[]` (Name, MxDataType, IsArray, ArrayDim, SecurityClass, Historized, WriteableRuntimeChecked) |
| `TagSetChangedNotification` | Host → Proxy | New deploy timestamp; triggers re-discover |
**Read / Write** (`IReadable`, `IWritable`):
| Message | Direction | Payload |
|---------|-----------|---------|
| `ReadRequest` | Proxy → Host | `TagRef[]` (tag_name + attribute) |
| `ReadResponse` | Host → Proxy | `VtqPayload[]` (value, quality, timestamp, statusCode) |
| `WriteRequest` | Proxy → Host | `(TagRef, Value, ExpectedDataType)[]` |
| `WriteResponse` | Host → Proxy | `(TagRef, StatusCode)[]` |
**Subscription** (`ISubscribable`):
| Message | Direction | Payload |
|---------|-----------|---------|
| `SubscribeRequest` | Proxy → Host | `TagRef[]` + Proxy-generated subscription ID |
| `SubscribeResponse` | Host → Proxy | Per-tag subscribe ack + handle |
| `UnsubscribeRequest` | Proxy → Host | handles |
| `DataChangeNotification` | Host → Proxy (push) | handle, VTQ, sequence number |
| `ProbeHealthNotification` | Host → Proxy (push) | probe tag staleness, `ScanState` transitions, overall connected/disconnected |
**Alarms** (`IAlarmSource`):
| Message | Direction | Payload |
|---------|-----------|---------|
| `AlarmEventNotification` | Host → Proxy (push) | source tag, InAlarm, Priority, Description, severity, transition type |
| `AlarmAckRequest` | Proxy → Host | source tag, user, comment |
**History** (`IHistoryProvider`):
| Message | Direction | Payload |
|---------|-----------|---------|
| `HistoryReadRawRequest` | Proxy → Host | TagRef, start, end, numValues, returnBounds, continuationPoint |
| `HistoryReadRawResponse` | Host → Proxy | values + next continuation point |
| `HistoryReadProcessedRequest` | Proxy → Host | TagRef, aggregateId, start, end, resampleInterval |
| `HistoryReadProcessedResponse` | Host → Proxy | aggregated values |
**Framing**: length-prefixed MessagePack frames over a single `NamedPipeServerStream` in `PipeTransmissionMode.Byte`. Separate outgoing pipe for push notifications or multiplex via message type tag.
### Threading / COM Constraints
- **STA thread** (`StaComThread.cs`) hosts MXAccess: `ApartmentState.STA`, raw Win32 `GetMessage` / `DispatchMessage` loop
- Work items marshaled in via `PostThreadMessage(WM_APP=0x8000)`
- **Per-handle serialization**: LMXProxyServer is not thread-safe — all Read/Write/Subscribe calls on one handle run serially via the STA queue
- **Dispatch thread** (separate from STA thread) drains `_pendingDataChanges` to the OPC UA framework; decouples the STA pump from UA stack locks so a slow subscriber can't back up COM event delivery
- **Reentrancy guards** — event unwiring must precede `Marshal.ReleaseComObject()` on disconnect
### Runtime Status (recent commits bc282b6 / 4b209f6 / 7310925 / c76ab8f / 0003984)
- `GalaxyRuntimeProbeManager` auto-subscribes `<ObjectName>.ScanState` for every $WinPlatform (category 1) and $AppEngine (category 3) in scope
- Per-host state machine: `Unknown → Running | Stopped`; transitions fire `_onHostStopped` / `_onHostRunning` callbacks on the dispatch thread
- **Synthetic OPC UA nodes** expose `ScanState` per host as read-only variables so clients see runtime topology without the dashboard
- **HealthCheck Rule 2e** monitors probe subscription health; a failed probe can no longer leave phantom entries that fan out false `BadOutOfService`
- Generalizes to the driver-agnostic `IHostConnectivityProbe` capability interface in v2 (see `plan.md` §5a)
### Implementation Notes
- **First Tier C out-of-process driver** — uses the `Galaxy.Proxy` / `Galaxy.Host` / `Galaxy.Shared` three-project split. The pattern is reusable; FOCAS is the second adopter (see §7), and any future driver with bitness, licensing, or stability-isolation needs reuses the same template. See `driver-stability.md` for the generalized contract
- `Galaxy.Proxy` (in the main server) implements `IDriver`, `ITagDiscovery`, `IRediscoverable`, `IReadable`, `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`, `IHostConnectivityProbe`
- `Galaxy.Host` owns `MxAccessBridge`, `GalaxyRepository`, alarm tracking, `GalaxyRuntimeProbeManager`, and the Historian plugin — no reference to `Core.Abstractions`
- `Galaxy.Shared` is .NET Standard 2.0, referenced by both sides
- Existing v1 code is the implementation — **refactor in place** (extract capability interfaces first, then move behind IPC — see `plan.md` Decision #55)
- **Parity gate**: v2 driver must pass v1 `IntegrationTests` suite + scripted Client.CLI walkthrough before Phase 3 begins
### Operational Stability Notes
Galaxy has a Tier C deep dive in `driver-stability.md` covering the STA pump, COM object lifetime, subscription replay, recycle policy, and post-mortem contents. Driver-instance specifics:
- **Memory baseline scales with Galaxy size**. Watchdog floor of 200 MB above baseline + 1.5 GB hard ceiling — higher than FOCAS because legitimate Galaxy footprints are larger.
- **Slope tolerance is 5 MB/min** (more permissive than FOCAS) because address-space rebuild on redeploy can transiently allocate large amounts.
- **Known regression-prone failure modes** (closed in commits `c76ab8f` and `7310925`, must remain closed): phantom probe subscription flipping Tick() to Stopped; cross-host quality clear wiping sibling state during recovery; sync-over-async on the OPC UA stack thread; fire-and-forget alarm tasks racing shutdown. Each should have a regression test in the v2 parity suite.
- **STA pump health probe** every 10 s (separate from the proxy↔host heartbeat). A wedged pump is the most likely Tier C failure mode for Galaxy.
- **Recycle preserves cached `time_of_last_deploy` watermark** — the common case (crash unrelated to redeploy) skips full DB rediscovery for faster recovery.
### Namespace Assignment
Galaxy is the canonical **SystemPlatform-kind namespace** driver. It exposes Aveva System Platform / Galaxy objects as OPC UA — these are *processed* values with business meaning attached at Layer 3, not raw equipment signals. Per `plan.md` §4:
- The Galaxy driver's `DriverInstance.NamespaceId` must reference a `Namespace` row with `Kind = 'SystemPlatform'`.
- **UNS naming rules do NOT apply** to the Galaxy hierarchy. Tags belong to `DriverInstanceId + FolderPath` (v1 LmxOpcUa pattern preserved); `Tag.EquipmentId` is NULL.
- The Galaxy hierarchy reflects the gobject parent chain as v1 has always done — no migration to UNS path conventions in v2.
- If a future need arises to expose raw Galaxy gobject data alongside processed (e.g. an Aveva-Wonderware Historian raw signal feed), that becomes a *separate* driver instance assigned to an Equipment-kind namespace, with its own per-equipment mapping.
Galaxy (MXAccess) is a **Tier-A in-process driver** that runs in the OtOpcUa server's .NET 10 AnyCPU process and speaks gRPC to a separately installed `mxaccessgw` (sibling repo at `c:\Users\dohertj2\Desktop\mxaccessgw\`). The gateway owns the MXAccess COM apartment, the STA pump, and the Galaxy Repository / Historian SDK on its own host; the driver itself is platform-agnostic and carries no COM or x86 bitness constraint. Project lives at `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/`.
### Capability Surface
`GalaxyDriver` (in `GalaxyDriver.cs`) implements `IDriver`, `IDisposable`, plus six driver capabilities — eight interfaces total.
| Capability | Source files |
|------------|--------------|
| `ITagDiscovery` | `Browse/GalaxyDiscoverer.cs`, `Browse/GatewayGalaxyHierarchySource.cs`, `Browse/DataTypeMap.cs`, `Browse/SecurityMap.cs`, `Browse/AlarmRefBuilder.cs` |
| `IRediscoverable` | `Browse/DeployWatcher.cs`, `Browse/GatewayGalaxyDeployWatchSource.cs` |
| `IReadable` | `Runtime/GalaxyMxSession.cs`, `Runtime/MxValueDecoder.cs`, `Runtime/StatusCodeMap.cs` |
| `IWritable` | `Runtime/GatewayGalaxyDataWriter.cs` (+ `TracedGalaxyDataWriter.cs`), `Runtime/MxValueEncoder.cs` |
| `ISubscribable` | `Runtime/GatewayGalaxySubscriber.cs` (+ `TracedGalaxySubscriber.cs`), `Runtime/EventPump.cs`, `Runtime/SubscriptionRegistry.cs`, `Runtime/ReconnectSupervisor.cs` |
| `IHostConnectivityProbe` | `Health/HostStatusAggregator.cs`, `Health/HostConnectivityForwarder.cs`, `Health/PerPlatformProbeWatcher.cs` |
History reads + alarm condition tracking now live in the server-layer `IHistoryRouter` and `AlarmConditionService` (PR 7.2). Galaxy no longer carries `IHistoryProvider` or `IAlarmSource` of its own.
### DriverConfig JSON shape
Per `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Config/GalaxyDriverOptions.cs`:
```jsonc
{
"Gateway": {
"Endpoint": "http://localhost:5120",
"ApiKeySecretRef": "secret:galaxy-gw-api-key",
"UseTls": true,
"CaCertificatePath": null,
"ConnectTimeoutSeconds": 10,
"DefaultCallTimeoutSeconds": 30,
"StreamTimeoutSeconds": 0
},
"MxAccess": {
"ClientName": "OtOpcUa",
"PublishingIntervalMs": 1000,
"WriteUserId": 0,
"EventPumpChannelCapacity": 50000
},
"Repository": {
"DiscoverPageSize": 5000,
"WatchDeployEvents": true
},
"Reconnect": {
"InitialBackoffMs": 500,
"MaxBackoffMs": 30000,
"ReplayOnSessionLost": true
}
}
```
`Gateway.ApiKeySecretRef` resolves through the server-side secret store (DPAPI in production, env override in dev) — the API key never appears in cleartext config. `MxAccess.ClientName` MUST be unique per OtOpcUa instance; redundancy pairs enforce uniqueness at install time. `StreamTimeoutSeconds = 0` keeps the `StreamEvents` RPC alive for the lifetime of the driver.
### Performance, tracing, soak
See [Galaxy.Performance.md](Galaxy.Performance.md) for the OpenTelemetry trace map, the per-RPC metric set (`galaxy.events.dropped`, channel headroom, reconnect backoff distribution), and the soak-run profile.
### Parity rig + gateway setup
See [Galaxy.ParityRig.md](Galaxy.ParityRig.md) and the `mxaccessgw` repo for the gateway worker layout and the dev-rig recipe.
---