Doc refresh (task #202) — core architecture docs for multi-driver OtOpcUa

Rewrite seven core-architecture docs to match the shipped multi-driver platform.
The v1 single-driver LmxNodeManager framing is replaced with the Core +
capability-interface model — Galaxy is now one driver of seven, and each doc
points at the current class names + source paths.

What changed per file:
- OpcUaServer.md — OtOpcUaServer as StandardServer host; per-driver
  DriverNodeManager + CapabilityInvoker wiring; Config-DB-driven configuration
  (sp_PublishGeneration, DraftRevisionToken, Admin UI); Phase 6.2
  AuthorizationGate integration.
- AddressSpace.md — GenericDriverNodeManager.BuildAddressSpaceAsync walks
  ITagDiscovery.DiscoverAsync and streams DriverAttributeInfo through
  IAddressSpaceBuilder; CapturingBuilder registers alarm-condition sinks;
  per-driver NodeId schemes replace the fixed ns=1;s=ZB root.
- ReadWriteOperations.md — OnReadValue / OnWriteValue dispatch to
  IReadable.ReadAsync / IWritable.WriteAsync through CapabilityInvoker,
  honoring WriteIdempotentAttribute (#143); two-layer authorization
  (WriteAuthzPolicy + Phase 6.2 AuthorizationGate).
- Subscriptions.md — ISubscribable.SubscribeAsync/UnsubscribeAsync is the
  capability surface; STA-thread story is now Galaxy-specific (StaPump inside
  Driver.Galaxy.Host), other drivers are free-threaded.
- AlarmTracking.md — IAlarmSource is optional; AlarmSurfaceInvoker wraps
  Subscribe/Ack/Unsubscribe with fan-out by IPerCallHostResolver and the
  no-retry AlarmAcknowledge pipeline (#143); CapturingBuilder registers sinks
  at build time.
- DataTypeMapping.md — DriverDataType + SecurityClassification are the
  driver-agnostic enums; per-driver mappers (GalaxyProxyDriver inline,
  AbCipDataType, ModbusDriver, etc.); SecurityClassification is metadata only,
  ACL enforcement is at the server layer.
- IncrementalSync.md — IRediscoverable covers backend-change signals;
  sp_ComputeGenerationDiff + DiffViewer drive generation-level change
  detection; IDriver.ReinitializeAsync is the in-process recovery path.
This commit is contained in:
Joseph Doherty
2026-04-20 01:27:25 -04:00
parent 48970af416
commit 985b7aba26
7 changed files with 290 additions and 699 deletions

View File

@@ -1,82 +1,72 @@
# Address Space
The address space maps the Galaxy object hierarchy and attribute definitions into an OPC UA browse tree. `LmxNodeManager` builds the tree from data queried by `GalaxyRepositoryService`, while `AddressSpaceBuilder` provides a testable in-memory model of the same structure.
Each driver's browsable subtree is built by streaming nodes from the driver's `ITagDiscovery.DiscoverAsync` implementation into an `IAddressSpaceBuilder`. `GenericDriverNodeManager` (`src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs`) owns the shared orchestration; `DriverNodeManager` (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs`) implements `IAddressSpaceBuilder` against the OPC Foundation stack's `CustomNodeManager2`. The same code path serves Galaxy object hierarchies, Modbus PLC registers, AB CIP tags, TwinCAT symbols, FOCAS CNC parameters, and OPC UA Client aggregations — Galaxy is one driver of seven, not the driver.
## Root ZB Folder
## Driver root folder
Every address space starts with a single root folder node named `ZB` (NodeId `ns=1;s=ZB`). This folder is added under the standard OPC UA `Objects` folder via an `Organizes` reference. The reverse reference is registered through `MasterNodeManager.AddReferences` because `BuildAddressSpace` runs after `CreateAddressSpace` has already consumed the external references dictionary.
Every driver's subtree starts with a root `FolderState` under the standard OPC UA `Objects` folder, wired with an `Organizes` reference. `DriverNodeManager.CreateAddressSpace` creates this folder with `NodeId = ns;s={DriverInstanceId}`, `BrowseName = {DriverInstanceId}`, and `EventNotifier = SubscribeToEvents | HistoryRead` so alarm and history-event subscriptions can target the root. The namespace URI is `urn:OtOpcUa:{DriverInstanceId}`.
The root folder has `EventNotifier = SubscribeToEvents` enabled so alarm events propagate up to clients subscribed at the root level.
## IAddressSpaceBuilder surface
## Area Folders vs Object Nodes
`IAddressSpaceBuilder` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAddressSpaceBuilder.cs`) offers three calls:
Galaxy objects fall into two categories based on `template_definition.category_id`:
- `Folder(browseName, displayName)` — creates a child `FolderState` and returns a child builder scoped to it.
- `Variable(browseName, displayName, DriverAttributeInfo attributeInfo)` — creates a `BaseDataVariableState` and returns an `IVariableHandle` the driver keeps for alarm wiring.
- `AddProperty(browseName, DriverDataType, value)` — attaches a `PropertyState` for static metadata (e.g. equipment identification fields).
- **Areas** (`category_id = 13`) become `FolderState` nodes with `FolderType` type definition and `Organizes` references. They represent logical groupings in the Galaxy hierarchy (e.g., production lines, cells).
- **Non-area objects** (AppEngine, Platform, UserDefined, etc.) become `BaseObjectState` nodes with `BaseObjectType` type definition and `HasComponent` references. These represent runtime automation objects that carry attributes.
Drivers drive ordering. Typical pattern: root → folder per equipment → variables per tag. `GenericDriverNodeManager` calls `DiscoverAsync` once on startup and once per rediscovery cycle.
Both node types use `contained_name` as the browse name. When `contained_name` is null or empty, `tag_name` is used as a fallback.
## DriverAttributeInfo → OPC UA variable
## Variable Nodes for Attributes
Each variable carries a `DriverAttributeInfo` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverAttributeInfo.cs`):
Each Galaxy attribute becomes a `BaseDataVariableState` node under its parent object. The variable is configured with:
| Field | OPC UA target |
|---|---|
| `FullName` | `NodeId.Identifier` — used as the driver-side lookup key for Read/Write/Subscribe |
| `DriverDataType` | mapped to a built-in `DataTypeIds.*` NodeId via `DriverNodeManager.MapDataType` |
| `IsArray` | `ValueRank = OneDimension` when true, `Scalar` otherwise |
| `ArrayDim` | declared array length, carried through as metadata |
| `SecurityClass` | stored in `_securityByFullRef` for `WriteAuthzPolicy` gating on write |
| `IsHistorized` | flips `AccessLevel.HistoryRead` + `Historizing = true` |
| `IsAlarm` | drives the `MarkAsAlarmCondition` pass (see below) |
| `WriteIdempotent` | stored in `_writeIdempotentByFullRef`; fed to `CapabilityInvoker.ExecuteWriteAsync` |
- **DataType** -- Mapped from `mx_data_type` via `MxDataTypeMapper` (see [DataTypeMapping.md](DataTypeMapping.md))
- **ValueRank** -- `OneDimension` (1) for arrays, `Scalar` (-1) for scalars
- **ArrayDimensions** -- Set to `[array_dimension]` when the attribute is an array
- **AccessLevel** -- `CurrentReadOrWrite` or `CurrentRead` based on security classification, with `HistoryRead` added for historized attributes
- **Historizing** -- Set to `true` for attributes with a `HistoryExtension` primitive
- **Initial value** -- `null` with `StatusCode = BadWaitingForInitialData` until the first MXAccess callback delivers a live value
The initial value stays `null` with `StatusCode = BadWaitingForInitialData` until the first Read or `ISubscribable.OnDataChange` push lands.
## Primitive Grouping
## CapturingBuilder + alarm sink registration
Galaxy objects can have primitive components (e.g., alarm extensions, history extensions) that attach sub-attributes to a parent attribute. The address space handles this with a two-pass approach:
`GenericDriverNodeManager.BuildAddressSpaceAsync` wraps the supplied builder in a `CapturingBuilder` before calling `DiscoverAsync`. The wrapper observes every `Variable()` call: when a returned `IVariableHandle.MarkAsAlarmCondition(AlarmConditionInfo)` fires, the sink is registered in the manager's `_alarmSinks` dictionary keyed by the variable's `FullReference`. Subsequent `IAlarmSource.OnAlarmEvent` pushes are routed to the matching sink by `SourceNodeId`. This keeps the alarm-wiring protocol declarative — drivers just flag `DriverAttributeInfo.IsAlarm = true` and the materialization of the OPC UA `AlarmConditionState` node is handled by the server layer. See `docs/AlarmTracking.md`.
### First pass: direct attributes
## NodeId scheme
Attributes with an empty `PrimitiveName` are created as direct variable children of the object node. If a direct attribute shares its name with a primitive group, the variable node reference is saved for the second pass.
### Second pass: primitive child attributes
Attributes with a non-empty `PrimitiveName` are grouped by that name. For each group:
1. If a direct attribute variable with the same name already exists, the primitive's child attributes are added as `HasComponent` children of that variable node. This merges alarm/history sub-attributes (e.g., `InAlarm`, `Priority`) under the parent variable they describe.
2. If no matching direct attribute exists, a new `BaseObjectState` node is created with NodeId `ns=1;s={TagName}.{PrimitiveName}`, and the primitive's attributes are added under it.
This structure means that browsing `TestMachine_001/SomeAlarmAttr` reveals both the process value and its alarm sub-attributes (`InAlarm`, `Priority`, `DescAttrName`) as children.
## NodeId Scheme
All node identifiers use string-based NodeIds in namespace index 1 (`ns=1`):
All nodes live in the driver's namespace (not a shared `ns=1`). Browse paths are driver-defined:
| Node type | NodeId format | Example |
|-----------|---------------|---------|
| Root folder | `ns=1;s=ZB` | `ns=1;s=ZB` |
| Area folder | `ns=1;s={tag_name}` | `ns=1;s=Area_001` |
| Object node | `ns=1;s={tag_name}` | `ns=1;s=TestMachine_001` |
| Scalar variable | `ns=1;s={tag_name}.{attr}` | `ns=1;s=TestMachine_001.MachineID` |
| Array variable | `ns=1;s={tag_name}.{attr}` | `ns=1;s=MESReceiver_001.MoveInPartNumbers` |
| Primitive sub-object | `ns=1;s={tag_name}.{prim}` | `ns=1;s=TestMachine_001.AlarmPrim` |
|---|---|---|
| Driver root | `ns;s={DriverInstanceId}` | `urn:OtOpcUa:galaxy-01;s=galaxy-01` |
| Folder | `ns;s={parent}/{browseName}` | `ns;s=galaxy-01/Area_001` |
| Variable | `ns;s={DriverAttributeInfo.FullName}` | `ns;s=DelmiaReceiver_001.DownloadPath` |
| Alarm condition | `ns;s={FullReference}.Condition` | `ns;s=DelmiaReceiver_001.Temperature.Condition` |
For array attributes, the `[]` suffix present in `full_tag_reference` is stripped from the NodeId. The `full_tag_reference` (with `[]`) is kept internally for MXAccess subscription addressing. This means `MESReceiver_001.MoveInPartNumbers[]` in the Galaxy maps to NodeId `ns=1;s=MESReceiver_001.MoveInPartNumbers`.
For Galaxy the `FullName` stays in the legacy `tag_name.AttributeName` format; Modbus uses `unit:register:type`; AB CIP uses the native `program:tag.member` path; etc. — the shape is the driver's choice.
## Topological Sort
## Per-driver hierarchy examples
The hierarchy query returns objects ordered by `parent_gobject_id, tag_name`, but this does not guarantee that a parent appears before all of its children in all cases. `LmxNodeManager.TopologicalSort` performs a depth-first traversal to produce a list where every parent is guaranteed to precede its children. This allows the build loop to look up parent nodes from `_nodeMap` without forward references.
- **Galaxy Proxy**: walks the DB-snapshot hierarchy (`GalaxyProxyDriver.DiscoverAsync`), streams Area objects as folders and non-area objects as variable-bearing folders, marks `IsAlarm = true` on attributes that have an `AlarmExtension` primitive. The v1 two-pass primitive-grouping logic is retained inside the Galaxy driver.
- **Modbus**: streams one folder per device, one variable per register range from `ModbusDriverOptions`. No alarm surface.
- **AB CIP**: uses `AbCipTemplateCache` to enumerate user-defined types, streams a folder per program with variables keyed on the native tag path.
- **OPC UA Client**: re-exposes a remote server's address space — browses the upstream and relays nodes through the builder.
## Platform Scope Filtering
See `docs/v2/driver-specs.md` for the per-driver discovery contracts.
When `GalaxyRepository.Scope` is set to `LocalPlatform`, the hierarchy and attributes passed to `BuildAddressSpace` are pre-filtered by `PlatformScopeFilter` inside `GalaxyRepositoryService`. The node manager receives only the local platform's objects and their ancestor areas, so the resulting browse tree is a subset of the full Galaxy. The filtering is transparent to `LmxNodeManager` — it builds nodes from whatever data it receives.
## Rediscovery
Clients browsing a `LocalPlatform`-scoped server will see only the areas and objects hosted by that platform. Areas that exist in the Galaxy but contain no local descendants are excluded. See [Galaxy Repository — Platform Scope Filter](GalaxyRepository.md#platform-scope-filter) for the filtering algorithm and configuration.
## Incremental Sync
On address space rebuild (triggered by a Galaxy deploy change), `SyncAddressSpace` uses `AddressSpaceDiff` to identify which `gobject_id` values have changed between the old and new snapshots. Only the affected subtrees are torn down and rebuilt, preserving unchanged nodes and their active subscriptions. Affected subscriptions are snapshot before teardown and replayed after rebuild.
If no previous state is cached (first build), the full `BuildAddressSpace` path runs instead.
Drivers that implement `IRediscoverable` fire `OnRediscoveryNeeded` when their backend signals a change (Galaxy: `time_of_last_deploy` advance; TwinCAT: symbol-version-changed; OPC UA Client: server namespace change). Core re-runs `DiscoverAsync` and diffs — see `docs/IncrementalSync.md`. Static drivers (Modbus, S7) don't implement `IRediscoverable`; their address space only changes when a new generation is published from the Config DB.
## Key source files
- `src/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LmxNodeManager.cs` -- Node manager with `BuildAddressSpace`, `SyncAddressSpace`, and `TopologicalSort`
- `src/ZB.MOM.WW.OtOpcUa.Host/OpcUa/AddressSpaceBuilder.cs` -- Testable in-memory model builder
- `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs` — orchestration + `CapturingBuilder`
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs` — OPC UA materialization (`IAddressSpaceBuilder` impl + `NestedBuilder`)
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAddressSpaceBuilder.cs` — builder contract
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ITagDiscovery.cs` — driver discovery capability
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverAttributeInfo.cs` — per-attribute descriptor

View File

@@ -1,234 +1,76 @@
# Alarm Tracking
`LmxNodeManager` generates OPC UA alarm conditions from Galaxy attributes marked as alarms. The system detects alarm-capable attributes during address space construction, creates `AlarmConditionState` nodes, auto-subscribes to the runtime alarm tags via MXAccess, and reports state transitions as OPC UA events.
Alarm surfacing is an optional driver capability exposed via `IAlarmSource` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAlarmSource.cs`). Drivers whose backends have an alarm concept implement it — today: Galaxy (MXAccess alarms), FOCAS (CNC alarms), OPC UA Client (A&C events from the upstream server). Modbus / S7 / AB CIP / AB Legacy / TwinCAT do not implement the interface and the feature is simply absent from their subtrees.
## AlarmInfo Structure
Each tracked alarm is represented by an `AlarmInfo` instance stored in the `_alarmInAlarmTags` dictionary, keyed by the `InAlarm` tag reference:
## IAlarmSource surface
```csharp
private sealed class AlarmInfo
{
public string SourceTagReference { get; set; } // e.g., "Tag_001.Temperature"
public NodeId SourceNodeId { get; set; }
public string SourceName { get; set; } // attribute name for event messages
public bool LastInAlarm { get; set; } // tracks previous state for edge detection
public AlarmConditionState? ConditionNode { get; set; }
public string PriorityTagReference { get; set; } // e.g., "Tag_001.Temperature.Priority"
public string DescAttrNameTagReference { get; set; } // e.g., "Tag_001.Temperature.DescAttrName"
public ushort CachedSeverity { get; set; }
public string CachedMessage { get; set; }
}
Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken);
Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken);
Task AcknowledgeAsync(IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements,
CancellationToken cancellationToken);
event EventHandler<AlarmEventArgs>? OnAlarmEvent;
```
`LastInAlarm` enables edge detection so only actual transitions (inactive-to-active or active-to-inactive) generate events, not repeated identical values.
The driver fires `OnAlarmEvent` for every transition (`Active`, `Acknowledged`, `Inactive`) with an `AlarmEventArgs` carrying the source node id, condition id, alarm type, message, severity (`AlarmSeverity` enum), and source timestamp.
## Alarm Detection via is_alarm Flag
## AlarmSurfaceInvoker
During `BuildAddressSpace` (and `BuildSubtree` for incremental sync), the node manager scans each non-area Galaxy object for attributes where `IsAlarm == true` and `PrimitiveName` is empty (direct attributes only, not primitive children):
`AlarmSurfaceInvoker` (`src/ZB.MOM.WW.OtOpcUa.Core/Resilience/AlarmSurfaceInvoker.cs`) wraps the three mutating surfaces through `CapabilityInvoker`:
```csharp
var alarmAttrs = objAttrs.Where(a => a.IsAlarm && string.IsNullOrEmpty(a.PrimitiveName)).ToList();
```
- `SubscribeAlarmsAsync` / `UnsubscribeAlarmsAsync` run through the `DriverCapability.AlarmSubscribe` pipeline — retries apply under the tier configuration.
- `AcknowledgeAsync` runs through `DriverCapability.AlarmAcknowledge` which does NOT retry per decision #143. A timed-out ack may have already registered at the plant floor; replay would silently double-acknowledge.
The `IsAlarm` flag originates from the `AlarmExtension` primitive in the Galaxy repository database. When a Galaxy attribute has an associated `AlarmExtension` primitive, the SQL query sets `is_alarm = 1` on the corresponding `GalaxyAttributeInfo`.
Multi-host fan-out: when the driver implements `IPerCallHostResolver`, each source node id is resolved individually and batches are grouped by host so a dead PLC inside a multi-device driver doesn't poison sibling breakers. Single-host drivers fall back to `IDriver.DriverInstanceId` as the pipeline-key host.
For each alarm attribute, the code verifies that a corresponding `InAlarm` sub-attribute variable node exists in `_tagToVariableNode` (constructed from `FullTagReference + ".InAlarm"`). If the variable node is missing, the alarm is skipped -- this prevents creating orphaned alarm conditions for attributes whose extension primitives were not published.
## Condition-node creation via CapturingBuilder
## Template-Based Alarm Object Filter
Alarm-condition nodes are materialized at address-space build time. During `GenericDriverNodeManager.BuildAddressSpaceAsync` the builder is wrapped in a `CapturingBuilder` that observes every `Variable()` call. When a driver calls `IVariableHandle.MarkAsAlarmCondition(AlarmConditionInfo)` on a returned handle, the server-side `DriverNodeManager.VariableHandle` creates a sibling `AlarmConditionState` node and returns an `IAlarmConditionSink`. The wrapper stores the sink in `_alarmSinks` keyed by the variable's full reference, then `GenericDriverNodeManager` registers a forwarder on `IAlarmSource.OnAlarmEvent` that routes each push to the matching sink by `SourceNodeId`. Unknown source ids are dropped silently — they may belong to another driver.
When large galaxies contain more alarm-bearing objects than clients need, `OpcUa.AlarmFilter.ObjectFilters` restricts alarm condition creation to a subset of objects selected by **template name pattern**. The filter is applied at both alarm creation sites -- the full build in `BuildAddressSpace` and the subtree rebuild path triggered by Galaxy redeployment -- so the included set is recomputed on every rebuild against the fresh hierarchy.
The `AlarmConditionState` layout matches OPC UA Part 9:
### Matching rules
- `SourceNode` → the originating variable
- `SourceName` / `ConditionName` → from `AlarmConditionInfo.SourceName`
- Initial state: enabled, inactive, acknowledged, severity per `InitialSeverity`, retain false
- `HasCondition` references wire the source variable ↔ the condition node bidirectionally
- `*` is the only wildcard (glob-style, zero or more characters). All other regex metacharacters are escaped and matched literally.
- Matching is case-insensitive.
- The leading `$` used by Galaxy template `tag_name` values is normalized away on both the stored chain entry and the operator pattern, so `TestMachine*` matches the stored `$TestMachine`.
- Each configured entry may itself be comma-separated for operator convenience (`"TestMachine*, Pump_*"`).
- An empty list disables the filter and restores the prior behavior: every alarm-bearing object is tracked when `AlarmTrackingEnabled=true`.
Drivers flag alarm-bearing variables at discovery time via `DriverAttributeInfo.IsAlarm = true`. The Galaxy driver, for example, sets this on attributes that have an `AlarmExtension` primitive in the Galaxy repository DB; FOCAS sets it on the CNC alarm register.
### What gets included
## State transitions
Every Galaxy object whose **template derivation chain** contains any template matching any pattern is included. The chain walks `gobject.derived_from_gobject_id` from the instance through its immediate template and each ancestor template, up to `$Object`. An instance of `TestCoolMachine` whose chain is `$TestCoolMachine -> $TestMachine -> $UserDefined` matches the pattern `TestMachine` via the ancestor hit.
`ConditionSink.OnTransition` runs under the node manager's `Lock` and maps the `AlarmEventArgs.AlarmType` string to Part 9 state:
Inclusion propagates down the **containment hierarchy**: if an object matches, all of its descendants are included as well, regardless of their own template chains. This lets operators target a parent and pick up all its alarm-bearing children with one pattern.
| AlarmType | Action |
|---|---|
| `Active` | `SetActiveState(true)`, `SetAcknowledgedState(false)`, `Retain = true` |
| `Acknowledged` | `SetAcknowledgedState(true)` |
| `Inactive` | `SetActiveState(false)`; `Retain = false` once both inactive and acknowledged |
Each object is evaluated exactly once. Overlapping matches (multiple patterns hit, or both an ancestor and descendant match independently) never produce duplicate alarm condition subscriptions -- the filter operates on object identity via a `HashSet<int>` of included `GobjectId` values.
Severity is remapped: `AlarmSeverity.Low/Medium/High/Critical` → OPC UA numeric 250 / 500 / 700 / 900. `Message.Value` is set from `AlarmEventArgs.Message` on every transition. `ClearChangeMasks(true)` and `ReportEvent(condition)` fire the OPC UA event notification for clients subscribed to any ancestor notifier.
### Resolution algorithm
## Acknowledge dispatch
`AlarmObjectFilter.ResolveIncludedObjects(hierarchy)` runs once per build:
Alarm acknowledgement initiated by an OPC UA client flows:
1. Compile each pattern into a regex with `IgnoreCase | CultureInvariant | Compiled`.
2. Build a `parent -> children` map from the hierarchy. Orphans (parent id not in the hierarchy) are treated as roots.
3. BFS from each root with a `(nodeId, parentIncluded)` queue and a `visited` set for cycle defense.
4. At each node: if the parent was included OR any chain entry matches any pattern, add the node and mark its subtree as included.
5. Return the `HashSet<int>` of included object IDs. When no patterns are configured the filter is disabled and the method returns `null`, which the alarm loop treats as "no filtering".
1. The SDK invokes the `AlarmConditionState.OnAcknowledge` method delegate.
2. The handler checks the session's roles for `AlarmAck` — drivers never see a request the session wasn't entitled to make.
3. `AlarmSurfaceInvoker.AcknowledgeAsync` is called with the source / condition / comment tuple. The invoker groups by host and runs each batch through the no-retry `AlarmAcknowledge` pipeline.
After each resolution, `UnmatchedPatterns` exposes any raw pattern that matched zero objects so the startup log can warn about operator typos without failing startup.
Drivers return normally for success or throw to signal the ack failed at the backend.
### How the alarm loop applies the filter
## EventNotifier propagation
```csharp
// LmxNodeManager.BuildAddressSpace (and the subtree rebuild path)
if (_alarmTrackingEnabled)
{
var includedIds = ResolveAlarmFilterIncludedIds(sorted); // null if no filter
foreach (var obj in sorted)
{
if (obj.IsArea) continue;
if (includedIds != null && !includedIds.Contains(obj.GobjectId)) continue;
// ... existing alarm-attribute collection + AlarmConditionState creation
}
}
```
Drivers that want hierarchical alarm subscriptions propagate `EventNotifier.SubscribeToEvents` up the containment chain during discovery — the Galaxy driver flips the flag on every ancestor of an alarm-bearing object up to the driver root, mirroring v1 behavior. Clients subscribed at the driver root, a mid-level folder, or the `Objects/` root see alarm events from every descendant with an `AlarmConditionState` sibling. The driver-root `FolderState` is created in `DriverNodeManager.CreateAddressSpace` with `EventNotifier = SubscribeToEvents | HistoryRead` so alarm event subscriptions and alarm history both have a single natural target.
`ResolveAlarmFilterIncludedIds` also emits a one-line summary (`Alarm filter: X of Y objects included (Z pattern(s))`) and per-pattern warnings for patterns that matched nothing. The included count is published to the dashboard via `AlarmFilterIncludedObjectCount`.
## ConditionRefresh
### Runtime telemetry
The OPC UA `ConditionRefresh` service queues the current state of every retained condition back to the requesting monitored items. `DriverNodeManager` iterates the node manager's `AlarmConditionState` collection and queues each condition whose `Retain.Value == true` — matching the Part 9 requirement.
`LmxNodeManager` exposes three read-only properties populated by the filter:
## Key source files
- `AlarmFilterEnabled` -- true when patterns are configured.
- `AlarmFilterPatternCount` -- number of compiled patterns.
- `AlarmFilterIncludedObjectCount` -- number of objects in the most recent included set.
`StatusReportService` reads these into `AlarmStatusInfo.FilterEnabled`, `FilterPatternCount`, and `FilterIncludedObjectCount`. The Alarms panel on the dashboard renders `Filter: N pattern(s), M object(s) included` only when the filter is enabled. See [Status Dashboard](StatusDashboard.md#alarms).
### Validator warning
`ConfigurationValidator.ValidateAndLog()` logs the effective filter at startup and emits a `Warning` if `AlarmFilter.ObjectFilters` is non-empty while `AlarmTrackingEnabled` is `false`, because the filter would have no effect.
## AlarmConditionState Creation
Each detected alarm attribute produces an `AlarmConditionState` node:
```csharp
var condition = new AlarmConditionState(sourceVariable);
condition.Create(SystemContext, conditionNodeId,
new QualifiedName(alarmAttr.AttributeName + "Alarm", NamespaceIndex),
new LocalizedText("en", alarmAttr.AttributeName + " Alarm"),
true);
```
Key configuration on the condition node:
- **SourceNode** -- Set to the OPC UA NodeId of the source variable, linking the condition to the attribute that triggered it.
- **SourceName / ConditionName** -- Set to the Galaxy attribute name for identification in event notifications.
- **AutoReportStateChanges** -- Set to `true` so the OPC UA framework automatically generates event notifications when condition properties change.
- **Initial state** -- Enabled, inactive, acknowledged, severity Medium, retain false.
- **HasCondition references** -- Bidirectional references are added between the source variable and the condition node.
The condition's `OnReportEvent` callback forwards events to `Server.ReportEvent` so they reach clients subscribed at the server level.
### Condition Methods
Each alarm condition supports the following OPC UA Part 9 methods:
- **Acknowledge** (`OnAcknowledge`) -- Writes the acknowledgment message to the Galaxy `AckMsg` tag. Requires the `AlarmAck` role.
- **Confirm** (`OnConfirm`) -- Confirms a previously acknowledged alarm. The SDK manages the `ConfirmedState` transition.
- **AddComment** (`OnAddComment`) -- Attaches an operator comment to the condition for audit trail purposes.
- **Enable / Disable** (`OnEnableDisable`) -- Activates or deactivates alarm monitoring for the specific condition. The SDK manages the `EnabledState` transition.
- **Shelve** (`OnShelve`) -- Supports `TimedShelve`, `OneShotShelve`, and `Unshelve` operations. The SDK manages the `ShelvedStateMachineType` state transitions including automatic timed unshelve.
- **TimedUnshelve** (`OnTimedUnshelve`) -- Automatically called by the SDK when a timed shelve period expires.
### Event Fields
Alarm events include the following fields:
- `EventId` -- Unique GUID for each event, used as reference for Acknowledge/Confirm
- `ActiveState`, `AckedState`, `ConfirmedState` -- State transitions
- `Message` -- Alarm message from Galaxy `DescAttrName` or default text
- `Severity` -- Galaxy Priority clamped to OPC UA range 1-1000
- `Retain` -- True while alarm is active or unacknowledged
- `LocalTime` -- Server timezone offset with daylight saving flag
- `Quality` -- Set to Good for alarm events
## Auto-subscription to Alarm Tags
After alarm condition nodes are created, `SubscribeAlarmTags` opens MXAccess subscriptions for three tags per alarm:
1. **InAlarm** (`Tag_001.Temperature.InAlarm`) -- The boolean trigger for alarm activation/deactivation.
2. **Priority** (`Tag_001.Temperature.Priority`) -- Numeric priority that maps to OPC UA severity.
3. **DescAttrName** (`Tag_001.Temperature.DescAttrName`) -- String description used as the alarm event message.
These subscriptions are opened unconditionally (not ref-counted) because they serve the server's own alarm tracking, not client-initiated monitoring. Tags that do not have corresponding variable nodes in `_tagToVariableNode` are skipped.
## EventNotifier Propagation
When a Galaxy object contains at least one alarm attribute, `EventNotifiers.SubscribeToEvents` is set on the object node **and all its ancestors** up to the root. This allows OPC UA clients to subscribe to events at any level in the hierarchy and receive alarm notifications from all descendants:
```csharp
if (hasAlarms && _nodeMap.TryGetValue(obj.GobjectId, out var objNode))
EnableEventNotifierUpChain(objNode);
```
For example, an alarm on `TestMachine_001.SubObject.Temperature` will be visible to clients subscribed on `SubObject`, `TestMachine_001`, or the root `ZB` folder. The root `ZB` folder also has `EventNotifiers.SubscribeToEvents` enabled during initial construction.
## InAlarm Transition Detection in DispatchLoop
Alarm state changes are detected in the dispatch loop's Phase 1 (outside `Lock`), which runs on the background dispatch thread rather than the STA thread. This placement is intentional because the detection logic reads Priority and DescAttrName values from MXAccess, which would block the STA thread if done inside the `OnMxAccessDataChange` callback.
For each pending data change, the loop checks whether the address matches a key in `_alarmInAlarmTags`:
```csharp
if (_alarmInAlarmTags.TryGetValue(address, out var alarmInfo))
{
var newInAlarm = vtq.Value is true || vtq.Value is 1
|| (vtq.Value is int intVal && intVal != 0);
if (newInAlarm != alarmInfo.LastInAlarm)
{
alarmInfo.LastInAlarm = newInAlarm;
// Read Priority and DescAttrName via MXAccess (outside Lock)
...
pendingAlarmEvents.Add((alarmInfo, newInAlarm));
}
}
```
The boolean coercion handles multiple value representations: `true`, integer `1`, or any non-zero integer. When the value changes state, Priority and DescAttrName are read synchronously from MXAccess to populate `CachedSeverity` and `CachedMessage`. These reads happen outside `Lock` because they call into the STA thread.
Priority values are clamped to the OPC UA severity range (1-1000). Both `int` and `short` types are handled.
## ReportAlarmEvent
`ReportAlarmEvent` runs inside `Lock` during Phase 2 of the dispatch loop. It updates the `AlarmConditionState` and generates an OPC UA event:
```csharp
condition.SetActiveState(SystemContext, active);
condition.Message.Value = new LocalizedText("en", message);
condition.SetSeverity(SystemContext, (EventSeverity)severity);
condition.Retain.Value = active || (condition.AckedState?.Id?.Value == false);
```
Key behaviors:
- **Active state** -- Set to `true` on activation, `false` on clearing.
- **Message** -- Uses `CachedMessage` (from DescAttrName) when available on activation. Falls back to a generated `"Alarm active: {SourceName}"` string. Cleared alarms always use `"Alarm cleared: {SourceName}"`.
- **Severity** -- Set from `CachedSeverity`, which was read from the Priority tag.
- **Retain** -- `true` while the alarm is active or unacknowledged. This keeps the condition visible in condition refresh responses.
- **Acknowledged state** -- Reset to `false` when the alarm activates, requiring explicit client acknowledgment. When role-based auth is active, alarm acknowledgment requires the `AlarmAck` role on the session (checked via `GrantedRoleIds`). Users without this role receive `BadUserAccessDenied`.
The event is reported by walking up the notifier chain from the source variable's parent through all ancestor nodes. Each ancestor with `EventNotifier` set receives the event via `ReportEvent`, so clients subscribed at any level in the Galaxy hierarchy see alarm transitions from descendant objects.
## Condition Refresh Override
The `ConditionRefresh` override iterates all tracked alarms and queues retained conditions to the requesting monitored items:
```csharp
public override ServiceResult ConditionRefresh(OperationContext context,
IList<IEventMonitoredItem> monitoredItems)
{
foreach (var kvp in _alarmInAlarmTags)
{
var info = kvp.Value;
if (info.ConditionNode == null || info.ConditionNode.Retain?.Value != true)
continue;
foreach (var item in monitoredItems)
item.QueueEvent(info.ConditionNode);
}
return ServiceResult.Good;
}
```
Only conditions where `Retain.Value == true` are included. This means only active or unacknowledged alarms appear in condition refresh responses, matching the OPC UA specification requirement that condition refresh returns the current state of all retained conditions.
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAlarmSource.cs` — capability contract + `AlarmEventArgs`
- `src/ZB.MOM.WW.OtOpcUa.Core/Resilience/AlarmSurfaceInvoker.cs` — per-host fan-out + no-retry ack
- `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs``CapturingBuilder` + alarm forwarder
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs``VariableHandle.MarkAsAlarmCondition` + `ConditionSink`
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/Alarms/GalaxyAlarmTracker.cs` — Galaxy-specific alarm-event production

View File

@@ -1,84 +1,65 @@
# Data Type Mapping
`MxDataTypeMapper` and `SecurityClassificationMapper` translate Galaxy attribute metadata into OPC UA variable node properties. These mappings determine how Galaxy runtime values are represented to OPC UA clients and whether clients can write to them.
Data-type mapping is driver-defined. Each driver translates its native attribute metadata into two driver-agnostic enums from `Core.Abstractions``DriverDataType` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverDataType.cs`) and `SecurityClassification` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/SecurityClassification.cs`) — and populates the `DriverAttributeInfo` record it hands to `IAddressSpaceBuilder.Variable(...)`. Core doesn't interpret the native types; it trusts the driver's translation.
## mx_data_type to OPC UA Type Mapping
## DriverDataType OPC UA built-in type
Each Galaxy attribute carries an `mx_data_type` integer that identifies its data type. `MxDataTypeMapper.MapToOpcUaDataType` maps these to OPC UA built-in type NodeIds:
`DriverNodeManager.MapDataType` (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs`) is the single translation table for every driver:
| mx_data_type | Galaxy type | OPC UA type | NodeId | CLR type |
|:---:|-------------|-------------|:------:|----------|
| 1 | Boolean | Boolean | i=1 | `bool` |
| 2 | Integer | Int32 | i=6 | `int` |
| 3 | Float | Float | i=10 | `float` |
| 4 | Double | Double | i=11 | `double` |
| 5 | String | String | i=12 | `string` |
| 6 | Time | DateTime | i=13 | `DateTime` |
| 7 | ElapsedTime | Double | i=11 | `double` |
| 8 | Reference | String | i=12 | `string` |
| 13 | Enumeration | Int32 | i=6 | `int` |
| 14 | Custom | String | i=12 | `string` |
| 15 | InternationalizedString | LocalizedText | i=21 | `string` |
| 16 | Custom | String | i=12 | `string` |
| other | Unknown | String | i=12 | `string` |
| DriverDataType | OPC UA NodeId |
|---|---|
| `Boolean` | `DataTypeIds.Boolean` (i=1) |
| `Int32` | `DataTypeIds.Int32` (i=6) |
| `Float32` | `DataTypeIds.Float` (i=10) |
| `Float64` | `DataTypeIds.Double` (i=11) |
| `String` | `DataTypeIds.String` (i=12) |
| `DateTime` | `DataTypeIds.DateTime` (i=13) |
| anything else | `DataTypeIds.BaseDataType` |
Unknown types default to String. This is a safe fallback because MXAccess delivers values as COM `VARIANT` objects, and string serialization preserves any value that does not have a direct OPC UA counterpart.
The enum also carries `Int16 / Int64 / UInt16 / UInt32 / UInt64 / Reference` members for drivers that need them; the mapping table is extended as those types surface in actual drivers. `Reference` is the Galaxy-style attribute reference — it's encoded as an OPC UA `String` on the wire.
### Why ElapsedTime maps to Double
## Per-driver mappers
Galaxy `ElapsedTime` (mx_data_type 7) represents a duration/timespan. OPC UA has no native `TimeSpan` type. The OPC UA specification defines a `Duration` type alias (NodeId i=290) that is semantically a `Double` representing milliseconds, but the simpler approach is to map directly to `Double` (i=11) representing seconds. This avoids ambiguity about whether the value is in seconds or milliseconds and matches how the Galaxy runtime exposes elapsed time values through MXAccess.
Each driver owns its native → `DriverDataType` translation:
## Array Handling
- **Galaxy Proxy** — `GalaxyProxyDriver.MapDataType(int mxDataType)` and `MapSecurity(int mxSec)` (inline in `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/GalaxyProxyDriver.cs`). The Galaxy `mx_data_type` integer is sent across the Host↔Proxy pipe and mapped on the Proxy side. Galaxy's full classic 16-entry table (Boolean / Integer / Float / Double / String / Time / ElapsedTime / Reference / Enumeration / Custom / InternationalizedString) is preserved but compressed into the seven-entry `DriverDataType` enum — `ElapsedTime``Float64`, `InternationalizedString``String`, `Reference``Reference`, enumerations → `Int32`.
- **AB CIP** — `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipDataType.cs` maps CIP tag type codes.
- **Modbus** — `src/ZB.MOM.WW.OtOpcUa.Driver.Modbus/ModbusDriver.cs` maps register shapes (16-bit signed, 16-bit unsigned, 32-bit float, etc.) including the DirectLogic quirk table in `DirectLogicAddress.cs`.
- **S7 / AB Legacy / TwinCAT / FOCAS / OPC UA Client** — each has its own inline mapper or `*DataType.cs` file per the same pattern.
Galaxy attributes with `is_array = 1` in the repository are exposed as one-dimensional OPC UA array variables.
The driver's mapping is authoritative — when a field type is ambiguous (a `LREAL` that could be bit-reinterpreted, a BCD counter, a string of a particular encoding), the driver decides the exposed OPC UA shape.
### ValueRank
## Array handling
The `ValueRank` property on the OPC UA variable node indicates the array dimensionality:
`DriverAttributeInfo.IsArray = true` flips `ValueRank = OneDimension` on the generated `BaseDataVariableState`; scalars stay at `ValueRank.Scalar`. `DriverAttributeInfo.ArrayDim` carries the declared length. Writing element-by-element (OPC UA `IndexRange`) is a driver-level decision — see `docs/ReadWriteOperations.md`.
| `is_array` | ValueRank | Constant |
|:---:|:---------:|----------|
| 0 | -1 | `ValueRanks.Scalar` |
| 1 | 1 | `ValueRanks.OneDimension` |
## SecurityClassification — metadata, not ACL
### ArrayDimensions
`SecurityClassification` is driver-reported metadata only. Drivers never enforce write permissions themselves — the classification flows into the Server project where `WriteAuthzPolicy.IsAllowed(classification, userRoles)` (`src/ZB.MOM.WW.OtOpcUa.Server/Security/WriteAuthzPolicy.cs`) gates the write against the session's LDAP-derived roles, and (Phase 6.2) the `AuthorizationGate` + permission trie apply on top. This is the "ACL at server layer" invariant recorded in `feedback_acl_at_server_layer.md`.
When `ValueRank = 1`, the `ArrayDimensions` property is set to a single-element `ReadOnlyList<uint>` containing the declared array length from `array_dimension`:
The classification values mirror the v1 Galaxy model so existing Galaxy galaxies keep their published semantics:
```csharp
if (attr.IsArray && attr.ArrayDimension.HasValue)
{
variable.ArrayDimensions = new ReadOnlyList<uint>(
new List<uint> { (uint)attr.ArrayDimension.Value });
}
```
| SecurityClassification | Required role | Write-from-OPC-UA |
|---|---|---|
| `FreeAccess` | — | yes (even anonymous) |
| `Operate` | `WriteOperate` | yes |
| `Tune` | `WriteTune` | yes |
| `Configure` | `WriteConfigure` | yes |
| `SecuredWrite` | `WriteOperate` | yes |
| `VerifiedWrite` | `WriteConfigure` | yes |
| `ViewOnly` | — | no |
The `array_dimension` value is extracted from the `mx_value` binary column in the Galaxy database (bytes 13-16, little-endian int32).
Drivers whose backend has no notion of classification (Modbus, most PLCs) default every tag to `FreeAccess` or `Operate`; drivers whose backend does carry the notion (Galaxy, OPC UA Client relaying `UserAccessLevel`) translate it directly.
### NodeId for array variables
## Historization
Array variables use a NodeId without the `[]` suffix. The `full_tag_reference` stored internally for MXAccess addressing retains the `[]` (e.g., `MESReceiver_001.MoveInPartNumbers[]`), but the OPC UA NodeId strips it to `ns=1;s=MESReceiver_001.MoveInPartNumbers`.
## Security Classification to AccessLevel Mapping
Galaxy attributes carry a `security_classification` value that controls write permissions. `SecurityClassificationMapper.IsWritable` determines the OPC UA `AccessLevel`:
| security_classification | Galaxy level | OPC UA AccessLevel | Writable |
|:---:|--------------|-------------------|:--------:|
| 0 | FreeAccess | CurrentReadOrWrite | Yes |
| 1 | Operate | CurrentReadOrWrite | Yes |
| 2 | SecuredWrite | CurrentRead | No |
| 3 | VerifiedWrite | CurrentRead | No |
| 4 | Tune | CurrentReadOrWrite | Yes |
| 5 | Configure | CurrentReadOrWrite | Yes |
| 6 | ViewOnly | CurrentRead | No |
Most attributes default to Operate (1). The mapper treats SecuredWrite, VerifiedWrite, and ViewOnly as read-only because the OPC UA server does not implement the Galaxy's multi-level authentication model. Allowing writes to SecuredWrite or VerifiedWrite attributes without proper verification would bypass Galaxy security.
For historized attributes, `AccessLevels.HistoryRead` is added to the access level via bitwise OR, enabling OPC UA history read requests when an `IHistorianDataSource` is configured via the runtime-loaded historian plugin.
`DriverAttributeInfo.IsHistorized = true` flips `AccessLevel.HistoryRead` and `Historizing = true` on the variable. The driver must then implement `IHistoryProvider` for HistoryRead service calls to succeed; otherwise the node manager surfaces `BadHistoryOperationUnsupported` per request.
## Key source files
- `src/ZB.MOM.WW.OtOpcUa.Host/Domain/MxDataTypeMapper.cs` -- Type and CLR mapping
- `src/ZB.MOM.WW.OtOpcUa.Host/Domain/SecurityClassificationMapper.cs` -- Write access mapping
- `gr/data_type_mapping.md` -- Reference documentation for the full mapping table
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverDataType.cs` — driver-agnostic type enum
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/SecurityClassification.cs` — write-authz tier metadata
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverAttributeInfo.cs` — per-attribute descriptor
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs``MapDataType` translation
- `src/ZB.MOM.WW.OtOpcUa.Server/Security/WriteAuthzPolicy.cs` — classification-to-role policy
- Per-driver mappers in each `Driver.*` project

View File

@@ -1,121 +1,65 @@
# Incremental Sync
When a Galaxy redeployment is detected, the OPC UA address space must be updated to reflect the new hierarchy and attributes. Rather than tearing down the entire address space and rebuilding from scratch (which disconnects all clients and drops all subscriptions), `LmxNodeManager` performs an incremental sync that identifies changed objects and rebuilds only the affected subtrees.
Two distinct change-detection paths feed the running server: driver-backend rediscovery (Galaxy's `time_of_last_deploy`, TwinCAT's symbol-version-changed, OPC UA Client's upstream namespace change) and generation-level config publishes from the Admin UI. Both flow into re-runs of `ITagDiscovery.DiscoverAsync`, but they originate differently.
## Cached State
## Driver-backend rediscovery — IRediscoverable
`LmxNodeManager` retains shallow copies of the last-published hierarchy and attributes:
Drivers whose backend has a native change signal implement `IRediscoverable` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IRediscoverable.cs`):
```csharp
private List<GalaxyObjectInfo>? _lastHierarchy;
private List<GalaxyAttributeInfo>? _lastAttributes;
```
These are updated at the end of every `BuildAddressSpace` or `SyncAddressSpace` call via `new List<T>(source)` to create independent copies. The copies serve as the baseline for the next diff comparison.
On the first call (when `_lastHierarchy` is null), `SyncAddressSpace` falls through to a full `BuildAddressSpace` since there is no baseline to diff against.
## AddressSpaceDiff
`AddressSpaceDiff` is a static helper class that computes the set of changed Galaxy object IDs between two snapshots.
### FindChangedGobjectIds
This method compares old and new hierarchy+attributes and returns a `HashSet<int>` of gobject IDs that have any difference. It detects three categories of changes:
**Added objects** -- Present in new hierarchy but not in old:
```csharp
foreach (var id in newObjects.Keys)
if (!oldObjects.ContainsKey(id))
changed.Add(id);
```
**Removed objects** -- Present in old hierarchy but not in new:
```csharp
foreach (var id in oldObjects.Keys)
if (!newObjects.ContainsKey(id))
changed.Add(id);
```
**Modified objects** -- Present in both but with different properties. `ObjectsEqual` compares `TagName`, `BrowseName`, `ContainedName`, `ParentGobjectId`, and `IsArea`.
**Attribute set changes** -- For objects that exist in both snapshots, attributes are grouped by `GobjectId` and compared pairwise. `AttributeSetsEqual` sorts both lists by `FullTagReference` and `PrimitiveName`, then checks each pair via `AttributesEqual`, which compares `AttributeName`, `FullTagReference`, `MxDataType`, `IsArray`, `ArrayDimension`, `PrimitiveName`, `SecurityClassification`, `IsHistorized`, and `IsAlarm`. A difference in count or any field mismatch marks the owning gobject as changed.
Objects already marked as changed by hierarchy comparison are skipped during attribute comparison to avoid redundant work.
### ExpandToSubtrees
When a Galaxy object changes, its children must also be rebuilt because they may reference the parent's node or have inherited attribute changes. `ExpandToSubtrees` performs a BFS traversal from each changed ID, adding all descendants:
```csharp
public static HashSet<int> ExpandToSubtrees(HashSet<int> changed,
List<GalaxyObjectInfo> hierarchy)
public interface IRediscoverable
{
var childrenByParent = hierarchy.GroupBy(h => h.ParentGobjectId)
.ToDictionary(g => g.Key, g => g.Select(h => h.GobjectId).ToList());
var expanded = new HashSet<int>(changed);
var queue = new Queue<int>(changed);
while (queue.Count > 0)
{
var id = queue.Dequeue();
if (childrenByParent.TryGetValue(id, out var children))
foreach (var childId in children)
if (expanded.Add(childId))
queue.Enqueue(childId);
}
return expanded;
event EventHandler<RediscoveryEventArgs>? OnRediscoveryNeeded;
}
public sealed record RediscoveryEventArgs(string Reason, string? ScopeHint);
```
The expansion runs against both the old and new hierarchy. This is necessary because a removed parent's children appear in the old hierarchy (for teardown) while an added parent's children appear in the new hierarchy (for construction).
The driver fires the event with a reason string (for the diagnostic log) and an optional scope hint — a non-null hint lets Core scope the rebuild surgically to that subtree; null means "the whole address space may have changed".
## SyncAddressSpace Flow
Drivers that implement the capability today:
`SyncAddressSpace` orchestrates the incremental update inside the OPC UA framework `Lock`:
- **Galaxy** — polls `galaxy.time_of_last_deploy` in the Galaxy repository DB and fires on change. This is Galaxy-internal change detection, not the platform-wide mechanism.
- **TwinCAT** — observes ADS symbol-version-changed notifications (`0x0702`).
- **OPC UA Client** — subscribes to the upstream server's `Server/NamespaceArray` change notifications.
1. **Diff** -- Call `FindChangedGobjectIds` with the cached and new snapshots. If no changes are detected, update the cached snapshots and return early.
Static drivers (Modbus, S7, AB CIP, AB Legacy, FOCAS) do not implement `IRediscoverable` — their tags only change when a new generation is published from the Config DB. Core sees absence of the interface and skips change-detection wiring for those drivers (decision #54).
2. **Expand** -- Call `ExpandToSubtrees` on both old and new hierarchies to include descendant objects.
## Config-DB generation publishes
3. **Snapshot subscriptions** -- Before teardown, iterate `_gobjectToTagRefs` for each changed gobject ID and record the current MXAccess subscription ref-counts. These are needed to restore subscriptions after rebuild.
Tag-set changes authored in the Admin UI (UNS edits, CSV imports, driver-config edits) accumulate in a draft generation and commit via `sp_PublishGeneration`. The delta between the currently-published generation and the proposed next one is computed by `sp_ComputeGenerationDiff`, which drives:
4. **Teardown** -- Call `TearDownGobjects` to remove the old nodes and clean up tracking state.
- The **DiffViewer** in Admin (`src/ZB.MOM.WW.OtOpcUa.Admin/Components/Pages/Clusters/DiffViewer.razor`) so operators can preview what will change before clicking Publish.
- The 409-on-stale-draft flow (decision #161) — a UNS drag-reorder preview carries a `DraftRevisionToken` so Confirm returns `409 Conflict / refresh-required` if the draft advanced between preview and commit.
5. **Rebuild** -- Filter the new hierarchy and attributes to only the changed gobject IDs, then call `BuildSubtree` to create the replacement nodes.
After publish, the server's generation applier invokes `IDriver.ReinitializeAsync(driverConfigJson, ct)` on every driver whose `DriverInstance.DriverConfig` row changed in the new generation. Reinitialize is the in-process recovery path for Tier A/B drivers; if it fails the driver is marked `DriverState.Faulted` and its nodes go Bad quality — but the server process stays running. See `docs/v2/driver-stability.md`.
6. **Restore subscriptions** -- For each previously subscribed tag reference that still exists in `_tagToVariableNode` after rebuild, re-open the MXAccess subscription and restore the original ref-count.
Drivers whose discovery depends on Config DB state (Modbus register maps, S7 DBs, AB CIP tag lists) re-run their discovery inside `ReinitializeAsync`; Core then diffs the new node set against the current address space.
7. **Update cache** -- Replace `_lastHierarchy` and `_lastAttributes` with shallow copies of the new data.
## Rebuild flow
## TearDownGobjects
When a rediscovery is triggered (by either source), `GenericDriverNodeManager` re-runs `ITagDiscovery.DiscoverAsync` into the same `CapturingBuilder` it used at first build. The new node set is diffed against the current:
`TearDownGobjects` removes all OPC UA nodes and tracking state for a set of gobject IDs:
1. **Diff** — full-name comparison of the new `DriverAttributeInfo` set against the existing `_variablesByFullRef` map. Added / removed / modified references are partitioned.
2. **Snapshot subscriptions** — before teardown, Core captures the current monitored-item ref-counts for every affected reference so subscriptions can be replayed after rebuild.
3. **Teardown** — removed / modified variable nodes are deleted via `CustomNodeManager2.DeleteNode`. Driver-side subscriptions for those references are unwound via `ISubscribable.UnsubscribeAsync`.
4. **Rebuild** — added / modified references get fresh `BaseDataVariableState` nodes via the standard `IAddressSpaceBuilder.Variable(...)` path. Alarm-flagged references re-register their `IAlarmConditionSink` through `CapturingBuilder`.
5. **Restore subscriptions** — for every captured reference that still exists after rebuild, Core re-opens the driver subscription and restores the original ref-count.
For each gobject ID, it processes the associated tag references from `_gobjectToTagRefs`:
Exceptions during teardown are swallowed per decision #12 — a driver throw must not leave the node tree half-deleted.
1. **Unsubscribe** -- If the tag has an active MXAccess subscription (entry in `_subscriptionRefCounts`), call `UnsubscribeAsync` and remove the ref-count entry.
## Scope hint
2. **Remove alarm tracking** -- Find any `_alarmInAlarmTags` entries whose `SourceTagReference` matches the tag. For each, unsubscribe the InAlarm, Priority, and DescAttrName tags, then remove the alarm entry.
When `RediscoveryEventArgs.ScopeHint` is non-null (e.g. a folder path), Core restricts the diff to that subtree. This matters for Galaxy Platform-scoped deployments where a `time_of_last_deploy` advance may only affect one platform's subtree, and for OPC UA Client where an upstream change may be localized. Null scope falls back to a full-tree diff.
3. **Delete variable node** -- Call `DeleteNode` on the variable's `NodeId`, remove from `_tagToVariableNode`, clean up `_nodeIdToTagReference` and `_tagMetadata`, and decrement `VariableNodeCount`.
## Active subscriptions survive rebuild
4. **Delete object/folder node** -- Remove the gobject's entry from `_nodeMap` and call `DeleteNode`. Non-folder nodes decrement `ObjectNodeCount`.
Subscriptions for unchanged references stay live across rebuilds — their ref-count map is not disturbed. Clients monitoring a stable tag never see a data-change gap during a deploy, only clients monitoring a tag that was genuinely removed see the subscription drop.
All MXAccess calls and `DeleteNode` calls are wrapped in try/catch with ignored exceptions, since teardown must complete even if individual cleanup steps fail.
## Key source files
## BuildSubtree
`BuildSubtree` creates OPC UA nodes for a subset of the Galaxy hierarchy, reusing existing parent nodes from `_nodeMap`.
The method first topologically sorts the input hierarchy (same `TopologicalSort` used by `BuildAddressSpace`) to ensure parents are created before children. For each object:
1. **Find parent** -- Look up `ParentGobjectId` in `_nodeMap`. If the parent was not part of the changed set, it already exists from the previous build. If no parent is found, fall back to the root `ZB` folder. This is the key difference from `BuildAddressSpace` -- subtree builds reuse the existing node tree rather than starting from the root.
2. **Create node** -- Areas become `FolderState` with `Organizes` reference; non-areas become `BaseObjectState` with `HasComponent` reference. The node is added to `_nodeMap`.
3. **Create variable nodes** -- Attributes are processed with the same primitive-grouping logic as `BuildAddressSpace`, creating `BaseDataVariableState` nodes via `CreateAttributeVariable`.
4. **Alarm tracking** -- If `_alarmTrackingEnabled` is set, alarm attributes are detected and `AlarmConditionState` nodes are created using the same logic as the full build. EventNotifier flags are set on parent nodes, and alarm tags are auto-subscribed.
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IRediscoverable.cs` — backend-change capability
- `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs` — discovery orchestration
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IDriver.cs``ReinitializeAsync` contract
- `src/ZB.MOM.WW.OtOpcUa.Admin/Services/GenerationService.cs` — publish-flow driver
- `docs/v2/config-db-schema.md``sp_PublishGeneration` + `sp_ComputeGenerationDiff`
- `docs/v2/admin-ui.md` — DiffViewer + draft-revision-token flow

View File

@@ -1,137 +1,88 @@
# OPC UA Server
The OPC UA server component hosts the Galaxy-backed namespace on a configurable TCP endpoint and exposes deployed System Platform objects and attributes to OPC UA clients.
The OPC UA server component (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs`) hosts the OPC UA stack and exposes one browsable subtree per registered driver. The server itself is driver-agnostic — Galaxy/MXAccess, Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS, and OPC UA Client are all plugged in as `IDriver` implementations via the capability interfaces in `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/`.
## Composition
`OtOpcUaServer` subclasses the OPC Foundation `StandardServer` and wires:
- A `DriverHost` (`src/ZB.MOM.WW.OtOpcUa.Core/Hosting/DriverHost.cs`) which registers drivers and holds the per-instance `IDriver` references.
- One `DriverNodeManager` per registered driver (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs`), constructed in `CreateMasterNodeManager`. Each manager owns its own namespace URI (`urn:OtOpcUa:{DriverInstanceId}`) and exposes the driver as a subtree under the standard `Objects` folder.
- A `CapabilityInvoker` (`src/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs`) per driver instance, keyed on `(DriverInstanceId, HostName, DriverCapability)` against the shared `DriverResiliencePipelineBuilder`. Every Read/Write/Discovery/Subscribe/HistoryRead/AlarmSubscribe call on the driver flows through this invoker so the Polly pipeline (retry / timeout / breaker / bulkhead) applies. The OTOPCUA0001 Roslyn analyzer enforces the wrapping at compile time.
- An `IUserAuthenticator` (LDAP in production, injected stub in tests) for `UserName` token validation in the `ImpersonateUser` hook.
- Optional `AuthorizationGate` + `NodeScopeResolver` (Phase 6.2) that sit in front of every dispatch call. In lax mode the gate passes through when the identity lacks LDAP groups so existing integration tests keep working; strict mode (`Authorization:StrictMode = true`) denies those cases.
`OtOpcUaServer.DriverNodeManagers` exposes the materialized list so the hosting layer can walk each one post-start and call `GenericDriverNodeManager.BuildAddressSpaceAsync(manager)` — the manager is passed as its own `IAddressSpaceBuilder`.
## Configuration
`OpcUaConfiguration` defines the server endpoint and session settings. All properties have sensible defaults:
Server wiring used to live in `appsettings.json`. It now flows from the SQL Server **Config DB**: `ServerInstance` + `DriverInstance` + `Tag` + `NodeAcl` rows are published as a *generation* via `sp_PublishGeneration` and loaded into the running process by the generation applier. The Admin UI (Blazor Server, `docs/v2/admin-ui.md`) is the operator surface — drafts accumulate edits; `sp_ComputeGenerationDiff` drives the DiffViewer preview; a UNS drag-reorder carries a `DraftRevisionToken` so Confirm re-checks against the current draft and returns 409 if it advanced (decision #161). See `docs/v2/config-db-schema.md` for the schema.
| Property | Default | Description |
|----------|---------|-------------|
| `BindAddress` | `0.0.0.0` | IP address or hostname the server binds to |
| `Port` | `4840` | TCP port the server listens on |
| `EndpointPath` | `/LmxOpcUa` | URI path appended to the base address |
| `ServerName` | `LmxOpcUa` | Application name presented to clients |
| `GalaxyName` | `ZB` | Galaxy name used in the namespace URI |
| `MaxSessions` | `100` | Maximum concurrent client sessions |
| `SessionTimeoutMinutes` | `30` | Idle session timeout |
| `AlarmTrackingEnabled` | `false` | Enables `AlarmConditionState` nodes for alarm attributes |
| `AlarmFilter.ObjectFilters` | `[]` | Wildcard template-name patterns that scope alarm tracking to matching objects and their descendants (see [Alarm Tracking](AlarmTracking.md#template-based-alarm-object-filter)) |
Environmental knobs that aren't per-tenant (bind address, port, PKI path) still live in `appsettings.json` on the Server project; everything tenant-scoped moved to the Config DB.
The resulting endpoint URL is `opc.tcp://{BindAddress}:{Port}{EndpointPath}`, e.g., `opc.tcp://0.0.0.0:4840/LmxOpcUa`.
## Transport
The namespace URI follows the pattern `urn:{GalaxyName}:LmxOpcUa` and is used as the `ProductUri`. The `ApplicationUri` can be set independently via `OpcUa.ApplicationUri` to support redundant deployments where each instance needs a unique identity. When `ApplicationUri` is null, it defaults to the namespace URI.
The server binds one TCP endpoint per `ServerInstance` (default `opc.tcp://0.0.0.0:4840`). The `ApplicationConfiguration` is built programmatically in the `OpcUaApplicationHost` — there are no UA XML files. Security profiles (`None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`) are resolved from the `ServerInstance.Security` JSON at startup; the default profile is still `None` for backward compatibility. User token policies (`Anonymous`, `UserName`) are attached based on whether LDAP is configured. See `docs/security.md` for hardening.
## Programmatic ApplicationConfiguration
## Session impersonation
`OpcUaServerHost` builds the entire `ApplicationConfiguration` in code. There are no XML configuration files. This keeps deployment simple on factory floor machines where editing XML is error-prone.
`OtOpcUaServer.OnImpersonateUser` handles the three token types:
The configuration covers:
- `AnonymousIdentityToken` → default anonymous `UserIdentity`.
- `UserNameIdentityToken``IUserAuthenticator.AuthenticateAsync` validates the credential (`LdapUserAuthenticator` in production). On success, the resolved display name + LDAP-derived roles are wrapped in a `RoleBasedIdentity` that implements `IRoleBearer`. `DriverNodeManager.OnWriteValue` reads these roles via `context.UserIdentity is IRoleBearer` and applies `WriteAuthzPolicy` per write.
- Anything else → `BadIdentityTokenInvalid`.
- **ServerConfiguration** -- base address, session limits, security policies, and user token policies
- **SecurityConfiguration** -- certificate store paths under `%LOCALAPPDATA%\OPC Foundation\pki\`, auto-accept enabled
- **TransportQuotas** -- 4 MB max message/string/byte-string size, 120-second operation timeout, 1-hour security token lifetime
- **TraceConfiguration** -- OPC Foundation SDK tracing is disabled (output path `null`, trace masks `0`); all logging goes through Serilog instead
The Phase 6.2 `AuthorizationGate` runs on top of this baseline: when configured it consults the cluster's permission trie (loaded from `NodeAcl` rows) using the session's `UserAuthorizationState` and can deny Read / HistoryRead / Write / Browse independently per tag. See `docs/v2/acl-design.md`.
## Security Profiles
## Dispatch
The server supports configurable transport security profiles controlled by the `Security` section in `appsettings.json`. The default configuration exposes only `MessageSecurityMode.None` for backward compatibility.
Every service call the stack hands to `DriverNodeManager` is translated to the driver's capability interface and routed through `CapabilityInvoker`:
Supported Phase 1 profiles:
| Profile Name | SecurityPolicy URI | MessageSecurityMode |
| Service | Capability | Invoker method |
|---|---|---|
| `None` | `SecurityPolicy#None` | `None` |
| `Basic256Sha256-Sign` | `SecurityPolicy#Basic256Sha256` | `Sign` |
| `Basic256Sha256-SignAndEncrypt` | `SecurityPolicy#Basic256Sha256` | `SignAndEncrypt` |
| Read | `IReadable.ReadAsync` | `ExecuteAsync(DriverCapability.Read, host, …)` |
| Write | `IWritable.WriteAsync` | `ExecuteWriteAsync(host, isIdempotent, …)` — honors `WriteIdempotentAttribute` (#143) |
| CreateMonitoredItems / DeleteMonitoredItems | `ISubscribable.SubscribeAsync/UnsubscribeAsync` | `ExecuteAsync(DriverCapability.Subscribe, host, …)` |
| HistoryRead (raw / processed / at-time / events) | `IHistoryProvider.*Async` | `ExecuteAsync(DriverCapability.HistoryRead, host, …)` |
| ConditionRefresh / Acknowledge | `IAlarmSource.*Async` | via `AlarmSurfaceInvoker` (fans out per host) |
`SecurityProfileResolver` maps configured profile names to `ServerSecurityPolicy` instances at startup. Unknown names are skipped with a warning, and an empty or invalid list falls back to `None`.
For production deployments, configure `["Basic256Sha256-SignAndEncrypt"]` or `["None", "Basic256Sha256-SignAndEncrypt"]` and set `AutoAcceptClientCertificates` to `false`. See the [Security Guide](security.md) for hardening details.
The host name fed to the invoker comes from `IPerCallHostResolver.ResolveHost(fullReference)` when the driver implements it (multi-host drivers: AB CIP, Modbus with per-device options). Single-host drivers fall back to `DriverInstanceId`, preserving pre-Phase-6.1 pipeline-key semantics (decision #144).
## Redundancy
When `Redundancy.Enabled = true`, `LmxOpcUaServer` exposes the standard OPC UA redundancy nodes on startup:
`Redundancy.Enabled = true` on the `ServerInstance` activates the `RedundancyCoordinator` + `ServiceLevelCalculator` (`src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/`). Standard OPC UA redundancy nodes (`Server/ServerRedundancy/RedundancySupport`, `ServerUriArray`, `Server/ServiceLevel`) are populated on startup; `ServiceLevel` recomputes whenever any driver's `DriverHealth` changes. The apply-lease mechanism prevents two instances from concurrently applying a generation. See `docs/Redundancy.md`.
- `Server/ServerRedundancy/RedundancySupport` — set to `Warm` or `Hot` based on configuration
- `Server/ServerRedundancy/ServerUriArray` — populated with the configured `ServerUris`
- `Server/ServiceLevel` — computed dynamically from role and runtime health
## Server class hierarchy
The `ServiceLevel` is updated whenever MXAccess connection state changes or Galaxy DB health changes. See [Redundancy Guide](Redundancy.md) for full details.
### OtOpcUaServer extends StandardServer
### User token policies
- **`CreateMasterNodeManager`** — Iterates `_driverHost.RegisteredDriverIds`, builds one `DriverNodeManager` per driver with its own `CapabilityInvoker` + resilience options (tier from `DriverTypeRegistry`, per-instance JSON overrides from `DriverInstance.ResilienceConfig` via `DriverResilienceOptionsParser`). The managers are wrapped in a `MasterNodeManager` with no additional core managers.
- **`OnServerStarted`** — Hooks `SessionManager.ImpersonateUser` for LDAP auth. Redundancy + server-capability population happens via `OpcUaApplicationHost`.
- **`LoadServerProperties`** — Manufacturer `OtOpcUa`, Product `OtOpcUa.Server`, ProductUri `urn:OtOpcUa:Server`.
`UserTokenPolicies` are dynamically configured based on the `Authentication` settings in `appsettings.json`:
### ServerCapabilities
- An `Anonymous` user token policy is added when `AllowAnonymous` is `true` (the default).
- A `UserName` user token policy is added when an authentication provider is configured (LDAP or injected).
Both policies can be active simultaneously, allowing clients to connect with or without credentials.
### Session impersonation
When a client presents `UserName` credentials, the server validates them through `IUserAuthenticationProvider`. If the provider also implements `IRoleProvider` (as `LdapAuthenticationProvider` does), LDAP group membership is resolved once during authentication and mapped to custom OPC UA role `NodeId`s in a dedicated `urn:zbmom:lmxopcua:roles` namespace. These role NodeIds are added to the session's `RoleBasedIdentity.GrantedRoleIds`.
Anonymous sessions receive `WellKnownRole_Anonymous`. Authenticated sessions receive `WellKnownRole_AuthenticatedUser` plus any LDAP-derived role NodeIds. Permission checks in `LmxNodeManager` inspect `GrantedRoleIds` directly — no username extraction or side-channel cache is needed.
`AnonymousCanWrite` controls whether anonymous sessions can write, regardless of whether LDAP is enabled.
`OpcUaApplicationHost` populates `Server/ServerCapabilities` with `StandardUA2017`, `en` locale, 100 ms `MinSupportedSampleRate`, 4 MB message caps, and per-operation limits (1000 per Read/Write/Browse/TranslateBrowsePaths/MonitoredItems/HistoryRead; 0 for MethodCall/NodeManagement/HistoryUpdate).
## Certificate handling
On startup, `OpcUaServerHost.StartAsync` calls `CheckApplicationInstanceCertificate(false, minKeySize)` to locate or create a self-signed certificate meeting the configured minimum key size (default 2048). The certificate subject defaults to `CN={ServerName}, O=ZB MOM, DC=localhost` but can be overridden via `Security.CertificateSubject`. Certificate stores use the directory-based store type under the configured `Security.PkiRootPath` (default `%LOCALAPPDATA%\OPC Foundation\pki\`):
Certificate stores default to `%LOCALAPPDATA%\OPC Foundation\pki\` (directory-based):
| Store | Path suffix |
|-------|-------------|
|---|---|
| Own | `pki/own` |
| Trusted issuers | `pki/issuer` |
| Trusted peers | `pki/trusted` |
| Rejected | `pki/rejected` |
`AutoAcceptUntrustedCertificates` is controlled by `Security.AutoAcceptClientCertificates` (default `true`). Set to `false` in production to enforce client certificate trust. When `RejectSHA1Certificates` is `true` (default), client certificates signed with SHA-1 are rejected. Certificate validation events are logged for visibility into accepted and rejected client connections.
## Server class hierarchy
### LmxOpcUaServer extends StandardServer
`LmxOpcUaServer` inherits from the OPC Foundation `StandardServer` base class and overrides two methods:
- **`CreateMasterNodeManager`** -- Instantiates `LmxNodeManager` with the Galaxy namespace URI, the `IMxAccessClient` for runtime I/O, performance metrics, and an optional `IHistorianDataSource` (supplied by the runtime-loaded historian plugin, see [Historical Data Access](HistoricalDataAccess.md)). The node manager is wrapped in a `MasterNodeManager` with no additional core node managers.
- **`OnServerStarted`** -- Configures redundancy, history capabilities, and server capabilities at startup. Called after the server is fully initialized.
- **`LoadServerProperties`** -- Returns server metadata: manufacturer `ZB MOM`, product `LmxOpcUa Server`, and the assembly version as the software version.
### ServerCapabilities
`ConfigureServerCapabilities` populates the `ServerCapabilities` node at startup:
- **ServerProfileArray** -- `StandardUA2017`
- **LocaleIdArray** -- `en`
- **MinSupportedSampleRate** -- 100ms
- **MaxBrowseContinuationPoints** -- 100
- **MaxHistoryContinuationPoints** -- 100
- **MaxArrayLength** -- 65535
- **MaxStringLength / MaxByteStringLength** -- 4MB
- **OperationLimits** -- 1000 nodes per Read/Write/Browse/RegisterNodes/TranslateBrowsePaths/MonitoredItems/HistoryRead; 0 for MethodCall/NodeManagement/HistoryUpdate (not supported)
- **ServerDiagnostics.EnabledFlag** -- `true` (SDK tracks session/subscription counts automatically)
### Session tracking
`LmxOpcUaServer` exposes `ActiveSessionCount` by querying `ServerInternal.SessionManager.GetSessions().Count`. `OpcUaServerHost` surfaces this for status reporting.
## Startup and Shutdown
`OpcUaServerHost.StartAsync` performs the following sequence:
1. Build `ApplicationConfiguration` programmatically
2. Validate the configuration via `appConfig.Validate(ApplicationType.Server)`
3. Create `ApplicationInstance` and check/create the application certificate
4. Instantiate `LmxOpcUaServer` and start it via `ApplicationInstance.Start`
`OpcUaServerHost.Stop` calls `_server.Stop()` and nulls both the server and application instance references. The class implements `IDisposable`, delegating to `Stop`.
`Security.AutoAcceptClientCertificates` (default `true`) and `RejectSHA1Certificates` (default `true`) are honored. The server certificate is always created — even for `None`-only deployments — because `UserName` token encryption needs it.
## Key source files
- `src/ZB.MOM.WW.OtOpcUa.Host/OpcUa/OpcUaServerHost.cs` -- Application lifecycle and programmatic configuration
- `src/ZB.MOM.WW.OtOpcUa.Host/OpcUa/LmxOpcUaServer.cs` -- StandardServer subclass and node manager creation
- `src/ZB.MOM.WW.OtOpcUa.Host/OpcUa/SecurityProfileResolver.cs` -- Profile-name to ServerSecurityPolicy mapping
- `src/ZB.MOM.WW.OtOpcUa.Host/Configuration/OpcUaConfiguration.cs` -- Configuration POCO
- `src/ZB.MOM.WW.OtOpcUa.Host/Configuration/SecurityProfileConfiguration.cs` -- Security configuration POCO
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs` `StandardServer` subclass + `ImpersonateUser` hook
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs` — per-driver `CustomNodeManager2` + dispatch surface
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs` — programmatic `ApplicationConfiguration` + lifecycle
- `src/ZB.MOM.WW.OtOpcUa.Core/Hosting/DriverHost.cs` — driver registration
- `src/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs` — Polly pipeline entry point
- `src/ZB.MOM.WW.OtOpcUa.Core/Authorization/` — Phase 6.2 permission trie + evaluator
- `src/ZB.MOM.WW.OtOpcUa.Server/Security/AuthorizationGate.cs` — stack-to-evaluator bridge

View File

@@ -1,99 +1,57 @@
# Read/Write Operations
`LmxNodeManager` overrides the OPC UA `Read` and `Write` methods to translate client requests into MXAccess runtime calls. Each override resolves the OPC UA `NodeId` to a Galaxy tag reference, performs the I/O through `IMxAccessClient`, and returns the result with appropriate status codes.
`DriverNodeManager` (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs`) wires the OPC UA stack's per-variable `OnReadValue` and `OnWriteValue` hooks to each driver's `IReadable` and `IWritable` capabilities. Every dispatch flows through `CapabilityInvoker` so the Polly pipeline (retry / timeout / breaker / bulkhead) applies uniformly across Galaxy, Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS, and OPC UA Client drivers.
## Read Override
## OnReadValue
The `Read` override in `LmxNodeManager` intercepts value attribute reads for nodes in the Galaxy namespace.
The hook is registered on every `BaseDataVariableState` created by the `IAddressSpaceBuilder.Variable(...)` call during discovery. When the stack dispatches a Read for a node in this namespace:
### Resolution flow
1. If the driver does not implement `IReadable`, the hook returns `BadNotReadable`.
2. The node's `NodeId.Identifier` is used directly as the driver-side full reference — it matches `DriverAttributeInfo.FullName` registered at discovery time.
3. (Phase 6.2) If an `AuthorizationGate` + `NodeScopeResolver` are wired, the gate is consulted first via `IsAllowed(identity, OpcUaOperation.Read, scope)`. A denied read never hits the driver.
4. The call is wrapped by `_invoker.ExecuteAsync(DriverCapability.Read, ResolveHostFor(fullRef), …)`. The resolved host is `IPerCallHostResolver.ResolveHost(fullRef)` for multi-host drivers; single-host drivers fall back to `DriverInstanceId` (decision #144).
5. The first `DataValueSnapshot` from the batch populates the outgoing `value` / `statusCode` / `timestamp`. An empty batch surfaces `BadNoData`; any exception surfaces `BadInternalError`.
1. The base class `Read` runs first, handling non-value attributes (DisplayName, DataType, etc.) through the standard node manager.
2. For each `ReadValueId` where `AttributeId == Attributes.Value`, the override checks whether the node belongs to this namespace (`NamespaceIndex` match).
3. The string-typed `NodeId.Identifier` is looked up in `_nodeIdToTagReference` to find the corresponding `FullTagReference` (e.g., `DelmiaReceiver_001.DownloadPath`).
4. `_mxAccessClient.ReadAsync(tagRef)` retrieves the current value, timestamp, and quality from MXAccess. The async call is synchronously awaited because the OPC UA SDK `Read` override is synchronous.
5. The returned `Vtq` is converted to a `DataValue` via `CreatePublishedDataValue`, which normalizes array values through `NormalizePublishedValue` (substituting a default typed array when the value is null for array nodes).
6. On success, `errors[i]` is set to `ServiceResult.Good`. On exception, the error is set to `BadInternalError`.
The hook is synchronous — the async invoker call is bridged with `AsTask().GetAwaiter().GetResult()` because the OPC UA SDK's value-hook signature is sync. Idempotent-by-construction reads mean this bridge is safe to retry inside the Polly pipeline.
```csharp
if (_nodeIdToTagReference.TryGetValue(nodeIdStr, out var tagRef))
{
var vtq = _mxAccessClient.ReadAsync(tagRef).GetAwaiter().GetResult();
results[i] = CreatePublishedDataValue(tagRef, vtq);
errors[i] = ServiceResult.Good;
}
```
## OnWriteValue
## Write Override
`OnWriteValue` follows the same shape with two additional concerns: authorization and idempotence.
The `Write` override follows a similar pattern but includes access-level enforcement and array element write support.
### Authorization (two layers)
### Access level check
1. **SecurityClassification gate.** Every variable stores its `SecurityClassification` in `_securityByFullRef` at registration time (populated from `DriverAttributeInfo.SecurityClass`). `WriteAuthzPolicy.IsAllowed(classification, userRoles)` runs first, consulting the session's roles via `context.UserIdentity is IRoleBearer`. `FreeAccess` passes anonymously, `ViewOnly` denies everyone, and `Operate / Tune / Configure / SecuredWrite / VerifiedWrite` require `WriteOperate / WriteTune / WriteConfigure` roles respectively. Denial returns `BadUserAccessDenied` without consulting the driver — drivers never enforce ACLs themselves; they only report classification as discovery metadata (feedback `feedback_acl_at_server_layer.md`).
2. **Phase 6.2 permission-trie gate.** When `AuthorizationGate` is wired, it re-runs with the operation derived from `WriteAuthzPolicy.ToOpcUaOperation(classification)`. The gate consults the per-cluster permission trie loaded from `NodeAcl` rows, enforcing fine-grained per-tag ACLs on top of the role-based classification policy. See `docs/v2/acl-design.md`.
The base class `Write` runs first and sets `BadNotWritable` for nodes whose `AccessLevel` does not include `CurrentWrite`. The override skips these nodes:
### Dispatch
```csharp
if (errors[i] != null && errors[i].StatusCode == StatusCodes.BadNotWritable)
continue;
```
`_invoker.ExecuteWriteAsync(host, isIdempotent, callSite, …)` honors the `WriteIdempotentAttribute` semantics per decisions #44-45 and #143:
The `AccessLevel` is set during node creation based on `SecurityClassificationMapper.IsWritable(attr.SecurityClassification)`. Read-only Galaxy attributes (e.g., security classification `FreeRead`) get `AccessLevels.CurrentRead` only.
- `isIdempotent = true` (tag flagged `WriteIdempotent` in the Config DB) → runs through the standard `DriverCapability.Write` pipeline; retry may apply per the tier configuration.
- `isIdempotent = false` (default) → the invoker builds a one-off pipeline with `RetryCount = 0`. A timeout may fire after the device already accepted the pulse / alarm-ack / counter-increment; replay is the caller's decision, not the server's.
### Write flow
The `_writeIdempotentByFullRef` lookup is populated at discovery time from the `DriverAttributeInfo.WriteIdempotent` field.
1. The `NodeId` is resolved to a tag reference via `_nodeIdToTagReference`.
2. The raw value is extracted from `writeValue.Value.WrappedValue.Value`.
3. If the write includes an `IndexRange` (array element write), `TryApplyArrayElementWrite` handles the merge before sending the full array to MXAccess.
4. `_mxAccessClient.WriteAsync(tagRef, value)` sends the value to the Galaxy runtime.
5. On success, `PublishLocalWrite` updates the in-memory node immediately so subscribed clients see the change without waiting for the next MXAccess data change callback.
### Per-write status
### Array element writes via IndexRange
`IWritable.WriteAsync` returns `IReadOnlyList<WriteResult>` — one numeric `StatusCode` per requested write. A non-zero code is surfaced directly to the client; exceptions become `BadInternalError`. The OPC UA stack's pattern of batching per-service is preserved through the full chain.
`TryApplyArrayElementWrite` supports writing individual elements of an array attribute. MXAccess does not support element-level writes, so the method performs a read-modify-write:
## Array element writes
1. Parse the `IndexRange` string as a zero-based integer index. Return `BadIndexRangeInvalid` if parsing fails or the index is negative.
2. Read the current array value from MXAccess via `ReadAsync`.
3. Clone the array and set the element at the target index.
4. `NormalizeIndexedWriteValue` unwraps single-element arrays (OPC UA clients sometimes wrap a scalar in a one-element array).
5. `ConvertArrayElementValue` coerces the value to the array's element type using `Convert.ChangeType`, handling null values by substituting the type's default.
6. The full modified array is written back to MXAccess as a single `WriteAsync` call.
Array-element writes via OPC UA `IndexRange` are driver-specific. The OPC UA stack hands the dispatch an unwrapped `NumericRange` on the `indexRange` parameter of `OnWriteValue`; `DriverNodeManager` passes the full `value` object to `IWritable.WriteAsync` and the driver decides whether to support partial writes. Galaxy performs a read-modify-write inside the Galaxy driver (MXAccess has no element-level writes); other drivers generally accept only full-array writes today.
```csharp
var nextArray = (Array)currentArray.Clone();
nextArray.SetValue(ConvertArrayElementValue(normalizedValue, elementType), index);
updatedArray = nextArray;
```
## HistoryRead
### Role-based write enforcement
`DriverNodeManager.HistoryReadRawModified`, `HistoryReadProcessed`, `HistoryReadAtTime`, and `HistoryReadEvents` route through the driver's `IHistoryProvider` capability with `DriverCapability.HistoryRead`. Drivers without `IHistoryProvider` surface `BadHistoryOperationUnsupported` per node. See `docs/HistoricalDataAccess.md`.
When `AnonymousCanWrite` is `false` in the `Authentication` configuration, the write override enforces role-based access control before dispatching to MXAccess. The check order is:
## Failure isolation
1. The base class `Write` runs first, enforcing `AccessLevel`. Nodes without `CurrentWrite` get `BadNotWritable` and the override skips them.
2. The override checks whether the node is in the Galaxy namespace. Non-namespace nodes are skipped.
3. If `AnonymousCanWrite` is `false`, the override inspects `context.OperationContext.Session` for `GrantedRoleIds`. If the session does not hold `WellKnownRole_AuthenticatedUser`, the error is set to `BadUserAccessDenied` and the write is rejected.
4. If the role check passes (or `AnonymousCanWrite` is `true`), the write proceeds to MXAccess.
Per decision #12, exceptions in the driver's capability call are logged and converted to a per-node `BadInternalError` — they never unwind into the master node manager. This keeps one driver's outage from disrupting sibling drivers in the same server process.
The existing security classification enforcement (ReadOnly nodes getting `BadNotWritable` via `AccessLevel`) still applies first and takes precedence over the role check.
## Key source files
## Value Type Conversion
`CreatePublishedDataValue` wraps the conversion pipeline. `NormalizePublishedValue` checks whether the tag is an array type with a declared `ArrayDimension` and substitutes a default typed array (via `CreateDefaultArrayValue`) when the raw value is null. This prevents OPC UA clients from receiving a null variant for array nodes, which violates the specification for nodes declared with `ValueRank.OneDimension`.
`CreateDefaultArrayValue` uses `MxDataTypeMapper.MapToClrType` to determine the CLR element type, then creates an `Array.CreateInstance` of the declared length. String arrays are initialized with `string.Empty` elements rather than null.
## PublishLocalWrite
After a successful write, `PublishLocalWrite` updates the variable node in memory without waiting for the MXAccess `OnDataChange` callback to arrive:
```csharp
private void PublishLocalWrite(string tagRef, object? value)
{
var dataValue = CreatePublishedDataValue(tagRef, Vtq.Good(value));
variable.Value = dataValue.Value;
variable.StatusCode = dataValue.StatusCode;
variable.Timestamp = dataValue.SourceTimestamp;
variable.ClearChangeMasks(SystemContext, false);
}
```
`ClearChangeMasks` notifies the OPC UA framework that the node value has changed, which triggers data change notifications to any active monitored items. Without this call, subscribed clients would only see the update when the next MXAccess data change event arrives, which could be delayed depending on the subscription interval.
- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs``OnReadValue` / `OnWriteValue` hooks
- `src/ZB.MOM.WW.OtOpcUa.Server/Security/WriteAuthzPolicy.cs` — classification-to-role policy
- `src/ZB.MOM.WW.OtOpcUa.Server/Security/AuthorizationGate.cs` — Phase 6.2 trie gate
- `src/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs``ExecuteAsync` / `ExecuteWriteAsync`
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IReadable.cs`, `IWritable.cs`, `WriteIdempotentAttribute.cs`

View File

@@ -1,135 +1,60 @@
# Subscriptions
`LmxNodeManager` bridges OPC UA monitored items to MXAccess runtime subscriptions using reference counting and a decoupled dispatch architecture. This design ensures that MXAccess COM callbacks (which run on the STA thread) never contend with the OPC UA framework lock.
Driver-side data-change subscriptions live behind `ISubscribable` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ISubscribable.cs`). The interface is deliberately mechanism-agnostic: it covers native subscriptions (Galaxy MXAccess advisory, OPC UA monitored items on an upstream server, TwinCAT ADS notifications) and driver-internal polled subscriptions (Modbus, AB CIP, S7, FOCAS). Core sees the same event shape regardless — drivers fire `OnDataChange` and Core dispatches to the matching OPC UA monitored items.
## Ref-Counted MXAccess Subscriptions
Multiple OPC UA clients can subscribe to the same Galaxy tag simultaneously. Rather than opening duplicate MXAccess subscriptions, `LmxNodeManager` maintains a reference count per tag in `_subscriptionRefCounts`.
### SubscribeTag
`SubscribeTag` increments the reference count for a tag reference. On the first subscription (count goes from 0 to 1), it calls `_mxAccessClient.SubscribeAsync` to open the MXAccess runtime subscription:
## ISubscribable surface
```csharp
internal void SubscribeTag(string fullTagReference)
{
lock (_lock)
{
if (_subscriptionRefCounts.TryGetValue(fullTagReference, out var count))
_subscriptionRefCounts[fullTagReference] = count + 1;
else
{
_subscriptionRefCounts[fullTagReference] = 1;
_ = _mxAccessClient.SubscribeAsync(fullTagReference, (_, _) => { });
}
}
}
Task<ISubscriptionHandle> SubscribeAsync(
IReadOnlyList<string> fullReferences,
TimeSpan publishingInterval,
CancellationToken cancellationToken);
Task UnsubscribeAsync(ISubscriptionHandle handle, CancellationToken cancellationToken);
event EventHandler<DataChangeEventArgs>? OnDataChange;
```
### UnsubscribeTag
A single `SubscribeAsync` call may batch many attributes and returns an opaque handle the caller passes back to `UnsubscribeAsync`. The driver may emit an immediate `OnDataChange` for each subscribed reference (the OPC UA initial-data convention) and then a push per change.
`UnsubscribeTag` decrements the reference count. When the count reaches zero, the MXAccess subscription is closed via `UnsubscribeAsync` and the tag is removed from the dictionary:
Every subscribe / unsubscribe call goes through `CapabilityInvoker.ExecuteAsync(DriverCapability.Subscribe, host, …)` so the per-host pipeline applies.
```csharp
if (count <= 1)
{
_subscriptionRefCounts.Remove(fullTagReference);
_ = _mxAccessClient.UnsubscribeAsync(fullTagReference);
}
else
_subscriptionRefCounts[fullTagReference] = count - 1;
```
## Reference counting at Core
Both methods use `lock (_lock)` (a private object, distinct from the OPC UA framework `Lock`) to serialize ref-count updates without blocking node value dispatches.
Multiple OPC UA clients can monitor the same variable simultaneously. Rather than open duplicate driver subscriptions, Core maintains a ref-count per `(driver, fullReference)` pair: the first OPC UA monitored-item for a reference triggers `ISubscribable.SubscribeAsync` with that single reference; each additional monitored-item just increments the count; decrement-to-zero triggers `UnsubscribeAsync`. Transferred subscriptions (client reconnect → resume session) replay against the same ref-count map so active driver subscriptions are preserved across session migration.
## OnMonitoredItemCreated
## Threading
The OPC UA framework calls `OnMonitoredItemCreated` when a client creates a monitored item. The override resolves the node handle to a tag reference and calls `SubscribeTag`, which opens the MXAccess subscription early so runtime values start arriving before the first publish cycle:
The STA thread story is now driver-specific, not a server-wide concern:
```csharp
protected override void OnMonitoredItemCreated(ServerSystemContext context,
NodeHandle handle, MonitoredItem monitoredItem)
{
base.OnMonitoredItemCreated(context, handle, monitoredItem);
var nodeIdStr = handle?.NodeId?.Identifier as string;
if (nodeIdStr != null && _nodeIdToTagReference.TryGetValue(nodeIdStr, out var tagRef))
SubscribeTag(tagRef);
}
```
- **Galaxy** runs its MXAccess COM objects on a dedicated STA thread with a Win32 message pump (`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Sta/StaPump.cs`) inside the standalone `Driver.Galaxy.Host` Windows service. The Proxy driver (`Driver.Galaxy.Proxy`) connects to the Host via named pipe and re-exposes the data on a free-threaded surface to Core. Core never touches COM.
- **Modbus / S7 / AB CIP / AB Legacy / TwinCAT / FOCAS** are free-threaded — they run their polling loops on ordinary `Task`s. Their `OnDataChange` fires on thread-pool threads.
- **OPC UA Client** delegates to the OPC Foundation stack's subscription loop.
`OnDeleteMonitoredItemsComplete` performs the inverse, calling `UnsubscribeTag` for each deleted monitored item.
The common contract: drivers are responsible for marshalling from whatever native thread the backend uses onto thread-pool threads before raising `OnDataChange`. Core's dispatch path acquires the OPC UA framework `Lock` and calls `ClearChangeMasks` on the corresponding `BaseDataVariableState` to notify subscribed clients.
## Data Change Dispatch Queue
## Dispatch
MXAccess delivers data change callbacks on the STA thread via the `OnTagValueChanged` event. These callbacks must not acquire the OPC UA framework `Lock` directly because the lock is also held during `Read`/`Write` operations that call into MXAccess (creating a potential deadlock with the STA thread). The solution is a `ConcurrentDictionary<string, Vtq>` named `_pendingDataChanges` that decouples the two threads.
Core's subscription dispatch path:
### Callback handler
1. `ISubscribable.OnDataChange` fires on a thread-pool thread with a `DataChangeEventArgs(subscriptionHandle, fullReference, DataValueSnapshot)`.
2. Core looks up the variable by `fullReference` in the driver's `DriverNodeManager` variable map.
3. Under the OPC UA framework `Lock`, the variable's `Value` / `StatusCode` / `Timestamp` are updated and `ClearChangeMasks(SystemContext, false)` is called.
4. The OPC Foundation stack then enqueues data-change notifications for every monitored-item attached to that variable, honoring each subscription's sampling + filter configuration.
`OnMxAccessDataChange` runs on the STA thread. It stores the latest value in the concurrent dictionary (coalescing rapid updates for the same tag) and signals the dispatch thread:
Batch coalescing — coalescing multiple pushes for the same reference between publish cycles — is done driver-side when the backend natively supports it (Galaxy keeps the v1 coalescing dictionary); otherwise the SDK's own data-change filter suppresses no-change notifications.
```csharp
private void OnMxAccessDataChange(string address, Vtq vtq)
{
Interlocked.Increment(ref _totalMxChangeEvents);
_pendingDataChanges[address] = vtq;
_dataChangeSignal.Set();
}
```
## Initial values
### Dispatch thread architecture
A freshly-built variable carries `StatusCode = BadWaitingForInitialData` until the driver delivers the first value. Drivers whose backends supply an initial read (Galaxy `AdviseSupervisory`, TwinCAT `AddDeviceNotification`) fire `OnDataChange` immediately after `SubscribeAsync` returns. Polled drivers fire the first push when their first poll cycle completes.
A dedicated background thread (`OpcUaDataChangeDispatch`) runs `DispatchLoop`, which waits on an `AutoResetEvent` with a 100ms timeout. The decoupled design exists for two reasons:
## Transferred subscription restoration
1. **Deadlock avoidance** -- The STA thread must not acquire the OPC UA `Lock`. The dispatch thread is a normal background thread that can safely acquire `Lock`.
2. **Batch coalescing** -- Multiple MXAccess callbacks for the same tag between dispatch cycles are collapsed to the latest value via dictionary key overwrite. Under high load, this reduces the number of `ClearChangeMasks` calls.
When an OPC UA session is resumed (client reconnect with `TransferSubscriptions`), Core walks the transferred monitored-items and ensures every referenced `(driver, fullReference)` has a live driver subscription. References already active (in-process migration) skip re-subscribing; references that lost their driver-side handle during the session gap are re-subscribed via `SubscribeAsync`.
The dispatch loop processes changes in two phases:
## Key source files
**Phase 1 (outside Lock):** Drain keys from `_pendingDataChanges`, convert each `Vtq` to a `DataValue` via `CreatePublishedDataValue`, and collect alarm transition events. MXAccess reads for alarm Priority and DescAttrName values also happen in this phase, since they call back into the STA thread.
**Phase 2 (inside Lock):** Apply all prepared updates to variable nodes and call `ClearChangeMasks` on each to trigger OPC UA data change notifications. Alarm events are reported in this same lock scope.
```csharp
lock (Lock)
{
foreach (var (variable, dataValue) in updates)
{
variable.Value = dataValue.Value;
variable.StatusCode = dataValue.StatusCode;
variable.Timestamp = dataValue.SourceTimestamp;
variable.ClearChangeMasks(SystemContext, false);
}
}
```
### ClearChangeMasks
`ClearChangeMasks(SystemContext, false)` is the mechanism that notifies the OPC UA framework a node's value has changed. The framework uses change masks internally to track which nodes have pending notifications for active monitored items. Calling this method causes the server to enqueue data change notifications for all monitoring clients of that node. The `false` parameter indicates that child nodes should not be recursively cleared.
## Transferred Subscription Restoration
When OPC UA sessions are transferred (e.g., client reconnects and resumes a previous session), the framework calls `OnMonitoredItemsTransferred`. The override collects the tag references for all transferred items and calls `RestoreTransferredSubscriptions`.
`RestoreTransferredSubscriptions` groups the tag references by count and, for each tag that does not already have an active ref-count entry, opens a new MXAccess subscription and sets the initial reference count:
```csharp
internal void RestoreTransferredSubscriptions(IEnumerable<string> fullTagReferences)
{
var transferredCounts = fullTagReferences
.GroupBy(tagRef => tagRef, StringComparer.OrdinalIgnoreCase)
.ToDictionary(g => g.Key, g => g.Count(), StringComparer.OrdinalIgnoreCase);
foreach (var kvp in transferredCounts)
{
lock (_lock)
{
if (_subscriptionRefCounts.ContainsKey(kvp.Key))
continue;
_subscriptionRefCounts[kvp.Key] = kvp.Value;
}
_ = _mxAccessClient.SubscribeAsync(kvp.Key, (_, _) => { });
}
}
```
Tags that already have in-memory bookkeeping are skipped to avoid double-counting when the transfer happens within the same server process (normal in-process session migration).
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ISubscribable.cs` — capability contract
- `src/ZB.MOM.WW.OtOpcUa.Core/Resilience/CapabilityInvoker.cs` — pipeline wrapping
- `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Sta/StaPump.cs` — Galaxy STA thread + message pump
- Per-driver subscribe implementations in each `Driver.*` project