Doc refresh (task #202) — core architecture docs for multi-driver OtOpcUa

Rewrite seven core-architecture docs to match the shipped multi-driver platform.
The v1 single-driver LmxNodeManager framing is replaced with the Core +
capability-interface model — Galaxy is now one driver of seven, and each doc
points at the current class names + source paths.

What changed per file:
- OpcUaServer.md — OtOpcUaServer as StandardServer host; per-driver
  DriverNodeManager + CapabilityInvoker wiring; Config-DB-driven configuration
  (sp_PublishGeneration, DraftRevisionToken, Admin UI); Phase 6.2
  AuthorizationGate integration.
- AddressSpace.md — GenericDriverNodeManager.BuildAddressSpaceAsync walks
  ITagDiscovery.DiscoverAsync and streams DriverAttributeInfo through
  IAddressSpaceBuilder; CapturingBuilder registers alarm-condition sinks;
  per-driver NodeId schemes replace the fixed ns=1;s=ZB root.
- ReadWriteOperations.md — OnReadValue / OnWriteValue dispatch to
  IReadable.ReadAsync / IWritable.WriteAsync through CapabilityInvoker,
  honoring WriteIdempotentAttribute (#143); two-layer authorization
  (WriteAuthzPolicy + Phase 6.2 AuthorizationGate).
- Subscriptions.md — ISubscribable.SubscribeAsync/UnsubscribeAsync is the
  capability surface; STA-thread story is now Galaxy-specific (StaPump inside
  Driver.Galaxy.Host), other drivers are free-threaded.
- AlarmTracking.md — IAlarmSource is optional; AlarmSurfaceInvoker wraps
  Subscribe/Ack/Unsubscribe with fan-out by IPerCallHostResolver and the
  no-retry AlarmAcknowledge pipeline (#143); CapturingBuilder registers sinks
  at build time.
- DataTypeMapping.md — DriverDataType + SecurityClassification are the
  driver-agnostic enums; per-driver mappers (GalaxyProxyDriver inline,
  AbCipDataType, ModbusDriver, etc.); SecurityClassification is metadata only,
  ACL enforcement is at the server layer.
- IncrementalSync.md — IRediscoverable covers backend-change signals;
  sp_ComputeGenerationDiff + DiffViewer drive generation-level change
  detection; IDriver.ReinitializeAsync is the in-process recovery path.
This commit is contained in:
Joseph Doherty
2026-04-20 01:27:25 -04:00
parent 48970af416
commit 985b7aba26
7 changed files with 290 additions and 699 deletions

View File

@@ -1,121 +1,65 @@
# Incremental Sync
When a Galaxy redeployment is detected, the OPC UA address space must be updated to reflect the new hierarchy and attributes. Rather than tearing down the entire address space and rebuilding from scratch (which disconnects all clients and drops all subscriptions), `LmxNodeManager` performs an incremental sync that identifies changed objects and rebuilds only the affected subtrees.
Two distinct change-detection paths feed the running server: driver-backend rediscovery (Galaxy's `time_of_last_deploy`, TwinCAT's symbol-version-changed, OPC UA Client's upstream namespace change) and generation-level config publishes from the Admin UI. Both flow into re-runs of `ITagDiscovery.DiscoverAsync`, but they originate differently.
## Cached State
## Driver-backend rediscovery — IRediscoverable
`LmxNodeManager` retains shallow copies of the last-published hierarchy and attributes:
Drivers whose backend has a native change signal implement `IRediscoverable` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IRediscoverable.cs`):
```csharp
private List<GalaxyObjectInfo>? _lastHierarchy;
private List<GalaxyAttributeInfo>? _lastAttributes;
```
These are updated at the end of every `BuildAddressSpace` or `SyncAddressSpace` call via `new List<T>(source)` to create independent copies. The copies serve as the baseline for the next diff comparison.
On the first call (when `_lastHierarchy` is null), `SyncAddressSpace` falls through to a full `BuildAddressSpace` since there is no baseline to diff against.
## AddressSpaceDiff
`AddressSpaceDiff` is a static helper class that computes the set of changed Galaxy object IDs between two snapshots.
### FindChangedGobjectIds
This method compares old and new hierarchy+attributes and returns a `HashSet<int>` of gobject IDs that have any difference. It detects three categories of changes:
**Added objects** -- Present in new hierarchy but not in old:
```csharp
foreach (var id in newObjects.Keys)
if (!oldObjects.ContainsKey(id))
changed.Add(id);
```
**Removed objects** -- Present in old hierarchy but not in new:
```csharp
foreach (var id in oldObjects.Keys)
if (!newObjects.ContainsKey(id))
changed.Add(id);
```
**Modified objects** -- Present in both but with different properties. `ObjectsEqual` compares `TagName`, `BrowseName`, `ContainedName`, `ParentGobjectId`, and `IsArea`.
**Attribute set changes** -- For objects that exist in both snapshots, attributes are grouped by `GobjectId` and compared pairwise. `AttributeSetsEqual` sorts both lists by `FullTagReference` and `PrimitiveName`, then checks each pair via `AttributesEqual`, which compares `AttributeName`, `FullTagReference`, `MxDataType`, `IsArray`, `ArrayDimension`, `PrimitiveName`, `SecurityClassification`, `IsHistorized`, and `IsAlarm`. A difference in count or any field mismatch marks the owning gobject as changed.
Objects already marked as changed by hierarchy comparison are skipped during attribute comparison to avoid redundant work.
### ExpandToSubtrees
When a Galaxy object changes, its children must also be rebuilt because they may reference the parent's node or have inherited attribute changes. `ExpandToSubtrees` performs a BFS traversal from each changed ID, adding all descendants:
```csharp
public static HashSet<int> ExpandToSubtrees(HashSet<int> changed,
List<GalaxyObjectInfo> hierarchy)
public interface IRediscoverable
{
var childrenByParent = hierarchy.GroupBy(h => h.ParentGobjectId)
.ToDictionary(g => g.Key, g => g.Select(h => h.GobjectId).ToList());
var expanded = new HashSet<int>(changed);
var queue = new Queue<int>(changed);
while (queue.Count > 0)
{
var id = queue.Dequeue();
if (childrenByParent.TryGetValue(id, out var children))
foreach (var childId in children)
if (expanded.Add(childId))
queue.Enqueue(childId);
}
return expanded;
event EventHandler<RediscoveryEventArgs>? OnRediscoveryNeeded;
}
public sealed record RediscoveryEventArgs(string Reason, string? ScopeHint);
```
The expansion runs against both the old and new hierarchy. This is necessary because a removed parent's children appear in the old hierarchy (for teardown) while an added parent's children appear in the new hierarchy (for construction).
The driver fires the event with a reason string (for the diagnostic log) and an optional scope hint — a non-null hint lets Core scope the rebuild surgically to that subtree; null means "the whole address space may have changed".
## SyncAddressSpace Flow
Drivers that implement the capability today:
`SyncAddressSpace` orchestrates the incremental update inside the OPC UA framework `Lock`:
- **Galaxy** — polls `galaxy.time_of_last_deploy` in the Galaxy repository DB and fires on change. This is Galaxy-internal change detection, not the platform-wide mechanism.
- **TwinCAT** — observes ADS symbol-version-changed notifications (`0x0702`).
- **OPC UA Client** — subscribes to the upstream server's `Server/NamespaceArray` change notifications.
1. **Diff** -- Call `FindChangedGobjectIds` with the cached and new snapshots. If no changes are detected, update the cached snapshots and return early.
Static drivers (Modbus, S7, AB CIP, AB Legacy, FOCAS) do not implement `IRediscoverable` — their tags only change when a new generation is published from the Config DB. Core sees absence of the interface and skips change-detection wiring for those drivers (decision #54).
2. **Expand** -- Call `ExpandToSubtrees` on both old and new hierarchies to include descendant objects.
## Config-DB generation publishes
3. **Snapshot subscriptions** -- Before teardown, iterate `_gobjectToTagRefs` for each changed gobject ID and record the current MXAccess subscription ref-counts. These are needed to restore subscriptions after rebuild.
Tag-set changes authored in the Admin UI (UNS edits, CSV imports, driver-config edits) accumulate in a draft generation and commit via `sp_PublishGeneration`. The delta between the currently-published generation and the proposed next one is computed by `sp_ComputeGenerationDiff`, which drives:
4. **Teardown** -- Call `TearDownGobjects` to remove the old nodes and clean up tracking state.
- The **DiffViewer** in Admin (`src/ZB.MOM.WW.OtOpcUa.Admin/Components/Pages/Clusters/DiffViewer.razor`) so operators can preview what will change before clicking Publish.
- The 409-on-stale-draft flow (decision #161) — a UNS drag-reorder preview carries a `DraftRevisionToken` so Confirm returns `409 Conflict / refresh-required` if the draft advanced between preview and commit.
5. **Rebuild** -- Filter the new hierarchy and attributes to only the changed gobject IDs, then call `BuildSubtree` to create the replacement nodes.
After publish, the server's generation applier invokes `IDriver.ReinitializeAsync(driverConfigJson, ct)` on every driver whose `DriverInstance.DriverConfig` row changed in the new generation. Reinitialize is the in-process recovery path for Tier A/B drivers; if it fails the driver is marked `DriverState.Faulted` and its nodes go Bad quality — but the server process stays running. See `docs/v2/driver-stability.md`.
6. **Restore subscriptions** -- For each previously subscribed tag reference that still exists in `_tagToVariableNode` after rebuild, re-open the MXAccess subscription and restore the original ref-count.
Drivers whose discovery depends on Config DB state (Modbus register maps, S7 DBs, AB CIP tag lists) re-run their discovery inside `ReinitializeAsync`; Core then diffs the new node set against the current address space.
7. **Update cache** -- Replace `_lastHierarchy` and `_lastAttributes` with shallow copies of the new data.
## Rebuild flow
## TearDownGobjects
When a rediscovery is triggered (by either source), `GenericDriverNodeManager` re-runs `ITagDiscovery.DiscoverAsync` into the same `CapturingBuilder` it used at first build. The new node set is diffed against the current:
`TearDownGobjects` removes all OPC UA nodes and tracking state for a set of gobject IDs:
1. **Diff** — full-name comparison of the new `DriverAttributeInfo` set against the existing `_variablesByFullRef` map. Added / removed / modified references are partitioned.
2. **Snapshot subscriptions** — before teardown, Core captures the current monitored-item ref-counts for every affected reference so subscriptions can be replayed after rebuild.
3. **Teardown** — removed / modified variable nodes are deleted via `CustomNodeManager2.DeleteNode`. Driver-side subscriptions for those references are unwound via `ISubscribable.UnsubscribeAsync`.
4. **Rebuild** — added / modified references get fresh `BaseDataVariableState` nodes via the standard `IAddressSpaceBuilder.Variable(...)` path. Alarm-flagged references re-register their `IAlarmConditionSink` through `CapturingBuilder`.
5. **Restore subscriptions** — for every captured reference that still exists after rebuild, Core re-opens the driver subscription and restores the original ref-count.
For each gobject ID, it processes the associated tag references from `_gobjectToTagRefs`:
Exceptions during teardown are swallowed per decision #12 — a driver throw must not leave the node tree half-deleted.
1. **Unsubscribe** -- If the tag has an active MXAccess subscription (entry in `_subscriptionRefCounts`), call `UnsubscribeAsync` and remove the ref-count entry.
## Scope hint
2. **Remove alarm tracking** -- Find any `_alarmInAlarmTags` entries whose `SourceTagReference` matches the tag. For each, unsubscribe the InAlarm, Priority, and DescAttrName tags, then remove the alarm entry.
When `RediscoveryEventArgs.ScopeHint` is non-null (e.g. a folder path), Core restricts the diff to that subtree. This matters for Galaxy Platform-scoped deployments where a `time_of_last_deploy` advance may only affect one platform's subtree, and for OPC UA Client where an upstream change may be localized. Null scope falls back to a full-tree diff.
3. **Delete variable node** -- Call `DeleteNode` on the variable's `NodeId`, remove from `_tagToVariableNode`, clean up `_nodeIdToTagReference` and `_tagMetadata`, and decrement `VariableNodeCount`.
## Active subscriptions survive rebuild
4. **Delete object/folder node** -- Remove the gobject's entry from `_nodeMap` and call `DeleteNode`. Non-folder nodes decrement `ObjectNodeCount`.
Subscriptions for unchanged references stay live across rebuilds — their ref-count map is not disturbed. Clients monitoring a stable tag never see a data-change gap during a deploy, only clients monitoring a tag that was genuinely removed see the subscription drop.
All MXAccess calls and `DeleteNode` calls are wrapped in try/catch with ignored exceptions, since teardown must complete even if individual cleanup steps fail.
## Key source files
## BuildSubtree
`BuildSubtree` creates OPC UA nodes for a subset of the Galaxy hierarchy, reusing existing parent nodes from `_nodeMap`.
The method first topologically sorts the input hierarchy (same `TopologicalSort` used by `BuildAddressSpace`) to ensure parents are created before children. For each object:
1. **Find parent** -- Look up `ParentGobjectId` in `_nodeMap`. If the parent was not part of the changed set, it already exists from the previous build. If no parent is found, fall back to the root `ZB` folder. This is the key difference from `BuildAddressSpace` -- subtree builds reuse the existing node tree rather than starting from the root.
2. **Create node** -- Areas become `FolderState` with `Organizes` reference; non-areas become `BaseObjectState` with `HasComponent` reference. The node is added to `_nodeMap`.
3. **Create variable nodes** -- Attributes are processed with the same primitive-grouping logic as `BuildAddressSpace`, creating `BaseDataVariableState` nodes via `CreateAttributeVariable`.
4. **Alarm tracking** -- If `_alarmTrackingEnabled` is set, alarm attributes are detected and `AlarmConditionState` nodes are created using the same logic as the full build. EventNotifier flags are set on parent nodes, and alarm tags are auto-subscribed.
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IRediscoverable.cs` — backend-change capability
- `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs` — discovery orchestration
- `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IDriver.cs``ReinitializeAsync` contract
- `src/ZB.MOM.WW.OtOpcUa.Admin/Services/GenerationService.cs` — publish-flow driver
- `docs/v2/config-db-schema.md``sp_PublishGeneration` + `sp_ComputeGenerationDiff`
- `docs/v2/admin-ui.md` — DiffViewer + draft-revision-token flow