diff --git a/docs/AlarmTracking.md b/docs/AlarmTracking.md index e0ba2c5..c7ccf23 100644 --- a/docs/AlarmTracking.md +++ b/docs/AlarmTracking.md @@ -67,6 +67,53 @@ Drivers that want hierarchical alarm subscriptions propagate `EventNotifier.Subs The OPC UA `ConditionRefresh` service queues the current state of every retained condition back to the requesting monitored items. `DriverNodeManager` iterates the node manager's `AlarmConditionState` collection and queues each condition whose `Retain.Value == true` — matching the Part 9 requirement. +## Alarm historian sink + +Distinct from the live `IAlarmSource` stream and the Part 9 `AlarmConditionState` materialization above, qualifying alarm transitions are **also** persisted to a durable event log for downstream AVEVA Historian ingestion. This is a separate subsystem from the `IHistoryProvider` capability used by `HistoryReadEvents` (see [HistoricalDataAccess.md](HistoricalDataAccess.md#alarm-event-history-vs-ihistoryprovider)): the sink is a *producer* path (server → Historian) that runs independently of any client HistoryRead call. + +### `IAlarmHistorianSink` + +`src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/IAlarmHistorianSink.cs` defines the intake contract: + +```csharp +Task EnqueueAsync(AlarmHistorianEvent evt, CancellationToken cancellationToken); +HistorianSinkStatus GetStatus(); +``` + +`EnqueueAsync` is fire-and-forget from the producer's perspective — it must never block the emitting thread. The event payload (`AlarmHistorianEvent` — same file) is source-agnostic: `AlarmId`, `EquipmentPath`, `AlarmName`, `AlarmTypeName` (Part 9 subtype name), `Severity`, `EventKind` (free-form transition string — `Activated` / `Cleared` / `Acknowledged` / `Confirmed` / `Shelved` / …), `Message`, `User`, `Comment`, `TimestampUtc`. + +The sink scope is defined to span every alarm source (plan decision #15: scripted, Galaxy-native, AB CIP ALMD, any future `IAlarmSource`), gated per-alarm by a `HistorizeToAveva` toggle on the producer. Today only `Phase7EngineComposer.RouteToHistorianAsync` (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7EngineComposer.cs`) is wired — it subscribes to `ScriptedAlarmEngine.OnEvent` and marshals each emission into `AlarmHistorianEvent`. Galaxy-native alarms continue to reach AVEVA Historian via the driver's direct `aahClientManaged` path and do not flow through the sink; the AB CIP ALMD path remains unwired pending a producer-side integration. + +### `SqliteStoreAndForwardSink` + +Default production implementation (`src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs`). A local SQLite queue absorbs every `EnqueueAsync` synchronously; a background `Timer` drains batches asynchronously to an `IAlarmHistorianWriter` so operator actions are never blocked on historian reachability. + +Queue schema (single table `Queue`): `RowId PK autoincrement`, `AlarmId`, `EnqueuedUtc`, `PayloadJson` (serialized `AlarmHistorianEvent`), `AttemptCount`, `LastAttemptUtc`, `LastError`, `DeadLettered` (bool), plus `IX_Queue_Drain (DeadLettered, RowId)`. Default capacity `1_000_000` non-dead-lettered rows; oldest rows evict with a WARN log past the cap. + +Drain cadence: `StartDrainLoop(tickInterval)` arms a periodic timer. `DrainOnceAsync` reads up to `batchSize` rows (default 100) in `RowId` order and forwards them through `IAlarmHistorianWriter.WriteBatchAsync`, which returns one `HistorianWriteOutcome` per row: + +| Outcome | Action | +|---|---| +| `Ack` | Row deleted. | +| `PermanentFail` | Row flipped to `DeadLettered = 1` with reason. Peers in the batch retry independently. | +| `RetryPlease` | `AttemptCount` bumped; row stays queued. Drain worker enters `BackingOff`. | + +Writer-side exceptions treat the whole batch as `RetryPlease`. + +Backoff ladder on `RetryPlease` (hard-coded): 1s → 2s → 5s → 15s → 60s cap. Reset to 0 on any batch with no retries. `CurrentBackoff` exposes the current step for instrumentation; the drain timer itself fires on `tickInterval`, so the ladder governs write cadence rather than timer period. + +Dead-letter retention defaults to 30 days (plan decision #21). `PurgeAgedDeadLetters` runs each drain pass and deletes rows whose `LastAttemptUtc` is past the cutoff. `RetryDeadLettered()` is an operator action that clears `DeadLettered` + resets `AttemptCount` on every dead-lettered row so they rejoin the main queue. + +### Composition and writer resolution + +`Phase7Composer.ResolveHistorianSink` (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs`) scans the registered drivers for one that implements `IAlarmHistorianWriter`. Today that is `GalaxyProxyDriver` via `GalaxyHistorianWriter` (`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/Ipc/GalaxyHistorianWriter.cs`), which forwards batches over the Galaxy.Host pipe to the `aahClientManaged` alarm schema. When a writer is found, a `SqliteStoreAndForwardSink` is instantiated against `%ProgramData%/OtOpcUa/alarm-historian-queue.db` with a 2 s drain tick and the writer attached. When no driver provides a writer the fallback is the DI-registered `NullAlarmHistorianSink` (`src/ZB.MOM.WW.OtOpcUa.Server/Program.cs`), which silently discards and reports `HistorianDrainState.Disabled`. + +### Status and observability + +`GetStatus()` returns `HistorianSinkStatus(QueueDepth, DeadLetterDepth, LastDrainUtc, LastSuccessUtc, LastError, DrainState)` — two `COUNT(*)` scalars plus last-drain telemetry. `DrainState` is one of `Disabled` / `Idle` / `Draining` / `BackingOff`. + +The Admin UI `/alarms/historian` page surfaces this through `HistorianDiagnosticsService` (`src/ZB.MOM.WW.OtOpcUa.Admin/Services/HistorianDiagnosticsService.cs`), which also exposes `TryRetryDeadLettered` — it calls through to `SqliteStoreAndForwardSink.RetryDeadLettered` when the live sink is the SQLite implementation and returns 0 otherwise. + ## Key source files - `src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IAlarmSource.cs` — capability contract + `AlarmEventArgs` @@ -74,3 +121,8 @@ The OPC UA `ConditionRefresh` service queues the current state of every retained - `src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/GenericDriverNodeManager.cs` — `CapturingBuilder` + alarm forwarder - `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs` — `VariableHandle.MarkAsAlarmCondition` + `ConditionSink` - `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/Alarms/GalaxyAlarmTracker.cs` — Galaxy-specific alarm-event production +- `src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/IAlarmHistorianSink.cs` — historian sink intake contract + `AlarmHistorianEvent` + `HistorianSinkStatus` + `IAlarmHistorianWriter` +- `src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs` — durable queue + drain worker + backoff ladder + dead-letter retention +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7EngineComposer.cs` — `RouteToHistorianAsync` wires scripted-alarm emissions into the sink +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs` — `ResolveHistorianSink` selects `SqliteStoreAndForwardSink` vs `NullAlarmHistorianSink` +- `src/ZB.MOM.WW.OtOpcUa.Admin/Services/HistorianDiagnosticsService.cs` — Admin UI `/alarms/historian` status + retry-dead-lettered operator action diff --git a/docs/DataTypeMapping.md b/docs/DataTypeMapping.md index e4e6d95..047dcf6 100644 --- a/docs/DataTypeMapping.md +++ b/docs/DataTypeMapping.md @@ -35,7 +35,7 @@ The driver's mapping is authoritative — when a field type is ambiguous (a `LRE ## SecurityClassification — metadata, not ACL -`SecurityClassification` is driver-reported metadata only. Drivers never enforce write permissions themselves — the classification flows into the Server project where `WriteAuthzPolicy.IsAllowed(classification, userRoles)` (`src/ZB.MOM.WW.OtOpcUa.Server/Security/WriteAuthzPolicy.cs`) gates the write against the session's LDAP-derived roles, and (Phase 6.2) the `AuthorizationGate` + permission trie apply on top. This is the "ACL at server layer" invariant recorded in `feedback_acl_at_server_layer.md`. +`SecurityClassification` is driver-reported metadata only. Drivers never enforce write permissions themselves — the classification flows into the Server project where `WriteAuthzPolicy.IsAllowed(classification, userRoles)` (`src/ZB.MOM.WW.OtOpcUa.Server/Security/WriteAuthzPolicy.cs`) gates the write against the session's LDAP-derived roles, and (Phase 6.2) the `AuthorizationGate` + permission trie apply on top. This is the "ACL at server layer" invariant documented in `docs/security.md`. The classification values mirror the v1 Galaxy model so existing Galaxy galaxies keep their published semantics: diff --git a/docs/Driver.FOCAS.Cli.md b/docs/Driver.FOCAS.Cli.md new file mode 100644 index 0000000..e558a53 --- /dev/null +++ b/docs/Driver.FOCAS.Cli.md @@ -0,0 +1,158 @@ +# `otopcua-focas-cli` — Fanuc FOCAS test client + +Ad-hoc probe / read / write / subscribe tool for Fanuc CNCs via the FOCAS/2 +protocol. Uses the **same** `FocasDriver` the OtOpcUa server does — PMC R/G/F +file registers, axis bits, parameters, and macro variables — all through +`FocasAddressParser` syntax. + +Sixth of the driver test-client CLIs, added alongside the Tier-C isolation +work tracked in task #220. + +## Architecture note + +FOCAS is a Tier-C driver: `Fwlib32.dll` is a proprietary 32-bit Fanuc library +with a documented habit of crashing its hosting process on network errors. +The target runtime deployment splits the driver into an in-process +`FocasProxyDriver` (.NET 10 x64) and an out-of-process `Driver.FOCAS.Host` +(.NET 4.8 x86 Windows service) that owns the DLL — see +[v2/implementation/focas-isolation-plan.md](v2/implementation/focas-isolation-plan.md) +and +[v2/implementation/phase-6-1-resilience-and-observability.md](v2/implementation/phase-6-1-resilience-and-observability.md) +for topology + supervisor / respawn / back-pressure design. + +The CLI skips the proxy and loads `FocasDriver` directly (via +`FwlibFocasClientFactory`, which P/Invokes `Fwlib32.dll` in the CLI's own +process). There is **no public simulator** for FOCAS; a meaningful probe +requires a real CNC + a licensed `Fwlib32.dll` on `PATH` (or next to the +executable). On a dev box without the DLL, every wire call surfaces as +`BadCommunicationError` — still useful as a "CLI wire-up is correct" signal. + +## Build + run + +```powershell +dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli +dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli -- --help +``` + +Or publish a self-contained binary: + +```powershell +dotnet publish src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli -c Release -o publish/focas-cli +publish/focas-cli/otopcua-focas-cli.exe --help +``` + +## Common flags + +Every command accepts: + +| Flag | Default | Purpose | +|---|---|---| +| `-h` / `--cnc-host` | **required** | CNC IP address or hostname | +| `-p` / `--cnc-port` | `8193` | FOCAS TCP port (FOCAS-over-EIP default) | +| `-s` / `--series` | `Unknown` | CNC series — `Unknown` / `Zero_i_D` / `Zero_i_F` / `Zero_i_MF` / `Zero_i_TF` / `Sixteen_i` / `Thirty_i` / `ThirtyOne_i` / `ThirtyTwo_i` / `PowerMotion_i` | +| `--timeout-ms` | `2000` | Per-operation timeout | +| `--verbose` | off | Serilog debug output | + +## Addressing + +`FocasAddressParser` syntax — the same format the server + `FocasTagDefinition` +use. Common shapes: + +| Address | Meaning | +|---|---| +| `R100` | PMC R-file word register 100 | +| `X0.0` | PMC X-file bit 0 of byte 0 | +| `G50.3` | PMC G-file bit 3 of byte 50 | +| `F1.4` | PMC F-file bit 4 of byte 1 | +| `PARAM:1815/0` | Parameter 1815, axis 0 | +| `MACRO:500` | Macro variable 500 | + +## Data types + +`Bit`, `Byte`, `Int16`, `Int32`, `Float32`, `Float64`, `String`. Default is +`Int16` (matches PMC R-file word width). + +## Commands + +### `probe` — is the CNC reachable? + +Opens a FOCAS session, reads one sample address, prints driver health. + +```powershell +# Default: read R100 as Int16 +otopcua-focas-cli probe -h 192.168.1.50 + +# Explicit series + address +otopcua-focas-cli probe -h 192.168.1.50 -s ThirtyOne_i --address R200 --type Int16 +``` + +### `read` — single address + +```powershell +# PMC R-file word +otopcua-focas-cli read -h 192.168.1.50 -a R100 -t Int16 + +# PMC X-bit +otopcua-focas-cli read -h 192.168.1.50 -a X0.0 -t Bit + +# Parameter (axis 0) +otopcua-focas-cli read -h 192.168.1.50 -a PARAM:1815/0 -t Int32 + +# Macro variable +otopcua-focas-cli read -h 192.168.1.50 -a MACRO:500 -t Float64 +``` + +### `write` — single value + +Values parse per `--type` with invariant culture. Booleans accept +`true` / `false` / `1` / `0` / `yes` / `no` / `on` / `off`. + +```powershell +otopcua-focas-cli write -h 192.168.1.50 -a R100 -t Int16 -v 42 +otopcua-focas-cli write -h 192.168.1.50 -a G50.3 -t Bit -v on +otopcua-focas-cli write -h 192.168.1.50 -a MACRO:500 -t Float64 -v 3.14 +``` + +PMC G/R writes land on a running machine — be careful which file you hit. +Parameter writes may require the CNC to be in MDI mode with the +parameter-write switch enabled. + +**Writes are non-idempotent by default** — a timeout after the CNC already +applied the write will NOT auto-retry (plan decisions #44 + #45). + +### `subscribe` — watch an address until Ctrl+C + +FOCAS has no push model; the shared `PollGroupEngine` handles the tick +loop. + +```powershell +otopcua-focas-cli subscribe -h 192.168.1.50 -a R100 -t Int16 -i 500 +``` + +## Output format + +Identical to the other driver CLIs via `SnapshotFormatter`: + +- `probe` / `read` emit a multi-line block: `Tag / Value / Status / + Source Time / Server Time`. `probe` prefixes it with `CNC`, `Series`, + `Health`, and `Last error` lines. +- `write` emits one line: `Write
: 0x... (Good | + BadCommunicationError | …)`. +- `subscribe` emits one line per change: `[HH:mm:ss.fff]
= + ()`. + +## Typical workflows + +**"Is the CNC alive?"** → `probe`. + +**"Does my parameter write land?"** → `write` + `read` back against the +same address. Check the parameter-write switch + MDI mode if the write +fails. + +**"Why did this macro flip?"** → `subscribe` to the macro, let the +operator reproduce the cycle, watch the HH:mm:ss.fff timeline. + +**"Is the Fwlib32 DLL wired up?"** → `probe` against any host. A +`DllNotFoundException` surfacing as `BadCommunicationError` with a +matching `Last error` line means the driver is loading but the DLL is +missing; anything else means a transport-layer problem. diff --git a/docs/DriverClis.md b/docs/DriverClis.md index 8fe8b2b..4c84e3e 100644 --- a/docs/DriverClis.md +++ b/docs/DriverClis.md @@ -1,6 +1,6 @@ # Driver test-client CLIs -Five shell-level ad-hoc validation tools, one per native-protocol driver family. +Six shell-level ad-hoc validation tools, one per native-protocol driver family. Each mirrors the v1 `otopcua-cli` shape (probe / read / write / subscribe) against the **same driver** the OtOpcUa server uses — so "does the CLI see it?" and "does the server see it?" are the same question. @@ -12,6 +12,7 @@ the **same driver** the OtOpcUa server uses — so "does the CLI see it?" and | `otopcua-ablegacy-cli` | PCCC (SLC / MicroLogix / PLC-5) | [Driver.AbLegacy.Cli.md](Driver.AbLegacy.Cli.md) | | `otopcua-s7-cli` | S7comm / ISO-on-TCP | [Driver.S7.Cli.md](Driver.S7.Cli.md) | | `otopcua-twincat-cli` | Beckhoff ADS | [Driver.TwinCAT.Cli.md](Driver.TwinCAT.Cli.md) | +| `otopcua-focas-cli` | Fanuc FOCAS/2 (CNC) | [Driver.FOCAS.Cli.md](Driver.FOCAS.Cli.md) | The OPC UA client CLI lives separately and predates this suite — see [Client.CLI.md](Client.CLI.md) for `otopcua-cli`. @@ -32,11 +33,11 @@ Every driver CLI exposes the same four verbs: decisions #44, #45). - **`subscribe`** — long-running data-change stream until Ctrl+C. Uses native push where available (TwinCAT ADS notifications) and falls back to polling - (`PollGroupEngine`) where the protocol has no push (Modbus, AB, S7). + (`PollGroupEngine`) where the protocol has no push (Modbus, AB, S7, FOCAS). ## Shared infrastructure -All five CLIs depend on `src/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/`: +All six CLIs depend on `src/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/`: - `DriverCommandBase` — `--verbose` + Serilog configuration + the abstract `Timeout` surface every protocol-specific base overrides with its own @@ -48,9 +49,9 @@ All five CLIs depend on `src/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/`: with a shortlist for `Good` / `Bad*` / `Uncertain`; unknown codes fall back to hex. -Writing a sixth CLI (hypothetical Galaxy / FOCAS) costs roughly 150 lines: -a `{Family}CommandBase` + four thin command classes that hand their flag -values to the already-shipped driver. +Writing a seventh CLI (hypothetical Galaxy / OPC UA Client) costs roughly +150 lines: a `{Family}CommandBase` + four thin command classes that hand +their flag values to the already-shipped driver. ## Typical cross-CLI workflows @@ -86,7 +87,9 @@ values to the already-shipped driver. ## Tracking -Tasks #249 / #250 / #251 shipped the suite. 122 unit tests cumulative -(16 shared-lib + 106 across the five CLIs) — run +Tasks #249 / #250 / #251 shipped the original five. The FOCAS CLI followed +alongside the Tier-C isolation work on task #220 — no CLI-level test +project (hardware-gated). 122 unit tests cumulative across the first five +(16 shared-lib + 106 CLI-specific) — run `dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common.Tests` + `tests/ZB.MOM.WW.OtOpcUa.Driver.*.Cli.Tests` to re-verify. diff --git a/docs/HistoricalDataAccess.md b/docs/HistoricalDataAccess.md index 6b531b6..abe8fe6 100644 --- a/docs/HistoricalDataAccess.md +++ b/docs/HistoricalDataAccess.md @@ -22,6 +22,12 @@ Supporting DTOs live alongside the interface in `Core.Abstractions`: - `HistoricalEvent(EventId, SourceName?, EventTimeUtc, ReceivedTimeUtc, Message?, Severity)` - `HistoricalEventsResult(IReadOnlyList Events, byte[]? ContinuationPoint)` +## Alarm event history vs. `IHistoryProvider` + +`IHistoryProvider.ReadEventsAsync` is the **pull** path: an OPC UA client calls `HistoryReadEvents` against a notifier node and the driver walks its own backend event store to satisfy the request. The Galaxy driver's implementation reads from AVEVA Historian's event schema via `aahClientManaged`; every other driver leaves the default `NotSupportedException` in place. + +There is also a separate **push** path for persisting alarm transitions from any `IAlarmSource` (and the Phase 7 scripted-alarm engine) into a durable event log, independent of any client HistoryRead call. That path is covered by `IAlarmHistorianSink` + `SqliteStoreAndForwardSink` in `src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/` and is documented in [AlarmTracking.md#alarm-historian-sink](AlarmTracking.md#alarm-historian-sink). The two paths are complementary — the sink populates an external historian's alarm schema; `ReadEventsAsync` reads from whatever event store the driver owns — and share neither interface nor dispatch. + ## Dispatch through `CapabilityInvoker` All four HistoryRead surfaces are wrapped by `CapabilityInvoker` (`Core/Resilience/CapabilityInvoker.cs`) with `DriverCapability.HistoryRead`. The Polly pipeline keyed on `(DriverInstanceId, HostName, DriverCapability.HistoryRead)` provides timeout, circuit-breaker, and bulkhead defaults per the driver's stability tier (see [docs/v2/driver-stability.md](v2/driver-stability.md)). diff --git a/docs/IncrementalSync.md b/docs/IncrementalSync.md index 09daefd..7da2a29 100644 --- a/docs/IncrementalSync.md +++ b/docs/IncrementalSync.md @@ -51,6 +51,10 @@ Exceptions during teardown are swallowed per decision #12 — a driver throw mus When `RediscoveryEventArgs.ScopeHint` is non-null (e.g. a folder path), Core restricts the diff to that subtree. This matters for Galaxy Platform-scoped deployments where a `time_of_last_deploy` advance may only affect one platform's subtree, and for OPC UA Client where an upstream change may be localized. Null scope falls back to a full-tree diff. +## Virtual tags in the rebuild + +Per [ADR-002](v2/implementation/adr-002-driver-vs-virtual-dispatch.md), virtual (scripted) tags live in the same address space as driver tags and flow through the same rebuild. `EquipmentNodeWalker` (`src/ZB.MOM.WW.OtOpcUa.Core/OpcUa/EquipmentNodeWalker.cs`) emits virtual-tag children alongside driver-tag children with `DriverAttributeInfo.Source = NodeSourceKind.Virtual`, and `DriverNodeManager` registers each variable's source in `_sourceByFullRef` so the dispatch branches correctly after rebuild. Virtual-tag script changes published from the Admin UI land through the same generation-publish path — the `VirtualTagEngine` recompiles its script bundle when its config row changes and `DriverNodeManager` re-registers any added/removed virtual variables through the standard diff path. Subscription restoration after rebuild runs through each source's `ISubscribable` — either the driver's or `VirtualTagSource` — without special-casing. + ## Active subscriptions survive rebuild Subscriptions for unchanged references stay live across rebuilds — their ref-count map is not disturbed. Clients monitoring a stable tag never see a data-change gap during a deploy, only clients monitoring a tag that was genuinely removed see the subscription drop. diff --git a/docs/README.md b/docs/README.md index 39e1ca3..7bf09fd 100644 --- a/docs/README.md +++ b/docs/README.md @@ -29,12 +29,21 @@ The project was originally called **LmxOpcUa** (a single-driver Galaxy/MXAccess | [DataTypeMapping.md](DataTypeMapping.md) | Per-driver `DriverAttributeInfo` → OPC UA variable types | | [IncrementalSync.md](IncrementalSync.md) | Address-space rebuild on redeploy + `sp_ComputeGenerationDiff` | | [HistoricalDataAccess.md](HistoricalDataAccess.md) | `IHistoryProvider` as a per-driver optional capability | +| [VirtualTags.md](VirtualTags.md) | `Core.Scripting` + `Core.VirtualTags` — Roslyn script sandbox, engine, dispatch alongside driver tags | +| [ScriptedAlarms.md](ScriptedAlarms.md) | `Core.ScriptedAlarms` — script-predicate `IAlarmSource` + Part 9 state machine | + +Two Core subsystems are shipped without a dedicated top-level doc; see the section in the linked doc: + +| Project | See | +|---------|-----| +| `Core.AlarmHistorian` | [AlarmTracking.md](AlarmTracking.md) § Alarm historian sink | +| `Analyzers` (Roslyn OTOPCUA0001) | [security.md](security.md) § OTOPCUA0001 Analyzer | ### Drivers | Doc | Covers | |-----|--------| -| [drivers/README.md](drivers/README.md) | Index of the seven shipped drivers + capability matrix | +| [drivers/README.md](drivers/README.md) | Index of the eight shipped drivers + capability matrix | | [drivers/Galaxy.md](drivers/Galaxy.md) | Galaxy driver — MXAccess bridge, Host/Proxy split, named-pipe IPC | | [drivers/Galaxy-Repository.md](drivers/Galaxy-Repository.md) | Galaxy-specific discovery via the ZB SQL database | @@ -62,6 +71,7 @@ For Modbus / S7 / AB CIP / AB Legacy / TwinCAT / FOCAS / OPC UA Client specifics | [Driver.AbLegacy.Cli.md](Driver.AbLegacy.Cli.md) | `otopcua-ablegacy-cli` — SLC / MicroLogix / PLC-5 (PCCC) | | [Driver.S7.Cli.md](Driver.S7.Cli.md) | `otopcua-s7-cli` — Siemens S7-300 / S7-400 / S7-1200 / S7-1500 | | [Driver.TwinCAT.Cli.md](Driver.TwinCAT.Cli.md) | `otopcua-twincat-cli` — Beckhoff TwinCAT 2/3 ADS | +| [Driver.FOCAS.Cli.md](Driver.FOCAS.Cli.md) | `otopcua-focas-cli` — Fanuc FOCAS/2 CNC | ### Requirements diff --git a/docs/ReadWriteOperations.md b/docs/ReadWriteOperations.md index c6ed0b4..b5b3e2c 100644 --- a/docs/ReadWriteOperations.md +++ b/docs/ReadWriteOperations.md @@ -2,6 +2,16 @@ `DriverNodeManager` (`src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs`) wires the OPC UA stack's per-variable `OnReadValue` and `OnWriteValue` hooks to each driver's `IReadable` and `IWritable` capabilities. Every dispatch flows through `CapabilityInvoker` so the Polly pipeline (retry / timeout / breaker / bulkhead) applies uniformly across Galaxy, Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS, and OPC UA Client drivers. +## Driver vs virtual dispatch + +Per [ADR-002](v2/implementation/adr-002-driver-vs-virtual-dispatch.md), a single `DriverNodeManager` routes reads and writes across both driver-sourced and virtual (scripted) tags. At discovery time each variable registers a `NodeSourceKind` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverAttributeInfo.cs`) in the manager's `_sourceByFullRef` lookup; the read/write hooks pattern-match on that value to pick the backend: + +- `NodeSourceKind.Driver` — dispatches to the driver's `IReadable` / `IWritable` through `CapabilityInvoker` (the rest of this doc). +- `NodeSourceKind.Virtual` — dispatches to `VirtualTagSource` (`src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagSource.cs`), which wraps `VirtualTagEngine`. Writes are rejected with `BadUserAccessDenied` before the branch per Phase 7 decision #6 — scripts are the only write path into virtual tags. +- `NodeSourceKind.ScriptedAlarm` — dispatches to the Phase 7 `ScriptedAlarmReadable` shim. + +ACL enforcement (`WriteAuthzPolicy` + `AuthorizationGate`) runs before the source branch, so the gates below apply uniformly to all three source kinds. + ## OnReadValue The hook is registered on every `BaseDataVariableState` created by the `IAddressSpaceBuilder.Variable(...)` call during discovery. When the stack dispatches a Read for a node in this namespace: @@ -20,7 +30,7 @@ The hook is synchronous — the async invoker call is bridged with `AsTask().Get ### Authorization (two layers) -1. **SecurityClassification gate.** Every variable stores its `SecurityClassification` in `_securityByFullRef` at registration time (populated from `DriverAttributeInfo.SecurityClass`). `WriteAuthzPolicy.IsAllowed(classification, userRoles)` runs first, consulting the session's roles via `context.UserIdentity is IRoleBearer`. `FreeAccess` passes anonymously, `ViewOnly` denies everyone, and `Operate / Tune / Configure / SecuredWrite / VerifiedWrite` require `WriteOperate / WriteTune / WriteConfigure` roles respectively. Denial returns `BadUserAccessDenied` without consulting the driver — drivers never enforce ACLs themselves; they only report classification as discovery metadata (feedback `feedback_acl_at_server_layer.md`). +1. **SecurityClassification gate.** Every variable stores its `SecurityClassification` in `_securityByFullRef` at registration time (populated from `DriverAttributeInfo.SecurityClass`). `WriteAuthzPolicy.IsAllowed(classification, userRoles)` runs first, consulting the session's roles via `context.UserIdentity is IRoleBearer`. `FreeAccess` passes anonymously, `ViewOnly` denies everyone, and `Operate / Tune / Configure / SecuredWrite / VerifiedWrite` require `WriteOperate / WriteTune / WriteConfigure` roles respectively. Denial returns `BadUserAccessDenied` without consulting the driver — drivers never enforce ACLs themselves; they only report classification as discovery metadata (see `docs/security.md`). 2. **Phase 6.2 permission-trie gate.** When `AuthorizationGate` is wired, it re-runs with the operation derived from `WriteAuthzPolicy.ToOpcUaOperation(classification)`. The gate consults the per-cluster permission trie loaded from `NodeAcl` rows, enforcing fine-grained per-tag ACLs on top of the role-based classification policy. See `docs/v2/acl-design.md`. ### Dispatch diff --git a/docs/ScriptedAlarms.md b/docs/ScriptedAlarms.md new file mode 100644 index 0000000..eb49b67 --- /dev/null +++ b/docs/ScriptedAlarms.md @@ -0,0 +1,125 @@ +# Scripted Alarms + +`Core.ScriptedAlarms` is the Phase 7 subsystem that raises OPC UA Part 9 alarms from operator-authored C# predicates rather than from driver-native alarm streams. Scripted alarms are additive: Galaxy, AB CIP, FOCAS, and OPC UA Client drivers keep their native `IAlarmSource` implementations unchanged, and a `ScriptedAlarmSource` simply registers as another source in the same fan-out. Predicates read tags from any source (driver tags or virtual tags) through the shared `ITagUpstreamSource` and emit condition transitions through the engine's Part 9 state machine. + +This file covers the engine internals — predicate evaluation, state machine, persistence, and the engine-to-`IAlarmSource` adapter. The server-side plumbing that turns those emissions into OPC UA `AlarmConditionState` nodes, applies retries, persists alarm transitions to the Historian, and routes operator acks through the session's `AlarmAck` permission lives in [AlarmTracking.md](AlarmTracking.md) and is not repeated here. + +## Definition shape + +`ScriptedAlarmDefinition` (`src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmDefinition.cs`) is the runtime contract the engine consumes. The generation-publish path materialises these from the `ScriptedAlarm` + `Script` config tables via `Phase7EngineComposer.ProjectScriptedAlarms`. + +| Field | Notes | +|---|---| +| `AlarmId` | Stable identity. Also the OPC UA `ConditionId` and the key in `IAlarmStateStore`. Convention: `{EquipmentPath}::{AlarmName}`. | +| `EquipmentPath` | UNS path the alarm hangs under in the address space. ACL scope inherits from the equipment node. | +| `AlarmName` | Browse-tree display name. | +| `Kind` | `AlarmKind` — `AlarmCondition`, `LimitAlarm`, `DiscreteAlarm`, or `OffNormalAlarm`. Controls only the OPC UA ObjectType the node surfaces as; the internal state machine is identical for all four. | +| `Severity` | `AlarmSeverity` enum (`Low` / `Medium` / `High` / `Critical`). Static per decision #13 — the predicate does not compute severity. The DB column is an OPC UA Part 9 1..1000 integer; `Phase7EngineComposer.MapSeverity` bands it into the four-value enum. | +| `MessageTemplate` | String with `{TagPath}` placeholders, resolved at emission time. See below. | +| `PredicateScriptSource` | Roslyn C# script returning `bool`. `true` = condition active; `false` = cleared. | +| `HistorizeToAveva` | When true, every emission is enqueued to `IAlarmHistorianSink`. Default true. Galaxy-native alarms default false since Galaxy historises them directly. | +| `Retain` | Part 9 retain flag — keep the condition visible after clear while un-acked/un-confirmed transitions remain. Default true. | + +Illustrative definition: + +```csharp +new ScriptedAlarmDefinition( + AlarmId: "Plant/Line1/Oven::OverTemp", + EquipmentPath: "Plant/Line1/Oven", + AlarmName: "OverTemp", + Kind: AlarmKind.LimitAlarm, + Severity: AlarmSeverity.High, + MessageTemplate: "Oven {Plant/Line1/Oven/Temp} exceeds limit {Plant/Line1/Oven/TempLimit}", + PredicateScriptSource: "return GetTag(\"Plant/Line1/Oven/Temp\").AsDouble() > GetTag(\"Plant/Line1/Oven/TempLimit\").AsDouble();"); +``` + +## Predicate evaluation + +Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. + +`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass: + +- `GetTag(path)` returns a `DataValueSnapshot` from the engine-maintained read cache. Missing path → `DataValueSnapshot(null, 0x80340000u, null, now)` (`BadNodeIdUnknown`). An empty path returns the same. +- `SetVirtualTag(path, value)` throws `InvalidOperationException`. Predicates must be side-effect free per plan decision #6; writes would couple alarm state to virtual-tag state in ways that are near-impossible to reason about. Operators see the rejection in `scripts-*.log`. +- `Now` and `Logger` are provided by the engine. + +Evaluation cadence: + +- On every upstream tag change that any alarm's input set references (`OnUpstreamChange` → `ReevaluateAsync`). The engine maintains an inverse index `tag path → alarm ids` (`_alarmsReferencing`); only affected alarms re-run. +- On a 5-second shelving-check timer (`_shelvingTimer`) for timed-shelve expiry. +- At `LoadAsync` for every alarm, to re-derive `ActiveState` per plan decision #14 (startup recovery). + +If a predicate throws or times out, the engine logs the failure and leaves the prior `ActiveState` intact — it does not synthesise a clear. Operators investigating a broken predicate should never see a phantom clear preceding the error. + +## Part 9 state machine + +`Part9StateMachine` (`Part9StateMachine.cs`) is a pure `static` function set. Every transition takes the current `AlarmConditionState` plus the event, returns a new record and an `EmissionKind`. No I/O, no mutation, trivially unit-testable. Transitions map to OPC UA Part 9: + +- `ApplyPredicate(current, predicateTrue, nowUtc)` — predicate re-evaluation. `Inactive → Active` sets `Acked = Unacknowledged` and `Confirmed = Unconfirmed`; `Active → Inactive` updates `LastClearedUtc` and consumes `OneShot` shelving. Disabled alarms no-op. +- `ApplyAcknowledge` / `ApplyConfirm` — operator ack/confirm. Require a non-empty user string (audit requirement). Each appends an `AlarmComment` with `Kind = "Acknowledge"` / `"Confirm"`. +- `ApplyOneShotShelve` / `ApplyTimedShelve(unshelveAtUtc)` / `ApplyUnshelve` — shelving transitions. `Timed` requires `unshelveAtUtc > nowUtc`. +- `ApplyEnable` / `ApplyDisable` — operator enable/disable. Disabled alarms ignore predicate results until re-enabled; on enable, `ActiveState` is re-derived from the next evaluation. +- `ApplyAddComment(text)` — append-only audit entry, no state change. +- `ApplyShelvingCheck(nowUtc)` — called by the 5s timer; promotes expired `Timed` shelving to `Unshelved` with a `system / AutoUnshelve` audit entry. + +Two invariants the machine enforces: + +1. **Disabled** alarms ignore every predicate evaluation — they never transition `ActiveState` / `AckedState` / `ConfirmedState` until re-enabled. +2. **Shelved** alarms still advance their internal state but emit `EmissionKind.Suppressed` instead of `Activated` / `Cleared`. The engine advances the state record (so startup recovery reflects reality) but `ScriptedAlarmSource` does not publish the suppressed transition to subscribers. `OneShot` expires on the next clear; `Timed` expires at `ShelvingState.UnshelveAtUtc`. + +`EmissionKind` values: `None`, `Suppressed`, `Activated`, `Cleared`, `Acknowledged`, `Confirmed`, `Shelved`, `Unshelved`, `Enabled`, `Disabled`, `CommentAdded`. + +## Message templates + +`MessageTemplate` (`MessageTemplate.cs`) resolves `{path}` placeholders in the configured message at emission time. Syntax: + +- `{path/with/slashes}` — brace-stripped contents are looked up via the engine's tag cache. +- No escaping. Literal braces in messages are not currently supported. +- `ExtractTokenPaths(template)` is called at `LoadAsync` so the engine subscribes to every referenced path (ensuring the value cache is populated before the first resolve). + +Fallback rules: a resolved `DataValueSnapshot` with a non-zero `StatusCode`, a `null` `Value`, or an unknown path becomes `{?}`. The event still fires — the operator sees where the reference broke rather than having the alarm swallowed. + +## State persistence + +`IAlarmStateStore` (`IAlarmStateStore.cs`) is the persistence contract: `LoadAsync(alarmId)`, `LoadAllAsync`, `SaveAsync(state)`, `RemoveAsync(alarmId)`. `InMemoryAlarmStateStore` in the same file is the default for tests and dev deployments without a SQL backend. Stream E wires the production implementation against the `ScriptedAlarmState` config-DB table with audit logging through `Core.Abstractions.IAuditLogger`. + +Persisted scope per plan decision #14: `Enabled`, `Acked`, `Confirmed`, `Shelving`, `LastTransitionUtc`, the `LastAck*` / `LastConfirm*` audit fields, and the append-only `Comments` list. `Active` is **not** trusted across restart — the engine re-runs the predicate at `LoadAsync` so operators never re-ack an alarm that was already acknowledged before an outage, and alarms whose condition cleared during downtime settle to `Inactive` without a spurious clear-event. + +Every mutation the state machine produces is immediately persisted inside the engine's `_evalGate` semaphore, so the store's view is always consistent with the in-memory state. + +## Source integration + +`ScriptedAlarmSource` (`ScriptedAlarmSource.cs`) adapts the engine to the driver-agnostic `IAlarmSource` interface. The existing `AlarmSurfaceInvoker` + `GenericDriverNodeManager` fan-out consumes it the same way it consumes Galaxy / AB CIP / FOCAS sources — there is no scripted-alarm-specific code path in the server plumbing. From that point on, the flow into `AlarmConditionState` nodes, the `AlarmAck` session check, and the Historian sink is shared — see [AlarmTracking.md](AlarmTracking.md). + +Two mapping notes specific to this adapter: + +- `SubscribeAlarmsAsync` accepts a list of source-node-id filters, interpreted as Equipment-path prefixes. Empty list matches every alarm. Each emission is matched against every live subscription — the adapter keeps no per-subscription cursor. +- `IAlarmSource.AcknowledgeAsync` does not carry a user identity. The adapter defaults the audit user to `"opcua-client"` so callers using the base interface still produce an audit entry. The server's Part 9 method handlers (Stream G) call the engine's richer `AcknowledgeAsync` / `ConfirmAsync` / `OneShotShelveAsync` / `TimedShelveAsync` / `UnshelveAsync` / `AddCommentAsync` directly with the authenticated principal instead. + +Emissions map into `AlarmEventArgs` as `AlarmType = Kind.ToString()`, `SourceNodeId = EquipmentPath`, `ConditionId = AlarmId`, `Message = resolved template string`, `Severity` carried verbatim, `SourceTimestampUtc = emission time`. + +## Composition + +`Phase7EngineComposer.Compose` (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7EngineComposer.cs`) is the single call site that instantiates the engine. It takes the generation's `Script` / `VirtualTag` / `ScriptedAlarm` rows, the shared `CachedTagUpstreamSource`, an `IAlarmStateStore`, and an `IAlarmHistorianSink`, and returns a `Phase7ComposedSources` the caller owns. When `scriptedAlarms.Count > 0`: + +1. `ProjectScriptedAlarms` resolves each row's `PredicateScriptId` against the script dictionary and produces a `ScriptedAlarmDefinition` list. Unknown or disabled scripts throw immediately — the DB publish guarantees referential integrity but this is a belt-and-braces check. +2. A `ScriptedAlarmEngine` is constructed with the upstream source, the store, a shared `ScriptLoggerFactory` keyed to `scripts-*.log`, and the root Serilog logger. +3. `alarmEngine.OnEvent` is wired to `RouteToHistorianAsync`, which projects each emission into an `AlarmHistorianEvent` and enqueues it on the sink. Fire-and-forget — the SQLite store-and-forward sink is already non-blocking. +4. `LoadAsync(alarmDefs)` runs synchronously on the startup thread: it compiles every predicate, subscribes to the union of predicate inputs and message-template tokens, seeds the value cache, loads persisted state, re-derives `ActiveState` from a fresh predicate evaluation, and starts the 5s shelving timer. Compile failures are aggregated into one `InvalidOperationException` so operators see every bad predicate in one startup log line rather than one at a time. +5. A `ScriptedAlarmSource` is created for the event stream, and a `ScriptedAlarmReadable` (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/ScriptedAlarmReadable.cs`) is created for OPC UA variable reads on the alarm's active-state node (task #245) — unknown alarm ids return `BadNodeIdUnknown` rather than silently reading `false`. + +Both engine and source are added to `Phase7ComposedSources.Disposables`, which `Phase7Composer` disposes on server shutdown. + +## Key source files + +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmEngine.cs` — orchestrator, cascade wiring, shelving timer, `OnEvent` emission +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmSource.cs` — `IAlarmSource` adapter over the engine +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/ScriptedAlarmDefinition.cs` — runtime definition record +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/Part9StateMachine.cs` — pure-function state machine + `TransitionResult` / `EmissionKind` +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/AlarmConditionState.cs` — persisted state record + `AlarmComment` audit entry + `ShelvingState` +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/AlarmPredicateContext.cs` — script-side `ScriptContext` (read-only, write rejected) +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/AlarmTypes.cs` — `AlarmKind` + the four Part 9 enums +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/MessageTemplate.cs` — `{path}` placeholder resolver +- `src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/IAlarmStateStore.cs` — persistence contract + `InMemoryAlarmStateStore` default +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7EngineComposer.cs` — composition, config-row projection, historian routing +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/ScriptedAlarmReadable.cs` — `IReadable` adapter exposing `ActiveState` to OPC UA variable reads diff --git a/docs/Subscriptions.md b/docs/Subscriptions.md index 19505f4..6918aa9 100644 --- a/docs/Subscriptions.md +++ b/docs/Subscriptions.md @@ -2,6 +2,15 @@ Driver-side data-change subscriptions live behind `ISubscribable` (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/ISubscribable.cs`). The interface is deliberately mechanism-agnostic: it covers native subscriptions (Galaxy MXAccess advisory, OPC UA monitored items on an upstream server, TwinCAT ADS notifications) and driver-internal polled subscriptions (Modbus, AB CIP, S7, FOCAS). Core sees the same event shape regardless — drivers fire `OnDataChange` and Core dispatches to the matching OPC UA monitored items. +## Driver vs virtual dispatch + +Per [ADR-002](v2/implementation/adr-002-driver-vs-virtual-dispatch.md), `DriverNodeManager` routes subscriptions across both driver tags and virtual (scripted) tags through the same `ISubscribable` contract. The per-variable `NodeSourceKind` (registered from `DriverAttributeInfo` at discovery) selects the backend: + +- `NodeSourceKind.Driver` — subscribes via the driver's `ISubscribable`, wrapped by `CapabilityInvoker` (the rest of this doc). +- `NodeSourceKind.Virtual` — subscribes via `VirtualTagSource` (`src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagSource.cs`), which forwards change events emitted by `VirtualTagEngine` as `OnDataChange`. The ref-counting, initial-value, and transfer-restoration behaviour below applies identically. + +Because both kinds expose `ISubscribable`, Core's dispatch, ref-count map, and monitored-item fan-out are unchanged across the source branch. + ## ISubscribable surface ```csharp diff --git a/docs/VirtualTags.md b/docs/VirtualTags.md new file mode 100644 index 0000000..b579b84 --- /dev/null +++ b/docs/VirtualTags.md @@ -0,0 +1,142 @@ +# Virtual Tags + +Virtual tags are OPC UA variable nodes whose values are computed by operator-authored C# scripts against other tags (driver or virtual). They live in the Equipment browse tree alongside driver-sourced variables: a client browsing `Enterprise/Site/Area/Line/Equipment/` sees one flat child list that mixes both kinds, and a read / subscribe on a virtual node looks identical to one on a driver node from the wire. The separation is server-side — `NodeScopeResolver` tags each variable's `NodeSource` (`Driver` / `Virtual` / `ScriptedAlarm`), and `DriverNodeManager` dispatches reads to different backends accordingly. See [ADR-002](v2/implementation/adr-002-driver-vs-virtual-dispatch.md) for the dispatch decision. + +The runtime is split across two projects: `Core.Scripting` holds the Roslyn sandbox + evaluator primitives that are reused by both virtual tags and scripted alarms; `Core.VirtualTags` holds the engine that owns the dependency graph, the evaluation pipeline, and the `ISubscribable` adapter the server dispatches to. + +## Roslyn script sandbox (`Core.Scripting`) + +User scripts are compiled via `Microsoft.CodeAnalysis.CSharp.Scripting` against a `ScriptContext` subclass. `ScriptGlobals` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value. + +### Compile pipeline (`ScriptEvaluator`) + +`ScriptEvaluator.Compile(source)` is a three-step gate: + +1. **Roslyn compile** against `ScriptSandbox.Build(contextType)`. Throws `CompilationErrorException` on syntax / type errors. +2. **`ForbiddenTypeAnalyzer.Analyze`** walks the syntax tree post-compile and resolves every referenced symbol against the deny-list. Throws `ScriptSandboxViolationException` with every offending source span attached. This is defence-in-depth: `ScriptOptions` alone cannot block every BCL namespace because .NET type forwarding routes types through assemblies the allow-list does permit. +3. **Delegate materialization** — `script.CreateDelegate()`. Failures here are Roslyn-internal; user scripts don't reach this step. + +`ScriptSandbox.Build` allow-lists exactly: `System.Private.CoreLib` (primitives + `Math` + `Convert`), `System.Linq`, `Core.Abstractions` (for `DataValueSnapshot` / `DriverDataType`), `Core.Scripting` (for `ScriptContext` + `Deadband`), `Serilog` (for `ILogger`), and the concrete context type's assembly. Pre-imported namespaces: `System`, `System.Linq`, `ZB.MOM.WW.OtOpcUa.Core.Abstractions`, `ZB.MOM.WW.OtOpcUa.Core.Scripting`. + +`ForbiddenTypeAnalyzer.ForbiddenNamespacePrefixes` currently denies `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Thread`, `System.Runtime.InteropServices`, `Microsoft.Win32`. Matching is by prefix against the resolved symbol's containing namespace, so `System.Net` catches `System.Net.Http.HttpClient` and every subnamespace. `System.Environment` is explicitly allowed. + +### Compile cache (`CompiledScriptCache`) + +`ConcurrentDictionary>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish. + +### Per-evaluation timeout (`TimedScriptEvaluator`) + +Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying `ScriptRunner` keeps running on its thread-pool thread until the Roslyn runtime returns (documented trade-off; out-of-process evaluation is a v3 concern). + +### Script logger plumbing + +`ScriptLoggerFactory.Create(scriptName)` returns a per-script Serilog logger with the `ScriptName` structured property bound (constant `ScriptLoggerFactory.ScriptNameProperty`). The root script logger is typically a rolling file sink to `scripts-*.log`. `ScriptLogCompanionSink` is attached to the root pipeline and mirrors script events at `Error` or higher into the main `opcua-*.log` at `Warning` level — operators see script errors in the primary log without drowning it in script-authored Info/Debug noise. Exceptions and the `ScriptName` property are preserved in the mirror. + +### Static dependency extraction (`DependencyExtractor`) + +Parses the script source with `CSharpSyntaxTree.ParseText` (script kind), walks invocation expressions, and records every `ctx.GetTag("literal")` and `ctx.SetVirtualTag("literal", ...)` call. The first argument **must** be a string literal — variables, concatenation, interpolation, and method-returned strings are rejected at publish with a `DependencyRejection` carrying the exact `TextSpan`. This is how the engine builds its change-trigger graph statically; scripts cannot smuggle a dependency past the extractor. + +## Virtual tag engine (`Core.VirtualTags`) + +### `VirtualTagDefinition` + +One row per operator-authored tag. Fields: `Path` (UNS browse path; also the engine's internal id), `DataType` (`DriverDataType` enum; the evaluator coerces the script's return value to this and mismatch surfaces as `BadTypeMismatch`), `ScriptSource` (Roslyn C# script text), `ChangeTriggered` (re-evaluate on any input delta), `TimerInterval` (optional periodic cadence; null disables), `Historize` (route every evaluation result to `IHistoryWriter`). Change-trigger and timer are independent — a tag can be either, both, or neither. + +### `VirtualTagContext` + +Subclass of `ScriptContext`. Constructed fresh per evaluation over a per-run read cache — scripts cannot stash mutable state across runs on `ctx`. `GetTag(path)` serves from the cache; missing-path reads return a `BadNodeIdUnknown`-quality snapshot. `SetVirtualTag(path, value)` routes through the engine's `OnScriptSetVirtualTag` callback so cross-tag writes still participate in change-trigger cascades (writes to non-virtual / non-registered paths log a warning and drop). `Now` is an injectable clock; production wires `DateTime.UtcNow`, tests pin it. + +### `DependencyGraph` + +Directed graph of tag paths. Edges run from a virtual tag to each path it reads. Unregistered paths (driver tags) are implicit leaves; leaf validity is checked elsewhere against the authoritative catalog. Two operations: + +- **`TopologicalSort()`** — Kahn's algorithm. Produces evaluation order such that every node appears after its registered (virtual) dependencies. Throws `DependencyCycleException` (with every cycle, not just one) on offense. +- **`TransitiveDependentsInOrder(nodeId)`** — DFS collects every reachable dependent of a changed upstream then sorts by topological rank. Used by the cascade dispatcher so a single upstream delta recomputes the full downstream closure in one serial pass without needing a second iteration. + +Cycle detection uses an **iterative** Tarjan's SCC implementation (no recursion, deep graphs cannot stack-overflow). Cycles of length > 1 and self-loops both reject; leaf references cannot form cycles with internal nodes. + +### `VirtualTagEngine` lifecycle + +- **`Load(definitions)`** — clears prior state, compiles every script through `DependencyExtractor.Extract` + `ScriptEvaluator.Compile` (wrapped in `TimedScriptEvaluator`), registers each in `_tags` + `_graph`, runs `TopologicalSort` (cycle check), then for every upstream (non-virtual) path subscribes via `ITagUpstreamSource.SubscribeTag` and seeds `_valueCache` with `ReadTag`. Throws `InvalidOperationException` aggregating every compile failure at once so operators see the whole set; throws `DependencyCycleException` on cycles. Re-entrant — supports config-publish reloads by disposing the prior upstream subscriptions first. +- **`EvaluateAllAsync(ct)`** — evaluates every tag once in topological order. Called at startup so virtual tags have a defined initial value before subscriptions start. +- **`EvaluateOneAsync(path, ct)`** — single-tag evaluation. Entry point for `TimerTriggerScheduler` + tests. +- **`Read(path)`** — synchronous last-known-value lookup from `_valueCache`. Returns `BadNodeIdUnknown`-quality for unregistered paths. +- **`Subscribe(path, observer)`** — register a change observer; returns `IDisposable`. Does **not** emit a seed value. +- **`OnUpstreamChange(path, value)`** (internal, wired from the upstream subscription) — updates cache, notifies observers, launches `CascadeAsync` fire-and-forget so the driver's dispatcher isn't blocked. + +Evaluations are **serial across all tags** — `_evalGate` is a `SemaphoreSlim(1, 1)` held around every `EvaluateInternalAsync`. Parallelism is deferred (Phase 7 plan decision #19). Rationale: serial execution preserves the "earlier topological nodes computed before later dependents" invariant when two cascades race. Per-tag error isolation: a script exception or timeout sets that tag's quality to `BadInternalError` and logs a structured error; other tags keep evaluating. `OperationCanceledException` is re-thrown (shutdown path). + +Result coercion: `CoerceResult` maps the script's return value to the declared `DriverDataType` via `Convert.ToXxx`. Coercion failure returns null which the outer pipeline maps to `BadInternalError`; `BadTypeMismatch` is documented in the definition shape (`VirtualTagDefinition.DataType` doc) rather than emitted distinctly today. + +`IHistoryWriter.Record` fires per evaluation when `Historize = true`. The default `NullHistoryWriter` drops silently. + +### `TimerTriggerScheduler` + +Groups `VirtualTagDefinition`s by `TimerInterval`, one `System.Threading.Timer` per unique interval. Each tick evaluates the group's paths serially via `VirtualTagEngine.EvaluateOneAsync`. Errors per-tag log and continue. `Dispose()` cancels an internal `CancellationTokenSource` and disposes every timer. Independent of the change-trigger path — a tag with both triggers fires from both scheduling sources. + +### `ITagUpstreamSource` + +What the engine pulls driver-tag values from. Reads are **synchronous** because user scripts call `ctx.GetTag(path)` inline — a blocking wire call per evaluation would kill throughput. Implementations are expected to serve from a last-known-value cache populated by subscription callbacks. The server's production implementation is `CachedTagUpstreamSource` (see Composition below). + +### `IHistoryWriter` + +Fire-and-forget sink for evaluation results when `VirtualTagDefinition.Historize = true`. Implementations must queue internally and drain on their own cadence — a slow historian must not block script evaluation. `NullHistoryWriter.Instance` is the no-op default. Today no production writer is wired into the virtual-tag path; scripted-alarm emissions flow through `Core.AlarmHistorian` via `Phase7EngineComposer.RouteToHistorianAsync` (a separate concern; see [AlarmTracking.md](AlarmTracking.md)). + +## Dispatch integration + +Per [ADR-002](v2/implementation/adr-002-driver-vs-virtual-dispatch.md) Option B, there is a single `DriverNodeManager`. `VirtualTagSource` implements `IReadable` + `ISubscribable` over a `VirtualTagEngine`: + +- `ReadAsync` fans each path through `engine.Read(...)`. +- `SubscribeAsync` calls `engine.Subscribe` per path and forwards each engine observer callback as an `OnDataChange` event; emits an initial-data callback per OPC UA convention. +- `UnsubscribeAsync` disposes every per-path engine subscription it holds. +- **`IWritable` is deliberately not implemented.** `DriverNodeManager.IsWriteAllowedBySource` rejects OPC UA client writes to virtual nodes with `BadUserAccessDenied` before any dispatch — scripts are the only write path via `ctx.SetVirtualTag`. + +`DriverNodeManager.SelectReadable(source, ...)` picks the `IReadable` based on `NodeSourceKind`. See [ReadWriteOperations.md](ReadWriteOperations.md) and [Subscriptions.md](Subscriptions.md) for the broader dispatch framing. + +## Upstream reads + history + +`ITagUpstreamSource` and `IHistoryWriter` are the two ports the engine requires from its host. Both live in `Core.VirtualTags`. In the Server process: + +- **`CachedTagUpstreamSource`** (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/CachedTagUpstreamSource.cs`) implements the interface (and the parallel `Core.ScriptedAlarms.ITagUpstreamSource` — identical shape, distinct namespace). A `ConcurrentDictionary` cache. `Push(path, snapshot)` updates the cache and fans out synchronously to every observer. Reads of never-pushed paths return `BadNodeIdUnknown` quality (`UpstreamNotConfigured = 0x80340000`). +- **`DriverSubscriptionBridge`** (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/DriverSubscriptionBridge.cs`) feeds the cache. For each registered `ISubscribable` driver it batches a single `SubscribeAsync` for every fullRef the script graph references, installs an `OnDataChange` handler that translates driver-opaque fullRefs back to UNS paths via a reverse map, and pushes each delta into `CachedTagUpstreamSource`. Unsubscribes on dispose. The bridge suppresses `OTOPCUA0001` (the Roslyn analyzer that requires `ISubscribable` callers to go through `CapabilityInvoker`) on the documented basis that this is a lifecycle wiring, not per-evaluation hot path. +- **`IHistoryWriter`** — no production implementation is currently wired for virtual tags; `VirtualTagEngine` gets `NullHistoryWriter` by default from `Phase7EngineComposer`. + +## Composition + +`Phase7Composer` (`src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs`) is an `IAsyncDisposable` injected into `OpcUaServerService`: + +1. `PrepareAsync(generationId, ct)` — called after the bootstrap generation loads and before `OpcUaApplicationHost.StartAsync`. Reads the `Script` / `VirtualTag` / `ScriptedAlarm` rows for that generation from the config DB (`OtOpcUaConfigDbContext`). Empty-config fast path returns `Phase7ComposedSources.Empty`. +2. Constructs a `CachedTagUpstreamSource` + hands it to `Phase7EngineComposer.Compose`. +3. `Phase7EngineComposer.Compose` projects `VirtualTag` rows into `VirtualTagDefinition`s (joining `Script` rows by `ScriptId`), instantiates `VirtualTagEngine`, calls `Load`, wraps in `VirtualTagSource`. +4. Builds a `DriverFeed` per driver by mapping the driver's `EquipmentNamespaceContent` to `UNS path → driver fullRef` (path format `/{area}/{line}/{equipment}/{tag}` matching the `EquipmentNodeWalker` browse tree so script literals match the operator-visible UNS), then starts `DriverSubscriptionBridge`. +5. Returns `Phase7ComposedSources` with the `VirtualTagSource` cast as `IReadable`. `OpcUaServerService` passes it to `OpcUaApplicationHost` which threads it into `DriverNodeManager` as `virtualReadable`. + +`DisposeAsync` tears down the bridge first (no more events into the cache), then the engines (cascades + timer ticks stop), then the owned SQLite historian sink if any. + +Definition reload on config publish: `VirtualTagEngine.Load` is re-entrant — a future config-publish handler can call it with a new definition set. That handler is not yet wired; today engine composition happens once per service start against the bootstrapped generation. + +## Key source files + +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptContext.cs` — abstract `ctx` API scripts see +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptGlobals.cs` — generic globals wrapper naming the field `ctx` +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptSandbox.cs` — assembly allow-list + imports +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ForbiddenTypeAnalyzer.cs` — post-compile semantic deny-list +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptEvaluator.cs` — three-step compile pipeline +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/TimedScriptEvaluator.cs` — 250ms default timeout wrapper +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/CompiledScriptCache.cs` — SHA-256-keyed compile cache +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/DependencyExtractor.cs` — static `ctx.GetTag` / `ctx.SetVirtualTag` inference +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptLoggerFactory.cs` — per-script Serilog logger +- `src/ZB.MOM.WW.OtOpcUa.Core.Scripting/ScriptLogCompanionSink.cs` — error mirror to main log +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagDefinition.cs` — per-tag config record +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagContext.cs` — evaluation-scoped `ctx` +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs` — Kahn topo-sort + iterative Tarjan SCC +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs` — load / evaluate / cascade pipeline +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs` — periodic re-evaluation +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/ITagUpstreamSource.cs` — driver-tag read + subscribe port +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/IHistoryWriter.cs` — historize sink port + `NullHistoryWriter` +- `src/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagSource.cs` — `IReadable` + `ISubscribable` adapter +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/CachedTagUpstreamSource.cs` — production `ITagUpstreamSource` +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/DriverSubscriptionBridge.cs` — driver `ISubscribable` → cache feed +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7EngineComposer.cs` — row projection + engine instantiation +- `src/ZB.MOM.WW.OtOpcUa.Server/Phase7/Phase7Composer.cs` — lifecycle host: load rows, compose, wire bridge +- `src/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs` — `SelectReadable` + `IsWriteAllowedBySource` dispatch kernel diff --git a/docs/drivers/README.md b/docs/drivers/README.md index 6b6f21c..74edd2b 100644 --- a/docs/drivers/README.md +++ b/docs/drivers/README.md @@ -23,7 +23,7 @@ Driver type metadata is registered at startup in `DriverTypeRegistry` (`src/ZB.M | [Galaxy](Galaxy.md) | `Driver.Galaxy.{Shared, Host, Proxy}` | C | MXAccess COM + `aahClientManaged` + SqlClient | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IAlarmSource, IHistoryProvider, IRediscoverable, IHostConnectivityProbe | Out-of-process — Host is its own Windows service (.NET 4.8 x86 for the COM bitness constraint); Proxy talks to Host over a named pipe | | Modbus TCP | `Driver.Modbus` | A | NModbus-derived in-house client | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe | Polled subscriptions via the shared `PollGroupEngine`. DL205 PLCs are covered by `AddressFormat=DL205` (octal V/X/Y/C/T/CT translation) — no separate driver | | Siemens S7 | `Driver.S7` | A | [S7netplus](https://github.com/S7NetPlus/s7netplus) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe | Single S7netplus `Plc` instance per PLC serialized with `SemaphoreSlim` — the S7 CPU's comm mailbox is scanned at most once per cycle, so parallel reads don't help | -| AB CIP | `Driver.AbCip` | A | libplctag CIP | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | ControlLogix / CompactLogix. Tag discovery uses the `@tags` walker to enumerate controller-scoped + program-scoped symbols; UDT member resolution via the UDT template reader | +| AB CIP | `Driver.AbCip` | A | libplctag CIP | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver, IAlarmSource | ControlLogix / CompactLogix. Tag discovery uses the `@tags` walker to enumerate controller-scoped + program-scoped symbols; UDT member resolution via the UDT template reader | | AB Legacy | `Driver.AbLegacy` | A | libplctag PCCC | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | SLC 500 / MicroLogix. File-based addressing (`N7:0`, `F8:0`) — no symbol table, tag list is user-authored in the config DB | | TwinCAT | `Driver.TwinCAT` | B | Beckhoff `TwinCAT.Ads` (`TcAdsClient`) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | The only native-notification driver outside Galaxy — ADS delivers `ValueChangedCallback` events the driver forwards straight to `ISubscribable.OnDataChange` without polling. Symbol tree uploaded via `SymbolLoaderFactory` | | FOCAS | `Driver.FOCAS` | C | FANUC FOCAS2 (`Fwlib32.dll` P/Invoke) | IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IHostConnectivityProbe, IPerCallHostResolver | Tier C — FOCAS DLL has crash modes that warrant process isolation. CNC-shaped data model (axes, spindle, PMC, macros, alarms) not a flat tag map | diff --git a/docs/reqs/OpcUaServerReqs.md b/docs/reqs/OpcUaServerReqs.md index c00dd49..bbb22ad 100644 --- a/docs/reqs/OpcUaServerReqs.md +++ b/docs/reqs/OpcUaServerReqs.md @@ -1,6 +1,8 @@ # OPC UA Server — Component Requirements -> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, `AuthorizationGate`, `ServiceLevel` reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`. +> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-019 are new and cover `AuthorizationGate` + permission trie, dynamic `ServiceLevel` reporting, session management, surgical address-space rebuild on generation apply, server diagnostics nodes, and OpenTelemetry observability hooks. Capability dispatch (OPC-012), per-host Polly isolation (OPC-013), idempotence-aware write retry (OPC-006 + OPC-012), the alarm surface (OPC-008), the history surface (OPC-009), and the transport-security / server-certificate profile matrix (OPC-010) are folded into the renumbered body above. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`. +> +> **Reserved** — OPC-020, OPC-021, and OPC-022 are intentionally unallocated and held for future use. An earlier draft of this revision header listed them; no matching requirement bodies were ever pinned down because the scope they were meant to hold is already covered by OPC-006/008/009/010/012/013. Do not recycle these IDs for unrelated requirements without a deliberate renumbering pass. Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-010](HighLevelReqs.md#hlr-010-per-driver-instance-resilience), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)