Compare commits
27 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 2d07d716dc | |||
| ae7106dfce | |||
| 1bd8a1875b | |||
| fe91d42927 | |||
| 6bf147a113 | |||
| 9db2edcbb5 | |||
| 5e890ec9d6 | |||
| 580c45f494 | |||
| da277a843a | |||
| c55da145ec | |||
| 42f41fbe50 | |||
| d5a87c7467 | |||
| 6f4cbf8449 | |||
| edee47d77f | |||
| 22ef2eb5ba | |||
| 698bdef572 | |||
| 2fdad81af3 | |||
| 7b21c3b428 | |||
| 619207e7f5 | |||
| 78fe3e8a45 | |||
| 837172ab39 | |||
| 80a0ca2651 | |||
| 8d042c631b | |||
| bbdbdf8afb | |||
| 982771df9a | |||
| 9db6da9c20 | |||
| 71443ecbf3 |
@@ -4,15 +4,39 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
||||
|
||||
## Project Goal
|
||||
|
||||
Build an OPC UA server on .NET Framework 4.8 (32-bit) that exposes AVEVA System Platform (Wonderware) Galaxy tags via the MXAccess toolkit. The server mirrors the Galaxy object hierarchy as an OPC UA address space, translating between contained-name browse paths and tag-name runtime references.
|
||||
Build an OPC UA server (.NET 10) that exposes AVEVA System Platform
|
||||
(Wonderware) Galaxy tags. The server mirrors the Galaxy object
|
||||
hierarchy as an OPC UA address space, translating between
|
||||
contained-name browse paths and tag-name runtime references. Galaxy
|
||||
access flows through the in-process `GalaxyDriver`
|
||||
(`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/`) talking gRPC to a separately
|
||||
installed **mxaccessgw** gateway process. The gateway owns the
|
||||
MXAccess COM bitness constraint (its worker is x86 net48); everything
|
||||
in this repo is .NET 10. PR 7.2 retired the legacy in-process
|
||||
`Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared` projects + the
|
||||
`OtOpcUaGalaxyHost` Windows service.
|
||||
|
||||
See `lmx_mxgw.md` for the migration design and
|
||||
`docs/v2/Galaxy.Performance.md` for the runtime perf surface
|
||||
(tracing, metrics, soak harness).
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Data Flow
|
||||
|
||||
1. **Galaxy Repository DB (ZB)** — SQL Server database holding the deployed object hierarchy and attribute definitions. Queried at startup and on change detection to build/rebuild the OPC UA address space.
|
||||
2. **MXAccess COM API** — Runtime data access layer. Subscribes to Galaxy tag attributes for live read/write. Requires a dedicated STA thread with a Win32 message pump for COM callbacks.
|
||||
3. **OPC UA Server** — Exposes the hierarchy as browse nodes and attributes as variable nodes. Clients browse via contained names but reads/writes are translated to `tag_name.AttributeName` format for MXAccess.
|
||||
1. **Galaxy Repository DB (ZB)** — SQL Server database holding the
|
||||
deployed object hierarchy and attribute definitions. The
|
||||
mxaccessgw's `GalaxyRepositoryClient` queries it via gRPC; the
|
||||
driver consumes the materialised hierarchy through
|
||||
`IGalaxyHierarchySource`.
|
||||
2. **MXAccess (via mxaccessgw)** — Live read/write/subscribe over a
|
||||
gRPC session. The gateway owns the COM apartment + STA pump
|
||||
server-side; the driver speaks `MxCommand` / `MxEvent` protos
|
||||
exclusively.
|
||||
3. **OPC UA Server** — Exposes the hierarchy as browse nodes and
|
||||
attributes as variable nodes. Clients browse via contained names
|
||||
but reads/writes are translated to `tag_name.AttributeName` format
|
||||
for MXAccess.
|
||||
|
||||
### Key Concept: Contained Name vs Tag Name
|
||||
|
||||
@@ -22,30 +46,17 @@ Galaxy objects have two names:
|
||||
|
||||
Example: browsing `TestMachine_001/DelmiaReceiver/DownloadPath` translates to MXAccess reference `DelmiaReceiver_001.DownloadPath`.
|
||||
|
||||
See `gr/layout.md` for the full mapping and target OPC UA structure.
|
||||
|
||||
### Data Type Mapping
|
||||
|
||||
Galaxy `mx_data_type` values map to OPC UA types (Boolean, Int32, Float, Double, String, DateTime, etc.). Array attributes use ValueRank=1 with ArrayDimensions from the Galaxy attribute definition. Full mapping in `gr/data_type_mapping.md`.
|
||||
Galaxy `mx_data_type` values map to OPC UA types (Boolean, Int32, Float, Double, String, DateTime, etc.). Array attributes use ValueRank=1 with ArrayDimensions from the Galaxy attribute definition. The driver-side mapping lives in `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Browse/DataTypeMap.cs`.
|
||||
|
||||
### Change Detection
|
||||
|
||||
Poll `galaxy.time_of_last_deploy` in the ZB database to detect redeployments, then rebuild the address space. See `gr/build_layout_plan.md` for the step-by-step plan.
|
||||
`DeployWatcher` (`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/Browse/DeployWatcher.cs`) polls the gateway's deploy-event signal and raises `IRediscoverable.OnRediscoveryNeeded` when the Galaxy redeploys. The server's `DriverHost` consumes the signal and rebuilds the address space.
|
||||
|
||||
## Reference Implementation
|
||||
## mxaccessgw
|
||||
|
||||
An existing MXAccess client implementation is at:
|
||||
`C:\Users\dohertj2\Desktop\scadalink-design\lmxproxy\src\ZB.MOM.WW.LmxProxy.Host`
|
||||
|
||||
Key patterns from that codebase:
|
||||
- **StaComThread** — Dedicated STA thread with Win32 message pump (`GetMessage`/`DispatchMessage` loop). All MXAccess COM objects must be created and called on this thread. Uses `PostThreadMessage(WM_APP)` to marshal work items.
|
||||
- **LMXProxyServer COM object** — `Register(clientName)` returns a connection handle. `AddItem(handle, address)` + `AdviseSupervisory(handle, itemHandle)` for subscriptions. `OnDataChange`/`OnWriteComplete` events for callbacks.
|
||||
- **Reconnect** — Stored subscriptions are replayed after reconnect. A probe tag subscription monitors connection health.
|
||||
- **COM cleanup** — `Marshal.ReleaseComObject()` on disconnect. Event handlers must be unwired before unregister.
|
||||
|
||||
## MXAccess Documentation
|
||||
|
||||
`mxaccess_documentation.md` in the project root contains the full ArchestrA MXAccess Toolkit User's Guide. Key API: `ArchestrA.MxAccess` namespace, `LMXProxyServer` class. The toolkit DLLs are in `Program Files (x86)\ArchestrA\Framework\bin`.
|
||||
The gateway lives in a sibling repo at `c:\Users\dohertj2\Desktop\mxaccessgw\`. See `docs/v2/Galaxy.ParityRig.md` for the gw setup recipe (build, API key provisioning via `apikey create-key`, env-var overrides for HTTP/2 cleartext + worker path). The gw's MXAccess Toolkit reference (its `gateway.md`) is the canonical MxAccess API doc; the standalone `mxaccess_documentation.md` previously kept in this repo retired in PR 7.3.
|
||||
|
||||
## Galaxy Repository Database
|
||||
|
||||
@@ -71,11 +82,48 @@ dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests # integration tests
|
||||
dotnet test --filter "FullyQualifiedName~MyTestClass.MyMethod" # single test
|
||||
```
|
||||
|
||||
## Docker Workflow (driver fixtures + central SQL Server)
|
||||
|
||||
> **Migrated 2026-04-28**: Docker config + host moved off this dev VM (DESKTOP-6JL3KKO) onto the shared Linux Docker host (`DOCKER`, 10.100.0.35) so the dev VM could shed WSL2/Hyper-V and have its GPU re-attached via ESXi passthrough. Docker Desktop is no longer installed here. All checked-in `appsettings.json` defaults, fixture-class default endpoints, and `e2e-config.sample.json` were rewritten to target `10.100.0.35`. The driver fixture compose files under `tests/.../Docker/docker-compose.yml` now carry a `project: lmxopcua` label on every service. See `docs/v2/dev-environment.md` for the full rewrite (header dated 2026-04-28).
|
||||
|
||||
Docker workloads run on a shared Linux host at **`10.100.0.35`** — not on this VM. Stacks live at `/opt/otopcua-<driver>/` on the host and carry the `project=lmxopcua` label so they're discoverable via `docker ps --filter label=project=lmxopcua`.
|
||||
|
||||
**`docker -H ssh://...` does NOT work from this VM.** Windows OpenSSH ↔ docker.exe stdio bridging hangs (`docker system dial-stdio` runs server-side but no API data flows). Use the helper below — it SSHes into the docker host and runs `docker compose` server-side.
|
||||
|
||||
**Use `lmxopcua-fix.ps1` (in `~/bin`) to control fixtures from this VM:**
|
||||
|
||||
```powershell
|
||||
lmxopcua-fix ls # list all lmxopcua-tagged containers on the host
|
||||
lmxopcua-fix up modbus standard # bring a profile up
|
||||
lmxopcua-fix up abcip controllogix
|
||||
lmxopcua-fix up s7 s7_1500
|
||||
lmxopcua-fix up opcuaclient # single-service stack, no profile arg
|
||||
lmxopcua-fix down modbus # tear stack down
|
||||
lmxopcua-fix logs modbus
|
||||
lmxopcua-fix sync modbus # rsync this repo's tests/.../Docker/ → /opt/otopcua-modbus/
|
||||
```
|
||||
|
||||
**`sync` is the deployment step.** When you edit a fixture's compose file or Dockerfile under `tests/.../Docker/`, run `lmxopcua-fix sync <driver>` to push the changes to the docker host before bringing the stack up. The repo files are the source of truth; `/opt/otopcua-<driver>/` is a mirrored deployment.
|
||||
|
||||
**Endpoints (defaults already point at the docker host):**
|
||||
- SQL Server (always-on): `10.100.0.35,14330` — used by `appsettings.json` for `ConfigDb`.
|
||||
- Modbus: `10.100.0.35:5020` (`MODBUS_SIM_ENDPOINT`)
|
||||
- AB CIP: `10.100.0.35:44818` (`AB_SERVER_ENDPOINT`)
|
||||
- S7: `10.100.0.35:1102` (`S7_SIM_ENDPOINT`)
|
||||
- OPC UA reference (opc-plc): `opc.tcp://10.100.0.35:50000` (`OPCUA_SIM_ENDPOINT`)
|
||||
|
||||
Override any endpoint via the env var to point at a real PLC. The local OtOpcUa server runs on this VM at `opc.tcp://localhost:4840` — **that's not on the docker host**.
|
||||
|
||||
See `docs/v2/dev-environment.md` for the full inventory and rationale.
|
||||
|
||||
## Build & Runtime Constraints
|
||||
|
||||
- Language: C#, .NET Framework 4.8, **x86 (32-bit)** platform target — required for MXAccess COM interop
|
||||
- MXAccess requires a deployed ArchestrA Platform on the machine running the server
|
||||
- COM apartment: MXAccess objects must live on an STA thread with a message pump
|
||||
- Language: C#, .NET 10, AnyCPU. The MXAccess COM bitness constraint
|
||||
is owned by the mxaccessgw worker (x86 net48), not by anything in
|
||||
this repo.
|
||||
- The gateway's MXAccess worker requires a deployed ArchestrA Platform
|
||||
on the machine running the gateway. The OtOpcUa server itself does
|
||||
not.
|
||||
|
||||
## Transport Security
|
||||
|
||||
@@ -83,7 +131,7 @@ The server supports configurable OPC UA transport security via the `Security` se
|
||||
|
||||
## Redundancy
|
||||
|
||||
The server supports non-transparent warm/hot redundancy via the `Redundancy` section in `appsettings.json`. Two instances share the same Galaxy DB and MXAccess runtime but have unique `ApplicationUri` values. Each exposes `RedundancySupport`, `ServerUriArray`, and a dynamic `ServiceLevel` based on role and runtime health. The primary advertises a higher ServiceLevel than the secondary. See `docs/Redundancy.md` for the full guide.
|
||||
The server supports non-transparent warm/hot redundancy via the `Redundancy` section in `appsettings.json`. Two instances share the same Galaxy DB and the same mxaccessgw (under distinct `MxAccess.ClientName` values) but have unique `ApplicationUri` values. Each exposes `RedundancySupport`, `ServerUriArray`, and a dynamic `ServiceLevel` based on role and runtime health. The primary advertises a higher ServiceLevel than the secondary. See `docs/Redundancy.md` for the full guide.
|
||||
|
||||
## LDAP Authentication
|
||||
|
||||
@@ -94,7 +142,6 @@ The server uses LDAP-based user authentication via the `Authentication.Ldap` sec
|
||||
- **Logging**: Serilog with rolling daily file sink
|
||||
- **Unit tests**: xUnit + Shouldly for assertions
|
||||
- **Service hosting (Server, Admin)**: .NET generic host with `AddWindowsService` (decision #30 — replaced TopShelf in v2; see `src/ZB.MOM.WW.OtOpcUa.Server/OpcUaServerService.cs`)
|
||||
- **Service hosting (Galaxy.Host)**: plain console app wrapped by NSSM (`.NET Framework 4.8 x86` — required by MXAccess COM bitness)
|
||||
- **OPC UA**: OPC Foundation UA .NET Standard stack (https://github.com/opcfoundation/ua-.netstandard) — NuGet: `OPCFoundation.NetStandard.Opc.Ua.Server`
|
||||
|
||||
## OPC UA .NET Standard Documentation
|
||||
|
||||
@@ -9,9 +9,6 @@
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Server/ZB.MOM.WW.OtOpcUa.Server.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Admin/ZB.MOM.WW.OtOpcUa.Admin.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.csproj"/>
|
||||
<Project Path="src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.csproj"/>
|
||||
@@ -46,12 +43,6 @@
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Server.Tests/ZB.MOM.WW.OtOpcUa.Server.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests.csproj"/>
|
||||
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client.Tests.csproj"/>
|
||||
|
||||
+41
-112
@@ -2,132 +2,61 @@
|
||||
|
||||
## Overview
|
||||
|
||||
A production OtOpcUa deployment runs **three processes**, each with a distinct runtime, platform target, and install surface:
|
||||
A production OtOpcUa deployment runs **two or three processes**, each
|
||||
with a distinct runtime and install surface:
|
||||
|
||||
| Process | Project | Runtime | Platform | Responsibility |
|
||||
|---|---|---|---|---|
|
||||
| **OtOpcUa Server** | `src/ZB.MOM.WW.OtOpcUa.Server` | .NET 10 | x64 | Hosts the OPC UA endpoint; loads every non-Galaxy driver in-process; exposes `/healthz`. |
|
||||
| **OtOpcUa Server** | `src/ZB.MOM.WW.OtOpcUa.Server` | .NET 10 | x64 | Hosts the OPC UA endpoint; loads every driver in-process (Modbus, S7, AbCip, AbLegacy, TwinCAT, FOCAS, OPC UA Client, Galaxy via mxaccessgw); exposes `/healthz`. |
|
||||
| **OtOpcUa Admin** | `src/ZB.MOM.WW.OtOpcUa.Admin` | .NET 10 (ASP.NET Core / Blazor Server) | x64 | Operator UI for Config DB editing + fleet status, SignalR hubs (`FleetStatusHub`, `AlertHub`), Prometheus `/metrics`. |
|
||||
| **OtOpcUa Galaxy.Host** | `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host` | .NET Framework 4.8 | x86 (32-bit) | Hosts MXAccess COM on a dedicated STA thread with a Win32 message pump; exposes a named-pipe IPC surface consumed by `Driver.Galaxy.Proxy` inside the Server process. |
|
||||
| **OtOpcUa Wonderware Historian** *(optional)* | `src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware` | .NET Framework 4.8 | x86 (32-bit) | Out-of-process sidecar exposing the Wonderware Historian SDK over a named pipe. Required only when `Historian:Wonderware:Enabled=true` in `appsettings.json`. |
|
||||
|
||||
The x86 / .NET Framework 4.8 constraint applies **only** to Galaxy.Host because the MXAccess toolkit DLLs (`Program Files (x86)\ArchestrA\Framework\bin`) are 32-bit-only COM. Every other driver (Modbus, S7, OpcUaClient, AbCip, AbLegacy, TwinCAT, FOCAS) runs in-process in the 64-bit Server.
|
||||
Galaxy access uses a separately-installed **mxaccessgw** running out
|
||||
of a sibling repo (`c:\Users\dohertj2\Desktop\mxaccessgw\`) — see
|
||||
`docs/v2/Galaxy.ParityRig.md` for setup. The mxaccessgw owns the
|
||||
MXAccess COM bitness constraint (its worker is x86 net48); nothing
|
||||
in the OtOpcUa repo carries that constraint anymore. PR 7.2 retired
|
||||
the legacy in-process `Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared`
|
||||
projects + the `OtOpcUaGalaxyHost` Windows service.
|
||||
|
||||
## Server process
|
||||
## OtOpcUa Server
|
||||
|
||||
`src/ZB.MOM.WW.OtOpcUa.Server/Program.cs` uses the generic host:
|
||||
Hosted via `Microsoft.Extensions.Hosting` with `AddWindowsService`
|
||||
(decision #30 — replaced TopShelf in v2). The host's `Build()`
|
||||
returns immediately when launched interactively (e.g. `dotnet run`)
|
||||
but blocks for SCM signals when running as a Windows service.
|
||||
|
||||
```csharp
|
||||
var builder = Host.CreateApplicationBuilder(args);
|
||||
builder.Services.AddSerilog();
|
||||
builder.Services.AddWindowsService(o => o.ServiceName = "OtOpcUa");
|
||||
…
|
||||
builder.Services.AddHostedService<OpcUaServerService>();
|
||||
builder.Services.AddHostedService<HostStatusPublisher>();
|
||||
```
|
||||
In-process drivers are registered at startup in `Program.cs`'s
|
||||
`DriverFactoryRegistry` block; the `DriverInstance` rows in the
|
||||
central Config DB select which driver factories materialise into
|
||||
live `IDriver` instances. See `docs/v2/driver-specs.md` for the
|
||||
per-driver `DriverConfig` JSON shapes.
|
||||
|
||||
`OpcUaServerService` is a `BackgroundService` (decision #30 — TopShelf from v1 was replaced by the generic-host `AddWindowsService` wrapper; no TopShelf dependency remains in any csproj). It owns:
|
||||
## OtOpcUa Admin
|
||||
|
||||
1. Config bootstrap — reads `Node:NodeId`, `Node:ClusterId`, `Node:ConfigDbConnectionString`, `Node:LocalCachePath` from `appsettings.json`.
|
||||
2. `NodeBootstrap` — pulls the latest published generation from the Config DB into the LiteDB local cache (`LiteDbConfigCache`) so the node starts even if the central DB is briefly unreachable.
|
||||
3. `DriverHost` — instantiates configured driver instances from the generation, wires each through `CapabilityInvoker` resilience pipelines.
|
||||
4. `OpcUaApplicationHost` — builds the OPC UA endpoint, applies `OpcUaServerOptions` + `LdapOptions`, registers `AuthorizationGate` at dispatch.
|
||||
5. `HostStatusPublisher` — a second hosted service that heartbeats `DriverHostStatus` rows so the Admin UI Fleet view sees the node.
|
||||
Same hosting model; runs the Blazor Server UI + SignalR hubs.
|
||||
Reads from the same Config DB the Server writes to.
|
||||
|
||||
### Installation
|
||||
## OtOpcUa Wonderware Historian (optional)
|
||||
|
||||
Same executable, different modes driven by the .NET generic-host `AddWindowsService` wrapper:
|
||||
When `Historian:Wonderware:Enabled=true`, the Server speaks to a
|
||||
sidecar that wraps the Wonderware Historian SDK (which is .NET
|
||||
Framework only). The pipe IPC contract is in
|
||||
`src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Contracts/`
|
||||
and the sidecar's pipe handler lives at
|
||||
`src/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Pipe/`.
|
||||
|
||||
| Mode | Invocation |
|
||||
|---|---|
|
||||
| Console | `ZB.MOM.WW.OtOpcUa.Server.exe` |
|
||||
| Install as Windows service | `sc create OtOpcUa binPath="C:\Program Files\OtOpcUa\Server\ZB.MOM.WW.OtOpcUa.Server.exe" start=auto` |
|
||||
| Start | `sc start OtOpcUa` |
|
||||
| Stop | `sc stop OtOpcUa` |
|
||||
| Uninstall | `sc delete OtOpcUa` |
|
||||
Install via the `-InstallWonderwareHistorian` switch on
|
||||
`scripts/install/Install-Services.ps1`.
|
||||
|
||||
### Health endpoints
|
||||
## Install / Uninstall
|
||||
|
||||
The Server exposes `/healthz` + `/readyz` used by (a) the Admin `FleetStatusPoller` as input to Fleet status and (b) `PeerReachabilityTracker` in a peer Server process as the HTTP side of the peer-reachability probe.
|
||||
- `scripts/install/Install-Services.ps1` — installs `OtOpcUa` and
|
||||
optionally `OtOpcUaWonderwareHistorian`.
|
||||
- `scripts/install/Uninstall-Services.ps1` — stops + removes both,
|
||||
plus `OtOpcUaGalaxyHost` if a pre-7.2 rig still carries it.
|
||||
|
||||
## Admin process
|
||||
## Logging
|
||||
|
||||
`src/ZB.MOM.WW.OtOpcUa.Admin/Program.cs` is a stock `WebApplication`. Highlights:
|
||||
|
||||
- Cookie auth (`CookieAuthenticationDefaults`, scheme name `OtOpcUa.Admin`) + Blazor Server (`AddInteractiveServerComponents`) + SignalR.
|
||||
- Authorization policies gated by `AdminRoles`: `ConfigViewer`, `ConfigEditor`, `FleetAdmin` (see `Services/AdminRoles.cs`). `CanEdit` policy requires `ConfigEditor` or `FleetAdmin`; `CanPublish` requires `FleetAdmin`.
|
||||
- `OtOpcUaConfigDbContext` registered against `ConnectionStrings:ConfigDb`.
|
||||
- Scoped services: `ClusterService`, `GenerationService`, `EquipmentService`, `UnsService`, `NamespaceService`, `DriverInstanceService`, `NodeAclService`, `PermissionProbeService`, `AclChangeNotifier`, `ReservationService`, `DraftValidationService`, `AuditLogService`, `HostStatusService`, `ClusterNodeService`, `EquipmentImportBatchService`, `ILdapGroupRoleMappingService`.
|
||||
- Singleton `RedundancyMetrics` (meter name `ZB.MOM.WW.OtOpcUa.Redundancy`) + `CertTrustService` (promotes rejected client certs in the Server's PKI store to trusted via the Admin Certificates page).
|
||||
- `LdapAuthService` bound to `Authentication:Ldap` — same LDAP flow as ScadaLink CentralUI for visual parity.
|
||||
- SignalR hubs mapped at `/hubs/fleet` and `/hubs/alerts`; `FleetStatusPoller` runs as a hosted service and pushes `RoleChanged`, host status, and alert events.
|
||||
- OpenTelemetry → Prometheus exporter at `/metrics` when `Metrics:Prometheus:Enabled=true` (default). Pull-based means no Collector required in the common K8s deploy.
|
||||
|
||||
### Installation
|
||||
|
||||
Deployed as an ASP.NET Core service; the generic-host `AddWindowsService` wrapper (or IIS reverse-proxy for multi-node fleets) provides install/uninstall. Listens on whatever `ASPNETCORE_URLS` specifies.
|
||||
|
||||
## Galaxy.Host process
|
||||
|
||||
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Program.cs` is a .NET Framework 4.8 x86 console executable. Configuration comes from environment variables supplied by the supervisor (`Driver.Galaxy.Proxy.Supervisor`):
|
||||
|
||||
| Env var | Purpose |
|
||||
|---|---|
|
||||
| `OTOPCUA_GALAXY_PIPE` | Pipe name the host listens on (default `OtOpcUaGalaxy`). |
|
||||
| `OTOPCUA_ALLOWED_SID` | SID of the Server process's principal; anyone else is refused during the handshake. |
|
||||
| `OTOPCUA_GALAXY_SECRET` | Per-spawn shared secret the client must present in the Hello frame. |
|
||||
| `OTOPCUA_GALAXY_BACKEND` | `mxaccess` (default), `db` (ZB-only, no COM), `stub` (in-memory; for tests). |
|
||||
| `OTOPCUA_GALAXY_ZB_CONN` | SQL connection string to the ZB Galaxy repository. |
|
||||
| `OTOPCUA_HISTORIAN_*` | Optional Wonderware Historian SDK config if Historian is enabled for this node. |
|
||||
|
||||
The host spins up `StaPump` (the STA thread with message pump), creates the MXAccess `LMXProxyServer` COM object on that thread, and handles all COM calls there; the IPC layer marshals work items via `PostThreadMessage`.
|
||||
|
||||
### Pipe security
|
||||
|
||||
`PipeServer` builds a `PipeAcl` from the provided `SecurityIdentifier` + uses `NamedPipeServerStream` with `maxNumberOfServerInstances: 1`. The handshake requires a matching shared secret in the first Hello frame; callers whose SID doesn't match `OTOPCUA_ALLOWED_SID` are rejected before any frame is processed via `NamedPipeServerStream.RunAsClient` + a SID comparison against the configured allow list. The DACL grants `ReadWrite | Synchronize` only to the allowed SID and denies `LocalSystem`. The installed dev host (`OtOpcUaGalaxyHost`) runs as `dohertj2` with the secret at `.local/galaxy-host-secret.txt`.
|
||||
|
||||
### Installation
|
||||
|
||||
NSSM-wrapped (the Non-Sucking Service Manager) because the executable itself is a plain console app, not a `ServiceBase` Windows service. The supervisor then adopts the child process over the pipe after install. Install/uninstall commands follow the NSSM pattern:
|
||||
|
||||
```bash
|
||||
nssm install OtOpcUaGalaxyHost "C:\Program Files (x86)\OtOpcUa\Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.exe"
|
||||
nssm set OtOpcUaGalaxyHost ObjectName .\dohertj2 <password>
|
||||
nssm set OtOpcUaGalaxyHost AppEnvironmentExtra OTOPCUA_GALAXY_BACKEND=mxaccess OTOPCUA_GALAXY_SECRET=… OTOPCUA_ALLOWED_SID=…
|
||||
nssm start OtOpcUaGalaxyHost
|
||||
```
|
||||
|
||||
(Exact values for the environment block are generated by the Admin UI + committed alongside `.local/galaxy-host-secret.txt` on the dev box.)
|
||||
|
||||
## Inter-process communication
|
||||
|
||||
```
|
||||
┌──────────────────────────┐ LDAP bind (Authentication:Ldap) ┌──────────────────────────┐
|
||||
│ OtOpcUa Admin (x64) │ ─────────────────────────────────────────────▶│ LDAP / AD │
|
||||
│ Blazor Server + SignalR │ └──────────────────────────┘
|
||||
│ /metrics (Prometheus) │ FleetStatusPoller → ClusterNode poll
|
||||
│ │ ─────────────────────────────────────────────▶┌──────────────────────────┐
|
||||
│ │ Cluster/Generation/ACL writes │ Config DB (SQL Server) │
|
||||
└──────────────────────────┘ ─────────────────────────────────────────────▶│ OtOpcUaConfigDbContext │
|
||||
▲ └──────────────────────────┘
|
||||
│ SignalR ▲
|
||||
│ (role change, │ sp_GetCurrentGenerationForCluster
|
||||
│ host status, │ sp_PublishGeneration
|
||||
│ alerts) │
|
||||
┌──────────────────────────┐ │
|
||||
│ OtOpcUa Server (x64) │ ──────────────────────────────────────────────────────────┘
|
||||
│ OPC UA endpoint │
|
||||
│ Non-Galaxy drivers │ Named pipe (OtOpcUaGalaxy) ┌──────────────────────────┐
|
||||
│ Driver.Galaxy.Proxy │ ─────────────────────────────────────────────▶│ Galaxy.Host (x86 .NFx) │
|
||||
│ │ SID + shared-secret handshake │ STA + message pump │
|
||||
│ /healthz /readyz │ │ MXAccess COM │
|
||||
└──────────────────────────┘ │ Historian SDK (opt) │
|
||||
└──────────────────────────┘
|
||||
```
|
||||
|
||||
## appsettings.json boundary
|
||||
|
||||
Each process reads its own `appsettings.json` for **bootstrap only** — connection strings, LDAP bind config, transport security profile, redundancy node id, logging. The authoritative configuration tree (drivers, UNS, tags, ACLs) lives in the Config DB and is edited through the Admin UI. See [`Configuration.md`](Configuration.md) for the split.
|
||||
|
||||
## Development bootstrap
|
||||
|
||||
For the Windows install steps (SQL Server in Docker, .NET 10 SDK, .NET Framework 4.8 SDK, Docker Desktop WSL 2 backend, EF Core CLI, first-run migration), see [`docs/v2/dev-environment.md`](v2/dev-environment.md).
|
||||
Serilog with rolling-daily file sinks. Each service writes to
|
||||
`%ProgramData%\OtOpcUa\<service>-*.log` plus stdout (NSSM-friendly).
|
||||
|
||||
@@ -0,0 +1,623 @@
|
||||
# Driver Feature Gaps vs Commercial OPC/SCADA Gateways
|
||||
|
||||
This document compares each non-Modbus, non-LMX driver in the OtOpcUa server against the feature surfaces of the dominant commercial gateways (Kepware KEPServerEX / PTC Kepware Edge, AVEVA OI Server / DAServer, Software Toolbox TOP Server, Matrikon, Unified Automation UaGateway, MTConnect-class Fanuc adapters, Beckhoff TF6100, etc.).
|
||||
|
||||
The intent is to:
|
||||
|
||||
- inventory what we already ship (with file:line citations into the current codebase)
|
||||
- list missing or under-served features that are table-stakes for sites replacing those commercial gateways
|
||||
- preserve the design choices that should NOT change just because a competitor does it differently
|
||||
|
||||
LMX (Galaxy / MXAccess) and Modbus are tracked elsewhere and are excluded here.
|
||||
|
||||
## Drivers covered
|
||||
|
||||
| Driver | Section | Implementation plan |
|
||||
|---|---|---|
|
||||
| AbCip — Allen-Bradley EtherNet/IP (ControlLogix / CompactLogix / Micro800 / GuardLogix) | [↓](#abcip-allen-bradley-ethernetip--logix) | [`plans/abcip-plan.md`](plans/abcip-plan.md) |
|
||||
| AbLegacy — Allen-Bradley PLC-5 / SLC / MicroLogix (PCCC) | [↓](#ablegacy-allen-bradley-plc-5--slc--micrologix) | [`plans/ablegacy-plan.md`](plans/ablegacy-plan.md) |
|
||||
| FOCAS — Fanuc CNC FOCAS / FOCAS2 | [↓](#focas-fanuc-cnc) | [`plans/focas-plan.md`](plans/focas-plan.md) |
|
||||
| OpcUaClient — OPC UA aggregation client | [↓](#opcuaclient-opc-ua-aggregation-client) | [`plans/opcuaclient-plan.md`](plans/opcuaclient-plan.md) |
|
||||
| S7 — Siemens S7-300 / 400 / 1200 / 1500 | [↓](#s7-siemens-s7-3004001200--1500) | [`plans/s7-plan.md`](plans/s7-plan.md) |
|
||||
| TwinCAT — Beckhoff TwinCAT 2 / 3 (ADS) | [↓](#twincat-beckhoff-ads) | [`plans/twincat-plan.md`](plans/twincat-plan.md) |
|
||||
|
||||
## How to read this document
|
||||
|
||||
Every gap below is rated **[Build]** (recommended) or **[Skip]** (not recommended) inline at the start of the bullet. The same rating appears in the per-driver `### Recommendations` table with its rationale. The per-driver implementation plan in `docs/plans/` covers the **[Build]** items only.
|
||||
|
||||
---
|
||||
|
||||
## AbCip (Allen-Bradley EtherNet/IP — Logix)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- Per-device `ab://gateway[:port]/cip-path` host-address with multi-hop CIP path via a comma-separated string (e.g. `1,2,2,192.168.50.20,1,0`) — `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipHostAddress.cs:23`.
|
||||
- Four PLC-family profiles (`ControlLogix`, `CompactLogix`, `Micro800`, `GuardLogix`) selecting libplctag plc attribute, ConnectionSize default (504/4002/488), default CIP path (`1,0` or empty), connected-vs-unconnected hint, request-packing flag, and MaxFragmentBytes — `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/PlcFamilies/AbCipPlcFamilyProfile.cs:13-62`.
|
||||
- N devices per driver instance with per-device bulkhead/breaker keying — `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipDriverOptions.cs:19`.
|
||||
- Pre-declared static tag map (`AbCipTagDefinition`) keyed by `Name`, with `TagPath`, `DataType`, `Writable`, `WriteIdempotent`, `Members`, `SafetyTag` — `AbCipDriverOptions.cs:95-103`.
|
||||
- Logix atomic types `BOOL/SINT/INT/DINT/LINT/USINT/UINT/UDINT/ULINT/REAL/LREAL/STRING/DT` plus `Structure` marker — `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip/AbCipDataType.cs:16-37`.
|
||||
- Optional online controller browse via libplctag `@tags` pseudo-tag, surfaced under a `Discovered/` sub-folder; controller- and program-scope (`Program:Main.X`) tags emitted; system/module/routine/task tags filtered — `AbCipDriver.cs:674-757`, `AbCipSystemTagFilter.cs`.
|
||||
- UDT / Predefined-Structure handling: declaration-driven member fan-out (Variable per member) plus runtime CIP Template Object (class 0x6C) decoder + per-device `(deviceHostAddress, templateInstanceId)` template cache — `CipTemplateObjectDecoder.cs`, `AbCipTemplateCache.cs`, `AbCipDriver.cs:70-103`.
|
||||
- Whole-UDT read coalescing — `AbCipUdtReadPlanner` groups members of the same parent and reads the parent once, decoding members from the buffer at computed byte offsets — `AbCipDriver.cs:323-449`, `AbCipUdtReadPlanner.cs`, `AbCipUdtMemberLayout.cs`.
|
||||
- BOOL-in-DINT addressing (`Tag.N` bit-index) with read-decode + RMW write through a per-parent `SemaphoreSlim` and cached parent-DINT runtime — `AbCipDriver.cs:494-614`, `AbCipTagPath.cs`.
|
||||
- Polling subscription overlay shared with other drivers (`PollGroupEngine`) — `AbCipDriver.cs:56-59,187-195`.
|
||||
- Per-device connectivity probe with configurable interval/timeout/probe tag (default off until tag configured) and `OnHostStatusChanged` events — `AbCipDriverOptions.cs:131-143`, `AbCipDriver.cs:235-295`.
|
||||
- ALMD alarm projection (opt-in) polling `InFaulted` + `Severity`, raising `OnAlarmEvent` on edges, with ack-write — `AbCipAlarmProjection.cs`, `AbCipDriverOptions.cs:42-58`.
|
||||
- GuardLogix safety-tag flag forces `SecurityClassification.ViewOnly` — `AbCipDriverOptions.cs:89-94`, `AbCipDriver.cs:474-478`.
|
||||
- libplctag-status → OPC UA StatusCode mapping (`BadCommunicationError`, `BadNotWritable`, `BadTypeMismatch`, `BadOutOfRange`, `BadNodeIdUnknown`) — `AbCipStatusMapper.cs`.
|
||||
- Tier-B reinit (`ReinitializeAsync`) tearing down all `IAbCipTagRuntime` handles — `AbCipDriver.cs:163-167`.
|
||||
- CLI test client: `probe`, `read`, `write`, `subscribe` against the same driver — `docs/Driver.AbCip.Cli.md`.
|
||||
|
||||
### Gaps vs commercial gateways
|
||||
|
||||
- **[Build]** **Offline tag import from L5K / L5X** — present in: both (Kepware Logix Database Settings; TOP Server Auto Tag Generation). Why it matters: lets engineers stage a project against a Studio 5000 export with no PLC online, the de-facto config workflow at Rockwell shops.
|
||||
- **[Build]** **CSV tag import / export** — present in: both. Why it matters: Kepware/AVEVA users routinely round-trip tag lists through Excel; replacing them without CSV makes mass-config painful.
|
||||
- **[Build]** **Tag descriptions / engineering metadata** — present in: both (descriptions imported with L5X). Why it matters: descriptions become the OPC UA `Description`/`DisplayName`, expected by HMI/Historian engineers.
|
||||
- **[Build]** **Logical-blocking / logical-non-blocking protocol modes** — present in: both (TOP Server names them; Kepware exposes equivalent "Optimize for read" / structure-block reads). Why it matters: whole-UDT vs per-member read strategy is the single biggest performance lever; we have one-direction whole-UDT only via `AbCipUdtReadPlanner`, no structure-block read for non-grouped members.
|
||||
- **[Build]** **Symbolic vs logical (instance-ID) addressing toggle** — present in: both. Why it matters: logical addressing skips ASCII parsing on every poll, ~3-5x faster for high-tag-count rigs; libplctag supports it but we don't expose the choice.
|
||||
- **[Build]** **Configurable CIP Connection Size per device** — present in: both (Kepware 500-4000 byte slider, TOP Server "Max Packet Size"). Why it matters: we hard-code the family default (4002/504/488); no field knob to tune for switches that fragment large frames or for legacy v19 firmware that won't accept Large Forward Open.
|
||||
- **[Skip]** **Inactivity timeout / connection idle disconnect** — present in: both. Why it matters: long-idle CIP sessions get reaped silently by some firewalls; commercial drivers expose a keep-alive cadence we don't.
|
||||
- **[Build]** **Per-tag scan rate / scan group bucketing** — present in: both (Kepware "scan classes", AVEVA Topic update intervals). Why it matters: lets engineers separate fast 100ms machine-state tags from 5s recipe data; we have one publishing-interval-per-subscription with no per-tag override.
|
||||
- **[Skip]** **"Respect tag-specified scan rate" mode** — present in: Kepware. Why it matters: lets the static tag table override client-requested rate, important when an HMI subscribes too fast and overruns the PLC.
|
||||
- **[Skip]** **Initial value cache / "first updates from cache"** — present in: Kepware. Why it matters: avoids a stall while a fresh subscription waits for its first poll; common SCADA expectation.
|
||||
- **[Build]** **Multi-tag write packing (write-multi)** — present in: both. Why it matters: we serialise writes one-by-one in `AbCipDriver.WriteAsync`; without CIP multi-request packing for writes a recipe-download is N round-trips instead of one.
|
||||
- **[Build]** **AOI (Add-On Instruction) input/output handling** — present in: Kepware (with explicit InOut limitation note). Why it matters: AOIs are how modern Logix code is structured; the Template Object decoder probably handles the layout but we don't surface AOI-specific browse paths.
|
||||
- **[Build]** **Native STRING (Logix STRING / custom STRINGxx) decoding** — present in: both (Kepware preserves descriptors; AVEVA exposes as native string). Why it matters: we map Logix `STRING` to `DriverDataType.String` but `AbCipDataType.cs` flags whole-string only; no support for user-defined `STRINGnn` variants of different DATA-array sizes.
|
||||
- **[Build]** **64-bit integer surface (LINT/ULINT)** — present in: both. Why it matters: Logix v32+ exposes LINT for 64-bit counters/timestamps; we widen them into `Int32` per a TODO at `AbCipDataType.cs:53`, losing the upper bits.
|
||||
- **[Skip]** **Structure / UDT as first-class OPC UA structured type** — present in: both (Kepware emits child tags; AVEVA exposes via native UDT). Why it matters: we emit `DriverDataType.String` placeholder for whole-UDT, only members are fully typed; OPC UA clients can't bind to a UDT shape.
|
||||
- **[Build]** **Array element / array slice addressing** — present in: both (Kepware `Tag[3,5]`, slice `Tag[0..15]`). Why it matters: `AbCipTagPath` supports indexed elements but the driver has no array-slice read for adjacent indices; reading `Tag[0..99]` becomes 100 individual reads.
|
||||
- **[Skip]** **PLC-5 / SLC-500 bridging via ControlLogix gateway** — present in: both (Kepware Logix Gateway, TOP Server NET-ENI). Why it matters: thousands of legacy AB sites front a PLC-5/SLC behind a 1756-ENBT; without the bridge those plants can't migrate to us in one step.
|
||||
- **[Build]** **Hot-standby ControlLogix redundancy (paired EN2T IPs)** — present in: AVEVA (and Kepware via secondary device). Why it matters: ControlLogix HSBY pairs are standard in continuous-process plants; today our driver has one host address per device, no automatic failover to the partner chassis.
|
||||
- **[Build]** **Diagnostics / system tags (`_ConnectionStatus`, `_ScanRate`, `_TagCount`, `_DeviceError`)** — present in: both. Why it matters: SCADA dashboards bind to these for live driver health; we expose `IHostConnectivityProbe` + `DriverHealth` but not as browseable OPC UA variables.
|
||||
- **[Build]** **Tag-write deadband / write-on-change / write-coalesce** — present in: both. Why it matters: avoids hammering the PLC on jittery analogue setpoints; we write every request straight through.
|
||||
- **[Skip]** **Unsolicited messages (PLC-pushed CIP MSG)** — present in: AVEVA (DASABCIP unsolicited topic), Kepware (separate "ControlLogix Unsolicited" driver). Why it matters: event-driven alarm/recipe-complete signals from the PLC arrive with sub-100ms latency vs our 1s alarm-poll loop.
|
||||
- **[Skip]** **CIP Generic / Class 3 message passthrough** — present in: both. Why it matters: enables custom tooling (drive parameters, motion config, MSG instruction targets) for shops that have built around it.
|
||||
- **[Skip]** **Configurable per-device connection count / connection pooling** — present in: both (AVEVA: max 31). Why it matters: lets operators trade PLC CPU cost against parallelism for high-throughput rigs; we run one connection per tag handle implicitly.
|
||||
- **[Build]** **Online tag-database refresh trigger** — present in: AVEVA (`$Sys$UpdateTagInfo`). Why it matters: lets ops force re-browse after a Studio 5000 download without restarting the driver; we only re-browse on full driver reinit.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | Offline L5K / L5X import | Yes | De-facto Studio 5000 workflow; engineers won't switch without it |
|
||||
| 2 | CSV tag import / export | Yes | Common round-trip via Excel for mass config |
|
||||
| 3 | Tag descriptions / engineering metadata | Yes | Free once L5X import lands; expected as OPC UA `Description` |
|
||||
| 4 | Logical-blocking / non-blocking modes | Yes | Biggest perf lever; today only whole-UDT coalescing |
|
||||
| 5 | Symbolic vs logical (instance-ID) toggle | Yes | 3-5x perf on dense rigs; libplctag already supports it |
|
||||
| 6 | Configurable Connection Size per device | Yes | Cheap field knob for v19 firmware / fragmenting switches |
|
||||
| 7 | Inactivity timeout / keep-alive cadence | No | Rarely an issue with libplctag-managed connections |
|
||||
| 8 | Per-tag scan rate / scan groups | Yes | Standard SCADA expectation; mixed-rate tag tables |
|
||||
| 9 | "Respect tag-specified scan rate" mode | No | Niche; OPC UA subscription rate already covers it |
|
||||
| 10 | Initial value cache / first-update from cache | No | OPC UA subscription sampling already handles first-update |
|
||||
| 11 | Multi-tag write packing | Yes | Recipe-download speed; one PDU vs N |
|
||||
| 12 | AOI input / output handling | Yes | Standard modern Logix code structure |
|
||||
| 13 | Native STRING / STRINGnn decoding | Yes | Table-stakes; we passthrough as String only |
|
||||
| 14 | 64-bit LINT / ULINT fidelity | Yes | Correctness on Logix v32+; we silently truncate (TODO in code) |
|
||||
| 15 | UDT as first-class OPC UA structured type | No | Member fan-out already works; structured-type plumbing is heavy |
|
||||
| 16 | Array slice addressing `Tag[0..15]` | Yes | Perf; reads of N-element arrays in one call |
|
||||
| 17 | PLC-5 / SLC bridging through CLX | No | AbLegacy driver covers this protocol family |
|
||||
| 18 | Hot-standby ControlLogix redundancy | Yes | Continuous-process plants standardize on HSBY pairs |
|
||||
| 19 | Diagnostic system tags (`_ConnectionStatus` etc.) | Yes | HMI dashboards bind to them; cheap given DriverHealth |
|
||||
| 20 | Write deadband / write-on-change | Yes | Analog setpoints flood the PLC without it |
|
||||
| 21 | Unsolicited CIP MSG ingestion | No | Separate driver in commercial; design-heavy; niche |
|
||||
| 22 | CIP Generic / Class 3 passthrough | No | Niche custom-tooling territory |
|
||||
| 23 | Per-device connection count / pooling | No | libplctag manages connections; premature |
|
||||
| 24 | Online tag-DB refresh trigger | Yes | Cheap; avoids restart after PLC download |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- libplctag-class wire layer covering ControlLogix/CompactLogix/Micro800/GuardLogix on EtherNet/IP CIP — same controller coverage as the commercial drivers (minus PLC-5/SLC).
|
||||
- Multi-hop CIP path syntax with bridge-through chassis (`1,2,2,IP,1,0` form) — matches Kepware/AVEVA routing semantics.
|
||||
- Online controller browse with program-scope vs controller-scope distinction and system-tag filtering — same shape as Kepware Auto Tag Generation.
|
||||
- CIP Template Object (class 0x6C) decoder for live UDT-shape resolution + cache — feature-parity with Kepware's structure-aware Auto Tag Generation.
|
||||
- Whole-UDT read coalescing for grouped members — matches TOP Server "logical blocking" optimisation for the cases it covers.
|
||||
- BOOL-in-DINT bit-index addressing with RMW serialisation per parent — same semantics commercial drivers expose for `Tag.N` bit access.
|
||||
- Per-PLC-family Connection Size / connected-messaging / fragment-bytes profile — mirrors the per-controller "model" picker in Kepware.
|
||||
- ALMD alarm projection with edge-detected raise/clear — reasonable parity for the alarm subset of FT Alarms & Events that those drivers do not natively translate.
|
||||
- Per-device circuit-breaker / bulkhead isolation keyed on `(driver, hostName)` — better operational story than the typical commercial gateway, which trips the whole channel on one bad device.
|
||||
- GuardLogix safety-tag write rejection at config time — explicit, matches Rockwell's safety-partition rules.
|
||||
|
||||
### Sources
|
||||
|
||||
- [Kepware Allen-Bradley ControlLogix Ethernet driver overview](https://support.ptc.com/help/kepware/drivers/en/kepware/drivers/CONTROLLOGIXETHERNET/Overview.html)
|
||||
- [Kepware Logix Database Settings (offline / online ATG, L5K/L5X)](https://support.ptc.com/help/kepware/drivers/en/kepware/drivers/CONTROLLOGIXETHERNET/Logix_Database_Settings.html)
|
||||
- [Kepware Preparing for Automatic Tag Database Generation](https://support.ptc.com/help/kepware/drivers/en/kepware/drivers/CONTROLLOGIXETHERNET/Preparing_for_Automatic_Tag_Database_Generation.html)
|
||||
- [Kepware Device Properties — Scan Mode (respect tag-specified, demand poll, initial cache)](https://support.ptc.com/help/kepware/drivers/en/kepware/drivers/Device_Properties_Scan_Mode.html)
|
||||
- [Kepware Allen-Bradley ControlLogix Ethernet driver manual (PDF, 2025)](https://downloads.softwaretoolbox.com/demodnld/prod_docs/topserver_help_pdf/Common/allen-bradley-controllogix-ethernet-manual.pdf)
|
||||
- [Kepware Allen-Bradley ControlLogix Server (Unsolicited)](https://www.ptc.com/en/store/kepware/drivers/allen-bradley-controllogix-unsolicited)
|
||||
- [Kepware System Tags](https://support.ptc.com/help/kepware/kepware_edge/en/kepware/kepware-edge/system-tags.html)
|
||||
- [TOP Server ControlLogix protocol modes (symbolic / logical-blocking / logical-non-blocking)](https://blog.softwaretoolbox.com/optimizing-controllogix-protocol-modes)
|
||||
- [TOP Server Rockwell ControlLogix Ethernet OPC driver details](https://softwaretoolbox.com/top-server/rockwell-ab-controllogix-ethernet)
|
||||
- [TOP Server ControlLogix Ethernet performance optimization](https://softwaretoolbox.com/top-server/rockwell-ab-controllogix-performance)
|
||||
- [Software Toolbox FAQ — making configuration choices for ControlLogix Ethernet](https://help.softwaretoolbox.com/faq/1658)
|
||||
- [AVEVA Communication Drivers Pack — ABCIP Driver user guide (PDF)](https://cdn.logic-control.com/docs/aveva/communications-pack/OIABCIP.pdf)
|
||||
- [Wonderware DASABCIP user guide (PDF)](https://cdn.logic-control.com/media/DASABCIP.pdf)
|
||||
- [Wonderware OI.ABCIP server user guide (PDF, v7.0)](https://s3-us-west-2.amazonaws.com/wonderwarepacwest/downloads/oi-abcip-user-guide.pdf)
|
||||
- [Industrial Software Solutions — DASABCIP unsolicited message handling](https://industrial-software.com/training-support/tech-notes/74-how-configure-wonderware-dasabcip-unsolicited-message-handling/)
|
||||
- [Industrial Software Solutions — `$Sys$UpdateTagInfo` with ABCIP](https://industrial-software.com/training-support/tech-notes/119-using-sysupdatetaginfo-with-abcip-oi-servers/)
|
||||
- [AVEVA — Configure the ABCIP Communication Driver](https://docs.aveva.com/bundle/sp-cdp-drivers/page/193749.html)
|
||||
|
||||
---
|
||||
|
||||
## AbLegacy (Allen-Bradley PLC-5 / SLC / MicroLogix)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- Per-device family knob: `Slc500` / `MicroLogix` / `Plc5` / `LogixPccc`, each mapped to a libplctag PLC attribute, default CIP path, max-tag-bytes, and string/long-file capability flags (`PlcFamilies/AbLegacyPlcFamilyProfile.cs:14-54`).
|
||||
- Single transport: PCCC encapsulated in EtherNet/IP via libplctag, with `ab://gateway[:port]/cip-path` host strings supporting CLX-bridged routing (`AbLegacyHostAddress.cs:14-52`).
|
||||
- File-letter set: `N`, `F`, `B`, `L`, `ST`, `T`, `C`, `R`, `I`, `O`, `S`, `A` parsed and validated; trailing `/N` bit index and `.SUBELEMENT` (ACC/PRE/EN/DN/TT/CU/CD/LEN/POS/ER) recognised (`AbLegacyAddress.cs:97-101`, `AbLegacyDataType.cs:9-29`).
|
||||
- Data types: `Bit`, `Int` (N/A), `Long` (L), `Float` (F), `String` (ST), `TimerElement`, `CounterElement`, `ControlElement` — all surfacing as `Boolean` / `Int32` / `Float32` / `String` driver types (`AbLegacyDataType.cs:34-44`).
|
||||
- Bit-within-N-word write path: read-modify-write against a parent-word runtime, serialised by per-parent `SemaphoreSlim` (`AbLegacyDriver.cs:353-409`).
|
||||
- Polling overlay via shared `PollGroupEngine` exposed through `ISubscribable`; per-publishing-interval grouping (`AbLegacyDriver.cs:268-276`).
|
||||
- Connectivity probe loop per device (default `S:0`, configurable interval/timeout) emitting `HostStatusChangedEventArgs` transitions (`AbLegacyDriver.cs:283-336`, `AbLegacyDriverOptions.cs:36-44`).
|
||||
- Capability surfaces: `IDriver`, `IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IPerCallHostResolver` — flat `AbLegacy/<host>/<tag>` browse tree built from static config (`AbLegacyDriver.cs:11-12`, `238-264`).
|
||||
- Static-config tag list only (`AbLegacyTagDefinition`); writes can be flagged `Writable=false` and `WriteIdempotent=true` (`AbLegacyDriverOptions.cs:28-34`).
|
||||
- Status mapping for libplctag error codes to OPC UA StatusCodes (`AbLegacyStatusMapper.cs`).
|
||||
|
||||
### Gaps vs commercial gateways
|
||||
|
||||
- **[Skip]** **Serial DF1 transports (full-duplex, half-duplex master/slave, KF2/KF3, radio modem)** — present in: both. Why: libplctag PCCC is Ethernet-only; no COM-port path means PLC-5/SLC/ML serial deployments are unreachable.
|
||||
- **[Build]** **DH+ via 1756-DHRIO / 1784-PKTX gateway routing** — present in: both. Why: DH+ Gateway is the canonical way to reach PLC-5 nodes through a CLX rack today; we expose a CIP path but no station-number addressing or DH+ link-id concept.
|
||||
- **[Skip]** **DH-485 routing through 1761-NET-AIC / 1747-AIC** — present in: both. Why: MicroLogix 1000/1200 and SLC 5/03 multi-drop deployments need DH-485 station addressing.
|
||||
- **[Skip]** **`M0` / `M1` module file access (block-transfer / RIO data)** — present in: Kepware, AVEVA. Why: Required for any PLC-5 with RIO modules or specialty cards (motion, weigh, vision); PCCC has dedicated frames.
|
||||
- **[Build]** **`PD` (PID), `MG` (Message), `PLS` (programmable limit switch), `BT` (block transfer) function/structure files** — present in: both. Why: Standard SLC/PLC-5 file types for PID loops and message instructions; we cap at T/C/R structures only.
|
||||
- **[Skip]** **`D` (BCD) and Long-BCD types** — present in: both. Why: Some legacy SLC/PLC-5 programs store recipe / setpoint data as packed BCD; we only ship binary `Int`/`Long`.
|
||||
- **[Build]** **PLC-5 octal addressing for I/O word/bit (`I:001/17`)** — present in: both. Why: Native PLC-5 documentation and RSLogix 5 use octal; rejecting decimal-only addresses misreads real configs.
|
||||
- **[Build]** **Indirect / indexed addressing (`N7:[N7:0]`, `N[N7:0]:5`)** — present in: both. Why: Common pattern for recipe / batch lookup tables; libplctag supports it but our parser only accepts literal `<letter><file>:<word>`.
|
||||
- **[Build]** **Array reads / contiguous block addressing (`N7:0,10` or `N7:0[10]`)** — present in: both. Why: One PCCC request can pull up to ~120 words; absent array syntax forces N round-trips for 1-of-N tags and breaks block-read sizing optimisation.
|
||||
- **[Build]** **String-file (`ST`) read/write path in production** — present in: both. Why: Type is enum-listed but `AbLegacyDataTypeExtensions.ToDriverDataType` maps to `String` only; ST is an 82-byte fixed buffer with a length word and we have no integration coverage to confirm round-trip.
|
||||
- **[Build]** **Sub-element predefined symbol coverage (timer `.PRE/.ACC/.EN/.TT/.DN`, counter `.CU/.CD/.OV/.UN`, control `.LEN/.POS/.ER/.UL/.IN/.FD`)** — present in: both. Why: Parser admits any all-letters sub-element but the `TimerElement/CounterElement/ControlElement` types collapse to a single `Int32`, losing per-bit Boolean semantics that HMIs expect (`.DN` should be Bit, not Int32).
|
||||
- **[Skip]** **Block read-size negotiation per family** — present in: both. Why: We carry `MaxTagBytes` as a constant but never plumb it into a request optimiser; libplctag's PCCC chunking is implicit and not tunable per-tag-group.
|
||||
- **[Build]** **Auto-demote on comm failure** — present in: both. Why: Kepware/TOP Server temporarily off-scan a non-responsive device for N seconds so other devices on the channel keep flowing; we only switch a `HostState` flag and keep retrying.
|
||||
- **[Skip]** **Communication serialisation across multiple devices on one channel** — present in: both. Why: DH+/DF1 networks share a single physical link; we have no channel concept, so a slow PLC-5 can starve a fast SLC on the same DH+ link.
|
||||
- **[Build]** **RSLogix 500 (`.RSS`) / RSLogix 5 (`.RSP`) / `.SLC` symbol & data-table import for automatic tag generation** — present in: both (DF1, AB Ethernet drivers). Why: Manual `AbLegacyTagDefinition` entries scale poorly; commercial tools parse RSLogix exports to seed tags and descriptions.
|
||||
- **[Skip]** **Online browse / data-table discovery from the controller** — present in: Kepware (Create-from-Device). Why: PCCC has a "read file directory" frame; we don't issue it, so `DiscoverAsync` only ever returns the static config.
|
||||
- **[Skip]** **DF1 error checking selection (BCC vs CRC-16)** — present in: both. Why: Some serial gear (older modems) only does BCC; not applicable until serial transport ships, but flagged for parity.
|
||||
- **[Build]** **Per-tag deadband / change filter on subscriptions** — present in: both. Why: Polling overlay publishes every poll; commercial drivers suppress no-op publishes by absolute-deadband or scaling.
|
||||
- **[Skip]** **PLC-5 typed-write / typed-read selection vs SLC protected typed reads** — present in: both. Why: Kepware exposes "Optimization Method" and "Force Logical=Yes" knobs that materially affect performance on slower processors; we use libplctag defaults silently.
|
||||
- **[Build]** **Diagnostic counters (request count, response time, retries, last-error per device, comm-failures)** — present in: both (built-in `_System` / `_DiagnosticTags`). Why: We surface a `DriverHealth` enum but no per-device tag-level diagnostics for an HMI to bind to.
|
||||
- **[Build]** **Per-device timeout / retry overrides** — present in: both. Why: We have one driver-wide `Timeout` (`AbLegacyDriverOptions.cs:16`) and one probe timeout; SLC 5/01 vs SLC 5/05 vs MicroLogix 1100 need very different values on a shared driver.
|
||||
- **[Skip]** **Write completion semantics — synchronous-confirmation vs queued** — present in: both. Why: Commercial drivers offer "write optimization (latest value only / write-through / disable)"; ours always writes through, which floods slow channels with redundant writes.
|
||||
- **[Build]** **MicroLogix-specific item naming (e.g. `RTC:0.HR`, `HSC:0`, `DLS:0` for daylight savings)** — present in: both. Why: MicroLogix 1100/1400 have proprietary function files that don't share file letters with SLC and our `IsKnownFileLetter` whitelist rejects them.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | Serial DF1 transports | No | Declining install base; libplctag has no serial path; major scope |
|
||||
| 2 | DH+ via 1756-DHRIO bridging | Yes | Real-world PLC-5 path; libplctag CIP routing already supports it |
|
||||
| 3 | DH-485 routing (1761/1747-AIC) | No | Very legacy; rare in greenfield |
|
||||
| 4 | M0 / M1 module file access | No | Niche RIO modules; declining |
|
||||
| 5 | PD / MG / PLS / BT files | Yes | PID files are common in real SLC programs |
|
||||
| 6 | D (BCD) and Long-BCD types | No | Very legacy data convention |
|
||||
| 7 | PLC-5 octal addressing | Yes | Correctness for actual PLC-5 sites |
|
||||
| 8 | Indirect / indexed addressing | Yes | Standard recipe / lookup pattern |
|
||||
| 9 | Array contiguous block addressing | Yes | Big perf gain; one PCCC frame vs N |
|
||||
| 10 | ST string read / write production verification | Yes | Type is enum-listed but untested; cheap to validate |
|
||||
| 11 | Sub-element bit semantics (`.DN` as Bit, etc.) | Yes | Correctness; HMIs expect Boolean for `.DN`/`.EN`/`.TT` |
|
||||
| 12 | Block read-size negotiation per family | No | libplctag handles chunking implicitly |
|
||||
| 13 | Auto-demote on comm failure | Yes | Standard SCADA resilience; one slow PLC starves fast ones |
|
||||
| 14 | Channel-shared comm serialisation | No | Only matters for serial / DH+ (transport not built) |
|
||||
| 15 | RSLogix 500/5 (.RSS / .RSP) symbol import | Yes | Workflow parity; manual config doesn't scale |
|
||||
| 16 | Online controller browse / data-table discovery | No | PCCC dir frame limited; libplctag support unclear |
|
||||
| 17 | DF1 BCC vs CRC-16 selection | No | Predicated on DF1 transport (gap #1) |
|
||||
| 18 | Per-tag deadband / change filter | Yes | Polling overlay floods every poll without it |
|
||||
| 19 | PLC-5 typed-read selection / Force Logical | No | libplctag defaults are sound; niche tuning |
|
||||
| 20 | Diagnostic counters as tags | Yes | HMI binding; cheap given existing health probe |
|
||||
| 21 | Per-device timeout / retry overrides | Yes | SLC 5/01 vs 5/05 vs ML1100 differ; cheap |
|
||||
| 22 | Write completion semantics options | No | Niche tuning; current write-through is safe default |
|
||||
| 23 | MicroLogix function-file naming (RTC/HSC/DLS) | Yes | Correctness for ML1100/1400 deployments |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- Family enum + per-family profile keeps SLC 500 / MicroLogix / PLC-5 / LogixPccc-mode behavioural differences explicit instead of probed at runtime (`PlcFamilies/AbLegacyPlcFamilyProfile.cs:14-54`).
|
||||
- ControlLogix-bridged routing string (`ab://gw/1,0`) matches Kepware's "Routing Path" concept and is how real PLC-5 deployments are reached today (`AbLegacyHostAddress.cs:14-52`).
|
||||
- Bit-within-N-word RMW with per-parent serialisation prevents the classic two-writer-tear bug other drivers ship (`AbLegacyDriver.cs:353-384`).
|
||||
- Probe loop with explicit `HostState` transitions gives a cleaner diagnostic surface than Kepware's lump-sum auto-demote (`AbLegacyDriver.cs:283-336`).
|
||||
- Status-file probe (`S:0`) is the same heartbeat Rockwell HMIs traditionally use, and it's family-agnostic (`AbLegacyDriverOptions.cs:43`).
|
||||
- libplctag back-end inherits ongoing community fixes for PCCC frame edge-cases without us owning the wire decoder.
|
||||
|
||||
### Sources
|
||||
|
||||
- [Kepware Allen-Bradley Ethernet Driver Manual (PDF)](https://cdn.logic-control.com/docs/kepware/Manuals/Drivers/Allen-Bradley/Allen-Bradley%20Ethernet%20Driver.pdf)
|
||||
- [Kepware Allen-Bradley DF1 Driver Manual (PDF)](https://cdn.logic-control.com/docs/kepware/Manuals/Drivers/Allen-Bradley/Allen-Bradley%20DF1%20Driver.pdf)
|
||||
- [Kepware Allen-Bradley ControlLogix Ethernet Driver Manual (PDF, 2025)](https://downloads.softwaretoolbox.com/demodnld/prod_docs/topserver_help_pdf/Common/allen-bradley-controllogix-ethernet-manual.pdf)
|
||||
- [Kepware Allen-Bradley ControlLogix Driver Manual (PDF, 2017)](https://ftp.softwaretoolbox.com/demodnld/prod_docs/topserver_help_pdf/v5_20/controllogix_ethernet.pdf)
|
||||
- [Kepware Allen-Bradley Ethernet driver product page](https://www.kepware.com/en-us/products/kepserverex/drivers/allen-bradley-ethernet/)
|
||||
- [TOP Server Rockwell DF1 Serial driver](https://softwaretoolbox.com/top-server/rockwell-ab-df1)
|
||||
- [AVEVA Communication Drivers Pack 2023 R2 readme](https://www.wmkit.com/archives/aveva-communication-drivers-pack-2023-r2-readme.html)
|
||||
- [AVEVA Communication Drivers Pack 2020 R2 readme](https://industrial-software.com/wp-content/uploads/Communication_Drivers/oi-communication-drivers-pack-2020-r2/Readme.html)
|
||||
- [AVEVA Communication Drivers datasheet (PDF)](https://www.aveva.com/content/dam/aveva/documents/datasheets/Datasheet_AVEVA-CommunicationDrivers_11-19.pdf)
|
||||
- [AVEVA OI ABCIP user guide (PDF)](https://s3-us-west-2.amazonaws.com/wonderwarepacwest/downloads/oi-abcip-user-guide.pdf)
|
||||
- [Kepware Logix Database Settings (Create-from-Device / .L5K import)](https://support.ptc.com/help/kepware/drivers/en/kepware/drivers/CONTROLLOGIXETHERNET/Logix_Database_Settings.html)
|
||||
- [Rockwell DF1 Protocol and Command Set reference (1770-RM516, PDF)](https://literature.rockwellautomation.com/idc/groups/literature/documents/rm/1770-rm516_-en-p.pdf)
|
||||
|
||||
---
|
||||
|
||||
## FOCAS (Fanuc CNC)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- TCP-only Ethernet transport on port 8193 via the pure-managed `Focas.Wire` client; no Fwlib DLL, no P/Invoke, no out-of-process Tier-C host (`docs/drivers/FOCAS.md:8-13`, retired Host noted at `:25-27`).
|
||||
- One driver instance can host N CNCs, each keyed by `focas://{ip}[:{port}]` (`FocasDriverOptions.cs:10`, `FocasDeviceOptions:92-95`).
|
||||
- Per-device CNC series declaration (`Zero_i_D/F/MF/TF`, `Sixteen_i`, `Thirty_i`, `ThirtyOne_i`, `ThirtyTwo_i`, `PowerMotion_i`, `Unknown`) with init-time capability matrix validating macro / parameter / PMC ranges per series (`FocasCncSeries.cs:21-47`, `FocasCapabilityMatrix.cs:29-138`).
|
||||
- User-authored tag addressing for: PMC bits/bytes (`X0.0`, `R100`, `R100.3`), CNC parameters (`PARAM:1815/0`), and macro variables (`MACRO:500`) — wired through `cnc_rdpmcrng` / `cnc_rdparam` / `cnc_rdmacro` (`docs/drivers/FOCAS.md:62-66, 90`).
|
||||
- Atomic data types: Bit, Byte, Int16, Int32, Float32, Float64, String (`FocasDataType.cs:10-26`).
|
||||
- Read-only by design — `WriteAsync` returns `BadNotWritable`; no `cnc_wrparam` / `pmc_wrpmcrng` / `cnc_wrmacro` paths exist (`FocasDriver.cs:222-279`, `docs/drivers/FOCAS.md:17-18, 91`).
|
||||
- Optional `FixedTree` auto-populated subtree per device (`FocasFixedTreeOptions:26-51`) populated at bootstrap from `cnc_sysinfo` + `cnc_rdaxisname` + `cnc_rdspdlname`, polled at three cadences (axis 250 ms, program 1 s, timer 30 s):
|
||||
- `Identity/` — `SeriesNumber`, `Version`, `MaxAxes`, `CncType`, `MtType`, `AxisCount` (`FocasDriver.cs:299-304`).
|
||||
- `Axes/{name}/` — `AbsolutePosition`, `MachinePosition`, `RelativePosition`, `DistanceToGo`, `ServoLoad` (cap-gated) (`FocasDriver.cs:307-316`).
|
||||
- `Axes/FeedRate/Actual`, `Axes/SpindleSpeed/Actual` (single-channel rates — first axis only, `FocasDriver.cs:317-318`, `:646-651`).
|
||||
- `Spindle/{name}/Load`, `Spindle/{name}/MaxRpm` (cap-gated, multi-spindle aware) (`FocasDriver.cs:323-336`).
|
||||
- `Program/Name`, `ONumber`, `Number`, `MainNumber`, `Sequence`, `BlockCount` (`FocasDriver.cs:339-347`).
|
||||
- `OperationMode/Mode` + `ModeText` ("MDI"/"AUTO"/"EDIT"/"HANDLE"/"JOG"/"TEACH_IN_HANDLE"/"REFERENCE"/"REMOTE"/"TEST"/"TJOG") (`IFocasClient.cs:213-226`).
|
||||
- `Timers/PowerOnSeconds`, `OperatingSeconds`, `CuttingSeconds`, `CycleSeconds` (`FocasDriver.cs:355-362`).
|
||||
- Per-series node suppression: optional API probes at bootstrap, `EW_FUNC` / `EW_NOOPT` / `EW_VERSION` causes the corresponding subtree to not be emitted (`docs/drivers/FOCAS.md:134-142`, `FocasDriver.cs:497-526`).
|
||||
- Active-alarm projection via `IAlarmSource` (opt-in, polls `cnc_rdalmmsg2` at 2 s default), differential raise/clear with mapped alarm types `Parameter / PulseCode / Overtravel / Overheat / Servo / DataIo / MemoryCheck / MacroAlarm`, severity buckets, and ack as no-op (`FocasAlarmProjectionOptions:79-85`, `IFocasClient.cs:275-287`, `docs/drivers/FOCAS.md:154-181`).
|
||||
- Connectivity probe via `cnc_rdcncstat` on configurable interval; transitions fire `OnHostStatusChanged` (`FocasProbeOptions:110-115`, `docs/drivers/FOCAS.md:94`).
|
||||
- Optional proactive handle-recycle loop to release FWLIB session handles on a cadence (defends against the documented handle-leak bugs and finite ~5–10 connection pool) (`FocasHandleRecycleOptions:68-72`, `docs/drivers/FOCAS.md:184-205`).
|
||||
- Subscriptions are emulated via the shared `PollGroupEngine` (FOCAS has no push) (`FocasDriver.cs:451-461`).
|
||||
- `IPerCallHostResolver` so each tag's reads route to its declared device, enabling per-host bulkhead resilience (decision #144) (`FocasDriver.cs:850-857`, `FocasDriverOptions.cs:3-7`).
|
||||
|
||||
### Gaps vs commercial gateways / MTConnect adapters
|
||||
|
||||
- **[Build]** **Writes (parameters / PMC / macro)** — Kepware "Fanuc Focas HSSB and Ethernet Driver", Ignition Fanuc, Memex Merlin, Predator MDC. Why: Macro / PMC writes are the canonical mechanism for DPRNT-free supervisory feedback to ladder logic; we explicitly return `BadNotWritable`.
|
||||
- **[Skip]** **HSSB (high-speed serial bus) transport** — Kepware, MTConnect Fanuc Adapter (Cincinnati), Memex. Why: HSSB is the only path on machines with no FOCAS Ethernet option licensed; we are TCP:8193 only, no `hssb` discovery, no PCI handle.
|
||||
- **[Build]** **FOCAS password / unlock parameter** — Kepware ("Password" property), MTConnect adapter. Why: Some controllers gate `cnc_wrparam` and certain reads behind a connection-level password; we have no such property in `FocasDeviceOptions`.
|
||||
- **[Build]** **Multi-path / multi-channel CNC support** — Kepware (Path number 1..n), MTConnect (per-path Components). Why: 30i/31i/32i can host 2-10 paths each with their own program / position / mode; our `cnc_setpath`-equivalent never runs and the fixed tree implicitly assumes path 1.
|
||||
- **[Skip]** **Series 15, Series 15i, Power Mate D/H, Series 35i** — Kepware lists 15/15i, MTConnect adapter handles legacy. Why: Our `FocasCncSeries` enum stops at Power Motion i + 16i; legacy Series 15 deployments would either fail validation or be forced to `Unknown`.
|
||||
- **[Build]** **`cnc_getfigure` decimal scaling** — Kepware, MTConnect, Memex. Why: Position values are exposed as raw scaled ints (Float64-typed) and we punt the divide-by-10^N onto the client; commercial gateways present pre-scaled millimeters/inches. (Acknowledged TODO in `docs/drivers/FOCAS.md:144-148`.)
|
||||
- **[Build]** **G-code / modal info (`cnc_modal`)** — Kepware ModalCodes group, MTConnect (FunctionalMode, MotionMode, PlaneCode, etc.), Ignition. Why: Modal G/M-code state (G54 active, G90/91, G17/18/19, M03/04/05, S/F overrides) is one of the most-asked CNC tag groups; we have neither a fixed-tree exposure nor a `MODAL:` address scheme.
|
||||
- **[Build]** **Tool number, current tool, tool life management** — Kepware (T-code, ToolLife group), MTConnect (`ToolNumber`, `ToolGroup`), Memex, Predator MDC. Why: Live `cnc_rdtlife*` / current T-code are core MES integration data; absent.
|
||||
- **[Skip]** **Tool offset table read/write (`cnc_rdtofs` / `cnc_wrtofs`)** — Kepware, Ignition. Why: Tool length / wear / radius compensation tables are often supervisory-edited; we have no `TOFS:` address scheme.
|
||||
- **[Build]** **Work coordinate offsets (G54..G59 + extended via `cnc_rdzofs` / `cnc_wrzofs`)** — Kepware "WorkOffsets" group, MTConnect (`PartCount` and `WorkCoordinate`). Why: Setup automation needs to read/poke work offsets; absent.
|
||||
- **[Build]** **Override values (Feedrate %, Rapid %, Spindle %, Jog %)** — Kepware OverrideGroup, MTConnect (`PathFeedrateOverride`, `RotaryVelocityOverride`). Why: Operator-modulated speeds are crucial for OEE/MES; not in the dynamic snapshot.
|
||||
- **[Build]** **Status / running flags surfaced as nodes (Auto, Run, Motion, Mstb, EmergencyStop, Edit, Tmmode, Alarm bool)** — MTConnect adapter exposes `Execution`, `ControllerMode`, `EmergencyStop` directly. Why: We poll `cnc_rdcncstat` only as a Boolean probe; the 9-field ODBST struct (tmmode/aut/run/motion/mstb/emergency/alarm/edit) is never projected to nodes.
|
||||
- **[Build]** **Parts count / required parts (`cnc_rdparam` 6711/6712/6713)** — Kepware "PartCount", MTConnect `PartCountAct/Min/Max`. Why: Part counters are MES bread-and-butter; reachable today only by user-authored `PARAM:6711` tag, not in the fixed tree.
|
||||
- **[Build]** **Diagnostic numbers (`cnc_rddiag` / `cnc_rddiagdgn`)** — Kepware Diagnostic group, MTConnect. Why: Servo/spindle diagnostics (axis position errors, current, temperature) are essential for predictive maintenance; no `DIAG:` address scheme.
|
||||
- **[Build]** **PMC data ranges (D/T/C/K/F/G addresses) for Series 16i** — partially limited by our matrix (`PmcLetters(Sixteen_i)` only allows X/Y/R/D, `FocasCapabilityMatrix.cs:80`). Why: Real 16i ladders use F/G signals for handshakes; users would have to set Series=Unknown to bypass validation.
|
||||
- **[Build]** **Bulk PMC range read (`pmc_rdpmcrng` multi-byte)** — Kepware coalesces consecutive PMC bytes; we issue one request per tag. Why: One TCP RTT per PMC byte at scale will saturate; commercial drivers batch into ranges of up to 1KB.
|
||||
- **[Build]** **Alarm history (`cnc_rdalmhistry` / `cnc_rdalmhistry5`)** — MTConnect adapter, Memex. Why: Acked alarms persist in a CNC ring buffer; we surface only the active alarm list.
|
||||
- **[Build]** **External operator messages (`cnc_rdopmsg` / `cnc_rdopmsg2` / `cnc_rdopmsg3`)** — Kepware OpMessage tag, MTConnect (`Message` data item). Why: Macro programmers display operator messages via #3006 / G65 P9099 etc.; not exposed.
|
||||
- **[Skip]** **Program list / upload / download / delete (`cnc_rdprogdir` / `cnc_upstart` / `cnc_dnstart` family)** — Kepware program-management group, Predator MDC, Memex Merlin. Why: DNC drip-feed is a primary use case for MDC products; entirely absent.
|
||||
- **[Build]** **Currently-executing program text (`cnc_rdactpt` / `cnc_rdexecprog`)** — Kepware "CurrentProgram", MTConnect `Block` and `Line`. Why: Live block display / current sequence content; we expose `Sequence` (number) but not the block text.
|
||||
- **[Skip]** **DPRNT / external data input (`cnc_rdmacrohk` / external macro)** — Predator MDC, Forcam, Memex (DPRNT collector). Why: DPRNT is the standard 1980s-vintage CNC-to-MES messaging path; we have no DPRNT TCP listener and no macro-call subscription.
|
||||
- **[Skip]** **Servo / spindle deep info (`cnc_rdsvinfo` / `cnc_rdspinfo`)** — Kepware, Memex. Why: Servo cycle counts, spindle motor speed/temp; absent (we only expose load percent).
|
||||
- **[Skip]** **Per-axis acceleration / jerk / feed-per-rev** — MTConnect (`AccelerationSpec`, `Jerk`, `Feedrate`). Why: Beyond actual feed; absent.
|
||||
- **[Build]** **Cycle time per part / last cycle time / cycle start timestamp** — MTConnect (`ProcessTimer`), Memex. Why: We expose accumulating timers but not "last completed cycle" deltas.
|
||||
- **[Skip]** **`cnc_rdrelpos` reset / preset, `cnc_setpath`, `cnc_wrabsmac`** — operator-style write commands. Why: Read-only-by-design covers it, but commercial parity assumes selective writes.
|
||||
- **[Skip]** **CNC time/date sync (`cnc_rdtimer` clock variant / `cnc_rtime`)** — Kepware, Memex. Why: Setting CNC system clock from a master time source is common in audited environments; absent.
|
||||
- **[Build]** **Connection-level statistics + retry counters surfaced as variables** — Kepware exposes per-channel stats; we publish health but not as variables.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | Writes (parameters / PMC / macro) | Yes | Key MES feedback path; current read-only is too narrow |
|
||||
| 2 | HSSB transport | No | PCI hardware; declining; reopens fwlib distribution problem |
|
||||
| 3 | FOCAS password / unlock | Yes | Cheap once writes ship; some controllers gate reads too |
|
||||
| 4 | Multi-path / multi-channel CNC | Yes | 30i/31i/32i routinely have multiple paths |
|
||||
| 5 | Series 15 / Power Mate D-H / Series 35i | No | Very legacy; small install base |
|
||||
| 6 | `cnc_getfigure` decimal scaling | Yes | Already TODO; clients shouldn't compute scaling |
|
||||
| 7 | Modal G-code / M-code state | Yes | One of the most-asked CNC tag groups |
|
||||
| 8 | Tool number / tool life management | Yes | Core MES integration data |
|
||||
| 9 | Tool offset table read / write | No | Write-heavy; defer with general write decision |
|
||||
| 10 | Work coordinate offsets (G54..) | Yes | Setup automation needs read / poke |
|
||||
| 11 | Override values (Feed / Rapid / Spindle / Jog) | Yes | OEE / MES bread-and-butter |
|
||||
| 12 | ODBST status flags as nodes | Yes | Cheap; project the 9 fields we already read |
|
||||
| 13 | Parts count in fixed tree | Yes | MES table-stakes; simple `cnc_rdparam` projection |
|
||||
| 14 | Diagnostic numbers (`cnc_rddiag`) | Yes | Predictive maintenance |
|
||||
| 15 | PMC F / G letters for 16i | Yes | Correctness; real ladders use F/G handshakes |
|
||||
| 16 | Bulk PMC range read | Yes | Big perf gain at scale |
|
||||
| 17 | Alarm history (`cnc_rdalmhistry`) | Yes | Auditing; small extension to alarm projection |
|
||||
| 18 | Operator messages (`cnc_rdopmsg*`) | Yes | Cheap; common macro feedback |
|
||||
| 19 | Program list / upload / download / delete | No | DNC product territory; significant scope |
|
||||
| 20 | Currently-executing program text | Yes | HMI displays expect block view |
|
||||
| 21 | DPRNT TCP listener | No | Significant scope; modern paths supersede it |
|
||||
| 22 | Servo / spindle deep info | No | Specialty; load% covers most needs |
|
||||
| 23 | Per-axis acceleration / jerk / feed-per-rev | No | Niche advanced telemetry |
|
||||
| 24 | Cycle time per part / last cycle delta | Yes | OEE-essential |
|
||||
| 25 | Operator write commands (preset etc.) | No | Read-only design choice; revisit only with general writes |
|
||||
| 26 | CNC time / date sync | No | Rare ask; commonly handled by CNC NTP |
|
||||
| 27 | Connection statistics as variables | Yes | Cheap given existing health |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- Pure-managed wire client (no Fwlib distribution problem) — significant operational win vs Kepware's HSSB driver DLL stack.
|
||||
- Per-series capability matrix at `InitializeAsync` time prevents silent runtime `BadOutOfRange` on misconfigured macro/parameter/PMC numbers.
|
||||
- Fixed-tree per-API capability probes auto-suppress nodes the CNC doesn't support — operators don't see nodes that perpetually return `BadDeviceFailure`.
|
||||
- `IPerCallHostResolver` integrates each device into the shared resilience bulkhead (Phase 6.1) — comparable to Kepware's per-device "channel" isolation.
|
||||
- Three-tier poll cadence (axis fast / program medium / timer slow) is closer to MTConnect adapter behaviour than Kepware's single-rate channel scan.
|
||||
- Handle-recycle loop is a thoughtful defence against documented Fanuc handle-leak firmware bugs — not present in many commercial drivers.
|
||||
- Alarm projection differentiates raise vs clear and maps `ALM_TYPE_*` to OPC UA severity buckets — closer to A&E semantics than the simple "alarm bit" Kepware exposes.
|
||||
|
||||
### Sources
|
||||
|
||||
- https://www.kepware.com/en-us/products/kepserverex/drivers/fanuc-focas-hssb-ethernet/ — Kepware Fanuc Focas HSSB and Ethernet Driver
|
||||
- https://github.com/mtconnect/cppagent_dev/tree/main/agent/adapter/fanuc — MTConnect Fanuc adapter reference
|
||||
- https://github.com/Ladder99/focas-mock — managed Focas wire client (the OSS basis we consume)
|
||||
- https://www.inductiveautomation.com/exchange/2218 — Ignition Fanuc FOCAS driver module
|
||||
- https://memex.ca/merlin-tempus-mes-suite/ — Memex Merlin OEE / Fanuc connectivity
|
||||
- https://www.predator-software.com/cnc-data-collection.htm — Predator MDC / DNC capabilities
|
||||
- https://www.forcam.com/en/products/factory-data-collection/ — Forcam Force MES Fanuc driver
|
||||
- Fanuc FOCAS Developer Kit `fwlib32.h` (mirrored at `strangesast/fwlib`) — authoritative API surface
|
||||
- https://www.mtconnect.org/standard-2 — MTConnect Standard Part 2 Devices Information Model
|
||||
|
||||
---
|
||||
|
||||
## OpcUaClient (OPC UA Aggregation Client)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- **Endpoint config**: single `EndpointUrl` plus ordered `EndpointUrls` failover list with `PerEndpointConnectTimeout` per-attempt budget (`OpcUaClientDriverOptions.cs:22-40`); failover sweep tries each in order on init and on session drop (`OpcUaClientDriver.cs:95-118`).
|
||||
- **Security policies**: `None`, `Basic128Rsa15`, `Basic256`, `Basic256Sha256`, `Aes128_Sha256_RsaOaep`, `Aes256_Sha256_RsaPss` plus `Sign` / `SignAndEncrypt` modes; explicit policy+mode matching against the server's `GetEndpoints` response, no silent fallback to a weaker cipher (`OpcUaClientDriver.cs:299-336`).
|
||||
- **Identity tokens**: Anonymous, Username/Password, and X509 user-certificate (PFX with private key) — built once and reused across every failover attempt (`OpcUaClientDriver.cs:244-369`).
|
||||
- **Certificate management**: per-process PKI store rooted at `%LocalAppData%\OtOpcUa\pki` with own/trusted/issuers/rejected directories; SDK auto-creates the application instance certificate at startup; `AutoAcceptCertificates` dev knob hooks the validator's `BadCertificateUntrusted` path (`OpcUaClientDriver.cs:163-217`).
|
||||
- **Session lifecycle**: configurable `SessionTimeout`, `KeepAliveInterval`, `ReconnectPeriod`, `ApplicationUri`, `SessionName`, operation `Timeout` (`OpcUaClientDriverOptions.cs:82-112`).
|
||||
- **Reconnect**: native `Session.KeepAlive` event drives a `SessionReconnectHandler` with a 2-minute max retry period; SDK's automatic `TransferSubscriptions` migrates monitored items onto the rebuilt channel; keep-alive is rewired onto the new session post-recovery (`OpcUaClientDriver.cs:1297-1359`).
|
||||
- **Discovery**: two-pass recursive browse from `BrowseRoot` (default `ObjectsFolder`) with `MaxBrowseDepth=10` and `MaxDiscoveredNodes=10_000` caps; pass 2 batch-reads `DataType` + `ValueRank` + `UserAccessLevel` + `Historizing` per variable in one Session.ReadAsync (`OpcUaClientDriver.cs:596-810`).
|
||||
- **Type mapping**: built-in OPC UA scalar types → `DriverDataType`; structs/enums/extension objects fall through to String passthrough; `ValueRank>=0` flags arrays (`OpcUaClientDriver.cs:820-836`).
|
||||
- **ACL bridge**: `UserAccessLevel.CurrentWrite` → `SecurityClassification.Operate`, otherwise `ViewOnly`; gating happens server-side in DriverNodeManager (`OpcUaClientDriver.cs:844-850`).
|
||||
- **Read/Write**: batched ReadAsync/WriteAsync with NodeId pre-parse + per-tag `BadNodeIdInvalid` short-circuit; cascading-quality preserves upstream `StatusCode` and `SourceTimestamp` verbatim; transport faults fan out as `BadCommunicationError` (`OpcUaClientDriver.cs:441-568`).
|
||||
- **Subscriptions**: native MonitoredItem forwarding with publishing-interval floor of 50 ms, `KeepAliveCount=10`, `LifetimeCount=1000`, `QueueSize=1`, `DiscardOldest=true`, `Reporting` mode, `TimestampsToReturn.Both` (`OpcUaClientDriver.cs:854-914`).
|
||||
- **Alarms (A&C)**: EventFilter SelectClauses on `BaseEventType` + `ConditionType` (EventId/EventType/SourceNode/Message/Severity/Time/ConditionId), source-node filter set, `QueueSize=1000` for burst tolerance, `Acknowledge` method invocation forwarded as `CallAsync`; severity bucketed Low/Medium/High/Critical per OPC UA Part 9 (`OpcUaClientDriver.cs:967-1143`).
|
||||
- **HistoryRead pass-through**: `ReadRawAsync`, `ReadProcessedAsync` (Average/Min/Max/Total/Count standard aggregates), `ReadAtTimeAsync` with continuation point support (`OpcUaClientDriver.cs:1154-1264`).
|
||||
- **Diagnostics**: per-driver `HostName` reflects the URL actually connected (not the first candidate); `HostState` transitions Running/Stopped/Unknown driven by keep-alive; `DriverHealth` carries `LastSuccessfulRead` + last error (`OpcUaClientDriver.cs:1281-1372`).
|
||||
- **Capability surface**: 8/8 — `IDriver`, `ITagDiscovery`, `IReadable`, `IWritable`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`.
|
||||
|
||||
### Gaps vs commercial UA aggregators
|
||||
|
||||
- **[Build]** **Reverse Connect (server-initiated client connect)** — present in: UaGateway, Prosys Forge, Kepware (1.5+), Matrikon. Why: lets the upstream server traverse outbound-only firewalls (typical OT-DMZ direction); a hard requirement for many regulated plant networks.
|
||||
- **[Build]** **Discovery URL with `FindServers` / `FindServersOnNetwork`** — present in: Kepware, UaGateway, Matrikon. Why: we accept only an explicit endpoint URL; commercial gateways resolve a discovery URL and let the operator pick from advertised endpoints in a UI without copying the policy/mode tuple by hand.
|
||||
- **[Skip]** **Multicast / LDS-ME registration** — present in: UaGateway, Prosys. Why: lets clients discover this gateway via the Local Discovery Server without static config.
|
||||
- **[Skip]** **GDS push management (Part 12)** — present in: UaGateway, Prosys. Why: certificate provisioning, renewal, trust-list updates pushed from a central GDS — required for fleets >10 endpoints; we have no `ServerConfigurationType` method support and no automatic renewal hook.
|
||||
- **[Build]** **Per-tag advanced subscription tuning** — present in: Kepware, UaGateway, Cogent. Why: `SamplingInterval`, `QueueSize`, `DiscardOldest`, `MonitoringMode`, `DataChangeFilter` (DeadbandType=Absolute/Percent, Trigger=Status/StatusValue/StatusValueTimestamp) are hard-coded (50 ms / 1 / true / Reporting / no deadband). No way to set deadbands per tag — a baseline aggregator feature for analog noise filtering.
|
||||
- **[Build]** **Per-subscription tuning (`PublishingInterval` / `KeepAliveCount` / `LifetimeCount` / `MaxNotificationsPerPublish` / `Priority`)** — present in: all listed gateways. Why: we hard-code 10/1000/0/0 in `Subscription` and `MaxNotificationsPerPublish=0` (unlimited) is a denial-of-service surface against high-event-rate servers; high-tag-count deployments need to split subscriptions across priorities.
|
||||
- **[Build]** **Selective import / namespace remap** — present in: Kepware, Matrikon, UaGateway, Cogent. Why: we mirror everything under `BrowseRoot` and re-prefix with a single "Remote" folder; commercial aggregators support per-branch include/exclude rules, namespace-URI remapping, alias paths, and re-keyed BrowseNames.
|
||||
- **[Build]** **Type definition mirroring (ObjectTypes / VariableTypes / DataTypes / ReferenceTypes)** — present in: UaGateway, Prosys, Kepware. Why: we walk Object + Variable nodes only; HasTypeDefinition references and custom type nodes are dropped, so downstream UI clients lose type-aware rendering and structured DataTypes decode as String passthrough.
|
||||
- **[Build]** **Method node mirroring + pass-through `Call`** — present in: UaGateway, Matrikon, Kepware. Why: `NodeClass.Method` is filtered out of the browse and `IDriver` has no `CallMethodAsync` capability; clients cannot invoke remote methods through the gateway. (`Acknowledge` is the only call we forward, hard-coded for A&C.)
|
||||
- **[Build]** **Automatic re-import on remote `ServerStatus.NodeVersion` / `ModelChangeEvent`** — present in: UaGateway, Kepware, Prosys. Why: we don't subscribe to `ServerStatus.State` or `BaseModelChangeEventType`; if the upstream server adds nodes mid-flight the new tags don't appear until the driver is reinitialized.
|
||||
- **[Skip]** **HistoryUpdate / HistoryRead-Modified / Annotation pass-through** — present in: UaGateway, Prosys Historian, Kepware (LocalHistorian). Why: we ship Raw/Processed/AtTime only; `IsReadModified=false` is hard-coded; no `HistoryUpdate`, no `DeleteRawModified`, no annotation forwarding. Many MES integrations need backfill writes.
|
||||
- **[Build]** **`ReadEventsAsync` (HistoryRead Events)** — explicitly deferred per memory entry. Why: `IHistoryProvider.ReadEventsAsync` interface lacks an `EventFilter SelectClauses` parameter to carry the field projection.
|
||||
- **[Build]** **Aggregate function set** — present in: UaGateway, Prosys, Kepware. Why: we map only Average/Minimum/Maximum/Total/Count; OPC UA Part 13 standard catalog has 30+ (TimeAverage, Interpolative, StdDev, DurationGood, NumberOfTransitions, etc.) that historian-class clients expect.
|
||||
- **[Build]** **Redundant-server URI list (`ServerUriArray`) and transparent failover** — present in: Kepware, UaGateway, Matrikon. Why: our `EndpointUrls` is a one-shot connect-attempt list, not a live redundancy group; we don't read the upstream `ServerRedundancyType` or fail over mid-session on `ServiceLevel` drop.
|
||||
- **[Build]** **Maximum nodes per Read/Write/Browse honored from server capabilities** — present in: all listed gateways. Why: we delegate chunking to the SDK but never query `Server.ServerCapabilities.OperationLimits.MaxNodesPerRead/Write/Browse`; on undersized servers this can produce `BadTooManyOperations` instead of automatic fragmentation.
|
||||
- **[Skip]** **Connection / session pooling for multi-instance scale-out** — present in: UaGateway, Cogent. Why: each driver instance opens its own session even when N drivers point at the same upstream; commercial gateways multiplex one session per remote across multiple downstream contexts to cut session count and cert-handshake load.
|
||||
- **[Build]** **Diagnostics counters (PublishRequest count, NotificationsPerSecond, MissingPublishRequests, dropped-notification rate)** — present in: UaGateway, Prosys. Why: `DriverHealth` carries `LastSuccessfulRead` + last error string only; no per-server message-rate counters or publish-queue health metrics for the Admin dashboard.
|
||||
- **[Skip]** **Kerberos / OAuth2 / IssuedToken (JWT) user identity** — present in: Kepware (Kerberos), UaGateway, Prosys. Why: we support Anonymous/Username/Certificate only; no `IssuedIdentityToken` token type, no Kerberos SPNEGO, no JWT bearer flow that newer security stacks (Azure AD) expect.
|
||||
- **[Skip]** **WriteAsync attribute scope beyond Value** — present in: UaGateway, Matrikon. Why: `WriteAsync` hard-codes `AttributeId = Attributes.Value`; no way to write `StatusCode`, `SourceTimestamp`, or non-Value attributes (rare but a documented OPC UA capability).
|
||||
- **[Build]** **CRL / revocation list configuration** — present in: Kepware, UaGateway. Why: the cert-validator hooks `BadCertificateUntrusted` only; revoked-cert chains aren't explicitly checked or surfaced as a distinct fault, and there's no `RejectSHA1SignedCertificates` knob.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | Reverse Connect | Yes | OT-DMZ outbound-only is the standard plant-network direction |
|
||||
| 2 | Discovery URL `FindServers` | Yes | Standard UX; saves manual policy / mode tuple copy |
|
||||
| 3 | Multicast / LDS-ME registration | No | Server-side responsibility, not aggregator's |
|
||||
| 4 | GDS push management (Part 12) | No | Significant infra; rare for our deployment scale |
|
||||
| 5 | Per-tag advanced subscription tuning (deadband, queue, mode) | Yes | Deadbands are baseline analog filtering |
|
||||
| 6 | Per-subscription tuning (publishing / keep-alive / lifetime) | Yes | Avoid DoS on bursty servers; operability |
|
||||
| 7 | Selective import / namespace remap | Yes | Curation is a baseline aggregator feature |
|
||||
| 8 | Type definition mirroring | Yes | UI clients lose structure decoding without it |
|
||||
| 9 | Method node mirroring + `Call` passthrough | Yes | Clear functional gap; `IDriver` capability missing |
|
||||
| 10 | Auto re-import on `ModelChangeEvent` | Yes | Correctness when remote topology changes |
|
||||
| 11 | HistoryUpdate / Modified / Annotation passthrough | No | MES backfill scope; defer |
|
||||
| 12 | `ReadEventsAsync` (HistoryRead Events) | Yes | Fix the `IHistoryProvider` abstraction gap |
|
||||
| 13 | Full Aggregate function set (Part 13) | Yes | Cheap to forward; historian clients expect it |
|
||||
| 14 | `ServerUriArray` redundant failover | Yes | HA expectation when upstream is redundant |
|
||||
| 15 | Honor server `OperationLimits` | Yes | Correctness; avoids `BadTooManyOperations` |
|
||||
| 16 | Connection / session pooling | No | Premature; current per-instance model is simple and adequate |
|
||||
| 17 | Diagnostics counters | Yes | Operability; admin dashboard needs publish-rate visibility |
|
||||
| 18 | Kerberos / OAuth2 / JWT identity | No | Significant security work; defer until AD integration drives it |
|
||||
| 19 | Write attribute scope beyond Value | No | Niche; rarely used in OPC UA practice |
|
||||
| 20 | CRL / revocation handling | Yes | Security baseline expectation |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- Cascading-quality contract: upstream `StatusCode` and `SourceTimestamp` preserved verbatim across Read, Subscribe, History — a baseline OPC-to-OPC bridging requirement.
|
||||
- Native subscription forwarding (no polling translation layer) — matches Kepware/UaGateway architecture, not Matrikon Tunneller's COM-bridge approach.
|
||||
- Two-pass discovery batching attribute reads — many naive aggregators issue per-node Reads which makes 10k-node servers take minutes.
|
||||
- Explicit policy+mode endpoint matching (no silent downgrade) — matches UaGateway's behavior; Kepware historically defaulted to "best available" which has been a CVE source.
|
||||
- Per-endpoint connect-timeout in failover sweep — bounded init budget is a property most of the listed gateways added late.
|
||||
- SDK-managed `TransferSubscriptions` on reconnect — matches the OPC Foundation reference behavior; no hand-rolled migration code.
|
||||
|
||||
### Sources
|
||||
|
||||
- OPC Foundation UA-.NETStandard SDK docs — https://github.com/OPCFoundation/UA-.NETStandard
|
||||
- Kepware KEPServerEX OPC UA Client — https://www.ptc.com/en/products/kepware/kepserverex/clients/opc-ua-client
|
||||
- Matrikon OPC UA Tunneller — https://www.matrikonopc.com/products/opc-tunneller/
|
||||
- Unified Automation UaGateway — https://www.unified-automation.com/products/wrapper-and-gateway/ua-gateway.html
|
||||
- Prosys OPC UA Forge / Historian — https://www.prosysopc.com/products/opc-ua-forge/
|
||||
- Cogent DataHub OPC UA — https://www.cogentdatahub.com/products/opc-ua/
|
||||
- AVEVA System Platform OI.UACLIENT — https://docs.aveva.com (Operations Integration UACLIENT)
|
||||
- OPC UA Part 4 (Services), Part 5 (Information Model), Part 9 (A&C), Part 11 (HistoricalAccess), Part 12 (Discovery & GDS), Part 13 (Aggregates), Part 14 (PubSub) — https://reference.opcfoundation.org/
|
||||
|
||||
---
|
||||
|
||||
## S7 (Siemens S7-300/400/1200/1500)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- Native S7comm over ISO-on-TCP via S7netplus; default port 102, configurable so an in-CI Snap7 server can bind 1102 (`S7DriverOptions.cs:32`, `S7Driver.cs:87`).
|
||||
- CPU family selector — `S71200`, `S71500`, `S71200Smart`, `S7200`, `S7300`, `S7400` — enum forwarded straight to S7netplus to pick the remote TSAP slot byte (`S7DriverOptions.cs:34-38`).
|
||||
- Rack/slot configuration with documented conventions (S7-300 slot 2, S7-400 slot 2/3, S7-1200/1500 slot 0) (`S7DriverOptions.cs:42-51`).
|
||||
- Single-connection-per-PLC policy enforced by a `SemaphoreSlim` because the CPU's comms mailbox is scanned at most once per cycle (`S7Driver.cs:23-27,60-67`).
|
||||
- Static tag table parsed at `InitializeAsync` so syntactic typos fail fast instead of bleeding through as `BadInternalError` per read (`S7Driver.cs:103-110`).
|
||||
- Address parser accepts DB / M / I / Q / T / C with X/B/W/D widths and 0-7 bit offsets, case-insensitive, with structured `FormatException` messages (`S7AddressParser.cs:65-216`).
|
||||
- Scalar reads/writes for Bool, Byte, Int16/UInt16, Int32/UInt32, Float32 with explicit signed/unsigned reinterpret of S7netplus' boxed unsigned return values (`S7Driver.cs:231-251,306-322`).
|
||||
- PUT/GET-disabled detection — `S7.Net.PlcException` mapped to `BadDeviceFailure` and surfaced as a configuration alert rather than retried via Polly (`S7Driver.cs:200-208`, `S7DriverOptions.cs:14-25`).
|
||||
- Polled `ISubscribable` overlay floored at 100 ms to avoid wire-side queueing past CPU scan; per-tag last-value diffing for change-of-value publishing (`S7Driver.cs:365-425`).
|
||||
- `IHostConnectivityProbe` using `ReadStatusAsync` (CPU Run/Stop) every probe interval, gated on the same semaphore so it doesn't race a live read (`S7Driver.cs:457-489`).
|
||||
- Per-tag `WriteIdempotent` flag for replay-safe write retry policy (`S7DriverOptions.cs:91-104`).
|
||||
- Snap7-server-backed integration fixture covers atomic typed reads + DB write-then-read round-trip on `localhost:1102` (`docs/drivers/S7-Test-Fixture.md:1-60`).
|
||||
- Test CLI — probe / read / write / subscribe — with the same address grammar and CPU/slot flags (`docs/Driver.S7.Cli.md`).
|
||||
|
||||
### Gaps vs commercial gateways
|
||||
|
||||
- **[Build]** **S7-1500 Optimized DB / Symbolic addressing (S7Plus)** — present in: Kepware "Siemens S7 Plus", Ignition, AVEVA OI.SIDIRECT (limited). Why: S7netplus speaks classic S7comm only; optimized DBs reorder fields and have no fixed byte offsets, so absolute `DB1.DBW0` reads return `BadDeviceFailure` until "Optimized block access" is unchecked in TIA Portal.
|
||||
- **[Build]** **PDU size negotiation surfaced to operators** — present in: Kepware, TOP Server, AVEVA OI.SIDIRECT. Why: Modern S7 CPUs negotiate PDU sizes from 240 up to 960 bytes; we accept whatever S7netplus negotiates with no operator visibility into the cap and no per-request packing strategy that uses the negotiated size.
|
||||
- **[Build]** **Multi-variable PDU packing / read coalescing** — present in: every commercial gateway. Why: `ReadAsync(IReadOnlyList<string>)` issues one S7netplus call per tag inside the semaphore (`S7Driver.cs:182-214`); commercial gateways bin-pack contiguous DB ranges into a single multi-item PDU which is 5-50× faster on dense tag groups.
|
||||
- **[Build]** **TSAP / Connection Type selector (PG / OP / S7-Basic / Other)** — present in: Kepware, TOP Server, AVEVA. Why: S7netplus picks PG-style TSAPs; sites that need OP-class slots (e.g. fenced HMI connections, license-counted PG slots) cannot pick. Some S7-1500 hardening modes refuse PG access from non-allowlisted clients.
|
||||
- **[Build]** **Symbol-table / TIA Portal export browse** — present in: Kepware (online symbol upload on S7-1500), Ignition (TIA tag CSV import), TOP Server (tag-import wizard from `.AWL`/`.udt`/`.xml`). Why: We ship a static tag table only (`S7DriverOptions.cs:55-57`); operators must hand-edit the JSON. No `.tia`/`.s7p` import, no online symbol read of the S7-1500 PG symbol table.
|
||||
- **[Build]** **UDT / STRUCT / nested-DB handling** — present in: Kepware, Ignition, TOP Server. Why: Tag map is flat scalar-only — no UDT fan-out into member variables, no `Array of <UDT>` indexing. Real S7-1500 projects expose hundreds of UDT-typed DBs.
|
||||
- **[Build]** **Array tags (ValueRank=1)** — present in: every commercial gateway. Why: `S7TagDefinition` has no array dimension; `MapDataType` always returns `IsArray: false` (`S7Driver.cs:337-345`). OPC UA arrays of S7 `Array[0..n]` are unaddressable.
|
||||
- **[Build]** **STRING / WSTRING / DTL / S5TIME / TIME / DATE_AND_TIME read+write** — present in: every commercial gateway. Why: Enum entries exist but every code path throws `NotSupportedException` (`S7Driver.cs:241-245,316-320`); S7 `STRING` has a 2-byte header, `WString` is UTF-16 with a 4-byte header, `DTL` is 12 bytes, `S5TIME` is BCD-encoded — none are wired up.
|
||||
- **[Build]** **64-bit types (LInt / ULInt / LReal / LWord)** — present in: Kepware S7 Plus, Ignition, TOP Server S7-1500 driver. Why: `Int64`/`UInt64`/`Float64` cases throw `NotSupportedException` (`S7Driver.cs:241-243`); S7-1500 `LReal` (8-byte double) is the standard analog representation in modern projects.
|
||||
- **[Build]** **Instance-DB / FB-block parameter access** — present in: Kepware, Ignition (with TIA import). Why: We address by absolute DB number; instance DBs of multi-instance FBs need symbolic resolution (`MyFB_Instance.MyParam`) which our parser doesn't accept.
|
||||
- **[Build]** **CPU diagnostic buffer / SZL reads** — present in: Kepware (CPU diagnostic tags), TOP Server (`@Diagnostic` tags), AVEVA OI.SIDIRECT. Why: We probe `ReadStatusAsync` only (`S7Driver.cs:476`); SZL IDs 0x0000-0xFFFF (CPU type, firmware version, cycle time min/max/avg, diagnostic-buffer entries, hardware module status) are not exposed as system tags.
|
||||
- **[Skip]** **AS-Alarms / Alarm_S/SQ/D/DQ / S7 ProDiag** — present in: Kepware (Alarms suite), Ignition. Why: No `IAlarmSource` implementation; CPU-resident alarms (Alarm_S blocks, ProDiag supervision messages, system diagnostic messages) are invisible to OPC UA A&E clients. CPU diagnostic-buffer entries similarly not surfaced.
|
||||
- **[Skip]** **CPU Run/Stop control / block download / PG functions** — present in: Kepware (limited), AVEVA OI.SIDIRECT. Why: `ReadStatusAsync` is the only PG-class call we make; remote `WriteCpuStop` / `WriteCpuStart`, block download, password authentication for PG functions are absent.
|
||||
- **[Build]** **PLC password / protection-level handling** — present in: Kepware, TOP Server, AVEVA. Why: S7-300/400 protection levels 1-3 and S7-1200/1500's "Connection mechanisms" / "Full access incl. fail-safe" tiers can require a password on connect; S7netplus's `Plc` ctor takes no password and we have no place to plumb one through.
|
||||
- **[Skip]** **S7-1500 "Secure Communication" (TLS / certificate-based)** — present in: Siemens-direct (OPC UA on S7-1500), Kepware S7 Plus partial. Why: S7-1500 firmware V3.0+ supports authenticated PG connections with certificates; we connect plaintext over TCP only. Sites with hardened CPUs (`Access protection = high` + cert required) won't accept the driver.
|
||||
- **[Skip]** **S7-400H / redundant H-system support** — present in: Kepware (paired-IP with sticky-master), AVEVA OI.SIDIRECT. Why: We have one host/port; H-systems present two sync'd CPUs on two IPs and the driver should fail over without losing subscriptions. Driver-level redundancy is unimplemented (server-level redundancy in `docs/Redundancy.md` is a separate axis).
|
||||
- **[Skip]** **Multi-CPU rack / multiple TSAPs per rack** — present in: Kepware, TOP Server. Why: One Plc instance binds one (rack, slot); S7-400 multi-CPU racks expose 2-4 CPUs that need parallel sessions to drive in parallel.
|
||||
- **[Skip]** **MPI / Profibus / RFC1006-routed transports** — present in: Kepware, AVEVA OI.SIDIRECT (DASSIDirect legacy paths), TOP Server. Why: S7netplus is Ethernet-only. Brownfield S7-300 sites still routed via CP 5611/5613 MPI cards or via S7-1500-as-router for fenced subnets are out of reach.
|
||||
- **[Build]** **LOGO! 8 / S7-200 / S7-200 Smart variant tuning** — present in: Kepware "Siemens TCP/IP Ethernet" (LOGO!), Sharp7 (S7-200 Smart), Ignition. Why: `CpuType.S7200`/`S7200Smart` exists in S7netplus but the V-memory area (`V` letter) is not in our parser's switch (`S7AddressParser.cs:88-97`). LOGO!'s VM range and S7-200's V/SM areas are unaddressable.
|
||||
- **[Build]** **Per-tag scan group / publish rate** — present in: Kepware (scan classes), Ignition (tag groups), TOP Server (scan rate per tag). Why: Subscriptions take one publishingInterval for the whole tag list (`S7Driver.cs:365-380`); a CPU with mixed 100 ms / 1 s / 10 s tags needs three subscribe calls and three semaphore-serialized poll loops.
|
||||
- **[Build]** **Deadband / on-change suppression with absolute or percent thresholds** — present in: every commercial gateway. Why: We diff exact-equal only (`S7Driver.cs:419`); no analog deadband — a noisy float tag floods the bus.
|
||||
- **[Build]** **Block-read coalescing for contiguous DB regions** — present in: every commercial gateway. Why: Reading `DB1.DBW0`, `DB1.DBW2`, `DB1.DBW4` issues 3 calls; commercial drivers issue a single FC=04 ReadVarRequest covering bytes 0-5 and slice client-side.
|
||||
- **[Skip]** **Connection-resource budget management / max-parallel-jobs (AmqLen)** — present in: Kepware, TOP Server. Why: S7-1200/1500 expose 8-64 connection-resources and a per-connection parallel-jobs cap (Amq); we hold one connection and serialize, but commercial drivers open 2-4 connections per CPU to multiplex. We have no operator knob.
|
||||
- **[Build]** **Pre-flight / online-test of PUT/GET enablement** — present in: Kepware (config validation step), AVEVA. Why: We surface `BadDeviceFailure` only at first read (`S7Driver.cs:200-208`); commercial drivers warn during connection wizard via SZL probe before the operator commits config.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | S7-1500 Optimized DB / Symbolic addressing (S7Plus) | Yes | Hard blocker on modern S7-1500 sites |
|
||||
| 2 | PDU size negotiation surfaced | Yes | Cheap operability; no behavior change |
|
||||
| 3 | Multi-variable PDU packing | Yes | 5-50x perf; current per-tag-per-call is the baseline gap |
|
||||
| 4 | TSAP / Connection Type selector | Yes | Hardened CPUs reject PG-class slots |
|
||||
| 5 | Symbol-table / TIA Portal export browse | Yes | Workflow parity; static JSON doesn't scale |
|
||||
| 6 | UDT / STRUCT / nested-DB handling | Yes | Real S7-1500 projects expose hundreds of UDTs |
|
||||
| 7 | Array tags (ValueRank=1) | Yes | Table-stakes; currently unaddressable |
|
||||
| 8 | STRING / WSTRING / DTL / S5TIME / TIME / DT | Yes | Standard datatypes; currently throw `NotSupported` |
|
||||
| 9 | 64-bit types (LInt / ULInt / LReal / LWord) | Yes | LReal is the standard analog representation on S7-1500 |
|
||||
| 10 | Instance-DB / FB parameter access | Yes | Modern symbolic structure; absolute DBs alone are limiting |
|
||||
| 11 | CPU diagnostic buffer / SZL reads | Yes | Operability; firmware / cycle-time visibility |
|
||||
| 12 | AS-Alarms / Alarm_S / ProDiag | No | Significant scope; alarms are a separate workstream |
|
||||
| 13 | CPU Run / Stop control / block download | No | Security / safety risk; out of scope |
|
||||
| 14 | PLC password / protection-level handling | Yes | Hardened CPUs require it (S7netplus support permitting) |
|
||||
| 15 | S7-1500 Secure Communication / TLS | No | Significant work; defer |
|
||||
| 16 | S7-400H redundant H-system support | No | Rare in our deployment scope |
|
||||
| 17 | Multi-CPU rack parallel sessions | No | Rare; one session per CPU works |
|
||||
| 18 | MPI / Profibus / RFC1006-routed transports | No | Declining; brownfield only |
|
||||
| 19 | LOGO! 8 / S7-200 V-memory area | Yes | Small parser fix broadens coverage materially |
|
||||
| 20 | Per-tag scan group / publish rate | Yes | Operability; mixed-rate is normal |
|
||||
| 21 | Deadband / on-change with thresholds | Yes | Analog noise mitigation |
|
||||
| 22 | Block-read coalescing for contiguous DBs | Yes | Big perf win; complements multi-variable PDU packing |
|
||||
| 23 | Connection-resource budget / parallel jobs | No | Premature; one connection works for most rigs |
|
||||
| 24 | Pre-flight PUT/GET enablement test | Yes | UX improvement; cheap |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- Single-connection-per-PLC + semaphore serialization is the documented S7netplus / Snap7 best practice and matches what TOP Server / AVEVA do in their default profile.
|
||||
- 100 ms minimum publishing interval correctly reflects CPU mailbox scan reality — commercial gateways advertise "1 ms scan" in marketing then quietly floor to ~100 ms in practice.
|
||||
- Strict address-parse-at-init with structured exceptions (rather than per-read `BadInternalError`) is better operator UX than Kepware's "you'll find out at runtime" default.
|
||||
- PUT/GET-disabled mapped to a sticky `BadDeviceFailure` instead of being retried by Polly — Polly retry against a CPU that will keep refusing is exactly the failure mode that floods commercial deployments.
|
||||
- `WriteIdempotent` per-tag flag is finer-grained than Kepware's connection-level `Auto Demote` and matches the safe-replay reality: DB set-points are replayable, M/Q edge-triggered bits are not.
|
||||
- Probe path uses `ReadStatusAsync` (single CPU-state PDU) rather than a tag read — doubles as "PLC actually up" without polluting the comms mailbox.
|
||||
- Driver-instance host/port format (`host:port`) matches the Modbus driver so Admin UI can render both families uniformly.
|
||||
- Snap7-server CI fixture closes the "no commercial vendor offers a meaningful S7 simulator" gap that Kepware/TOP Server users hit on day one.
|
||||
|
||||
### Sources
|
||||
|
||||
- https://www.kepserverexopc.com/products/siemens-tcpip-ethernet/ (Kepware Siemens TCP/IP Ethernet)
|
||||
- https://www.kepware.com/en-us/products/kepserverex/drivers/siemens-s7-plus/ (Kepware S7 Plus — Optimized DB / Symbolic addressing)
|
||||
- https://www.aveva.com/en/products/communication-drivers/ (AVEVA OI Server / DASSIDirect)
|
||||
- https://www.softwaretoolbox.com/topserver-siemens-suite (TOP Server Siemens Suite)
|
||||
- https://docs.inductiveautomation.com/docs/8.1/platform/connecting-to-devices/siemens (Ignition Siemens driver guide)
|
||||
- https://github.com/S7NetPlus/s7netplus (S7netplus library)
|
||||
- https://snap7.sourceforge.net/ (Snap7)
|
||||
- https://github.com/evcc-io/sharp7 (Sharp7 fork — S7-1200/1500 PUT/GET semantics)
|
||||
- https://cache.industry.siemens.com/dl/files/591/68018591/att_956083/v1/s71500_communication_function_manual_en-US_en-US.pdf (Siemens S7-1500 Communication Function Manual)
|
||||
- https://support.industry.siemens.com/cs/document/26224811 (Siemens — TSAPs and connection resources)
|
||||
- https://support.industry.siemens.com/cs/document/89260861 (Siemens — SZL list IDs / system status lists)
|
||||
- https://docs.tia.siemens.cloud/r/en-us/v20/safety-and-security/secure-communication (S7-1500 secure communication)
|
||||
|
||||
---
|
||||
|
||||
## TwinCAT (Beckhoff ADS)
|
||||
|
||||
### What we ship today
|
||||
|
||||
- `TwinCATDriver` implements `IReadable`, `IWritable`, `ISubscribable`, `ITagDiscovery`, `IHostConnectivityProbe`, `IPerCallHostResolver` over Beckhoff's `Beckhoff.TwinCAT.Ads` v6 `AdsClient` (`src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs:11-12`, `AdsTwinCATClient.cs:22-24`).
|
||||
- AMS addressing parses `ads://{netId}:{port}` with the six-octet AmsNetId and TC3 default port 851 (also documents 801/811/821 for TC2 and 10000 for system service) (`TwinCATAmsAddress.cs:20-64`).
|
||||
- Native ADS notifications via `AddDeviceNotificationExAsync` with `AdsTransMode.OnChange` and per-tag cycle time; falls back to shared `PollGroupEngine` when `UseNativeNotifications=false` (`TwinCATDriver.cs:296-339`, `AdsTwinCATClient.cs:130-160`).
|
||||
- IEC 61131-3 atomic data type surface — Bool, S/U Int 8/16/32/64, Real, LReal, String, WString, Time, Date, DT, TOD (`TwinCATDataType.cs:9-30`).
|
||||
- Symbol path parser supports POU/GVL prefix, struct member walks, array subscripts incl. multi-dim `Matrix[1,2]`, and bit-access `.0..31` (`TwinCATSymbolPath.cs:1-104`).
|
||||
- Bit-indexed BOOL **read** path: read parent word as `uint`, mask locally (`AdsTwinCATClient.cs:57-64`, `ExtractBit`).
|
||||
- Optional controller-side symbol browse via `SymbolLoaderFactory` (flat mode), with system-symbol filter for `TwinCAT_*`, `Constants.*`, `Mc_*`, `__*` (`AdsTwinCATClient.cs:178-195`, `TwinCATSystemSymbolFilter.cs`).
|
||||
- Per-device probe loop calls `ReadStateAsync` and emits `OnHostStatusChanged` Running/Stopped transitions (`TwinCATDriver.cs:366-402`).
|
||||
- Status-code mapping `AdsErrorCode` → OPC UA via `TwinCATStatusMapper`; auto-reconnect on dropped client (`TwinCATDriver.cs:413-429`).
|
||||
- Sized strings `STRING(80)` / `WSTRING(80)` are tolerated in browse — type name parens stripped to bare atom (`AdsTwinCATClient.cs:200-206`).
|
||||
- Live-tested against TCBSD VM and Hyper-V XAR — 30 integration test cases (read/write/array/subscribe/browse/reconnect/probe), 110 unit tests (`docs/drivers/TwinCAT-Test-Fixture.md`).
|
||||
|
||||
### Gaps vs commercial gateways
|
||||
|
||||
- **[Build]** **ADS Sum commands (sum-read / sum-write / sum-add-notification)** — present in: Kepware, TF6100, Ignition, TwinCAT.Ads itself. Why: we issue one `ReadValueAsync` per tag in a loop (`TwinCATDriver.cs:118-156`); commercial drivers batch into `IndexGroup=0xF080..0xF084` sum requests for ~10x throughput on multi-thousand tag scans.
|
||||
- **[Build]** **Handle-based access (CreateVariableHandle / ReadByHandle)** — present in: Kepware, TF6100, AdsClient itself. Why: we resolve the symbolic name on every read; cached handles cut per-request bytes and AMS overhead, especially over WAN/multi-hop.
|
||||
- **[Build]** **STRUCT / UDT decomposition with offline TMC parsing** — present in: Kepware (TwinCAT TMC import), TF6100 (native), Ignition. Why: `TwinCATDataType.Structure` is declared but discovery skips non-atomic symbols (`AdsTwinCATClient.cs:224`); we can't expose nested UDT trees without hand-declaring every leaf.
|
||||
- **[Build]** **Bit-indexed BOOL writes** — present in: Kepware, TF6100. Why: we throw `NotSupportedException` (`AdsTwinCATClient.cs:99-100`); commercial drivers do read-modify-write or use `ADSIGRP_SYM_VALBYNAME` with the `.N` syntax the runtime supports for some primitives.
|
||||
- **[Build]** **Multi-dim / whole-array reads** — present in: Kepware, TF6100. Why: we parse `Matrix[1,2]` element-by-element but never read the array in one ADS call; sized-array marshalling is in `TwinCAT.Ads` but unused here.
|
||||
- **[Build]** **Int64 fidelity** — present in: TF6100, Ignition. Why: `LInt`/`ULInt` map to `DriverDataType.Int32` (`TwinCATDataType.ToDriverDataType` line 40 with explicit "matches Int64 gap" comment) — silent precision loss above 2^31.
|
||||
- **[Build]** **TIME / DATE / DT / TOD as native OPC UA types** — present in: TF6100 (DateTime/Duration), Kepware. Why: we marshal all four as raw `UDINT` (`AdsTwinCATClient.cs:278-280`) leaving timestamp interpretation to the client.
|
||||
- **[Build]** **ENUM / ALIAS / REFERENCE / POINTER / INTERFACE / UNION** — present in: TF6100, Kepware (partial). Why: not in `TwinCATDataType`; symbol-mapper returns `null` and skips.
|
||||
- **[Skip]** **Multi-target / multi-route AMS gateway** — present in: Kepware, Ignition (one driver instance, many devices). Why: we accept N `Devices` but each requires its own `TwinCATDeviceOptions`; no central route table, no `StaticRoutes.xml` management, no AMS-router credential handling.
|
||||
- **[Skip]** **TwinCAT 3.1.4024+ Secure ADS / ADS-over-TLS** — present in: TF6100, recent TwinCAT.Ads. Why: `AdsClient.Connect` is called without secure-ADS opts; no certificate or pre-shared-key knobs in `TwinCATDriverOptions`.
|
||||
- **[Skip]** **Route credential management** — present in: Kepware (route auth UI), TF6100. Why: relies entirely on the host AMS router's pre-authorized routes; we have no in-driver way to add a route or supply credentials.
|
||||
- **[Skip]** **NC-axis / CNC channel / EtherCAT slave I/O surfaces** — present in: TF6100 (full NC namespace), Kepware (NC variables). Why: our system-symbol filter actively drops `Mc_*` (`TwinCATSystemSymbolFilter.cs:28`); we treat NC plumbing as noise.
|
||||
- **[Skip]** **System-service ports** (`AMSPORT_R0_REALTIME=200`, `R0_TCOMSERVER=10000`, `EVENTLOG=110`) — present in: TF6100, Kepware (system data). Why: only `Devices` are PLC-runtime ports in practice; no helpers for system-service requests, run/config-mode switches, or Real-Time diagnostic counters.
|
||||
- **[Build]** **Event log ingest (TwinCAT EventLogger / TC3 Eventing)** — present in: TF6100 (alarms/conditions), Ignition. Why: we don't implement `IAlarmSource`; AMS port 110 events never surface as OPC UA AC events.
|
||||
- **[Skip]** **PLC RPC / method invocation (TC3 method calls via ADS)** — present in: TF6100. Why: `IWritable` is value-only; no surface for `RpcInvoke`-style method calls on FB instances.
|
||||
- **[Skip]** **Per-PLC-runtime fan-out (port 851/852/853)** — partially present. Why: technically supported via separate `Devices` entries, but no helper that auto-discovers which runtimes exist on a controller via the system service.
|
||||
- **[Build]** **Sub-millisecond cycle accuracy / max-delay tuning** — present in: TF6100, Kepware. Why: `NotificationSettings(OnChange, cycleMs, 0)` clamps cycle to 1 ms and sets max-delay to 0 (`AdsTwinCATClient.cs:144-145`); no per-tag override of `MaxDelay` to coalesce bursty signals.
|
||||
- **[Build]** **Cycle-time / jitter / PLC-state diagnostics** — present in: TF6100, Kepware. Why: probe only checks reachability; we don't surface cycle-time, jitter, RT-state or `_AppInfo.OnlineChangeCnt` as health signals.
|
||||
- **[Build]** **Online change / symbol-version invalidation** — present in: TF6100, Ignition. Why: no listener on `ADSIGRP_SYMVAL_BYHND` invalidation event; an online change silently invalidates cached handles (we have none, but adding handles needs this).
|
||||
- **[Skip]** **File-system access via ADS (`ADSIGRP_FOPEN/FREAD`)** — present in: TF6100. Why: not implemented; useful for reading recipe files / log uploads without a separate transport.
|
||||
|
||||
### Recommendations
|
||||
|
||||
| # | Gap | Build? | Rationale |
|
||||
|---|-----|:------:|-----------|
|
||||
| 1 | ADS Sum commands | Yes | ~10x throughput for multi-thousand-tag scans; blocker at scale |
|
||||
| 2 | Handle-based access (caching) | Yes | Perf; reduces per-request bytes and AMS overhead |
|
||||
| 3 | STRUCT / UDT decomposition with TMC parsing | Yes | Real projects have nested UDTs we currently can't expose |
|
||||
| 4 | Bit-indexed BOOL writes | Yes | Correctness; we read bits but throw on write |
|
||||
| 5 | Multi-dim / whole-array reads | Yes | Perf; library supports it |
|
||||
| 6 | Int64 fidelity (LInt / ULInt) | Yes | Correctness; we silently truncate |
|
||||
| 7 | TIME / DATE / DT / TOD as native UA types | Yes | Correctness; raw UDINT pushes interpretation to clients |
|
||||
| 8 | ENUM / ALIAS / REFERENCE / POINTER / INTERFACE / UNION | Yes | At least ENUM and ALIAS are common in real projects |
|
||||
| 9 | Multi-target / multi-route AMS gateway | No | Per-device config already works |
|
||||
| 10 | Secure ADS / ADS-over-TLS | No | Significant work; defer |
|
||||
| 11 | Route credential management | No | Host-level AMS router responsibility |
|
||||
| 12 | NC-axis / CNC channel / EtherCAT slave I/O surfaces | No | Specialty; not in target use cases |
|
||||
| 13 | System-service ports | No | Niche operational tooling |
|
||||
| 14 | Event log / TC3 alarms (`IAlarmSource`) | Yes | Currently no `IAlarmSource` implementation; capability gap |
|
||||
| 15 | PLC RPC / method invocation | No | Niche; design-heavy |
|
||||
| 16 | Per-PLC-runtime auto-discover | No | Cosmetic; manual port config works |
|
||||
| 17 | Sub-millisecond max-delay tuning | Yes | Cheap; helps coalesce bursty signals |
|
||||
| 18 | Cycle-time / jitter / PLC-state diagnostics | Yes | Operability; cheap given existing probe |
|
||||
| 19 | Online-change / symbol-version invalidation | Yes | Required if handle caching lands (gap #2) |
|
||||
| 20 | File-system access via ADS | No | Niche; out of scope |
|
||||
|
||||
### Notable parity (keep)
|
||||
|
||||
- Native `OnChange` notifications (not polling) — matches TF6100/Kepware default and is the right CPU/latency posture.
|
||||
- Symbolic addressing (no manual index-group/offset arithmetic) — same DX as Kepware's TwinCAT driver.
|
||||
- Live integration suite against a real runtime (TCBSD + XAR), not just mocks — better than Ignition's stock TwinCAT module which lacks bundled hardware tests.
|
||||
- System-symbol filter so `Discovered/` doesn't drown the address space — Kepware ships an equivalent.
|
||||
- Config-driven tag declarations as the authoritative path; `EnableControllerBrowse` is opt-in — matches "tag-import-then-curate" workflow Kepware encourages.
|
||||
- AmsNetId + port modelled correctly with TC3-vs-TC2 default port awareness — matches TF6100 conventions.
|
||||
|
||||
### Sources
|
||||
|
||||
- https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_ads_intro/index.html
|
||||
- https://infosys.beckhoff.com/english.php?content=../content/1033/tcadsdll2/117571083.html (Sum commands / index groups)
|
||||
- https://infosys.beckhoff.com/english.php?content=../content/1033/tf6100_tc3_opcua/index.html (TF6100)
|
||||
- https://github.com/Beckhoff/TF6100-OPC-UA-Sample
|
||||
- https://github.com/Beckhoff/TC3-AdsClient-Csharp / `Beckhoff.TwinCAT.Ads` NuGet docs
|
||||
- https://www.kepserverexlibrary.kepware.com/Beckhoff%20TwinCAT (Kepware Beckhoff TwinCAT driver manual)
|
||||
- https://docs.inductiveautomation.com/docs/8.1/platform/connections/devices (Ignition device drivers)
|
||||
- https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_security_management/ (Secure ADS)
|
||||
- https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_adsnetref/ (NotificationSettings, AdsTransMode)
|
||||
@@ -0,0 +1,682 @@
|
||||
# AbCip Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → AbCip](../featuregaps.md#abcip-allen-bradley-ethernetip--logix)
|
||||
>
|
||||
> This plan covers the **Build = Yes** items only. Skip-rated gaps are listed at the bottom for traceability.
|
||||
|
||||
## Summary
|
||||
|
||||
This plan closes the 16 Build-rated AbCip gaps in five phases ordered to ship correctness fixes
|
||||
first, then engineering workflow, then performance, then operability, and finally redundancy.
|
||||
Phase 1 lands the data-type fidelity work (LINT/ULINT, native STRINGnn, array slicing,
|
||||
write-multi packing) that today silently truncates 64-bit values and serialises adjacent reads
|
||||
into N round-trips. Phase 2 introduces the offline tag-import workflow (L5K/L5X + CSV) that
|
||||
Studio 5000 shops require before they will switch off Kepware. Phase 3 exposes the
|
||||
performance levers commercial drivers ship as field knobs — symbolic vs logical addressing,
|
||||
configurable Connection Size, and the logical-blocking / logical-non-blocking strategy
|
||||
selector. Phase 4 surfaces per-tag scan rates, write deadband, online tag-DB refresh trigger,
|
||||
and the diagnostic system tags an HMI dashboard expects. Phase 5 adds HSBY paired-IP failover
|
||||
for continuous-process plants. Headline outcome: parity with Kepware's Logix Database Settings
|
||||
and TOP Server's protocol-mode picker, with measurable throughput wins (3-5x on dense rigs via
|
||||
logical addressing, single-PDU reads on contiguous arrays, single-PDU writes on multi-tag
|
||||
recipe pushes).
|
||||
|
||||
## Phased delivery
|
||||
|
||||
### Phase 1 — Data-type correctness (4 PRs)
|
||||
|
||||
Goal: stop silently losing data. None of the items in this phase are user-visible features —
|
||||
they are correctness fixes against existing capability surfaces.
|
||||
|
||||
#### PR 1.1 — LINT / ULINT 64-bit fidelity
|
||||
- **Scope**: replace the truncating `Int32` widening at `AbCipDataType.cs:53` with `Int64`
|
||||
routing across decode + encode + the `DriverDataType` map. Includes `DT` (epoch-millis on
|
||||
Logix v32+ surfaces as LINT, not DINT — verify against `LibplctagTagRuntime.cs:53` before
|
||||
reusing the same code-path).
|
||||
- **Files**: `AbCipDataType.cs` (mapping), `LibplctagTagRuntime.cs` (already calls
|
||||
`_tag.GetInt64` / `SetInt64`, so the runtime is correct — the gap is the surface enum
|
||||
flattening into `Int32`), `Core.Abstractions/DriverDataType.cs` may need an `Int64` /
|
||||
`UInt64` member if not already present.
|
||||
- **Test approach**: unit (xUnit + Shouldly) with a fake `IAbCipTagRuntime` that returns
|
||||
`long.MaxValue` on `DecodeValueAt(LInt, ...)`; assert the snapshot value round-trips through
|
||||
the read path without truncation. Integration test against pymodbus is N/A — needs a live
|
||||
Logix or a libplctag mock-server fixture; keep this unit-only and rely on smoke testing on
|
||||
the dev box with a real ControlLogix.
|
||||
- **Effort**: S
|
||||
- **Dependencies**: confirm `DriverDataType.Int64` exists; if not, that is a Core change
|
||||
shared with the Modbus TODO at `AbCipDataType.cs:53`.
|
||||
- **Docs / fixture / e2e**: appends a Logix-types row to the type-mapping table in
|
||||
`docs/Driver.AbCip.Cli.md` (CLI gains `--type LInt` / `--type ULInt`); extends
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §"What it actually covers" to list `LINT` once
|
||||
ab_server is reseeded with a `TestLINT:LINT[1]` tag; updates the
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/Docker/docker-compose.yml`
|
||||
ControlLogix profile to seed `TestLINT`; adds a 64-bit assertion case in
|
||||
`AbCipReadSmokeTests`; extends `scripts/e2e/test-abcip.ps1` with an LInt loopback
|
||||
assertion (and a matching seeded `TestLINT` in `scripts/smoke/seed-abcip-smoke.sql`).
|
||||
|
||||
#### PR 1.2 — Native STRING / STRINGnn variant decoding
|
||||
- **Scope**: Today `AbCipDataType.String` flattens any Logix `STRING` UDT into a .NET
|
||||
string via libplctag's `_tag.GetString(0)`. Logix programs commonly define
|
||||
`STRING_20`, `STRING_40`, `STRING_80` variants with different DATA-array sizes; libplctag
|
||||
honours these when the tag name resolves to the user-defined type, but our discovery
|
||||
emits them as the generic `String` placeholder. Add a `StringLength` field to
|
||||
`AbCipStructureMember` + `AbCipTagDefinition` so declared variants carry their cap, and
|
||||
thread it into the `Tag.Name` attribute or a libplctag string-cap hint.
|
||||
- **Files**: `AbCipDataType.cs`, `AbCipDriverOptions.cs` (record fields), `LibplctagTagRuntime.cs`
|
||||
(string-length aware decode/encode), and the discovery emit at `AbCipDriver.cs:715`.
|
||||
- **Test approach**: unit test with a fake runtime returning `string` values shorter and
|
||||
longer than the declared cap; integration test deferred until a sample L5X with mixed
|
||||
STRING variants is available.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: investigate libplctag's `str_max_capacity` / `str_count_word_bytes`
|
||||
attributes — the docs reference them but the C# wrapper may not expose them; if not, this
|
||||
PR must extend `LibplctagTagRuntime` with a raw-buffer decode path.
|
||||
- **Docs / fixture / e2e**: extends `docs/Driver.AbCip.Cli.md` with a new `--string-size`
|
||||
flag in the `read`/`write` cookbook plus a STRINGnn worked example; updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §"What it actually covers" to list
|
||||
`STRING_20`/`STRING_80` once seeded; extends the ControlLogix profile in
|
||||
`tests/.../Docker/docker-compose.yml` with `TestSTRING80:STRING[1]` (plus a `STRING_20`
|
||||
variant if `ab_server` honours non-default DATA caps; otherwise documented as Emulate-tier
|
||||
only); adds `tests/.../IntegrationTests/AbCipStringDecodingTests.cs` round-trip; adds a
|
||||
short-string round-trip case to `scripts/e2e/test-abcip.ps1` and a `TestSTRING80` row to
|
||||
`scripts/smoke/seed-abcip-smoke.sql`.
|
||||
|
||||
#### PR 1.3 — Array-slice read addressing `Tag[0..N]`
|
||||
- **Scope**: today `AbCipTagPath` parses `Tag[3,5]` as a single element. Add slice syntax
|
||||
`Tag[0..15]` (parsed in `AbCipTagPath.TryParse`) and a planner that issues one libplctag
|
||||
read with `elem_count=N` per Rockwell array semantics, decoding the buffer at element
|
||||
stride into N output snapshots. Mirrors the whole-UDT planner pattern.
|
||||
- **Files**: `AbCipTagPath.cs` (parser — add `IsSlice` + `SliceLength` to the path segment
|
||||
record, or carry it on `AbCipTagPath` itself), new `AbCipArrayReadPlanner.cs` next to
|
||||
`AbCipUdtReadPlanner.cs`, `AbCipDriver.ReadAsync` to dispatch through the planner,
|
||||
`IAbCipTagRuntime` to add `DecodeArrayAt(type, elementStride, count)` or build on
|
||||
`DecodeValueAt`. Investigate libplctag's `elem_count` attribute on `Tag` create to confirm
|
||||
the right wire-level switch.
|
||||
- **Test approach**: parser unit tests for the new syntax, planner unit tests with fake
|
||||
runtime, integration smoke against a live ControlLogix DINT[100] tag using the dev-box
|
||||
PLC.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: PR 1.1 must land first if the array element type is LINT — otherwise the
|
||||
slice path silently truncates 64-bit elements.
|
||||
- **Docs / fixture / e2e**: extends `docs/Driver.AbCip.Cli.md` `read` section with the
|
||||
`Tag[0..N]` slice syntax + a worked example reading `Recipe[0..15]` in one round-trip;
|
||||
updates `docs/drivers/AbServer-Test-Fixture.md` §"What it actually covers" to mention
|
||||
the existing `DINT[16]` array tag is now exercised end-to-end via slicing; extends
|
||||
`AbCipReadSmokeTests` with a slice-read assertion against the seeded `TestDINTArray`;
|
||||
adds `tests/.../IntegrationTests/AbCipArraySliceTests.cs` covering edge cases
|
||||
(boundary, single-element, full-range); adds a slice-read assertion to
|
||||
`scripts/e2e/test-abcip.ps1`.
|
||||
|
||||
#### PR 1.4 — CIP multi-tag write packing
|
||||
- **Scope**: `AbCipDriver.WriteAsync` (`AbCipDriver.cs:460-546`) loops over writes one-by-one.
|
||||
Group writes by `(device, no-bit-RMW)` and submit one CIP Multi-Service Packet (0x0A)
|
||||
carrying up to N write-singles per round-trip. Honours the per-family
|
||||
`SupportsRequestPacking` flag at `AbCipPlcFamilyProfile.cs:36,43,51,59` — Micro800 falls
|
||||
back to the existing per-write loop because its profile already disables packing.
|
||||
- **Files**: `AbCipDriver.cs` (add a write planner mirroring the read planner), new
|
||||
`AbCipMultiWritePlanner.cs`, possibly a new `IAbCipTagRuntime.WriteBatchAsync` method or a
|
||||
new `IAbCipMultiWriter` capability since libplctag's high-level `Tag.WriteAsync` is
|
||||
per-tag — investigate libplctag's `cip-msg-multi` raw-CIP path or whether building a
|
||||
Multi-Service Packet via `plc_tag_create("name=@raw,...")` is feasible.
|
||||
- **Test approach**: unit test the planner with a synthetic batch (mixed-device, mixed
|
||||
bit-RMW, one Micro800); integration test recipe-style 50-tag write against ControlLogix
|
||||
measuring round-trip count via Wireshark or via a libplctag debug-trace sink.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: investigate libplctag multi-service-packet API; if absent, this PR may
|
||||
need to drop down to raw CIP via the `@raw` pseudo-tag or be deferred.
|
||||
- **Docs / fixture / e2e**: appends a "Multi-tag writes" subsection to
|
||||
`docs/Driver.AbCip.Cli.md` (no flag — automatic batching when multiple writes queue
|
||||
inside one publish) plus a note that Micro800 falls back per profile; updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §7 ("Capability surfaces beyond read") to flip
|
||||
`IWritable.WriteAsync` from "no smoke test" to covered for the multi-write path; adds
|
||||
`tests/.../IntegrationTests/AbCipMultiWriteTests.cs` asserting 50-tag batch lands in one
|
||||
round-trip (count via libplctag debug-trace sink); extends `scripts/e2e/test-abcip.ps1`
|
||||
with a recipe-style multi-write step; extends seed SQL with two extra DINT tags so the
|
||||
e2e has a packing target.
|
||||
|
||||
### Phase 2 — Tag-import workflows (4 PRs)
|
||||
|
||||
Goal: replicate Kepware's Logix Database Settings — point the driver at an L5K/L5X export or
|
||||
a CSV and have the tag table populate without an online controller.
|
||||
|
||||
#### PR 2.1 — L5K parser + ingest
|
||||
- **Scope**: parse a Studio 5000 L5K export (a labelled-section text format with
|
||||
`TAG ... END_TAG` blocks, `DATATYPE ... END_DATATYPE` UDT definitions, and program-scope
|
||||
qualifiers). Produce `AbCipTagDefinition` + `AbCipStructureMember` records that match the
|
||||
declarative options shape. Includes Description ingest (PR 2.3 lifts it to OPC UA
|
||||
`Description`).
|
||||
- **Files**: new `Import/L5kParser.cs`, new `Import/IL5kSource.cs` for testability, new
|
||||
`Import/L5kIngest.cs` that converts parsed records into `AbCipTagDefinition`. Hook into
|
||||
`AbCipDriverOptions` via a new `TagImports` collection (filenames or inline blobs) parsed
|
||||
on `AbCipDriver.InitializeAsync`.
|
||||
- **Test approach**: unit-only with sample L5K files in `tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Tests/Import/Fixtures/`
|
||||
covering controller-scope tags, program-scope tags, alias tags (skipped per Kepware
|
||||
precedent), and UDTs with nested structures.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: none — pure-text parser.
|
||||
- **Docs / fixture / e2e**: new doc `docs/drivers/AbCip-TagImport.md` covering the L5K
|
||||
format support matrix (controller-scope / program-scope / UDT / alias-skipped) and a
|
||||
worked example of pointing `AbCipDriverOptions.TagImports` at an L5K export; appends a
|
||||
`tag-import` command section to `docs/Driver.AbCip.Cli.md` (CLI gains
|
||||
`tag-import --file foo.L5K`); fixture-side no change to `ab_server` (offline parse —
|
||||
no PLC needed) but adds sample L5K files under
|
||||
`tests/.../AbCip.Tests/Import/Fixtures/`; extends `scripts/e2e/test-abcip.ps1` with an
|
||||
offline `tag-import` smoke that diffs the parsed tag set against a golden JSON.
|
||||
|
||||
#### PR 2.2 — L5X (XML) parser + ingest
|
||||
- **Scope**: same surface as PR 2.1 but parses Studio 5000's XML export. L5X is the de-facto
|
||||
modern format (Studio 5000 v21+) and carries richer metadata than L5K including
|
||||
ExternalAccess attributes and AOI definitions.
|
||||
- **Files**: new `Import/L5xParser.cs` using `System.Xml.XPath`, share the `IL5kSource` /
|
||||
`L5kIngest` ingest layer with PR 2.1 by introducing a common `ParsedTagsBundle` record.
|
||||
- **Test approach**: unit tests with sample L5X fixtures including an AOI-typed tag (sets up
|
||||
the AOI work in PR 2.7).
|
||||
- **Effort**: L
|
||||
- **Dependencies**: PR 2.1 is preferred first to settle the shared ingest seam.
|
||||
- **Docs / fixture / e2e**: extends `docs/drivers/AbCip-TagImport.md` (created in PR 2.1)
|
||||
with the L5X-specific section — namespace handling, ExternalAccess attributes, AOI
|
||||
references; extends the `tag-import` CLI section in `docs/Driver.AbCip.Cli.md` to note
|
||||
L5X auto-detection by file extension; sample L5X files added under
|
||||
`tests/.../AbCip.Tests/Import/Fixtures/` (one with an AOI-typed tag for PR 2.6);
|
||||
reuses the offline `tag-import` step from `scripts/e2e/test-abcip.ps1` (now driven by
|
||||
L5X) — no fixture container change because parse is offline; cross-links from
|
||||
`tests/.../IntegrationTests/LogixProject/README.md` so the on-site Emulate L5X export
|
||||
doubles as a parser fixture.
|
||||
|
||||
#### PR 2.3 — Tag descriptions surfaced as OPC UA `Description`
|
||||
- **Scope**: extend `AbCipTagDefinition` with `Description` (string?), populate it from the
|
||||
L5K/L5X parsers, and thread it through to `DriverAttributeInfo` so the address-space
|
||||
builder sets the OPC UA `Description` attribute. Also lifts the description onto
|
||||
`AbCipStructureMember` for member-level metadata.
|
||||
- **Files**: `AbCipDriverOptions.cs` (record fields), `AbCipDriver.cs:760-770`
|
||||
(`ToAttributeInfo` helper), `Core.Abstractions/DriverAttributeInfo.cs` (verify it carries
|
||||
a Description field; if not, that becomes a Core PR shared across drivers).
|
||||
- **Test approach**: unit — a discovery test asserts that a tag with a description ends up
|
||||
with that description on the `DriverAttributeInfo` record.
|
||||
- **Effort**: S
|
||||
- **Dependencies**: PR 2.1 / PR 2.2 (descriptions only land via importer).
|
||||
- **Docs / fixture / e2e**: appends a "Description metadata" subsection to
|
||||
`docs/drivers/AbCip-TagImport.md` documenting how Studio 5000 descriptions surface as
|
||||
OPC UA `Description`; no CLI surface change (read-side only — the existing
|
||||
`otopcua-cli read` already projects `Description`); no fixture container change; adds
|
||||
a cross-driver assertion to the existing OPC UA browse test in
|
||||
`tests/.../IntegrationTests/` verifying the description survives the full
|
||||
parser → driver → server → client path; extends `scripts/e2e/test-abcip.ps1` with a
|
||||
one-line `Description != null` assertion after the import smoke step.
|
||||
|
||||
#### PR 2.4 — CSV tag import / export
|
||||
- **Scope**: a CSV round-trip matching the Kepware column layout (`Tag Name, Address, Data
|
||||
Type, Respect Data Type, Client Access, Scan Rate, Description, Scaling`). Import populates
|
||||
`AbCipTagDefinition`; export dumps the live tag table for editing in Excel.
|
||||
- **Files**: new `Import/CsvTagImporter.cs`, new `Import/CsvTagExporter.cs`, integration
|
||||
point in `AbCipDriverOptions.TagImports` parallel to PR 2.1's hook. Export hook is
|
||||
exposed via the CLI (`docs/Driver.AbCip.Cli.md`) — add a `tag-export` command.
|
||||
- **Test approach**: unit tests for parser + writer with fixture CSVs; CLI integration test
|
||||
using a synthetic options payload.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: lighter than 2.1/2.2 — could ship in either order, but landing CSV after
|
||||
L5X means the CSV export reuses the `ParsedTagsBundle` shape.
|
||||
- **Docs / fixture / e2e**: appends a "CSV tag table" section to
|
||||
`docs/drivers/AbCip-TagImport.md` documenting the column layout (Kepware-compatible) and
|
||||
round-trip semantics; appends `tag-export` and CSV-flavour `tag-import` commands to
|
||||
`docs/Driver.AbCip.Cli.md`; adds sample CSVs under
|
||||
`tests/.../AbCip.Tests/Import/Fixtures/` plus a CLI integration test
|
||||
(`tests/.../AbCip.Tests/Import/CsvRoundTripTests.cs`); extends
|
||||
`scripts/e2e/test-abcip.ps1` with an export-then-import-then-diff scenario (no PLC
|
||||
required); fixture-side no change.
|
||||
|
||||
#### PR 2.5 — Online tag-DB refresh trigger (`$Sys$UpdateTagInfo` parity)
|
||||
- **Scope**: AVEVA exposes `$Sys$UpdateTagInfo` so an HMI can write `1` to force the driver
|
||||
to re-walk the controller's symbol table after a Studio 5000 download — without restarting
|
||||
the driver. Implement as a new `IDriverControl.RebrowseAsync()` invoked by the server or
|
||||
via a system-tag write (PR 4.4 will surface system tags as browseable variables — once
|
||||
that lands, this becomes the writeable system tag `_RefreshTagDb`). For now expose it
|
||||
via the CLI and via a new `AbCipDriver.RebrowseAsync` method.
|
||||
- **Files**: `AbCipDriver.cs` (new method that re-runs the `@tags` enumerator without going
|
||||
through full `ReinitializeAsync`), CLI command in `src/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/`,
|
||||
documentation update in `docs/Driver.AbCip.Cli.md`.
|
||||
- **Test approach**: unit test that two consecutive `RebrowseAsync` calls produce two
|
||||
enumeration passes; integration smoke against the dev-box ControlLogix verifying the
|
||||
address space picks up a tag added between rebrowses.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: ties cleanly into PR 4.4 (system tags) but ships earlier as a
|
||||
programmatic API.
|
||||
- **Docs / fixture / e2e**: appends a `rebrowse` command to `docs/Driver.AbCip.Cli.md`
|
||||
with a Studio 5000 download recipe ("after a download, run
|
||||
`otopcua-abcip-cli rebrowse -g …`"); cross-references the future `_RefreshTagDb` system
|
||||
tag once PR 4.4 lands; updates `docs/drivers/AbServer-Test-Fixture.md` §7 to mark
|
||||
`ITagDiscovery.DiscoverAsync` as covered for the rebrowse path; adds
|
||||
`tests/.../IntegrationTests/AbCipRebrowseTests.cs` driving two consecutive
|
||||
enumerations (the second sees a tag added between calls — ab_server supports runtime
|
||||
reseed via its REST hook); extends `scripts/e2e/test-abcip.ps1` with a
|
||||
rebrowse-after-reseed assertion (or marks it `[OnlyIfRig]` if the simulator's reseed
|
||||
hook isn't reachable).
|
||||
|
||||
#### PR 2.6 — AOI (Add-On Instruction) input/output handling
|
||||
- **Scope**: AOIs are first-class types in L5X (`AddOnInstructionDefinition` blocks). The
|
||||
Template Object decoder at `CipTemplateObjectDecoder.cs` likely already handles them at
|
||||
the wire level (an AOI is a Logix UDT with InOut/Input/Output qualifiers). This PR adds:
|
||||
(a) AOI-aware browse paths so an AOI instance shows up as a folder with `Inputs/`,
|
||||
`Outputs/`, `InOut/` sub-folders; (b) skip-on-discovery for `InOut` parameters per
|
||||
Kepware's documented limitation (InOut is a pointer, not a value).
|
||||
- **Files**: extend `AbCipStructureMember` with an `AoiQualifier` enum
|
||||
(Input/Output/InOut/Local), L5K/L5X parser extends to set it, `AbCipDriver.DiscoverAsync`
|
||||
groups members into qualifier-named sub-folders.
|
||||
- **Test approach**: unit test discovery against an AOI-containing fixture.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: PR 2.2 (L5X) lands the AOI definition parsing.
|
||||
- **Docs / fixture / e2e**: appends an "AOI handling" section to
|
||||
`docs/drivers/AbCip-TagImport.md` covering Inputs/Outputs/InOut grouping + the InOut
|
||||
skip rationale; updates `docs/drivers/AbServer-Test-Fixture.md` §"What it does NOT
|
||||
cover" to keep AOIs flagged as ab_server-blocked but call out Logix Emulate as the
|
||||
authoritative tier; adds a sample AOI-bearing L5X under
|
||||
`tests/.../AbCip.Tests/Import/Fixtures/` and a discovery test that asserts the
|
||||
Inputs/Outputs sub-folder shape; promotes
|
||||
`tests/.../IntegrationTests/Emulate/AbCipEmulateAoiTests.cs` (gated on
|
||||
`AB_SERVER_PROFILE=emulate`) — no `scripts/e2e/test-abcip.ps1` change because AOIs
|
||||
need Emulate or a rig.
|
||||
|
||||
### Phase 3 — Performance levers (3 PRs)
|
||||
|
||||
Goal: expose the protocol-mode + connection-tuning knobs that commercial drivers expose as
|
||||
device-level config.
|
||||
|
||||
#### PR 3.1 — Configurable CIP Connection Size per device
|
||||
- **Scope**: today the family profile hard-codes 4002 / 504 / 488 at
|
||||
`AbCipPlcFamilyProfile.cs:33,42,49`. Add an optional `ConnectionSize` field to
|
||||
`AbCipDeviceOptions` that overrides the family default; thread it through to the
|
||||
libplctag tag-create attribute (`connection_size=N`). Validate against a sensible range
|
||||
(500-4002 per Kepware's slider).
|
||||
- **Files**: `AbCipDriverOptions.cs:70-73` (extend `AbCipDeviceOptions` record),
|
||||
`IAbCipTagRuntime.cs` (extend `AbCipTagCreateParams` with `ConnectionSize`),
|
||||
`LibplctagTagRuntime.cs` (set the `Tag.PlcType`-adjacent attribute — investigate libplctag
|
||||
C# wrapper's exposure of `connection_size`; may need to set via `Tag.AddAttribute` if a
|
||||
named property doesn't exist).
|
||||
- **Test approach**: unit test that custom Connection Size flows from options into the
|
||||
`AbCipTagCreateParams`; integration smoke against the dev-box ControlLogix verifying
|
||||
reduced-size connections succeed on legacy v19 firmware. Live test required because
|
||||
libplctag rejects out-of-range values silently in some versions.
|
||||
- **Effort**: S
|
||||
- **Dependencies**: investigate libplctag `connection_size` attribute exposure.
|
||||
- **Docs / fixture / e2e**: appends a "Connection Size" subsection to a new
|
||||
`docs/drivers/AbCip-Performance.md` (consolidates the Phase 3 knobs in one place) and a
|
||||
brief note + warning-symptom callout in `docs/Driver.AbCip.Cli.md` for the new
|
||||
per-device option in the Driver config; updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §5 (CompactLogix narrow cap) noting that
|
||||
ab_server still doesn't enforce the cap so live coverage stays Emulate/rig-only;
|
||||
extends `scripts/smoke/seed-abcip-smoke.sql` with a `ConnectionSize` field demo;
|
||||
no `scripts/e2e/test-abcip.ps1` change (boot-time config knob, no per-call surface).
|
||||
|
||||
#### PR 3.2 — Symbolic vs logical (instance-ID) addressing toggle
|
||||
- **Scope**: libplctag exposes `use_connected_msg=1&allow_packet_response_packing=1&logical_segment=1`
|
||||
(or similar — investigate the exact attribute name) for instance-ID addressing that skips
|
||||
per-poll ASCII parsing. Add a per-device `AddressingMode` enum
|
||||
(`Symbolic | Logical | Auto`) and thread it through `AbCipTagCreateParams`. `Auto` is the
|
||||
default and matches today's behaviour; `Logical` flips libplctag into instance-ID mode.
|
||||
Logical mode requires a one-time symbol-table walk to map names to instance IDs — reuse
|
||||
`LibplctagTagEnumerator` for the bootstrap.
|
||||
- **Files**: `AbCipDriverOptions.cs` (per-device enum), `IAbCipTagRuntime.cs`
|
||||
(`AbCipTagCreateParams.AddressingMode`), `LibplctagTagRuntime.cs` (translate to libplctag
|
||||
attributes), `AbCipDriver.cs` (run a one-time symbol-walk on first read in Logical mode).
|
||||
- **Test approach**: unit test attribute construction; integration benchmark — read 1000
|
||||
tags in Symbolic vs Logical and assert >2x throughput on the dev-box ControlLogix.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: investigate libplctag's instance-ID API; the mapping pseudo-tag is
|
||||
`@tags` (already used for browse) but the per-tag wire flag needs research. If libplctag
|
||||
doesn't expose this cleanly, the PR drops down to the raw `cip_addr` attribute.
|
||||
- **Docs / fixture / e2e**: appends an "Addressing mode" section to
|
||||
`docs/drivers/AbCip-Performance.md` (Symbolic / Logical / Auto trade-offs); adds a
|
||||
per-device `addressing-mode` knob to `docs/Driver.AbCip.Cli.md` (CLI gains
|
||||
`--addressing-mode` on `read`/`subscribe`/`write` for ad-hoc benchmarking); updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §"What it actually covers" to add Logical-mode
|
||||
reads if ab_server's symbol table walks correctly under instance IDs (otherwise
|
||||
marked Emulate-tier-only); adds a benchmark test
|
||||
`tests/.../IntegrationTests/AbCipAddressingModeBenchTests.cs`; extends
|
||||
`scripts/e2e/test-abcip.ps1` with a Symbolic-vs-Logical sanity assertion
|
||||
(read 1000 tags both modes, assert Logical >= Symbolic throughput).
|
||||
|
||||
#### PR 3.3 — Logical-blocking / non-blocking strategy selector
|
||||
- **Scope**: TOP Server names two modes: "logical-blocking" (whole-UDT read, decode members
|
||||
in-memory) and "logical-non-blocking" (per-member reads packed into one Multi-Service
|
||||
Packet). We have one direction shipped via `AbCipUdtReadPlanner`. Add a per-device
|
||||
`ReadStrategy` enum with three values: `WholeUdt` (current behaviour), `MultiPacket`
|
||||
(new: use libplctag request-packing to bundle per-member reads into one PDU when the UDT
|
||||
is sparse — i.e. only 2-of-50 members subscribed), and `Auto` (planner picks based on
|
||||
fraction-of-members-subscribed threshold). Strategy is per-device because Micro800
|
||||
doesn't support packing.
|
||||
- **Files**: `AbCipDriverOptions.cs` (per-device enum), `AbCipUdtReadPlanner.cs` (add the
|
||||
threshold heuristic), new `AbCipMultiPacketReadPlanner.cs`, `AbCipDriver.ReadAsync`
|
||||
dispatch. Honours `AbCipPlcFamilyProfile.SupportsRequestPacking` at the family level so a
|
||||
user-selected `MultiPacket` on Micro800 falls back to per-tag with a warning logged.
|
||||
- **Test approach**: unit test the heuristic on synthetic batches of varying sparsity;
|
||||
integration benchmark with a 50-member UDT where 5 members are subscribed — verify
|
||||
MultiPacket beats WholeUdt by buffer-size delta.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: PR 1.4 (multi-tag write packing) builds the same libplctag-multi-service
|
||||
primitive; landing 1.4 first reduces scope here.
|
||||
- **Docs / fixture / e2e**: appends a "Read strategy" section to
|
||||
`docs/drivers/AbCip-Performance.md` covering WholeUdt / MultiPacket / Auto plus the
|
||||
sparsity-threshold heuristic; updates `docs/drivers/AbServer-Test-Fixture.md` §1
|
||||
(UDT coverage) with a note that strategy switching is decided in the planner and
|
||||
unit-tested only — Emulate is the authoritative wire-level coverage; adds
|
||||
`tests/.../IntegrationTests/Emulate/AbCipEmulateMultiPacketReadTests.cs` (gated on
|
||||
`AB_SERVER_PROFILE=emulate`); no CLI surface change beyond the existing
|
||||
per-device option, no `scripts/e2e/test-abcip.ps1` change because the simulator
|
||||
doesn't differentiate the two strategies on the wire.
|
||||
|
||||
### Phase 4 — Operability (4 PRs)
|
||||
|
||||
Goal: make the driver behave like a SCADA driver — per-tag scan rates, write deadband,
|
||||
diagnostic system tags, online refresh trigger.
|
||||
|
||||
#### PR 4.1 — Per-tag scan rate / scan group bucketing
|
||||
- **Scope**: today subscriptions key on a single `publishingInterval` per
|
||||
`_poll.Subscribe(...)` call. Add an optional `ScanRate` field to `AbCipTagDefinition` that,
|
||||
when set, overrides the subscription interval for that tag. The shared `PollGroupEngine`
|
||||
already buckets by interval — the change is to read the per-tag rate at subscribe-time
|
||||
and place the tag into its own bucket.
|
||||
- **Files**: `AbCipDriverOptions.cs` (record field), `AbCipDriver.SubscribeAsync` (look up
|
||||
per-tag override before passing to `_poll.Subscribe`). `PollGroupEngine` may need a new
|
||||
`Subscribe(tags, defaultInterval, perTagOverrides)` overload — check Core for the current
|
||||
signature.
|
||||
- **Test approach**: unit test that two tags with different ScanRate values produce two
|
||||
poll buckets; integration test verifying the faster-rate tag publishes more frequently
|
||||
than the slower-rate tag inside one subscription.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: may require a small change to `PollGroupEngine` in Core.
|
||||
- **Docs / fixture / e2e**: new doc `docs/drivers/AbCip-Operability.md` (consolidates the
|
||||
Phase 4 knobs); appends a "Per-tag scan rate" section to it covering Kepware "scan
|
||||
classes" parity + the OPC UA publishing-interval interaction; no CLI surface change;
|
||||
fixture-side no change to ab_server; adds
|
||||
`tests/.../IntegrationTests/AbCipPerTagScanRateTests.cs` driving two tags at
|
||||
different rates against ab_server and asserting bucket separation; extends
|
||||
`scripts/e2e/test-abcip.ps1` with a two-tag subscribe-rate-divergence assertion.
|
||||
|
||||
#### PR 4.2 — Write deadband / write-on-change
|
||||
- **Scope**: `AbCipDriver.WriteAsync` writes every request through. Add per-tag
|
||||
`WriteDeadband` (numeric) and `WriteOnChange` (boolean). When set, the driver tracks the
|
||||
last successfully-written value per `(tag, deviceHostAddress)` and suppresses the next
|
||||
write if `|new - last| < deadband` (numeric) or `new == last` (any). Suppressed writes
|
||||
return `Good` so OPC UA semantics are unaffected.
|
||||
- **Files**: `AbCipDriverOptions.cs` (record fields), new `AbCipWriteCoalescer.cs` holding
|
||||
the per-tag last-value cache, `AbCipDriver.WriteAsync` consults the coalescer before
|
||||
hitting the runtime.
|
||||
- **Test approach**: unit tests with synthetic writes — assert that a sequence of jittery
|
||||
setpoint values within deadband triggers a single PLC write.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: none.
|
||||
- **Docs / fixture / e2e**: appends a "Write deadband / write-on-change" section to
|
||||
`docs/drivers/AbCip-Operability.md` with a worked setpoint-jitter example; updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §7 to flip the multi-write coverage line to
|
||||
also cover suppression; adds
|
||||
`tests/.../IntegrationTests/AbCipWriteDeadbandTests.cs` driving a jittery setpoint and
|
||||
asserting the actual PLC write count via libplctag debug-trace; extends
|
||||
`scripts/e2e/test-abcip.ps1` with a write-coalesce assertion (write the same value
|
||||
twice, verify only one PLC-side change).
|
||||
|
||||
#### PR 4.3 — Diagnostic / system tags as browseable variables
|
||||
- **Scope**: surface the `IHostConnectivityProbe` + `DriverHealth` data as browseable OPC UA
|
||||
variables under `AbCip/<device>/_System/`. Variables: `_ConnectionStatus`, `_ScanRate`
|
||||
(current effective publishing interval), `_TagCount`, `_DeviceError`, `_LastScanTimeMs`.
|
||||
Read-only; updated on each driver health transition.
|
||||
- **Files**: `AbCipDriver.DiscoverAsync` (`AbCipDriver.cs:674-758`) emits the system folder
|
||||
per device; new `AbCipSystemTagSource.cs` produces the live values; `ReadAsync` routes
|
||||
`_System/...` references to the source instead of the libplctag runtime.
|
||||
- **Test approach**: unit test that the discovery emits the expected nodes; unit test that
|
||||
reading a system tag returns the current health snapshot.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: none, but PR 2.5 (online refresh trigger) becomes nicer once this lands —
|
||||
`_RefreshTagDb` writes `1` to invoke `RebrowseAsync`.
|
||||
- **Docs / fixture / e2e**: appends a "System tags / `_System` folder" section to
|
||||
`docs/drivers/AbCip-Operability.md` enumerating `_ConnectionStatus`, `_ScanRate`,
|
||||
`_TagCount`, `_DeviceError`, `_LastScanTimeMs`; cross-link from
|
||||
`docs/Driver.AbCip.Cli.md` (the `read` cookbook gains a system-tag example); updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §7 to flip `IHostConnectivityProbe` state-
|
||||
transition coverage from "no" to covered (system tag observation provides the assertion
|
||||
hook); adds `tests/.../IntegrationTests/AbCipSystemTagDiscoveryTests.cs`; extends
|
||||
`scripts/e2e/test-abcip.ps1` with a `_System/_ConnectionStatus` browse-and-read step.
|
||||
|
||||
#### PR 4.4 — Online tag-DB refresh trigger as `_RefreshTagDb` system tag
|
||||
- **Scope**: thin follow-up to PR 2.5 + PR 4.3 — wire the writeable system tag to the
|
||||
existing `RebrowseAsync` method.
|
||||
- **Files**: `AbCipSystemTagSource.cs` (writeable variable), `AbCipDriver.WriteAsync`
|
||||
intercepts `_RefreshTagDb` writes and dispatches to `RebrowseAsync`.
|
||||
- **Test approach**: unit + CLI integration.
|
||||
- **Effort**: S
|
||||
- **Dependencies**: PR 2.5 and PR 4.3.
|
||||
- **Docs / fixture / e2e**: extends the `_System` table in
|
||||
`docs/drivers/AbCip-Operability.md` to mark `_RefreshTagDb` as writeable; appends a
|
||||
"Refreshing the tag DB" recipe to `docs/Driver.AbCip.Cli.md` that pairs the system-tag
|
||||
write with the existing `rebrowse` command from PR 2.5; reuses the
|
||||
`AbCipRebrowseTests` fixture from PR 2.5 with an added system-tag-write entry point;
|
||||
extends `scripts/e2e/test-abcip.ps1` with a `_RefreshTagDb` write-then-verify
|
||||
assertion (chained off the rebrowse step from PR 2.5).
|
||||
|
||||
### Phase 5 — Redundancy (2 PRs)
|
||||
|
||||
Goal: HSBY paired-IP failover for continuous-process plants. Heavier than the rest because
|
||||
it changes the `(device, hostName)` axiom — one logical device now has two host addresses.
|
||||
|
||||
#### PR 5.1 — Paired host address syntax + role probing
|
||||
- **Scope**: extend `AbCipDeviceOptions` with `PartnerHostAddress` (optional). When set, the
|
||||
device probes both gateways concurrently using the existing probe loop machinery
|
||||
(`AbCipDriver.cs:235-281`). A ControlLogix HSBY pair exposes
|
||||
`WallClockTime`/`Module.Status` tags that identify the active chassis — investigate the
|
||||
exact tag name; `WallClockTime.SyncStatus` is one option, `S:34` (Module Status)
|
||||
carries the role bit on some versions.
|
||||
- **Files**: `AbCipDriverOptions.cs` (extend `AbCipDeviceOptions`), `AbCipDriver.cs`
|
||||
(extend `DeviceState` with `ActiveAddress` field, run two probe loops), new
|
||||
`AbCipHsbyRoleProber.cs` reading the role tag and returning Active/Standby.
|
||||
- **Test approach**: unit test with two fake probe runtimes returning different role bits;
|
||||
integration test deferred until a true HSBY pair is available — note in
|
||||
`MEMORY.md/project_aveva_platform_installed.md` that the dev box has a single chassis.
|
||||
- **Effort**: L
|
||||
- **Dependencies**: investigate the canonical HSBY role tag — the AVEVA ABCIP docs name it
|
||||
but the wire-level tag varies by firmware.
|
||||
- **Docs / fixture / e2e**: new doc `docs/drivers/AbCip-HSBY.md` covering the paired-IP
|
||||
config, the role-tag detection matrix (v20 / v24 / v32+), and the feature-flag gate
|
||||
(`Redundancy.Hsby.Enabled`); extends `docs/Driver.AbCip.Cli.md` with a `--partner`
|
||||
flag plus an `hsby-status` command that prints the active partner; updates
|
||||
`docs/drivers/AbServer-Test-Fixture.md` §"What it does NOT cover" with a new entry
|
||||
marking HSBY as ab_server-blocked but adds a "paired-fixture" mode to
|
||||
`tests/.../Docker/docker-compose.yml` (two `controllogix` services on different
|
||||
ports + a `hsby-mux` sidecar that flips the role bit on demand); adds
|
||||
`tests/.../IntegrationTests/AbCipHsbyRoleProberTests.cs`; no
|
||||
`scripts/e2e/test-abcip.ps1` change yet — HSBY e2e is gated behind a sibling
|
||||
`scripts/e2e/test-abcip-hsby.ps1` script introduced in PR 5.2.
|
||||
|
||||
#### PR 5.2 — Failover routing in IPerCallHostResolver
|
||||
- **Scope**: `AbCipDriver.ResolveHost` returns the device's primary address today
|
||||
(`AbCipDriver.cs:307-312`). Change it to return the currently-Active partner. On role
|
||||
transition, the existing bulkhead/breaker per-host keying isolates a stuck primary
|
||||
without affecting the failover path because the partner address has its own breaker.
|
||||
- **Files**: `AbCipDriver.cs:ResolveHost` consults `DeviceState.ActiveAddress`, plus a
|
||||
small change to per-tag runtime caching so handles are keyed on the active address —
|
||||
failover invalidates the handle cache and re-creates against the new gateway.
|
||||
- **Test approach**: unit test that toggling the role flag flips `ResolveHost` output;
|
||||
integration test deferred per PR 5.1.
|
||||
- **Effort**: M
|
||||
- **Dependencies**: PR 5.1.
|
||||
- **Docs / fixture / e2e**: appends a "Failover behaviour" section to
|
||||
`docs/drivers/AbCip-HSBY.md` documenting handle-cache invalidation + bulkhead key
|
||||
semantics; appends a "Failure-mode walkthrough" to the same doc covering
|
||||
primary-stuck / secondary-stuck / both-stuck cases; reuses the paired-fixture from
|
||||
PR 5.1; adds `tests/.../IntegrationTests/AbCipHsbyFailoverTests.cs` driving the
|
||||
role-flip via the `hsby-mux` sidecar and asserting reads route to the new active
|
||||
partner; ships the new `scripts/e2e/test-abcip-hsby.ps1` (paired-fixture variant of
|
||||
the standard e2e — flips the role mid-stream and asserts subscribe stream survives).
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
The summary above already includes each PR's title, motivation (linked to the
|
||||
featuregaps.md table row), files, test plan, and effort. To keep this section from
|
||||
duplicating, here are the cross-cutting design notes and risks per phase rather than per PR.
|
||||
|
||||
### Phase 1 risks
|
||||
- **Int64 surface change** (PR 1.1) ripples through the address-space builder + the OPC UA
|
||||
variant emit. Confirm `Core.Abstractions.DriverDataType` already has `Int64`; if not, this
|
||||
PR pulls in a Core change other drivers will share (Modbus has the same TODO).
|
||||
- **STRINGnn variant addressing** (PR 1.2) is the smallest data-correctness PR but has the
|
||||
highest unknown — libplctag's C# wrapper may flatten all string variants to its built-in
|
||||
`GetString(0)` helper. If true, PR 1.2 must add a raw-buffer decode path and is then
|
||||
upgraded from M to L.
|
||||
- **Array-slice planner** (PR 1.3) introduces a third planner alongside the UDT planner +
|
||||
the future write planner (1.4). Build them on a shared `IAbCipReadPlanner` seam so
|
||||
Phase 3's strategy selector has one slot to pivot on, not three.
|
||||
- **Multi-write packing** (PR 1.4) hinges on libplctag exposing CIP Multi-Service Packet
|
||||
construction. If it does not, the work-around is a raw-CIP `@raw` send, which is a
|
||||
bigger lift and may push 1.4 to an L-plus that drags into Phase 3.
|
||||
|
||||
### Phase 2 risks
|
||||
- **L5K text format** has documented edge cases (escape sequences in DESCRIPTION strings,
|
||||
alias resolution, nested DATATYPE blocks). Lean on Rockwell's published L5K BNF and treat
|
||||
unknown sections as warnings, not failures.
|
||||
- **L5X namespace handling** — Studio 5000 v32+ adds optional XML namespaces. Use
|
||||
XPath with prefix-agnostic queries to avoid version-pinning the parser.
|
||||
- **CSV column drift** — Kepware's column order has shifted over major versions. Implement
|
||||
the importer to read by header name, not column index.
|
||||
|
||||
### Phase 3 risks
|
||||
- **Logical addressing bootstrap cost** (PR 3.2) — symbol-table walk on first read can
|
||||
stall the first poll batch. Cache the instance-ID map per `(device, last symbol-table
|
||||
hash)` and persist it across `ReinitializeAsync` if feasible.
|
||||
- **MultiPacket vs WholeUdt heuristic** (PR 3.3) — the threshold (e.g. "switch to
|
||||
MultiPacket when fewer than 30% of UDT members are subscribed") needs benchmarking on
|
||||
real rigs. Ship an explicit per-device override + pick a conservative default.
|
||||
- **Connection Size on legacy firmware** (PR 3.1) — v19-and-earlier ControlLogix firmware
|
||||
rejects Large Forward Open silently. Document the symptom in `docs/Driver.AbCip.md` and
|
||||
emit a warning when ConnectionSize > 511 against a family profile that is
|
||||
ControlLogix-typed but probed-as-v19.
|
||||
|
||||
### Phase 4 risks
|
||||
- **Per-tag scan rate** (PR 4.1) interacts with the OPC UA subscription's
|
||||
publishing-interval contract. Document that the per-tag override is a *driver-side*
|
||||
publish bucket that fires `OnDataChange` events at the per-tag rate; the OPC UA layer
|
||||
still aggregates them on its own publishing-interval and the client may see them at the
|
||||
larger of the two intervals. This matches Kepware's "scan classes" semantics.
|
||||
- **Write deadband** (PR 4.2) on UDT-fanned-out members must use the member-level cache, not
|
||||
the parent UDT's cache.
|
||||
|
||||
### Phase 5 risks
|
||||
- **HSBY role tag name** (PR 5.1) varies by firmware version; without a real HSBY pair on
|
||||
the dev box the integration coverage is deferred to a customer-site smoke test. Consider
|
||||
parking PR 5.1+5.2 behind a feature flag (`Redundancy.Hsby.Enabled`) and shipping unit
|
||||
coverage only.
|
||||
- **Bulkhead key** assumed to be `(driver, hostName)`; once `ResolveHost` returns the active
|
||||
partner address that key is correct by construction, but verify Polly's per-key state is
|
||||
invalidated cleanly when the active address changes mid-call.
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Cross-cutting roll-up of the per-PR `Docs / fixture / e2e` lines above. Read this before
|
||||
starting any phase to plan doc + fixture + e2e work in parallel with the code change.
|
||||
|
||||
### New documents
|
||||
|
||||
- `docs/drivers/AbCip-TagImport.md` (Phase 2, lands with PR 2.1; extended by PR 2.2,
|
||||
PR 2.3, PR 2.4, PR 2.6) — L5K / L5X / CSV / AOI tag-import reference.
|
||||
- `docs/drivers/AbCip-Performance.md` (Phase 3, lands with PR 3.1; extended by PR 3.2,
|
||||
PR 3.3) — Connection Size, Addressing Mode, Read Strategy.
|
||||
- `docs/drivers/AbCip-Operability.md` (Phase 4, lands with PR 4.1; extended by PR 4.2,
|
||||
PR 4.3, PR 4.4) — per-tag scan rate, write deadband, system tags.
|
||||
- `docs/drivers/AbCip-HSBY.md` (Phase 5, lands with PR 5.1; extended by PR 5.2) —
|
||||
paired-IP redundancy, role-tag matrix, failover semantics.
|
||||
|
||||
### Documents with appended sections
|
||||
|
||||
- `docs/Driver.AbCip.Cli.md` — gains type-table rows (PR 1.1), `--string-size` flag
|
||||
(PR 1.2), slice syntax (PR 1.3), multi-write subsection (PR 1.4), `tag-import` /
|
||||
`tag-export` commands (PR 2.1, PR 2.2, PR 2.4), `rebrowse` command (PR 2.5),
|
||||
Connection Size note (PR 3.1), `--addressing-mode` flag (PR 3.2), system-tag
|
||||
read example (PR 4.3), `_RefreshTagDb` recipe (PR 4.4), `--partner` flag plus
|
||||
`hsby-status` command (PR 5.1).
|
||||
- `docs/drivers/AbServer-Test-Fixture.md` — coverage map updated by every PR that
|
||||
changes what ab_server actually exercises (1.1, 1.2, 1.3, 1.4, 2.6, 3.1, 3.2,
|
||||
3.3, 4.2, 4.3, 5.1).
|
||||
|
||||
### Fixture / simulator scaffolding
|
||||
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/Docker/docker-compose.yml` —
|
||||
ControlLogix profile gains seeded `TestLINT`, `TestSTRING80`, extra DINT tags for
|
||||
multi-write (PRs 1.1, 1.2, 1.4); a new paired-fixture mode with `hsby-mux` sidecar
|
||||
for HSBY (PR 5.1).
|
||||
- `tests/.../AbCip.IntegrationTests/AbServerProfile.cs` — the `KnownProfiles`
|
||||
records get extended Notes lines for each new seeded tag class.
|
||||
- `tests/.../AbCip.Tests/Import/Fixtures/` — new directory hosting sample L5K, L5X,
|
||||
and CSV files (PRs 2.1, 2.2, 2.6, 2.4).
|
||||
- `tests/.../AbCip.IntegrationTests/Emulate/` — new gated tests for AOI (PR 2.6) and
|
||||
MultiPacket strategy (PR 3.3); reuses the existing `AB_SERVER_PROFILE=emulate`
|
||||
gate.
|
||||
- `tests/.../AbCip.IntegrationTests/LogixProject/README.md` — cross-link added when
|
||||
PR 2.2 lands so the on-site Studio 5000 export doubles as a parser fixture.
|
||||
|
||||
### Integration / e2e scripts
|
||||
|
||||
- `scripts/e2e/test-abcip.ps1` — gains assertions for: LInt loopback (1.1),
|
||||
STRING round-trip (1.2), array-slice read (1.3), recipe multi-write (1.4),
|
||||
tag-import diff (2.1, 2.2, 2.4), Description survival (2.3), rebrowse-after-reseed
|
||||
(2.5), Symbolic-vs-Logical sanity (3.2), per-tag scan-rate divergence (4.1),
|
||||
write-coalesce (4.2), `_System` browse-and-read (4.3), `_RefreshTagDb` write-then-
|
||||
verify (4.4).
|
||||
- `scripts/smoke/seed-abcip-smoke.sql` — extended with `TestLINT`, `TestSTRING80`,
|
||||
multi-write target tags, and a `ConnectionSize` field demo (PRs 1.1, 1.2, 1.4,
|
||||
3.1).
|
||||
- `scripts/e2e/test-abcip-hsby.ps1` — new paired-fixture variant of the standard
|
||||
e2e, ships with PR 5.2; not chained into `scripts/e2e/test-all.ps1` until HSBY
|
||||
exits feature-flag gating.
|
||||
|
||||
### Cross-cutting work
|
||||
|
||||
- The `Docs / fixture / e2e` lines deliberately reuse the existing
|
||||
`Test-Probe` / `Test-DriverLoopback` / `Test-ServerBridge` / `Test-OpcUaWriteBridge` /
|
||||
`Test-SubscribeSeesChange` helpers in `scripts/e2e/_common.ps1` — no new helper
|
||||
functions are required for Phases 1-4. Phase 5 is the first phase that introduces a
|
||||
new helper (`Test-FailoverDuringSubscribe`) in `_common.ps1`, shipped alongside
|
||||
PR 5.2; if other drivers (TwinCAT, S7) later adopt a paired-fixture mode they can
|
||||
reuse it.
|
||||
- `tests/.../AbCip.IntegrationTests/AbServerFixture.cs` may need a small extension in
|
||||
PR 5.1 to support the paired-port probe; the change is additive (probe both
|
||||
`127.0.0.1:44818` and `127.0.0.1:44819`), keeping single-fixture tests working
|
||||
unchanged.
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
Copied from the Recommendations table at `docs/featuregaps.md`:
|
||||
|
||||
- **#7 Inactivity timeout / keep-alive cadence** — Rarely an issue with libplctag-managed
|
||||
connections.
|
||||
- **#9 "Respect tag-specified scan rate" mode** — Niche; OPC UA subscription rate already
|
||||
covers it.
|
||||
- **#10 Initial value cache / first-update from cache** — OPC UA subscription sampling
|
||||
already handles first-update.
|
||||
- **#15 UDT as first-class OPC UA structured type** — Member fan-out already works;
|
||||
structured-type plumbing is heavy.
|
||||
- **#17 PLC-5 / SLC bridging through CLX** — AbLegacy driver covers this protocol family.
|
||||
- **#21 Unsolicited CIP MSG ingestion** — Separate driver in commercial; design-heavy;
|
||||
niche.
|
||||
- **#22 CIP Generic / Class 3 passthrough** — Niche custom-tooling territory.
|
||||
- **#23 Per-device connection count / pooling** — libplctag manages connections;
|
||||
premature.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **libplctag instance-ID API** (PR 3.2) — does the C# wrapper expose
|
||||
`logical_segment` / `cip_addr` attributes directly, or do we have to drop down to
|
||||
`Tag.AddAttribute` calls? Affects scope of Phase 3.
|
||||
2. **libplctag CIP Multi-Service Packet** (PR 1.4) — is there a wrapper-level multi-write
|
||||
helper, or must we go through the `@raw` pseudo-tag? Affects scope of Phase 1.
|
||||
3. **`DriverDataType.Int64` / `Int64Array`** (PR 1.1) — does Core already carry it, or is
|
||||
this a shared Core change with Modbus's matching TODO?
|
||||
4. **HSBY role tag** (PR 5.1) — confirm the canonical Active/Standby indicator across
|
||||
ControlLogix v20 / v24 / v32+; without a known tag the role-prober is speculative.
|
||||
5. **AOI InOut handling** (PR 2.6) — Kepware skips InOut parameters because they are
|
||||
pointers, not values. Do we follow the same precedent or attempt to dereference at
|
||||
read-time? Skip is the cheap default.
|
||||
6. **L5K vs L5X coverage** — if the customer base has standardised on L5X (Studio 5000
|
||||
v21+), can we ship PR 2.2 first and make PR 2.1 best-effort? Affects phasing within
|
||||
Phase 2.
|
||||
7. **HSBY scope for v2 vs v3** — Phase 5 carries the largest unknowns; if no continuous-
|
||||
process customer demands it for the v2 release, deferring Phase 5 to v3 is reasonable.
|
||||
8. **Per-tag scan rate plumbing** (PR 4.1) — does `PollGroupEngine` in Core already accept
|
||||
per-reference interval overrides, or does that need a Core extension shared with the
|
||||
other polling-overlay drivers (Modbus, FOCAS)?
|
||||
@@ -0,0 +1,470 @@
|
||||
# AbLegacy Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → AbLegacy](../featuregaps.md#ablegacy-allen-bradley-plc-5--slc--micrologix)
|
||||
>
|
||||
> Covers Build = Yes items only. Skip-rated gaps listed at bottom for traceability.
|
||||
|
||||
## Summary
|
||||
|
||||
The AbLegacy driver (PCCC over EtherNet/IP via libplctag) currently ships with parsing for the canonical SLC/PLC-5/MicroLogix file letters, four PLC-family profiles, bit-within-N-word RMW writes, a probe loop, and a flat static-config tag list. The `featuregaps.md` Recommendations table flags 13 gaps as **Build = Yes**:
|
||||
|
||||
1. DH+ via 1756-DHRIO bridging (#2)
|
||||
2. PD/MG/PLS/BT files (#5)
|
||||
3. PLC-5 octal addressing (#7)
|
||||
4. Indirect/indexed addressing (#8)
|
||||
5. Array contiguous block addressing (#9)
|
||||
6. ST string read/write production verification (#10)
|
||||
7. Sub-element bit semantics (`.DN` as Bit) (#11)
|
||||
8. Auto-demote on comm failure (#13)
|
||||
9. RSLogix 500/5 symbol import (#15)
|
||||
10. Per-tag deadband / change filter (#18)
|
||||
11. Diagnostic counters as tags (#20)
|
||||
12. Per-device timeout / retry overrides (#21)
|
||||
13. MicroLogix function-file naming (RTC/HSC/DLS) (#23)
|
||||
|
||||
The plan splits these across **5 phases / 13 PRs** (one PR per gap, with a couple of small ones bundled). Phases are ordered by coupling — addressing correctness first because everything downstream depends on the parser, then file/type coverage, then performance, then workflow tooling, then resilience. Each PR is sized to fit comfortably under the project's per-PR review budget (most S/M; only the RSLogix import is L).
|
||||
|
||||
## Phased delivery
|
||||
|
||||
| Phase | Theme | PRs | Gaps |
|
||||
|-------|-------|-----|------|
|
||||
| 1 | Addressing correctness | 4 | #7 octal, #8 indirect, #11 sub-element bits, #23 ML function files |
|
||||
| 2 | File / type coverage | 2 | #5 PD/MG/PLS/BT, #10 ST verification |
|
||||
| 3 | Performance | 2 | #9 array block, #18 per-tag deadband |
|
||||
| 4 | Workflow | 3 | #15 RSLogix import, #21 per-device timeouts, #20 diagnostic counters |
|
||||
| 5 | Resilience | 2 | #13 auto-demote, #2 DH+ bridging |
|
||||
|
||||
Phase 1 lands first because Phase 2 (PD/MG/PLS/BT) and Phase 3 (array reads) both extend the parser shipped in Phase 1. Phase 5 (auto-demote) reads diagnostic counters from Phase 4 #20, so 4 precedes 5.
|
||||
|
||||
---
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
### Phase 1 — Addressing correctness
|
||||
|
||||
#### PR 1 — PLC-5 octal I/O addressing (#7)
|
||||
|
||||
**Scope**: PLC-5 documentation and RSLogix 5 use octal for `I:` / `O:` word and bit indices (`I:001/17` is rack 0 group 0 word 1, bit 17₈ = bit 15₁₀). Today `AbLegacyAddress.TryParse` does `int.TryParse` on the word number and bit index, silently accepting decimal. For `PlcFamily=Plc5` (and only that family) `I` / `O` files must parse as octal.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyAddress.cs` — add `TryParse(string, AbLegacyPlcFamily)` overload; existing `TryParse(string)` keeps decimal semantics (back-compat for non-PLC-5 callers and pure shape validation).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `EnsureTagRuntimeAsync` and the bit-RMW path call the family-aware overload using `device.Options.PlcFamily`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/PlcFamilies/AbLegacyPlcFamilyProfile.cs` — add `OctalIoAddressing` flag (true for `Plc5` only).
|
||||
|
||||
**Test plan**:
|
||||
- Unit (`tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/AbLegacyAddressTests.cs`): `I:001/17` parses to word=1, bit=15 under PLC-5; same string parses to bit=17 under SLC500. `O:7/10` (decimal under SLC500 = bit 10; octal under PLC-5 = bit 8).
|
||||
- Round-trip: `ToLibplctagName()` must emit the format libplctag expects (verify libplctag's PLC-5 PCCC layer accepts octal-formatted I/O addresses, or whether we must convert decimal→octal-text before forwarding).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — extend the "PCCC address primer" with an `I:` / `O:` row noting PLC-5 octal vs SLC500 decimal semantics; worked example showing `I:001/17` resolved differently per family.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — note octal-vs-decimal addressing as a covered family-aware parser dimension under the unit-coverage list.
|
||||
- Fixture: extend `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` `plc5` profile to seed an `I:001` (or equivalent module-image word) tag if `ab_server --plc=PLC/5` accepts it; otherwise document the gap in `Docker/README.md`.
|
||||
- E2E: add `--plc-type Plc5 -a "I:001/17"` octal-bit assertion to `scripts/e2e/test-ablegacy.ps1` (gated on the `plc5` compose profile being up); no change to `scripts/smoke/seed-ablegacy-smoke.sql` required (existing `N7:5` tag continues to cover the SLC500 path).
|
||||
|
||||
**Effort**: S
|
||||
**Dependencies**: none
|
||||
|
||||
---
|
||||
|
||||
#### PR 2 — MicroLogix function-file letters (RTC / HSC / DLS / MMI / PTO / PWM / STI / EII / IOS / BHI) (#23)
|
||||
|
||||
**Scope**: MicroLogix 1100/1400 expose proprietary function files that don't share file letters with SLC. Today `IsKnownFileLetter` (`AbLegacyAddress.cs:97-101`) only allows the SLC/PLC-5 set, so any tag like `RTC:0.HR` is rejected at parse time even though libplctag's `micrologix` PlcType supports them.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyAddress.cs` — extend `IsKnownFileLetter` to recognise multi-letter function-file types (`RTC`, `HSC`, `DLS`, `MMI`, `PTO`, `PWM`, `STI`, `EII`, `IOS`, `BHI`). Permit only when family is `MicroLogix`. The letter-scan loop already accepts any contiguous letters (`AbLegacyAddress.cs:80-82`).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDataType.cs` — define a sub-element catalogue per function-file (RTC has YR/MON/DAY/HR/MIN/SEC/DOW; HSC has ACC/HIP/LOP/OFS/etc.). Map each sub-element to the right `DriverDataType`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/PlcFamilies/AbLegacyPlcFamilyProfile.cs` — `SupportsFunctionFiles` flag.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `RTC:0.HR` parses with `FileLetter="RTC"`, `WordNumber=0`, `SubElement="HR"`. `HSC:0.ACC` parses. Same strings under PlcFamily=Slc500 must reject (ML1100 file types not present on SLC).
|
||||
- Integration (`tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests`): only if a MicroLogix simulator profile exists; flag as TODO otherwise — verify libplctag `micrologix` PlcType accepts these tag names.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-MicroLogix-FunctionFiles.md` — catalogue of supported function files (RTC/HSC/DLS/MMI/PTO/PWM/STI/EII/IOS/BHI), per-family availability matrix (ML1100 vs ML1400 vs ML1500), sub-element-to-DriverDataType table.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add a "MicroLogix function files" row to the PCCC address primer with `RTC:0.HR` / `HSC:0.ACC` examples and a CLI worked example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — record fixture coverage status for function files and link to the `micrologix` profile gap (only if `ab_server --plc=Micrologix` rejects function-file addresses, document the unit-only fallback).
|
||||
- Fixture: extend `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` `micrologix` profile with `--tag=RTC0[1]` / `--tag=HSC0[1]` if accepted by `ab_server`, else mark as hardware-gated in `Docker/README.md`.
|
||||
- E2E: add a parametric `-PlcType MicroLogix -Address RTC:0.HR` invocation to `scripts/e2e/test-ablegacy.ps1` (skip-when-fixture-gap, mirroring the existing `BadCommunicationError` gate); no `seed-ablegacy-smoke.sql` change unless the fixture supports function-file tags.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: PR 1 (parser overload signature settled)
|
||||
|
||||
---
|
||||
|
||||
#### PR 3 — Sub-element bit semantics (`.DN`, `.EN`, `.TT`, `.CU`, `.CD`, `.OV`, `.UN`, `.ER`) (#11)
|
||||
|
||||
**Scope**: Today `T4:0.DN` parses fine but the `TimerElement`/`CounterElement`/`ControlElement` types collapse to `Int32` (`AbLegacyDataType.cs:41-44`). HMIs expect `.DN` / `.EN` / `.TT` / `.CU` / `.CD` / `.OV` / `.UN` / `.ER` to surface as `Boolean`. The fix is to detect the sub-element at tag-runtime build time and override the driver-surface type.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDataType.cs` — new helper `SubElementBitNames` (HashSet of bit-typed sub-elements per parent type — Timer: EN/TT/DN; Counter: CU/CD/DN/OV/UN; Control: EN/EU/DN/EM/ER/UL/IN/FD). New `EffectiveDriverDataType(AbLegacyDataType, string? subElement)` returning `Boolean` for bit-typed sub-elements, otherwise the existing mapping.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `DiscoverAsync` uses `EffectiveDriverDataType(def.DataType, parsed.SubElement)`; `ReadAsync` decodes the parent word and masks the bit instead of returning the whole word as Int32.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/LibplctagLegacyTagRuntime.cs` — verify libplctag exposes `.DN` etc. as a single bit when read with `GetBit` against the sub-element address. If not, fall back to read-the-word + mask.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (`AbLegacyDriverTests` + new `AbLegacyDataTypeTests`): `T4:0.DN` discovers as Boolean; `T4:0.ACC` discovers as Int32; counter `.OV` is Boolean; control `.LEN` is Int32.
|
||||
- Bit-write semantics: writing Boolean `true` to `T4:0.DN` should be rejected with `BadNotWritable` (timer status bits are PLC-set; verify by integration smoke test against the AbLegacy simulator).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — extend the Timer/Counter/Control rows in the address primer with a "bit sub-elements surface as Boolean" note and a `--type Bool -a T4:0.DN` CLI example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — note `AbLegacyDataTypeTests` as a new unit-coverage class under "What it actually covers".
|
||||
- Fixture: no compose change required (T4/C5/R6 already seeded by `ab_server` defaults — verify; if not, add `--tag=T4[5]`/`--tag=C5[5]`/`--tag=R6[5]` to the `slc500` profile in `Docker/docker-compose.yml`).
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a Boolean sub-element read assertion (`read --type Bool -a T4:0.DN`) once the simulator round-trip works. Update `scripts/smoke/seed-ablegacy-smoke.sql` to add a Boolean tag binding `T4:0.DN` so the server-bridge assertion exercises the new mapping.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: none (independent of PR 1/2 parser changes)
|
||||
|
||||
---
|
||||
|
||||
#### PR 4 — Indirect / indexed addressing parser (`N7:[N7:0]`, `N[N7:0]:5`) (#8)
|
||||
|
||||
**Scope**: Recipe / batch lookup tables use `N7:[N7:0]` (read N7 word indexed by the value at N7:0) or `N[N7:0]:5`. Today `AbLegacyAddress.TryParse` rejects both because it requires literal integer word and file numbers.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyAddress.cs` — record gains nullable `IndirectFileSource` and `IndirectWordSource` (each itself an `AbLegacyAddress`). Parser handles `[<inner>]` segments at file-number or word-number positions. Recursion depth capped at 1 (libplctag accepts only one level of indirection per address — verify against libplctag PCCC docs).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDataType.cs` — no change.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — pass-through; `ToLibplctagName()` re-emits the bracket form.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `N7:[N7:0]` → outer file=N7, indirect word source = (N, 7, 0); `B3:[N7:0]/0` → bit, indirect word source = (N, 7, 0); `N[N7:0]:5` → indirect file source = (N, 7, 0), word=5; depth-2 (`N[N[N7:0]:5]:0`) must reject.
|
||||
- Integration: verify libplctag's `slc500`/`plc5` PlcType accepts a `Name` of form `N7:[N7:0]` and resolves at read time. (If libplctag rejects indirect text, fall back to two-step read: resolve the inner address, then read the outer with the resolved index. Document the chosen strategy in the PR.)
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-Indirect-Addressing.md` — explain `N7:[N7:0]` and `N[N7:0]:5` syntax, the depth-1 limit, the chosen libplctag strategy (verbatim pass-through vs two-step resolve), and recipe-table use cases.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add an indirect-addressing row to the address primer with `--address "N7:[N7:0]"` example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — under unit coverage, list `AbLegacyAddressTests` indirect-parsing cases.
|
||||
- Fixture: no `Docker/docker-compose.yml` change required (`N7[10]` already seeded; the inner index tag at `N7:0` is already addressable). Document recipe-pattern in `Docker/README.md`.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with an indirect-address driver-loopback case (write to `N7:0` to set the index, then read `N7:[N7:0]` and assert the value matches the previously-written content of the resolved word). Skip-gate behind libplctag capability check.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: PR 1 (octal resolution must apply to inner address too if the outer file is `I:`/`O:` on PLC-5)
|
||||
|
||||
---
|
||||
|
||||
### Phase 2 — File / type coverage
|
||||
|
||||
#### PR 5 — PD / MG / PLS / BT structure files (#5)
|
||||
|
||||
**Scope**: Add PD (PID), MG (Message), PLS (Programmable Limit Switch), BT (Block Transfer) file types to the parser and the data-type catalogue. PD has SP/PV/CV/Error/Bias plus 25+ sub-elements; MG has Error/Length/Position/etc.; PLS has LEN/POS; BT is similar to MG.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyAddress.cs` — extend `IsKnownFileLetter` with `PD`, `MG`, `PLS`, `BT`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDataType.cs` — new enum members `PidElement`, `MessageElement`, `PlsElement`, `BlockTransferElement`. Sub-element catalogue per type — many PD sub-elements are Float32 (`SP`, `PV`, `CV`, `KP`, `KI`, `KD`), some are Boolean (`EN`, `DN`, `MO`, `PE`), some Int16 (`SPS`, `MAXS`, `MINS`).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/LibplctagLegacyTagRuntime.cs` — verify libplctag PCCC supports addressing PD/MG/PLS/BT sub-elements by name; if not, the driver reads the parent struct as a byte block and offsets internally (libplctag docs to consult).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/PlcFamilies/AbLegacyPlcFamilyProfile.cs` — `SupportsPidFile` etc. flags (PLC-5 supports PD/BT; SLC supports PD; ML1100/1400 generally do not — verify per family docs).
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `PD9:0.SP` → Float32; `PD9:0.EN` → Boolean; `MG10:0.LEN` → Int32; reject `PD9:0` (no sub-element on a struct file).
|
||||
- Integration: smoke test against a simulator with PD file configured (verify pylogix/pycomm3 sim supports PD, otherwise mark as TODO and lean on unit coverage).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-Structure-Files.md` — sub-element catalogues for PD / MG / PLS / BT, per-family availability matrix (PLC-5 vs SLC vs ML), DriverDataType per sub-element.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add PD / MG / PLS / BT rows to the file-letter primer with `--type PidElement` etc. examples.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list new structure-file file letters under unit coverage and note any fixture limitations (pd/mg likely not supported by `ab_server`).
|
||||
- Fixture: extend `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` `slc500` and `plc5` profiles with `--tag=PD9[2]` / `--tag=MG10[2]` if `ab_server` accepts; otherwise document gap in `Docker/README.md` and rely on unit coverage.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a `read --type Float -a PD9:0.SP` assertion when fixture exposes the file; add a corresponding tag row to `scripts/smoke/seed-ablegacy-smoke.sql` (skip-gated).
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: PR 3 (sub-element bit semantics machinery must exist first — PD `.EN` is Boolean by the same mechanism as Timer `.EN`)
|
||||
|
||||
---
|
||||
|
||||
#### PR 6 — ST string read/write production verification (#10)
|
||||
|
||||
**Scope**: ST is enum-listed and `LibplctagLegacyTagRuntime.DecodeValue` calls `_tag.GetString(0)`, but there's no integration coverage that ST round-trips through libplctag's 82-byte length-word format. This PR is verification + any fixes uncovered.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/LibplctagLegacyTagRuntime.cs` — likely no source change if libplctag's `GetString`/`SetString` already handles the length-word convention; if not, add `GetByteArrayBuffer` + manual length-word decode.
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/AbLegacyReadSmokeTests.cs` — add `ST_RoundTrip_*` tests against the simulator: write 82-char string, write 0-char, write 41-char, write embedded null/non-ASCII; round-trip each through ReadAsync.
|
||||
- New `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/AbLegacyStringEncodingTests.cs` — unit-level decode of a known length-word + payload byte buffer (mock `IAbLegacyTagRuntime` returning fixed bytes).
|
||||
|
||||
**Test plan**:
|
||||
- Integration: 4 round-trip cases above; covers PlcFamily=Slc500 and PlcFamily=Plc5 (libplctag may handle the length word differently between the two PCCC layers — verify).
|
||||
- Quality: unit test that `BadOutOfRange` surfaces when caller writes a 100-char string to an 82-byte ST.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — expand the `ST` row in the address primer with the 82-byte limit, length-word convention, and a `write --type String --value "Hello"` worked example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list the new `AbLegacyStringEncodingTests` unit class and the four `ST_RoundTrip_*` integration cases under coverage.
|
||||
- Fixture: extend `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` `slc500` and `plc5` profiles with `--tag=ST20[5]` so the round-trip tests have a real address to write against; document any `ab_server` ST gaps in `Docker/README.md`.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a String round-trip case (`-a "ST20:0" --type String`) and a `String` tag row in `scripts/smoke/seed-ablegacy-smoke.sql` so the bridge assertion exercises ST.
|
||||
|
||||
**Effort**: S (mostly tests; small encoding fix if any)
|
||||
**Dependencies**: none
|
||||
|
||||
---
|
||||
|
||||
### Phase 3 — Performance
|
||||
|
||||
#### PR 7 — Array contiguous block addressing (`N7:0,10` or `N7:0[10]`) (#9)
|
||||
|
||||
**Scope**: One PCCC frame can pull up to ~120 words. Today every tag is a separate libplctag instance and a separate request. The fix exposes array tags as a single tag with `IsArray=true` + `ArrayDim`, backed by a libplctag tag with `elem_count=N`.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyAddress.cs` — record gains `ArrayCount` (nullable). Parser accepts `,N` suffix (Rockwell convention) and `[N]` suffix (libplctag convention) on the word number. Reject combination with sub-element or bit index.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriverOptions.cs` — `AbLegacyTagDefinition` gains optional `ArrayLength` (overrides parsed value; convenient when address is parameterised).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/IAbLegacyTagRuntime.cs` — `AbLegacyTagCreateParams` gains `ElementCount` (default 1).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/LibplctagLegacyTagRuntime.cs` — pass `ElementCount` to libplctag `Tag.ElementCount` (verify libplctag supports element counts on PCCC PlcTypes — it does for ab_eip CIP tags but PCCC may behave differently).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `DiscoverAsync` emits `IsArray=true`, `ArrayDim=[N]`; `ReadAsync` decodes via per-index `_tag.GetInt16(i*2)` etc.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `N7:0,10` parses ArrayCount=10; `N7:0[10]` same; `N7:0,10/3` rejects (array+bit); `T4:0,5.ACC` rejects (array+sub-element).
|
||||
- Integration: read `N7:0,10` returns 10 elements in one frame; latency measurement vs 10 individual tags should be ≥ 5x faster (target).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add an "Array reads" section explaining `N7:0,10` vs `N7:0[10]` syntax and the per-PCCC-frame ~120-word ceiling, plus a `read --array-length 10 -a N7:0,10` CLI example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list array-block reads under unit coverage and note the latency benchmark integration test as a new perf-flagged case.
|
||||
- Fixture: confirm `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` `--tag=N7[10]` / `--tag=F8[10]` already provide enough contiguous words; otherwise bump array sizes (`N7[120]` to allow max-frame tests).
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a `read -a "N7:0,10"` array assertion (parse comma-separated CLI output); add a matching `IsArray=1` tag row in `scripts/smoke/seed-ablegacy-smoke.sql` to exercise the address-space side.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: PR 1 (octal applies to array index when the file is I/O on PLC-5)
|
||||
|
||||
---
|
||||
|
||||
#### PR 8 — Per-tag deadband / change filter (#18)
|
||||
|
||||
**Scope**: Today `PollGroupEngine` publishes every poll. Add absolute and percent deadband per tag — only emit `OnDataChange` when the new value differs by ≥ deadband.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriverOptions.cs` — `AbLegacyTagDefinition` gains `AbsoluteDeadband` (double?), `PercentDeadband` (double?).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — wrap the `PollGroupEngine` callback with a per-tag last-published-value cache and the deadband test. Booleans bypass deadband (always change-on-edge). Strings + status changes always publish.
|
||||
- Verify: `PollGroupEngine` (in `Core.Drivers`) doesn't already centralise this — if it does, this PR threads the per-tag config through the engine instead of layering on top.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (new `AbLegacyDeadbandTests`): tag with `AbsoluteDeadband=1.0` reading `[10.0, 10.5, 11.5, 11.6]` publishes only `10.0` and `11.5`. Boolean tag publishes every transition. Status code change always publishes.
|
||||
- Quality: ensure last-value cache doesn't leak across `ReinitializeAsync`.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add a "Deadband" subsection under subscribe with `--deadband-absolute` / `--deadband-percent` CLI flags and example.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list `AbLegacyDeadbandTests` under unit coverage.
|
||||
- Fixture: no compose change required (per-tag deadband is a config-side concern, not a server simulator one).
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a deadband subscribe assertion (subscribe with `--deadband-absolute 5`, write three small deltas, assert only one notification fires); add a tag row to `scripts/smoke/seed-ablegacy-smoke.sql` with `AbsoluteDeadband=5` to exercise the seed-from-config path.
|
||||
|
||||
**Effort**: S
|
||||
**Dependencies**: none
|
||||
|
||||
---
|
||||
|
||||
### Phase 4 — Workflow
|
||||
|
||||
#### PR 9 — Per-device timeout / retry overrides (#21)
|
||||
|
||||
**Scope**: Replace single driver-wide `Timeout` with per-device override (SLC 5/01 needs ~5 s, SLC 5/05 fine at 2 s, ML1100 sometimes 3 s). Optional retry count per device.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriverOptions.cs` — `AbLegacyDeviceOptions` gains optional `Timeout`, `Retries`. `AbLegacyDriverOptions.Timeout` becomes the driver-wide default.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `EnsureTagRuntimeAsync` and `ProbeLoopAsync` use `device.Options.Timeout ?? _options.Timeout`. `ReadAsync` retry loop honours `device.Options.Retries`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: device with `Timeout=TimeSpan.FromSeconds(5)` propagates into `AbLegacyTagCreateParams.Timeout`; absent override falls back to driver-wide.
|
||||
- Integration: simulate a slow device (1 s artificial delay) — driver-wide 2 s passes; reducing per-device to 500 ms surfaces `BadCommunicationError` on the slow device while the fast device keeps reading.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — document per-device `--timeout-ms` / `--retries` precedence vs driver-wide defaults; add a tuning cheat-sheet for SLC 5/01 vs 5/05 vs ML1100.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — note per-device options under the AbLegacyDeviceOptions surface.
|
||||
- Fixture: no compose change. Add a slow-device test harness using a `tc qdisc add dev eth0 delay 1000ms` sidecar (or a Linux `iptables -j DELAY` shim) — document in `Docker/README.md` as an optional perf-tuning fixture.
|
||||
- E2E: no `test-ablegacy.ps1` change needed (per-device timeout is integration-test territory). Add a `Timeout=PT500MS` device-level row to `scripts/smoke/seed-ablegacy-smoke.sql` so the seed path exercises the new column.
|
||||
|
||||
**Effort**: S
|
||||
**Dependencies**: none
|
||||
|
||||
---
|
||||
|
||||
#### PR 10 — Diagnostic counters as tags (#20)
|
||||
|
||||
**Scope**: Per-device diagnostic counters (request count, response count, retry count, last-error code, comm-failures) surface as auto-generated tags under `AbLegacy/<host>/_Diagnostics/*` so HMIs can bind directly. Mirrors what other drivers expose.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `DeviceState` gains `Counters` (record of int64s). `ReadAsync`, `WriteAsync`, `ProbeLoopAsync` increment counters on success/failure paths. `DiscoverAsync` emits a `_Diagnostics` folder per device with seven Variables: `RequestCount`, `ResponseCount`, `ErrorCount`, `RetryCount`, `LastErrorCode`, `LastErrorMessage`, `CommFailures`.
|
||||
- New `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDiagnosticTags.cs` — generates the 7 well-known tag names; reading them returns counter snapshots from `DeviceState.Counters`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` `ReadAsync` short-circuits diagnostic tag references before dispatching to libplctag.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (new `AbLegacyDiagnosticsTests`): force 5 reads (3 success, 2 fail) → `RequestCount=5`, `ErrorCount=2`. `LastErrorCode` reflects the last libplctag status. Counters reset on `ReinitializeAsync`.
|
||||
- Quality: verify the 7 well-known names don't collide with user-config tag names (reject overlap at `InitializeAsync`).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-Diagnostics.md` — the seven well-known counter tag names, their semantics, namespace convention (`_Diagnostics` folder per device), reset behaviour on `ReinitializeAsync`, and HMI binding examples.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — note that diagnostic tags surface alongside user-config tags and can be `read --address _Diagnostics/RequestCount` (or whatever the canonical CLI shape ends up being).
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list `AbLegacyDiagnosticsTests` and call out the collision-rejection contract.
|
||||
- Fixture: no compose change.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a "after N reads, RequestCount==N" assertion against the diagnostic NodeId published by the OPC UA server-bridge step; add a `_Diagnostics/RequestCount` Tag row to `scripts/smoke/seed-ablegacy-smoke.sql` if the addr-space team requires explicit registration.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: none
|
||||
|
||||
---
|
||||
|
||||
#### PR 11 — RSLogix 500 / PLC-5 symbol & data-table import (#15)
|
||||
|
||||
**Scope**: Import RSLogix exports (`.RSS` Slc500, `.RSP` Plc5, `.SLC` text export) to seed `AbLegacyTagDefinition` entries. The binary `.RSS`/`.RSP` formats are proprietary and largely undocumented; the practical strategy is to support the `.SLC` / `.CSV` text exports that RSLogix can produce ("save as text" / "Database Export"). Verify whether libplctag or a sister project ships an `.RSS` parser — if not, scope to text exports only and document the binary case as a future enhancement.
|
||||
|
||||
**Files**:
|
||||
- New `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/Import/RsLogixSymbolImport.cs` — parses RSLogix text export (CSV: `Symbol,Address,Description,DataType,Scope`).
|
||||
- New `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/Import/IRsLogixImporter.cs` — abstraction for future binary support.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriverFactoryExtensions.cs` — extension method `AddRsLogixImport(string path, string deviceHostAddress)` materialises `AbLegacyTagDefinition` entries from the file at startup-time.
|
||||
- New CLI command in `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli/` (mirrors AbCip CLI patterns — verify: confirm the AbLegacy CLI project layout): `import-rslogix --file foo.csv --device ab://... --emit appsettings-fragment`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (new `RsLogixSymbolImportTests`): canonical CSV with one of each file letter (N/F/B/L/ST/T/C/R) generates 8 `AbLegacyTagDefinition` entries with correct `DataType`. Malformed rows skipped with logged warning. Comments and header rows skipped.
|
||||
- Integration: an end-to-end test with a recorded RSLogix CSV (committed under `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/`) produces an addr-space matching a golden snapshot.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-RSLogix-Import.md` — supported export formats (CSV / .SLC text), CSV column convention, scope handling, the `import-rslogix` CLI subcommand, and the explicit non-goal of binary `.RSS`/`.RSP` parsing for v1.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add an `import-rslogix` subcommand row to the commands table with `--file foo.csv --device ab://... --emit appsettings-fragment` example.
|
||||
- Update `docs/DriverClis.md` if it carries a per-CLI command matrix.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list `RsLogixSymbolImportTests`, the new `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/` golden CSV, and the import-then-read integration scenario.
|
||||
- Fixture: new committed CSV under `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/rslogix-canonical.csv` plus the corresponding golden snapshot. No `Docker/docker-compose.yml` change.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with an `import-rslogix` invocation that emits an appsettings fragment, then asserts the resulting tag count matches the CSV row count. No `seed-ablegacy-smoke.sql` change (importer is offline tooling).
|
||||
|
||||
**Effort**: L (parser + CLI + golden-snapshot fixture)
|
||||
**Dependencies**: PR 1–5 complete (importer must produce addresses the parser accepts)
|
||||
|
||||
---
|
||||
|
||||
### Phase 5 — Resilience
|
||||
|
||||
#### PR 12 — Auto-demote on comm failure (#13)
|
||||
|
||||
**Scope**: When a device fails N consecutive reads/probes, mark it Demoted and skip its tags for `DemoteFor` seconds — so one slow PLC doesn't starve fast PLCs sharing the same driver/poll cadence.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriverOptions.cs` — new `AbLegacyDemoteOptions { FailureThreshold=3, DemoteFor=TimeSpan.FromSeconds(30), Enabled=true }` on `AbLegacyDeviceOptions`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyDriver.cs` — `DeviceState` gains `ConsecutiveFailures`, `DemotedUntilUtc`. `ReadAsync` short-circuits demoted devices with `BadCommunicationError` until `DemotedUntilUtc`. `ProbeLoopAsync` clears demote on first success. New `HostState.Demoted` enum value (verify `HostState` is in `Core.Abstractions` and adding a member is non-breaking).
|
||||
- Diagnostic tags from PR 10 gain `DemoteCount` and `LastDemotedUtc`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (new `AbLegacyAutoDemoteTests`): force 3 consecutive failures → device transitions to `Demoted`; reads while demoted return `BadCommunicationError` without invoking libplctag (verify via test fake counting `ReadAsync` calls). After `DemoteFor` expires, the next read attempt goes through.
|
||||
- Integration: two devices on the same driver, one with a fault — fault doesn't slow down the healthy one.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-AutoDemote.md` (or a section appended to `AbLegacy-Diagnostics.md` from PR 10) — failure-threshold + demote-window semantics, interaction with the probe loop, the `HostState.Demoted` enum value, recovery path.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — add `--demote-failure-threshold` / `--demote-for` per-device flags and document how `probe` reflects the Demoted state.
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — list `AbLegacyAutoDemoteTests` and the two-device fault-isolation integration case.
|
||||
- Fixture: extend `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml` with a second `slc500-faulty` service that listens on `:44819` but rejects every read (or doesn't bind, simulating ECONNREFUSED). The driver test then targets both `:44818` (healthy) and `:44819` (faulty) to exercise demotion.
|
||||
- E2E: extend `scripts/e2e/test-ablegacy.ps1` with a "kill simulator, observe demotion in `_Diagnostics/DemoteCount`" assertion (gated on PR 10's diagnostic tags being present). Add a `DemoteFor=PT30S` device row to `scripts/smoke/seed-ablegacy-smoke.sql`.
|
||||
|
||||
**Effort**: M
|
||||
**Dependencies**: PR 10 (diagnostic counters)
|
||||
|
||||
---
|
||||
|
||||
#### PR 13 — DH+ via 1756-DHRIO bridging (#2)
|
||||
|
||||
**Scope**: Allow addressing a PLC-5 sitting on a DH+ link reached through a ControlLogix chassis with a 1756-DHRIO module. The CIP path syntax is `1,<slot>,2,<dh+_station_octal>` — already accepted as a string by `AbLegacyHostAddress`, but we should validate and document it, and verify libplctag's `plc5` PlcType resolves DH+ stations correctly through the DHRIO port.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/AbLegacyHostAddress.cs` — add validation for the DH+ path form `1,<slot>,2,<station>` where station is 0..77 octal. Surface the parsed components (`BackplaneSlot`, `DhPlusPort`, `DhPlusStation`) for diagnostics.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy/PlcFamilies/AbLegacyPlcFamilyProfile.cs` — note that DH+ bridging is a `Plc5`-only path (DHRIO doesn't bridge to SLC/ML).
|
||||
- `docs/Driver.AbLegacy.Cli.md` — add a worked example of DHRIO routing.
|
||||
|
||||
**Test plan**:
|
||||
- Unit (`AbLegacyHostAndStatusTests`): `ab://10.0.0.1/1,3,2,07` parses with slot=3, station=7₈=7. `ab://10.0.0.1/1,3,2,77` parses station=77₈=63. `ab://10.0.0.1/1,3,2,80` rejects (octal range).
|
||||
- Integration: requires a real DHRIO + PLC-5 — flag as hardware-gated; cover with unit-only for now and document the manual smoke procedure (`docs/Driver.AbLegacy.Cli.md`).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- New doc `docs/drivers/AbLegacy-DH-Bridging.md` — the `1,<slot>,2,<station_octal>` CIP path syntax, DHRIO module wiring overview, octal-station-number reference (00..77 octal = 0..63), restriction to PLC-5 family, and the manual smoke procedure since DHRIO can't be simulated.
|
||||
- Update `docs/Driver.AbLegacy.Cli.md` — extend the family/cip-path cheat sheet with a "PLC-5 via DHRIO" row showing `ab://logix-host/1,3,2,07` and a worked CLI example. (Plan already calls this out at line 279 — keep it, but link to the new dedicated doc.)
|
||||
- Update `docs/drivers/AbLegacy-Test-Fixture.md` — note that DH+ bridging is unit-only (no fixture support possible) and reference the manual hardware smoke procedure.
|
||||
- Fixture: no `Docker/docker-compose.yml` change is feasible (DHRIO is hardware-only).
|
||||
- E2E: no new automated `test-ablegacy.ps1` case (would require real DHRIO). Add a `-DhPlusStation 7` parameter form documented in the script comment header for hardware-gated runs only. No `seed-ablegacy-smoke.sql` change.
|
||||
|
||||
**Effort**: S
|
||||
**Dependencies**: PR 1 (octal parsing utility) — share the octal-int helper between PR 1 and PR 13.
|
||||
|
||||
---
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Consolidated view of every doc, fixture, and e2e/smoke artefact this plan touches, so reviewers and PR authors can size the non-code surface area at a glance.
|
||||
|
||||
### New docs (created by this plan)
|
||||
|
||||
| Doc | Created by | Purpose |
|
||||
|-----|-----------|---------|
|
||||
| `docs/drivers/AbLegacy-MicroLogix-FunctionFiles.md` | PR 2 | Function-file catalogue (RTC/HSC/DLS/MMI/PTO/PWM/STI/EII/IOS/BHI), per-family availability, sub-element types |
|
||||
| `docs/drivers/AbLegacy-Indirect-Addressing.md` | PR 4 | `N7:[N7:0]` and `N[N7:0]:5` syntax, depth-1 limit, libplctag strategy |
|
||||
| `docs/drivers/AbLegacy-Structure-Files.md` | PR 5 | PD / MG / PLS / BT sub-element catalogues + per-family availability matrix |
|
||||
| `docs/drivers/AbLegacy-Diagnostics.md` | PR 10 | Seven well-known counter tag names, namespace convention, reset semantics |
|
||||
| `docs/drivers/AbLegacy-RSLogix-Import.md` | PR 11 | CSV / `.SLC` text-export schema, `import-rslogix` CLI, binary-format non-goals |
|
||||
| `docs/drivers/AbLegacy-AutoDemote.md` (or PR 10 doc extension) | PR 12 | Demote thresholds, recovery, `HostState.Demoted` semantics |
|
||||
| `docs/drivers/AbLegacy-DH-Bridging.md` | PR 13 | `1,<slot>,2,<station_octal>` CIP path, DHRIO wiring, manual smoke procedure |
|
||||
|
||||
### Updated docs (extended by this plan)
|
||||
|
||||
- `docs/Driver.AbLegacy.Cli.md` — extended by **every** PR (octal I/O, function files, sub-element bits, indirect, structure files, ST round-trip, array reads, deadband flags, per-device timeouts, diagnostic tags, RSLogix import subcommand, demote flags, DHRIO cheat-sheet row).
|
||||
- `docs/drivers/AbLegacy-Test-Fixture.md` — extended by **every** PR with new unit test classes, integration cases, and fixture limitations.
|
||||
- `docs/DriverClis.md` — touched by PR 11 (new `import-rslogix` subcommand row).
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/README.md` — touched by PRs 1, 2, 4, 5, 9, 12 (fixture limitations, optional perf-tuning sidecars, faulty-device service, recipe-pattern note).
|
||||
|
||||
### Fixture / scaffolding work
|
||||
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/Docker/docker-compose.yml`:
|
||||
- PR 1: extend `plc5` profile with `I:001`-style tags (if `ab_server` accepts).
|
||||
- PR 2: extend `micrologix` profile with `RTC0[1]`/`HSC0[1]` (if accepted).
|
||||
- PR 3: extend `slc500` profile with `T4[5]`/`C5[5]`/`R6[5]` if not already seeded by `ab_server` defaults.
|
||||
- PR 5: extend `slc500` and `plc5` profiles with `PD9[2]`/`MG10[2]` (if accepted).
|
||||
- PR 6: extend `slc500` and `plc5` profiles with `ST20[5]`.
|
||||
- PR 7: bump array sizes (`N7[120]`) for max-frame array-read tests.
|
||||
- PR 12: add a second `slc500-faulty` service for demotion/fault-isolation tests.
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Tests/Fixtures/`:
|
||||
- PR 11: new `rslogix-canonical.csv` + golden snapshot for the symbol-import integration test.
|
||||
|
||||
### E2E / smoke scripts
|
||||
|
||||
- `scripts/e2e/test-ablegacy.ps1`:
|
||||
- PR 1: octal-bit `Plc5` assertion.
|
||||
- PR 2: `MicroLogix RTC:0.HR` parametric.
|
||||
- PR 3: Boolean sub-element read (`T4:0.DN`).
|
||||
- PR 4: indirect-address loopback.
|
||||
- PR 5: `PD9:0.SP` Float read (skip-gated).
|
||||
- PR 6: ST round-trip.
|
||||
- PR 7: array-read `N7:0,10`.
|
||||
- PR 8: deadband subscribe assertion.
|
||||
- PR 10: `_Diagnostics/RequestCount` assertion via OPC UA bridge.
|
||||
- PR 11: `import-rslogix` invocation + tag-count assertion.
|
||||
- PR 12: kill-simulator-and-observe-demote assertion.
|
||||
- PR 13: parameter-only header note for hardware-gated DHRIO runs.
|
||||
- `scripts/smoke/seed-ablegacy-smoke.sql`:
|
||||
- PR 3: `T4:0.DN` Boolean tag row.
|
||||
- PR 5: `PD9:0.SP` PidElement tag row (skip-gated).
|
||||
- PR 6: `ST20:0` String tag row.
|
||||
- PR 7: `N7:0,10` array tag row (`IsArray=1`).
|
||||
- PR 8: tag row with `AbsoluteDeadband=5`.
|
||||
- PR 9: device row with `Timeout=PT500MS`.
|
||||
- PR 10: `_Diagnostics/RequestCount` tag row (if explicit registration required).
|
||||
- PR 12: device row with `DemoteFor=PT30S`.
|
||||
|
||||
---
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
For traceability, the gaps the recommendations table flagged **No**:
|
||||
|
||||
| # | Gap | Skip rationale |
|
||||
|---|-----|----------------|
|
||||
| 1 | Serial DF1 transports (full-duplex, half-duplex, KF2/KF3) | libplctag has no serial path; declining install base |
|
||||
| 3 | DH-485 routing (1761/1747-AIC) | Very legacy; rare in greenfield |
|
||||
| 4 | M0 / M1 module file access | Niche RIO modules; declining |
|
||||
| 6 | D (BCD) and Long-BCD types | Very legacy data convention |
|
||||
| 12 | Block read-size negotiation per family | libplctag handles chunking implicitly |
|
||||
| 14 | Channel-shared comm serialisation | Only matters for serial / DH+ transport (not built) |
|
||||
| 16 | Online controller browse / data-table discovery | PCCC dir frame limited; libplctag support unclear |
|
||||
| 17 | DF1 BCC vs CRC-16 selection | Predicated on DF1 transport (gap #1) |
|
||||
| 19 | PLC-5 typed-read selection / Force Logical | libplctag defaults are sound; niche tuning |
|
||||
| 22 | Write completion semantics options | Niche tuning; current write-through is safe default |
|
||||
|
||||
These remain documented in `featuregaps.md` and can be reopened if customer feedback warrants.
|
||||
|
||||
---
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **libplctag PCCC capability verification** — several PRs (especially 2, 4, 5, 7) hinge on what libplctag's `slc500` / `micrologix` / `plc5` / `logixpccc` PlcTypes actually accept in the `Name` attribute. Before scheduling Phase 2 we should run a one-day spike with the AbLegacy simulator to confirm:
|
||||
- Does libplctag accept indirect addresses (`N7:[N7:0]`) verbatim, or do we need to resolve in two steps?
|
||||
- Does it accept array notation (`N7:0,10` vs `N7:0[10]`) for PCCC PlcTypes?
|
||||
- Does it expose PD/MG/PLS/BT sub-elements by name, or do we read the parent struct as a byte block?
|
||||
- Does it correctly handle PLC-5 octal in I:/O: addresses, or does the driver need to convert?
|
||||
2. **MicroLogix simulator fidelity** — we don't currently know whether the AbLegacy integration-test fixture (`AbLegacyServerFixture`) simulates the MicroLogix function files (RTC/HSC/DLS). PR 2's integration coverage is gated on this. If not, we either extend the fixture or scope PR 2 to unit-only tests + a hardware smoke-test playbook.
|
||||
3. **RSLogix import format coverage** — binary `.RSS` / `.RSP` parsing is non-trivial. PR 11 scopes to text/CSV exports. Should we instead invest in shelling out to the (free) Rockwell `RSWho` / `RSLogix Emulate` tooling for binary conversion, or accept text-only as the v1 scope and revisit?
|
||||
4. **Address-space rebuild on tag-set change** — when PR 11 (RSLogix import) adds 1000+ tags, does `ReinitializeAsync` perform acceptably, or do we need an incremental discovery path? Out of scope for this plan but worth flagging.
|
||||
5. **Diagnostic tag namespace collision** — PR 10 reserves `_Diagnostics` under each device folder. Confirm with the address-space team that the leading underscore is the established convention (other drivers use `_System` or `_DiagnosticTags`); align before implementation.
|
||||
@@ -0,0 +1,807 @@
|
||||
# FOCAS Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → FOCAS](../featuregaps.md#focas-fanuc-cnc)
|
||||
>
|
||||
> Covers Build = Yes items only.
|
||||
|
||||
## Summary
|
||||
|
||||
The FOCAS driver today is a pure-managed, read-only FOCAS/2 wire client
|
||||
(`src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/Wire/`) backing a fixed-tree projection
|
||||
plus user-authored `PARAM:` / `MACRO:` / PMC tags. It exposes a thin set of
|
||||
calls (`cnc_sysinfo`, `cnc_rdcncstat`, `cnc_rdaxisname`, `cnc_rdspdlname`,
|
||||
`cnc_rddynamic2`, `cnc_rdsvmeter`, `cnc_rdspload`, `cnc_rdspmaxrpm`,
|
||||
`cnc_exeprgname2`, `cnc_rdblkcount`, `cnc_rdopmode`, `cnc_rdtimer`,
|
||||
`cnc_rdparam`, `cnc_rdmacro`, `pmc_rdpmcrng`, `cnc_rdalmmsg2`).
|
||||
|
||||
The featuregaps table marks **18** items as Build = Yes. They cluster into
|
||||
five distinct workstreams:
|
||||
|
||||
1. **Phase 1 — fixed-tree expansion** (#6, #7, #8, #10, #11, #12, #13, #14,
|
||||
#18, #20, #24, #27). These are mostly new wire calls plumbed into the
|
||||
existing `FixedTree*` poll cadences; no architectural change.
|
||||
2. **Phase 2 — addressing additions** (#4, #14 DIAG scheme, #15, #16). New
|
||||
`FocasAreaKind` values, new capability-matrix entries, multi-path
|
||||
`PathId`. Touches the parser + matrix + wire envelope; mostly additive.
|
||||
3. **Phase 3 — alarm history** (#17). Extends the existing
|
||||
`FocasAlarmProjection` with a one-shot history pull on connect plus
|
||||
periodic delta polls.
|
||||
4. **Phase 4 — write path** (#1, #3). The biggest behavioural change in
|
||||
the driver's lifetime: removes the `BadNotWritable` short-circuit, adds
|
||||
`cnc_wrparam` / `pmc_wrpmcrng` / `cnc_wrmacro` plus FOCAS password
|
||||
handling. Material risk surface — see Risks.
|
||||
5. **Phase 5 — derived telemetry** (#24 cycle-delta computation). Optional
|
||||
companion to #24 raw cycle time; computes "last completed cycle" from
|
||||
the existing cumulative `Cycle` timer.
|
||||
|
||||
DIAG (#14) is in Phase 2 (addressing) rather than Phase 1 because it
|
||||
needs a new address scheme, but the fixed-tree status flag projection
|
||||
(#12) is the cheapest item and should land first as a vertical slice.
|
||||
|
||||
The remaining 9 items in the featuregaps table (HSSB, Series 15 / 35i,
|
||||
tool-offset write, program upload/download, DPRNT, deep servo info,
|
||||
acceleration/jerk, operator preset commands, NTP) are scoped out as
|
||||
Build = No; they appear in [Skip-rated items](#skip-rated-items-for-context)
|
||||
for context only.
|
||||
|
||||
## Phased delivery
|
||||
|
||||
| Phase | Scope | Gaps closed | Approx PRs | Risk |
|
||||
|-------|-------|-------------|------------|------|
|
||||
| 1 | Fixed-tree expansion (read-only) | 12, 13, 7, 8, 10, 11, 20, 18, 6, 24, 27, 14 (read-only piece) | 6 | Low |
|
||||
| 2 | Addressing additions | 4, 15, 16, 14 (DIAG: scheme) | 4 | Medium (multi-path) |
|
||||
| 3 | Alarm history | 17 | 1 | Low |
|
||||
| 4 | Write path + password | 1, 3 | 4 | High (read-only design choice removed) |
|
||||
| 5 | Cycle-delta derived telemetry | 24 (delta companion) | 1 | Low |
|
||||
|
||||
Phases 1–3 are mutually independent and can ship in any order. Phase 4
|
||||
deliberately follows Phase 2 so writes ride on top of the multi-path
|
||||
addressing already in place. Phase 5 tags onto the cycle-time node from
|
||||
Phase 1.
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
### Phase 1 — fixed-tree expansion
|
||||
|
||||
Common shape: each PR adds one or more wire calls in
|
||||
`Wire/FocasWireClient.cs`, surfaces them on `IFocasClient`, plumbs them
|
||||
into `FocasDriver`'s `FixedTreeLoopAsync` cadences (axis 250 ms / program
|
||||
1 s / timer 30 s) and the `TryReadFixedTree` synthesizer, then adds
|
||||
fakes + assertions.
|
||||
|
||||
**PR F1-a — ODBST status flags as fixed-tree nodes (#12)**
|
||||
- Scope: project the 9 fields of `cnc_rdcncstat` (`tmmode`, `aut`, `run`,
|
||||
`motion`, `mstb`, `emergency`, `alarm`, `edit`, `dummy`) under
|
||||
`Status/` per device. We already issue this call in `ProbeAsync`; this
|
||||
PR keeps the boolean probe but additionally caches the full struct on
|
||||
every poll tick.
|
||||
- Files:
|
||||
`Wire/FocasWireClient.cs` (extend `ReadStatusAsync` to return the
|
||||
whole `WireStatus` rather than only `IsOk`), `IFocasClient.cs` (new
|
||||
`GetStatusAsync`), `FocasDriver.cs` (new `Status/*` branch in
|
||||
`TryReadFixedTree`, status cache on `DeviceState`).
|
||||
- Tests:
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Tests/FocasFixedTreeStatusTests.cs`
|
||||
(new) — `FakeFocasClient` returns canned ODBST, assert each field maps
|
||||
to the expected `Status/*` browse name. Integration: extend
|
||||
`FocasSimFixture` to seed the simulator's status response and assert
|
||||
via the OPC UA client.
|
||||
- **Docs / fixture / e2e**: extend `docs/drivers/FOCAS.md` fixed-tree
|
||||
table with the 9 `Status/*` nodes; mention the boolean-probe →
|
||||
full-struct change in `docs/drivers/FOCAS-Test-Fixture.md` integration
|
||||
bullet list; teach `focas-mock` (under
|
||||
`tests/.../IntegrationTests/Docker/focas-mock/`) the `cnc_rdcncstat`
|
||||
payload shape per `docs/v2/implementation/focas-wire-protocol.md`
|
||||
(add ODBST struct entry); extend `FocasSimFixture` with a helper to
|
||||
patch the canned status payload; new
|
||||
`Series/StatusFlagsPopulateTests.cs` integration test.
|
||||
- Effort: small; one wire call already exists.
|
||||
- Risk: Low.
|
||||
|
||||
**PR F1-b — parts count + cycle time (#13, #24 raw)**
|
||||
- Scope: surface `cnc_rdparam(6711)` (parts produced), `6712` (parts
|
||||
required), `6713` (parts total since power-on) under `Production/`,
|
||||
plus `Production/CycleTimeSeconds` (already exposed as
|
||||
`Timers/CycleSeconds` — promote to the `Production/` group too with
|
||||
the same backing). The existing `cnc_rdtimer` call is sufficient.
|
||||
- Files: `FocasDriver.cs` (`Production/*` branch, parameter-cached
|
||||
reads on the timer poll cadence), `IFocasClient.cs` (no new call —
|
||||
rides on `ReadParameterInt32Async`).
|
||||
- Tests: `FakeFocasClient` returns canned parameter values; assert
|
||||
`Production/PartsTotal` equals the canned value.
|
||||
- **Docs / fixture / e2e**: add `Production/*` rows to the fixed-tree
|
||||
table in `docs/drivers/FOCAS.md`; add `Production:` example to
|
||||
`docs/Driver.FOCAS.Cli.md` (a `read -a PARAM:6711` snippet); the
|
||||
parts-count parameters (6711/6712/6713) are already in the
|
||||
simulator profile range, so only the `dl205`-style profile JSON
|
||||
under `tests/.../Docker/focas-mock/profiles/` needs seeded values
|
||||
added; extend `FocasSimFixture` with a `SeedPartsCount` helper;
|
||||
integration test under `Series/ProductionPopulatesTests.cs`.
|
||||
- Effort: small.
|
||||
- Risk: Low.
|
||||
|
||||
**PR F1-c — modal G/M/T codes (#7) + override values (#11)**
|
||||
- Scope: add `cnc_modal` (command id TBD per `fwlib32.h` — the wire
|
||||
protocol uses the same numeric command convention seen in
|
||||
`FocasWireClient`; capture during simulator iteration). Project:
|
||||
`Modal/G_Group{n}` (groups 1..21), `Modal/MCode`, `Modal/SCode`,
|
||||
`Modal/TCode`, `Modal/BCode`. Adds `Override/Feed`, `Override/Rapid`,
|
||||
`Override/Spindle`, `Override/Jog` from `cnc_rdparam(...)` — the
|
||||
override percent registers live at known parameter numbers; numbers
|
||||
are MTB-specific so pull defaults from
|
||||
`docs/v2/focas-version-matrix.md` and let operators override per device.
|
||||
- Files: `Wire/FocasWireClient.cs` (new `ReadModalAsync`), new
|
||||
`Wire/FocasWireModels.cs` records `WireModal` / `WireModalGroup`,
|
||||
`IFocasClient.cs` (new `GetModalAsync`), `FocasDriver.cs` (new
|
||||
poll-medium branches under the program-poll cadence).
|
||||
- Tests: `FocasModalTests.cs` (unit), simulator handler returns canned
|
||||
modal payload, integration asserts `Modal/G_Group1` text.
|
||||
- **Docs / fixture / e2e**: add `Modal/*` and `Override/*` sections to
|
||||
the fixed-tree table in `docs/drivers/FOCAS.md`, including the
|
||||
G-group decode table for groups 01/03/06/07/14; add a `MODAL:`
|
||||
address example row to `docs/Driver.FOCAS.Cli.md` (new `read -a
|
||||
MODAL:G1` style — note: this PR does NOT add a new address scheme,
|
||||
the modal data is fixed-tree only, so the CLI example reads via
|
||||
`read -n "ns=2;s=Modal/G_Group1"` over the OPC UA endpoint);
|
||||
document MTB-specific override register defaults in
|
||||
`docs/v2/focas-version-matrix.md` (new `Override registers per
|
||||
series` table); capture the `cnc_modal` command id resolved during
|
||||
simulator iteration into `docs/v2/implementation/focas-wire-protocol.md`
|
||||
(new struct entry — promote out of the open-questions list);
|
||||
update `docs/v2/implementation/focas-simulator-plan.md` Stream C
|
||||
protocol-surface table with the new `cnc_modal` handler;
|
||||
extend focas-mock with a `cnc_modal` command-id handler + canned
|
||||
modal payload per profile; integration test reading G54/G90 modal
|
||||
state via `Series/ModalPopulatesTests.cs`.
|
||||
- Effort: medium — `cnc_modal` returns a multi-group struct; encoding
|
||||
needs care.
|
||||
- Risk: Medium — modal-group numbering varies by series; treat the
|
||||
raw integer as the value the CNC reports and surface a string
|
||||
decode table only for the universally-present groups (G-group 01
|
||||
motion, 03 absolute/incremental, 06 input units, 07 cutter comp,
|
||||
14 work coordinate). Document MTB-specific groups as raw int.
|
||||
|
||||
**PR F1-d — tool number / tool life (#8) + work coordinate offsets (#10)**
|
||||
- Scope: add `cnc_rdtofs` / `cnc_rdtlife*` / `cnc_rdzofs`. Project
|
||||
`Tooling/CurrentTool`, `Tooling/CurrentOffset`,
|
||||
`Tooling/Life/{group}/Remaining`, `Tooling/Life/{group}/Total`,
|
||||
`Offsets/G54..G59[+ extended]/{X,Y,Z}`.
|
||||
- Files: new wire calls in `Wire/FocasWireClient.cs` (`ReadToolOffsetAsync`,
|
||||
`ReadToolLifeAsync`, `ReadWorkOffsetAsync`), `Wire/FocasWireModels.cs`
|
||||
(records), `IFocasClient.cs`, `FocasDriver.cs` (new `Tooling/` and
|
||||
`Offsets/` branches; both poll on the slow timer cadence — these
|
||||
change at setup time, not per-cycle), capability matrix per-call
|
||||
suppression like the existing `Spindle/` gating.
|
||||
- Tests: unit + simulator. Tool-life is the largest payload; assert
|
||||
array projection rather than per-tool nodes (one ValueRank=1 array
|
||||
per group keeps the address-space size bounded on machines with
|
||||
500+ tool slots).
|
||||
- **Docs / fixture / e2e**: add `Tooling/*` and `Offsets/*` sections to
|
||||
the fixed-tree table in `docs/drivers/FOCAS.md`, including the
|
||||
ValueRank=1 array note for tool-life groups; add a per-series
|
||||
capability-suppression row to `docs/v2/focas-version-matrix.md`
|
||||
(which series support `cnc_rdtlife*` vs not); document the three
|
||||
new structs (`ODBTOFS`, `ODBTLIFE5`, `IODBZOR`) in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`; add
|
||||
`cnc_rdtofs` / `cnc_rdtlife*` / `cnc_rdzofs` rows to the protocol
|
||||
surface table in `docs/v2/implementation/focas-simulator-plan.md`;
|
||||
extend focas-mock with three new command-id handlers + per-profile
|
||||
seed data (tool table + work-offset table); add a
|
||||
`tools_per_series` matrix to the `focas-mock` per-series profile
|
||||
JSON so 0i-D's small tool table differs from 30i's; new
|
||||
`Series/ToolingPopulatesTests.cs` and `Series/OffsetsPopulatesTests.cs`
|
||||
integration tests; update `docs/drivers/FOCAS-Test-Fixture.md`
|
||||
coverage map with the three new wire calls.
|
||||
- Effort: large — three new calls, each with its own struct; tool-life
|
||||
is variable-length.
|
||||
- Risk: Medium — payload shapes are series-specific; keep the
|
||||
capability matrix as the authoritative gate.
|
||||
|
||||
**PR F1-e — operator messages (#18) + currently-executing block text (#20)**
|
||||
- Scope: `cnc_rdopmsg3` (gives all four FANUC opmsg classes in one
|
||||
call), `cnc_rdactpt` (current block text). Project `Messages/External`
|
||||
(variable, last-N strings), `Program/CurrentBlock` (single string).
|
||||
- Files: `Wire/FocasWireClient.cs` (`ReadOperatorMessagesAsync`,
|
||||
`ReadCurrentBlockAsync`), `IFocasClient.cs`, `FocasDriver.cs` (new
|
||||
branches under program-poll cadence).
|
||||
- Tests: simulator returns canned ASCII; assert string round-trip is
|
||||
trim-stable (FANUC right-pads with `\0` or space).
|
||||
- **Docs / fixture / e2e**: add `Messages/External` and
|
||||
`Program/CurrentBlock` rows to the fixed-tree table in
|
||||
`docs/drivers/FOCAS.md`, including the ring-buffer / last-N
|
||||
semantics for opmsg; document the `OPMSG3` and `ODBACT2`
|
||||
payload shapes in `docs/v2/implementation/focas-wire-protocol.md`;
|
||||
add `cnc_rdopmsg3` / `cnc_rdactpt` rows to the protocol surface
|
||||
table in `docs/v2/implementation/focas-simulator-plan.md`; extend
|
||||
focas-mock with the two new command-id handlers (per-profile
|
||||
canned message text + canned current-block text); add a
|
||||
`mock_patch_opmsg` admin endpoint hook on `FocasSimFixture` for
|
||||
tests that need to push a canned message; integration test
|
||||
`Series/OperatorMessagesPopulateTests.cs` asserts trim-stable
|
||||
round-trip and last-N retention.
|
||||
- Effort: medium.
|
||||
- Risk: Low — ASCII-only payloads.
|
||||
|
||||
**PR F1-f — `cnc_getfigure` decimal scaling (#6) + connection statistics (#27)**
|
||||
- Scope: `cnc_getfigure` returns per-axis decimal-place counts; cache
|
||||
the result at bootstrap and divide each `AbsolutePosition` /
|
||||
`MachinePosition` / `RelativePosition` / `DistanceToGo` /
|
||||
`ActualFeedRate` value before publishing. Existing nodes already
|
||||
carry `Float64`; the change is invisible to clients except that
|
||||
values become real-world units. Adds `Diagnostics/` subtree:
|
||||
`Diagnostics/ReadCount`, `Diagnostics/ReadFailureCount`,
|
||||
`Diagnostics/LastErrorMessage`, `Diagnostics/LastSuccessfulRead`,
|
||||
`Diagnostics/ReconnectCount` — driven by counters already maintained
|
||||
on `DeviceState`.
|
||||
- Files: `Wire/FocasWireClient.cs` (new `ReadFigureAsync`),
|
||||
`IFocasClient.cs`, `FocasDriver.cs` (cache decimal places per axis,
|
||||
multiply on the read path, expose counters under `Diagnostics/`).
|
||||
- Tests: assert that with a canned `cnc_getfigure` returning 3, an
|
||||
`AbsolutePosition` of 12345 becomes `12.345`. Connection-stat tests
|
||||
assert counters increment under known conditions.
|
||||
- **Docs / fixture / e2e**: significant `docs/drivers/FOCAS.md` change —
|
||||
add a "Decimal-place scaling" subsection explaining the
|
||||
`FixedTree.ApplyFigureScaling` flag (default true on new installs,
|
||||
false on migrations) and the unit-correctness semantics it enforces;
|
||||
add `Diagnostics/*` rows to the fixed-tree table; add a
|
||||
Diagnostics-counters subsection to `docs/v2/focas-deployment.md`
|
||||
for operator dashboards; document `cnc_getfigure` (`ODBAXDP` /
|
||||
`ODBAXIS`) struct in `docs/v2/implementation/focas-wire-protocol.md`;
|
||||
add `cnc_getfigure` to the protocol surface in
|
||||
`docs/v2/implementation/focas-simulator-plan.md`; extend focas-mock
|
||||
with the per-axis decimal-place command handler + a `decimal_places`
|
||||
field on each profile JSON; update
|
||||
`docs/drivers/FOCAS-Test-Fixture.md` "When to trust each layer"
|
||||
table with a "Are axis values reported in real-world units?" row;
|
||||
add an opt-in `-CheckDecimalScaling` switch to `scripts/e2e/test-focas.ps1`
|
||||
that asserts AbsolutePosition is scaled when the flag is on;
|
||||
integration test `Series/DecimalScalingTests.cs` and
|
||||
`Series/DiagnosticsCountersTests.cs`.
|
||||
- Effort: medium — touches every axis read.
|
||||
- Risk: Medium — this is a behavioural change for any existing
|
||||
consumer that was already dividing client-side. Surface as a
|
||||
`FixedTree.ApplyFigureScaling` opt-in flag (default true on new
|
||||
installs, false when migrating); document in `docs/drivers/FOCAS.md`.
|
||||
|
||||
### Phase 2 — addressing additions
|
||||
|
||||
**PR F2-a — DIAG: address scheme (#14)**
|
||||
- Scope: new `FocasAreaKind.Diagnostic` parsed from `DIAG:nnn` /
|
||||
`DIAG:nnn/axis`, dispatched to `cnc_rddiag` (or `cnc_rddiagdgn` for
|
||||
series that support it).
|
||||
- Files: `FocasAddress.cs` (new prefix branch), `FocasCapabilityMatrix.cs`
|
||||
(new `DiagnosticRange` per series), `Wire/FocasWireClient.cs`
|
||||
(`ReadDiagnosticAsync`), `WireFocasClient.ReadAsync` (new dispatch
|
||||
branch).
|
||||
- Tests: parser unit tests, capability matrix unit tests, simulator
|
||||
read-round-trip.
|
||||
- **Docs / fixture / e2e**: add a `DIAG:` row to the address-syntax
|
||||
table in `docs/Driver.FOCAS.Cli.md` with `read -a DIAG:301` and
|
||||
`DIAG:301/0` (axis-scoped) examples; add a `DIAG:` row to the
|
||||
addressing table in `docs/drivers/FOCAS.md`; add per-series
|
||||
`DiagnosticRange` columns to `docs/v2/focas-version-matrix.md`;
|
||||
document the `ODBDGN` struct in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`; add `cnc_rddiag`
|
||||
/ `cnc_rddiagdgn` to the protocol surface in
|
||||
`docs/v2/implementation/focas-simulator-plan.md`; extend focas-mock
|
||||
with the diagnostic-range command handler + per-profile seeded
|
||||
diagnostic numbers; integration test
|
||||
`Series/DiagAddressTests.cs` round-trips a seeded diagnostic
|
||||
number; update `docs/drivers/FOCAS-Test-Fixture.md` capability list
|
||||
with the new `Diagnostic` `FocasAreaKind`.
|
||||
- Effort: medium.
|
||||
- Risk: Low — additive.
|
||||
|
||||
**PR F2-b — Multi-path / multi-channel CNC (#4)**
|
||||
- Scope: 30i/31i/32i can host 2–10 paths; today every request block is
|
||||
built with `PathId = 1` (`Wire/FocasWireProtocol.cs:216`). Add
|
||||
optional `Path` segment to `FocasAddress` (e.g. `PARAM:1815@2`,
|
||||
`R100@3.0`, `MACRO:500@2`); thread it into the `RequestBlock.PathId`
|
||||
field. Fixed-tree gets a `Paths/{n}/` folder pivot.
|
||||
- Files: `FocasAddress.cs` (new `Path` field + parser), `IFocasClient.cs`
|
||||
(every read call gains an optional `pathId` parameter, defaulting to
|
||||
1 for backward compatibility), `Wire/FocasWireClient.cs`
|
||||
(thread the param through every `RequestBlock` constructor),
|
||||
`FocasDriver.cs` (per-device `PathCount` discovery via
|
||||
`cnc_rdpathnum`; iterate fixed-tree per path).
|
||||
- Tests: unit on the parser; simulator with two paths configured;
|
||||
assert that a `PARAM:1815@2` read targets path 2.
|
||||
- **Docs / fixture / e2e**: significant `docs/drivers/FOCAS.md`
|
||||
update — new "Multi-path / multi-channel CNC" subsection explaining
|
||||
the `@N` suffix syntax, `Paths/{n}/` browse pivot, and per-path
|
||||
capability gating; add `@N` to every address row in the
|
||||
addressing table in `docs/Driver.FOCAS.Cli.md`; document
|
||||
`cnc_rdpathnum` (`ODBPATHNUM` struct) in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`, and update the
|
||||
`RequestBlock.PathId` discussion (was hard-coded to 1 — now a
|
||||
parameter); add `cnc_rdpathnum` to the protocol surface and the
|
||||
per-profile `path_count` field to the profile schema in
|
||||
`docs/v2/implementation/focas-simulator-plan.md`; extend focas-mock
|
||||
with per-path state isolation (separate PMC / param / macro tables
|
||||
per `path_id`) and a new `multi_path` profile (e.g.
|
||||
`thirtyone_i_dual_path`); add a `-Paths` switch to
|
||||
`scripts/e2e/test-focas.ps1` that runs the matrix once per
|
||||
declared path; document the new compose profile in
|
||||
`docs/drivers/FOCAS-Test-Fixture.md`; new
|
||||
`Series/MultiPathTests.cs` integration test asserting independent
|
||||
per-path reads.
|
||||
- Effort: large — touches every wire call's `RequestBlock` shape.
|
||||
- Risk: Medium — backward compatibility for existing single-path
|
||||
configs. Default `PathId = 1` everywhere; only deviate when the
|
||||
address explicitly carries a `@N` suffix or when the fixed-tree
|
||||
loop is iterating discovered paths.
|
||||
|
||||
**PR F2-c — PMC F/G letters for 16i (#15)**
|
||||
- Scope: capability matrix bug — `PmcLetters(Sixteen_i)` currently
|
||||
returns `{X, Y, R, D}`; real 16i ladders use F/G for handshakes.
|
||||
Widen the set; verify the address `pmc_rdpmcrng` numeric letter
|
||||
codes match.
|
||||
- Files: `FocasCapabilityMatrix.cs` (one-line fix to the 16i case),
|
||||
`tests/.../FocasCapabilityMatrixTests.cs` (assert F0.0 and G50.5
|
||||
parse against `Sixteen_i`).
|
||||
- **Docs / fixture / e2e**: update the 16i row of the PMC-letters
|
||||
column in `docs/v2/focas-version-matrix.md` (the row currently lists
|
||||
X/Y/R/D — add F/G); add a one-line "fixed in v…" callout to the
|
||||
changelog section of the same doc; no simulator change required (the
|
||||
16i profile JSON in `tests/.../Docker/focas-mock/profiles/sixteen_i.json`
|
||||
already has F/G ranges declared from Stream B); add F0.0 / G50.5
|
||||
probes to the 16i row of the per-series matrix in
|
||||
`scripts/e2e/test-focas.ps1`; no fixture-doc change needed.
|
||||
- Effort: trivial.
|
||||
- Risk: Low — correctness fix.
|
||||
|
||||
**PR F2-d — Bulk PMC range read (#16)**
|
||||
- Scope: today the driver issues one `pmc_rdpmcrng` per tag (one TCP
|
||||
RTT each). The wire call already supports a range `[start, end]`;
|
||||
the missing piece is coalescing on the read side. Add a coalescer:
|
||||
group same-letter contiguous (or near-contiguous within a small
|
||||
gap budget) PMC bytes from the request batch into one wire call
|
||||
per group, then slice client-side. Reuse the Modbus coalescing
|
||||
infrastructure pattern (per-group-id ProhibitedRanges) where it
|
||||
applies.
|
||||
- Files: new `Wire/FocasPmcCoalescer.cs`, hook into
|
||||
`FocasDriver.ReadAsync` between the per-tag path and the wire call
|
||||
layer. Surface coalesce stats on the `Diagnostics/` subtree (PR F1-f).
|
||||
- Tests: unit — given a request batch of `R100..R110`, assert that
|
||||
the coalescer issues one call covering 100..110 and slices the
|
||||
result. Integration — assert observed wire-call count drops with
|
||||
coalescing on.
|
||||
- **Docs / fixture / e2e**: add a "PMC range coalescing" subsection
|
||||
to `docs/drivers/FOCAS.md` (wire-call reduction, gap budget,
|
||||
per-series byte cap); document the new `Diagnostics/CoalesceStats/*`
|
||||
counters added on top of PR F1-f's diagnostics tree; add a
|
||||
PMC-byte-cap column to `docs/v2/focas-version-matrix.md`;
|
||||
no new wire calls (`pmc_rdpmcrng` is already in the surface), but
|
||||
document the supported max-bytes-per-call in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`; extend focas-mock
|
||||
with a request-counter admin endpoint so integration tests can
|
||||
assert the call-count reduction (counter visible via
|
||||
`FocasSimFixture.GetWireCallCountAsync`); update
|
||||
`docs/v2/implementation/focas-simulator-plan.md` Stream B
|
||||
validation harness with the request-counter handler; integration
|
||||
test `Series/PmcCoalescingTests.cs` asserts an `R100..R110` batch
|
||||
produces exactly 1 wire call against the mock.
|
||||
- Effort: medium.
|
||||
- Risk: Medium — the FANUC max-bytes-per-`pmc_rdpmcrng` ceiling is
|
||||
series-specific; cap conservatively (≤ 256 bytes per range) and
|
||||
let operators raise it via config if their CNC accepts more.
|
||||
|
||||
### Phase 3 — alarm history
|
||||
|
||||
**PR F3-a — `cnc_rdalmhistry` extension to alarm projection (#17)**
|
||||
- Scope: extend `FocasAlarmProjection` with two modes — `ActiveOnly`
|
||||
(today's behaviour) and `ActivePlusHistory`. In the latter, on
|
||||
connect (and on a configurable cadence — default 5 min, since the
|
||||
CNC ring buffer changes only on alarm raise/clear) issue
|
||||
`cnc_rdalmhistry` for the most-recent N entries; project as
|
||||
historic events through `IAlarmSource` with `OccurrenceTime` from
|
||||
the CNC's timestamp field.
|
||||
- Files: new `Wire/FocasWireClient.ReadAlarmHistoryAsync`, new
|
||||
`IFocasClient.ReadAlarmHistoryAsync`,
|
||||
`FocasAlarmProjection.cs` (mode switch + history poll loop),
|
||||
`FocasDriverOptions.cs` (`AlarmProjection.Mode` enum +
|
||||
`HistoryPollInterval` + `HistoryDepth`).
|
||||
- Tests: simulator returns canned history payload; assert events
|
||||
fire with the timestamps from the canned data and don't re-fire
|
||||
on every poll.
|
||||
- **Docs / fixture / e2e**: add an "Alarm history" subsection to
|
||||
`docs/drivers/FOCAS.md` documenting the `ActiveOnly` vs
|
||||
`ActivePlusHistory` mode switch, the `HistoryDepth` cap, and the
|
||||
dedup key; add a configuration-knob row to
|
||||
`docs/v2/focas-deployment.md` for operator dashboards; document
|
||||
`ODBALMHIS` struct in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`; add
|
||||
`cnc_rdalmhistry` to the protocol surface in
|
||||
`docs/v2/implementation/focas-simulator-plan.md`; extend focas-mock
|
||||
with a ring-buffer alarm history (per profile) + `mock_patch_alarmhistory`
|
||||
admin endpoint; expose a `SeedAlarmHistoryAsync` helper on
|
||||
`FocasSimFixture`; add `Series/AlarmHistoryProjectionTests.cs`
|
||||
asserting historic events fire once and active events still fire
|
||||
raise/clear; update `docs/drivers/FOCAS-Test-Fixture.md` integration
|
||||
bullet list with `cnc_rdalmhistry`.
|
||||
- Effort: medium.
|
||||
- Risk: Medium — duplicate-event suppression; key history events on
|
||||
`(timestamp, alarmNumber, type)` to deduplicate against the active
|
||||
list.
|
||||
|
||||
### Phase 4 — write path
|
||||
|
||||
This phase is the major behavioural change. The driver's read-only
|
||||
contract has been the documented design choice in
|
||||
`docs/drivers/FOCAS.md:14-18` and is reinforced by tests
|
||||
(`FocasReadWriteTests.WriteAsync_ReturnsBadNotWritable`). Removing it
|
||||
deserves a deliberate decision-record entry in the v2 decisions log
|
||||
before any code lands.
|
||||
|
||||
**PR F4-a — write infrastructure + per-tag opt-in (no wire calls yet)**
|
||||
- Scope: drop the `BadNotWritable` short-circuit in
|
||||
`WireFocasClient.WriteAsync` and replace with a kind-based dispatch
|
||||
that returns `BadNotWritable` only for kinds the wire client
|
||||
doesn't yet implement. Honour `FocasTagDefinition.Writable` (already
|
||||
present, default `true` — flip default to `false` per #1's safer
|
||||
posture). Plumb `WriteIdempotent` through Polly retry.
|
||||
- Files: `WireFocasClient.cs`, `FocasDriverOptions.cs`,
|
||||
`FocasDriver.cs`, `docs/drivers/FOCAS.md` (rewrite the read-only
|
||||
paragraph), new `docs/v2/decisions.md` entry.
|
||||
- Tests: assert that with `Writable=false` the path still returns
|
||||
`BadNotWritable`; with `Writable=true` and an unimplemented kind
|
||||
the write returns `BadNotSupported` (distinct from the per-tag
|
||||
policy denial).
|
||||
- **Docs / fixture / e2e**: this is the heaviest doc PR in the plan.
|
||||
- **`docs/drivers/FOCAS.md` lines 14–18** — revoke the unconditional
|
||||
"OtOpcUa is read-only against FOCAS… Writes return BadNotWritable
|
||||
by design" callout. Replace with a "Writes (opt-in, off by
|
||||
default)" subsection that names `Writes.Enabled`, the per-tag
|
||||
`Writable` flag (default flipped to `false`), and links to the
|
||||
Phase 4 decision-record entry.
|
||||
- **`docs/drivers/FOCAS-Test-Fixture.md` lines 42–43** — revoke the
|
||||
"`IWritable` intentionally returns `BadNotWritable` — OtOpcUa is
|
||||
read-only against FOCAS" callout. Replace with a qualified
|
||||
"default behaviour" note plus a pointer to the new write-enabled
|
||||
test profile.
|
||||
- **`docs/Driver.FOCAS.Cli.md` lines 100–116** — the existing
|
||||
`write` section already documents the CLI shape; expand the
|
||||
"**Writes are non-idempotent by default**" warning with a
|
||||
server-side note that the OtOpcUa endpoint enforces the
|
||||
`Writes.Enabled` flag and rejects writes when off, and that
|
||||
the CLI itself talks to the driver directly so its writes are
|
||||
not gated by the server flag (operator must consciously use
|
||||
the right tool).
|
||||
- New `docs/v2/decisions.md` entry "FOCAS write-path opt-in"
|
||||
capturing the design-choice reversal.
|
||||
- Update `docs/featuregaps.md` row for #1 / #3 — flip Build = Yes
|
||||
annotation to "shipping behind flag".
|
||||
- Simulator: no new commands; existing read commands gain a
|
||||
"writes when not unlocked" branch wired up here for symmetry
|
||||
even though no write commands ship yet (returns
|
||||
`BadNotSupported` until F4-b lands).
|
||||
- E2E: add `-Write` switch (no-op stage in this PR; populated by
|
||||
F4-b) to `scripts/e2e/test-focas.ps1`.
|
||||
- Effort: medium.
|
||||
- Risk: High — design-choice reversal. Mitigation: ship behind a
|
||||
driver-level `Writes.Enabled` flag (default `false`); operators
|
||||
must explicitly enable in `appsettings.json`.
|
||||
|
||||
**PR F4-b — `cnc_wrmacro` + `cnc_wrparam`**
|
||||
- Scope: implement macro and parameter writes. Both have well-defined
|
||||
payload shapes mirroring their read counterparts (IODBPSD for
|
||||
parameters, ODBM for macros).
|
||||
- Files: `Wire/FocasWireClient.cs` (new `WriteParameterAsync`,
|
||||
`WriteMacroAsync`), `WireFocasClient.WriteAsync` (dispatch).
|
||||
- Tests: simulator extension — accept writes and reflect them on
|
||||
subsequent reads. ACL tests in
|
||||
`tests/ZB.MOM.WW.OtOpcUa.IntegrationTests` to verify the
|
||||
server-layer enforcement (per the memory entry: ACL decisions
|
||||
happen in `DriverNodeManager`, never in driver-level code).
|
||||
- **Docs / fixture / e2e**:
|
||||
- `docs/drivers/FOCAS.md` — extend the "Writes" subsection
|
||||
(introduced in F4-a) with the two new write kinds, the
|
||||
`Writes.AllowParameter` and `Writes.AllowMacro` granular flags,
|
||||
and a security note: parameter writes require LDAP group
|
||||
`WriteConfigure`, macro writes require `WriteOperate` (cross-link
|
||||
to `docs/Security.md`).
|
||||
- `docs/v2/focas-deployment.md` — significant addition: a "Write
|
||||
safety" section covering operator pre-checks (CNC in MDI mode,
|
||||
parameter-write switch enabled), audit-log expectations, and the
|
||||
LDAP group requirements.
|
||||
- `docs/Driver.FOCAS.Cli.md` — populate the existing `write`
|
||||
examples for `PARAM:` and `MACRO:` (already present at lines
|
||||
105–108) with a "Server-enforced ACL" note linking to
|
||||
`docs/Security.md`.
|
||||
- Document `IODBPSD` (write side) and `ODBM` (write side) in
|
||||
`docs/v2/implementation/focas-wire-protocol.md` (the read-side
|
||||
structs are already there — flag the byte layout symmetry).
|
||||
- `docs/v2/implementation/focas-simulator-plan.md` — add
|
||||
`cnc_wrparam` / `cnc_wrmacro` to the protocol surface table
|
||||
and update Stream C status accordingly.
|
||||
- Extend focas-mock with `cnc_wrparam` / `cnc_wrmacro` handlers
|
||||
that mutate the per-profile state and return
|
||||
`EW_PASSWD` when the unlock state is off (sets up F4-d's
|
||||
test path); add `mock_get_last_write` admin endpoint for
|
||||
audit-log assertions.
|
||||
- New `Series/ParameterWriteTests.cs` and `Series/MacroWriteTests.cs`
|
||||
integration tests; ACL test under
|
||||
`tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/Authz/FocasWriteAclTests.cs`
|
||||
asserting `WriteConfigure` is required for `PARAM:` writes and
|
||||
`WriteOperate` for `MACRO:` writes.
|
||||
- `scripts/e2e/test-focas.ps1` — populate the `-Write` stage from
|
||||
F4-a with macro and parameter round-trip writes against the
|
||||
Docker mock.
|
||||
- Effort: medium.
|
||||
- Risk: High — a misdirected parameter write can put the CNC into a
|
||||
bad state. Surface a `Writes.AllowParameter` flag separate from
|
||||
`Writes.Enabled` so operators can grant macro writes without
|
||||
parameter writes.
|
||||
|
||||
**PR F4-c — `pmc_wrpmcrng`**
|
||||
- Scope: PMC range writes. Read-modify-write semantics for bit-level
|
||||
writes (the wire call is byte-addressed). Existing tests
|
||||
(`FocasPmcBitRmwTests.cs`) prove the read-modify-write pattern
|
||||
shape that the write path needs.
|
||||
- Files: `Wire/FocasWireClient.cs` (new `WritePmcRangeAsync`),
|
||||
bit-level RMW helper in `WireFocasClient`.
|
||||
- Tests: simulator round-trip on byte writes; bit-level write asserts
|
||||
the unrelated bits in the same byte are preserved.
|
||||
- **Docs / fixture / e2e**:
|
||||
- `docs/drivers/FOCAS.md` — extend the "Writes" subsection with
|
||||
PMC writes; loud safety callout block ("PMC is ladder working
|
||||
memory — a mistargeted bit can move motion"); document the
|
||||
read-modify-write semantics for bit-level writes; document the
|
||||
new `Writes.AllowPmc` granular flag.
|
||||
- `docs/v2/focas-deployment.md` — extend the "Write safety"
|
||||
section with PMC-specific pre-checks (e-stop, jog mode); add an
|
||||
ops-runbook bullet on auditing PMC writes from the
|
||||
`Diagnostics/CoalesceStats/` (extended) tree.
|
||||
- `docs/Driver.FOCAS.Cli.md` — the existing `write` example
|
||||
`write -h … -a G50.3 -t Bit -v on` (line 107) is already PMC-bit;
|
||||
update its surrounding warning to call out RMW behaviour.
|
||||
- Document the `pmc_wrpmcrng` request frame in
|
||||
`docs/v2/implementation/focas-wire-protocol.md` (the read frame
|
||||
is already there — flag the inverted shape).
|
||||
- `docs/v2/implementation/focas-simulator-plan.md` — add
|
||||
`pmc_wrpmcrng` to the protocol surface table.
|
||||
- Extend focas-mock with `pmc_wrpmcrng` handler that mutates
|
||||
per-profile PMC tables; assert byte-aligned writes preserve
|
||||
untouched bytes (mirrors the driver's RMW contract).
|
||||
- New `Series/PmcRangeWriteTests.cs` and
|
||||
`Series/PmcBitRmwIntegrationTests.cs` integration tests; ACL
|
||||
test under
|
||||
`tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/Authz/FocasPmcWriteAclTests.cs`
|
||||
asserting `WriteOperate` is required.
|
||||
- `scripts/e2e/test-focas.ps1` — extend the `-Write` stage with a
|
||||
PMC bit round-trip.
|
||||
- Effort: medium.
|
||||
- Risk: High — PMC is the ladder logic's working memory; a
|
||||
mistargeted write can move motion. Document loudly.
|
||||
|
||||
**PR F4-d — FOCAS password / unlock parameter (#3)**
|
||||
- Scope: some controllers gate `cnc_wrparam` and certain reads behind
|
||||
a connection-level password. Add `Password` to `FocasDeviceOptions`;
|
||||
emit the FOCAS password block during connect (`cnc_wrunlockparam`
|
||||
per FOCAS docs — confirm the exact command id during simulator
|
||||
iteration). On any read/write returning `EW_PASSWD`
|
||||
re-issue the password and retry once.
|
||||
- Files: `Wire/FocasWireClient.cs` (`UnlockAsync`),
|
||||
`FocasDriverOptions.cs` (`Password` field, treated as a secret —
|
||||
redact in logs), `FocasDriver.cs` (call on connect).
|
||||
- Tests: simulator extension — emit `EW_PASSWD` on writes when not
|
||||
unlocked; assert the unlock+retry path.
|
||||
- **Docs / fixture / e2e**:
|
||||
- `docs/drivers/FOCAS.md` — new "FOCAS password" subsection under
|
||||
Writes describing the optional `Password` device-option, when
|
||||
the CNC requires it (16i + some 30i firmwares with parameter-
|
||||
protect on), and the redaction guarantee.
|
||||
- **Security-note in `docs/v2/focas-deployment.md`** — significant
|
||||
addition: a "FOCAS password handling" subsection covering
|
||||
storage in `appsettings.json` (and the dev redaction pattern at
|
||||
`.local/`), the no-log invariant, and a runbook for password
|
||||
rotation. Cross-link to `docs/Security.md`.
|
||||
- `docs/Driver.FOCAS.Cli.md` — add a `--cnc-password` flag row to
|
||||
the "Common flags" table with the redaction note.
|
||||
- Document `cnc_wrunlockparam` (or the resolved command id) in
|
||||
`docs/v2/implementation/focas-wire-protocol.md`; resolve the
|
||||
open question raised by F4-d into the doc.
|
||||
- `docs/v2/implementation/focas-simulator-plan.md` — add
|
||||
`cnc_wrunlockparam` to the protocol surface; document the
|
||||
per-profile `unlock_password` field on the JSON profile schema.
|
||||
- Extend focas-mock with locked-state semantics on parameter
|
||||
writes (already half-stubbed in F4-b's `EW_PASSWD` branch);
|
||||
add `cnc_wrunlockparam` handler; add `mock_set_password`
|
||||
admin endpoint so integration tests can pin the unlock value.
|
||||
- New `Series/PasswordUnlockTests.cs` integration test asserts
|
||||
a write returning `EW_PASSWD` triggers exactly one unlock
|
||||
retry, and the second write succeeds.
|
||||
- `scripts/e2e/test-focas.ps1` — add `-CncPassword` parameter,
|
||||
threaded through to the CLI for the `-Write` stage.
|
||||
- Effort: small — once Phase 4-a/b are in.
|
||||
- Risk: Medium — password storage. Use the existing
|
||||
`appsettings.json` redaction pattern (memory entry: `dohertj2`
|
||||
AppData path); never log the password value.
|
||||
|
||||
### Phase 5 — derived telemetry
|
||||
|
||||
**PR F5-a — Cycle time per part / last cycle delta (#24 derivation)**
|
||||
- Scope: with `Production/CycleTimeSeconds` in place from F1-b and
|
||||
the parts-count from `cnc_rdparam`, compute "last completed cycle"
|
||||
as the delta in `Timers/CycleSeconds` between successive
|
||||
parts-count increments. Project `Production/LastCycleSeconds`,
|
||||
`Production/LastCycleStartUtc`.
|
||||
- Files: `FocasDriver.cs` only — pure derivation in the program-poll
|
||||
cadence handler.
|
||||
- Tests: simulate a parts-count increment from 5→6; assert
|
||||
`LastCycleSeconds` equals the cycle-timer delta over the same
|
||||
window.
|
||||
- **Docs / fixture / e2e**: add `Production/LastCycleSeconds` and
|
||||
`Production/LastCycleStartUtc` rows to the fixed-tree table in
|
||||
`docs/drivers/FOCAS.md` with the rollover / counter-reset
|
||||
behaviour documented; add a `Derived telemetry` callout in
|
||||
`docs/v2/focas-deployment.md` explaining the derivation is
|
||||
client-visible only (no new wire calls); no
|
||||
`docs/v2/implementation/focas-wire-protocol.md` change (pure
|
||||
derivation); no focas-mock change beyond `FocasSimFixture`'s
|
||||
existing parameter-patch / timer-patch helpers — add a
|
||||
`SimulateCycleCompletionAsync` convenience helper that increments
|
||||
parts-count and advances the cycle timer atomically; new
|
||||
`Series/CycleDeltaTests.cs` integration test simulates a 5→6
|
||||
parts-count transition; no `scripts/e2e/test-focas.ps1` change.
|
||||
- Effort: small.
|
||||
- Risk: Low — pure derivation.
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Consolidated view of every doc, fixture, and e2e artefact this plan
|
||||
touches. FOCAS has the largest doc surface of any driver in the v2
|
||||
roadmap because Phase 4 reverses a long-standing read-only design
|
||||
choice that is referenced from at least three user-facing docs and one
|
||||
test-fixture doc.
|
||||
|
||||
### Docs touched (per file, with the heaviest PR called out)
|
||||
|
||||
| Doc | Touched by | Heaviest change |
|
||||
| --- | --- | --- |
|
||||
| `docs/drivers/FOCAS.md` | F1-a, F1-b, F1-c, F1-d, F1-e, F1-f, F2-a, F2-b, F2-d, F3-a, F4-a, F4-b, F4-c, F4-d, F5-a | **F4-a** revokes the read-only callout at lines 14–18; **F2-b** adds the multi-path subsection |
|
||||
| `docs/drivers/FOCAS-Test-Fixture.md` | F1-a, F1-d, F1-f, F2-a, F2-b, F3-a, F4-a | **F4-a** revokes the "`IWritable` intentionally returns `BadNotWritable`" callout at lines 42–43 |
|
||||
| `docs/Driver.FOCAS.Cli.md` | F1-b, F1-c, F2-a, F2-b, F4-a, F4-b, F4-c, F4-d | **F4-a** qualifies the read-only stance at lines 100–116; **F4-d** adds `--cnc-password` flag |
|
||||
| `docs/v2/focas-deployment.md` | F1-f, F3-a, F4-a, F4-b, F4-c, F4-d | **F4-b** adds "Write safety" section; **F4-d** adds "FOCAS password handling" section |
|
||||
| `docs/v2/focas-version-matrix.md` | F1-c, F1-d, F2-a, F2-c, F2-d | **F1-d** adds capability-suppression rows for tooling/offsets |
|
||||
| `docs/v2/implementation/focas-wire-protocol.md` | F1-a, F1-c, F1-d, F1-e, F1-f, F2-a, F2-b, F2-d, F3-a, F4-b, F4-c, F4-d | **F1-d** documents three new structs (ODBTOFS, ODBTLIFE5, IODBZOR); **F4-d** resolves the `cnc_wrunlockparam` open question |
|
||||
| `docs/v2/implementation/focas-simulator-plan.md` | F1-c, F1-d, F1-e, F1-f, F2-a, F2-b, F2-d, F3-a, F4-a, F4-b, F4-c, F4-d | Each PR appends to the protocol surface table; F4-* close out Stream C status |
|
||||
| `docs/v2/decisions.md` (new entry) | F4-a | Net-new decision-record for the read-only reversal |
|
||||
| `docs/featuregaps.md` | F4-a | Updates Build = Yes annotation for #1 / #3 with "shipping behind flag" |
|
||||
|
||||
### Fixture (focas-mock) extensions
|
||||
|
||||
The vendored Python `focas-mock` simulator under
|
||||
`tests/.../IntegrationTests/Docker/focas-mock/` gains the following
|
||||
new command-id handlers and per-profile state:
|
||||
|
||||
| PR | Mock extension |
|
||||
| --- | --- |
|
||||
| F1-a | `cnc_rdcncstat` full-struct response |
|
||||
| F1-b | Seeded values for parameters 6711/6712/6713 in every profile JSON |
|
||||
| F1-c | New `cnc_modal` handler + canned modal payload per profile |
|
||||
| F1-d | `cnc_rdtofs` / `cnc_rdtlife*` / `cnc_rdzofs` handlers + per-profile tool/offset tables, plus a `tools_per_series` profile knob |
|
||||
| F1-e | `cnc_rdopmsg3` / `cnc_rdactpt` handlers + `mock_patch_opmsg` admin endpoint |
|
||||
| F1-f | `cnc_getfigure` handler + per-profile `decimal_places` field |
|
||||
| F2-a | `cnc_rddiag` / `cnc_rddiagdgn` handlers + per-profile diagnostic numbers |
|
||||
| F2-b | Per-path state isolation; new `path_count` profile field; new `thirtyone_i_dual_path` compose profile |
|
||||
| F2-c | No mock change (16i profile already declares F/G ranges) |
|
||||
| F2-d | Wire-call counter admin endpoint |
|
||||
| F3-a | Ring-buffer alarm history + `mock_patch_alarmhistory` admin endpoint |
|
||||
| F4-a | Stub branch returning `BadNotSupported` for write commands |
|
||||
| F4-b | `cnc_wrparam` / `cnc_wrmacro` handlers (with `EW_PASSWD` when locked); `mock_get_last_write` admin endpoint |
|
||||
| F4-c | `pmc_wrpmcrng` handler with byte-aligned write semantics |
|
||||
| F4-d | `cnc_wrunlockparam` handler; `mock_set_password` admin endpoint; locked-state on the param-write path |
|
||||
| F5-a | `SimulateCycleCompletionAsync` helper on `FocasSimFixture` (no new mock command) |
|
||||
|
||||
`FocasSimFixture` (in
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/FocasSimFixture.cs`)
|
||||
gains corresponding admin-API client helpers for each new endpoint.
|
||||
|
||||
### Integration tests (per phase)
|
||||
|
||||
| Phase | New / extended integration tests under `tests/.../FOCAS.IntegrationTests/Series/` |
|
||||
| --- | --- |
|
||||
| Phase 1 | `StatusFlagsPopulateTests.cs`, `ProductionPopulatesTests.cs`, `ModalPopulatesTests.cs`, `ToolingPopulatesTests.cs`, `OffsetsPopulatesTests.cs`, `OperatorMessagesPopulateTests.cs`, `DecimalScalingTests.cs`, `DiagnosticsCountersTests.cs` |
|
||||
| Phase 2 | `DiagAddressTests.cs`, `MultiPathTests.cs`, `PmcCoalescingTests.cs` (plus a 16i row in `FocasCapabilityMatrixTests.cs` for F2-c) |
|
||||
| Phase 3 | `AlarmHistoryProjectionTests.cs` |
|
||||
| Phase 4 | `ParameterWriteTests.cs`, `MacroWriteTests.cs`, `PmcRangeWriteTests.cs`, `PmcBitRmwIntegrationTests.cs`, `PasswordUnlockTests.cs` plus ACL tests under `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/Authz/FocasWriteAclTests.cs` and `FocasPmcWriteAclTests.cs` |
|
||||
| Phase 5 | `CycleDeltaTests.cs` |
|
||||
|
||||
### E2E script (`scripts/e2e/test-focas.ps1`) updates
|
||||
|
||||
| PR | Change |
|
||||
| --- | --- |
|
||||
| F1-f | New `-CheckDecimalScaling` switch |
|
||||
| F2-b | New `-Paths` switch (matrix mode iterates per declared path) |
|
||||
| F2-c | Adds F0.0 / G50.5 probes to the 16i row of the per-series matrix |
|
||||
| F4-a | Adds `-Write` switch (no-op stage in F4-a; populated by F4-b/c) |
|
||||
| F4-b | Populates `-Write` stage with macro + parameter round-trip writes |
|
||||
| F4-c | Extends `-Write` stage with PMC bit round-trip |
|
||||
| F4-d | Adds `-CncPassword` parameter, threaded through to the CLI |
|
||||
|
||||
`scripts/integration/run-focas.ps1` does not change shape across the
|
||||
plan — it remains the compose up/test/compose down wrapper. New
|
||||
profiles registered by F2-b are automatically picked up via the
|
||||
existing `-Profile` switch.
|
||||
|
||||
### Read-only callouts requiring revocation in Phase 4
|
||||
|
||||
For reviewer benefit, the explicit read-only callouts that **F4-a
|
||||
must revoke or qualify** in the same PR that flips the design choice:
|
||||
|
||||
- `docs/drivers/FOCAS.md` lines 14–18 ("OtOpcUa is **read-only**
|
||||
against FOCAS… Writes return `BadNotWritable` by design.")
|
||||
- `docs/drivers/FOCAS-Test-Fixture.md` lines 42–43 ("`IWritable`
|
||||
intentionally returns `BadNotWritable` — OtOpcUa is read-only
|
||||
against FOCAS.")
|
||||
- `docs/Driver.FOCAS.Cli.md` lines 100–116 (write section is already
|
||||
documented but predates the server-side flag; needs a
|
||||
server-enforced-ACL note)
|
||||
- `docs/featuregaps.md` (FOCAS row entries for #1 and #3 carry the
|
||||
same read-only-by-design framing — flip annotation)
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
These appear in the featuregaps recommendations table as Build = No;
|
||||
recapped here so reviewers can confirm the scope decision rather than
|
||||
re-deriving it from `featuregaps.md`:
|
||||
|
||||
- **#2 HSSB transport** — PCI hardware, declining install base,
|
||||
reopens the Fwlib distribution problem the wire client deliberately
|
||||
closed.
|
||||
- **#5 Series 15 / Power Mate D-H / Series 35i** — very legacy; small
|
||||
install base. Capability matrix already accepts `Unknown` as a
|
||||
permissive escape hatch.
|
||||
- **#9 Tool-offset write** — write-heavy; defer alongside the general
|
||||
write decision (F4 covers reads via tool-life only).
|
||||
- **#19 Program list / upload / download / delete** — DNC product
|
||||
territory; significant scope; out of OtOpcUa's MES focus.
|
||||
- **#21 DPRNT TCP listener** — significant scope; modern OPC UA
|
||||
alarms / events supersede it.
|
||||
- **#22 Servo / spindle deep info (`cnc_rdsvinfo` / `cnc_rdspinfo`)** —
|
||||
specialty; load-percent already covers most needs.
|
||||
- **#23 Per-axis acceleration / jerk / feed-per-rev** — niche
|
||||
advanced telemetry.
|
||||
- **#25 Operator write commands (preset, `cnc_setpath`, `cnc_wrabsmac`)** —
|
||||
read-only-by-design covers it; parameter / PMC / macro writes from
|
||||
Phase 4 are the supervisory writes operators actually need.
|
||||
- **#26 CNC time / date sync** — rare ask; commonly handled by CNC NTP.
|
||||
|
||||
## Open questions
|
||||
|
||||
- **Modal command id** (PR F1-c): `cnc_modal` numeric command code is
|
||||
not in the existing wire-protocol notes
|
||||
(`docs/v2/implementation/focas-wire-protocol.md`). Capture during
|
||||
the simulator iteration loop; if the simulator can't yet emit the
|
||||
shape, gate F1-c behind a bench-CNC trace per the
|
||||
diminishing-returns checkpoint.
|
||||
- **Override parameter numbers** (PR F1-c): feedrate / rapid /
|
||||
spindle override register numbers are MTB-specific. Default to the
|
||||
documented Fanuc factory numbers and let operators override per
|
||||
device (`Devices[].OverrideRegisters` map).
|
||||
- **Multi-path discovery** (PR F2-b): does the simulator support
|
||||
multi-path responses today? If not, F2-b lands gated behind the
|
||||
`OTOPCUA_FOCAS_SIM_WIRE_COMPAT=1` flag the wire-protocol doc
|
||||
describes.
|
||||
- **Decimal-scaling migration** (PR F1-f): existing `Float64` axis
|
||||
nodes are scaled integers today. Decision: ship F1-f with
|
||||
scaling-on default, add a one-release deprecation window with the
|
||||
flag default-off so existing dashboards don't silently scale by
|
||||
10^N when the driver is upgraded. Need explicit operator opt-in.
|
||||
- **Write security posture** (Phase 4): should writes require LDAP
|
||||
group `WriteConfigure` (parameters) vs `WriteOperate` (macros /
|
||||
PMC)? Per the memory entry on ACL-at-server-layer, the driver only
|
||||
reports `SecurityClassification`; the server enforces. Need the
|
||||
driver to surface the right classification per address kind:
|
||||
`Configure` for `PARAM:`, `Operate` for `MACRO:` and PMC writes.
|
||||
- **Phase 4 rollout**: ship behind a feature flag in `appsettings.json`
|
||||
(`Drivers.{name}.Config.Writes.Enabled`) with `false` default for at
|
||||
least one release before flipping the default. Update
|
||||
`docs/drivers/FOCAS.md` and `docs/featuregaps.md` in the same PR
|
||||
that flips the default.
|
||||
- **Cycle-delta edge cases** (PR F5-a): parts-count rollover; counter
|
||||
reset by the operator. Default behaviour: emit the delta only when
|
||||
the counter strictly increments by 1; on any other transition emit
|
||||
`Production/LastCycleSeconds` as `null` with `BadOutOfRange` and
|
||||
let the operator interpret.
|
||||
@@ -0,0 +1,863 @@
|
||||
# OpcUaClient Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → OpcUaClient](../featuregaps.md#opcuaclient-opc-ua-aggregation-client)
|
||||
>
|
||||
> Covers Build = Yes items only. Numbering matches the featuregaps Recommendations table.
|
||||
|
||||
## Summary
|
||||
|
||||
The OpcUaClient driver already ships 8/8 capability interfaces and a working
|
||||
end-to-end Session/Subscription/MonitoredItem/HistoryRead pipeline backed by
|
||||
the OPC Foundation `OPCFoundation.NetStandard.Opc.Ua.Client` SDK. Most of the
|
||||
14 Build = Yes gaps are operability or curation knobs — config surface +
|
||||
plumbing into existing SDK calls — rather than new protocol implementation.
|
||||
A small number need genuinely new SDK plumbing (Reverse Connect,
|
||||
ModelChangeEvent subscribe) and one (`ReadEventsAsync`) needs a coordinated
|
||||
cross-driver interface change.
|
||||
|
||||
The plan groups the work into five phases, ordered to deliver per-tag /
|
||||
per-subscription operability first (highest-frequency operator pain), then
|
||||
curation, then change tracking, then connectivity, then historical+HA. Each
|
||||
PR sticks to one feature-gap row so reviews stay narrow.
|
||||
|
||||
## Phased delivery
|
||||
|
||||
| Phase | Theme | Gaps | PRs | Notes |
|
||||
| :---: | --- | --- | :---: | --- |
|
||||
| 1 | Operability knobs | #5, #6, #15, #17, #20 | 5 | Pure SDK config surface; no new wire flows |
|
||||
| 2 | Discovery & curation | #2, #7, #8, #9 | 4 | Touches `ITagDiscovery` + adds method invoke |
|
||||
| 3 | Change tracking | #10 | 1 | New session-level subscription on `Server` node |
|
||||
| 4 | Connectivity | #1 | 1 | Reverse Connect — new listener path |
|
||||
| 5 | Historical & redundancy | #12, #13, #14 | 3 | Includes the cross-driver `IHistoryProvider` change |
|
||||
|
||||
**Total: 14 PRs across 5 phases.** Phases 1-3 land independently against
|
||||
the existing single-session model. Phase 4 ships in parallel with phases 2-3
|
||||
since it doesn't touch `OpcUaClientDriver` proper. Phase 5's first PR is a
|
||||
prerequisite for the `ReadEventsAsync` work in every other history-capable
|
||||
driver and must coordinate with them.
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
### Phase 1 — Operability knobs
|
||||
|
||||
#### PR-1: Per-subscription tuning (gap #6)
|
||||
|
||||
**Goal**: lift the hard-coded `KeepAliveCount=10`, `LifetimeCount=1000`,
|
||||
`MaxNotificationsPerPublish=0`, `Priority=0`, `PublishingInterval` floor of
|
||||
50 ms into `OpcUaClientDriverOptions` so high-event-rate servers can be
|
||||
defended against (`MaxNotificationsPerPublish=0` is unlimited — the
|
||||
documented DoS surface) and high-tag-count deployments can split by
|
||||
priority.
|
||||
|
||||
**SDK API**:
|
||||
- `Subscription.SetPublishingMode(bool, ct)` for runtime enable/disable
|
||||
- `SubscriptionOptions.PublishingInterval / KeepAliveCount / LifetimeCount /
|
||||
MaxNotificationsPerPublish / Priority` set at create-time
|
||||
- New options class `OpcUaSubscriptionDefaults` (publish interval floor,
|
||||
keep-alive count, lifetime count, max notifications, priority)
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add `Subscriptions`
|
||||
sub-section
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — `SubscribeAsync` reads from
|
||||
options
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — `SubscribeAlarmsAsync` reuses
|
||||
same defaults but with `Priority=1` higher than data subscriptions so
|
||||
alarms aren't starved during data bursts
|
||||
|
||||
**Tests**: `OpcUaClientSubscribeAndProbeTests` — assert options propagate;
|
||||
add a stress unit test (mocked `Subscription`) that asserts custom
|
||||
`MaxNotificationsPerPublish` is forwarded so a value > 0 actually reaches
|
||||
the SDK.
|
||||
|
||||
**Risks**: Setting `LifetimeCount` too low against a server with publish-
|
||||
throttling can drop subscriptions; doc the formula (`LifetimeCount >=
|
||||
3 * KeepAliveCount`).
|
||||
|
||||
**Docs / fixture / e2e**: new "Subscription tuning" subsection in
|
||||
`docs/drivers/OpcUaClient.md` (create if missing) documenting the
|
||||
`Subscriptions` options block with the `LifetimeCount >= 3 *
|
||||
KeepAliveCount` formula; cross-link from the "Advanced options" section
|
||||
of `docs/Client.CLI.md` so CLI users discover the knobs. Fixture: opc-plc
|
||||
already publishes fast tickers (`FastUInt1` @ 100 ms) sufficient for
|
||||
coverage — no fixture-side change. Integration test in
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` asserting
|
||||
custom `KeepAliveCount` / `Priority` reach the wire (capture via
|
||||
`OpcPlcFixture` keepalive count). E2E: extend
|
||||
`scripts/e2e/test-opcuaclient.ps1` with a stage that sets a non-default
|
||||
publish interval and confirms the local subscription honours it.
|
||||
|
||||
---
|
||||
|
||||
#### PR-2: Per-tag advanced subscription tuning incl. deadband (gap #5)
|
||||
|
||||
**Goal**: surface `SamplingInterval`, `QueueSize`, `DiscardOldest`,
|
||||
`MonitoringMode`, and `DataChangeFilter` (DeadbandType=Absolute/Percent +
|
||||
Trigger=Status/StatusValue/StatusValueTimestamp) per-tag. Deadband is the
|
||||
baseline analog noise filter every commercial UA aggregator ships and the
|
||||
single feature most likely to cut bandwidth on busy plants.
|
||||
|
||||
**SDK API**:
|
||||
- `MonitoredItem.Filter = new DataChangeFilter { Trigger =
|
||||
DataChangeTrigger.StatusValue, DeadbandType = (uint)DeadbandType.Absolute,
|
||||
DeadbandValue = 0.5 }`
|
||||
- `MonitoredItemOptions.QueueSize / DiscardOldest / SamplingInterval /
|
||||
MonitoringMode`
|
||||
- Per-tag override structure: extend the `SubscribeAsync` parameter shape
|
||||
(or add an overload accepting a `IReadOnlyList<MonitoredTagSpec>`) — note
|
||||
this requires coordinating with `ISubscribable` so the per-tag carrier
|
||||
reaches the driver.
|
||||
|
||||
**Files**:
|
||||
- `src/.../Core.Abstractions/ISubscribable.cs` — add overload
|
||||
`SubscribeAsync(IReadOnlyList<MonitoredTagSpec>, ...)` keeping old API
|
||||
for source compat
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — translate spec → SDK filter
|
||||
|
||||
**Tests**: assert `DataChangeFilter` lands on the `MonitoredItem.Filter` for
|
||||
each kind of trigger; assert PercentDeadband requires server-side
|
||||
EURange (server returns `BadFilterNotAllowed` if not configured) — capture
|
||||
the StatusCode and surface as a usable error.
|
||||
|
||||
**Risks**: cross-cutting `ISubscribable` change. Mitigation: ship the
|
||||
overload as additive — existing single-arg path still exists.
|
||||
|
||||
**Docs / fixture / e2e**: new "Per-tag deadband and monitoring filters"
|
||||
section in `docs/drivers/OpcUaClient.md` (create if missing) with worked
|
||||
examples of Absolute vs Percent deadband + the EURange prerequisite;
|
||||
update `docs/Client.CLI.md` `subscribe` command page with the new tag-
|
||||
config syntax for `--deadband` / `--queue-size` / `--discard-oldest`;
|
||||
update `docs/Client.UI.md` Subscriptions tab section to mirror. Fixture:
|
||||
`OpcPlcFixture` / `OpcPlcProfile` seeds an analog (`StepUp` already
|
||||
oscillates) and confirms `EURange` is published — extend the profile to
|
||||
flag noisy nodes. Integration test in
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` asserts
|
||||
publish suppression below the deadband threshold. E2E: add a
|
||||
`-DeadbandValue` stage to `scripts/e2e/test-opcuaclient.ps1` (and a
|
||||
`deadband` knob to `scripts/e2e/e2e-config.sample.json`) that subscribes,
|
||||
asserts no spurious updates within the band.
|
||||
|
||||
---
|
||||
|
||||
#### PR-3: Honor server `OperationLimits` (gap #15)
|
||||
|
||||
**Goal**: read `Server.ServerCapabilities.OperationLimits.MaxNodesPerRead /
|
||||
Write / Browse / HistoryReadData` once after Session activation, cache,
|
||||
and chunk batch operations to those caps client-side. Today the SDK chunks
|
||||
on its internal default; against an undersized embedded UA server this
|
||||
results in `BadTooManyOperations`.
|
||||
|
||||
**SDK API**:
|
||||
- After session open: `Session.ReadAsync` of
|
||||
`VariableIds.Server_ServerCapabilities_OperationLimits_MaxNodesPerRead`
|
||||
+ sibling NodeIds. The SDK exposes `Session.OperationLimits` after
|
||||
`FetchOperationLimits` is called — prefer that path.
|
||||
- `Session.FetchOperationLimitsAsync(ct)` (1.5+); fallback: explicit Read.
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — call
|
||||
`FetchOperationLimitsAsync` post-`OpenSessionOnEndpointAsync`; honour
|
||||
caps in `ReadAsync`, `WriteAsync`, `BrowseRecursiveAsync`,
|
||||
`EnrichAndRegisterVariablesAsync`, `ExecuteHistoryReadAsync`.
|
||||
|
||||
**Tests**: mock `Session.OperationLimits` to a value below the test batch
|
||||
size and assert the driver issues N wire calls instead of one.
|
||||
|
||||
**Risks**: a zero on the server means "no limit" per Part 5 — don't divide
|
||||
by zero.
|
||||
|
||||
**Docs / fixture / e2e**: new "Server OperationLimits handling"
|
||||
subsection in `docs/drivers/OpcUaClient.md` documenting the auto-fetch
|
||||
behaviour, the zero-means-unlimited semantics, and how to override via
|
||||
options if the server reports an under-truthful value. Fixture: opc-plc
|
||||
publishes the standard ServerCapabilities tree out of the box — no
|
||||
container-side change; the `OpcPlcFixture` seed validates the IDs at
|
||||
collection init. Integration test asserts batch reads chunk to the
|
||||
fetched cap. No e2e change needed (the script's batch sizes are already
|
||||
small).
|
||||
|
||||
---
|
||||
|
||||
#### PR-4: Diagnostics counters (gap #17)
|
||||
|
||||
**Goal**: expose per-driver counters on `DriverHealth` (or a sibling
|
||||
`DriverDiagnostics` surface): publish-request count, notifications-per-
|
||||
second EWMA, missing-publish-request count, dropped-notification rate,
|
||||
session resets count. Operators currently see only `LastSuccessfulRead`
|
||||
+ last error.
|
||||
|
||||
**SDK API**:
|
||||
- `Subscription.Notification` event fires per published notification — bump
|
||||
a counter
|
||||
- `Subscription.PublishStateChanged` event for missed-publish detection
|
||||
- `Session.PublishError` event for channel-level errors
|
||||
- `Session.SessionClosing`/`SessionConfigurationChanged` for session-reset
|
||||
attribution
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — instrument hooks; expose via
|
||||
`IDriver.GetDiagnostics()` or extend `DriverHealth`
|
||||
- `src/.../Core.Abstractions/IDriver.cs` — confirm where the counter shape
|
||||
lives; if `DriverHealth` is too rigid, add `IDriverDiagnostics` (mirrors
|
||||
the Modbus `driver-diagnostics` RPC pattern from #154)
|
||||
|
||||
**Tests**: synthetic notification fan-out → assert counters increment;
|
||||
session close → assert reset count bumps.
|
||||
|
||||
**Risks**: counters need to be lock-free hot-path safe; use
|
||||
`Interlocked.Increment` and a single sliding-window clock per counter.
|
||||
|
||||
**Docs / fixture / e2e**: new "Driver diagnostics" section in
|
||||
`docs/drivers/OpcUaClient.md` enumerating each counter and the event
|
||||
that bumps it; cross-link to the `driver-diagnostics` Admin RPC
|
||||
documented for Modbus (#154 pattern). Fixture: no opc-plc change
|
||||
required. Integration test exercises `IDriverDiagnostics` after
|
||||
forcing a session close. E2E: extend
|
||||
`scripts/e2e/test-opcuaclient.ps1` with a "diagnostics snapshot" stage
|
||||
that asserts publish/notification counters are non-zero after the
|
||||
subscribe stage.
|
||||
|
||||
---
|
||||
|
||||
#### PR-5: CRL / revocation handling (gap #20)
|
||||
|
||||
**Goal**: explicit revoked-cert handling in `CertificateValidator` plus a
|
||||
`RejectSHA1SignedCertificates` knob. Today the validator hooks
|
||||
`BadCertificateUntrusted` only — a revoked cert silently fails as
|
||||
"untrusted" with no operator-visible distinction.
|
||||
|
||||
**SDK API**:
|
||||
- `CertificateValidator.CertificateValidation` event — inspect
|
||||
`e.Error.StatusCode` for `BadCertificateRevoked`,
|
||||
`BadCertificateRevocationUnknown`,
|
||||
`BadCertificateIssuerRevocationUnknown`,
|
||||
`BadCertificatePolicyCheckFailed`
|
||||
- `SecurityConfiguration.RejectSHA1SignedCertificates`,
|
||||
`SecurityConfiguration.RejectUnknownRevocationStatus`,
|
||||
`SecurityConfiguration.MinimumCertificateKeySize` — direct config
|
||||
bool/int knobs already on the SDK type
|
||||
- `CertificateTrustList.AddCRL` / per-store CRL directories under
|
||||
`%LocalAppData%\OtOpcUa\pki\{trusted,issuers}\crl\`
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — `BuildApplicationConfigurationAsync`
|
||||
honours new options, validator handler distinguishes revoked vs untrusted
|
||||
in the surfaced error message
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
|
||||
`RejectSHA1SignedCertificates`, `RejectUnknownRevocationStatus`,
|
||||
`MinimumCertificateKeySize`
|
||||
|
||||
**Tests**: feed a SHA1-signed test cert and a revoked cert through the
|
||||
validator with the new knobs on/off.
|
||||
|
||||
**Risks**: PKI directory layout changes — existing deployments need a
|
||||
migration note.
|
||||
|
||||
**Docs / fixture / e2e**: new "Certificate revocation and SHA1 rejection"
|
||||
subsection in `docs/drivers/OpcUaClient.md` documenting the CRL
|
||||
directory layout under `%LocalAppData%\OtOpcUa\pki\{trusted,issuers}\crl\`
|
||||
and the new options (with a migration note for existing PKI stores);
|
||||
cross-link from `docs/security.md`. Fixture: extend
|
||||
`OpcPlcFixture` / `Docker/docker-compose.yml` with an optional secured
|
||||
endpoint variant and a SHA1-signed test cert checked into the test
|
||||
project's resources for the validator unit test. Integration test
|
||||
exercises a revoked cert via a local CRL drop. E2E: add a
|
||||
`-Insecure:$false` smoke stage to `scripts/e2e/test-opcuaclient.ps1`
|
||||
that asserts a revoked cert produces a distinguishable error message.
|
||||
|
||||
---
|
||||
|
||||
### Phase 2 — Discovery & curation
|
||||
|
||||
#### PR-6: Discovery URL `FindServers` (gap #2)
|
||||
|
||||
**Goal**: accept a discovery URL (`opc.tcp://host:4840` pointing at the
|
||||
LDS or the server's own discovery endpoint) and surface advertised servers
|
||||
+ endpoints to the operator without manual policy/mode tuple copy.
|
||||
|
||||
**SDK API**:
|
||||
- `DiscoveryClient.CreateAsync(appConfig, new Uri(url), DiagnosticsMasks.None, ct)`
|
||||
- `DiscoveryClient.FindServersAsync(null, ct)` → `ApplicationDescription[]`
|
||||
- `DiscoveryClient.GetEndpointsAsync(null, ct)` per advertised `DiscoveryUrl`
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new internal
|
||||
`DiscoverServersAsync` helper; extend the Admin-side discovery RPC to
|
||||
invoke it (driver-diagnostics pattern from #154)
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
|
||||
`DiscoveryUrl` knob (alternative to explicit `EndpointUrls` — when set
|
||||
the driver runs `FindServers` at init and feeds the result into the
|
||||
failover candidate list)
|
||||
|
||||
**Tests**: mock `DiscoveryClient` returning two advertised servers each
|
||||
with three endpoints; assert the candidate list reflects the policy/mode
|
||||
filter applied client-side.
|
||||
|
||||
**Risks**: `FindServers` itself usually requires `SecurityMode=None` —
|
||||
spec out in the doc that the discovery channel is unsecured even when
|
||||
the data channel will be encrypted.
|
||||
|
||||
**Docs / fixture / e2e**: new "Discovery URL (`FindServers`)" section in
|
||||
`docs/drivers/OpcUaClient.md` with the unsecured-discovery-vs-secured-
|
||||
data caveat called out; cross-link from `docs/Client.CLI.md` if a
|
||||
`discover` CLI command surfaces. Fixture: opc-plc already responds to
|
||||
`FindServers` on the same endpoint — `OpcPlcFixture` adds a discovery
|
||||
probe at collection init. Integration test exercises the helper against
|
||||
the live opc-plc container and asserts at least one
|
||||
`ApplicationDescription` returned. E2E: replace the hard-coded
|
||||
`-RemoteUrl` stage in `scripts/e2e/test-opcuaclient.ps1` with an
|
||||
optional `-DiscoveryUrl` mode that picks the first advertised endpoint.
|
||||
|
||||
---
|
||||
|
||||
#### PR-7: Selective import / namespace remap (gap #7)
|
||||
|
||||
**Goal**: per-branch include/exclude rules, namespace-URI remapping, and
|
||||
re-keyed BrowseNames — the curation surface every commercial aggregator
|
||||
ships.
|
||||
|
||||
**Approach**: extend `OpcUaClientDriverOptions` with a `Curation` section:
|
||||
- `IncludePaths: string[]` — glob or NodeId-rooted prefix list; only paths
|
||||
matching are imported
|
||||
- `ExcludePaths: string[]` — wins over Include (Include is allow-list,
|
||||
Exclude is block-list)
|
||||
- `NamespaceRemap: Dictionary<string,string>` — upstream NS URI →
|
||||
local-side alias for BrowseName generation
|
||||
- `RootAlias: string` — default `"Remote"`; replaces the hardcoded folder
|
||||
name today
|
||||
|
||||
**SDK API** — none new; this is pure local filtering inside
|
||||
`BrowseRecursiveAsync` and `EnrichAndRegisterVariablesAsync`.
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs`
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` —
|
||||
`BrowseRecursiveAsync` consults the rule set; helper
|
||||
`MapNamespaceForBrowseName` handles NS remap
|
||||
|
||||
**Tests**: synthetic browse tree, exercise include/exclude/remap each
|
||||
independently and combined; verify the cap accounting in
|
||||
`MaxDiscoveredNodes` excludes filtered nodes.
|
||||
|
||||
**Risks**: glob semantics — pin to a small subset (`*`, `?` only — no
|
||||
character classes or `**`) to keep the doc + behaviour simple.
|
||||
|
||||
**Docs / fixture / e2e**: new "Curation: include/exclude and namespace
|
||||
remap" section in `docs/drivers/OpcUaClient.md` with worked examples of
|
||||
each rule kind and the supported glob subset; update
|
||||
`docs/drivers/OpcUaClient-Test-Fixture.md` "Coverage map" with the new
|
||||
filtering rows. Fixture: extend `OpcPlcProfile` to enumerate which
|
||||
upstream namespaces are exercised so curation tests can target them.
|
||||
Integration test seeds an Include + Exclude + Remap rule and asserts
|
||||
the local tree reflects the filter. E2E: add a
|
||||
`-IncludePath` / `-NamespaceRemap` set of params to
|
||||
`scripts/e2e/test-opcuaclient.ps1` that asserts the local browse depth
|
||||
matches the rule.
|
||||
|
||||
---
|
||||
|
||||
#### PR-8: Type definition mirroring (gap #8)
|
||||
|
||||
**Goal**: walk the upstream `Types` folder (`ObjectTypes`,
|
||||
`VariableTypes`, `DataTypes`, `ReferenceTypes`) and project them into the
|
||||
local address space so downstream UI clients keep type-aware rendering and
|
||||
structured DataTypes decode correctly.
|
||||
|
||||
**SDK API**:
|
||||
- `Session.NodeCache.FetchNode(typeNodeId)` for type metadata
|
||||
- `Session.LoadDataTypeSystem` — for structured DataType encoding
|
||||
- `Session.FetchTypeTree(NodeIdCollection)` — populates the session's
|
||||
type cache from the server
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new pass-3 in `DiscoverAsync`
|
||||
that walks `i=86` (Types folder) under the curation rules, registers a
|
||||
parallel type subtree, and links variables to their TypeDefinition via
|
||||
HasTypeDefinition references on the address-space builder
|
||||
- `src/.../Core.Abstractions/IAddressSpaceBuilder.cs` — confirm whether
|
||||
the builder accepts type nodes; if not, extend it (this likely is a
|
||||
prerequisite — if so, it gets its own preceding PR-8a)
|
||||
|
||||
**Tests**: mock browse returning `BaseObjectType -> DerivedThing`;
|
||||
assert local builder receives the type node + the HasTypeDefinition link.
|
||||
|
||||
**Risks**: significant. Type mirroring touches `IAddressSpaceBuilder`
|
||||
which is a cross-cutting interface every driver depends on. If
|
||||
`IAddressSpaceBuilder` already supports type nodes (Galaxy has type-like
|
||||
templates), reuse that surface; otherwise this PR splits.
|
||||
|
||||
**Docs / fixture / e2e**: new "Type mirroring" section in
|
||||
`docs/drivers/OpcUaClient.md` documenting which type nodes get walked
|
||||
and how downstream UA clients see the HasTypeDefinition references; also
|
||||
note in `docs/Client.UI.md` that the Browse tree now shows mirrored
|
||||
types. Fixture: opc-plc already exposes the standard `Types` folder;
|
||||
extend `OpcPlcProfile` to assert at least one custom ObjectType is
|
||||
present. Integration test browses the local Types folder post-discovery
|
||||
and asserts the upstream type chain landed. No e2e change needed beyond
|
||||
extending the existing browse stage to walk under `Types`.
|
||||
|
||||
---
|
||||
|
||||
#### PR-9: Method node mirroring + `Call` passthrough (gap #9)
|
||||
|
||||
**Goal**: discover `NodeClass.Method` nodes in the browse pass, expose
|
||||
them on the local address space, and forward `Call` invocations as
|
||||
`Session.CallAsync` against the upstream node. The driver already calls
|
||||
`AcknowledgeableConditionType.Acknowledge` for A&C — generalize that path.
|
||||
|
||||
**SDK API**:
|
||||
- `Session.CallAsync(requestHeader, methodsToCall: CallMethodRequestCollection, ct)`
|
||||
returning `CallMethodResultCollection`
|
||||
- Browse already covers Method nodes by lifting the `NodeClassMask`; need
|
||||
to additionally browse `HasProperty` to discover `InputArguments` /
|
||||
`OutputArguments` for argument translation
|
||||
|
||||
**Files**:
|
||||
- `src/.../Core.Abstractions/IDriver.cs` — add `IMethodInvoker` capability
|
||||
interface (this is a NEW capability, not a tweak to an existing one)
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — implement
|
||||
`IMethodInvoker.InvokeAsync(string objectId, string methodId,
|
||||
IReadOnlyList<object?> inputs, ct)`; refactor `AcknowledgeAsync` to
|
||||
reuse the common path
|
||||
- `src/.../Server/...` node-manager — wire `IMethodInvoker` to the OPC UA
|
||||
server's `MethodNode.OnCallMethod` hook so downstream Call requests
|
||||
reach the driver
|
||||
|
||||
**Tests**: mock `Session.CallAsync` returning Good + an output collection;
|
||||
assert pass-through fidelity. Also assert per-argument `BadInvalidArgument`
|
||||
codes pass through.
|
||||
|
||||
**Risks**: high — adds a new capability interface. Other drivers that
|
||||
*could* support methods (Galaxy via `OnExecute` scripts, FOCAS via FOCAS
|
||||
commands) gain a clean extension point but each is its own follow-up.
|
||||
|
||||
**Docs / fixture / e2e**: new "Method nodes and Call passthrough"
|
||||
section in `docs/drivers/OpcUaClient.md` explaining how method calls
|
||||
flow through the aggregator (input/output argument translation, error-
|
||||
code passthrough); add a `call` command page to `docs/Client.CLI.md`
|
||||
covering the new path; mirror in `docs/Client.UI.md` if a UI surface
|
||||
ships. Fixture: opc-plc already exposes the standard
|
||||
`Server.GetMonitoredItems` method — `OpcPlcFixture` registers it as the
|
||||
canonical method-call target. Integration test in
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` invokes
|
||||
`Server.GetMonitoredItems` through the aggregator. E2E: add a
|
||||
`-MethodNodeId` stage to `scripts/e2e/test-opcuaclient.ps1` that calls
|
||||
the method through the local server and asserts the output matches the
|
||||
direct upstream call.
|
||||
|
||||
---
|
||||
|
||||
### Phase 3 — Change tracking
|
||||
|
||||
#### PR-10: Auto re-import on `ModelChangeEvent` (gap #10)
|
||||
|
||||
**Goal**: subscribe to `BaseModelChangeEventType` /
|
||||
`GeneralModelChangeEventType` on the upstream server's `i=2253` Server
|
||||
node so when the upstream topology changes (new tag added, type modified)
|
||||
the driver triggers a `ReinitializeAsync`-style re-import without
|
||||
operator action.
|
||||
|
||||
**SDK API**:
|
||||
- A second `Subscription` on the Session, monitoring `Server` node
|
||||
(`ObjectIds.Server`) with an `EventFilter` whose SelectClauses reference
|
||||
`BaseModelChangeEventType` and (optionally) `GeneralModelChangeEventType`
|
||||
Changes property
|
||||
- On notification: enqueue a debounced re-discover (don't react to every
|
||||
event during a bulk topology edit — coalesce 2-5s window)
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — add `_modelChangeSubscription`
|
||||
field; new `SubscribeModelChangesAsync` invoked at the end of
|
||||
`InitializeAsync`; debounce timer that calls `ReinitializeAsync` on the
|
||||
driver host
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
|
||||
`WatchModelChanges: bool` (default true) +
|
||||
`ModelChangeDebounce: TimeSpan` (default 5s)
|
||||
|
||||
**Tests**: synthetic event injection on the mock Session's notification
|
||||
stream; assert one debounced re-import call regardless of N events
|
||||
arriving in the window.
|
||||
|
||||
**Risks**: re-import while a downstream client is mid-browse — needs
|
||||
serialization on `_gate` like the rest of the driver; document that
|
||||
clients see a brief gap in the address space during reload.
|
||||
|
||||
**Docs / fixture / e2e**: new "Auto re-import on ModelChangeEvent"
|
||||
section in `docs/drivers/OpcUaClient.md` documenting the debounce window,
|
||||
the `_gate` serialization, and the brief browse-gap during reload.
|
||||
Fixture: opc-plc supports runtime topology mutation via the
|
||||
`addnode`/`addtag` HTTP control endpoint — extend `OpcPlcFixture` with
|
||||
a helper that triggers a model change. Integration test asserts a
|
||||
single re-import call after a burst of synthetic model change events.
|
||||
E2E: add a "topology change" stage to
|
||||
`scripts/e2e/test-opcuaclient.ps1` that calls the opc-plc control
|
||||
endpoint, then asserts the local server reflects the new node within
|
||||
the debounce window.
|
||||
|
||||
---
|
||||
|
||||
### Phase 4 — Connectivity
|
||||
|
||||
#### PR-11: Reverse Connect (gap #1)
|
||||
|
||||
**Goal**: support server-initiated client connect for OT-DMZ outbound-only
|
||||
firewalls. The upstream server connects *to* us on a TCP listener; we
|
||||
respond as the client. Hard requirement for many regulated plant networks.
|
||||
|
||||
**SDK API**:
|
||||
- `Opc.Ua.Client.ReverseConnectManager` — manages a TCP listener on the
|
||||
configured port and dispatches incoming reverse-connect requests
|
||||
- `ReverseConnectManager.AddEndpoint(Uri reverseEndpoint)` — listener URI
|
||||
e.g. `opc.tcp://0.0.0.0:4844`
|
||||
- `ReverseConnectManager.WaitForConnection(serverUri, serverUri, ct)` —
|
||||
blocks until the configured server initiates a reverse connect
|
||||
- `Session.Create(appConfig, reverseConnection, endpoint, ...)` —
|
||||
alternative session-create overload accepting the
|
||||
`ITransportWaitingConnection` returned by the manager
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
|
||||
`ReverseConnect: { Enabled, ListenerUrl, ExpectedServerUri }` section
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — when reverse-connect is
|
||||
enabled, replace the failover sweep with `WaitForConnection` and fall
|
||||
through into the same session-create path
|
||||
- New helper `ReverseConnectListener` — owns the manager lifecycle, one
|
||||
listener per driver-host process (singleton across instances if multiple
|
||||
reverse-connect drivers are configured)
|
||||
|
||||
**Tests**: spin up a `ReverseConnectClient` test against an opc-plc
|
||||
container started with `--rc opc.tcp://host:4844` to verify end-to-end.
|
||||
Unit tests mock `ITransportWaitingConnection`.
|
||||
|
||||
**Risks**: highest of the plan. Reverse Connect changes the
|
||||
listen-vs-dial direction; if multiple OpcUaClient driver instances both
|
||||
listen on the same port the manager must multiplex. opc-plc supports
|
||||
reverse connect (`--rc` flag) so the integration test pattern from
|
||||
`docs/drivers/OpcUaClient-Test-Fixture.md` extends cleanly.
|
||||
|
||||
**Docs / fixture / e2e**: new "Reverse Connect" section in
|
||||
`docs/drivers/OpcUaClient.md` (create if missing) documenting the
|
||||
listener URL config, the OT-DMZ outbound-only use case, and the shared-
|
||||
listener singleton model; update `docs/drivers/OpcUaClient-Test-Fixture.md`
|
||||
with the new "Reverse Connect coverage" row. Fixture: extend
|
||||
`Docker/docker-compose.yml` with an `opc-plc-rc` service variant that
|
||||
adds `--rc opc.tcp://host.docker.internal:4844`; `OpcPlcFixture` gains
|
||||
a `[CollectionDefinition]` that wires up the reverse-connect listener
|
||||
on the test side. Integration test asserts a session opens via the
|
||||
reverse path. E2E: add a `-ReverseConnect` switch to
|
||||
`scripts/e2e/test-opcuaclient.ps1` that flips the driver to listener
|
||||
mode and verifies the bridge stage still passes.
|
||||
|
||||
---
|
||||
|
||||
### Phase 5 — Historical & redundancy
|
||||
|
||||
#### PR-12: `IHistoryProvider.ReadEventsAsync` interface fix + driver impl (gap #12)
|
||||
|
||||
**Goal**: extend `IHistoryProvider.ReadEventsAsync` to carry an
|
||||
`EventFilter SelectClauses` parameter so HistoryRead Events can return
|
||||
the right field projection, and implement the OPC UA Client passthrough.
|
||||
|
||||
**This is a cross-driver concern.** `IHistoryProvider` lives in
|
||||
`Core.Abstractions` and every driver that opts into history (Galaxy,
|
||||
OpcUaClient, plus any future historian-backed Tier-A driver) inherits the
|
||||
default. Changing the signature is source-breaking — coordinate as one PR
|
||||
that:
|
||||
1. Adds the `IReadOnlyList<EventFieldProjection>` (or equivalent
|
||||
abstract `EventFilterSpec`) parameter
|
||||
2. Updates Galaxy's existing override (currently the only override) to
|
||||
honour the projection (best-effort — the Galaxy A&E log has a fixed
|
||||
field set so most projections degrade to the default columns)
|
||||
3. Lands the OpcUaClient passthrough using `Session.HistoryReadAsync` with
|
||||
`ReadEventDetails`
|
||||
|
||||
**SDK API**:
|
||||
- `ReadEventDetails { StartTime, EndTime, NumValuesPerNode, Filter }`
|
||||
- `Session.HistoryReadAsync` is already the call we use for Raw — pass
|
||||
`new ExtensionObject(new ReadEventDetails { ... })` for events
|
||||
- `HistoryEvent.Events: HistoryEventFieldList[]` — unwrap into
|
||||
`HistoricalEvent` records
|
||||
|
||||
**Files**:
|
||||
- `src/.../Core.Abstractions/IHistoryProvider.cs` — interface change
|
||||
- `src/.../Driver.Galaxy.../*HistoryProvider*.cs` — adjust signature
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — implement
|
||||
`ReadEventsAsync`; reuse `ExecuteHistoryReadAsync` shape
|
||||
- Server-side history facade — propagate the new parameter
|
||||
|
||||
**Tests**: integration test against opc-plc with
|
||||
`--alm` (alarm sim already enabled per the fixture doc) — verify the
|
||||
SelectClause projection comes back correctly.
|
||||
|
||||
**Risks**: the cross-driver interface change is the riskiest single
|
||||
ergonomic call in this plan. If we can't fit the new parameter without
|
||||
breaking every driver's `IHistoryProvider` impl, fall back to a sibling
|
||||
`IEventHistoryProvider` interface and only the OPC UA Client + Galaxy
|
||||
implement it. **Decide this in the PR review.**
|
||||
|
||||
**Docs / fixture / e2e**: new "HistoryRead Events" section in
|
||||
`docs/drivers/OpcUaClient.md` documenting the `EventFilter`-aware
|
||||
passthrough; update `docs/Client.CLI.md` `historyread` page to cover
|
||||
event-mode reads. **Cross-driver doc updates** (this PR adds an
|
||||
"`IHistoryProvider.ReadEventsAsync` signature change — see
|
||||
`docs/plans/opcuaclient-plan.md` PR-12" note to every other driver
|
||||
plan that has a history surface): `docs/plans/abcip-plan.md`,
|
||||
`docs/plans/ablegacy-plan.md`, `docs/plans/focas-plan.md`,
|
||||
`docs/plans/s7-plan.md`, `docs/plans/twincat-plan.md`, the Galaxy plan
|
||||
family (`docs/plans/galaxy-*.md` if/when present, and the LMX equivalent
|
||||
if it lands), and any Modbus plan. Galaxy is the only existing
|
||||
implementor and gets a real signature update in this PR; the others
|
||||
get a heads-up note so future work tracks the new shape. Fixture: opc-
|
||||
plc runs with `--alm` already (per existing fixture doc) — no compose
|
||||
change. Integration test issues a HistoryRead Events with a non-default
|
||||
SelectClause and asserts the projected fields. E2E: extend
|
||||
`scripts/e2e/test-opcuaclient.ps1` with a "history events" stage
|
||||
gated on the `--alm` simulator producing at least one event.
|
||||
|
||||
---
|
||||
|
||||
#### PR-13: Full Aggregate function set (gap #13)
|
||||
|
||||
**Goal**: extend `HistoryAggregateType` from the 5 enum values today
|
||||
(Average/Minimum/Maximum/Total/Count) to the OPC UA Part 13 standard
|
||||
catalog of 30+ aggregates that historian-class clients expect.
|
||||
|
||||
**SDK API**: `ObjectIds.AggregateFunction_*` constants — one per
|
||||
aggregate. The SDK already exposes them; this is pure mapping work.
|
||||
|
||||
Aggregates to add (Part 13 §5):
|
||||
- `TimeAverage`, `TimeAverage2`
|
||||
- `Interpolative`
|
||||
- `MinimumActualTime`, `MaximumActualTime`, `Range`, `Range2`
|
||||
- `AnnotationCount`, `DurationGood`, `DurationBad`,
|
||||
`PercentGood`, `PercentBad`
|
||||
- `WorstQuality`, `WorstQuality2`
|
||||
- `StandardDeviationSample`, `StandardDeviationPopulation`,
|
||||
`VarianceSample`, `VariancePopulation`
|
||||
- `NumberOfTransitions`
|
||||
- `Start`, `End`, `Delta`, `StartBound`, `EndBound`
|
||||
- `DurationInStateZero`, `DurationInStateNonZero`
|
||||
|
||||
**Files**:
|
||||
- `src/.../Core.Abstractions/IHistoryProvider.cs` — extend
|
||||
`HistoryAggregateType` enum (additive — existing values keep their
|
||||
ordinal)
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` —
|
||||
`MapAggregateToNodeId` switch grows; default arm rejects `out of range`
|
||||
|
||||
**Tests**: parametrized unit test sweeping every enum value — assert
|
||||
each maps to a non-null `NodeId` in the SDK's well-known set.
|
||||
|
||||
**Risks**: low — this is mapping work. Drivers without a real historian
|
||||
(everything except Galaxy + OpcUaClient) keep throwing `NotSupported`.
|
||||
|
||||
**Docs / fixture / e2e**: extend the "HistoryRead aggregates" section in
|
||||
`docs/drivers/OpcUaClient.md` with the full Part 13 catalog and which
|
||||
aggregates require server-side support; update
|
||||
`docs/Client.CLI.md` `historyread` page enumerating the new
|
||||
`--aggregate` values. Fixture: opc-plc historian support is limited —
|
||||
flag in `docs/drivers/OpcUaClient-Test-Fixture.md` that the new
|
||||
aggregates are unit-tested via the SDK's well-known NodeId set, not
|
||||
exercised wire-side. Integration test sweeps every enum value and
|
||||
asserts the mapping; gated-skip for aggregates the live opc-plc image
|
||||
doesn't honour. No e2e change.
|
||||
|
||||
---
|
||||
|
||||
#### PR-14: `ServerUriArray` redundant failover (gap #14)
|
||||
|
||||
**Goal**: read upstream `Server.ServerArray` /
|
||||
`ServerStatus.ServerArray` and `ServerRedundancyType.RedundancySupport` at
|
||||
session activation; when the upstream server advertises non-`None`
|
||||
redundancy, fail over mid-session on `ServiceLevel` drop without losing
|
||||
client subscriptions. Today our `EndpointUrls` is a one-shot connect-
|
||||
attempt list, not a live redundancy group.
|
||||
|
||||
**SDK API**:
|
||||
- `Session.ReadValueAsync(VariableIds.Server_ServerStatus_ServerArray, ct)`
|
||||
→ URI list
|
||||
- `Session.ReadValueAsync(VariableIds.Server_ServiceLevel, ct)` polled or
|
||||
subscribed via MonitoredItem
|
||||
- Subscribe `Server_ServiceLevel` on the existing alarm subscription so
|
||||
drops propagate via the publish channel
|
||||
- On low-`ServiceLevel`: open a parallel session against the next URI in
|
||||
`ServerArray`, `Session.TransferSubscriptionsAsync(otherSession, ...)`
|
||||
the live subscriptions, swap `Session` reference
|
||||
|
||||
**Files**:
|
||||
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new
|
||||
`MonitorServerRedundancyAsync` method; integrate with the existing
|
||||
`OnKeepAlive` / `SessionReconnectHandler` machinery so reconnect and
|
||||
redundancy-failover share the subscription-transfer code path
|
||||
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
|
||||
`Redundancy: { Enabled, ServiceLevelThreshold (default 200) }`
|
||||
|
||||
**Tests**: with two opc-plc containers behind the driver,
|
||||
artificially drop ServiceLevel on the active one and assert the
|
||||
secondary takes over; assert subscription handles stay valid.
|
||||
|
||||
**Risks**: redundancy is the second-riskiest item after Reverse Connect.
|
||||
The SDK's `TransferSubscriptions` has known edge cases when the
|
||||
secondary's `SecureChannel` rejects the source-channel's authentication
|
||||
token; doc that the secondary must trust the same client cert as the
|
||||
primary.
|
||||
|
||||
**Docs / fixture / e2e**: new "Upstream redundancy (`ServerArray`)"
|
||||
section in `docs/drivers/OpcUaClient.md` with the ServiceLevel
|
||||
threshold, the shared-cert prerequisite for `TransferSubscriptions`,
|
||||
and the ops runbook for forcing a failover; cross-link from
|
||||
`docs/Redundancy.md` (which today covers OUR server's redundancy —
|
||||
add a "vs upstream-side redundancy" note). Fixture: extend
|
||||
`Docker/docker-compose.yml` with a second `opc-plc-secondary` service
|
||||
on a different port; `OpcPlcFixture` gains a multi-endpoint variant.
|
||||
Integration test drops the active server's ServiceLevel and asserts
|
||||
the secondary takes over with subscription handles intact. E2E: add a
|
||||
`-PrimaryUrl` / `-SecondaryUrl` pair to
|
||||
`scripts/e2e/test-opcuaclient.ps1` (and matching keys to
|
||||
`scripts/e2e/e2e-config.sample.json`) that scripts a primary stop +
|
||||
asserts the bridge stage continues to pass.
|
||||
|
||||
---
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Consolidated index of every doc page, fixture asset, and e2e script touched
|
||||
by the plan above. Authoritative for review — if a PR's `Docs / fixture /
|
||||
e2e` line references a path not listed here, that's a checklist gap.
|
||||
|
||||
### Driver user docs
|
||||
|
||||
- `docs/drivers/OpcUaClient.md` — **create on first PR that needs it
|
||||
(PR-1)** if not present, then extend with one section per PR-1 through
|
||||
PR-14 covering: subscription tuning, per-tag deadband, OperationLimits
|
||||
handling, diagnostics counters, CRL/SHA1, FindServers, curation,
|
||||
type mirroring, methods, ModelChangeEvent, Reverse Connect, history
|
||||
events, aggregates, upstream redundancy.
|
||||
- `docs/drivers/OpcUaClient-Test-Fixture.md` — coverage map updated for
|
||||
curation (PR-7), Reverse Connect (PR-11), aggregates note (PR-13),
|
||||
redundancy multi-endpoint variant (PR-14).
|
||||
- `docs/Client.CLI.md` — extended for subscribe deadband syntax (PR-2),
|
||||
any `discover` command (PR-6), `call` command (PR-9), `historyread`
|
||||
event mode (PR-12), `--aggregate` enum expansion (PR-13).
|
||||
- `docs/Client.UI.md` — extended for Subscriptions tab deadband fields
|
||||
(PR-2), Browse-tree type rendering note (PR-8), Method-call surface
|
||||
(PR-9) if it ships.
|
||||
- `docs/security.md` — cross-link from PR-5 (CRL/SHA1 knobs).
|
||||
- `docs/Redundancy.md` — cross-link from PR-14 (note distinguishing
|
||||
server-side redundancy from upstream-side redundancy).
|
||||
|
||||
### Fixture assets
|
||||
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/docker-compose.yml`
|
||||
— add `opc-plc-rc` (PR-11) and `opc-plc-secondary` (PR-14) service
|
||||
variants; optional secured endpoint (PR-5).
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/OpcPlcFixture.cs`
|
||||
— discovery probe at collection init (PR-6), reverse-connect listener
|
||||
(PR-11), multi-endpoint variant (PR-14), model-change helper (PR-10).
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/OpcPlcProfile.cs`
|
||||
— flag noisy analogs for deadband (PR-2), enumerate exercised
|
||||
namespaces for curation (PR-7), record at least one custom ObjectType
|
||||
(PR-8).
|
||||
- New integration tests added per PR; all live under the existing
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/`
|
||||
collection.
|
||||
- Test certs (PR-5): SHA1-signed + revoked test fixtures checked into
|
||||
the unit-test project's resources.
|
||||
|
||||
### E2E scripts
|
||||
|
||||
- `scripts/e2e/test-opcuaclient.ps1` — new stages added per PR (subscription
|
||||
tuning PR-1, deadband PR-2, diagnostics PR-4, CRL PR-5, discovery
|
||||
PR-6, curation PR-7, method call PR-9, topology change PR-10,
|
||||
reverse connect PR-11, history events PR-12, redundancy failover
|
||||
PR-14). The script is the single integration point for every
|
||||
driver-level e2e — keep the stages ordered top-down by phase.
|
||||
- `scripts/e2e/e2e-config.sample.json` — new keys: `deadband`,
|
||||
`discoveryUrl`, `includePath`, `namespaceRemap`, `methodNodeId`,
|
||||
`reverseConnect`, `primaryUrl`, `secondaryUrl`.
|
||||
- `scripts/e2e/test-all.ps1` — no structural change; the existing
|
||||
`opcuaclient` block forwards new params after wiring them through
|
||||
`e2e-config.sample.json`.
|
||||
|
||||
### Cross-driver impact (PR-12 — `IHistoryProvider.ReadEventsAsync`)
|
||||
|
||||
PR-12 changes the `IHistoryProvider.ReadEventsAsync` signature in
|
||||
`Core.Abstractions` (or introduces a sibling `IEventHistoryProvider`
|
||||
— pinned in PR-12 review per Open Question 2). That decision is
|
||||
source-breaking for every driver that opts into history. PR-12 must
|
||||
add an explicit "interface change — adopt new signature when this
|
||||
driver implements `ReadEventsAsync`" note to:
|
||||
|
||||
- `docs/plans/abcip-plan.md`
|
||||
- `docs/plans/ablegacy-plan.md`
|
||||
- `docs/plans/focas-plan.md`
|
||||
- `docs/plans/s7-plan.md`
|
||||
- `docs/plans/twincat-plan.md`
|
||||
- The Galaxy plan family — `docs/plans/galaxy-*.md` if/when those
|
||||
pages exist; Galaxy is the only current implementor and gets the
|
||||
real signature update in PR-12, not just a note.
|
||||
- The LMX plan — `docs/plans/lmx-*.md` if/when it lands (current state:
|
||||
the LMX driver's history surface is implicit through Galaxy; revisit
|
||||
during PR-12 review).
|
||||
- A Modbus plan page if/when one exists; Modbus does not implement
|
||||
history today but the heads-up note tracks the cross-driver shape.
|
||||
|
||||
The cross-driver note text should be a one-paragraph "Heads up: the
|
||||
`IHistoryProvider.ReadEventsAsync` interface gained an
|
||||
`EventFilterSpec` parameter in OpcUaClient PR-12 (`docs/plans/opcuaclient-plan.md`).
|
||||
If/when this driver implements event-history, adopt the new signature."
|
||||
This pattern keeps each driver plan stable while the cross-cutting
|
||||
breakage is owned by one PR.
|
||||
|
||||
---
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
These featuregaps rows are **Build = No** and intentionally omitted from
|
||||
the plan above:
|
||||
|
||||
| # | Gap | Why we're skipping |
|
||||
| :---: | --- | --- |
|
||||
| 3 | Multicast / LDS-ME registration | Server-side responsibility, not aggregator's. |
|
||||
| 4 | GDS push management (Part 12) | Significant infra; rare for our deployment scale. |
|
||||
| 11 | HistoryUpdate / Modified / Annotation passthrough | MES backfill scope; defer. |
|
||||
| 16 | Connection / session pooling for multi-instance scale-out | Premature; current per-instance model is simple and adequate. |
|
||||
| 18 | Kerberos / OAuth2 / JWT identity | Significant security work; defer until AD integration drives it (separate workstream). |
|
||||
| 19 | Write attribute scope beyond `Value` | Niche; rarely used in OPC UA practice. |
|
||||
|
||||
If any of these get prioritized later they slot cleanly between the phases
|
||||
above — none have prerequisites among the Build = Yes items.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **`ISubscribable` overload vs new method (PR-2)**: per-tag spec
|
||||
carrier is needed for deadband; do we extend the existing
|
||||
`SubscribeAsync` overload or add `SubscribeWithSpecsAsync`? The
|
||||
former is source-breaking but cleaner; the latter is additive but
|
||||
leaves two parallel paths.
|
||||
2. **`IHistoryProvider.ReadEventsAsync` shape (PR-12)**: does the
|
||||
`EventFilterSpec` parameter live on `IHistoryProvider` (one interface,
|
||||
every driver gets it) or on a sibling `IEventHistoryProvider` (two
|
||||
interfaces, only event-history drivers implement)? Memory entry
|
||||
suggests the former; preference depends on whether non-OPC-UA drivers
|
||||
ever expect to project arbitrary event fields. **Pin this in PR-12
|
||||
review.**
|
||||
3. **`IMethodInvoker` capability (PR-9)**: does this become the 9th
|
||||
capability interface (currently 8/8) or is it folded into
|
||||
`IWritable` as a method-invoke variant? Adding a 9th interface is
|
||||
the cleaner model and matches the spec layering.
|
||||
4. **Type mirroring address-space surface (PR-8)**: does
|
||||
`IAddressSpaceBuilder` already accept type nodes? If yes, PR-8 is
|
||||
straightforward; if no, it splits into a prerequisite PR-8a that
|
||||
extends the builder, then PR-8b for the OPC UA Client wire-up. The
|
||||
answer determines whether PR-8 ships in Phase 2 or slips to a later
|
||||
phase.
|
||||
5. **Reverse Connect listener ownership (PR-11)**: one listener per
|
||||
driver instance (port collision when multiple reverse-connect
|
||||
drivers run in the same process) vs one shared listener with a
|
||||
`expectedServerUri` dispatcher. Shared is the right answer; pin
|
||||
the singleton lifetime to the driver-host.
|
||||
6. **Phase 1 ship order**: PR-1, PR-3, PR-4, PR-5 are independent and can
|
||||
land in parallel. PR-2 depends on the `ISubscribable` interface
|
||||
decision (Q1) — recommend landing PR-1 first to validate the
|
||||
`OpcUaSubscriptionDefaults` shape, then PR-2.
|
||||
@@ -0,0 +1,807 @@
|
||||
# S7 Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → S7](../featuregaps.md#s7-siemens-s7-3004001200--1500)
|
||||
>
|
||||
> Covers Build = Yes items only. Skip-rated rows are noted at the end for context.
|
||||
|
||||
## Summary
|
||||
|
||||
The S7 driver (`src/ZB.MOM.WW.OtOpcUa.Driver.S7/`) ships a working scaffold over
|
||||
**S7netplus 0.20**: ISO-on-TCP / S7comm, single-connection-per-PLC (`SemaphoreSlim`),
|
||||
DB / M / I / Q / T / C address parsing, atomic scalar reads/writes for Bool / Byte
|
||||
/ I16 / U16 / I32 / U32 / F32, polled `ISubscribable` overlay, `IHostConnectivityProbe`
|
||||
via `ReadStatusAsync`, and a Snap7-server-backed CI fixture on `localhost:1102`.
|
||||
|
||||
The 16 Build = Yes gaps fall into six tractable phases. **The hard one is gap #1
|
||||
(S7-1500 Optimized DB / Symbolic addressing)** — S7netplus speaks classic S7comm
|
||||
only and cannot reach optimized DBs at all. Phase 6 calls that out as an explicit
|
||||
architectural decision: ship the constraint as documentation and the rest as
|
||||
S7netplus-compatible features, *or* fork to a library that supports S7Plus
|
||||
(Sharp7-fork, Snap7 v2, custom S7Plus). Phases 1-5 do not depend on that decision
|
||||
and are landable on the current S7netplus base.
|
||||
|
||||
Every PR ships unit-test coverage and — where wire semantics matter — extends the
|
||||
Snap7-server profile in `Docker/server.py` so the integration fixture exercises
|
||||
the new path. PRs that need real S7-1500 firmware features the simulator doesn't
|
||||
mimic (PUT/GET protection, password-tier auth, SZL diagnostic buffer) call that
|
||||
out and gate the live-firmware test on the dev-box S7-1500 lab rig.
|
||||
|
||||
Architectural invariants we explicitly preserve:
|
||||
|
||||
- Single connection per PLC; `_gate` (SemaphoreSlim) serializes every PDU.
|
||||
- Strict address-parse-at-init; bad config fails fast with `FormatException`.
|
||||
- PUT/GET-disabled mapped to sticky `BadDeviceFailure`, not Polly-retried.
|
||||
- 100 ms minimum publishing interval (matches CPU mailbox scan reality).
|
||||
- `WriteIdempotent` per-tag flag is the only retry-policy lever.
|
||||
|
||||
## Phased delivery
|
||||
|
||||
| Phase | Theme | PRs | Gaps closed |
|
||||
|------:|-------|-----|-------------|
|
||||
| 1 | Data-type correctness | PR-S7-A1..A5 | #7, #8, #9, #19 |
|
||||
| 2 | Performance — multi-tag PDU packing | PR-S7-B1..B2 | #3, #22 |
|
||||
| 3 | Operability knobs | PR-S7-C1..C5 | #2, #4, #20, #21, #24 |
|
||||
| 4 | Workflow — symbol import + UDTs | PR-S7-D1..D3 | #5, #6, #10 |
|
||||
| 5 | Diagnostics & security | PR-S7-E1..E2 | #11, #14 |
|
||||
| 6 | S7-1500 Optimized DB / Symbolic | PR-S7-F (decision) | #1 |
|
||||
|
||||
Phases 1-3 run sequentially because Phase 2 packing and Phase 3 deadbands are
|
||||
both keyed off the type-decode work in Phase 1. Phase 4 (UDT/symbol import) is
|
||||
parallelizable with Phase 5; Phase 6 is gated on the library-choice decision
|
||||
in Open Questions (a).
|
||||
|
||||
---
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
### Phase 1 — Data-type correctness
|
||||
|
||||
#### PR-S7-A1 — 64-bit scalar types (LInt / ULInt / LReal / LWord)
|
||||
|
||||
Closes gap #9. `Float64`/`Int64`/`UInt64` cases in `S7Driver.ReadOneAsync`/
|
||||
`WriteOneAsync` currently throw `NotSupportedException`.
|
||||
|
||||
- **Files**: `S7Driver.cs` (read + write switch), `S7DriverOptions.cs` (extend
|
||||
`S7Size` with `LWord` for 8-byte access), `S7AddressParser.cs` (accept `DBL` /
|
||||
`LD` size suffix; S7netplus encodes 8-byte access via byte-array reads, so the
|
||||
parser converts `DB1.LD0` to a byte-range read internally).
|
||||
- **Tests**: unit decode tests for the byte-pattern → `long` / `ulong` / `double`
|
||||
conversion; Snap7-server profile gets `f64` and `i64` seed types.
|
||||
- **Risks**: S7netplus's `ReadAsync(string)` does not accept `LD` natively;
|
||||
fallback path is `Plc.ReadBytes(DataType.DataBlock, db, byteOffset, 8)` then
|
||||
`BitConverter` with explicit endian flip (S7 is big-endian on the wire,
|
||||
`BitConverter` is little-endian on x86/x64).
|
||||
- **Effort**: M (3-4 days incl. tests).
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: extends the type-mapping table in `docs/v2/s7.md`
|
||||
with `LInt` / `ULInt` / `LReal` / `LWord` rows; adds the new sizes
|
||||
(`LInt`, `ULInt`, `LReal`) to the `read` / `write` cookbook in
|
||||
`docs/Driver.S7.Cli.md`; updates `docs/drivers/S7-Test-Fixture.md`
|
||||
§"What it actually covers" to list the new 64-bit types and removes them
|
||||
from §5 "Data types beyond the scalars"; extends the snap7 seed-type set
|
||||
in `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/Docker/server.py`
|
||||
with `i64`, `u64`, `f64` cases; adds seeds at known offsets
|
||||
(e.g. `DB1.DBL40` for i64, `DB1.DBL48` for f64) to
|
||||
`Docker/profiles/s7_1500.json`; adds `S7_1500Profile` constants for the
|
||||
new tags + a `Driver_reads_seeded_64bit_batch` smoke test in
|
||||
`S7_1500SmokeTests`; adds an LInt loopback assertion to
|
||||
`scripts/e2e/test-s7.ps1`.
|
||||
|
||||
#### PR-S7-A2 — STRING / WSTRING / CHAR / WCHAR
|
||||
|
||||
Closes gap #8 (string portion). S7 `STRING(n)` is `[max-len][actual-len][bytes...]`
|
||||
(2-byte header + ASCII). `WSTRING(n)` is 4-byte header + UTF-16BE bytes. `CHAR`
|
||||
is 1 byte; `WCHAR` is 2 bytes UTF-16BE.
|
||||
|
||||
- **Files**: `S7Driver.cs` (new `ReadStringAsync` / `WriteStringAsync` private
|
||||
helpers using `Plc.ReadBytes` for raw byte-range fetch), `S7DriverOptions.cs`
|
||||
(already has `StringLength`; add `S7DataType.WString`, `Char`, `WChar`).
|
||||
- **Tests**: unit tests for header parsing including the "actual-len > max-len"
|
||||
PLC bug case (clamp on read, reject on write); Snap7 `ascii` seed type already
|
||||
exists, add `wstring` seed.
|
||||
- **Risks**: write must respect the configured `StringLength` to avoid overrunning
|
||||
the DB; mismatched max-len is a common field bug.
|
||||
- **Effort**: M.
|
||||
- **Deps**: PR-S7-A1 (byte-range read helper lands there).
|
||||
- **Docs / fixture / e2e**: extends the type-mapping section in
|
||||
`docs/v2/s7.md` with `STRING(n)` / `WSTRING(n)` / `CHAR` / `WCHAR`
|
||||
layouts (2-byte vs 4-byte header, UTF-16BE encoding, the "actual-len >
|
||||
max-len" PLC bug); extends the `read` / `write` cookbook in
|
||||
`docs/Driver.S7.Cli.md` with `--type WString` / `--type Char` / `--type
|
||||
WChar` examples and the `--string-length` flag for WString; updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it actually covers" to list
|
||||
ascii/wstring/char/wchar; adds `wstring`, `char`, `wchar` seed types to
|
||||
`Docker/server.py` (existing `ascii` covers STRING); seeds a
|
||||
`DB1.WSTRING[256]` and a `DB1.CHAR[300]` in
|
||||
`Docker/profiles/s7_1500.json`; adds `Driver_round_trips_string_types`
|
||||
smoke test exercising read + write of every variant; adds a string
|
||||
round-trip assertion to `scripts/e2e/test-s7.ps1`.
|
||||
|
||||
#### PR-S7-A3 — DTL / DATE_AND_TIME / S5TIME / TIME / TOD / DATE
|
||||
|
||||
Closes gap #8 (date/time portion).
|
||||
|
||||
- DTL is 12 bytes: year(u16) / month / day / weekday / hour / minute / second / nanos(u32).
|
||||
- DATE_AND_TIME (DT) is 8 bytes BCD: yy mm dd hh mm ss msH msL+dow.
|
||||
- S5TIME is 16-bit BCD with a 2-bit time-base.
|
||||
- TIME is `Int32` ms since 1972 (S7-300/400) or signed-ms duration (S7-1200/1500).
|
||||
- TOD is `UInt32` ms since midnight; DATE is `UInt16` days since 1990-01-01.
|
||||
|
||||
- **Files**: `S7Driver.cs` + new `S7DateTimeCodec.cs` static class encapsulating
|
||||
every encode/decode (keep the driver lean; codec is unit-testable in isolation).
|
||||
- **Tests**: round-trip tests per type with golden byte vectors taken from the
|
||||
Siemens "STEP 7 V18 — Programming Reference" document. Snap7-server seed
|
||||
profile gains `dtl`, `dt`, `s5time`, `time` types.
|
||||
- **Risks**: BCD parsing must reject invalid month/day combinations; PLC programs
|
||||
occasionally write 0x00 0x00 ... when uninitialized — surface as `BadOutOfRange`
|
||||
rather than parsing to year 0.
|
||||
- **Effort**: L (4-5 days incl. all six types and the golden-vector suite).
|
||||
- **Deps**: PR-S7-A1.
|
||||
- **Docs / fixture / e2e**: extends `docs/v2/s7.md` with a new "Date / time
|
||||
types" subsection documenting DTL / DT (BCD) / S5TIME / TIME / TOD /
|
||||
DATE byte layouts and the S7-300/400 vs S7-1200/1500 TIME-encoding
|
||||
split; adds `--type Dtl` / `--type DateAndTime` / `--type S5Time` /
|
||||
`--type Time` / `--type TimeOfDay` / `--type Date` to the
|
||||
`docs/Driver.S7.Cli.md` cookbook; updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it actually covers" with the
|
||||
new datetime types and removes "DTL / DATE_AND_TIME" from §5 "Data
|
||||
types beyond the scalars"; adds `dtl`, `dt`, `s5time`, `time`, `tod`,
|
||||
`date` seed types to `Docker/server.py` with golden-byte vectors
|
||||
documented in comments; seeds `DB1.DTL[260]`, `DB1.DT[272]`,
|
||||
`DB1.S5TIME[280]`, `DB1.TIME[284]`, `DB1.TOD[288]`, `DB1.DATE[292]` in
|
||||
`Docker/profiles/s7_1500.json`; adds
|
||||
`S7DateTimeCodecTests` (unit) + `Driver_round_trips_datetime_types`
|
||||
smoke test; no `scripts/e2e/test-s7.ps1` change required (CLI cookbook
|
||||
examples cover the manual surface).
|
||||
|
||||
#### PR-S7-A4 — Array tags (ValueRank=1)
|
||||
|
||||
Closes gap #7. `S7TagDefinition` currently has no array dimension; `MapDataType`
|
||||
hard-codes `IsArray: false`.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (extend `S7TagDefinition` with `ArrayDim` int?
|
||||
and `ElementCount` int?), `S7Driver.cs` (read path: detect array tag, issue
|
||||
one byte-range read covering N elements, slice client-side; write path: same
|
||||
in reverse), `DiscoverAsync` reports `IsArray: true, ArrayDim: [N]`.
|
||||
- **Tests**: unit tests for `Array[0..9] of Int` and `Array[0..9] of Real`;
|
||||
Snap7-server profile adds an array seed type. Round-trip array-write test
|
||||
proves slice ordering.
|
||||
- **Risks**: S7-1500 supports multi-dim arrays; declare ValueRank=1 only and
|
||||
document multi-dim as a follow-up. Array-of-UDT lands with PR-S7-D2.
|
||||
- **Effort**: M.
|
||||
- **Deps**: PR-S7-A1 (byte-range reads).
|
||||
- **Docs / fixture / e2e**: adds an "Array tags (ValueRank=1)" subsection
|
||||
to `docs/v2/s7.md` documenting `Array[0..N]` syntax + the multi-dim
|
||||
follow-up note; extends `docs/Driver.S7.Cli.md` with an
|
||||
`--array-count N` flag in the `read` / `write` cookbook and worked
|
||||
examples for `Array[0..9] of Int` and `Array[0..9] of Real`; updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it actually covers" to list
|
||||
array round-trips and removes "arrays of structs" from §5 (struct
|
||||
arrays land in PR-S7-D2); extends `Docker/server.py` with an `array`
|
||||
meta-seed-type that takes an inner-type + count and lays out N elements
|
||||
contiguously; seeds `DB1.ArrayInt[300]` (10×Int) and
|
||||
`DB1.ArrayReal[320]` (10×Real) in `Docker/profiles/s7_1500.json`;
|
||||
adds `Driver_round_trips_array_int10` + `Driver_round_trips_array_real10`
|
||||
smoke tests proving slice ordering; adds an array round-trip assertion
|
||||
to `scripts/e2e/test-s7.ps1`.
|
||||
|
||||
#### PR-S7-A5 — LOGO! 8 + S7-200 V-memory area
|
||||
|
||||
Closes gap #19. `S7AddressParser` currently rejects the `V` area letter.
|
||||
|
||||
- **Files**: `S7AddressParser.cs` (add `V` case → maps to `S7Area.DataBlock` with
|
||||
`DbNumber=1` for S7-200 / DbNumber per LOGO! VM-mapping table; document the
|
||||
conversion), `S7DriverOptions.cs` (note CpuType-dependent meaning of V).
|
||||
- **Tests**: unit tests for `VW0` / `VD4` / `V0.0` parsing, both S7-200 and
|
||||
LOGO! conventions; document caller responsibility to set `CpuType.S7200` or
|
||||
`S7200Smart`.
|
||||
- **Risks**: LOGO! VM base address differs by firmware (V0=0 vs V0=1024 depending
|
||||
on block); document the offset table rather than auto-detecting.
|
||||
- **Effort**: S (1-2 days, mostly parser + tests; no wire changes).
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: adds a "LOGO! 8 / S7-200 V-memory" subsection
|
||||
to `docs/v2/s7.md` covering the `V` area letter, the `S7200` /
|
||||
`S7200Smart` CpuType pre-requisite, the LOGO! VM-mapping table by
|
||||
firmware band, and the "V0 = DB1.DBX0.0" semantic; extends the address
|
||||
grammar cheat sheet in `docs/Driver.S7.Cli.md` with `VW0` / `VD4` /
|
||||
`V0.0` rows and a `-c S7200Smart` worked example; updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it does NOT cover" item 4 to
|
||||
note S7-200 / LOGO! parser coverage now exists at unit level; adds
|
||||
unit-only `S7AddressParserTests` cases — no Snap7 fixture change
|
||||
(server.py already exposes DB1, which is where V-memory aliases land);
|
||||
no `scripts/e2e/test-s7.ps1` change required (live-LOGO! testing is
|
||||
documented as field-only).
|
||||
|
||||
### Phase 2 — Performance (multi-tag PDU packing + block coalescing)
|
||||
|
||||
#### PR-S7-B1 — Multi-variable PDU packing
|
||||
|
||||
Closes gap #3. `ReadAsync(IReadOnlyList<string>)` currently issues one
|
||||
`plc.ReadAsync` per tag inside the semaphore — N PDUs for N tags.
|
||||
|
||||
- **Files**: `S7Driver.cs` (replace per-tag loop with a packer that builds a
|
||||
list of `S7.Net.Types.DataItem`, calls `plc.ReadMultipleVarsAsync`, then
|
||||
fans the results back to the per-tag decoder). Keep the existing per-tag
|
||||
decode switch — only the wire fetch becomes batched.
|
||||
- **Tests**: integration test that subscribes to 100 tags and asserts the
|
||||
packet count seen by the Snap7 server is 1 (or N / packing-budget) rather
|
||||
than 100. Unit-level test covers packer chunking when the negotiated PDU
|
||||
size won't fit all items.
|
||||
- **Risks**: `ReadMultipleVarsAsync` errors are per-item; we must surface
|
||||
per-tag StatusCodes correctly rather than failing the whole batch on one
|
||||
bad tag. Packing budget = `negotiatedPduSize - 18 (header) - per_item(12)`,
|
||||
conservatively cap at 19 items per PDU on a 240-byte PDU.
|
||||
- **Effort**: L (5-6 days incl. the per-item-error fan-out semantics).
|
||||
- **Deps**: Phase 1 PRs do not block this — but conflicts in `S7Driver.cs`
|
||||
are likely, so land Phase 1 first.
|
||||
- **Docs / fixture / e2e**: adds a "Performance — multi-variable PDU
|
||||
packing" subsection to `docs/v2/s7.md` describing
|
||||
`ReadMultipleVarsAsync`, the negotiated-PDU packing budget formula
|
||||
(`pdu - 18 - 12·N`), the 19-items-per-240-byte-PDU rule of thumb, and
|
||||
the per-item-error semantics; no `docs/Driver.S7.Cli.md` change (CLI
|
||||
is single-tag); no Snap7-server seed change required (existing seeds
|
||||
cover the wire path); adds
|
||||
`S7MultiVarPduPackingTests` to the unit suite (planner chunking when
|
||||
items don't fit) + a 100-tag perf integration test
|
||||
`Driver_packs_100_tags_into_minimum_pdus` that asserts request-count
|
||||
reduction; no `scripts/e2e/test-s7.ps1` change required.
|
||||
|
||||
#### PR-S7-B2 — Block-read coalescing for contiguous DBs
|
||||
|
||||
Closes gap #22. Reading `DB1.DBW0`, `DB1.DBW2`, `DB1.DBW4` should issue one
|
||||
6-byte byte-range read against DB1 starting at offset 0, sliced client-side.
|
||||
|
||||
- **Files**: `S7Driver.cs` adds a planner pass: group same-DB tags by
|
||||
contiguous byte ranges (gap-merge threshold = configurable, default 16
|
||||
bytes; over-fetching 16 bytes is cheaper than one extra PDU). Merged ranges
|
||||
become a single `Plc.ReadBytes` call; the result is sliced per-tag.
|
||||
- **Tests**: unit tests for the merge planner (input list → expected ranges);
|
||||
integration test with 50 contiguous DB words proves wire-level reduction.
|
||||
- **Risks**: STRINGs / arrays should opt out of merging because the per-tag
|
||||
byte size is variable. Add an "opaque-size" flag so the planner skips them.
|
||||
- **Effort**: M.
|
||||
- **Deps**: PR-S7-B1 (the multi-var packer). The two interact: the planner
|
||||
emits sum-reads, then the packer puts multiple sum-reads on one PDU.
|
||||
- **Docs / fixture / e2e**: extends the §"Performance" section in
|
||||
`docs/v2/s7.md` with a "Block-read coalescing" subsection — the
|
||||
default 16-byte gap-merge threshold, the opaque-size opt-out for
|
||||
STRINGs / arrays, and operator guidance for tuning the threshold per
|
||||
DB; no CLI doc change; no Snap7-server seed change (existing
|
||||
contiguous DB1 seeds — DBW0 / DBW10 / DBD20 — already exercise
|
||||
contiguous-merge); adds
|
||||
`S7BlockCoalescingPlannerTests` (unit) covering the merge planner +
|
||||
opaque opt-out; adds a 50-contiguous-DBW integration test
|
||||
`Driver_coalesces_contiguous_DBWs_into_single_byte_range_read` that
|
||||
asserts wire-level reduction; no `scripts/e2e/test-s7.ps1` change.
|
||||
|
||||
### Phase 3 — Operability
|
||||
|
||||
#### PR-S7-C1 — PDU size negotiation surfaced
|
||||
|
||||
Closes gap #2. S7netplus's `Plc` instance exposes the negotiated PDU size after
|
||||
`OpenAsync` via `Plc.MaxPDUSize`.
|
||||
|
||||
- **Files**: `S7Driver.cs` (read `Plc.MaxPDUSize` after open, store on
|
||||
`_health`; expose via `GetHealth().Diagnostics["NegotiatedPduSize"]` —
|
||||
this requires adding a `Diagnostics` dictionary to `DriverHealth`, which
|
||||
is a Core change). Operator-visible via the Admin UI driver-diagnostics
|
||||
panel that already renders Modbus diagnostic stats.
|
||||
- **Tests**: integration test asserts the value is non-zero after init.
|
||||
- **Risks**: `DriverHealth` extension must be backward-compatible — existing
|
||||
drivers should still compile against the unchanged record. Make the new
|
||||
property nullable with a default of `null`.
|
||||
- **Effort**: S.
|
||||
- **Deps**: Core `DriverHealth` shape change (single PR coordinated with
|
||||
the Modbus diagnostic surface).
|
||||
- **Docs / fixture / e2e**: adds a "Diagnostics surfacing" subsection to
|
||||
`docs/v2/s7.md` documenting the `Diagnostics["NegotiatedPduSize"]`
|
||||
surface + how it renders in the Admin UI driver-diagnostics panel;
|
||||
no CLI doc change (CLI doesn't expose diagnostics); updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it actually covers" with a
|
||||
"negotiated PDU size surfaces in driver health" line; no Snap7
|
||||
seed-type change (snap7's PDU negotiation is fixed at 240 bytes —
|
||||
document the fixture's negotiated size in the README); adds
|
||||
`Driver_exposes_negotiated_pdu_size_post_init` smoke test asserting
|
||||
the value is non-zero; no `scripts/e2e/test-s7.ps1` change.
|
||||
|
||||
#### PR-S7-C2 — TSAP / Connection Type selector
|
||||
|
||||
Closes gap #4. S7netplus picks PG-class TSAPs by default; hardened CPUs may
|
||||
require OP / S7-Basic / Other.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (new `TsapMode` enum: `Auto` / `Pg` / `Op` /
|
||||
`S7Basic` / `Other`; `Auto` preserves current behavior. Optional
|
||||
`LocalTsap` / `RemoteTsap` `ushort?` for explicit override). `S7Driver.cs`
|
||||
branches on the mode to pick the S7netplus `Plc(CpuType, ...)` constructor
|
||||
vs the `Plc(string ip, byte rack, byte slot, ushort localTsap, ushort remoteTsap)`
|
||||
raw-TSAP overload. Document the raw-TSAP table in `docs/v2/s7.md`.
|
||||
- **Tests**: unit test on the mode → TSAP-byte mapping; live-firmware test
|
||||
documented but only runnable against the dev-box S7-1500 lab rig.
|
||||
- **Risks**: wrong TSAP causes connection refused at handshake — same failure
|
||||
shape as wrong slot. Document the mapping prominently.
|
||||
- **Effort**: M.
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: adds a "TSAP / Connection Type" section to
|
||||
`docs/v2/s7.md` covering the `TsapMode` enum, the raw-TSAP table
|
||||
(PG = 0x0100/0x0102, OP = 0x0200/0x0202, S7-Basic = 0x0300/0x0302,
|
||||
Other = caller-supplied), and the hardened-CPU motivation; adds
|
||||
`--tsap-mode` and `--local-tsap` / `--remote-tsap` flags to
|
||||
`docs/Driver.S7.Cli.md`'s common-flags table with a worked example
|
||||
hitting an OP-class TSAP; no Snap7 seed change (snap7 accepts any
|
||||
TSAP from the CLI, so the unit-level mapping test is sufficient); no
|
||||
smoke test change (live-firmware-only); no `scripts/e2e/test-s7.ps1`
|
||||
change.
|
||||
|
||||
#### PR-S7-C3 — Per-tag scan group / publish rate
|
||||
|
||||
Closes gap #20. `SubscribeAsync` takes one publishing interval for the whole
|
||||
list; mixed 100 ms / 1 s / 10 s tags need three subscribe calls today.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (extend `S7TagDefinition` with optional
|
||||
`ScanGroup` string). `S7Driver.cs` (`SubscribeAsync` partitions the input
|
||||
list into one poll loop per distinct interval; `PollGroupEngine`-style
|
||||
internal group, but driver-local — same engine the TwinCAT driver uses).
|
||||
- **Tests**: unit test with three tags at three rates asserts three independent
|
||||
poll-tick streams; integration test asserts no group starves the others.
|
||||
- **Risks**: the `_gate` semaphore still serializes — three poll loops can
|
||||
contend. Document the contention as part of the "1 connection / 1 mailbox"
|
||||
invariant; if it bites, follow-up adds a fairness queue.
|
||||
- **Effort**: M.
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: adds a "Per-tag scan groups" subsection to
|
||||
`docs/v2/s7.md` documenting `S7TagDefinition.ScanGroup`, the multi-rate
|
||||
partitioning semantics, and the `_gate` contention caveat; no CLI doc
|
||||
change (CLI is single-tag); no Snap7 seed change required (existing
|
||||
scalar seeds suffice); adds `S7ScanGroupPartitioningTests` (unit) +
|
||||
`Driver_three_scan_groups_publish_independently` smoke test that
|
||||
subscribes 3 tags at 100 ms / 1 s / 10 s rates and asserts
|
||||
independent tick streams; no `scripts/e2e/test-s7.ps1` change
|
||||
(subscribe assertion already covers the polling path).
|
||||
|
||||
#### PR-S7-C4 — Deadband / on-change with thresholds
|
||||
|
||||
Closes gap #21. `PollOnceAsync` currently does `!Equals(prev, current)` only —
|
||||
no analog deadband.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (extend `S7TagDefinition` with
|
||||
`DeadbandAbsolute double?` and `DeadbandPercent double?`). `S7Driver.cs`
|
||||
(`PollOnceAsync` evaluates per-tag deadband for numeric types; non-numeric
|
||||
types fall through to exact equality).
|
||||
- **Tests**: unit tests for absolute and percent deadbands at edge cases
|
||||
(NaN, ±Infinity, sign flip, near-zero percent).
|
||||
- **Risks**: percent deadband against a zero baseline diverges; document and
|
||||
fall back to absolute when |baseline| < 1e-6.
|
||||
- **Effort**: S.
|
||||
- **Deps**: PR-S7-C3 helpful but not required.
|
||||
- **Docs / fixture / e2e**: adds a "Deadband / on-change" subsection to
|
||||
`docs/v2/s7.md` documenting `DeadbandAbsolute` / `DeadbandPercent` per
|
||||
tag, NaN / ±Infinity / sign-flip / near-zero-percent edge cases, and
|
||||
the |baseline| < 1e-6 fallback; no CLI doc change (CLI's `subscribe`
|
||||
already polls on change); no Snap7 seed change; adds
|
||||
`S7DeadbandTests` (unit) covering all edge cases — no integration test
|
||||
required since deadband is pre-publish filtering inside the polling
|
||||
loop; no `scripts/e2e/test-s7.ps1` change.
|
||||
|
||||
#### PR-S7-C5 — Pre-flight PUT/GET enablement test
|
||||
|
||||
Closes gap #24. We currently surface `BadDeviceFailure` only at first read.
|
||||
Add a pre-flight check during `InitializeAsync` (after `OpenAsync`) that issues
|
||||
one trivial read (`MW0` or the configured `Probe.ProbeAddress`) and surfaces
|
||||
the dedicated diagnostic message before declaring `DriverState.Healthy`.
|
||||
|
||||
- **Files**: `S7Driver.cs` (`InitializeAsync` adds the probe read; on
|
||||
`S7.Net.PlcException` with the PUT/GET-disabled error code, throw a
|
||||
typed `S7PutGetDisabledException` with a configuration-fix hint).
|
||||
- **Tests**: integration test toggles a Snap7 simulator quirk that mimics
|
||||
the PUT/GET-disabled response (Snap7 doesn't model this; gate the test
|
||||
on a `--with-real-plc` opt-in or document as live-firmware-only).
|
||||
- **Risks**: pre-flight against a real `Probe.ProbeAddress` requires the
|
||||
address to exist in the PLC; document that the default `MW0` is fine for
|
||||
most installs but allow `null` / "skip" for sites that haven't wired one.
|
||||
- **Effort**: S.
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: extends the "PUT/GET must be enabled" section
|
||||
of `docs/Driver.S7.Cli.md` with the new typed
|
||||
`S7PutGetDisabledException` message + the "skip pre-flight" knob;
|
||||
adds the same content as a "Pre-flight PUT/GET enablement" subsection
|
||||
in `docs/v2/s7.md`; no Snap7 seed change (snap7 doesn't model
|
||||
PUT/GET-disabled — the test for the success path uses the existing
|
||||
MW0 seed); adds `Driver_preflight_passes_when_probe_address_seeded`
|
||||
smoke test; documents the live-firmware test as gated on a
|
||||
`--with-real-plc` opt-in flag in `docs/drivers/S7-Test-Fixture.md`
|
||||
§"Follow-up candidates"; no `scripts/e2e/test-s7.ps1` change (probe
|
||||
test already runs first).
|
||||
|
||||
### Phase 4 — Workflow (symbol import + UDTs + instance DBs)
|
||||
|
||||
#### PR-S7-D1 — Symbol-table / TIA Portal export browse
|
||||
|
||||
Closes gap #5. Operators currently hand-edit `S7TagDefinition` JSON. TIA Portal
|
||||
exports symbols as **`.s7p` archive → External tags → CSV / SDF**. The lighter
|
||||
target is the CSV format used by the "Generate source from blocks" exporter.
|
||||
|
||||
- **Files**: new `src/ZB.MOM.WW.OtOpcUa.Driver.S7/SymbolImport/` directory:
|
||||
- `TiaCsvImporter.cs` — parses TIA Portal "Show all tags" CSV (`Name`,
|
||||
`Address`, `Data type`, `Comment`, `Visible in HMI`). Output: list of
|
||||
`S7TagDefinition`.
|
||||
- `AwlImporter.cs` — best-effort AWL `VAR_GLOBAL` / `DATA_BLOCK` parser
|
||||
for legacy STEP 7 Classic projects.
|
||||
- **Files (Admin UI)**: a "Import S7 symbols" button on the Driver Tags tab
|
||||
that POSTs the file to a new `POST /api/drivers/{id}/import-s7-symbols`
|
||||
endpoint and reports the diff.
|
||||
- **Tests**: unit tests with golden-input CSV / AWL fixtures; round-trip
|
||||
test that imports → produces tags → reads against simulator.
|
||||
- **Risks**: TIA Portal CSV is locale-dependent (decimal-comma in DE locale).
|
||||
Detect from the header row and accept both. UDT-typed symbols import as
|
||||
a placeholder until PR-S7-D2.
|
||||
- **Effort**: L (5-7 days incl. the Admin UI flow).
|
||||
- **Deps**: see Open Question (c) — confirm CSV+AWL is the right scope, or
|
||||
whether `.s7p` / `.zip` archive parsing is required.
|
||||
- **Docs / fixture / e2e**: adds new doc
|
||||
`docs/drivers/S7-TIA-Import.md` documenting the supported TIA Portal
|
||||
CSV format (column names, locale-comma detection, UDT-typed
|
||||
placeholders) and the AWL `VAR_GLOBAL` / `DATA_BLOCK` parser scope;
|
||||
cross-links it from `docs/v2/s7.md`'s new "Symbol import" section
|
||||
and from `docs/Driver.S7.Cli.md` with a future `import` subcommand
|
||||
hook; adds golden-input fixtures
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/Fixtures/sample_tia_export.csv`,
|
||||
`sample_tia_export_de_locale.csv`, and `sample_step7_classic.awl`;
|
||||
no Snap7 seed change required (existing DB1 seeds support
|
||||
the import-then-read round-trip); adds `TiaCsvImporterTests` and
|
||||
`AwlImporterTests` (unit) + `Driver_imports_csv_then_reads_seeded_tags`
|
||||
integration test that imports the sample CSV → reads via Snap7;
|
||||
no `scripts/e2e/test-s7.ps1` change (Admin-UI flow has its own
|
||||
end-to-end coverage in the Admin UI test suite).
|
||||
|
||||
#### PR-S7-D2 — UDT / STRUCT / nested-DB handling
|
||||
|
||||
Closes gap #6. Today's tag map is flat scalar-only; UDT-typed DBs are
|
||||
unusable without hand-flattening every member.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (extend `S7TagDefinition` with `UdtName string?`;
|
||||
alongside, a new `IReadOnlyList<S7UdtDefinition> Udts` on the options that
|
||||
declares the layout: name, ordered members `(Name, Offset, S7DataType, ArrayDim?)`).
|
||||
`S7Driver.cs` fans a UDT-typed tag into per-member sub-tags at `InitializeAsync`,
|
||||
so the read/write path stays scalar-only.
|
||||
- **Tests**: unit tests for fan-out with nested UDTs (UDT-of-UDT); integration
|
||||
test with a Snap7 DB seeded as a UDT-shape byte array proves the fan-out
|
||||
decodes correctly.
|
||||
- **Risks**: UDT-of-UDT arbitrary nesting depth — cap at 4 levels and reject
|
||||
deeper with a clear error. Optimized DBs would let TIA reorder members,
|
||||
re-introducing gap #1; document that user-defined UDTs require "Optimized
|
||||
block access" off, same as the general DB rule.
|
||||
- **Effort**: L (1-2 weeks).
|
||||
- **Deps**: PR-S7-D1 (symbol importer drops UDT-typed entries with a
|
||||
placeholder; D2 makes those usable).
|
||||
- **Docs / fixture / e2e**: adds a "UDT / STRUCT support" section to
|
||||
`docs/v2/s7.md` documenting `S7UdtDefinition`, the fan-out
|
||||
semantics, the 4-level nesting cap, and the "Optimized block access
|
||||
must be off" prerequisite; extends `docs/drivers/S7-TIA-Import.md`
|
||||
(created in PR-S7-D1) with a UDT-typed-entry section showing how
|
||||
the importer + `Udts` declaration cooperate; updates
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it does NOT cover" item 5 to
|
||||
remove "UDT fan-out"; extends `Docker/server.py` with a
|
||||
`udt_layout` meta-seed-type that lays out per-member offsets within
|
||||
a DB byte range; seeds a `DB1.MyUdt[400]` (e.g. Real + Int + Bool)
|
||||
in `Docker/profiles/s7_1500.json`; adds `S7UdtFanOutTests` (unit) +
|
||||
`Driver_fans_out_udt_into_member_tags` integration test covering a
|
||||
nested-UDT case; adds a UDT-member round-trip assertion to
|
||||
`scripts/e2e/test-s7.ps1`.
|
||||
|
||||
#### PR-S7-D3 — Instance-DB / FB parameter access
|
||||
|
||||
Closes gap #10. Multi-instance FBs are addressed symbolically (`MyFB_Instance.MyParam`)
|
||||
with no fixed absolute DB byte offset visible without a TIA project export.
|
||||
|
||||
- **Files**: extends PR-S7-D1's importer to recognize "instance DB" entries
|
||||
(TIA export shows them with a different "DB type" column value); the
|
||||
importer translates `MyFB_Instance.MyParam` to the resolved
|
||||
`DBn.DBW_offset` based on the FB's interface declaration in the export.
|
||||
- **Tests**: golden-input test with an FB-instance DB export; resolved
|
||||
addresses match Siemens reference.
|
||||
- **Risks**: when the FB interface changes (TIA "online change"), instance-DB
|
||||
layouts shift. Document that re-import is required after any FB-interface
|
||||
edit. Eventually surface this as a startup warning when the symbol-table
|
||||
hash differs from the imported snapshot — out of scope for this PR.
|
||||
- **Effort**: M.
|
||||
- **Deps**: PR-S7-D1, PR-S7-D2.
|
||||
- **Docs / fixture / e2e**: extends `docs/drivers/S7-TIA-Import.md` with
|
||||
an "Instance DBs / FB parameters" section covering the importer's
|
||||
`MyFB_Instance.MyParam` → `DBn.DBW_offset` resolution, the "DB type"
|
||||
column convention, and the "re-import on FB-interface edit" caveat;
|
||||
adds the same caveat as a paragraph in `docs/v2/s7.md`'s "UDT /
|
||||
STRUCT" section; adds a golden-input fixture
|
||||
`Fixtures/sample_tia_export_with_fb_instance.csv` to the integration
|
||||
tests; no Snap7 seed change required (resolved addresses land in DB1
|
||||
which the existing seeds back); adds
|
||||
`InstanceDbResolverTests` (unit) +
|
||||
`Driver_resolves_fb_instance_then_reads_seeded_member` integration
|
||||
test; no `scripts/e2e/test-s7.ps1` change (FB-instance lookup is an
|
||||
import-time concern).
|
||||
|
||||
### Phase 5 — Diagnostics & security
|
||||
|
||||
#### PR-S7-E1 — CPU diagnostic buffer / SZL reads
|
||||
|
||||
Closes gap #11. SZL (System Status List) IDs surface CPU type, firmware
|
||||
version, cycle-time min/avg/max, and the diagnostic-buffer entries.
|
||||
|
||||
- **Files**: `S7Driver.cs` exposes a small set of "system tags" alongside
|
||||
`Tags` — virtual addresses prefixed `@System.` that the read path
|
||||
recognizes and dispatches to S7netplus's `ReadSzlAsync` (or, if not
|
||||
exposed, a raw `Plc.ReadBytes` against the SZL-via-S7comm sub-protocol):
|
||||
- `@System.CpuType`, `@System.Firmware`, `@System.OrderNo` — SZL 0x0011
|
||||
- `@System.CycleMs.Min` / `.Max` / `.Avg` — SZL 0x0132 / 0x0432
|
||||
- `@System.DiagBuffer[0..N]` — SZL 0x00A0 ring-buffer entries
|
||||
- **Files (discovery)**: `DiscoverAsync` adds a `Diagnostics/` subfolder
|
||||
with the system-tag set when `S7DriverOptions.ExposeSystemTags = true`.
|
||||
- **Tests**: unit tests for the SZL response parser (golden bytes); live-
|
||||
firmware test against the dev-box S7-1500.
|
||||
- **Risks**: S7netplus's SZL surface is incomplete; may need a raw
|
||||
`Plc.ReadBytes` against `0x84` register or a small SZL-PDU helper.
|
||||
- **Effort**: M-L.
|
||||
- **Deps**: PR-S7-C1 (`DriverHealth.Diagnostics` dictionary already there).
|
||||
- **Docs / fixture / e2e**: adds a "CPU diagnostics (SZL)" section to
|
||||
`docs/v2/s7.md` listing the exposed `@System.*` virtual addresses, the
|
||||
underlying SZL IDs, and the `ExposeSystemTags` opt-in; extends
|
||||
`docs/Driver.S7.Cli.md` with a worked `read -a @System.CpuType` example
|
||||
in the cookbook; updates `docs/drivers/S7-Test-Fixture.md` §"What it
|
||||
does NOT cover" with a note that snap7 does not implement SZL — golden-
|
||||
byte unit tests cover the parser, live SZL is gated on a real S7-1500;
|
||||
no Snap7 seed change (snap7 returns a fixed handshake banner that the
|
||||
test checks for "SZL not supported on simulator" branch); adds
|
||||
`S7SzlParserTests` (unit) with golden bytes; documents the live SZL
|
||||
test in `docs/drivers/S7-Test-Fixture.md` §"Follow-up candidates"; no
|
||||
`scripts/e2e/test-s7.ps1` change.
|
||||
|
||||
#### PR-S7-E2 — PLC password / protection-level handling
|
||||
|
||||
Closes gap #14. S7-300/400 protection levels 1-3 and S7-1200/1500 connection
|
||||
mechanisms can require a password on connect.
|
||||
|
||||
- **Files**: `S7DriverOptions.cs` (new `Password string?` and `ProtectionLevel`
|
||||
enum). `S7Driver.cs` calls S7netplus's `SetPassword` (if the API surfaces it
|
||||
— newer S7netplus versions ship `Plc.SendPassword(string)`; if not, raw-PDU
|
||||
fallback per Siemens "Communication Function Manual" §5.2).
|
||||
- **Tests**: live-firmware-gated; password-tier failure modes don't reproduce
|
||||
in Snap7. Unit-level coverage for the options-binding shape only.
|
||||
- **Risks**: S7netplus may not expose password auth — fallback is to call into
|
||||
the lower-level `S7.Net.S7Protocol` types or to fork. Land the options
|
||||
surface unconditionally, gate the wire path on library support, document
|
||||
the limitation if the library doesn't oblige.
|
||||
- **Effort**: M (S if S7netplus ships it; L if we need a fallback path).
|
||||
- **Deps**: none.
|
||||
- **Docs / fixture / e2e**: adds a "PLC password / protection levels"
|
||||
section to `docs/v2/s7.md` documenting the `Password` /
|
||||
`ProtectionLevel` options + the S7-300/400 levels 1-3 vs S7-1200/1500
|
||||
connection-mechanism semantics + the "limitation if S7netplus
|
||||
doesn't ship `SendPassword`" note; adds a `--password` flag to
|
||||
`docs/Driver.S7.Cli.md`'s common-flags table with a hardened-CPU
|
||||
worked example; updates `docs/drivers/S7-Test-Fixture.md` §"What it
|
||||
does NOT cover" with a "password / protection levels not modelled by
|
||||
snap7" note; no Snap7 seed change (snap7 doesn't enforce protection
|
||||
levels); adds options-binding unit tests only — no integration test
|
||||
(live-firmware-only); no `scripts/e2e/test-s7.ps1` change.
|
||||
|
||||
### Phase 6 — S7-1500 Optimized DB / Symbolic addressing (decision PR)
|
||||
|
||||
#### PR-S7-F — Optimized DB / S7Plus
|
||||
|
||||
Closes gap #1. **This is an architectural decision PR, not a code PR.**
|
||||
|
||||
S7netplus speaks classic S7comm only. Optimized DBs on S7-1500 (default for
|
||||
new TIA projects) reorder fields and have no fixed byte offsets — absolute
|
||||
`DB1.DBW0` reads return `BadDeviceFailure`. Three tracks:
|
||||
|
||||
1. **Document the constraint and stay on S7netplus.** Operators must uncheck
|
||||
"Optimized block access" in TIA Portal for any DB the driver reads. This
|
||||
is what the test fixture already documents. Effort: S (docs only).
|
||||
2. **Migrate to a library that supports S7Plus.**
|
||||
- **Snap7 v2 / `Snap7Net`** — C-library wrapper, supports classic S7comm
|
||||
only (same limitation as S7netplus). Not a fix.
|
||||
- **Sharp7 fork** — community fork of Snap7 with **partial** S7-1200/1500
|
||||
PUT/GET semantics. Still classic S7comm.
|
||||
- **Custom S7Plus implementation** — Wireshark dissector exists; reverse
|
||||
engineering is substantial. Effort: ≥ 4 weeks; ongoing protocol-version
|
||||
maintenance. Risk: Siemens has not published S7Plus.
|
||||
3. **Embed an OPC UA → OPC UA bridge to the S7-1500's onboard OPC UA server.**
|
||||
The S7-1500 V2.5+ exposes its own OPC UA server with full symbolic access.
|
||||
Our `OPC UA Client driver` (already shipping per memory) could read the
|
||||
target CPU's OPC UA server and re-publish — sidesteps S7Plus entirely.
|
||||
Effort: S; semantics: requires the customer to license Siemens OPC UA
|
||||
on the CPU. Most modern S7-1500 deployments already license it.
|
||||
|
||||
**Recommendation**: ship Track 1 docs immediately (closes the operator
|
||||
expectation gap) and Track 3 as the Optimized-DB workflow path (re-uses
|
||||
existing OPC UA Client driver). Track 2 (S7Plus reverse-engineering) is
|
||||
out of scope unless a customer pays for it.
|
||||
|
||||
- **Files**: `docs/v2/s7.md` (Optimized DB section + how to disable),
|
||||
`docs/featuregaps.md` row #1 updated to reflect the Track 1+3 decision.
|
||||
- **Tests**: live-firmware test against the dev-box S7-1500 with optimized
|
||||
block access toggled both ways, asserting `BadDeviceFailure` vs
|
||||
successful read.
|
||||
- **Risks**: Track 3's OPC-UA-Client-bridging needs Admin UI plumbing to
|
||||
configure; that's a larger workstream tracked separately.
|
||||
- **Effort**: S (docs + decision); L if Track 2 is taken.
|
||||
- **Deps**: Open Question (a) below.
|
||||
- **Docs / fixture / e2e**: rewrites `docs/v2/s7.md` to land a
|
||||
prominent "Optimized DB constraint" section at the top — explicitly
|
||||
documents the S7-1200 V4.0+ / S7-1500 default, the
|
||||
`BadDeviceFailure` shape on absolute `DB1.DBW0` reads against an
|
||||
optimized DB, the "Uncheck Optimized block access in TIA Portal"
|
||||
fix, and the recommended **bridge-via-OpcUaClient** pattern with a
|
||||
worked example (Siemens S7-1500 V2.5+ onboard OPC UA server →
|
||||
`OpcUaClient` driver → re-publish on the OtOpcUa server's address
|
||||
space); updates `docs/featuregaps.md` row #1 to reflect the
|
||||
Track 1+3 decision; updates the "Optimized-DB" line of
|
||||
`docs/drivers/S7-Test-Fixture.md` §"What it does NOT cover" item 4
|
||||
to point at the new doc; no CLI doc change (CLI is a probe tool, not
|
||||
the bridging path); no Snap7 fixture change (snap7 has no Optimized-
|
||||
DB mode); the live-firmware test toggling Optimized block access on
|
||||
/ off is recorded as a manual checklist in
|
||||
`docs/drivers/S7-Test-Fixture.md` §"Follow-up candidates" and gated
|
||||
behind `--with-real-plc`; if Track 2 is taken later, this PR's doc
|
||||
surface becomes the migration baseline; no `scripts/e2e/test-s7.ps1`
|
||||
change.
|
||||
|
||||
---
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Consolidated view of every per-PR `Docs / fixture / e2e` line above, so a
|
||||
reviewer can see the cross-cutting churn at a glance and so the doc /
|
||||
fixture / e2e maintainers can sequence their work alongside the code PRs.
|
||||
|
||||
### User-facing documentation churn
|
||||
|
||||
| PR | `docs/v2/s7.md` | `docs/Driver.S7.Cli.md` | `docs/drivers/S7-Test-Fixture.md` | New / cross-cut docs |
|
||||
|----|-----------------|-------------------------|------------------------------------|----------------------|
|
||||
| PR-S7-A1 (LInt/ULInt/LReal/LWord) | extend type-mapping table | new sizes in cookbook | remove "no 64-bit types" | — |
|
||||
| PR-S7-A2 (STRING/WSTRING/CHAR/WCHAR) | string layout subsection | `--type WString` / `--string-length` | list new types | — |
|
||||
| PR-S7-A3 (DTL/DT/S5TIME/TIME/TOD/DATE) | "Date / time types" subsection | datetime cookbook entries | list new types | — |
|
||||
| PR-S7-A4 (arrays) | "Array tags (ValueRank=1)" subsection | `--array-count` flag + examples | list array round-trips | — |
|
||||
| PR-S7-A5 (V-memory) | "LOGO! 8 / S7-200 V-memory" subsection | grammar table + S7200Smart example | parser coverage note | — |
|
||||
| PR-S7-B1 (PDU packing) | "Performance — multi-variable PDU packing" subsection | — | — | — |
|
||||
| PR-S7-B2 (block coalescing) | "Block-read coalescing" subsection | — | — | — |
|
||||
| PR-S7-C1 (negotiated PDU diag) | "Diagnostics surfacing" subsection | — | "negotiated PDU size" line | — |
|
||||
| PR-S7-C2 (TSAP) | "TSAP / Connection Type" section | `--tsap-mode` / `--local-tsap` / `--remote-tsap` flags | — | — |
|
||||
| PR-S7-C3 (scan groups) | "Per-tag scan groups" subsection | — | — | — |
|
||||
| PR-S7-C4 (deadband) | "Deadband / on-change" subsection | — | — | — |
|
||||
| PR-S7-C5 (PUT/GET pre-flight) | "Pre-flight PUT/GET enablement" subsection | extend "PUT/GET must be enabled" | mark live-firmware test | — |
|
||||
| PR-S7-D1 (TIA CSV / AWL import) | "Symbol import" cross-link | future `import` subcommand stub | — | **new `docs/drivers/S7-TIA-Import.md`** |
|
||||
| PR-S7-D2 (UDT / STRUCT) | "UDT / STRUCT support" section | — | remove "UDT fan-out" | extend `S7-TIA-Import.md` |
|
||||
| PR-S7-D3 (instance DB) | re-import-on-FB-edit caveat | — | — | extend `S7-TIA-Import.md` |
|
||||
| PR-S7-E1 (SZL diagnostics) | "CPU diagnostics (SZL)" section | `read -a @System.CpuType` example | "SZL not modelled by snap7" + Follow-up | — |
|
||||
| PR-S7-E2 (PLC password) | "PLC password / protection levels" section | `--password` flag | "password not modelled by snap7" | — |
|
||||
| PR-S7-F (Optimized DB / S7Plus) | top-level "Optimized DB constraint" + bridge-via-OpcUaClient worked example | — | point §"What it does NOT cover" at new doc | also updates `docs/featuregaps.md` row #1 |
|
||||
|
||||
### Snap7-server fixture seed-type additions per PR
|
||||
|
||||
The snap7 simulator at `localhost:1102` (driven by
|
||||
`tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/Docker/server.py` +
|
||||
`Docker/profiles/s7_1500.json`) has a `seed_buffer` pump with a fixed type
|
||||
set — `u8 / i8 / u16 / i16 / u32 / i32 / f32 / bool / ascii`. New PRs need
|
||||
new seed-type cases in `server.py`, new offsets in `s7_1500.json`, and
|
||||
matching constants in `S7_1500Profile.cs`. The table below names the
|
||||
delta for each Build-Yes PR:
|
||||
|
||||
| PR | New `server.py` seed types | New `s7_1500.json` seed offsets | `S7_1500Profile.cs` additions |
|
||||
|----|----------------------------|----------------------------------|-------------------------------|
|
||||
| PR-S7-A1 | `i64`, `u64`, `f64` | `DB1.DBL40` (i64), `DB1.DBL48` (f64), `DB1.DBL56` (u64) | `SmokeI64Tag` / `SmokeU64Tag` / `SmokeF64Tag` |
|
||||
| PR-S7-A2 | `wstring`, `char`, `wchar` (existing `ascii` covers STRING) | `DB1.WSTRING[256]`, `DB1.CHAR[300]` | `SmokeWStringTag` / `SmokeCharTag` |
|
||||
| PR-S7-A3 | `dtl`, `dt`, `s5time`, `time`, `tod`, `date` (golden-byte vectors in comments) | `DB1.DTL[260]`, `DB1.DT[272]`, `DB1.S5TIME[280]`, `DB1.TIME[284]`, `DB1.TOD[288]`, `DB1.DATE[292]` | `SmokeDtl` / `SmokeDt` / `SmokeS5Time` / `SmokeTime` / `SmokeTod` / `SmokeDate` |
|
||||
| PR-S7-A4 | `array` meta-seed (inner-type + count) | `DB1.ArrayInt[300]` 10×Int, `DB1.ArrayReal[320]` 10×Real | `ArrayInt10Tag` / `ArrayReal10Tag` |
|
||||
| PR-S7-A5 | none (V-memory aliases land in DB1, which `server.py` already exposes) | none | unit-only — no profile change |
|
||||
| PR-S7-B1 | none | none (existing scalar seeds suffice for packing) | none — perf integration test reuses scalar tags |
|
||||
| PR-S7-B2 | none | none (existing contiguous DBW0 / DBW10 / DBD20 already test merge) | none |
|
||||
| PR-S7-C1 | none | none | none |
|
||||
| PR-S7-C2 | none (snap7 accepts any TSAP) | none | none |
|
||||
| PR-S7-C3 | none | none | none |
|
||||
| PR-S7-C4 | none | none | none |
|
||||
| PR-S7-C5 | none (existing `MK0` MW0 seed covers success path) | none | none |
|
||||
| PR-S7-D1 | none (CSV import lands tags pointing at existing seeds) | none | possibly add fixture-pointer constants |
|
||||
| PR-S7-D2 | `udt_layout` meta-seed (per-member offsets) | `DB1.MyUdt[400]` (Real + Int + Bool layout) | `MyUdtTag` + member tags |
|
||||
| PR-S7-D3 | none (resolved addresses land in DB1) | none | none |
|
||||
| PR-S7-E1 | none — snap7 doesn't model SZL; unit-level golden bytes cover the parser | none | none |
|
||||
| PR-S7-E2 | none — snap7 doesn't enforce protection levels; options-binding unit tests only | none | none |
|
||||
| PR-S7-F | none — snap7 has no Optimized-DB mode; live-firmware checklist instead | none | none |
|
||||
|
||||
### E2E `scripts/e2e/test-s7.ps1` impact
|
||||
|
||||
`scripts/e2e/test-s7.ps1` runs the five-assertion CLI loopback (probe /
|
||||
driver-loopback / forward-bridge / reverse-bridge / subscribe-sees-change)
|
||||
against `DB1.DBW0` Int16. Build-Yes PRs that add CLI surface get a
|
||||
matching loopback assertion; PRs that touch only internals or admin-UI
|
||||
flows do not.
|
||||
|
||||
| PR | E2E script change |
|
||||
|----|-------------------|
|
||||
| PR-S7-A1 | add LInt loopback assertion (write 0x7FFFFFFFFFFFFFFF, read back) |
|
||||
| PR-S7-A2 | add string round-trip assertion |
|
||||
| PR-S7-A3 | none (CLI cookbook covers manual surface) |
|
||||
| PR-S7-A4 | add array round-trip assertion |
|
||||
| PR-S7-A5 | none (live-LOGO! field-only) |
|
||||
| PR-S7-B1 | none |
|
||||
| PR-S7-B2 | none |
|
||||
| PR-S7-C1 | none |
|
||||
| PR-S7-C2 | none (live-firmware-only) |
|
||||
| PR-S7-C3 | none (subscribe assertion already covers polling) |
|
||||
| PR-S7-C4 | none |
|
||||
| PR-S7-C5 | none (probe runs first today) |
|
||||
| PR-S7-D1 | none (Admin UI has its own e2e) |
|
||||
| PR-S7-D2 | add UDT-member round-trip assertion |
|
||||
| PR-S7-D3 | none (import-time concern) |
|
||||
| PR-S7-E1 | none |
|
||||
| PR-S7-E2 | none (live-firmware-only) |
|
||||
| PR-S7-F | none (decision PR; live-firmware checklist instead) |
|
||||
|
||||
---
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
| # | Gap | Skip rationale |
|
||||
|---|-----|---------------|
|
||||
| 12 | AS-Alarms / Alarm_S / ProDiag | Alarms are a separate workstream; no `IAlarmSource` shipped on this driver yet, and the gap analysis flags it as a deferred topic. |
|
||||
| 13 | CPU Run / Stop control / block download | Security and safety risk. PG-class writes that change CPU state are explicitly out of scope. |
|
||||
| 15 | S7-1500 Secure Communication / TLS | Significant work; S7netplus has no TLS surface. Reconsider when S7Plus track is taken. |
|
||||
| 16 | S7-400H redundant H-system support | Rare in our deployment scope. Server-level redundancy (`docs/Redundancy.md`) covers the OPC UA layer; H-system driver-level failover is a separate axis. |
|
||||
| 17 | Multi-CPU rack parallel sessions | One session per CPU works for the deployments we target; multi-CPU racks are an S7-400 niche. |
|
||||
| 18 | MPI / Profibus / RFC1006-routed transports | Declining use; brownfield only. S7netplus is Ethernet-only. |
|
||||
| 23 | Connection-resource budget / parallel jobs | One connection works; premature optimization until a deployment hits the cap. |
|
||||
|
||||
---
|
||||
|
||||
## Open questions
|
||||
|
||||
### (a) Library choice for S7Plus
|
||||
|
||||
PR-S7-F gates on this decision. Options:
|
||||
|
||||
1. **Stay on S7netplus + document Optimized-DB constraint** (preferred default).
|
||||
2. **Fork to Sharp7 / Snap7 v2** — does *not* solve the S7Plus / Optimized-DB
|
||||
problem; both are classic S7comm only. Adopting them buys nothing for this
|
||||
gap. Reject unless we want it for unrelated reasons.
|
||||
3. **Custom S7Plus client over Wireshark-dissected protocol** — large effort,
|
||||
ongoing maintenance risk. Only if a customer is paying.
|
||||
4. **OPC UA → OPC UA bridge via existing OPC UA Client driver** — sidesteps
|
||||
S7Plus by re-using Siemens's onboard OPC UA server. Recommended secondary
|
||||
track.
|
||||
|
||||
Decision needed before Phase 6 PR-S7-F kicks off.
|
||||
|
||||
### (b) `WriteIdempotent` semantics for new types
|
||||
|
||||
The `WriteIdempotent` per-tag flag (decisions #44, #45, #143) governs replay-
|
||||
safe writes. New types from Phase 1:
|
||||
|
||||
- **STRING / WSTRING** — typically idempotent (recipe / message text).
|
||||
Replay-safe by default? **Need confirmation.** Risk: PLC programs that
|
||||
treat a new string write as a "new message" event would double-fire.
|
||||
- **DTL / DT** — usually written from a clock master; replay-safe.
|
||||
- **Arrays of UDT** — depends on the UDT semantics (recipe = safe, command
|
||||
block = unsafe). Inherit `WriteIdempotent` from the parent tag, do not
|
||||
add a per-member flag.
|
||||
- **64-bit types** — same rule as 32-bit equivalents.
|
||||
|
||||
Default: keep `WriteIdempotent = false` for everything. Operators flip per
|
||||
tag based on PLC program semantics. **No semantic extension needed**, but
|
||||
document the per-type guidance in `docs/v2/s7.md`.
|
||||
|
||||
### (c) Symbol-import file format(s)
|
||||
|
||||
PR-S7-D1 ships an importer. Which formats?
|
||||
|
||||
- **TIA Portal CSV** (Show all tags / Export) — preferred entry point;
|
||||
most common. **Confirm.**
|
||||
- **TIA Portal SDF / Excel** — same data; harder to parse. Skip unless
|
||||
customer demand emerges.
|
||||
- **STEP 7 Classic AWL / SCL `.AWL`** — secondary. Useful for legacy
|
||||
S7-300/400 sites still on Classic. **Include in D1?**
|
||||
- **`.s7p` / `.zap` project archive** — full TIA project. ZIP-shaped;
|
||||
symbol export would require unpacking and parsing internal XML. Large
|
||||
scope. **Defer.**
|
||||
- **`.udt` / `.SDF` external tag library** — niche; defer unless asked.
|
||||
|
||||
Recommendation: PR-S7-D1 ships **TIA CSV** + **AWL** only. Anything else is
|
||||
a follow-up. Decision needed before Phase 4 work begins.
|
||||
@@ -0,0 +1,899 @@
|
||||
# TwinCAT Driver — Implementation Plan
|
||||
|
||||
> Source of gap analysis: [featuregaps.md → TwinCAT](../featuregaps.md#twincat-beckhoff-ads)
|
||||
>
|
||||
> Covers Build = Yes items only.
|
||||
|
||||
## Summary
|
||||
|
||||
The TwinCAT driver (`src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/`) ships a solid baseline:
|
||||
six capability interfaces over `Beckhoff.TwinCAT.Ads` v6 `AdsClient`, native
|
||||
`AdsTransMode.OnChange` notifications, AMS address parsing, symbol-path parser
|
||||
with multi-dim subscripts, controller-side browse with system-symbol filtering,
|
||||
and a 30-case live integration suite against TCBSD + Hyper-V XAR. Twelve gaps
|
||||
remain rated Build=Yes in `docs/featuregaps.md` and they cluster cleanly into
|
||||
five themes:
|
||||
|
||||
1. **Data-type correctness** — `LInt`/`ULInt` silently truncated to Int32
|
||||
(explicit `// matches Int64 gap` comment in `TwinCATDataType.cs:40`),
|
||||
`TIME`/`DATE`/`DT`/`TOD` marshalled as raw `UDINT` rather than native UA
|
||||
types, `ENUM`/`ALIAS` skipped at browse, bit-indexed BOOL writes throw,
|
||||
multi-dim and whole-array reads not batched.
|
||||
2. **Performance** — every read is a `ReadValueAsync` call with re-resolved
|
||||
symbolic name; no Sum commands, no handle caching. Multi-thousand-tag
|
||||
scans pay symbol resolution + per-tag AMS round-trip cost on every cycle.
|
||||
3. **Operability** — `NotificationSettings(OnChange, cycleMs, 0)` clamps
|
||||
max-delay to zero with no per-tag override; probe loop only checks
|
||||
reachability — no cycle-time / jitter / `_AppInfo` / RT-state telemetry.
|
||||
4. **UDT decomposition** — `Structure` is declared in the enum but discovery
|
||||
skips non-atomic symbols (`AdsTwinCATClient.cs:224`); to expose nested UDT
|
||||
trees we need TMC-file parsing or runtime data-type table introspection.
|
||||
5. **Alarms** — no `IAlarmSource` implementation; TC3 EventLogger / AMS port
|
||||
110 events never surface as OPC UA AC events.
|
||||
|
||||
The plan ships as five phases / 12 PRs. Phases 1-3 are all narrow scope and can
|
||||
land in parallel where dependencies allow. Phase 4 (UDT/TMC) is the largest
|
||||
single piece of work and is called out as such. Phase 5 (alarms) requires
|
||||
investigation up front (Beckhoff TC3 EventLogger NuGet availability — see
|
||||
Open questions).
|
||||
|
||||
Hyper-V conflict gating: live integration runs against the TCBSD VM
|
||||
(`docs/drivers/TwinCAT-Test-Fixture.md`, AmsNetId `41.169.163.43.1.1` at
|
||||
`10.100.0.128`) since the local Hyper-V XAR can't co-exist with Docker
|
||||
Desktop. All wire-level tests gate on `[TwinCATFact]` / `[TwinCATTheory]`
|
||||
and skip cleanly when `TWINCAT_TARGET_NETID` is unset.
|
||||
|
||||
## Phased delivery
|
||||
|
||||
| Phase | Theme | PRs | Sequencing |
|
||||
|---|---|---|---|
|
||||
| 1 | Data-type correctness | 1.1 — 1.5 | Independent; ship in any order |
|
||||
| 2 | Performance — Sum + handles | 2.1 — 2.3 | 2.3 depends on 2.2 |
|
||||
| 3 | Operability — max-delay + diagnostics | 3.1 — 3.2 | Independent |
|
||||
| 4 | UDT decomposition with TMC parsing | 4.1 | Stand-alone; significant scope |
|
||||
| 5 | TC3 EventLogger alarms | 5.1 | Stand-alone; spike first |
|
||||
|
||||
Total: 12 PRs covering the 12 Build=Yes gaps.
|
||||
|
||||
Recommended landing order: **Phase 1 (correctness) → Phase 3 (operability) →
|
||||
Phase 2 (perf) → Phase 5 (alarms) → Phase 4 (UDT)**. Correctness first because
|
||||
it's cheap and removes fixtures' `Skip("Int64 gap")`-style workarounds.
|
||||
Operability before perf because the diagnostics surface created in 3.2 makes it
|
||||
much easier to validate Sum-command throughput claims in 2.1.
|
||||
|
||||
## Per-PR detail
|
||||
|
||||
### Phase 1 — Data-type correctness
|
||||
|
||||
#### PR 1.1 — Int64 fidelity for `LINT` / `ULINT`
|
||||
|
||||
**Scope**: Map `LInt`/`ULInt` to `DriverDataType.Int64` (currently truncates to
|
||||
Int32 per `TwinCATDataType.cs:40` comment "matches Int64 gap"). `MapToClrType`
|
||||
already returns `typeof(long)`/`typeof(ulong)`; the truncation is purely in the
|
||||
`ToDriverDataType` extension.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDataType.cs` — change line 40 to
|
||||
`=> DriverDataType.Int64;` (drop the gap comment).
|
||||
- Verify `DriverDataType.Int64` exists in `Core.Abstractions` — if not, add it
|
||||
(likely scope creep into `ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverDataType.cs`).
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: none — the wire-level `AdsClient.ReadValueAsync`
|
||||
already returns `long`/`ulong` boxed in `result.Value` when called with
|
||||
`typeof(long)` per `MapToClrType`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: extend `TwinCATCapabilityTests` — assert `LInt.ToDriverDataType() ==
|
||||
Int64`, `ULInt.ToDriverDataType() == Int64`.
|
||||
- Integration: extend `GVL_Primitives` to include an `LINT` (`nLargeCounter`)
|
||||
seeded with `0x1_0000_0000L` (above Int32 range). Add a `[TwinCATTheory]`
|
||||
case asserting the value round-trips without truncation. May need a new
|
||||
`GVL_Primitives.lLong : LINT` symbol if not already present (the existing
|
||||
16-primitive theory in `TwinCAT3SmokeTests.cs` covers `LInt`/`ULInt` —
|
||||
inspect what value it seeds and tighten the assertion).
|
||||
|
||||
**Effort**: S (half day).
|
||||
**Deps**: none.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` "Data types" table — drop the "marshal as
|
||||
`UDINT` on the wire" caveat for `LInt` / `ULInt` (this PR keeps Int64 fidelity);
|
||||
`docs/drivers/TwinCAT-Test-Fixture.md` "Bugs caught by live runs" gains a 4th
|
||||
entry pinning the truncation regression.
|
||||
- Fixture (TCBSD PLC project): `PLC/GVLs/GVL_Primitives.TcGVL` adds
|
||||
`vLargeCounter : LINT := 16#1_0000_0000` (above Int32 range) + matching
|
||||
`vLargeCounterU : ULINT`; `tests/.../TwinCatProject/README.md` "GVL_Primitives
|
||||
numeric seeds" enumerates the new symbols.
|
||||
- Integration tests: `TwinCAT3SmokeTests.cs` — extend the 16-case
|
||||
`[TwinCATTheory]` to 17/18 cases covering the new LINT/ULINT seeds; assert
|
||||
the value round-trips without truncation.
|
||||
- E2E: no change to `scripts/e2e/test-twincat.ps1` — the bridge script targets
|
||||
a single DINT counter, untouched by Int64 work.
|
||||
|
||||
#### PR 1.2 — TIME / DATE / DT / TOD as native UA types
|
||||
|
||||
**Scope**: Stop marshalling `TIME` / `DATE` / `DT` / `TOD` as raw `UDINT`
|
||||
(`AdsTwinCATClient.cs:278-280`). Map according to IEC 61131-3 semantics:
|
||||
|
||||
- `TIME` (ms duration) → `DriverDataType.Duration` (UA `Double` seconds, or
|
||||
add `Duration` to `DriverDataType` if missing).
|
||||
- `DATE` (days since 1970-01-01) → `DriverDataType.DateTime` (midnight UTC).
|
||||
- `DT` (seconds since 1970-01-01) → `DriverDataType.DateTime`.
|
||||
- `TOD` (ms since midnight) → `DriverDataType.DateTime` (today's date +
|
||||
offset) or a dedicated `TimeOfDay` type if the abstraction supports it.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDataType.cs` — update
|
||||
`ToDriverDataType` mapping for the four IEC time types.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — `MapToClrType`
|
||||
returns the raw UDINT today; keep that for the wire read but post-process
|
||||
inside `ReadValueAsync` / `ConvertForWrite` to convert UDINT ↔ `DateTime` /
|
||||
`TimeSpan`. Symmetrical change in `OnAdsNotificationEx` so subscriptions see
|
||||
the same shape.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: still `AdsClient.ReadValueAsync(symbol,
|
||||
typeof(uint), ct)`. Beckhoff exposes `PlcOpenDate` / `PlcOpenTimeOfDay` etc.
|
||||
in `TwinCAT.Ads.TypeSystem` — using those types directly would simplify
|
||||
conversion but tightens our coupling. Investigate during PR.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: round-trip helpers UDINT-since-epoch ↔ `DateTime` for each variant.
|
||||
- Integration: add `GVL_Primitives.dCurrentTime : DT` seeded with a known
|
||||
literal (e.g. `DT#2026-01-15-12:00:00`); assert the driver returns a
|
||||
`DateTime` matching that instant within 1 s.
|
||||
|
||||
**Effort**: M (1-2 days).
|
||||
**Deps**: none. May expose missing `Duration` in `DriverDataType` enum.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` "Data types" section — replace the
|
||||
"marshal as `UDINT` on the wire — CLI takes a numeric raw value" paragraph
|
||||
with native syntax (e.g. `read -t DateTime` returns ISO-8601, `write -t Time
|
||||
-v 00:00:01.500` for IEC TIME duration). New examples for each of the four
|
||||
IEC time types under `read` / `write`.
|
||||
- Fixture (TCBSD PLC project): `PLC/GVLs/GVL_Primitives.TcGVL` adds
|
||||
`dCurrentTime : DT := DT#2026-01-15-12:00:00`, `tCycleDuration : TIME :=
|
||||
T#1500ms`, `dToday : DATE := DATE#2026-04-25`, `tShiftStart : TOD :=
|
||||
TOD#06:30:00`. Existing primitives theory in
|
||||
`tests/.../TwinCatProject/README.md` § "Type coverage" gets the seed values
|
||||
documented.
|
||||
- Integration tests: `TwinCAT3SmokeTests.cs` — new
|
||||
`Driver_round_trips_TIME_DATE_DT_TOD_as_native_UA_types` `[TwinCATFact]`
|
||||
reading each variable and asserting the CLR shape (`TimeSpan` / `DateTime`).
|
||||
Update the existing 16-case primitive `[TwinCATTheory]` to assert native
|
||||
types instead of raw `UDINT` for these four entries.
|
||||
- E2E: `scripts/e2e/test-twincat.ps1` unchanged for now (single DINT bridge);
|
||||
follow-up could add a DT-typed bridge node but it's not on the critical path.
|
||||
|
||||
#### PR 1.3 — Bit-indexed BOOL writes (read-modify-write)
|
||||
|
||||
**Scope**: Replace the `NotSupportedException` at `AdsTwinCATClient.cs:99-100`
|
||||
with a read-modify-write sequence: read parent word as `uint`, set/clear bit,
|
||||
write the word back. Must serialize against concurrent writes to the same
|
||||
parent word — a single `SemaphoreSlim` keyed on parent symbol path is
|
||||
sufficient (concurrency on bit writes within the same parent is rare and the
|
||||
PLC cycle is the natural lower bound on contention anyway).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — replace `throw`
|
||||
branch in `WriteValueAsync` with RMW logic mirroring `ReadValueAsync`'s
|
||||
bit-index path. Add `ConcurrentDictionary<string, SemaphoreSlim>
|
||||
_bitWriteLocks` keyed on parent symbol.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: `AdsClient.ReadValueAsync(parent, typeof(uint))`
|
||||
+ `AdsClient.WriteValueAsync(parent, modifiedWord)`. Both already used.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: extend `TwinCATReadWriteTests` with a `FakeTwinCATClient` test
|
||||
covering set + clear of bits 0, 7, 15, 31 of a `uint` parent.
|
||||
- Integration: add a new `[TwinCATFact]` —
|
||||
`Driver_round_trips_bit_indexed_BOOL_write_and_read` against
|
||||
`GVL_Primitives.vWord.4` (the `0xBEEF` word's bit-4); flip to true, read
|
||||
back as true, flip to false, read back as false.
|
||||
|
||||
**Effort**: S-M (1 day).
|
||||
**Deps**: none. Closes task #181 referenced in the existing `NotSupported`
|
||||
exception message.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` `write` section — add an example
|
||||
`otopcua-twincat-cli write -n ... -s "GVL_Primitives.vWord.4" -t Bool -v
|
||||
true` and a note explaining the RMW semantics + concurrency caveat (parent
|
||||
word is locked per write — concurrent bit writes on the same word
|
||||
serialize). `docs/drivers/TwinCAT-Test-Fixture.md` "Bugs caught by live
|
||||
runs" updates entry #3 to note that writes now also work (read previously
|
||||
shipped; write was the gap).
|
||||
- Fixture (TCBSD PLC project): no schema change required —
|
||||
`GVL_Primitives.vWord` already exists with seed `0xBEEF`. Tests use bits 4
|
||||
(clear) and 7 (set) to round-trip.
|
||||
- Integration tests: `TwinCAT3SmokeTests.cs` — new
|
||||
`Driver_round_trips_bit_indexed_BOOL_write_and_read` `[TwinCATFact]`. Unit
|
||||
tests in `TwinCATReadWriteTests` extended via `FakeTwinCATClient` for bits
|
||||
0/7/15/31 of a `uint` parent.
|
||||
- E2E: no change.
|
||||
|
||||
#### PR 1.4 — Multi-dim and whole-array reads
|
||||
|
||||
**Scope**: Expand `ReadValueAsync` / `WriteValueAsync` to handle whole-array
|
||||
reads via Beckhoff's array marshalling, instead of element-by-element. The
|
||||
symbol-path parser already produces `TwinCATSymbolSegment.Subscripts` with N
|
||||
dims; today the driver only reads single elements (one path per request).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — when a tag
|
||||
declares `IsArray=true` (extend `TwinCATTagDefinition`), use
|
||||
`AdsClient.ReadValueAsync(symbol, typeof(int[]))` / `typeof(double[,])` etc.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — surface
|
||||
`IsArray` + `ArrayDim` through `DriverAttributeInfo` in `DiscoverAsync`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATTagDefinition.cs` (if exists,
|
||||
in `TwinCATDriverOptions.cs`) — add `bool IsArray`, `int[]? ArrayDimensions`.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: `AdsClient.ReadValueAsync(symbol, Type, ct)`
|
||||
accepts CLR array types. For dynamically-sized reads use
|
||||
`AdsClient.ReadAnyAsync<T[]>(...)` or pass `Array.CreateInstance(elemType,
|
||||
dims)`. SymbolLoader yields a `Symbol.Category == DataTypeCategory.Array` we
|
||||
can inspect to autoderive dimensions during discovery.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: parse `Matrix[1,2]` and verify ranking / dimension flow into the
|
||||
request shape via `FakeTwinCATClient`.
|
||||
- Integration: extend `GVL_Arrays` with a 5x5 `aReal2D : ARRAY [1..5, 1..5]
|
||||
OF REAL`; new `[TwinCATFact]` reads the whole array in one call and
|
||||
verifies element count + values.
|
||||
|
||||
**Effort**: M (2-3 days).
|
||||
**Deps**: none. Sets up the array-shape plumbing the rest of the driver
|
||||
needs anyway.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` `read` section — add whole-array example
|
||||
(`read -s "GVL_Arrays.aReal2D"` returns the full matrix as JSON) plus a
|
||||
dedicated "Arrays" sub-section calling out 1-D / N-D / array-of-struct
|
||||
semantics. `docs/drivers/TwinCAT-Test-Fixture.md` "What it actually covers"
|
||||
list adds the whole-array bullet.
|
||||
- Fixture (TCBSD PLC project): `PLC/GVLs/GVL_Arrays.TcGVL` already declares
|
||||
`ARRAY[1..4,1..4] OF REAL` per `TwinCatProject/README.md` § "Array
|
||||
coverage". This PR adds a 5x5 `aReal2D : ARRAY [1..5, 1..5] OF REAL`
|
||||
initialised with a deterministic pattern (e.g. `(i-1)*5 + (j-1)`) so the
|
||||
whole-array test can assert each element. README "Array coverage" gets the
|
||||
new symbol.
|
||||
- Integration tests: `TwinCAT3SmokeTests.cs` — new
|
||||
`Driver_reads_whole_2D_array_in_one_call` `[TwinCATFact]`. Unit tests
|
||||
extend `TwinCATSymbolPathTests` for multi-dim subscript shape.
|
||||
- E2E: no change to `scripts/e2e/test-twincat.ps1` (scalar bridge); a future
|
||||
array-bridge scenario is captured in the consolidated section below.
|
||||
|
||||
#### PR 1.5 — ENUM and ALIAS at discovery
|
||||
|
||||
**Scope**: `MapSymbolTypeName` returns `null` for any non-atomic type
|
||||
(`AdsTwinCATClient.cs:224`), so ENUM and ALIAS symbols are silently dropped
|
||||
during browse. ENUM is essentially a sized-integer with named members; ALIAS
|
||||
is a renamed atomic. Both are extremely common in real projects (motor states,
|
||||
recipe-step IDs, bit-flag groups).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` —
|
||||
`MapSymbolTypeName` keyed only on the type name today; switch to inspecting
|
||||
`symbol.DataType` + `symbol.Category` from `TwinCAT.TypeSystem`. For
|
||||
`DataTypeCategory.Enum` walk `EnumType.EnumValues` and pick the underlying
|
||||
base type. For `DataTypeCategory.Alias` resolve `AliasType.BaseType`
|
||||
recursively until atomic.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/BrowseSymbolsAsync` —
|
||||
surface enum members so the OPC UA layer can later emit them as
|
||||
EnumStrings.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: `TwinCAT.Ads.TypeSystem.SymbolLoaderFactory`
|
||||
already returns full `IDataType` objects with `Category`, `EnumType`,
|
||||
`AliasType`, etc. No new APIs.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: extend `TwinCATSymbolBrowserTests` — fake an enum symbol via
|
||||
`FakeTwinCATClient`; assert it browses with the underlying base type.
|
||||
- Integration: add `E_LineState : (Idle, Running, Faulted)` + a GVL instance
|
||||
variable; new `[TwinCATFact]` browses + reads it as `Int16` (or whatever
|
||||
the underlying type is).
|
||||
|
||||
**Effort**: M (1-2 days).
|
||||
**Deps**: none. POINTER / REFERENCE / INTERFACE / UNION are explicitly
|
||||
out-of-scope for this PR — they need real-world demand and a much larger
|
||||
type-system rework. ENUM and ALIAS are the 80% case.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` `browse` section — note that ENUM and
|
||||
ALIAS symbols now appear in the output (previously dropped); add a Data
|
||||
types row for "Enum (surfaced as underlying integer with EnumStrings)"
|
||||
and "Alias (resolved to base atomic)". `docs/drivers/TwinCAT-Test-
|
||||
Fixture.md` "What it actually covers" extends with the enum/alias bullet.
|
||||
- Fixture (TCBSD PLC project): `PLC/DUTs/E_AxisState.TcDUT` and
|
||||
`E_Severity.TcDUT` already exist; `PLC/DUTs/T_Temperature.TcDUT` and
|
||||
`T_MeterPerSec.TcDUT` already exist. `PLC/GVLs/GVL_Enums.TcGVL` already
|
||||
exposes them at the root per `TwinCatProject/README.md` § "Enum + alias
|
||||
coverage" — no fixture change needed for this PR. README's "Integration-
|
||||
test contract" gets a new entry for `GVL_Enums.currentSeverity` /
|
||||
`currentTemperature` so the new browse assertion has a stable target.
|
||||
- Integration tests: `TwinCAT3SmokeTests.cs` — new
|
||||
`Driver_browses_enums_and_aliases_with_resolved_base_types` `[TwinCATFact]`
|
||||
asserting the four `GVL_Enums` symbols surface with the correct underlying
|
||||
CLR type (`Int32` for E_AxisState, `Int16` for E_Severity, `Double` for
|
||||
the LREAL aliases).
|
||||
- E2E: no change.
|
||||
|
||||
### Phase 2 — Performance (Sum commands + handle caching)
|
||||
|
||||
#### PR 2.1 — ADS Sum-read / Sum-write
|
||||
|
||||
**Scope**: Today `ReadAsync` loops over `fullReferences` issuing one
|
||||
`ReadValueAsync` per tag (`TwinCATDriver.cs:118-156`). Beckhoff's ADS Sum
|
||||
commands (`IndexGroup=0xF080..0xF084`) batch N reads/writes into a single AMS
|
||||
request. `Beckhoff.TwinCAT.Ads` v6 exposes this via
|
||||
`AdsClient.ReadWriteAsync` with `SumCommand` request envelopes —
|
||||
specifically `SumSymbolRead` / `SumSymbolWrite` from
|
||||
`TwinCAT.Ads.SumCommand`. ~10x throughput on multi-thousand-tag scans
|
||||
according to Beckhoff InfoSys.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — new
|
||||
`ReadValuesAsync(IReadOnlyList<(string symbol, Type clrType)>, ct)` returning
|
||||
a parallel array of `(value, status)`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs::ReadAsync` — bucket
|
||||
`fullReferences` by `DeviceHostAddress`, call the new client method per
|
||||
bucket. `bitIndex` handling stays per-tag (RMW post-step).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/ITwinCATClient.cs` — add the
|
||||
bulk-read / bulk-write surface.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**:
|
||||
- `AdsClient.ReadWriteAsync(IndexGroup=0xF080, IndexOffset=count, ...)` for
|
||||
raw sum-read by handle.
|
||||
- Higher-level: `TwinCAT.Ads.SumCommand.SumSymbolRead(client, symbols)` /
|
||||
`SumSymbolWrite(client, symbols, values)` in v6. Verify the exact namespace
|
||||
during PR — Beckhoff sometimes re-shuffles between minor versions.
|
||||
- For symbolic (no handle) batching: `SumSymbolReadByName`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `FakeTwinCATClient.ReadValuesAsync` fakes the bulk surface; test
|
||||
ordering preservation, partial-failure mapping, empty-input handling.
|
||||
- Integration: `[TwinCATFact]` reads 100 declared tags in one call, asserts
|
||||
value parity with 100 single-call equivalents and measures wall-clock
|
||||
difference (assert under 50% of the loop baseline).
|
||||
|
||||
**Effort**: M-L (3 days).
|
||||
**Deps**: none (handle caching in 2.2 amplifies the win but isn't required).
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/v3/twincat-backlog.md` perf note moves out (Sum-commands no
|
||||
longer deferred) — add a closed-out bullet pointing at this PR. New
|
||||
performance section in `docs/drivers/TwinCAT-Test-Fixture.md` documenting
|
||||
the throughput baseline + Sum-command delta. `docs/Driver.TwinCAT.Cli.md`
|
||||
doesn't expose Sum directly to the user — the CLI still drives one symbol
|
||||
per call — so no CLI doc change.
|
||||
- Fixture (TCBSD PLC project, primary fixture-extension surface): add a new
|
||||
`PLC/GVLs/GVL_Perf.TcGVL` declaring `aTags : ARRAY[1..1000] OF DINT` plus
|
||||
a `MAIN` rung (or new `FB_PerfChurn` POU) that increments each element on
|
||||
a rotating subset. `TwinCatProject/README.md` § "Required project state"
|
||||
gains a "Performance scenarios" subsection documenting the 1000-tag GVL.
|
||||
- Integration tests: new perf test
|
||||
`Driver_sum_read_1000_tags_beats_loop_baseline_by_5x` (`[TwinCATFact]`,
|
||||
perf-tier — guarded behind a separate `TWINCAT_PERF=1` env flag so CI
|
||||
noise from VM jitter doesn't flap the suite). Unit tests cover ordering,
|
||||
partial-failure mapping, empty-input via `FakeTwinCATClient.ReadValuesAsync`.
|
||||
- E2E: `scripts/e2e/test-twincat.ps1` unchanged for the canonical bridge;
|
||||
perf scripts live alongside as a separate `scripts/perf/twincat-sum.ps1`
|
||||
if/when introduced (deferred — integration test is sufficient).
|
||||
|
||||
#### PR 2.2 — Handle-based access with caching
|
||||
|
||||
**Scope**: Cache `AdsClient.CreateVariableHandleAsync` results so per-read
|
||||
overhead drops from "resolve symbolic name + read by name" to "read by handle"
|
||||
— smaller AMS payloads, no name resolution on each call. Cache lifetime is
|
||||
process-scoped; eviction is via the PR 2.3 invalidation listener. Until 2.3
|
||||
ships the cache must be cleared on `AdsClient` reconnect (the existing
|
||||
auto-reconnect path in `EnsureConnectedAsync`).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — add
|
||||
`ConcurrentDictionary<string, uint> _handleCache`. Wrap reads/writes through
|
||||
`EnsureHandleAsync(symbolPath)` that hits the cache or calls
|
||||
`CreateVariableHandleAsync`. On `AdsErrorCode.DeviceSymbolVersionInvalid`
|
||||
(0x710 / 1808) evict the entry and retry once.
|
||||
- Dispose path: `DeleteVariableHandleAsync` for every cached handle on
|
||||
`AdsClient.Dispose` to be a good citizen with the runtime.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**:
|
||||
- `AdsClient.CreateVariableHandleAsync(string symbol, ct)` → returns
|
||||
`ResultHandle` with `.Handle` (uint).
|
||||
- `AdsClient.ReadAnyAsync<T>(IndexGroup=0xF005, IndexOffset=handle, ct)`
|
||||
reads by handle.
|
||||
- `AdsClient.WriteAnyAsync(IndexGroup=0xF005, IndexOffset=handle, value, ct)`.
|
||||
- `AdsClient.DeleteVariableHandleAsync(uint handle, ct)`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `FakeTwinCATClient` records handle-create / read-by-handle calls;
|
||||
test asserts second read of same symbol uses cached handle (zero new
|
||||
creates).
|
||||
- Integration: subscribe + read 50 tags, capture AMS round-trips via probe
|
||||
counter, assert the second pass uses ~50% of the bytes (handle = 4 bytes
|
||||
vs symbol path = N bytes).
|
||||
|
||||
**Effort**: M (2 days).
|
||||
**Deps**: combines with PR 2.1 for sum-read-by-handle (highest perf path).
|
||||
Without 2.3, handles can go stale after an online change — call out the
|
||||
caveat in driver options and add a manual `FlushOptionalCachesAsync` invocation
|
||||
that wipes the handle cache.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/drivers/TwinCAT-Test-Fixture.md` perf section gets a paragraph
|
||||
noting that handles drop AMS payload size for repeated reads (4 bytes vs.
|
||||
N-byte symbol path); call out the staleness caveat (online-change
|
||||
invalidation lands in 2.3). `docs/Driver.TwinCAT.Cli.md` adds a brief note
|
||||
in the `subscribe` / `read` sections that handles are cached transparently
|
||||
— no user-visible flag.
|
||||
- Fixture (TCBSD PLC project): no change required — handle caching is
|
||||
observable via byte-counter on the wire, not via PLC-side state. The
|
||||
perf-scenario `GVL_Perf.aTags` from PR 2.1 doubles as the exercise target.
|
||||
- Integration tests: new
|
||||
`Driver_handle_cache_avoids_repeat_symbol_resolution` `[TwinCATFact]`
|
||||
reads the same 50 symbols twice; asserts second pass uses cached handles
|
||||
(probed via diagnostics counters from PR 3.2 if shipped, otherwise via a
|
||||
test-only hook on `AdsTwinCATClient`). Unit tests on
|
||||
`FakeTwinCATClient.HandleCacheTests` assert second read of same symbol
|
||||
triggers zero new handle creates.
|
||||
- E2E: no change.
|
||||
|
||||
#### PR 2.3 — Symbol-version invalidation listener
|
||||
|
||||
**Scope**: TwinCAT publishes a "symbol table version changed" notification on
|
||||
ADS Index Group `ADSIGRP_SYMVAL_BYHND` (or rather, version bumps land via
|
||||
`SystemServiceLoadFile` style notifications + `SymbolVersion` reads). When the
|
||||
PLC takes an online change, all cached handles are silently invalidated; the
|
||||
next read returns `DeviceSymbolVersionInvalid` if you're lucky and a wrong
|
||||
value if you're not. We register a notification on the symbol-version index
|
||||
and wipe the handle cache on bump.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` — on connect,
|
||||
call `AddDeviceNotificationAsync(ADSIGRP_SYM_VERSION, 0, length=1, ...)`
|
||||
with `AdsTransMode.OnChange`. On callback, clear `_handleCache` + log.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**:
|
||||
- `AdsClient.AddDeviceNotificationAsync(uint indexGroup, uint indexOffset,
|
||||
int length, NotificationSettings, object userData, ct)` — the raw,
|
||||
index-group-based variant (not the symbol-name `Ex` variant we use today).
|
||||
- Index group: `AdsReservedIndexGroup.SymbolVersion` (0xF008). One byte
|
||||
payload that's the current symbol-version counter. Confirm during PR — open
|
||||
question (c) below.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: extend `TwinCATNativeNotificationTests` — `FakeTwinCATClient` exposes
|
||||
a `FireSymbolVersionChange()` method; test asserts handle cache is cleared
|
||||
and subsequent reads recreate handles.
|
||||
- Integration: `[TwinCATFact]` triggers an online change on the TCBSD project
|
||||
(rebuild a GVL with one new variable + login activate) — needs a project
|
||||
helper that automates the online-change. May ship behind a manual gate
|
||||
(`[TwinCATFact(Reason="requires-manual-online-change")]`) initially.
|
||||
|
||||
**Effort**: M (2 days).
|
||||
**Deps**: PR 2.2 (no point invalidating an empty cache). Confirm
|
||||
`SymbolVersion` index-group constant in `Beckhoff.TwinCAT.Ads` v6 — open
|
||||
question (c) below.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/drivers/TwinCAT-Test-Fixture.md` section on "What it does NOT
|
||||
cover" — drop the implicit "online-change handling" gap. New paragraph in
|
||||
the perf section noting handle cache is now self-invalidating.
|
||||
`docs/Driver.TwinCAT.Cli.md` no change (transparent to CLI user).
|
||||
- Fixture (TCBSD PLC project): no schema change. Operator workflow gains an
|
||||
online-change drill — `TwinCatProject/README.md` adds a § "Online-change
|
||||
test scenario" describing the steps (open project, add a dummy variable
|
||||
to `GVL_Perf`, "Login + Activate" → triggers the symbol-version bump).
|
||||
This is the manual gate for the integration assertion.
|
||||
- Integration tests: new `Driver_invalidates_handle_cache_on_symbol_version_bump`
|
||||
`[TwinCATFact]` — initially gated `[TwinCATFact(Reason="requires-manual-online-change")]`
|
||||
until automation lands. Unit tests cover the callback path via
|
||||
`FakeTwinCATClient.FireSymbolVersionChange()`.
|
||||
- E2E: no change.
|
||||
|
||||
### Phase 3 — Operability
|
||||
|
||||
#### PR 3.1 — Per-tag MaxDelay tuning
|
||||
|
||||
**Scope**: Today `NotificationSettings` is hard-coded as `(OnChange, cycleMs,
|
||||
0)` (`AdsTwinCATClient.cs:144-145`). MaxDelay=0 means "fire as soon as the
|
||||
change is detected, no coalescing"; for bursty high-frequency signals this
|
||||
floods the OPC UA subscription queue. Surface MaxDelay as a per-tag option
|
||||
(default 0 to preserve current behavior).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriverOptions.cs` — add
|
||||
`int? MaxDelayMs` to `TwinCATTagDefinition`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs::SubscribeAsync` —
|
||||
pass through to client.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs::AddNotificationAsync`
|
||||
— accept `int maxDelayMs`, plumb into `NotificationSettings(...,
|
||||
cycleMs, maxDelayMs)`.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: `NotificationSettings(AdsTransMode mode, int
|
||||
cycleTime, int maxDelay)` — both args in milliseconds per Beckhoff InfoSys
|
||||
`tcadsnetref/7313319051`.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: extend `TwinCATNativeNotificationTests` — assert the plumbed
|
||||
`maxDelayMs` lands on `NotificationSettings`.
|
||||
- Integration: subscribe to `GVL_Fixture.nCounter` with `MaxDelayMs=500`;
|
||||
assert delivery rate is ≤ 2 Hz even when PLC cycle is 10 ms.
|
||||
|
||||
**Effort**: S (half day).
|
||||
**Deps**: none.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: `docs/Driver.TwinCAT.Cli.md` `subscribe` flag table — add `--max-delay-ms`
|
||||
with default `0` and a note that nonzero coalesces high-frequency PLC
|
||||
signals. Update the description of `-i` / `--interval-ms` to disambiguate
|
||||
cycle vs. max-delay (both pass through to `NotificationSettings`).
|
||||
`docs/drivers/TwinCAT-Test-Fixture.md` "Notification coalescing under
|
||||
jitter" caveat — noting per-tag MaxDelay is now configurable.
|
||||
- Fixture (TCBSD PLC project): no change required — `GVL_Fixture.nCounter`
|
||||
already increments on every 10 ms cycle (see `MAIN.TcPOU`), so the test
|
||||
can drive a 100 Hz change rate and verify ≤ 2 Hz delivery with
|
||||
`MaxDelayMs=500`. README "Required project state" gets a one-line note
|
||||
that the counter doubles as the coalescing-test driver.
|
||||
- Integration tests: new `Driver_coalesces_notifications_at_max_delay`
|
||||
`[TwinCATFact]` subscribes to `GVL_Fixture.nCounter` with `MaxDelayMs=500`
|
||||
and asserts delivered-event count ≤ 3 over a 1 s window.
|
||||
- E2E: `scripts/e2e/test-twincat.ps1` `Test-SubscribeSeesChange` is a
|
||||
one-shot subscribe; no change. A future high-rate variant could test
|
||||
coalescing end-to-end through the OPC UA bridge but it's not on the
|
||||
critical path.
|
||||
|
||||
#### PR 3.2 — Cycle-time / jitter / PLC-state diagnostics
|
||||
|
||||
**Scope**: Probe loop today only checks reachability via `ReadStateAsync`
|
||||
(`TwinCATDriver.cs::ProbeLoopAsync`). Surface cycle-time, jitter, and online-
|
||||
change counter as health signals via the standard `_AppInfo` /
|
||||
`TwinCAT_SystemInfoVarList._AppInfo` GVL (the same one we filter out of
|
||||
discovery). Specifically:
|
||||
|
||||
- `_AppInfo.OnlineChangeCnt` (UDINT) — incremented on every online change.
|
||||
- `_AppInfo.AppName` (STRING) — TC project name, useful for
|
||||
cross-instance identification.
|
||||
- `_TaskInfo[1].CycleTime` (UDINT, 100 ns units) — the configured PLC cycle.
|
||||
- `_TaskInfo[1].LastExecTime` (UDINT, 100 ns units) — most recent measured
|
||||
cycle execution; jitter is the delta against `CycleTime`.
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs::ProbeLoopAsync` —
|
||||
augment success path to also read these four symbols. Surface via a new
|
||||
`TwinCATDeviceDiagnostics` record on `DeviceState`. Emit through
|
||||
`IDriverDiagnostics` (the cross-driver diagnostics surface introduced for
|
||||
Modbus prohibition events — task #154).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATSystemSymbolFilter.cs` — leave
|
||||
the filter as-is for the user-visible browse; the probe path reads system
|
||||
symbols directly without going through discovery.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: still `AdsClient.ReadValueAsync(symbol, type,
|
||||
ct)`. The symbols are read by name, not by index group, so no new API.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: `FakeTwinCATClient` exposes `SetSystemSymbolValue(string name, object
|
||||
value)` so tests can drive the diagnostics surface deterministically.
|
||||
- Integration: `[TwinCATFact]` connects to TCBSD, asserts the diagnostics
|
||||
block populates `CycleTimeMs > 0` and `OnlineChangeCnt >= 0` within one
|
||||
probe interval.
|
||||
|
||||
**Effort**: M (1-2 days).
|
||||
**Deps**: confirm `IDriverDiagnostics` shape from existing Modbus diagnostics
|
||||
RPC (task #154 in MEMORY); it should be reusable.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: new section "Diagnostics" in `docs/drivers/TwinCAT-Test-Fixture.md`
|
||||
documenting the four exposed signals (cycle time, jitter, online-change
|
||||
counter, app name) and where they surface in the cross-driver
|
||||
diagnostics RPC. `docs/Driver.TwinCAT.Cli.md` `probe` section gains a
|
||||
"Health probe" sub-section noting the same symbols can be read directly
|
||||
via `probe -s "TwinCAT_SystemInfoVarList._AppInfo.OnlineChangeCnt"`
|
||||
(the existing example) plus the new `_TaskInfo[1].CycleTime` /
|
||||
`LastExecTime`. Add `docs/v3/twincat-backlog.md` cross-link confirming
|
||||
cycle-time/jitter no longer deferred.
|
||||
- Fixture (TCBSD PLC project): no change required — `_AppInfo` and
|
||||
`_TaskInfo[1]` are TwinCAT system GVLs, present on every runtime. The
|
||||
`TwinCATSystemSymbolFilter` already drops them from user browse;
|
||||
`TwinCatProject/README.md` adds a one-line "These symbols are read by
|
||||
the probe loop, not project-defined" callout.
|
||||
- Integration tests: new `Probe_loop_surfaces_cycle_time_and_online_change_count`
|
||||
`[TwinCATFact]` asserts the diagnostics record populates within one
|
||||
probe interval against TCBSD. Unit tests via `FakeTwinCATClient.SetSystemSymbolValue`
|
||||
drive the diagnostics surface deterministically.
|
||||
- E2E: no change. Future enhancement could expose driver diagnostics via a
|
||||
CLI subcommand (`otopcua-twincat-cli diagnostics -n ...`) — captured in
|
||||
the consolidated section below as a follow-up.
|
||||
|
||||
### Phase 4 — UDT decomposition with TMC parsing
|
||||
|
||||
#### PR 4.1 — Nested UDT browse via TMC parsing
|
||||
|
||||
**Scope**: Largest single piece of work in the plan. `TwinCATDataType.Structure`
|
||||
exists but `BrowseSymbolsAsync` skips non-atomic symbols
|
||||
(`AdsTwinCATClient.cs:224`); to expose nested UDT trees we either:
|
||||
|
||||
1. **Online**: walk the `IDataType` tree returned by `SymbolLoaderFactory` —
|
||||
each `IStructType` exposes `SubItems` recursively. This is what
|
||||
`Beckhoff.TwinCAT.Ads` v6's TypeSystem already gives us at runtime; we just
|
||||
never recursed.
|
||||
2. **Offline (TMC file)**: parse the TwinCAT Module Class XML file the project
|
||||
compiles to (`*.tmc`), build a type catalogue, drive discovery from it
|
||||
without requiring a live runtime.
|
||||
|
||||
We ship the **online** path first (PR 4.1) because it covers 100% of the case
|
||||
where the runtime is reachable, and `SymbolLoaderFactory` already does the
|
||||
heavy lifting. TMC offline parsing is deferred to a hypothetical PR 4.2 if a
|
||||
disconnected-discovery use case emerges (unlikely; live integration tests
|
||||
demonstrate runtime is always available in our deployments).
|
||||
|
||||
**Files**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs` —
|
||||
`BrowseSymbolsAsync` recurses into `IStructType.SubItems`, yielding one
|
||||
`TwinCATDiscoveredSymbol` per leaf with the dotted instance path
|
||||
(`MyStruct.Inner.Field`). For arrays-of-structs, expand element-by-element
|
||||
up to a configurable bound (default 1024) — beyond that, expose only the
|
||||
array root with `IsArray=true`.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs::DiscoverAsync` —
|
||||
fold the recursed structure into the existing `Discovered/` folder tree
|
||||
using `IAddressSpaceBuilder.Folder` for each struct member.
|
||||
- New: `TwinCATTypeWalker.cs` — pure helper that takes an `IDataType` and
|
||||
yields `(instancePath, atomicType, readOnly)` tuples. Unit-testable without
|
||||
touching `AdsClient`.
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**:
|
||||
- `TwinCAT.TypeSystem.IStructType` — `SubItems` (collection of
|
||||
`IMember`); each member has `BaseType`, `Name`, `Offset`.
|
||||
- `TwinCAT.TypeSystem.IArrayType` — `Dimensions`, `BaseType`.
|
||||
- `TwinCAT.TypeSystem.IEnumType` — handled in PR 1.5 (atomic surface).
|
||||
- `TwinCAT.TypeSystem.IAliasType.BaseType` — recurse until atomic.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: new `TwinCATTypeWalkerTests` — feed synthetic `IDataType` trees,
|
||||
assert the flattened paths and types.
|
||||
- Integration: extend `GVL_Plant` (already has `Line1.Stations[1].Axes[1].Motor`
|
||||
per `TwinCAT3SmokeTests.cs`) — the existing `Driver_reads_deeply_nested_UDT_path`
|
||||
test reads a known-leaf path; add a new test that browses into the same
|
||||
GVL and asserts the entire tree shape matches expectation. Should yield
|
||||
~50+ leaves.
|
||||
|
||||
**Effort**: L (4-5 days). Most of the cost is in the addressspace-builder
|
||||
folder/variable plumbing, not the type walking itself.
|
||||
**Deps**: PR 1.5 (ENUM/ALIAS) — without it, struct members of enum type
|
||||
silently drop. PR 1.4 (whole-array reads) is helpful but not blocking.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: this is the **largest doc-write of the plan**.
|
||||
`docs/Driver.TwinCAT.Cli.md` gains a new top-level "UDT decomposition"
|
||||
section explaining the dotted-instance browse syntax (`MyStruct.Inner.
|
||||
Field`), array-of-struct expansion bound, and how members surface via
|
||||
`browse`. The existing `read` example "Nested UDT member" gets expanded
|
||||
with a multi-level case targeting the plant hierarchy. `docs/drivers/
|
||||
TwinCAT-Test-Fixture.md` "What it actually covers" gets a UDT bullet
|
||||
per-member rather than per-leaf. Update `docs/v3/twincat-backlog.md` —
|
||||
remove the implicit UDT-decomposition gap.
|
||||
- Fixture (TCBSD PLC project, primary fixture-extension surface): the
|
||||
existing `GVL_Plant.Line1.Stations[1..3].Axes[1..4]...` 5-level
|
||||
hierarchy already provides ~50+ leaves per `TwinCatProject/README.md`
|
||||
§ "5-level plant hierarchy" + § "Live value churn". This PR may add a
|
||||
few **edge cases** to stress the type walker:
|
||||
- `PLC/DUTs/ST_NestedFlags.TcDUT` — struct containing a BIT-packed
|
||||
member (e.g. `Flags : DWORD` with named bit-mask aliases).
|
||||
- `PLC/DUTs/ST_RecursiveCap.TcDUT` — struct with a self-pointer (must
|
||||
be capped by the type walker, not infinite-recurse). Demonstrates
|
||||
POINTER skip behavior.
|
||||
- Add an `ARRAY [1..2000] OF ST_AlarmRecord` to exercise the
|
||||
`MaxArrayExpansion` (default 1024) cutoff.
|
||||
README § "Complex hierarchy" gets the new edge-case DUTs documented.
|
||||
- Integration tests: new `TwinCATTypeWalkerTests` (unit) feeding synthetic
|
||||
`IDataType` trees. Live: `Driver_browses_full_plant_hierarchy_yields_50_plus_leaves`,
|
||||
`Driver_caps_array_of_struct_expansion_at_configured_bound`,
|
||||
`Driver_handles_self_referential_struct_without_recursion` against the
|
||||
new edge-case DUTs.
|
||||
- E2E: `scripts/e2e/test-twincat.ps1` could gain a UDT-bridge scenario
|
||||
(`-BridgeNodeId` pointing at `GVL_Plant.Line1.Stations[1].Axes[1].Motor.
|
||||
Temperature`) but this requires the OPC UA server's address-space to
|
||||
reflect the decomposed tree — keep as a follow-up after server-side
|
||||
rendering ships in v3.
|
||||
|
||||
### Phase 5 — TC3 EventLogger alarms
|
||||
|
||||
#### PR 5.1 — `IAlarmSource` via TC3 EventLogger
|
||||
|
||||
**Scope**: TwinCAT 3.1 build 4022+ ships TcEventLogger as a system service
|
||||
exposing alarms/events on AMS port 110 (`AMSPORT_EVENTLOG`). Implement
|
||||
`IAlarmSource` over that interface so PLC alarms surface as OPC UA AC events.
|
||||
|
||||
**Open question (b) below** drives the implementation: does Beckhoff publish
|
||||
a managed wrapper, or do we hit AMS port 110 directly?
|
||||
|
||||
If a managed wrapper exists:
|
||||
- `Beckhoff.TwinCAT.Ads.TcEventLogger` (or similar) — subscribe via
|
||||
`EventLogger.AlarmRaised` event.
|
||||
|
||||
If not (likely — InfoSys docs lean on `TcCOM` C++ APIs):
|
||||
- Open a second `AdsClient` connection to port 110 via
|
||||
`_secondaryClient.Connect(netId, 110)`.
|
||||
- Use `AddDeviceNotificationAsync` on the alarm-list index group
|
||||
(`ADSIGRP_TCEVENTLOG_ALARMS`, exact constant TBD during spike).
|
||||
- Decode the binary event payload into `AlarmEvent` records (severity,
|
||||
source, message, time-of-occurrence, ack state).
|
||||
|
||||
**Files**:
|
||||
- New: `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATAlarmSource.cs`
|
||||
— implements `IAlarmSource` (currently used by Galaxy / Wonderware).
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs` — declare
|
||||
`IAlarmSource` interface, delegate to the helper.
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriverOptions.cs` — new
|
||||
`bool EnableAlarms` (default `false` until production-validated).
|
||||
|
||||
**Beckhoff.TwinCAT.Ads API**: TBD pending spike. Falls back to raw
|
||||
`AdsClient.AddDeviceNotificationAsync` on port 110 if no managed wrapper.
|
||||
|
||||
**Test plan**:
|
||||
- Unit: fake event-logger feeds synthetic alarms; assert `IAlarmSource`
|
||||
surface raises events with correct shape.
|
||||
- Integration: TCBSD project gains an `Alarm.Raise(...)` call site on a GVL
|
||||
bool transition; new `[TwinCATFact]` subscribes via the driver, toggles the
|
||||
trigger, asserts the alarm appears in the source within 5 s.
|
||||
|
||||
**Effort**: L (4-5 days), most of which is the spike. If no managed wrapper
|
||||
exists, add another L (3-4 days) to implement the binary protocol decoder.
|
||||
**Deps**: spike answer to open question (b) — surface that as an explicit
|
||||
investigation PR before committing to the build.
|
||||
|
||||
**Docs / fixture / e2e**:
|
||||
- Docs: **new file** `docs/drivers/TwinCAT.md` (the existing
|
||||
`TwinCAT-Test-Fixture.md` is fixture-only) covering the alarm
|
||||
configuration surface — `EnableAlarms` option, AMS port 110 routing,
|
||||
severity / source / message decode, OPC UA AC mapping. Spike output
|
||||
goes to `docs/v3/twincat-eventlogger-spike.md` per open question (b).
|
||||
`docs/Driver.TwinCAT.Cli.md` gains a new `alarms` subcommand (subscribe
|
||||
+ print stream) mirroring the OPC UA Client CLI's `alarms` verb.
|
||||
`docs/drivers/TwinCAT-Test-Fixture.md` "Alarms / history" caveat
|
||||
removed; capability matrix gets `IAlarmSource = yes`.
|
||||
- Fixture (TCBSD PLC project, primary fixture-extension surface): add
|
||||
`PLC/POUs/FB_AlarmHarness.TcPOU` that calls `FB_TcLogEvent` (or
|
||||
equivalent TC3 EventLogger PLC API) on a 5 s tick, raising / clearing
|
||||
a known event class. New `PLC/GVLs/GVL_Alarms.TcGVL` exposes the
|
||||
trigger booleans the test toggles. `TwinCatProject/README.md` § new
|
||||
"Alarm scenarios" subsection documents the event class IDs + severity
|
||||
+ cleared-on transitions. The existing `ST_Alarm` DUT remains for
|
||||
PLC-level data; the EventLogger is the AC source.
|
||||
- Integration tests: new `TwinCATAlarmIntegrationTests.cs` —
|
||||
`Driver_raises_alarm_event_when_PLC_logs_event` `[TwinCATFact]`
|
||||
toggles the trigger via `WriteAsync`, asserts the alarm appears in
|
||||
`IAlarmSource.AlarmRaised` within 5 s. Includes a clear-event variant.
|
||||
Unit tests via fake event-logger feed synthetic alarms.
|
||||
- E2E: `scripts/e2e/test-twincat.ps1` gains a `Test-AlarmRoundTrip`
|
||||
step (toggle PLC trigger → assert event surfaces via OPC UA AC client)
|
||||
once the server-side wiring is in. Likely defers to a follow-up PR
|
||||
after the server-tier alarm rendering catches up.
|
||||
|
||||
## Documentation, fixture, and e2e impact
|
||||
|
||||
Consolidated view across all 12 PRs. The **TCBSD fixture PLC project**
|
||||
(`tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/`) is
|
||||
the **primary fixture-extension surface** — it's a real TwinCAT XAE project
|
||||
committed object-by-object as `.TcGVL` / `.TcDUT` / `.TcPOU` files. Most PRs
|
||||
extend it by adding GVL variables, DUTs (structs / enums / aliases), or POUs
|
||||
(function blocks driving live data churn). The TCBSD VM at AmsNetId
|
||||
`41.169.163.43.1.1` on `10.100.0.128` is the deployment target (per memory
|
||||
entry `project_tcbsd_fixture.md`); the project bypasses the local Hyper-V/RTIME
|
||||
conflict (per `project_twincat_hyperv_conflict.md`) by running on ESXi.
|
||||
|
||||
### User docs touched
|
||||
|
||||
| PR | `docs/Driver.TwinCAT.Cli.md` | `docs/drivers/TwinCAT-Test-Fixture.md` | `docs/v3/twincat-backlog.md` | Other |
|
||||
|---|---|---|---|---|
|
||||
| 1.1 LINT/ULINT | Data-types caveat removed | Bugs-caught entry #4 | — | — |
|
||||
| 1.2 TIME/DATE/DT/TOD | Native-type syntax + 4 examples | — | — | — |
|
||||
| 1.3 Bit-write | `write` example + RMW note | Bugs-caught entry #3 update | — | — |
|
||||
| 1.4 Arrays | New "Arrays" sub-section + read example | Coverage list bullet | — | — |
|
||||
| 1.5 ENUM/ALIAS | `browse` data-types rows | Coverage list bullet | — | — |
|
||||
| 2.1 Sum cmds | — | New "Performance" section | Closed-out perf bullet | — |
|
||||
| 2.2 Handles | Cache note in `read` / `subscribe` | Perf-section paragraph | — | — |
|
||||
| 2.3 Sym-version | — | Online-change-handling caveat dropped | — | — |
|
||||
| 3.1 MaxDelay | `--max-delay-ms` flag | Coalescing caveat updated | — | — |
|
||||
| 3.2 Diagnostics | `probe` health-symbols sub-section | New "Diagnostics" section | Cycle-time bullet closed | — |
|
||||
| 4.1 UDT | New top-level "UDT decomposition" section | Coverage list per-member | UDT-decomp gap removed | — |
|
||||
| 5.1 Alarms | New `alarms` subcommand | "Alarms" caveat removed | — | **New** `docs/drivers/TwinCAT.md`; **new** `docs/v3/twincat-eventlogger-spike.md` |
|
||||
|
||||
### TCBSD fixture PLC project changes
|
||||
|
||||
| PR | GVL changes | DUT changes | POU changes | README section |
|
||||
|---|---|---|---|---|
|
||||
| 1.1 LINT/ULINT | `GVL_Primitives.vLargeCounter`, `vLargeCounterU` | — | — | "GVL_Primitives numeric seeds" |
|
||||
| 1.2 TIME/DATE/DT/TOD | `GVL_Primitives.dCurrentTime`, `tCycleDuration`, `dToday`, `tShiftStart` | — | — | "Type coverage" seed values |
|
||||
| 1.3 Bit-write | _(reuse `GVL_Primitives.vWord`)_ | — | — | — |
|
||||
| 1.4 Arrays | `GVL_Arrays.aReal2D : ARRAY[1..5,1..5] OF REAL` | — | — | "Array coverage" |
|
||||
| 1.5 ENUM/ALIAS | _(reuse `GVL_Enums`; new `currentSeverity`/`currentTemperature` instance vars)_ | — | — | "Integration-test contract" entry |
|
||||
| 2.1 Sum cmds | **`GVL_Perf.aTags : ARRAY[1..1000] OF DINT`** | — | New `FB_PerfChurn` driving rotating writes | New "Performance scenarios" subsection |
|
||||
| 2.2 Handles | _(reuse `GVL_Perf.aTags`)_ | — | — | — |
|
||||
| 2.3 Sym-version | _(no schema change; manual online-change drill)_ | — | — | New "Online-change test scenario" |
|
||||
| 3.1 MaxDelay | _(reuse `GVL_Fixture.nCounter` 100 Hz driver)_ | — | — | One-line note in "Required project state" |
|
||||
| 3.2 Diagnostics | _(reads system GVLs `_AppInfo`, `_TaskInfo[1]`)_ | — | — | Probe-symbols callout |
|
||||
| 4.1 UDT | _(reuse `GVL_Plant`; possibly grow `aLargeAlarms : ARRAY[1..2000] OF ST_AlarmRecord`)_ | New `ST_NestedFlags`, `ST_RecursiveCap`, `ST_AlarmRecord` | — | "Complex hierarchy" edge-cases |
|
||||
| 5.1 Alarms | New `GVL_Alarms` (trigger booleans) | — | New `FB_AlarmHarness` calling `FB_TcLogEvent` | New "Alarm scenarios" |
|
||||
|
||||
### Integration test additions
|
||||
|
||||
All new tests gate on `[TwinCATFact]` / `[TwinCATTheory]` against
|
||||
`TWINCAT_TARGET_NETID`. Most ship in `tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.
|
||||
IntegrationTests/TwinCAT3SmokeTests.cs`; PR 5.1 introduces a new
|
||||
`TwinCATAlarmIntegrationTests.cs`. The existing 30-case suite grows to
|
||||
roughly **45 cases** end-of-plan, plus a perf-tier guarded behind
|
||||
`TWINCAT_PERF=1`.
|
||||
|
||||
### E2E scripts
|
||||
|
||||
`scripts/e2e/test-twincat.ps1` is the single TwinCAT e2e bridge today; it's
|
||||
gated behind `TWINCAT_TRUST_WIRE=1` (see task #221 — CI fixture). The plan
|
||||
intentionally **does not change** the canonical bridge for most PRs because
|
||||
the bridge exercises one DINT counter through the OPC UA server, and that
|
||||
path stays correct. PRs 1.2 (DT bridge), 1.4 (array bridge), 4.1 (UDT
|
||||
bridge), 5.1 (alarm round-trip) each list speculative e2e extensions but
|
||||
they're explicitly marked as follow-ups gated on server-side rendering
|
||||
catching up.
|
||||
|
||||
## Skip-rated items (for context)
|
||||
|
||||
These are intentionally not built. Listed for future-reader completeness so
|
||||
nobody re-invests effort that was already triaged:
|
||||
|
||||
| # | Gap | Why skip |
|
||||
|---|---|---|
|
||||
| 9 | Multi-target / multi-route AMS gateway | Per-device config in `TwinCATDriverOptions.Devices` already supports N targets |
|
||||
| 10 | Secure ADS / ADS-over-TLS | Significant work — TC3.1 build 4024+ feature, host-router-level config; defer |
|
||||
| 11 | Route credential management | Host-level AMS router responsibility (`StaticRoutes.xml`); not driver scope |
|
||||
| 12 | NC-axis / CNC channel / EtherCAT slave I/O | Specialty; system-symbol filter actively drops `Mc_*` (`TwinCATSystemSymbolFilter.cs:28`) |
|
||||
| 13 | System-service ports (200/10000) | Niche operational tooling; user-runtime ports cover real use cases |
|
||||
| 15 | PLC RPC / method invocation | Niche; design-heavy; no demand signal yet |
|
||||
| 16 | Per-PLC-runtime auto-discover | Cosmetic; manual port config in options works |
|
||||
| 20 | File-system access via ADS (FOPEN/FREAD) | Niche; out of scope |
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **(a) TMC parsing — separate library or embedded?**
|
||||
Phase 4 ships the **online type-walker** path which uses
|
||||
`Beckhoff.TwinCAT.Ads.TypeSystem.SymbolLoaderFactory` and needs a live
|
||||
runtime. If a future use case needs offline discovery (e.g. address-space
|
||||
pre-bake at build time without a reachable PLC), do we:
|
||||
- vendor a TMC-XML parser into this driver, or
|
||||
- build a separate `ZB.MOM.WW.OtOpcUa.Tooling.TwinCAT` CLI that emits a
|
||||
pre-baked tag manifest?
|
||||
The latter cleanly separates build-time tooling from runtime driver code
|
||||
and matches how Galaxy.Host is split. Decision deferred until demand
|
||||
appears; recommend the CLI route when it does.
|
||||
|
||||
2. **(b) Beckhoff TC3 EventLogger NuGet — published, or AMS port 110 raw?**
|
||||
Need to spike against the current `Beckhoff.TwinCAT.Ads` v6 NuGet API
|
||||
surface. Beckhoff InfoSys lists a `Tc3_EventLogger` PLC library and a
|
||||
TcCOM C++ API but the .NET surface is thinner. PR 5.1 starts with a
|
||||
one-day spike documented as `docs/v3/twincat-eventlogger-spike.md` before
|
||||
committing to the implementation path.
|
||||
|
||||
3. **(c) Symbol-version invalidation event details**
|
||||
PR 2.3 needs the exact index-group constant and notification semantics for
|
||||
the symbol-version counter. `AdsReservedIndexGroup.SymbolVersion` (0xF008)
|
||||
is the working hypothesis but the field on the v6 enum needs verification
|
||||
— the older `TwinCAT.Ads.AdsReservedIndexGroup` enum had different naming.
|
||||
Beckhoff InfoSys `tcadscommon/tcadscommon_indexgroups` is the reference;
|
||||
confirm during the PR 2.3 spike. Fallback: poll the version counter at
|
||||
probe-loop cadence and treat any change as an invalidation.
|
||||
|
||||
## References
|
||||
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs`
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs`
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDataType.cs`
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATSymbolPath.cs`
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATSystemSymbolFilter.cs`
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATAmsAddress.cs`
|
||||
- `docs/featuregaps.md` — TwinCAT (Beckhoff ADS) section
|
||||
- `docs/v3/twincat-backlog.md` — deferred items (TC2, multi-hop, lab IPC)
|
||||
- `docs/drivers/TwinCAT-Test-Fixture.md` — TCBSD + XAR fixture details
|
||||
- Beckhoff InfoSys: <https://infosys.beckhoff.com/english.php?content=../content/1033/tcadsdll2/117571083.html> (Sum commands)
|
||||
- Beckhoff InfoSys: <https://infosys.beckhoff.com/english.php?content=../content/1033/tcadsnetref/7313319051.html> (NotificationSettings)
|
||||
- Beckhoff GitHub: <https://github.com/Beckhoff/TC3-AdsClient-Csharp>
|
||||
@@ -0,0 +1,149 @@
|
||||
# Galaxy backend parity matrix
|
||||
|
||||
This document tracks the scenario × result matrix that the
|
||||
`Driver.Galaxy.ParityTests` suite drives against both Galaxy backends —
|
||||
the legacy out-of-process **Galaxy.Host** (.NET 4.8 x86 + MXAccess COM,
|
||||
fronted by `GalaxyProxyDriver`) and the new in-process **mxgateway**
|
||||
backend (`GalaxyDriver`, .NET 10 + gRPC against `mxaccessgw`).
|
||||
|
||||
Maintained alongside Phase 5 (PR 5.W). The Phase 7 default flip
|
||||
(PR 7.1) consumes this matrix as its go/no-go gate — every row must be
|
||||
either green or carry an explicit *accepted-delta* justification.
|
||||
|
||||
## Reading the matrix
|
||||
|
||||
- **Status: green** — the scenario asserts strict parity and passes
|
||||
(or skips cleanly when the rig isn't up).
|
||||
- **Status: yellow** — soft pin only (count or shape parity, not value
|
||||
parity) — acceptable when the underlying COM/gRPC stacks have known
|
||||
divergences in raw payloads but the surface presented to the
|
||||
DriverNodeManager is equivalent.
|
||||
- **Status: red** — divergence detected. Row carries a fix or a
|
||||
follow-up task ID.
|
||||
|
||||
## Scenarios
|
||||
|
||||
Last verified end-to-end on the dev parity rig: **2026-04-30**
|
||||
(legacy `OtOpcUaGalaxyHost` mxaccess backend; mxaccessgw v1.x at
|
||||
`http://localhost:5120`; sandbox `OtOpcUaParityTest_001` deployed in
|
||||
the `ZB` galaxy; 13 passed / 1 skipped / 0 failed in 19 minutes).
|
||||
|
||||
| PR | Test class | Scenario | Status | Notes |
|
||||
|----|-----------|----------|--------|-------|
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same variable set | green | symmetric set diff on full-reference set, after `[]` array-suffix workaround in `GalaxyDiscoverer` |
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same DataType / SecurityClass / IsHistorized | green | per-attribute meta triple parity |
|
||||
| 5.2 | `BrowseAndReadParityTests` | Same StatusCode-class on a sampled read | yellow | pins status class (Bad/Uncertain/Good); CLR type intentionally not asserted — see "Accepted deltas" #6 |
|
||||
| 5.3 | `SubscribeAndEventRateParityTests` | Subscribe returns a handle on each backend | green | symmetric Unsubscribe cleanup |
|
||||
| 5.3 | `SubscribeAndEventRateParityTests` | Event rate within ±50% over 3s | yellow | both backends fed by the same upstream MXAccess subscriptions; tolerance absorbs scheduler jitter |
|
||||
| 5.4 | `WriteByClassificationParityTests` | FreeAccess / Operate write status-class parity | yellow | pins status class only; legacy flat-maps every failure to BadInternalError, mxgw distinguishes (BadCommunicationError, BadDeviceFailure, etc.) — see "Accepted deltas" #7 |
|
||||
| 5.4 | `WriteByClassificationParityTests` | Configure / Tune routes via secured-write | yellow | same status-class pin |
|
||||
| 5.5 | `AlarmTransitionParityTests` | Same alarm-condition source-node-id set | green | one-way invariant on sub-attribute refs (legacy populated → mxgw matches; legacy null → mxgw free to populate per AlarmRefBuilder) |
|
||||
| 5.5 | `AlarmTransitionParityTests` | IsAlarm-marked variable count parity | green | soft pin — count must match, doesn't have to be non-zero |
|
||||
| 5.6 | `HistoryReadParityTests` | Same historized attribute set | green | what HistoryRouter consumes when routing to the Wonderware sidecar |
|
||||
| 5.6 | `HistoryReadParityTests` | New mxgw GalaxyDriver does not implement `IHistoryProvider` | green | architectural pin from Phase 1 (PR 1.3) on the *new* path; legacy `GalaxyProxyDriver` keeps the interface for back-compat until PR 7.2 — see "Accepted deltas" #8 |
|
||||
| 5.7 | `ReconnectParityTests` | Reinitialize → both Healthy + reads succeed | green | recovery latency is *not* pinned (legacy: pipe + COM client; mxgw: re-Register gw session) |
|
||||
| 5.7 | `ReconnectParityTests` | Health diverges only when one side recovers | yellow | soft pin until a toxiproxy-style fault injector lands |
|
||||
| 5.8 | `ScanStateProbeParityTests` | Same per-platform host set | n/a — deferred | dev rig is licensed for one `$WinPlatform` only; multi-platform parity deferred to a customer rig (PR 4.7's unit tests pin the state-decoder + member-tracking logic) |
|
||||
| 5.8 | `ScanStateProbeParityTests` | Same `HostState` per overlapping platform | n/a — deferred | same single-platform constraint |
|
||||
|
||||
## Accepted deltas
|
||||
|
||||
These are intentional differences between the two backends — the parity
|
||||
suite skips or tolerates them by design.
|
||||
|
||||
1. **Transport-entry host name.** The legacy backend's
|
||||
`IHostConnectivityProbe` surface includes a host entry named after
|
||||
the Galaxy.Host process identity; the mxgw backend uses the
|
||||
configured `MxAccess.ClientName`. The names differ, but both are
|
||||
correct for their respective sessions — the parity test compares
|
||||
only the platform-host subset.
|
||||
|
||||
2. **Reconnect latency cadence.** Legacy reconnect roundtrips an OS
|
||||
named pipe + an MxAccess COM client + a Galaxy.Host process restart
|
||||
if the host died. The mxgw reconnect re-Registers the gateway session
|
||||
over an existing gRPC channel. Sub-second vs multi-second recoveries
|
||||
are both correct for their own paths; only the eventual `Healthy`
|
||||
convergence is pinned.
|
||||
|
||||
3. **Read-value drift.** A read sampled twice on a live Galaxy can
|
||||
return different values legitimately. We pin `StatusCode`-class
|
||||
parity (Bad/Uncertain/Good); value equality is not pinned.
|
||||
|
||||
4. **Event-rate variance.** Both backends consume the same upstream
|
||||
MXAccess publish events but route them through different deserializers
|
||||
(LMXProxyServer COM events vs gRPC `MxEvent` protos). Scheduler
|
||||
jitter on either side can shift counts within a 3s window; we pin a
|
||||
±50% ratio, not strict equality.
|
||||
|
||||
5. **`IHistoryProvider` on the new path only.** Phase 1 (PR 1.3) lifted
|
||||
history off the per-driver path onto the server-owned
|
||||
`HistoryRouter` for the *new* in-process `GalaxyDriver`. The legacy
|
||||
`GalaxyProxyDriver` still surfaces `IHistoryProvider` for back-compat
|
||||
with the legacy server bootstrap path — it's an accepted delta
|
||||
retired in PR 7.2 alongside the rest of the legacy projects. The
|
||||
pin we want to enforce is "the new path doesn't regress to per-driver
|
||||
history."
|
||||
|
||||
6. **Read value-CLR-type.** Legacy returns the raw VARIANT (e.g.
|
||||
`Byte[]`) for an attribute that hasn't received its first value
|
||||
cycle from MxAccess yet, while mxgw returns the typed value
|
||||
(`Single`, `Int32`, etc.). Once a real value is written or scanned,
|
||||
both converge. Pinning CLR-type equality across the uninitialized
|
||||
window adds noise without a real parity invariant — the
|
||||
`StatusCode`-class assertion already covers the
|
||||
"did the read succeed" question.
|
||||
|
||||
7. **Write-failure StatusCode mapping.** Legacy
|
||||
`MxAccessGalaxyBackend.WriteValuesAsync` flat-maps every failure to
|
||||
`BadInternalError` (`0x80020000`); mxgw
|
||||
`GatewayGalaxyDataWriter.TranslateReply` uses
|
||||
`MxStatusProxy.RawDetectedBy` to distinguish gw-layer faults
|
||||
(`BadCommunicationError`, `0x80050000`) from MxAccess HRESULT
|
||||
faults (`BadDeviceFailure`, `BadNotConnected`, etc.). Both yield
|
||||
Bad-status — the parity invariant is the *status class*, not the
|
||||
exact code. Tighter mapping parity isn't worth investing in: the
|
||||
legacy mapping retires alongside `GalaxyProxyDriver` in PR 7.2.
|
||||
|
||||
8. **Single-platform scope on the dev rig.** Two
|
||||
`ScanStateProbeParityTests` scenarios are deferred to a customer
|
||||
rig with multiple deployed `$WinPlatform` instances; this dev box
|
||||
is licensed for one. PR 4.7's unit tests (`PerPlatformProbeWatcherTests`)
|
||||
pin the state-decoder + member-tracking logic at the seam level,
|
||||
so the runtime parity check becomes a customer-rig acceptance gate
|
||||
before that customer goes live, not a precondition for retiring
|
||||
the legacy projects on this dev box.
|
||||
|
||||
9. **Workaround for the gw `[]` array-suffix bug.**
|
||||
`mxaccessgw/src/MxGateway.Server/Galaxy/GalaxyRepository.cs:173-175`
|
||||
appends `[]` to the `full_tag_reference` of array-typed attributes,
|
||||
which `MxAccess COM IInstance.AddItem` doesn't accept. The lmxopcua
|
||||
discoverer (`GalaxyDiscoverer.StripArraySuffix`) defensively strips
|
||||
the suffix. Tracked in `mxaccessgw/requirements-array-suffix-fix.md`;
|
||||
the workaround is removed when that gw fix lands.
|
||||
|
||||
## Outstanding deltas
|
||||
|
||||
None as of 2026-04-30. Phase 7 (PR 7.1) flipped the default to
|
||||
`mxgw`; PR 7.2 (legacy project deletion) is unblocked — the matrix
|
||||
gate is satisfied and no further soak/pilot precondition applies.
|
||||
|
||||
## Running the matrix
|
||||
|
||||
```bash
|
||||
# Both backends must be reachable for any row to run; rows skip
|
||||
# cleanly when their backend is unavailable.
|
||||
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
|
||||
```
|
||||
|
||||
Environment overrides for the mxgw backend:
|
||||
|
||||
| Variable | Default | Purpose |
|
||||
|----------|---------|---------|
|
||||
| `OTOPCUA_PARITY_GW_ENDPOINT` | `http://localhost:5120` | mxaccessgw gRPC endpoint |
|
||||
| `OTOPCUA_PARITY_GW_API_KEY` | `parity-suite-key` | API key handed to `MxGatewayClient` |
|
||||
| `OTOPCUA_PARITY_CLIENT_NAME` | `OtOpcUa-Parity` | `MxAccess.ClientName` for the session |
|
||||
|
||||
The legacy backend reads ZB SQL on `localhost:1433` and spawns
|
||||
`OtOpcUa.Driver.Galaxy.Host.exe` from
|
||||
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/` — both
|
||||
must exist for the legacy half to resolve.
|
||||
@@ -0,0 +1,361 @@
|
||||
# Galaxy parity rig — runbook
|
||||
|
||||
Brings up both Galaxy backends side-by-side against a single live Galaxy
|
||||
so the parity matrix in `docs/v2/Galaxy.ParityMatrix.md` and the soak
|
||||
scenario in `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/SoakScenarioTests.cs`
|
||||
can run for real. Closing the parity matrix is the gate for PR 7.2
|
||||
(retire legacy Galaxy projects).
|
||||
|
||||
## Conceptual layout
|
||||
|
||||
```
|
||||
Galaxy ZB SQL ──┬── OtOpcUaGalaxyHost (NSSM service, net48 x86)
|
||||
│ └── MxAccess COM, ClientName "OtOpcUa-Galaxy.Host"
|
||||
│ └── named pipe "OtOpcUaGalaxy"
|
||||
│ ▲
|
||||
│ │ pipe IPC
|
||||
│ │
|
||||
│ GalaxyProxyDriver ◄── parity test (legacy half)
|
||||
│
|
||||
└── mxaccessgw service
|
||||
└── MxAccess COM, ClientName "OtOpcUa-Parity"
|
||||
└── gRPC on http://localhost:5120
|
||||
▲
|
||||
│ gRPC
|
||||
│
|
||||
GalaxyDriver (in-process) ◄── parity test (mxgw half)
|
||||
```
|
||||
|
||||
Both halves talk to the **same Galaxy** through **two distinct MxAccess
|
||||
sessions** (different ClientNames so they don't evict each other).
|
||||
|
||||
## What's already on this dev box
|
||||
|
||||
Per `~/.claude/projects/.../memory/`:
|
||||
|
||||
- **AVEVA System Platform + Galaxy + MXAccess runtime** — `project_aveva_platform_installed.md`.
|
||||
- **`OtOpcUaGalaxyHost`** Windows service running as `dohertj2`, NSSM-wrapped,
|
||||
binary at `C:\publish\OtOpcUaGalaxyHost\OtOpcUa.Driver.Galaxy.Host.exe`,
|
||||
shared secret at `.local/galaxy-host-secret.txt`, ZB SQL on `localhost:1433`
|
||||
— `project_galaxy_host_installed.md`.
|
||||
- **Parity test project** (`Driver.Galaxy.ParityTests`) committed and
|
||||
skip-clean — runs as soon as the mxgw half resolves.
|
||||
|
||||
## Setup steps (one-time)
|
||||
|
||||
### 1. Build + run mxaccessgw
|
||||
|
||||
The gateway source is at `c:\Users\dohertj2\Desktop\mxaccessgw\`.
|
||||
Build both halves — the worker has to be x86 net48 (MxAccess COM
|
||||
bitness), the server is .NET 10:
|
||||
|
||||
```powershell
|
||||
cd C:\Users\dohertj2\Desktop\mxaccessgw
|
||||
dotnet build src\MxGateway.Worker -c Release # produces bin\x86\Release\net48\MxGateway.Worker.exe
|
||||
dotnet build src\MxGateway.Server -c Release # produces bin\Release\net10.0\MxGateway.Server.dll
|
||||
```
|
||||
|
||||
Initialize the auth database and mint an API key. The CLI mode is
|
||||
gated by an `apikey` first-arg prefix:
|
||||
|
||||
```powershell
|
||||
$env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # any stable string for dev
|
||||
$srv = "C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Server\bin\Release\net10.0\MxGateway.Server.dll"
|
||||
|
||||
dotnet $srv apikey init-db # → "init-db: initialized"
|
||||
|
||||
dotnet $srv apikey create-key `
|
||||
--key-id parity-rig `
|
||||
--display-name "OtOpcUa-Parity" `
|
||||
--scopes "session:open,session:close,invoke:read,invoke:write,invoke:secure,events:read,metadata:read"
|
||||
# → "API key: mxgw_parity-rig_<base64suffix>" ← capture this; you can't list secrets later
|
||||
```
|
||||
|
||||
Save that exact key string for `OTOPCUA_PARITY_GW_API_KEY` in step 2.
|
||||
|
||||
Run the server with three env-var overrides — the defaults don't
|
||||
quite match what gRPC + the parity test need:
|
||||
|
||||
```powershell
|
||||
$env:MxGateway__ApiKeyPepper = "parity-rig-dev-pepper" # MUST match the create-key invocation
|
||||
$env:Kestrel__Endpoints__Http__Url = "http://localhost:5120"
|
||||
$env:Kestrel__Endpoints__Http__Protocols = "Http2" # gRPC needs h2c on plain HTTP
|
||||
$env:MxGateway__Worker__ExecutablePath = `
|
||||
"C:\Users\dohertj2\Desktop\mxaccessgw\src\MxGateway.Worker\bin\x86\Release\net48\MxGateway.Worker.exe"
|
||||
# appsettings.json's relative path is missing the \net48 segment; absolute path sidesteps that
|
||||
|
||||
dotnet $srv
|
||||
# → "Now listening on: http://localhost:5120"
|
||||
```
|
||||
|
||||
The worker spawns lazily on the first OpenSession RPC — there's no
|
||||
worker process visible in Task Manager until the first session. If
|
||||
the worker can't spawn, the server returns `Failed to open session
|
||||
session-…` with a `WorkerProcessLaunchException` in the server log.
|
||||
|
||||
NSSM-wrap it later if the rig becomes long-lived; for first-pass
|
||||
provisioning a console window is easier to inspect.
|
||||
|
||||
### 2. Set the parity env vars
|
||||
|
||||
In the test-runner shell:
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_PARITY_GW_ENDPOINT = "http://localhost:5120"
|
||||
$env:OTOPCUA_PARITY_GW_API_KEY = "parity-suite-key" # match the gw config
|
||||
$env:OTOPCUA_PARITY_CLIENT_NAME = "OtOpcUa-Parity"
|
||||
```
|
||||
|
||||
Elevation status doesn't matter — the legacy Galaxy.Host pipe ACL accepts
|
||||
elevated and non-elevated `dohertj2` shells alike (the Administrators deny
|
||||
ACE was removed 2026-04-24; see `project_galaxy_host_installed.md`).
|
||||
|
||||
### 3. Verify both halves resolve
|
||||
|
||||
```powershell
|
||||
cd C:\Users\dohertj2\Desktop\lmxopcua
|
||||
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
|
||||
--filter "FullyQualifiedName~HarnessShapeTests"
|
||||
```
|
||||
|
||||
`Harness_records_a_skip_reason_for_each_unavailable_backend` is the
|
||||
two-line truth-teller:
|
||||
|
||||
- Both `LegacyDriver` non-null + both `MxGatewayDriver` non-null → rig is up.
|
||||
- One side null → read its `LegacySkipReason` / `MxGatewaySkipReason` and fix.
|
||||
|
||||
## Running the matrix
|
||||
|
||||
Once both halves resolve:
|
||||
|
||||
```powershell
|
||||
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
|
||||
--filter "Category=ParityE2E"
|
||||
```
|
||||
|
||||
This runs all 17 scenario tests across the seven scenario classes
|
||||
(BrowseAndRead / Subscribe / Write / Alarm / History / Reconnect /
|
||||
ScanState). Each scenario class is independent — failures in one don't
|
||||
block the rest.
|
||||
|
||||
Track the result against `docs/v2/Galaxy.ParityMatrix.md`. Update each
|
||||
row to:
|
||||
|
||||
- **green** if the scenario passes
|
||||
- **yellow** if it skipped because the dev Galaxy doesn't have the right
|
||||
shape (see coverage matrix below)
|
||||
- **red** if it asserted a real delta — those are the deltas that block
|
||||
PR 7.2; chase each before retiring the legacy backend
|
||||
|
||||
## Galaxy shape needed for full coverage
|
||||
|
||||
Skip-on-empty-shape scenarios fail-soft today. To turn a skip into a
|
||||
real result, the dev Galaxy needs the shape in the right column:
|
||||
|
||||
| Scenario | Needs | Local rig |
|
||||
|---|---|---|
|
||||
| `BrowseAndReadParityTests` (3 tests) | Any deployed objects with attributes | ✅ existing seed |
|
||||
| `SubscribeAndEventRateParityTests` event-rate | ≥5 attributes whose values *change* in 3s | ⚙ scriptable via graccess-cli |
|
||||
| `WriteByClassificationParityTests` (FreeAccess/Operate) | A FreeAccess/Operate numeric attribute | ⚙ scriptable via graccess-cli |
|
||||
| `WriteByClassificationParityTests` (Configure/Tune) | A Configure/Tune attribute | ⚙ scriptable via graccess-cli |
|
||||
| `AlarmTransitionParityTests` (2 tests) | Attributes with the `$Alarm*` extension | ⚙ scriptable via graccess-cli |
|
||||
| `HistoryReadParityTests` (historized set) | Attributes with the History extension | ⚙ scriptable via graccess-cli |
|
||||
| `ScanStateProbeParityTests` (2 tests) | Multiple `$WinPlatform` / `$AppEngine` objects | ❌ **deferred to customer rig** — this dev box is provisioned for one platform only |
|
||||
|
||||
### The single-platform constraint
|
||||
|
||||
The dev box at `DESKTOP-6JL3KKO` is licensed / configured for a single
|
||||
deployed `$WinPlatform`. Adding a second platform isn't feasible here,
|
||||
so `ScanStateProbeParityTests` will skip in a "no overlap" branch on
|
||||
this rig. Both of its scenarios already handle that case gracefully
|
||||
(`Assert.Skip("no overlapping platform hosts between backends — likely
|
||||
the transport names differ but no $WinPlatform was discovered")`), so
|
||||
the matrix reports them as **n/a (deferred)** rather than red.
|
||||
|
||||
Plan: defer the two ScanState scenarios to a customer rig with multiple
|
||||
platforms. The PR 7.2 gate accepts "n/a, deferred" on these rows
|
||||
provided the legacy `GalaxyRuntimeProbeManager` and the in-process
|
||||
`PerPlatformProbeWatcher` have matching unit-test coverage of the
|
||||
state-decoder + member-tracking logic — which they do (PR 4.7's tests).
|
||||
Treat the runtime parity check as a customer-rig acceptance gate before
|
||||
that customer goes live, not a precondition for retiring the legacy
|
||||
projects on this dev box.
|
||||
|
||||
### Provisioning the rest via graccess-cli
|
||||
|
||||
`C:\Users\dohertj2\Desktop\graccess\graccess_cli\` is a .NET Framework
|
||||
4.8 console app over the ArchestrA GRAccess COM API. It can configure
|
||||
templates, instances, attributes, UDAs, extensions, and attribute
|
||||
security — i.e. every row above marked ⚙ scriptable. Full surface in
|
||||
`graccess/graccess_cli/docs/usage.md` and per-area workflow guides
|
||||
(`attribute-editing.md`, `template-editing.md`,
|
||||
`template-instance-editing.md`).
|
||||
|
||||
Reserve a sandbox UDO (e.g. `OtOpcUaParityTest`) to avoid mutating
|
||||
attributes on plant-relevant objects. Concrete commands per requirement:
|
||||
|
||||
**A FreeAccess/Operate numeric attribute** (covers WriteByClassification
|
||||
FreeAccess/Operate scenario):
|
||||
|
||||
```powershell
|
||||
graccess object uda add `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--uda OperateValue --data-type MxFloat `
|
||||
--category MxCategoryWriteable_C --security MxSecurityOperate `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
```
|
||||
|
||||
**A Configure / Tune attribute** (covers WriteByClassification
|
||||
Configure/Tune scenario):
|
||||
|
||||
```powershell
|
||||
# Tune
|
||||
graccess object uda add `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--uda TuneValue --data-type MxFloat `
|
||||
--category MxCategoryWriteable_T --security MxSecurityTune `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
|
||||
# Configure
|
||||
graccess object uda add `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--uda ConfigValue --data-type MxFloat `
|
||||
--category MxCategoryWriteable_C --security MxSecurityConfigure `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
```
|
||||
|
||||
**A changing-value attribute** (covers Subscribe event-rate scenario).
|
||||
Two ways:
|
||||
|
||||
1. *On-scan increment* — bind a script extension that bumps a counter
|
||||
each scan. Simplest to author with `object extension add` against
|
||||
`ScriptExtension` plus `object attribute set` for the script body
|
||||
(see `attribute-editing.md` §"Edit Extensions" for the pattern).
|
||||
2. *External writer loop* — leave the attribute as plain Float and run
|
||||
a one-liner that writes incrementing values from the parity-test
|
||||
shell. Uses the legacy backend path so it's available before the
|
||||
mxgw subscriber is up. This keeps the Galaxy template clean.
|
||||
|
||||
For first-pass validation pick #2 — no template surgery needed, and the
|
||||
write loop runs only during `dotnet test`.
|
||||
|
||||
**Attributes with the `$Alarm*` extension** (covers AlarmTransition
|
||||
scenario). Per `attribute-editing.md` §"Edit Alarm Settings" the
|
||||
likely-named attributes vary by extension type
|
||||
(`Limit`, `RateOfChange`, etc.). Add the extension via:
|
||||
|
||||
```powershell
|
||||
graccess object extension add `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--extension-type AnalogLimitAlarm --primitive AlarmInput `
|
||||
--object-extension `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
```
|
||||
|
||||
Then set HiHi/Hi/Lo/LoLo limit values + priority on the resulting
|
||||
attributes via `object attribute set`. Inspect first via
|
||||
`object attributes` to see the names the extension introduces — they
|
||||
differ across Aveva versions.
|
||||
|
||||
**Attributes with the History extension** (covers HistoryRead routing
|
||||
scenario). History settings are usually attribute or extension
|
||||
attributes; `attribute-editing.md` §"Edit History Settings" covers the
|
||||
discovery flow. Quick start:
|
||||
|
||||
```powershell
|
||||
graccess object extension add `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--extension-type HistoryExtension --primitive HistoryRecord `
|
||||
--object-extension `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
|
||||
# Then enable history on whichever attribute the extension points at
|
||||
graccess object attribute set `
|
||||
--galaxy ZB --name OtOpcUaParityTest --type template `
|
||||
--attribute HistoryEnabled --value true --data-type bool `
|
||||
--confirm --confirm-target OtOpcUaParityTest
|
||||
```
|
||||
|
||||
**Deploy + restart Galaxy.Host after any of the above** so MxAccess
|
||||
sees the change:
|
||||
|
||||
```powershell
|
||||
graccess object deploy --galaxy ZB --name OtOpcUaParityTest_001 `
|
||||
--confirm --confirm-target OtOpcUaParityTest_001
|
||||
sc.exe restart OtOpcUaGalaxyHost
|
||||
```
|
||||
|
||||
Then re-run the parity matrix. The previously-skipped scenarios should
|
||||
now find a sandbox attribute matching their selector and assert.
|
||||
|
||||
## Soak run
|
||||
|
||||
The 24h × 50k soak gates the production confidence half of PR 7.2.
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_SOAK_RUN = "1"
|
||||
$env:OTOPCUA_SOAK_TAGS = "<actual tag count if Galaxy < 50k>"
|
||||
$env:OTOPCUA_SOAK_MINUTES = "1440" # default 24h; compress for first runs
|
||||
$env:OTOPCUA_SOAK_DROP_PCT = "0.5"
|
||||
|
||||
dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests\ `
|
||||
--filter "Category=Soak"
|
||||
```
|
||||
|
||||
The test logs a per-minute CSV-style line to stdout:
|
||||
|
||||
```
|
||||
soak,1.0,received=51234,dispatched=51234,dropped=0,ws_mb=412
|
||||
soak,2.0,received=102468,dispatched=102468,dropped=0,ws_mb=415
|
||||
...
|
||||
```
|
||||
|
||||
Capture stdout to a file for post-run analysis. The three guards
|
||||
(`received` growing, `dropped/received` ratio, working-set delta) all
|
||||
fire mid-run rather than at end-of-test, so a failure surfaces within
|
||||
the first few minutes if the architecture is wrong.
|
||||
|
||||
## Compressed-tag soak (when Galaxy isn't 50k tags)
|
||||
|
||||
A first-pass validation is fine with the override:
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_SOAK_RUN = "1"
|
||||
$env:OTOPCUA_SOAK_TAGS = "500" # whatever the dev Galaxy has
|
||||
$env:OTOPCUA_SOAK_MINUTES = "60" # one hour is enough to surface plumbing bugs
|
||||
$env:OTOPCUA_SOAK_DROP_PCT = "1.0"
|
||||
```
|
||||
|
||||
This validates the *plumbing* (bounded channel, pump invariants, leak
|
||||
guard) but doesn't pin the 50k-tag scaling assertion. Defer the full
|
||||
50k validation to a customer rig with that scale, or build a synthetic
|
||||
Galaxy with a script that imports 50k attributes onto a generated UDO
|
||||
(~2 hours of one-off work).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **`MxGatewaySkipReason` says "mxaccessgw not reachable"** — the gw
|
||||
isn't listening, or it's on a different port. `Test-NetConnection
|
||||
localhost -Port 5120` is the quick check.
|
||||
- **`MxGatewaySkipReason` says "mxgateway backend boot failed:
|
||||
RpcException: Unauthenticated"** — API key mismatch. Verify the
|
||||
`OTOPCUA_PARITY_GW_API_KEY` env var matches the gw's configured key.
|
||||
- **`LegacySkipReason` says "Galaxy ZB SQL not reachable on
|
||||
localhost:1433"** — SQL Server isn't running, or its TCP listener is
|
||||
off. Check `services.msc` for the SQL Server (default) instance.
|
||||
- **`LegacySkipReason` says "Galaxy.Host EXE not built"** — the parity
|
||||
harness looks under `src/.../bin/Debug/net48/`. Build it once:
|
||||
`dotnet build src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`. Note the
|
||||
separately-published copy at `C:\publish\OtOpcUaGalaxyHost\` is for
|
||||
the Windows service; the parity harness spawns its own subprocess.
|
||||
- **Both halves resolve but parity scenarios assert deltas** — that's
|
||||
the expected outcome the rig exists to surface. Review each delta
|
||||
against `docs/v2/Galaxy.ParityMatrix.md`'s "Accepted deltas" section
|
||||
to decide whether it's a real bug or a pre-accepted divergence.
|
||||
|
||||
## After the rig is green
|
||||
|
||||
When the matrix is fully green or carries documented accepted-deltas,
|
||||
PR 7.2 (legacy project deletion) is unblocked. The only follow-up is
|
||||
to promote any newly-discovered accepted-delta to the matrix doc with
|
||||
the why so the matrix history stays auditable.
|
||||
@@ -0,0 +1,152 @@
|
||||
# Galaxy backend performance
|
||||
|
||||
This document covers the performance surface of the in-process
|
||||
`GalaxyDriver` (the v2 mxgw backend) — the ActivitySource it emits, the
|
||||
metrics on its EventPump, the soak scenario that validates it, and the
|
||||
tuning knobs you can reach for when the dev parity rig surfaces a hot
|
||||
spot.
|
||||
|
||||
## Tracing surface (PR 6.1)
|
||||
|
||||
The driver emits spans on the `ZB.MOM.WW.OtOpcUa.Driver.Galaxy`
|
||||
ActivitySource. No package dependency on OpenTelemetry — the host
|
||||
process picks the listener (OTLP exporter, dotnet-trace, Application
|
||||
Insights). Wire it via `OpenTelemetry.Trace.AddSource(...)` in the
|
||||
host's tracing pipeline.
|
||||
|
||||
| Span | Source | Tags |
|
||||
|------|--------|------|
|
||||
| `galaxy.subscribe_bulk` | `TracedGalaxySubscriber` | `galaxy.client`, `galaxy.tag_count`, `galaxy.buffered_interval_ms`, `galaxy.success_count` |
|
||||
| `galaxy.unsubscribe_bulk` | `TracedGalaxySubscriber` | `galaxy.client`, `galaxy.tag_count` |
|
||||
| `galaxy.stream_events` | `TracedGalaxySubscriber` | `galaxy.client`, `galaxy.event_count` (set on stream end) |
|
||||
| `galaxy.write` | `TracedGalaxyDataWriter` | `galaxy.client`, `galaxy.tag_count`, `galaxy.secured_write_count`, `galaxy.success_count` |
|
||||
| `galaxy.get_hierarchy` | `TracedGalaxyHierarchySource` | `galaxy.client`, `galaxy.object_count` |
|
||||
|
||||
The stream-events span deliberately covers the *entire* stream lifetime
|
||||
rather than per-event spans — at 50k tags / 1Hz the per-event volume
|
||||
would dominate the trace pipeline. Per-event visibility flows through
|
||||
the metrics surface instead.
|
||||
|
||||
## Metrics surface (PR 6.2)
|
||||
|
||||
`EventPump` publishes three counters on the
|
||||
`ZB.MOM.WW.OtOpcUa.Driver.Galaxy` meter, each tagged with
|
||||
`galaxy.client` so multi-driver hosts can split by source:
|
||||
|
||||
| Counter | Unit | Meaning |
|
||||
|---------|------|---------|
|
||||
| `galaxy.events.received` | `{event}` | MxEvents read from the gateway StreamEvents stream |
|
||||
| `galaxy.events.dispatched` | `{event}` | MxEvents that made it through the bounded channel into `OnDataChange` |
|
||||
| `galaxy.events.dropped` | `{event}` | MxEvents discarded because the bounded channel was full (newest-dropped) |
|
||||
|
||||
The invariant is `received = dispatched + dropped + (in-flight in the
|
||||
channel)`. Watch the dropped counter — it is the leading indicator of
|
||||
listener back-pressure. A non-zero dropped rate means a downstream
|
||||
consumer (DriverNodeManager → UA notification queue → client) is
|
||||
slower than the gw event stream; investigate that consumer before
|
||||
raising `EventPump` channel capacity.
|
||||
|
||||
### Bounded channel design
|
||||
|
||||
The pump runs two background tasks:
|
||||
|
||||
1. **Producer** — reads from `IGalaxySubscriber.StreamEventsAsync`,
|
||||
increments `events.received`, and `TryWrite`s into a bounded
|
||||
`Channel<MxEvent>`. When the channel is full, the producer counts
|
||||
the drop and continues reading the gw stream so back-pressure does
|
||||
not propagate upstream (which would stall the gw worker and cascade
|
||||
to *all* driver instances sharing that worker).
|
||||
2. **Consumer** — reads from the channel, fans out via
|
||||
`SubscriptionRegistry`, increments `events.dispatched`.
|
||||
|
||||
Default channel capacity is 50_000 (one second of headroom at 50k
|
||||
tags / 1Hz). Override via the `EventPump` constructor's
|
||||
`channelCapacity` parameter; the public-facing wiring path in
|
||||
`GalaxyDriver.EnsureEventPumpStarted` does not yet expose this through
|
||||
`GalaxyDriverOptions` because no parity scenario has needed it. Add it
|
||||
when soak data does.
|
||||
|
||||
## Buffered update interval (PR 6.3)
|
||||
|
||||
`MxAccess.PublishingIntervalMs` (default 1000) flows through both
|
||||
subscribe paths:
|
||||
|
||||
- `GalaxyDriver.SubscribeAsync` — the caller's `publishingInterval`
|
||||
wins when non-zero (the server's UA subscription publishingInterval
|
||||
drives this in production). When the caller passes
|
||||
`TimeSpan.Zero`, the configured option is the fallback.
|
||||
- `PerPlatformProbeWatcher` — the watcher passes the configured value
|
||||
through `SubscribeBulkAsync` so probe `ScanState` changes publish at
|
||||
the deployment's chosen cadence.
|
||||
|
||||
A session-level `SetBufferedUpdateInterval` RPC exists in the gw
|
||||
protocol but the .NET client doesn't expose a typed helper yet —
|
||||
adjusting an existing subscription's interval mid-flight is a
|
||||
follow-up. Today's path subscribes once at the right interval, which
|
||||
covers the common case.
|
||||
|
||||
## Soak scenario (PR 6.4)
|
||||
|
||||
`SoakScenarioTests.Soak_HoldsSubscription_AndKeepsEventStreamFlowing`
|
||||
in `Driver.Galaxy.ParityTests` is the long-running validation. It
|
||||
subscribes a configurable tag count (default 50_000), holds the
|
||||
subscription for a configurable duration (default 24h), polls the
|
||||
three counters every minute, and asserts:
|
||||
|
||||
- `events.received` continues to grow (gw stream isn't stuck)
|
||||
- `events.dropped / events.received` stays under the configured
|
||||
ceiling (default 0.5%)
|
||||
- process working-set doesn't grow more than 1 GB above baseline
|
||||
(leak guard)
|
||||
|
||||
Always skipped unless the operator opts in:
|
||||
|
||||
```bash
|
||||
# Full 24h × 50k soak (production validation)
|
||||
OTOPCUA_SOAK_RUN=1 dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
|
||||
|
||||
# Compressed CI-friendly run (10min × 1k tags, 1% drop ceiling)
|
||||
OTOPCUA_SOAK_RUN=1 OTOPCUA_SOAK_MINUTES=10 OTOPCUA_SOAK_TAGS=1000 \
|
||||
OTOPCUA_SOAK_DROP_PCT=1.0 \
|
||||
dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.ParityTests/
|
||||
```
|
||||
|
||||
The scenario writes a per-minute CSV-style row to stdout
|
||||
(`soak,<minutes>,received=…,dispatched=…,dropped=…,ws_mb=…`) so an
|
||||
operator can grep the test runner output mid-run.
|
||||
|
||||
## Tuned defaults (PR 6.5)
|
||||
|
||||
| Option | Default | Source | Notes |
|
||||
|--------|---------|--------|-------|
|
||||
| `Gateway.ConnectTimeoutSeconds` | 10 | unchanged | Cold-start network paths fit comfortably; soak never observed >2s |
|
||||
| `Gateway.DefaultCallTimeoutSeconds` | 30 | **bumped from 5** in PR 6.5 | A 50k-tag `SubscribeBulk` can exceed 5s under MxAccess COM apartment lock contention; 30s leaves headroom while still failing fast on a wedged worker |
|
||||
| `Gateway.StreamTimeoutSeconds` | 0 (unlimited) | unchanged | The stream must run for the lifetime of the driver |
|
||||
| `MxAccess.PublishingIntervalMs` | 1000 | unchanged | Matches the legacy `LMXProxyServer` cadence; deployments needing tighter health visibility can dial down |
|
||||
| `Reconnect.InitialBackoffMs` | 500 | unchanged | First retry shouldn't dogpile a recovering gw |
|
||||
| `Reconnect.MaxBackoffMs` | 30_000 | unchanged | 30s ceiling so a long-down gw doesn't sit in 5+ min backoff |
|
||||
| `Repository.DiscoverPageSize` | 5000 | unchanged | One Galaxy page round-trip per ~5k objects; soak hadn't surfaced pressure |
|
||||
| `EventPump` channel capacity | 50_000 | unchanged | One second of headroom at 50k tags / 1Hz |
|
||||
|
||||
The unchanged rows are not "definitely correct" — they are "no live
|
||||
data argues for changing them." Re-run the soak scenario after every
|
||||
substantive driver change, and revise this table when the data does.
|
||||
|
||||
## Where to look first when something's slow
|
||||
|
||||
1. **Slow `Discover`?** Inspect `galaxy.get_hierarchy` span duration
|
||||
and `galaxy.object_count`. The gw walks the Galaxy DB serially;
|
||||
slow Discovers usually mean a slow ZB SQL.
|
||||
2. **Subscribe pile-up?** `galaxy.subscribe_bulk` span duration
|
||||
correlates with `galaxy.tag_count`. If duration ÷ tag_count starts
|
||||
climbing, the gw worker is probably under apartment-lock pressure.
|
||||
3. **Events stalled?** Watch `galaxy.events.received`. Flat-lined
|
||||
means the gw stream is wedged — kick the reconnect supervisor by
|
||||
forcing a `ReinitializeAsync`.
|
||||
4. **Dropped events?** Non-zero `galaxy.events.dropped` means a slow
|
||||
downstream consumer. Profile `OnDataChange` handlers in
|
||||
`DriverNodeManager` before bumping the channel capacity.
|
||||
5. **Memory growing?** Confirm with the soak scenario's working-set
|
||||
leak guard. Likely culprits: lingering subscription handles in
|
||||
`SubscriptionRegistry`, or a downstream consumer retaining
|
||||
`DataValueSnapshot` references past their useful life.
|
||||
+45
-42
@@ -4,6 +4,7 @@
|
||||
>
|
||||
> **Branch**: `v2`
|
||||
> **Created**: 2026-04-17
|
||||
> **Updated 2026-04-28**: Docker workloads moved off the Windows dev VM to a shared Linux Docker host at `10.100.0.35` so the dev VM can have its GPU re-attached via ESXi passthrough (Hyper-V/WSL2 was blocking it). The two-tier model below is updated accordingly: per-developer Docker Desktop is gone; SQL Server + driver fixtures all live on the central Linux host, identifiable via `docker ps --filter label=project=lmxopcua`.
|
||||
|
||||
## Scope
|
||||
|
||||
@@ -13,30 +14,31 @@ Every external resource a developer needs on their machine, plus the dedicated i
|
||||
|
||||
## Two Environment Tiers
|
||||
|
||||
Per decision #99:
|
||||
Per decision #99 (updated 2026-04-28):
|
||||
|
||||
| Tier | Purpose | Where it runs | Resources |
|
||||
|------|---------|---------------|-----------|
|
||||
| **PR-CI / inner-loop dev** | Fast, runs on minimal Windows + Linux build agents and developer laptops | Each developer's machine; CI runners | Pure-managed in-process simulators (NModbus, OPC Foundation reference server, FOCAS TCP stub from test project). No Docker, no VMs. |
|
||||
| **Nightly / integration CI** | Full driver-stack validation against real wire protocols | One dedicated Windows host with Docker Desktop + Hyper-V + a TwinCAT XAR VM | All Docker simulators (`oitc/modbus-server`, `ab_server`, Snap7), TwinCAT XAR VM, Galaxy.Host installer + dev Galaxy access, FOCAS TCP stub binary, FOCAS FaultShim assembly |
|
||||
| **PR-CI / inner-loop dev** | Fast, runs on minimal Windows + Linux build agents and developer laptops | Each developer's machine; CI runners | Pure-managed in-process simulators (NModbus, OPC Foundation reference server, FOCAS TCP stub from test project). No Docker, no VMs locally. |
|
||||
| **Integration / nightly CI** | Full driver-stack validation against real wire protocols | **Shared Linux Docker host at `10.100.0.35`** (Debian 13, Docker 29.2.1) — one host for all developers; replaces the former per-developer Docker Desktop + Hyper-V model | All Docker simulators (pymodbus, ab_server, python-snap7, opc-plc) + central SQL Server, all running as `/opt/otopcua-<driver>/` stacks with the `project=lmxopcua` label. TwinCAT XAR + the Galaxy/mxaccessgw stack stay on the Windows dev VM (license + Hyper-V constraints unchanged) |
|
||||
|
||||
The tier split keeps developer onboarding fast (no Docker required for first build) while concentrating the heavy simulator setup on one machine the team maintains.
|
||||
The Linux Docker host is shared because (a) only one team member needs it active at a time, (b) it removes the per-developer Docker Desktop install, and (c) the dev VM no longer needs Hyper-V/WSL2 — freeing it for GPU passthrough.
|
||||
|
||||
## Installed Inventory — This Machine
|
||||
## Installed Inventory — Dev VM (`DESKTOP-6JL3KKO`)
|
||||
|
||||
Running record of every v2 dev service stood up on this developer machine. Updated on every install / config change. Credentials here are **dev-only** per decision #137 — production uses Integrated Security / gMSA per decision #46 and never any value in this table.
|
||||
Running record of v2 dev services on the Windows dev VM. Updated on every install / config change. Credentials here are **dev-only** per decision #137 — production uses Integrated Security / gMSA per decision #46 and never any value in this table.
|
||||
|
||||
**Last updated**: 2026-04-17
|
||||
**Last updated**: 2026-04-28 — Docker Desktop + WSL2 removed; Docker workloads now live on the Linux Docker host (see next section).
|
||||
|
||||
### Host
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Machine name | `DESKTOP-6JL3KKO` |
|
||||
| User | `dohertj2` (member of local Administrators + `docker-users`) |
|
||||
| VM platform | VMware (`VMware20,1`), nested virtualization enabled |
|
||||
| Machine name | `DESKTOP-6JL3KKO` (10.100.0.48) |
|
||||
| User | `dohertj2` (local Administrators) |
|
||||
| VM platform | VMware ESXi |
|
||||
| CPU | Intel Xeon E5-2697 v4 @ 2.30GHz (3 vCPUs) |
|
||||
| OS | Windows (WSL2 + Hyper-V Platform features installed) |
|
||||
| OS | Windows 10 Enterprise (10.0.19045) |
|
||||
| GPU | (Re-attached after WSL2/Hyper-V removal) |
|
||||
|
||||
### Toolchain
|
||||
|
||||
@@ -46,36 +48,40 @@ Running record of every v2 dev service stood up on this developer machine. Updat
|
||||
| .NET AspNetCore runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App\` | Pre-installed |
|
||||
| .NET NETCore runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.NETCore.App\` | Pre-installed |
|
||||
| .NET WindowsDesktop runtime | 10.0.5 | `C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App\` | Pre-installed |
|
||||
| .NET Framework 4.8 SDK | — | Pending (needed for Phase 2 Galaxy.Host; not yet required) | — |
|
||||
| .NET Framework 4.8 SDK | — | Optional — only needed when building the mxaccessgw worker (sibling repo, x86 net48) | — |
|
||||
| Git | Pre-installed | Standard | — |
|
||||
| PowerShell 7 | Pre-installed | Standard | — |
|
||||
| winget | v1.28.220 | Standard Windows feature | — |
|
||||
| WSL | Default v2, distro `docker-desktop` `STATE Running` | — | `wsl --install --no-launch` (2026-04-17) |
|
||||
| Docker Desktop | 29.3.1 (engine) / Docker Desktop 4.68.0 (app) | Standard | `winget install --id Docker.DockerDesktop` (2026-04-17) |
|
||||
| Docker CLI (standalone, no daemon) | 29.3.1 | `%USERPROFILE%\bin\docker.exe` | Static binary from download.docker.com (2026-04-28) |
|
||||
| Docker Compose CLI plugin | latest | `%USERPROFILE%\.docker\cli-plugins\docker-compose.exe` | Direct download from github.com/docker/compose (2026-04-28) |
|
||||
| `lmxopcua-fix.ps1` helper | n/a | `%USERPROFILE%\bin\lmxopcua-fix.ps1` | See "Docker host" section below |
|
||||
| `dotnet-ef` CLI | 10.0.6 | `%USERPROFILE%\.dotnet\tools\dotnet-ef.exe` | `dotnet tool install --global dotnet-ef --version 10.0.*` (2026-04-17) |
|
||||
| ~~Docker Desktop~~ | — | Removed 2026-04-28 — replaced by remote Linux Docker host | — |
|
||||
| ~~WSL2 (`docker-desktop` distro)~~ | — | Removed 2026-04-28 (frees Hyper-V for GPU passthrough) | — |
|
||||
|
||||
### Services
|
||||
|
||||
| Service | Container / Process | Version | Host:Port | Credentials (dev-only) | Data location | Status |
|
||||
|---------|---------------------|---------|-----------|------------------------|---------------|--------|
|
||||
| **Central config DB** | Docker container `otopcua-mssql` (image `mcr.microsoft.com/mssql/server:2022-latest`) | 16.0.4250.1 (RTM-CU24-GDR, KB5083252) | `localhost:14330` (host) → `1433` (container) — remapped from 1433 to avoid collision with the native MSSQL14 instance that hosts the Galaxy `ZB` DB (both bind 0.0.0.0:1433; whichever wins the race gets connections) | User `sa` / Password `OtOpcUaDev_2026!` | Docker named volume `otopcua-mssql-data` (mounted at `/var/opt/mssql` inside container) | ✅ Running — `InitialSchema` migration applied, 16 entity tables live |
|
||||
| **Central config DB** | Docker container `otopcua-mssql` on the Linux Docker host (image `mcr.microsoft.com/mssql/server:2022-latest`) | 16.0.4250.1 (RTM-CU24-GDR, KB5083252) | `10.100.0.35:14330` → `1433` (container) — port 14330 retained from the previous local-container setup so connection-string ports don't churn | User `sa` / Password `OtOpcUaDev_2026!` | Docker named volume `otopcua-mssql-data` on the Docker host | ✅ Running on Docker host (`/opt/otopcua-mssql/`) since 2026-04-28; carries `project=lmxopcua` label |
|
||||
| Dev Galaxy (AVEVA System Platform) | Local install on this dev box — full ArchestrA + Historian + OI-Server stack | v1 baseline | Local COM via MXAccess (`C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`); Historian via `aaH*` services; SuiteLink via `slssvc` | Windows Auth | Galaxy repository DB `ZB` on local SQL Server (separate instance from `otopcua-mssql` — legacy v1 Galaxy DB, not related to v2 config DB) | ✅ **Fully available — Phase 2 lift unblocked.** 27 ArchestrA / AVEVA / Wonderware services running incl. `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`, `ArchestrADataStore`, `AsbServiceManager`, `AutoBuild_Service`; full Historian set (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `aahSupervisor`, `InSQLStorage`, `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`, `InSQLManualStorage`, `InSQLSystemDriver`, `HistorianSearch-x64`); `slssvc` (Wonderware SuiteLink); `OI-Gateway` install present at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` (decision #142 AppServer-via-OI-Gateway smoke test now also unblocked) |
|
||||
| GLAuth (LDAP) | Local install at `C:\publish\glauth\` | v2.4.0 | `localhost:3893` (LDAP) / `3894` (LDAPS, disabled) | Direct-bind `cn={user},dc=lmxopcua,dc=local` per `auth.md`; users `readonly`/`writeop`/`writetune`/`writeconfig`/`alarmack`/`admin`/`serviceaccount` (passwords in `glauth.cfg` as SHA-256) | `C:\publish\glauth\` | ✅ Running (NSSM service `GLAuth`). Phase 1 Admin uses GroupToRole map `ReadOnly→ConfigViewer`, `WriteOperate→ConfigEditor`, `AlarmAck→FleetAdmin`. v2-rebrand to `dc=otopcua,dc=local` is a future cosmetic change |
|
||||
| OPC Foundation reference server | Not yet built | — | `localhost:62541` (target) | `user1` / `password1` (reference-server defaults) | — | Pending (needed for Phase 5 OPC UA Client driver testing) |
|
||||
| FOCAS TCP stub | Not yet built | — | `localhost:8193` (target) | n/a | — | Pending (built in Phase 5) |
|
||||
| Modbus simulator (`oitc/modbus-server`) | — | — | `localhost:502` (target) | n/a | — | Pending (needed for Phase 3 Modbus driver; moves to integration host per two-tier model) |
|
||||
| libplctag `ab_server` | — | — | `localhost:44818` (target) | n/a | — | Pending (Phase 3/4 AB CIP and AB Legacy drivers) |
|
||||
| Snap7 Server | — | — | `localhost:102` (target) | n/a | — | Pending (Phase 4 S7 driver) |
|
||||
| TwinCAT XAR VM | — | — | `localhost:48898` (ADS) (target) | TwinCAT default route creds | — | Pending — runs in Hyper-V VM, not on this dev box (per decision #135) |
|
||||
| OPC Foundation reference server | Not yet built | — | `10.100.0.35:62541` (target) | `user1` / `password1` (reference-server defaults) | — | Pending (needed for Phase 5 OPC UA Client driver testing) |
|
||||
| FOCAS TCP stub | Not yet built | — | `10.100.0.35:8193` (target) | n/a | — | Pending (built in Phase 5; runs on Docker host) |
|
||||
| Modbus simulator (`otopcua-pymodbus:3.13.0`) | Docker compose at `/opt/otopcua-modbus/` on Docker host | pinned 3.13.0 | `10.100.0.35:5020` | n/a | n/a | Stack staged; bring up with `lmxopcua-fix up modbus <profile>` from this VM |
|
||||
| AB CIP fixture (`otopcua-ab-server:libplctag-release`) | Docker compose at `/opt/otopcua-abcip/` on Docker host | source-pinned `release` tag | `10.100.0.35:44818` | n/a | n/a | Stack staged; bring up with `lmxopcua-fix up abcip <profile>` from this VM |
|
||||
| S7 fixture (`otopcua-python-snap7:1.0`) | Docker compose at `/opt/otopcua-s7/` on Docker host | python-snap7 ≥2.0 | `10.100.0.35:1102` | n/a | n/a | Stack staged; bring up with `lmxopcua-fix up s7 s7_1500` from this VM |
|
||||
| OPC UA simulator (`mcr.microsoft.com/iotedge/opc-plc:2.14.10`) | Docker compose at `/opt/otopcua-opcuaclient/` on Docker host | pinned 2.14.10 | `10.100.0.35:50000` | anonymous | n/a | Stack staged; bring up with `lmxopcua-fix up opcuaclient` from this VM |
|
||||
| TwinCAT XAR VM | — | — | TBD via Hyper-V on a separate Windows host (NOT this dev VM) | TwinCAT default route creds | — | Pending — Hyper-V removed from this dev VM; XAR will live on a separate dedicated Windows machine if needed |
|
||||
|
||||
### Connection strings for `appsettings.Development.json`
|
||||
|
||||
Copy-paste-ready. **Never commit these to the repo** — they go in `appsettings.Development.json` (gitignored per the standard .NET convention) or in user-scoped dotnet secrets.
|
||||
Copy-paste-ready. The checked-in `appsettings.json` defaults already point at the Docker host (`10.100.0.35,14330`), so `appsettings.Development.json` is only needed for per-developer overrides.
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"ConfigDatabase": {
|
||||
"ConnectionString": "Server=localhost,14330;Database=OtOpcUaConfig_Dev;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=true;Encrypt=false;"
|
||||
"ConnectionString": "Server=10.100.0.35,14330;Database=OtOpcUaConfig_Dev;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=true;Encrypt=false;"
|
||||
},
|
||||
"Authentication": {
|
||||
"Ldap": {
|
||||
@@ -89,29 +95,26 @@ Copy-paste-ready. **Never commit these to the repo** — they go in `appsettings
|
||||
}
|
||||
```
|
||||
|
||||
LDAP host stays `localhost` because GLAuth still runs as a native NSSM service on this dev VM (not yet migrated to the Docker host).
|
||||
|
||||
For xUnit test fixtures that need a throwaway DB per test run, build connection strings with `Database=OtOpcUaConfig_Test_{timestamp}` to avoid cross-run pollution.
|
||||
|
||||
### Container management quick reference
|
||||
|
||||
All commands SSH into the Docker host. The standalone Windows `docker.exe` on this VM has no daemon — every operation runs server-side via the helper.
|
||||
|
||||
```powershell
|
||||
# Start / stop the SQL Server container (survives reboots via Docker Desktop auto-start)
|
||||
docker stop otopcua-mssql
|
||||
docker start otopcua-mssql
|
||||
# Status / log / lifecycle from this VM
|
||||
lmxopcua-fix ls # list lmxopcua-tagged containers + status
|
||||
lmxopcua-fix logs mssql # SQL Server log tail
|
||||
ssh dohertj2@10.100.0.35 'docker stop otopcua-mssql; docker start otopcua-mssql'
|
||||
ssh dohertj2@10.100.0.35 'docker logs otopcua-mssql --tail 50'
|
||||
|
||||
# Logs (useful for diagnosing startup failures or login issues)
|
||||
docker logs otopcua-mssql --tail 50
|
||||
# sqlcmd inside the container (run on the Docker host)
|
||||
ssh dohertj2@10.100.0.35 'docker exec otopcua-mssql /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "OtOpcUaDev_2026!" -C -Q "SELECT @@VERSION"'
|
||||
|
||||
# Shell into the container (rarely needed; sqlcmd is the usual tool)
|
||||
docker exec -it otopcua-mssql bash
|
||||
|
||||
# Query via sqlcmd inside the container (Git Bash needs MSYS_NO_PATHCONV=1 to avoid path mangling)
|
||||
MSYS_NO_PATHCONV=1 docker exec otopcua-mssql /opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P "OtOpcUaDev_2026!" -C -Q "SELECT @@VERSION"
|
||||
|
||||
# Nuclear reset: drop the container + volume (destroys all DB data)
|
||||
docker stop otopcua-mssql
|
||||
docker rm otopcua-mssql
|
||||
docker volume rm otopcua-mssql-data
|
||||
# …then re-run the docker run command from Bootstrap Step 6
|
||||
# Nuclear reset (destroys dev DB data)
|
||||
ssh dohertj2@10.100.0.35 'cd /opt/otopcua-mssql && docker compose down -v && docker compose up -d'
|
||||
```
|
||||
|
||||
### Credential rotation
|
||||
@@ -125,7 +128,7 @@ Dev credentials in this inventory are convenience defaults, not secrets. Change
|
||||
| Resource | Purpose | Type | Default port | Default credentials | Owner |
|
||||
|----------|---------|------|--------------|---------------------|-------|
|
||||
| **.NET 10 SDK** | Build all .NET 10 x64 projects | OS install | n/a | n/a | Developer |
|
||||
| **.NET Framework 4.8 SDK + targeting pack** | Build `Driver.Galaxy.Host` (Phase 2+) | Windows install | n/a | n/a | Developer |
|
||||
| **.NET Framework 4.8 SDK + targeting pack** | Optional — build the mxaccessgw worker (sibling repo, x86 net48) | Windows install | n/a | n/a | Developer |
|
||||
| **Visual Studio 2022 17.8+ or Rider 2024+** | IDE (any C# IDE works; these are the supported configs) | OS install | n/a | n/a | Developer |
|
||||
| **Git** | Source control | OS install | n/a | n/a | Developer |
|
||||
| **PowerShell 7.4+** | Compliance scripts (`phase-N-compliance.ps1`) | OS install | n/a | n/a | Developer |
|
||||
@@ -247,7 +250,7 @@ Order matters because some installs have prerequisites and several need admin el
|
||||
winget install --id Microsoft.DotNet.SDK.10 --accept-package-agreements --accept-source-agreements
|
||||
```
|
||||
|
||||
2. **Install .NET Framework 4.8 SDK + targeting pack** — only needed when starting Phase 2 (Galaxy.Host); skip for Phase 0–1 if not yet there
|
||||
2. **Install .NET Framework 4.8 SDK + targeting pack** — optional, only needed when building the mxaccessgw worker (sibling repo, x86 net48). Not required by anything in this repo.
|
||||
```powershell
|
||||
winget install --id Microsoft.DotNet.Framework.DeveloperPack_4 --accept-package-agreements --accept-source-agreements
|
||||
```
|
||||
@@ -482,7 +485,7 @@ Seeds are idempotent (re-runnable) and gitignored where they contain credentials
|
||||
| Docker Desktop license terms change for org use | Track Docker pricing; budget approved or fall back to Podman if license becomes blocking |
|
||||
| Integration host single point of failure | Document the setup so a second host can be provisioned in <2 days; test fixtures pin to a hostname so failover changes one DNS entry |
|
||||
| GLAuth dev config drifts between developers | Sync script + template (Step 4) keep configs aligned; periodic review |
|
||||
| Galaxy / MXAccess licensing for non-dev-machine | Galaxy stays on the dev machines that already have Aveva licenses; integration host does NOT run Galaxy (Galaxy.Host integration tests run on the dev box, not the shared host) |
|
||||
| Galaxy / MXAccess licensing for non-dev-machine | Galaxy stays on the dev machines that already have Aveva licenses; integration host does NOT run Galaxy (the mxaccessgw worker requires the AVEVA stack and runs on the dev box, not the shared host) |
|
||||
| Long-lived dev env credentials in dev `appsettings.Development.json` | Gitignored; documented as dev-only; production never uses these |
|
||||
|
||||
## Decisions to Add to plan.md
|
||||
|
||||
@@ -0,0 +1,159 @@
|
||||
# FOCAS deployment guide
|
||||
|
||||
Operational reference for deploying the Fanuc FOCAS driver in production.
|
||||
|
||||
## Licence + DLL provisioning
|
||||
|
||||
Fanuc's FOCAS2 library is proprietary + closed-source. Two DLL variants exist:
|
||||
|
||||
| Variant | Bitness | OtOpcUa usage |
|
||||
|---|---|---|
|
||||
| **`Fwlib64.dll`** | x64 | **Default production binary.** Loaded by `Driver.FOCAS.Host` (net10.0 x64 Windows service) and by the `Driver.FOCAS.Cli` when running on an x64 server. |
|
||||
| `Fwlib32.dll` | x86 | Historical — what the project was originally scaffolded against. Not used by any current binary post the 2026-04-23 Host retarget. Kept in the licence set for legacy deployments that insist on x86-only Hosts. |
|
||||
|
||||
Both are **licensed for this project** — this project has a valid Fanuc FOCAS developer-kit licence that grants redistribution for either variant internally.
|
||||
|
||||
### The DLLs now ship with the Host (2026-04-23)
|
||||
|
||||
As of the vendoring change, the Host csproj copies the licensed FOCAS binaries from [`vendor/fanuc/`](../../vendor/fanuc/README.md) to its build output automatically. So after a `dotnet build` / `dotnet publish`, the layout is:
|
||||
|
||||
```
|
||||
<publish-root>\Driver.FOCAS.Host\
|
||||
├── OtOpcUa.Driver.FOCAS.Host.exe
|
||||
├── OtOpcUa.Driver.FOCAS.Host.dll
|
||||
├── ... runtime deps ...
|
||||
├── Fwlib64.dll ← master FOCAS runtime (generic x64)
|
||||
├── fwlib0iD64.dll ← 0i-D series dispatch target
|
||||
├── fwlib30i64.dll ← 30i / 31i / 32i series dispatch target
|
||||
├── fwlibe64.dll ← Ethernet transport variant
|
||||
├── fwlibNCG64.dll ← NC Guide (Fanuc PC simulator) target
|
||||
└── fwlib0DN64.dll ← 0i-D Numeric-control thin variant
|
||||
```
|
||||
|
||||
No operator step required to "drop Fwlib64.dll on PATH" anymore — the Host loads `Fwlib64.dll` via bare-name and Windows finds it in the exe's own directory first. Shipping the full set of series-specific siblings lets the Host work against any Fanuc CNC the deployment points it at; the master `Fwlib64.dll` dispatches to the right variant based on what the CNC reports during `cnc_allclibhndl3`.
|
||||
|
||||
The DLL loads lazily on the first `OpenSessionAsync` call. When somehow missing (deployment artefact surgery), `Fwlib64FocasBackend` returns a structured `Fwlib64DllMissing` error-code rather than crashing; the Proxy maps it to `BadCommunicationError` with a clear operator message.
|
||||
|
||||
### Repo confidentiality note
|
||||
|
||||
**The FOCAS runtime DLLs in `vendor/fanuc/` are licensed binaries — treat this repo accordingly.** Do not mirror / push / fork to any public forge without first confirming the redistribution is covered by whoever manages the Fanuc relationship. Internal / customer-licensed mirrors are fine. See [`vendor/fanuc/README.md`](../../vendor/fanuc/README.md) for the full provenance + licence context.
|
||||
|
||||
## Tier-C architecture recap
|
||||
|
||||
The FOCAS driver is **Tier-C** — out-of-process — for **blast-radius isolation**, not bitness. Fanuc's DLL has documented crash modes (network errors, malformed responses, handle-recycle bugs) that could take the main OPC UA server down if loaded in-process. Splitting the P/Invoke into a separate Host process means a Fwlib crash only loses FOCAS tags; every other driver keeps running, and the supervisor restarts the Host.
|
||||
|
||||
Galaxy has the same pattern but is **forced** by MXAccess's 32-bit-only COM — there's no x64 path. FOCAS would work in-process on x64 (Fwlib64 is licensed), but the blast-radius argument keeps it Tier-C anyway.
|
||||
|
||||
See [`implementation/focas-isolation-plan.md`](implementation/focas-isolation-plan.md) for the full topology.
|
||||
|
||||
## Installing the Host service
|
||||
|
||||
Use the NSSM wrapper script:
|
||||
|
||||
```powershell
|
||||
.\scripts\install\Install-FocasHost.ps1 `
|
||||
-InstallRoot 'C:\Program Files\OtOpcUa\Driver.FOCAS.Host' `
|
||||
-ServiceAccount 'OTOPCUA\svc-otopcua' `
|
||||
-FocasBackend fwlib64
|
||||
```
|
||||
|
||||
Parameters:
|
||||
|
||||
| Parameter | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `-InstallRoot` | **required** | Where the Host binaries + `Fwlib64.dll` live |
|
||||
| `-ServiceAccount` | **required** | Must match the main OtOpcUa server account so the named-pipe ACL allows the Proxy to connect |
|
||||
| `-FocasBackend` | `fwlib64` | `fwlib64` (production), `fake` (in-memory for Tier-C pipeline smoke without a CNC), `unconfigured` (returns BadDeviceFailure for every call) |
|
||||
| `-FocasSharedSecret` | auto-gen | Per-process secret passed at service start so it never touches disk |
|
||||
| `-FocasPipeName` | `OtOpcUaFocas` | Named pipe the Proxy connects to |
|
||||
| `-ServiceName` | `OtOpcUaFocasHost` | Windows service display name |
|
||||
|
||||
`fwlib32` is accepted as a legacy alias but maps to `Fwlib64FocasBackend` internally — the Host is x64 post-2026-04-23, so 32-bit-only deployments would need to rebuild + retarget.
|
||||
|
||||
## Configuring a FOCAS driver instance
|
||||
|
||||
In the Admin UI's Drivers tab, create a `DriverInstance` with `DriverType = "FOCAS"` and a JSON config of the shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"Backend": "ipc",
|
||||
"PipeName": "OtOpcUaFocas",
|
||||
"SharedSecret": "<matches OTOPCUA_FOCAS_SECRET env var on the Host>",
|
||||
"Devices": [
|
||||
{ "Name": "Mill-01", "HostAddress": "focas://192.168.1.50:8193", "Series": "ThirtyOne_i" }
|
||||
],
|
||||
"Tags": [
|
||||
{ "Name": "SpindleLoad", "DeviceName": "Mill-01", "Address": "R100", "DataType": "Int16" },
|
||||
{ "Name": "CycleRunning", "DeviceName": "Mill-01", "Address": "X0.0", "DataType": "Bit" },
|
||||
{ "Name": "PartCount", "DeviceName": "Mill-01", "Address": "MACRO:500", "DataType": "Float64" }
|
||||
],
|
||||
"Probe": { "Enabled": true, "IntervalMs": 5000, "TimeoutMs": 2000 }
|
||||
}
|
||||
```
|
||||
|
||||
`Backend` selector (on the Proxy side — not to be confused with `OTOPCUA_FOCAS_BACKEND` on the Host):
|
||||
|
||||
| Value | Meaning |
|
||||
|---|---|
|
||||
| `ipc` (default) | Route through `Driver.FOCAS.Host` over the named pipe. **Production shape.** |
|
||||
| `fwlib` | Direct in-process P/Invoke via `FwlibFocasClient`. Only valid on x64 servers that are willing to accept the blast-radius trade-off. |
|
||||
| `unimplemented` | Throws at construction — used for scaffolding `DriverInstance` rows before the Host is deployed. |
|
||||
|
||||
## Smoke testing
|
||||
|
||||
**Without a CNC — pipeline only:**
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_FOCAS_BACKEND = "fake"
|
||||
Start-Service OtOpcUaFocasHost
|
||||
```
|
||||
|
||||
The `FakeFocasBackend` stores per-address values in-memory and survives read/write/subscribe exercising. Use `otopcua-focas-cli` (in-process, bypasses the Host) or the OtOpcUa server's own driver registration to exercise the pipeline.
|
||||
|
||||
**Version-aware fake** (Stream A of the simulator plan, shipped 2026-04-23) — set `OTOPCUA_FOCAS_SERIES` to simulate a specific Fanuc controller's capability matrix. Addresses outside the series' documented ranges get rejected with `BadOutOfRange` (matching what the real DLL returns as `EW_NUMBER` / `EW_PARAM`):
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_FOCAS_BACKEND = "fake"
|
||||
$env:OTOPCUA_FOCAS_SERIES = "ThirtyOne_i" # or Zero_i_D / Zero_i_F / Sixteen_i / PowerMotion_i / ...
|
||||
Start-Service OtOpcUaFocasHost
|
||||
```
|
||||
|
||||
**Optional behavioural quirks** — `OTOPCUA_FOCAS_QUIRKS` is a comma-separated list:
|
||||
|
||||
| Token | Behaviour |
|
||||
|---|---|
|
||||
| `EditMode` | `OpenSessionAsync` refuses sessions with `ErrorCode=EditModeActive`, mimicking a CNC in Edit mode |
|
||||
| `Emergency` | `ProbeAsync` reports the session as unhealthy with `emergency-stop active` error even after a clean open — exercises the driver's probe-surfaces-non-connectivity path |
|
||||
| `SlowFirstConnect[=ms]` | First `OpenSessionAsync` blocks for `ms` (default 3000) milliseconds, mimicking the 16i-series slow-first-connect — subsequent opens are fast |
|
||||
| `CrashAfterCycles=N` | After `N` session opens, the `N+1`-th returns `ErrorCode=Fwlib64Crashed` — mimics the documented Fanuc handle-leak |
|
||||
|
||||
Example combining several:
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_FOCAS_QUIRKS = "EditMode,CrashAfterCycles=5,SlowFirstConnect=500"
|
||||
```
|
||||
|
||||
Unknown tokens log a warning but don't abort startup.
|
||||
|
||||
**With a real CNC:**
|
||||
|
||||
```powershell
|
||||
$env:OTOPCUA_FOCAS_BACKEND = "fwlib64"
|
||||
$env:FOCAS_TRUST_WIRE = "1"
|
||||
Start-Service OtOpcUaFocasHost
|
||||
.\scripts\e2e\test-focas.ps1 -CncHost 192.168.1.50 -BridgeNodeId 'ns=2;s=Focas/R100'
|
||||
```
|
||||
|
||||
Requires `Fwlib64.dll` on `PATH` alongside the Host exe.
|
||||
|
||||
## Observability
|
||||
|
||||
- Host logs: `%ProgramData%\OtOpcUa\focas-host-*.log` (Serilog daily rolling)
|
||||
- Post-mortem: `%ProgramData%\OtOpcUa\focas-post-mortem.mmf` — ring buffer of the last ~1000 IPC operations, survives a Host crash so the Proxy-side supervisor can read it during respawn diagnostic
|
||||
- `DriverHostStatus` rows in the central Config DB under `HostName = <configured device host>` — `State` transitions + Polly resilience counters surface on the Admin `/hosts` page
|
||||
|
||||
## Known issues
|
||||
|
||||
- **No public simulator** — Fanuc FOCAS has no published emulator. Lab-rig validation (a real FANUC 0i-F / 30i controller or an FDK-licenced dev rig) is the only way to confirm wire-level correctness. Tracked under task #222.
|
||||
- **32-bit-only deployments unsupported** — post the 2026-04-23 retarget, running the Host as net48 x86 is not a supported mode. If you genuinely need Fwlib32-only, revert the Host csproj + Program.cs changes from that commit.
|
||||
- **Handle-recycling cadence** — documented Fanuc issue where long-lived FWLIB session handles can leak inside the DLL; the Host periodically cycles them. Currently on a fixed 60-minute cadence; future config knob tracked as a post-release follow-up.
|
||||
@@ -0,0 +1,315 @@
|
||||
# FOCAS Docker simulator — implementation plan
|
||||
|
||||
> **Status**: **IN PROGRESS** 2026-04-23. **Streams A + B shipped.** Stream C (real Fwlib64 wire compat) + Stream D (e2e + docs) still open — both require a Windows rig with licensed Fwlib64.dll + captured Wireshark traces. Stream B shipped the full architectural scaffold (Docker image, 9 per-series compose profiles, asyncio TCP server, handler dispatch, profile-driven range enforcement, local validation harness) — exercised end-to-end against both `thirtyone_i` and `powermotion_i` profiles.
|
||||
|
||||
## Goal
|
||||
|
||||
Close the one remaining FOCAS gap (`#222` follow-up — "wire-level live-boot against real hardware") with a hardware-free fixture that:
|
||||
|
||||
1. Runs in Docker, matches the per-driver fixture pattern (`docker compose up -d` in the test project).
|
||||
2. Exposes the FOCAS TCP port (`8193` by default) to the host.
|
||||
3. Speaks enough of the FOCAS wire protocol that **a Windows test rig running our unmodified `Driver.FOCAS.Host` + licensed `Fwlib64.dll` can open a session and exercise the 9 FWLIB functions the driver actually uses.**
|
||||
4. Supports **version profiles** — one container per Fanuc series (0i-D, 0i-F, 30i, 31i, 32i, PowerMotion-i) — so driver-side range validation, error-code mapping, and per-series quirks get exercised against a server that actually behaves differently per series.
|
||||
5. Plugs into the existing e2e infrastructure (`scripts/e2e/test-focas.ps1` loses the `FOCAS_TRUST_WIRE=1` gate when the fixture is up).
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **Not a full FOCAS emulator.** Fanuc's FOCAS spec is closed; faithfully reproducing every function across every controller model would be a years-long project. We implement the narrow subset the driver uses (see §Protocol surface).
|
||||
- **Not a CNC behavioural model.** We return plausible values for PMC/param/macro reads; we do NOT simulate axis motion, program execution, or alarm generation. The mock exists to exercise the driver's marshalling + IPC + status-code paths, not to prove the CNC behaves correctly.
|
||||
- **Not a replacement for a bench CNC.** A physical controller still catches timing-dependent bugs (Fwlib-internal thread-pool exhaustion, handle-recycle pathologies, vendor-firmware quirks) that a mock can't reproduce. Mock covers ~80% of value; real-hardware smoke stays as a final gate.
|
||||
|
||||
## Constraint that shapes the design
|
||||
|
||||
`Fwlib64.dll` is a proprietary closed-source library that speaks FOCAS to the CNC. **Our driver never touches raw TCP** — it calls `cnc_allclibhndl3` / `pmc_rdpmcrng` / etc. and Fwlib encodes the wire frames internally.
|
||||
|
||||
This means the mock has two possible architectures:
|
||||
|
||||
| Option | Where the mock lives | Exercises Fwlib? |
|
||||
|---|---|---|
|
||||
| **A. IPC-layer fake** (already shipped as `FakeFocasBackend`) | Between `FwlibFrameHandler` and the FWLIB call | ❌ No — bypasses Fwlib entirely |
|
||||
| **B. TCP wire mock** (this plan) | Listens on port 8193; Fwlib connects to it | ✅ Yes — Fwlib encodes real frames |
|
||||
|
||||
Option B is the only one that validates the driver's actual production wire path (driver → Host → `FwlibFocasClient` → `Fwlib64.dll` → TCP → mock).
|
||||
|
||||
**Prerequisite reading** the implementer needs before starting Option B:
|
||||
- `strangesast/fwlib` on GitHub — reverse-engineered FOCAS2 Linux client, has frame-format notes
|
||||
- `GalvinGao/opcua-server-fanuc` — another OSS FOCAS client with wire-format traces
|
||||
- `jdegre/focas-python` (if it still exists) — previous Python FOCAS stub, starting point
|
||||
- Our own `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs` — the 9-function surface we need to satisfy
|
||||
|
||||
## Protocol surface (what the mock must speak)
|
||||
|
||||
From `FwlibNative.cs`, our driver makes exactly 9 FWLIB calls:
|
||||
|
||||
| FWLIB function | What it does | Wire complexity |
|
||||
|---|---|---|
|
||||
| `cnc_allclibhndl3` | Open Ethernet handle (connect) | **High** — initial handshake, version negotiation, session state |
|
||||
| `cnc_freelibhndl` | Close handle | Low |
|
||||
| `pmc_rdpmcrng` | PMC range read (byte/word/long + optional bit) | **Medium** — 40-byte buffer with type-dependent layout |
|
||||
| `pmc_wrpmcrng` | PMC range write | **Medium** — same buffer shape inverted |
|
||||
| `cnc_rdparam` | Parameter read (axis-aware) | Medium — 32-byte buffer |
|
||||
| `cnc_wrparam` | Parameter write | Medium |
|
||||
| `cnc_rdmacro` | Macro variable read (value + decimal-point count) | Low |
|
||||
| `cnc_wrmacro` | Macro variable write | Low |
|
||||
| `cnc_statinfo` | Status info (for probe) | Low — fixed-shape response |
|
||||
|
||||
**Coverage target**: all 9 functions return plausible responses for the address ranges declared in each series profile. Out-of-range addresses return `EW_NUMBER` / `EW_PARAM`. Unknown PMC letters return `EW_DATA`. Session state (handle validity, unknown handle detection) is enforced.
|
||||
|
||||
## Version profiles
|
||||
|
||||
The driver has `FocasCncSeries` + `FocasCapabilityMatrix` already — we mirror that matrix into JSON profiles the mock loads at start:
|
||||
|
||||
```
|
||||
fixture/
|
||||
├── Dockerfile
|
||||
├── requirements.txt
|
||||
├── server/
|
||||
│ ├── focas_server.py # asyncio TCP server + frame parser
|
||||
│ ├── handlers/
|
||||
│ │ ├── allclibhndl3.py
|
||||
│ │ ├── pmc.py
|
||||
│ │ ├── param.py
|
||||
│ │ ├── macro.py
|
||||
│ │ └── status.py
|
||||
│ ├── state.py # in-memory "CNC" state
|
||||
│ └── frames.py # FOCAS frame encode/decode
|
||||
└── profiles/
|
||||
├── zero_i_d.json
|
||||
├── zero_i_f.json
|
||||
├── zero_i_mf.json
|
||||
├── zero_i_tf.json
|
||||
├── sixteen_i.json
|
||||
├── thirty_i.json
|
||||
├── thirtyone_i.json
|
||||
├── thirtytwo_i.json
|
||||
└── powermotion_i.json
|
||||
```
|
||||
|
||||
Each profile captures:
|
||||
|
||||
```json
|
||||
{
|
||||
"series": "ThirtyOne_i",
|
||||
"api_version": "0x30",
|
||||
"pmc_ranges": {
|
||||
"X": [0, 127], "Y": [0, 127], "F": [0, 767], "G": [0, 767],
|
||||
"R": [0, 1499], "D": [0, 2999], "C": [0, 199], "K": [0, 31],
|
||||
"A": [0, 24], "T": [0, 79], "E": [0, 9999]
|
||||
},
|
||||
"param_ranges": [[1000, 9999], [10000, 15999]],
|
||||
"macro_range": [100, 999],
|
||||
"extended_macros": false,
|
||||
"axes": 3,
|
||||
"quirks": {
|
||||
"crash_after_handle_cycles": null,
|
||||
"edit_mode_rejects_connection": false,
|
||||
"allclibhndl3_blocks_during_alarm": false,
|
||||
"param_bit_index_max": 7
|
||||
},
|
||||
"alarm_default": false,
|
||||
"emergency_default": false
|
||||
}
|
||||
```
|
||||
|
||||
**Differences that actually matter** for driver coverage:
|
||||
|
||||
| Series | Meaningful difference vs baseline |
|
||||
|---|---|
|
||||
| 0i-D / 0i-F / 0i-MF / 0i-TF | PMC range narrower; no E-relay; macro range `100-999` strict |
|
||||
| 16i | Older Fwlib version; `cnc_allclibhndl3` extra-slow on first connect (artificial delay in mock) |
|
||||
| 30i | Full PMC range; extended macros (`#10000+`) supported |
|
||||
| 31i / 32i | 5-axis; larger parameter ranges |
|
||||
| PowerMotion-i | No PMC `T` timer; motion-only controller quirks |
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Windows test rig (net10.0 x64) │
|
||||
│ │
|
||||
│ FocasDriver ──► FwlibFocasClient ──► Fwlib64.dll ──► TCP ──┐ │
|
||||
│ (real P/Invoke) │ │
|
||||
└─────────────────────────────────────────────────────────────┼───┘
|
||||
│
|
||||
port 8193 │
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Docker container: otopcua-focas-sim-{series} │
|
||||
│ │
|
||||
│ Python asyncio TCP server │
|
||||
│ ├─ frames.py: parse + encode FOCAS frames │
|
||||
│ ├─ handlers/: one module per FWLIB function │
|
||||
│ ├─ state.py: per-session handle registry + simulated memory │
|
||||
│ └─ profiles/{series}.json: range + quirk table loaded at │
|
||||
│ boot via env var OTOPCUA_FOCAS_ │
|
||||
│ PROFILE=thirtyone_i │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Python choice rationale: the existing OSS FOCAS implementations are Python-first; asyncio's `StreamReader`/`StreamWriter` maps cleanly to FOCAS's length-prefixed frame model; one Dockerfile covers every profile because profile-switching is an env-var.
|
||||
|
||||
`docker-compose.yml` exposes one service per profile as a `--profile`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
focas-thirtyone:
|
||||
profiles: ["thirtyone"]
|
||||
image: otopcua-focas-sim:latest
|
||||
environment: { OTOPCUA_FOCAS_PROFILE: "thirtyone_i" }
|
||||
ports: ["8193:8193"]
|
||||
|
||||
focas-zerod:
|
||||
profiles: ["zerod"]
|
||||
image: otopcua-focas-sim:latest
|
||||
environment: { OTOPCUA_FOCAS_PROFILE: "zero_i_d" }
|
||||
ports: ["8193:8193"]
|
||||
# ... one per supported series ...
|
||||
```
|
||||
|
||||
Users pick a profile with `docker compose --profile thirtyone up -d`. Only one profile runs at a time (port collision on 8193) — matching the other driver fixtures' single-image pattern.
|
||||
|
||||
## Delivery plan — three streams
|
||||
|
||||
### Stream A — Version-aware fake backend (C#, 2-3 days) — ✅ **SHIPPED 2026-04-23**
|
||||
|
||||
**What landed**:
|
||||
|
||||
- `FakeFocasBackend` gained a second ctor `(FocasCncSeries series, FakeFocasBackendQuirks? quirks)`; default ctor preserves the pre-Stream-A permissive behaviour.
|
||||
- `ValidateAddress` delegates to the existing `FocasCapabilityMatrix.Validate` so mock + driver share one source of truth. Out-of-range reads/writes/PMC-bit-writes return `BadOutOfRange` (0x803C0000 — matching what the real driver maps `EW_NUMBER`/`EW_PARAM` to).
|
||||
- `FakeFocasBackendQuirks` record carries four opt-in quirks: `EditModeRejectsConnection`, `CrashAfterHandleCycles`, `SlowFirstConnectDelay`, `EmergencyAtStartup`.
|
||||
- `Program.cs` reads `OTOPCUA_FOCAS_SERIES` (case-insensitive FocasCncSeries enum value) + `OTOPCUA_FOCAS_QUIRKS` (comma-separated token list: `EditMode`, `Emergency`, `SlowFirstConnect[=ms]`, `CrashAfterCycles=N`). Unknown tokens log-and-ignore. Values surface in Host log at startup.
|
||||
- 19 new tests in `FakeFocasBackendSeriesTests.cs` covering: Unknown-permissive baseline, Zero_i_D macro rejection, ThirtyOne_i extended-macro acceptance, PowerMotion_i T-timer rejection, Write+PmcBitWrite parallel rejection, all four quirks, + 8 theory cases for the env-var parser.
|
||||
|
||||
**Deliverable shipped**:
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Backend/FakeFocasBackend.cs` — extended
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host/Program.cs` — `BuildFakeBackend` local fn + `ParseFakeQuirks` helper
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Host.Tests/FakeFocasBackendSeriesTests.cs` — new, 19 tests
|
||||
- 38/38 Host tests green post-Stream-A.
|
||||
|
||||
### Stream B — Python FOCAS TCP server (scaffold) — ✅ **SHIPPED 2026-04-23**
|
||||
|
||||
**What landed** under `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/`:
|
||||
|
||||
- `Dockerfile` — Python 3.12-slim image; stdlib-only, no external deps
|
||||
- `docker-compose.yml` — 9 `--profile` entries, one per Fanuc series (`thirtyone`, `thirtytwo`, `thirty`, `sixteen`, `zerod`, `zerof`, `zeromf`, `zerotf`, `powermotion`). All share one image + one port (8193).
|
||||
- `server/focas_server.py` — asyncio entry point, per-connection session loop, graceful-shutdown signal handling
|
||||
- `server/frames.py` — length-prefixed frame codec (scaffold — see Stream C note below)
|
||||
- `server/state.py` — per-session handle registry + in-memory PMC/param/macro dictionaries
|
||||
- `server/profile.py` — JSON profile loader
|
||||
- `server/handlers/` — one module per FWLIB function (9 total): open/close, PMC read/write, param read/write, macro read/write, statinfo. Profile-driven range validation; error responses use a `FLAG_ERROR` bit on the response header.
|
||||
- `profiles/*.json` — 9 series profiles mirroring `FocasCapabilityMatrix`. Quirks (`slow_first_connect_ms`, `alarm_default`, `emergency_default`, `crash_after_handle_cycles`, `edit_mode_rejects_connection`) declared per profile.
|
||||
- `validate_harness.py` — scaffold-protocol TCP client that opens a session, round-trips a macro, triggers range-rejection, asserts the expected error reasons surface.
|
||||
- `README.md` — operator-facing usage + Stream C next-steps checklist.
|
||||
|
||||
**Exit criterion met**: validated end-to-end against two profiles (`thirtyone_i`, `powermotion_i`) via the local harness. Session handshake → statinfo → macro round-trip → out-of-range rejection → PMC round-trip → bad-letter rejection → clean close — all PASS. Profile-switching confirmed working: 31i API 0x0030 → PowerMotion 0x0040, macro range [0,99999]→[0,999], letter set {A,C,D,E,F,G,K,M,R,T,X,Y}→{D,R,X,Y}.
|
||||
|
||||
**⚠️ The wire *framing* is a scaffold — NOT Fwlib64-compatible yet.** `server/frames.py` uses a plausible length-prefixed framing (big-endian header: uint32 length, uint16 function_id, uint16 flags) that satisfies the harness but has never been validated against the real Fanuc DLL. Stream C is the iterative refinement cycle where a Windows rig drives that convergence.
|
||||
|
||||
**The response payload shapes inside those frames ARE authoritative** (refined 2026-04-23 after `fwlib32.h` review):
|
||||
- `ODBM` (macro read) = 10 bytes: `short datano, short dummy, int32 mcr_val, short dec_val`
|
||||
- `ODBST` (statinfo) = 18 bytes: 9 × `short` (dummy/tmmode/aut/run/motion/mstb/emergency/alarm/edit)
|
||||
- `IODBPSD` (param read) = 36 bytes: `short datano, short type, bytes[32]` (union = 8 axes × 4 bytes)
|
||||
- `IODBPMC` (PMC range read) = 48 bytes: `short type_a, short type_d, uint16 datano_s, uint16 datano_e, bytes[40]`
|
||||
|
||||
Validate harness asserts exact byte sizes + header field round-trip. When Stream C's Wireshark traces arrive, the payload layer should already match — only framing needs iteration.
|
||||
|
||||
See [`focas-wire-protocol.md`](focas-wire-protocol.md) for the authoritative-vs-guessed breakdown.
|
||||
|
||||
**C# integration test scaffold** also shipped (`tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/`) — `FocasSimFixture` probes port 8193 + skips when the container's down; three smoke tests pass against a running container (TCP reachability, clean connect-close, profile parsing). A `Series/WireCompatGatedTests.cs` skeleton gates Fwlib64-dependent tests behind `OTOPCUA_FOCAS_SIM_WIRE_COMPAT=1`, ready for Stream C activation.
|
||||
|
||||
### Stream C — FWLIB compat + version profiles (2-3 weeks) — **blocked on Windows rig + Wireshark traces**
|
||||
|
||||
See `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/Docker/README.md` §"Stream C — what's required to reach wire compatibility" for the concrete implementer checklist.
|
||||
|
||||
**Goal**: real Fwlib64.dll running on a Windows test rig can open a session against the mock and round-trip the 9 FWLIB calls our driver makes.
|
||||
|
||||
Sub-tasks:
|
||||
|
||||
1. **Handshake** (`handlers/allclibhndl3.py`) — the hardest piece. FOCAS session open negotiates protocol version + controller type. Incorrect negotiation → Fwlib disconnects. Start from `strangesast/fwlib`'s handshake trace.
|
||||
2. **PMC read/write** (`handlers/pmc.py`) — 40-byte buffer with type-dependent layout. Must match `FwlibNative.IODBPMC` struct layout exactly. Implement per-profile range checks.
|
||||
3. **Parameter read/write** (`handlers/param.py`) — 32-byte axis-aware buffer. Similar to PMC but simpler (no sub-address bit indexing beyond `param_bit_index_max`).
|
||||
4. **Macro read/write** (`handlers/macro.py`) — straightforward; value + decimal-point count as `ODBM`.
|
||||
5. **Status info** (`handlers/status.py`) — fixed `ODBST` shape; profile declares defaults for `Aut` / `Run` / `Motion` / `Alarm`.
|
||||
6. **State management** (`server/state.py`) — per-session handle registry, in-memory PMC/param/macro dictionaries, persistent across one session, reset on session close.
|
||||
7. **Profile loader** — reads `OTOPCUA_FOCAS_PROFILE` env var, loads matching JSON, injects into handlers.
|
||||
8. **Windows validation rig** — one-time setup: a Windows VM (or dev box) with licensed `Fwlib64.dll` + a tiny test driver that calls the 9 FWLIB functions + asserts round-trip. This is the first live-wire validation the plan asks for.
|
||||
9. **Per-series test matrix** — `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` new project, one test class per series, each class's `[Fact]` runs against that profile's container.
|
||||
|
||||
**Exit criterion**: live Fwlib64.dll on a Windows rig opens a session, reads + writes across all 9 FWLIB functions, against each of the 9 profiles. Integration test suite green.
|
||||
|
||||
### Stream D — e2e integration + doc close-out (1-2 days)
|
||||
|
||||
- Update `scripts/e2e/test-focas.ps1` to accept `-ProfileName` and skip `FOCAS_TRUST_WIRE` gate when the matching container is up.
|
||||
- Add the FOCAS simulator to `docs/v2/test-data-sources.md` + `docs/drivers/FOCAS-Test-Fixture.md` (flip the "hardware-gated" caveat to "fixture or hardware").
|
||||
- Update `exit-gate-phase-3.md` — final FOCAS deferral closes.
|
||||
|
||||
## Test integration
|
||||
|
||||
The new project `tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/` mirrors `Driver.OpcUaClient.IntegrationTests`:
|
||||
|
||||
```
|
||||
tests/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.IntegrationTests/
|
||||
├── Docker/
|
||||
│ ├── docker-compose.yml # references the 9 series profiles
|
||||
│ ├── Dockerfile # Python image
|
||||
│ ├── requirements.txt
|
||||
│ ├── server/
|
||||
│ └── profiles/
|
||||
├── FocasSimFixture.cs # probes 8193 at collection init, skips if down
|
||||
├── FocasSimSeriesProfile.cs # test-side mirror of the JSON profile
|
||||
└── Series/
|
||||
├── ThirtyOneITests.cs
|
||||
├── ZeroIDTests.cs
|
||||
└── ... one file per series ...
|
||||
```
|
||||
|
||||
The existing `FocasDocker`-less skip pattern applies: if the container isn't running, tests skip with a clear message pointing at `docker compose up -d`. Matches Modbus / S7 / OpcUaClient.
|
||||
|
||||
## Risks + mitigations
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation |
|
||||
|---|---|---|---|
|
||||
| FOCAS wire protocol is more complex than the OSS traces suggest → Stream C slips weeks | **Medium** | High | Stream A delivers 70% value with zero protocol risk. If Stream C stalls, ship A + schedule C as a follow-up. |
|
||||
| Fwlib64.dll version differs from what `strangesast/fwlib` reverse-engineered → handshake fails | Medium | High | Capture Wireshark trace of a real CNC session against our actual licensed Fwlib64 version before coding. One-time investment, catches drift early. |
|
||||
| Profile differences that matter at the wire level aren't captured in `FocasCapabilityMatrix` | Medium | Medium | Stream C exit criterion includes validating each profile against live Fwlib — any mismatch is a profile-table bug we fix then. |
|
||||
| Docker container startup time breaks PR-CI budget | Low | Low | Each profile is one Python container + profile JSON — sub-5s cold start. Matches opc-plc. |
|
||||
| Windows validation rig availability blocks Stream C | Medium | High | Use the existing TCBSD-class approach: a dedicated ESXi VM with Windows + licensed Fwlib64.dll, provisioned once, shared by the team. Cost ~1 dev-day to set up; unblocks all future FOCAS work forever. |
|
||||
| Fanuc licence audit surfaces our mock as an "unlicensed FOCAS implementation" | **Low** | **High** | The mock doesn't ship the Fanuc DLL or reproduce any of Fanuc's code. Reverse-engineered wire formats from OSS research are fair use; the mock is our code. Consult legal before open-sourcing, not before internal use. |
|
||||
|
||||
## Timeline estimate
|
||||
|
||||
Assuming one dev full-time:
|
||||
|
||||
| Stream | Duration | Dependencies |
|
||||
|---|---|---|
|
||||
| A — Version-aware fake backend | 2-3 days | none |
|
||||
| B — TCP server scaffold | 1 week | Windows rig not required yet |
|
||||
| C — FWLIB compat + profiles | 2-3 weeks | Windows rig with Fwlib64 + Wireshark trace |
|
||||
| D — e2e + docs | 1-2 days | C done |
|
||||
|
||||
**Total**: ~4-5 weeks to full coverage. Ship A immediately (independent value), start C in parallel with Windows-rig setup.
|
||||
|
||||
## Exit criteria (what closes #222)
|
||||
|
||||
- [ ] All 9 series profiles containerized + pass startup health check
|
||||
- [ ] Live Fwlib64.dll round-trips all 9 FWLIB calls against every profile (Stream C validation rig)
|
||||
- [ ] Per-series integration test suite green in CI
|
||||
- [ ] `test-focas.ps1` runs end-to-end against the simulator without `FOCAS_TRUST_WIRE=1`
|
||||
- [ ] Docs updated: `FOCAS-Test-Fixture.md` flipped from "hardware-only" to "fixture or hardware"
|
||||
- [ ] One live-CNC smoke still runs during v2 release readiness, as a belt-and-braces final check
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Licence clarity**: is reverse-engineered FOCAS2 wire-format documentation (from `strangesast/fwlib` etc.) compatible with our Fanuc FOCAS developer-kit licence? Legal check required before starting Stream C.
|
||||
2. **Windows rig**: do we dedicate an existing VM (like the TCBSD box) or provision a new one? Cost difference is small; decision affects who owns maintenance.
|
||||
3. **Profile source of truth**: if `FocasCapabilityMatrix.cs` and `profiles/*.json` ever disagree, which wins? Proposal: profiles win (wire behavior is authoritative), driver's matrix is regenerated from profiles as a build step.
|
||||
4. **Alarm events**: the driver doesn't currently use `cnc_rdalmmsg2` / alarm subscription, so the mock doesn't need to simulate alarms beyond the `statinfo.Alarm` flag. If we add `IAlarmSource` to FOCAS later, Stream C expands.
|
||||
|
||||
## References
|
||||
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FwlibNative.cs` — 9-function P/Invoke surface the mock must satisfy
|
||||
- `src/ZB.MOM.WW.OtOpcUa.Driver.FOCAS/FocasCapabilityMatrix.cs` — per-series range tables (profile seed data)
|
||||
- `docs/v2/focas-version-matrix.md` — human-readable version matrix the profiles mirror
|
||||
- `docs/drivers/FOCAS-Test-Fixture.md` — current test-fixture doc (flips post-Stream-D)
|
||||
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/` — pattern this plan mirrors for the Docker compose + fixture-skip shape
|
||||
- `strangesast/fwlib` (GitHub, OSS) — primary FOCAS wire-format reverse-engineering reference
|
||||
+87
@@ -0,0 +1,87 @@
|
||||
# Follow-ups from `auto/driver-gaps` queue (PRs #225–#316, 92 merged)
|
||||
|
||||
Captured 2026-04-26 after the plan-execution queue drained. Organised by category.
|
||||
|
||||
## Wrapper / library-version-blocked (waiting on upstream)
|
||||
|
||||
| Driver | PR | Blocker | Resolution path |
|
||||
|---|---|---|---|
|
||||
| AbCip | abcip-3.1 | libplctag.NET 1.5.2 doesn't expose `connection_size` | Reflection fallback ships; remove when wrapper publishes the property |
|
||||
| AbCip | abcip-3.2 | libplctag.NET 1.5.x has no public instance-ID knob | Wire stays Symbolic regardless of mode; flip when wrapper exposes it |
|
||||
| AbCip | abcip-3.3 | libplctag.NET wire-level multi-service-packet bundling not exposed | Planner ships correct; runtime currently issues N reads. Switch when wrapper bundles |
|
||||
| S7 | s7-e1 | S7netplus 0.20 has no public `ReadSzlAsync` (request builder is internal) | Parser tested + cached; `BadNotSupported` until S7netplus exposes it or we add raw S7comm SZL-PDU helper |
|
||||
| S7 | s7-e2 | S7netplus 0.20 doesn't expose `SendPassword` | `IS7PlcAuthGate` reflection probe; logs warning, no exception. Flip when library exposes it |
|
||||
| TwinCAT | twincat-2.2 | Bulk Sum path stays on symbolic | Phase-2 perf sweep follow-up to switch bulk to handle-based |
|
||||
| TwinCAT | twincat-5.1 | Beckhoff doesn't ship a managed `TcEventLogger` wrapper | Gate seam ships; production `AdsTwinCATAlarmGate` binary decoder against `ADSIGRP_TCEVENTLOG_ALARMS` is the next chunk of work |
|
||||
|
||||
## Fixture / simulator gaps
|
||||
|
||||
### focas-mock simulator doesn't exist
|
||||
- Blocks integration tests for: f3a (alarm history ring-buffer + `mock_patch_alarmhistory`), f4b (`mock_set_unlock_state`, `mock_get_last_write`), f4c (`pmc_wrpmcrng` handler), f4d (`cnc_wrunlockparam` + `mock_set_password`), f5a (`mock_simulate_cycle_completion`).
|
||||
- No FOCAS IntegrationTests project exists yet — it needs to be created when the mock lands.
|
||||
|
||||
### opc-plc fixture upgrades
|
||||
- **opcuaclient-10**: `TriggerModelChangeAsync` is a stub. Live HTTP-driven model-change verification deferred. Tests use an inject seam.
|
||||
- **opcuaclient-11**: `opc-plc-rc` Docker fixture session-open assertion (gated `OPCUACLIENT_TOPOLOGY_TRIGGER_CMD` / `OPCUA_RC_SIM`).
|
||||
- **opcuaclient-12**: opc-plc `--alm` fixture run for HistoryRead Events (waiting for fixture image upgrade).
|
||||
- **opcuaclient-13**: opc-plc historian-sim wire-level sweep for the 25 new aggregates (only ~5 likely honoured today).
|
||||
- **opcuaclient-14**: Two-container failover smoke against opc-plc + opc-plc-secondary on the live fixture.
|
||||
|
||||
### AbCip HSBY paired-fixture
|
||||
- **abcip-5.1/5.2**: `hsby-mux` Python sidecar is a stub; the patched `ab_server` image and live role-flip integration test are gated until that stabilises.
|
||||
|
||||
### AbLegacy auto-demote fixture
|
||||
- **ablegacy-12**: `slc500-faulty` is a commented compose placeholder; tests use the `127.0.0.1:1` ECONNREFUSED trick. Real refusing-proxy fixture is follow-up.
|
||||
|
||||
### TCBSD TwinCAT project
|
||||
- twincat-2.1, 3.1, 3.2, 4.1, 5.1 added new fixture stub files that need to be imported into the actual TwinCAT XAE project before `[TwinCATFact]` integration tests can exercise them:
|
||||
- `PLC/GVLs/GVL_Perf.TcGVL` + `PLC/POUs/FB_PerfChurn.TcPOU` (twincat-2.1)
|
||||
- `PLC/DUTs/ST_NestedFlags.TcDUT`, `ST_RecursiveCap.TcDUT`, `ST_AlarmRecord.TcDUT` (twincat-4.1)
|
||||
- `PLC/GVLs/GVL_Plant.TcGVL` extensions (twincat-4.1)
|
||||
- `PLC/GVLs/GVL_Alarms.TcGVL` + `PLC/POUs/FB_AlarmHarness.TcPOU` (twincat-5.1)
|
||||
|
||||
### Snap7 round-trip tests
|
||||
- s7-d1 (TIA CSV), s7-d2 (UDT fan-out), s7-d3 (instance-DB), s7-c1 (negotiated PDU), s7-c3 (scan groups), s7-c4 (deadband) integration tests are build-only until run against the live Snap7 fixture.
|
||||
|
||||
## Live-firmware / hardware verification
|
||||
|
||||
- **s7-c2** — hardened S7-1500 with non-PG TSAP modes (gated `--with-real-plc`).
|
||||
- **s7-c5** — hardened PLC with PUT/GET disabled (currently only Snap7 happy-path tested).
|
||||
- **s7-f** — manual checklist: toggle Optimized block access in TIA + Track 3 OPC UA bridge verification.
|
||||
- **ablegacy-13** — DH+ via real 1756-DHRIO + PLC-5. No Docker fixture possible.
|
||||
- **twincat-2.1 perf-tier** — `Driver_sum_read_1000_tags_beats_loop_baseline_by_5x` gated `TWINCAT_PERF=1`.
|
||||
- **twincat-2.3** — symbol-version online-change drill (`TWINCAT_MANUAL_ONLINE_CHANGE=1`).
|
||||
- **focas-f4b/c/d** — live CNC parameter / macro / PMC writes + password-protected CNC.
|
||||
|
||||
## Cross-driver / ecosystem
|
||||
|
||||
- **opcuaclient-12** — Galaxy A&E projection currently keeps the fixed-field `ReadEventsAsync(sourceName, ...)` overload; richer SelectClause-aware projection on the Galaxy A&E log is best-effort future work.
|
||||
- **per-driver plan files don't exist** — opcuaclient-12 cross-driver `IHistoryProvider` heads-up went into doc-comments instead. If anyone adds per-driver plan files later, the heads-up note belongs in each.
|
||||
|
||||
## Pre-existing red-build issues (NOT touched, will block solution-level CI)
|
||||
|
||||
- **NU1902 OpenTelemetry warning-as-error in Admin** — predates the queue.
|
||||
- **`Server/Phase7/DriverSubscriptionBridge.cs` cref ambiguity** — predates the queue.
|
||||
|
||||
Both must be fixed before solution-level CI can pass on the merged-up `task-galaxy-e2e`.
|
||||
|
||||
## Integration-branch merge
|
||||
|
||||
- **`auto/driver-gaps` has 92 stacked PRs** vs `task-galaxy-e2e`. Final merge needs a careful single review — likely staged or one big PR — and will collide with whatever has landed on `task-galaxy-e2e` in parallel.
|
||||
|
||||
## Plan-vs-reality deltas (informational; nothing to chase)
|
||||
|
||||
- **focas-f4a/b/c/d** — Plan referenced doc lines that had already been removed in prior evolution (FOCAS.md "intentionally returns BadNotWritable" callout; FOCAS-Test-Fixture.md alarms-not-covered caveat).
|
||||
- **opcuaclient-12** — Repo has no per-driver plan files for abcip / ablegacy / s7 / twincat — heads-up went into IHistoryProvider doc-comments instead.
|
||||
- **twincat-4.1** — `docs/v3/twincat-backlog.md` doesn't exist; UDT-gap-removal item N/A.
|
||||
|
||||
## Highest-leverage cleanup once upstream catches up
|
||||
|
||||
When the upstream library bumps, these reflection / `BadNotSupported` paths simplify to direct calls:
|
||||
|
||||
- **abcip-3.1**: remove reflection fallback in `LibplctagTagRuntime.TrySetIntAttribute(connection_size, ...)`.
|
||||
- **abcip-3.2**: remove `LibplctagTagRuntime.TrySetLogicalAddressing` reflection.
|
||||
- **abcip-3.3**: switch `MultiPacket` runtime from N-reads to true wire-bundle.
|
||||
- **s7-e1**: replace `S7NetSzlReader.ReadAsync` returning null with real `Plc.ReadSzlAsync`.
|
||||
- **s7-e2**: replace `ReflectionS7PlcAuthGate` warning path with direct `Plc.SendPasswordAsync` call.
|
||||
- **twincat-5.1**: ship `AdsTwinCATAlarmGate` binary decoder.
|
||||
+3
-1
@@ -913,7 +913,9 @@ after 6.4 (uses its data). 6.W last.
|
||||
- `Server/Configuration/DriverFactoryRegistry.cs` — remove the
|
||||
`legacy-host` switch arm.
|
||||
|
||||
**Depends on:** PR 7.1 fully soaked (no rollback risk).
|
||||
**Depends on:** parity matrix in `docs/v2/Galaxy.ParityMatrix.md` is
|
||||
fully green or carries documented accepted-deltas (verified
|
||||
2026-04-30 on the dev rig: 14 passed / 1 skipped / 0 failed).
|
||||
|
||||
#### PR 7.3 — Doc + memory housekeeping
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -73,13 +73,13 @@ Assert-TextFound "ScriptedAlarmSource implements IAlarmSource" "class ScriptedAl
|
||||
Assert-TextFound "IAlarmStateStore abstraction + in-memory default" "class InMemoryAlarmStateStore" @("src/ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms/IAlarmStateStore.cs")
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "Stream D - Core.AlarmHistorian (SQLite store-and-forward + Galaxy.Host IPC contracts)"
|
||||
Write-Host "Stream D - Core.AlarmHistorian (SQLite store-and-forward; alarm-event sidecar IPC moved to Driver.Historian.Wonderware.Client in PR 3.4)"
|
||||
Assert-FileExists "Core.AlarmHistorian project" "src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj"
|
||||
Assert-TextFound "SqliteStoreAndForwardSink backoff ladder (1s..60s cap)" "BackoffLadder" @("src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs")
|
||||
Assert-TextFound "Default 1M row capacity + 30-day dead-letter retention (plan decision #21)" "DefaultDeadLetterRetention" @("src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs")
|
||||
Assert-TextFound "Per-event outcomes (Ack/RetryPlease/PermanentFail)" "HistorianWriteOutcome" @("src/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/IAlarmHistorianSink.cs")
|
||||
Assert-TextFound "Galaxy.Host IPC contract HistorianAlarmEventRequest" "class HistorianAlarmEventRequest" @("src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/Contracts/HistorianAlarms.cs")
|
||||
Assert-TextFound "Historian connectivity status notification" "HistorianConnectivityStatusNotification" @("src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/Contracts/HistorianAlarms.cs")
|
||||
# Galaxy.Shared pipe-IPC contracts retired in PR 7.2 alongside the rest of the legacy
|
||||
# Galaxy projects. Wonderware sidecar contracts live in Driver.Historian.Wonderware.Client.
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "Stream E - Config DB schema"
|
||||
|
||||
@@ -63,7 +63,9 @@ live driver. The factory-wiring block that originally gated stages
|
||||
Live-boot verification:
|
||||
|
||||
- **Galaxy** — 7/7 stages (read / write / subscribe / alarms / history)
|
||||
against a real Galaxy + `OtOpcUaGalaxyHost` on this dev box.
|
||||
against a real Galaxy via the in-process `GalaxyDriver` →
|
||||
`mxaccessgw` (gRPC). PR 7.2 retired the legacy `OtOpcUaGalaxyHost`
|
||||
out-of-process driver path.
|
||||
- **AB CIP, S7** — 5/5 stages each under task #220 against the
|
||||
`ab_server` + `python-snap7` fixtures.
|
||||
- **AB Legacy** — 5/5 stages under task #222 against `ab_server` SLC500
|
||||
@@ -155,7 +157,7 @@ section to skip it.
|
||||
| Modbus | — | **PASS** (pymodbus fixture) |
|
||||
| AB CIP | — | **PASS** (ab_server fixture) |
|
||||
| AB Legacy | — | **PASS** (ab_server SLC500/MicroLogix/PLC-5 profiles; `/1,0` cip-path required for the Docker fixture) |
|
||||
| Galaxy | — | **PASS** (requires OtOpcUaGalaxyHost + a live Galaxy; 7 stages including alarms + history) |
|
||||
| Galaxy | — | **PASS** (requires mxaccessgw running + a live Galaxy; 7 stages including alarms + history; PR 7.2 retired the legacy OtOpcUaGalaxyHost path) |
|
||||
| S7 | — | **PASS** (python-snap7 fixture) |
|
||||
| FOCAS | `FOCAS_TRUST_WIRE=1` | **SKIP** (no public simulator — task #222 lab rig) |
|
||||
| TwinCAT | `TWINCAT_TRUST_WIRE=1` | **SKIP** by default; features **validated** against the TCBSD VM fixture — set the env var to run |
|
||||
|
||||
@@ -3,14 +3,14 @@
|
||||
|
||||
"modbus": {
|
||||
"$comment": "Port 5020 matches tests/.../Modbus.IntegrationTests/Docker/docker-compose.yml — `docker compose --profile standard up -d`.",
|
||||
"endpoint": "127.0.0.1:5020",
|
||||
"endpoint": "10.100.0.35:5020",
|
||||
"bridgeNodeId": "ns=2;s=Modbus/HR200",
|
||||
"opcUaUrl": "opc.tcp://localhost:4840"
|
||||
},
|
||||
|
||||
"abcip": {
|
||||
"$comment": "ab_server listens on port 44818 (default CIP/EIP). `docker compose --profile controllogix up -d`.",
|
||||
"gateway": "ab://127.0.0.1:44818/1,0",
|
||||
"gateway": "ab://10.100.0.35:44818/1,0",
|
||||
"family": "ControlLogix",
|
||||
"tagPath": "TestDINT",
|
||||
"bridgeNodeId": "ns=2;s=AbCip/TestDINT"
|
||||
@@ -18,7 +18,7 @@
|
||||
|
||||
"ablegacy": {
|
||||
"$comment": "Works against ab_server --profile slc500 (Docker fixture) or real SLC/MicroLogix/PLC-5 hardware. `/1,0` cip-path is required for the Docker fixture; real hardware accepts an empty path — e.g. `ab://10.0.1.50:44818/`.",
|
||||
"gateway": "ab://127.0.0.1/1,0",
|
||||
"gateway": "ab://10.100.0.35/1,0",
|
||||
"plcType": "Slc500",
|
||||
"address": "N7:5",
|
||||
"bridgeNodeId": "ns=2;s=AbLegacy/N7_5"
|
||||
@@ -26,7 +26,7 @@
|
||||
|
||||
"s7": {
|
||||
"$comment": "Port 1102 matches tests/.../S7.IntegrationTests/Docker/docker-compose.yml (python-snap7 needs non-priv port). `docker compose --profile s7_1500 up -d`. Real S7 PLCs listen on 102.",
|
||||
"endpoint": "127.0.0.1:1102",
|
||||
"endpoint": "10.100.0.35:1102",
|
||||
"cpu": "S71500",
|
||||
"slot": 0,
|
||||
"address": "DB1.DBW0",
|
||||
@@ -50,7 +50,7 @@
|
||||
},
|
||||
|
||||
"galaxy": {
|
||||
"$comment": "Galaxy (MXAccess) driver. Has no per-driver CLI — all stages go through otopcua-cli against the published NodeIds. Seven stages: probe / source read / virtual-tag bridge / subscribe-sees-change / reverse write / alarm fires / history read. Requires OtOpcUaGalaxyHost running + seed-phase-7-smoke.sql applied with a real Galaxy attribute substituted into dbo.Tag.TagConfig.",
|
||||
"$comment": "Galaxy (MXAccess) driver. Has no per-driver CLI — all stages go through otopcua-cli against the published NodeIds. Seven stages: probe / source read / virtual-tag bridge / subscribe-sees-change / reverse write / alarm fires / history read. The driver is now the in-process GalaxyDriver (DriverType = 'GalaxyMxGateway') talking gRPC to a separately-installed mxaccessgw at http://localhost:5120 by default — override via the DriverInstance row's DriverConfig. PR 7.2 retired the legacy 'Galaxy' DriverType + OtOpcUaGalaxyHost service.",
|
||||
"sourceNodeId": "ns=2;s=p7-smoke-tag-source",
|
||||
"virtualNodeId": "ns=2;s=p7-smoke-vt-derived",
|
||||
"alarmNodeId": "ns=2;s=p7-smoke-al-overtemp",
|
||||
@@ -62,7 +62,7 @@
|
||||
|
||||
"opcuaclient": {
|
||||
"$comment": "OPC UA Client (gateway) driver. Default opc-plc Docker fixture exposes ns=3;s=FastUInt1 as a ticker. The `bridgeNodeId` is the local mirror of remoteNodeId after the OpcUaClient driver's DiscoverAsync runs — dev-specific. Stages 5/7/8 are opt-in: supply writable* NodeIds to enable reverse-bridge, alarmNodeId to enable alarm, historyNodeId to enable history (opc-plc does not historize by default — a Prosys / UA Expert sample server is needed for stage 8).",
|
||||
"remoteUrl": "opc.tcp://localhost:50000",
|
||||
"remoteUrl": "opc.tcp://10.100.0.35:50000",
|
||||
"remoteNodeId": "ns=3;s=FastUInt1",
|
||||
"bridgeNodeId": "ns=2;s=OpcUaClient/FastUInt1",
|
||||
"bridgeRootNodeId": "ns=2;s=OpcUaClient",
|
||||
|
||||
@@ -1,298 +0,0 @@
|
||||
#Requires -Version 7.0
|
||||
<#
|
||||
.SYNOPSIS
|
||||
End-to-end CLI test for the Galaxy (MXAccess) driver — read, write, subscribe,
|
||||
alarms, and history through a running OtOpcUa server.
|
||||
|
||||
.DESCRIPTION
|
||||
Unlike the other e2e scripts there is no `otopcua-galaxy-cli` — the Galaxy
|
||||
driver proxy lives in-process with the server + talks to `OtOpcUaGalaxyHost`
|
||||
over a named pipe (MXAccess is 32-bit COM, can't ship in the .NET 10 process).
|
||||
Every stage therefore goes through `otopcua-cli` against the published OPC UA
|
||||
address space.
|
||||
|
||||
Seven stages:
|
||||
|
||||
1. Probe — otopcua-cli connect + read the source NodeId; confirms
|
||||
the whole Galaxy.Host → Proxy → server → client chain is
|
||||
up
|
||||
2. Source read — otopcua-cli read returns a Good value for the source
|
||||
attribute; proves IReadable.ReadAsync is dispatching
|
||||
through the IPC bridge
|
||||
3. Virtual-tag bridge — `otopcua-cli read` on the VirtualTag NodeId; confirms
|
||||
the Phase 7 CachedTagUpstreamSource is bridging the
|
||||
driver-sourced input into the scripting engine
|
||||
4. Subscribe-sees-change — subscribe to the source NodeId in the background;
|
||||
Galaxy pushes a data-change event within N seconds
|
||||
(Galaxy's underlying attribute must be actively
|
||||
changing — production Galaxies typically have
|
||||
scan-driven updates; for idle galaxies, widen
|
||||
-ChangeWaitSec or drive the write stage below first)
|
||||
5. Reverse bridge — `otopcua-cli write` to a writable Galaxy attribute;
|
||||
read it back. Gracefully becomes INFO-only if the
|
||||
attribute's Galaxy-side AccessLevel forbids writes
|
||||
(BadUserAccessDenied / BadNotWritable)
|
||||
6. Alarm fires — subscribe to the scripted-alarm Condition NodeId,
|
||||
drive the source tag above its threshold, confirm an
|
||||
Active alarm event surfaces. Exercises the Part 9
|
||||
alarm-condition propagation path
|
||||
7. History read — historyread on the source tag over the last hour;
|
||||
confirms Aveva Historian → IHistoryProvider dispatch
|
||||
returns samples
|
||||
|
||||
The Phase 7 seed (`scripts/smoke/seed-phase-7-smoke.sql`) already plants the
|
||||
right shape — one Galaxy DriverInstance, one source Tag, one VirtualTag
|
||||
(source × 2), one ScriptedAlarm (source > 50). Substitute the real Galaxy
|
||||
attribute FullName into `dbo.Tag.TagConfig` before running.
|
||||
|
||||
.PARAMETER OpcUaUrl
|
||||
OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
|
||||
|
||||
.PARAMETER SourceNodeId
|
||||
NodeId of the driver-sourced Galaxy tag (numeric, writable preferred). NodeIds
|
||||
are path-based per OPC UA Part 3 §5.2.2 — the default matches the Phase 7 seed
|
||||
walking `p7-smoke-galaxy` (DriverInstanceId) → `lab-floor` → `galaxy-line` →
|
||||
`reactor-1` → `Source` (Tag.Name).
|
||||
|
||||
.PARAMETER VirtualNodeId
|
||||
NodeId of the VirtualTag that computes MachineStatus = (Source > 0) (Phase 7
|
||||
scripting). Same path-based scheme, ending in the VirtualTag.Name
|
||||
(`MachineStatus`). The tag is historized so the write/subscribe exercise
|
||||
doubles as a historian-sink check.
|
||||
|
||||
.PARAMETER AlarmNodeId
|
||||
NodeId of the scripted-alarm Condition (fires when Source > 50). Same
|
||||
path-based scheme, ending in ScriptedAlarm.Name (`OverTemp`).
|
||||
|
||||
.PARAMETER AlarmTriggerValue
|
||||
Value written to -SourceNodeId to push it over the alarm threshold.
|
||||
Default 75 (well above the seeded 50-threshold).
|
||||
|
||||
.PARAMETER ChangeWaitSec
|
||||
Seconds the subscribe-sees-change stage waits for a natural data change.
|
||||
Default 10. Idle galaxies may need this extended or the stage will fail
|
||||
with "subscribe did not observe...".
|
||||
|
||||
.PARAMETER AlarmWaitSec
|
||||
Seconds the alarm-fires stage waits after triggering the write. Default 10.
|
||||
|
||||
.PARAMETER HistoryLookbackSec
|
||||
Seconds back from now to query history. Default 3600 (1 h).
|
||||
|
||||
.EXAMPLE
|
||||
# Against the default Phase-7 smoke seed + live Galaxy + OtOpcUa server
|
||||
./scripts/e2e/test-galaxy.ps1
|
||||
|
||||
.EXAMPLE
|
||||
# Custom NodeIds from a non-smoke cluster
|
||||
./scripts/e2e/test-galaxy.ps1 `
|
||||
-SourceNodeId "ns=2;s=Reactor1.Temperature" `
|
||||
-VirtualNodeId "ns=2;s=Reactor1.TempDoubled" `
|
||||
-AlarmNodeId "ns=2;s=Reactor1.OverTemp" `
|
||||
-AlarmTriggerValue 120
|
||||
#>
|
||||
|
||||
param(
|
||||
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
|
||||
[string]$SourceNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source",
|
||||
[string]$VirtualNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus",
|
||||
[string]$AlarmNodeId = "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp",
|
||||
[string]$AlarmTriggerValue = "75",
|
||||
[int]$ChangeWaitSec = 10,
|
||||
[int]$AlarmWaitSec = 10,
|
||||
[int]$HistoryLookbackSec = 3600,
|
||||
# The default Phase 7 seed uses a Galaxy attribute with
|
||||
# security_classification=Operate. Anonymous OPC UA sessions are denied writes
|
||||
# against Operate-classified tags (PR 26 / docs/Security.md). Supply an LDAP
|
||||
# user with WriteOperate to exercise the reverse-bridge stage — e.g.
|
||||
# `-Username writeop -Password writeop123` against the dev-box GLAuth.
|
||||
[string]$Username = "",
|
||||
[string]$Password = ""
|
||||
)
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
. "$PSScriptRoot/_common.ps1"
|
||||
|
||||
$opcUaCli = Get-CliInvocation `
|
||||
-ProjectFolder "src/ZB.MOM.WW.OtOpcUa.Client.CLI" `
|
||||
-ExeName "otopcua-cli"
|
||||
|
||||
# Auth-extension helper — appends `-U / -P` to the CLI args when credentials
|
||||
# were supplied. Stays empty for anonymous runs so the default smoke path
|
||||
# doesn't require an LDAP round-trip.
|
||||
$authArgs = @()
|
||||
if ($Username) { $authArgs += @("-U", $Username) }
|
||||
if ($Password) { $authArgs += @("-P", $Password) }
|
||||
|
||||
$results = @()
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 1 — Probe. The probe is an otopcua-cli read against the source NodeId;
|
||||
# success implies Galaxy.Host is up + the pipe ACL lets the server connect +
|
||||
# the Proxy is tracking the tag + the server published it.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Header "Probe"
|
||||
$probe = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
|
||||
if ($probe.ExitCode -eq 0 -and $probe.Output -match "Status:\s+0x00000000") {
|
||||
Write-Pass "source NodeId readable (Galaxy pipe → proxy → server → client chain up)"
|
||||
$results += @{ Passed = $true }
|
||||
} else {
|
||||
Write-Fail "probe read failed (exit=$($probe.ExitCode))"
|
||||
Write-Host $probe.Output
|
||||
$results += @{ Passed = $false; Reason = "probe failed" }
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 2 — Source read. Captures the current value for the later virtual-tag
|
||||
# comparison + confirms read dispatch works end-to-end. Failure here without a
|
||||
# stage-1 failure would be unusual — probe already reads.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Header "Source read"
|
||||
$sourceRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
|
||||
$sourceValue = $null
|
||||
if ($sourceRead.ExitCode -eq 0 -and $sourceRead.Output -match "Value:\s+([^\r\n]+)") {
|
||||
$sourceValue = $Matches[1].Trim()
|
||||
Write-Pass "source value = $sourceValue"
|
||||
$results += @{ Passed = $true }
|
||||
} else {
|
||||
Write-Fail "source read failed"
|
||||
Write-Host $sourceRead.Output
|
||||
$results += @{ Passed = $false; Reason = "source read failed" }
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 3 — Virtual-tag bridge. Reads the Phase 7 VirtualTag (source × 2). Not
|
||||
# strictly driver-specific, but exercises the CachedTagUpstreamSource bridge
|
||||
# (the seam most likely to silently stop working after a Galaxy-side change).
|
||||
# Skip if the VirtualNodeId param is empty (non-Phase-7 clusters).
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
if ([string]::IsNullOrEmpty($VirtualNodeId)) {
|
||||
Write-Header "Virtual-tag bridge"
|
||||
Write-Skip "VirtualNodeId not supplied — skipping Phase 7 bridge check"
|
||||
} else {
|
||||
Write-Header "Virtual-tag bridge"
|
||||
$vtRead = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $VirtualNodeId) + $authArgs)
|
||||
if ($vtRead.ExitCode -eq 0 -and $vtRead.Output -match "Value:\s+([^\r\n]+)") {
|
||||
$vtValue = $Matches[1].Trim()
|
||||
Write-Pass "virtual-tag value = $vtValue (source was $sourceValue)"
|
||||
$results += @{ Passed = $true }
|
||||
} else {
|
||||
Write-Fail "virtual-tag read failed"
|
||||
Write-Host $vtRead.Output
|
||||
$results += @{ Passed = $false; Reason = "virtual-tag read failed" }
|
||||
}
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 4 — Subscribe-sees-change. otopcua-cli subscribe in the background;
|
||||
# wait N seconds for Galaxy to push any data-change event on the source node.
|
||||
# This is optimistic — if the Galaxy attribute is idle, widen -ChangeWaitSec.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Header "Subscribe sees change"
|
||||
$stdout = New-TemporaryFile
|
||||
$stderr = New-TemporaryFile
|
||||
$subArgs = @($opcUaCli.PrefixArgs) + @(
|
||||
"subscribe", "-u", $OpcUaUrl, "-n", $SourceNodeId,
|
||||
"-i", "500", "--duration", "$ChangeWaitSec") + $authArgs
|
||||
$subProc = Start-Process -FilePath $opcUaCli.File `
|
||||
-ArgumentList $subArgs -NoNewWindow -PassThru `
|
||||
-RedirectStandardOutput $stdout.FullName `
|
||||
-RedirectStandardError $stderr.FullName
|
||||
Write-Info "subscription started (pid $($subProc.Id)) for ${ChangeWaitSec}s"
|
||||
$subProc.WaitForExit(($ChangeWaitSec + 5) * 1000) | Out-Null
|
||||
if (-not $subProc.HasExited) { Stop-Process -Id $subProc.Id -Force }
|
||||
$subOut = (Get-Content $stdout.FullName -Raw) + (Get-Content $stderr.FullName -Raw)
|
||||
Remove-Item $stdout.FullName, $stderr.FullName -ErrorAction SilentlyContinue
|
||||
|
||||
# Any `=` followed by `(Good)` line after the initial subscribe-confirmation
|
||||
# indicates at least one data-change tick arrived. The `@(...)` forces an array
|
||||
# so `.Count` works on the 0-match + single-match cases that Set-StrictMode
|
||||
# -Version 3.0 otherwise flags as `property 'Count' cannot be found`.
|
||||
$changeLines = @(($subOut -split "`n") | Where-Object { $_ -match "=\s+.*\(Good\)" })
|
||||
if ($changeLines.Count -gt 0) {
|
||||
Write-Pass "$($changeLines.Count) data-change events observed"
|
||||
$results += @{ Passed = $true }
|
||||
} else {
|
||||
Write-Fail "no data-change events in ${ChangeWaitSec}s — Galaxy attribute may be idle; rerun with -ChangeWaitSec larger, or trigger a change first"
|
||||
Write-Host $subOut
|
||||
$results += @{ Passed = $false; Reason = "no data-change" }
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 5 — Reverse bridge (OPC UA write → Galaxy). Galaxy attributes with
|
||||
# AccessLevel > FreeAccess often reject anonymous writes; record as INFO when
|
||||
# that's the case rather than failing the whole script.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
Write-Header "Reverse bridge (OPC UA write)"
|
||||
$writeValue = [int]$AlarmTriggerValue # reuse the alarm trigger value — two stages for one write
|
||||
$w = Invoke-Cli -Cli $opcUaCli -Args (@(
|
||||
"write", "-u", $OpcUaUrl, "-n", $SourceNodeId, "-v", "$writeValue") + $authArgs)
|
||||
if ($w.ExitCode -ne 0) {
|
||||
# Connection/protocol failure — still a test failure.
|
||||
Write-Fail "write CLI exit=$($w.ExitCode)"
|
||||
Write-Host $w.Output
|
||||
$results += @{ Passed = $false; Reason = "write failed" }
|
||||
} elseif ($w.Output -match "Write failed:\s*0x801F0000") {
|
||||
Write-Info "BadUserAccessDenied — attribute's Galaxy-side ACL blocks writes for this session. Not a bug; grant WriteOperate or run against a writable attribute."
|
||||
$results += @{ Passed = $true; Reason = "acl-expected" }
|
||||
} elseif ($w.Output -match "Write failed:\s*0x80390000|BadNotWritable") {
|
||||
Write-Info "BadNotWritable — attribute is read-only at the Galaxy layer (status attributes, @-prefixed meta, etc)."
|
||||
$results += @{ Passed = $true; Reason = "readonly-expected" }
|
||||
} elseif ($w.Output -match "Write successful") {
|
||||
# Read back — Galaxy poll interval + MXAccess advise may need a second or two to settle.
|
||||
Start-Sleep -Seconds 2
|
||||
$r = Invoke-Cli -Cli $opcUaCli -Args (@("read", "-u", $OpcUaUrl, "-n", $SourceNodeId) + $authArgs)
|
||||
if ($r.Output -match "Value:\s+$([Regex]::Escape("$writeValue"))\b") {
|
||||
Write-Pass "write propagated — source reads back $writeValue"
|
||||
$results += @{ Passed = $true }
|
||||
} else {
|
||||
Write-Fail "write reported success but read-back did not reflect $writeValue"
|
||||
Write-Host $r.Output
|
||||
$results += @{ Passed = $false; Reason = "write-readback mismatch" }
|
||||
}
|
||||
} else {
|
||||
Write-Fail "unexpected write response"
|
||||
Write-Host $w.Output
|
||||
$results += @{ Passed = $false; Reason = "unexpected write response" }
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 6 — Alarm fires. Uses the helper from _common.ps1. If stage 5 already
|
||||
# wrote the trigger value the alarm may already be active; that's fine — the
|
||||
# Part 9 ConditionRefresh in the alarms CLI replays the current state so the
|
||||
# subscribe window still captures the Active event.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
if ([string]::IsNullOrEmpty($AlarmNodeId)) {
|
||||
Write-Header "Alarm fires on threshold"
|
||||
Write-Skip "AlarmNodeId not supplied — skipping alarm check"
|
||||
} else {
|
||||
$results += Test-AlarmFiresOnThreshold `
|
||||
-OpcUaCli $opcUaCli `
|
||||
-OpcUaUrl $OpcUaUrl `
|
||||
-AlarmNodeId $AlarmNodeId `
|
||||
-InputNodeId $SourceNodeId `
|
||||
-TriggerValue $AlarmTriggerValue `
|
||||
-DurationSec $AlarmWaitSec
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Stage 7 — History read. historyread against the source tag over the last N
|
||||
# seconds. Failure modes the skip pattern catches: tag not historized in the
|
||||
# Galaxy attribute's historization profile, or the lookback window misses the
|
||||
# sample cadence.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
$results += Test-HistoryHasSamples `
|
||||
-OpcUaCli $opcUaCli `
|
||||
-OpcUaUrl $OpcUaUrl `
|
||||
-NodeId $SourceNodeId `
|
||||
-LookbackSec $HistoryLookbackSec
|
||||
|
||||
Write-Summary -Title "Galaxy e2e" -Results $results
|
||||
if ($results | Where-Object { -not $_.Passed }) { exit 1 }
|
||||
@@ -11,7 +11,7 @@
|
||||
of this test use `otopcua-cli` against two different endpoints:
|
||||
|
||||
remote = the upstream OPC UA server the driver connects to (opc-plc fixture
|
||||
by default, opc.tcp://localhost:50000)
|
||||
by default, opc.tcp://10.100.0.35:50000)
|
||||
local = the OtOpcUa server itself, which mirrors remote nodes through the
|
||||
OpcUaClient driver instance (opc.tcp://localhost:4840)
|
||||
|
||||
@@ -72,7 +72,7 @@
|
||||
|
||||
.PARAMETER RemoteUrl
|
||||
Upstream OPC UA server endpoint (the server the driver connects to).
|
||||
Default matches the opc-plc Docker fixture — opc.tcp://localhost:50000.
|
||||
Default matches the opc-plc Docker fixture — opc.tcp://10.100.0.35:50000.
|
||||
|
||||
.PARAMETER OpcUaUrl
|
||||
Local OtOpcUa server endpoint. Default opc.tcp://localhost:4840.
|
||||
@@ -146,7 +146,7 @@
|
||||
#>
|
||||
|
||||
param(
|
||||
[string]$RemoteUrl = "opc.tcp://localhost:50000",
|
||||
[string]$RemoteUrl = "opc.tcp://10.100.0.35:50000",
|
||||
[string]$OpcUaUrl = "opc.tcp://localhost:4840",
|
||||
[string]$RemoteNodeId = "ns=3;s=FastUInt1",
|
||||
[Parameter(Mandatory)] [string]$BridgeNodeId,
|
||||
|
||||
@@ -1,39 +1,52 @@
|
||||
<#
|
||||
.SYNOPSIS
|
||||
Registers the two v2 Windows services on a node: OtOpcUa (main server, net10) and
|
||||
OtOpcUaGalaxyHost (out-of-process Galaxy COM host, net48 x86).
|
||||
Registers the v2 Windows services on a node: OtOpcUa (main server, net10) and
|
||||
optionally OtOpcUaWonderwareHistorian (Wonderware historian sidecar).
|
||||
|
||||
.DESCRIPTION
|
||||
Phase 2 Stream D.2 — replaces the v1 single-service install (TopShelf-based OtOpcUa.Host).
|
||||
Installs both services with the correct service-account SID + per-process shared secret
|
||||
provisioning per `driver-stability.md §"IPC Security"`. Galaxy.Host depends on OtOpcUa
|
||||
(Galaxy.Host must be reachable when OtOpcUa starts; service dependency wiring + retry
|
||||
handled by OtOpcUa.Server NodeBootstrap).
|
||||
PR 7.2 retired the legacy out-of-process OtOpcUaGalaxyHost service alongside the
|
||||
GalaxyProxyDriver / GalaxyHost / GalaxyShared projects. Galaxy access now flows
|
||||
through the in-process GalaxyDriver talking gRPC to a separately-installed
|
||||
mxaccessgw. The mxaccessgw server runs out of its own repo
|
||||
(`c:\Users\dohertj2\Desktop\mxaccessgw\`) — see
|
||||
`docs/v2/Galaxy.ParityRig.md` for the gw setup recipe.
|
||||
|
||||
.PARAMETER InstallRoot
|
||||
Where the binaries live (typically C:\Program Files\OtOpcUa).
|
||||
|
||||
.PARAMETER ServiceAccount
|
||||
Service account SID or DOMAIN\name. Both services run under this account; the
|
||||
Galaxy.Host pipe ACL only allows this SID to connect (decision #76).
|
||||
Service account SID or DOMAIN\name. The OtOpcUa service runs under this account.
|
||||
|
||||
.PARAMETER GalaxySharedSecret
|
||||
Per-process secret passed to Galaxy.Host via env var. Generated freshly per install.
|
||||
.PARAMETER InstallWonderwareHistorian
|
||||
Gate the OtOpcUaWonderwareHistorian sidecar install. Off by default; set when
|
||||
the deployment uses the Wonderware historian for history reads + alarm-event
|
||||
persistence.
|
||||
|
||||
.PARAMETER ZbConnection
|
||||
Galaxy ZB SQL connection string (passed to Galaxy.Host via env var).
|
||||
.PARAMETER HistorianSharedSecret
|
||||
Per-process secret passed to the Historian sidecar via env var. Generated
|
||||
freshly per install when not supplied.
|
||||
|
||||
.EXAMPLE
|
||||
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua'
|
||||
|
||||
.EXAMPLE
|
||||
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua' `
|
||||
-InstallWonderwareHistorian
|
||||
#>
|
||||
[CmdletBinding()]
|
||||
param(
|
||||
[Parameter(Mandatory)] [string]$InstallRoot,
|
||||
[Parameter(Mandatory)] [string]$ServiceAccount,
|
||||
[string]$GalaxySharedSecret,
|
||||
[string]$ZbConnection = 'Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;',
|
||||
[string]$GalaxyClientName = 'OtOpcUa-Galaxy.Host',
|
||||
[string]$GalaxyPipeName = 'OtOpcUaGalaxy'
|
||||
|
||||
# PR 3.W — Wonderware historian sidecar. Optional; gates the
|
||||
# OtOpcUaWonderwareHistorian service. Secret + pipe defaults match the server's
|
||||
# Historian:Wonderware appsettings block.
|
||||
[switch]$InstallWonderwareHistorian,
|
||||
[string]$HistorianSharedSecret,
|
||||
[string]$HistorianPipeName = 'OtOpcUaWonderwareHistorian',
|
||||
[string]$HistorianServer = 'localhost',
|
||||
[int]$HistorianPort = 32568,
|
||||
[string[]]$AvevaServiceDependencies = @('NmxSvc', 'aaBootstrap', 'aaGR')
|
||||
)
|
||||
|
||||
$ErrorActionPreference = 'Stop'
|
||||
@@ -42,17 +55,18 @@ if (-not (Test-Path "$InstallRoot\OtOpcUa.Server.exe")) {
|
||||
Write-Error "OtOpcUa.Server.exe not found at $InstallRoot — copy the publish output first"
|
||||
exit 1
|
||||
}
|
||||
if (-not (Test-Path "$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe")) {
|
||||
Write-Error "OtOpcUa.Driver.Galaxy.Host.exe not found at $InstallRoot\Galaxy — copy the publish output first"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Generate a fresh shared secret per install if not supplied. Stored in DPAPI-protected file
|
||||
# rather than the registry so the service account can read it but other local users cannot.
|
||||
if (-not $GalaxySharedSecret) {
|
||||
# Generate fresh shared secrets per install if not supplied.
|
||||
function New-SharedSecret {
|
||||
$bytes = New-Object byte[] 32
|
||||
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
|
||||
$GalaxySharedSecret = [Convert]::ToBase64String($bytes)
|
||||
return [Convert]::ToBase64String($bytes)
|
||||
}
|
||||
if ($InstallWonderwareHistorian -and -not $HistorianSharedSecret) { $HistorianSharedSecret = New-SharedSecret }
|
||||
|
||||
if ($InstallWonderwareHistorian -and -not (Test-Path "$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe")) {
|
||||
Write-Error "OtOpcUa.Driver.Historian.Wonderware.exe not found at $InstallRoot\WonderwareHistorian — copy the publish output first"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Resolve the SID — the IPC ACL needs the SID, not the down-level name.
|
||||
@@ -62,41 +76,67 @@ $sid = if ($ServiceAccount.StartsWith('S-1-')) {
|
||||
(New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value
|
||||
}
|
||||
|
||||
# --- Install OtOpcUaGalaxyHost first (OtOpcUa starts after, depends on it being up).
|
||||
$galaxyEnv = @(
|
||||
"OTOPCUA_GALAXY_PIPE=$GalaxyPipeName"
|
||||
"OTOPCUA_ALLOWED_SID=$sid"
|
||||
"OTOPCUA_GALAXY_SECRET=$GalaxySharedSecret"
|
||||
"OTOPCUA_GALAXY_BACKEND=mxaccess"
|
||||
"OTOPCUA_GALAXY_ZB_CONN=$ZbConnection"
|
||||
"OTOPCUA_GALAXY_CLIENT_NAME=$GalaxyClientName"
|
||||
) -join "`0"
|
||||
$galaxyEnv += "`0`0"
|
||||
# --- Install OtOpcUaWonderwareHistorian (PR 3.W) — separate sidecar that exposes the
|
||||
# Wonderware Historian SDK via a named-pipe protocol consumed by the .NET 10 server.
|
||||
# Optional: only installed when -InstallWonderwareHistorian is supplied. Depends on the
|
||||
# hard AVEVA services that host the historian SDK runtime path.
|
||||
$historianDepend = $null
|
||||
if ($InstallWonderwareHistorian) {
|
||||
$historianEnv = @(
|
||||
"OTOPCUA_HISTORIAN_PIPE=$HistorianPipeName"
|
||||
"OTOPCUA_ALLOWED_SID=$sid"
|
||||
"OTOPCUA_HISTORIAN_SECRET=$HistorianSharedSecret"
|
||||
"OTOPCUA_HISTORIAN_ENABLED=true"
|
||||
"OTOPCUA_HISTORIAN_SERVER=$HistorianServer"
|
||||
"OTOPCUA_HISTORIAN_PORT=$HistorianPort"
|
||||
) -join "`0"
|
||||
$historianEnv += "`0`0"
|
||||
|
||||
Write-Host "Installing OtOpcUaGalaxyHost..."
|
||||
& sc.exe create OtOpcUaGalaxyHost binPath= "`"$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe`"" `
|
||||
DisplayName= 'OtOpcUa Galaxy Host (out-of-process MXAccess)' `
|
||||
start= auto `
|
||||
obj= $ServiceAccount | Out-Null
|
||||
Write-Host "Installing OtOpcUaWonderwareHistorian..."
|
||||
& sc.exe create OtOpcUaWonderwareHistorian binPath= "`"$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe`"" `
|
||||
DisplayName= 'OtOpcUa Wonderware Historian Sidecar (out-of-process aahClient)' `
|
||||
start= auto `
|
||||
depend= ($AvevaServiceDependencies -join '/') `
|
||||
obj= $ServiceAccount | Out-Null
|
||||
& sc.exe config OtOpcUaWonderwareHistorian start= delayed-auto | Out-Null
|
||||
|
||||
# Set per-service environment variables via the registry — sc.exe doesn't expose them directly.
|
||||
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost"
|
||||
$envValue = $galaxyEnv.Split("`0") | Where-Object { $_ -ne '' }
|
||||
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue
|
||||
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaWonderwareHistorian"
|
||||
$envValue = $historianEnv.Split("`0") | Where-Object { $_ -ne '' }
|
||||
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue
|
||||
|
||||
$historianDepend = 'OtOpcUaWonderwareHistorian'
|
||||
}
|
||||
|
||||
# --- Install OtOpcUa. Galaxy access flows through GalaxyDriver → mxaccessgw (gRPC),
|
||||
# so OtOpcUa no longer depends on a sibling service for Galaxy connectivity. The
|
||||
# mxaccessgw is installed separately. When the Wonderware sidecar is installed,
|
||||
# depend on it for startup ordering.
|
||||
$otOpcUaDepends = @()
|
||||
if ($historianDepend) { $otOpcUaDepends += $historianDepend }
|
||||
|
||||
# --- Install OtOpcUa (depends on Galaxy host being installed; doesn't strictly require it
|
||||
# started — OtOpcUa.Server NodeBootstrap retries on the IPC connect path).
|
||||
Write-Host "Installing OtOpcUa..."
|
||||
& sc.exe create OtOpcUa binPath= "`"$InstallRoot\OtOpcUa.Server.exe`"" `
|
||||
DisplayName= 'OtOpcUa Server' `
|
||||
start= auto `
|
||||
depend= 'OtOpcUaGalaxyHost' `
|
||||
obj= $ServiceAccount | Out-Null
|
||||
$createArgs = @(
|
||||
'create', 'OtOpcUa',
|
||||
'binPath=', "`"$InstallRoot\OtOpcUa.Server.exe`"",
|
||||
'DisplayName=', 'OtOpcUa Server',
|
||||
'start=', 'auto',
|
||||
'obj=', $ServiceAccount
|
||||
)
|
||||
if ($otOpcUaDepends.Count -gt 0) {
|
||||
$createArgs += @('depend=', ($otOpcUaDepends -join '/'))
|
||||
}
|
||||
& sc.exe @createArgs | Out-Null
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "Installed. Start with:"
|
||||
Write-Host " sc.exe start OtOpcUaGalaxyHost"
|
||||
if ($InstallWonderwareHistorian) { Write-Host " sc.exe start OtOpcUaWonderwareHistorian" }
|
||||
Write-Host " sc.exe start OtOpcUa"
|
||||
if ($InstallWonderwareHistorian) {
|
||||
Write-Host ""
|
||||
Write-Host "Wonderware historian shared secret (configure into appsettings.json Historian:Wonderware:SharedSecret):"
|
||||
Write-Host " $HistorianSharedSecret"
|
||||
}
|
||||
Write-Host ""
|
||||
Write-Host "Galaxy shared secret (record this offline — required for service rebinding):"
|
||||
Write-Host " $GalaxySharedSecret"
|
||||
Write-Host "NOTE: Galaxy access flows through mxaccessgw — install + run that separately"
|
||||
Write-Host " per docs/v2/Galaxy.ParityRig.md. OtOpcUa connects via the Galaxy.Gateway"
|
||||
Write-Host " section of appsettings.json (default endpoint http://localhost:5120)."
|
||||
|
||||
@@ -1,11 +1,18 @@
|
||||
<#
|
||||
.SYNOPSIS
|
||||
Stops + removes the two v2 services. Mirrors Install-Services.ps1.
|
||||
Stops + removes the v2 services. Mirrors Install-Services.ps1.
|
||||
|
||||
.DESCRIPTION
|
||||
PR 7.2 retired the legacy OtOpcUaGalaxyHost service. Galaxy access now flows
|
||||
through the in-process GalaxyDriver against a separately-installed mxaccessgw.
|
||||
OtOpcUaGalaxyHost is included in the cleanup loop below so this script safely
|
||||
removes it from any rig still carrying the legacy service from a pre-7.2
|
||||
install.
|
||||
#>
|
||||
[CmdletBinding()] param()
|
||||
$ErrorActionPreference = 'Continue'
|
||||
|
||||
foreach ($svc in 'OtOpcUa', 'OtOpcUaGalaxyHost') {
|
||||
foreach ($svc in 'OtOpcUa', 'OtOpcUaWonderwareHistorian', 'OtOpcUaGalaxyHost') {
|
||||
if (Get-Service $svc -ErrorAction SilentlyContinue) {
|
||||
Write-Host "Stopping $svc..."
|
||||
Stop-Service $svc -Force -ErrorAction SilentlyContinue
|
||||
|
||||
@@ -0,0 +1,21 @@
|
||||
{
|
||||
"auto-managed": 10,
|
||||
"cross-driver": 14,
|
||||
"driver/abcip": 13,
|
||||
"driver/ablegacy": 16,
|
||||
"driver/focas": 11,
|
||||
"driver/opcuaclient": 12,
|
||||
"driver/s7": 19,
|
||||
"driver/twincat": 17,
|
||||
"phase/1": 8,
|
||||
"phase/2": 7,
|
||||
"phase/3": 6,
|
||||
"phase/4": 5,
|
||||
"phase/5": 4,
|
||||
"phase/6": 3,
|
||||
"queue/blocked": 2,
|
||||
"queue/done": 15,
|
||||
"queue/failed": 9,
|
||||
"queue/in-progress": 1,
|
||||
"queue/queued": 18
|
||||
}
|
||||
@@ -0,0 +1,320 @@
|
||||
- id: twincat-1.1
|
||||
driver: twincat
|
||||
phase: 1
|
||||
plan_pr_id: "1.1"
|
||||
title: "TwinCAT — Int64 fidelity for LINT/ULINT"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Map LInt/ULInt to DriverDataType.Int64 instead of silently truncating to Int32.
|
||||
The TwinCATDataType.cs:40 truncation comment "matches Int64 gap" is removed and
|
||||
MapToClrType already returns long/ulong, so the wire-level read returns the
|
||||
correct boxed types. May add Int64 to Core.Abstractions DriverDataType enum if
|
||||
missing. Closes a long-standing fixture caveat noted in the test suite.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDataType.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverDataType.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/GVLs/GVL_Primitives.TcGVL"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: S
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID; no e2e change to test-twincat.ps1."
|
||||
|
||||
- id: twincat-1.2
|
||||
driver: twincat
|
||||
phase: 1
|
||||
plan_pr_id: "1.2"
|
||||
title: "TwinCAT — TIME/DATE/DT/TOD as native UA types"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Stop marshalling IEC TIME/DATE/DT/TOD as raw UDINT and convert to native UA
|
||||
Duration/DateTime types via post-processing in ReadValueAsync, ConvertForWrite,
|
||||
and OnAdsNotificationEx. May expose missing Duration in DriverDataType. CLI
|
||||
syntax updates so users write ISO-8601 / IEC literals instead of numeric raw
|
||||
values.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDataType.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/GVLs/GVL_Primitives.TcGVL"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID. May add Duration to DriverDataType enum."
|
||||
|
||||
- id: twincat-1.3
|
||||
driver: twincat
|
||||
phase: 1
|
||||
plan_pr_id: "1.3"
|
||||
title: "TwinCAT — Bit-indexed BOOL writes (read-modify-write)"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Replace the NotSupportedException at AdsTwinCATClient.cs:99 with read-modify-write
|
||||
on the parent word, serializing concurrent bit writes to the same parent via a
|
||||
keyed SemaphoreSlim. Closes referenced task #181. CLI gains an example and the
|
||||
fixture caveat in the bugs-caught list updates to note writes now work.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture: []
|
||||
e2e: []
|
||||
effort: S
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Reuses GVL_Primitives.vWord (0xBEEF) — no fixture schema change."
|
||||
|
||||
- id: twincat-1.4
|
||||
driver: twincat
|
||||
phase: 1
|
||||
plan_pr_id: "1.4"
|
||||
title: "TwinCAT — Multi-dim and whole-array reads"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Expand ReadValueAsync/WriteValueAsync to handle whole-array reads in a single
|
||||
AdsClient call rather than element-by-element. Surface IsArray + ArrayDimensions
|
||||
on TwinCATTagDefinition and through DriverAttributeInfo from DiscoverAsync. Sets
|
||||
up the array-shape plumbing the rest of the driver needs.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriverOptions.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/GVLs/GVL_Arrays.TcGVL"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID. New 5x5 aReal2D seed with deterministic pattern."
|
||||
|
||||
- id: twincat-1.5
|
||||
driver: twincat
|
||||
phase: 1
|
||||
plan_pr_id: "1.5"
|
||||
title: "TwinCAT — ENUM and ALIAS at discovery"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
MapSymbolTypeName currently returns null for non-atomic types, dropping ENUM and
|
||||
ALIAS symbols silently. Switch to inspecting symbol.DataType + Category from
|
||||
TwinCAT.TypeSystem so DataTypeCategory.Enum walks EnumValues and Alias resolves
|
||||
to base atomic recursively. Surface enum members for later EnumStrings rendering.
|
||||
POINTER/REFERENCE/INTERFACE/UNION explicitly out of scope.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Reuses existing GVL_Enums + DUTs; only README integration-test contract entry added."
|
||||
|
||||
- id: twincat-2.1
|
||||
driver: twincat
|
||||
phase: 2
|
||||
plan_pr_id: "2.1"
|
||||
title: "TwinCAT — ADS Sum-read / Sum-write"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Replace per-tag ReadValueAsync loops with Beckhoff's ADS Sum commands
|
||||
(IndexGroup 0xF080-0xF084) via SumSymbolRead/SumSymbolWrite to batch N
|
||||
reads/writes per AMS request. Bucket fullReferences by DeviceHostAddress and
|
||||
expose a new ReadValuesAsync surface on ITwinCATClient. Targets ~10x throughput
|
||||
on multi-thousand-tag scans; perf-tier test gated behind TWINCAT_PERF=1.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/ITwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/v3/twincat-backlog.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/GVLs/GVL_Perf.TcGVL"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: L
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Perf test gated behind TWINCAT_PERF=1 plus TWINCAT_TARGET_NETID; new FB_PerfChurn POU."
|
||||
|
||||
- id: twincat-2.2
|
||||
driver: twincat
|
||||
phase: 2
|
||||
plan_pr_id: "2.2"
|
||||
title: "TwinCAT — Handle-based access with caching"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Cache CreateVariableHandleAsync results so per-read overhead drops to
|
||||
read-by-handle (4-byte index vs N-byte symbol path). On
|
||||
DeviceSymbolVersionInvalid (0x710) evict and retry once. Clear cache on
|
||||
AdsClient reconnect until the symbol-version listener (PR 2.3) ships. Dispose
|
||||
path calls DeleteVariableHandleAsync for cached handles.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture: []
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Combines with PR 2.1 for sum-read-by-handle. Reuses GVL_Perf.aTags."
|
||||
|
||||
- id: twincat-2.3
|
||||
driver: twincat
|
||||
phase: 2
|
||||
plan_pr_id: "2.3"
|
||||
title: "TwinCAT — Symbol-version invalidation listener"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Register an AddDeviceNotificationAsync on the symbol-version index group
|
||||
(AdsReservedIndexGroup.SymbolVersion 0xF008) so the handle cache from PR 2.2
|
||||
is wiped on online-change bumps. Initial integration test gated as
|
||||
requires-manual-online-change until automation lands. Resolves open question
|
||||
(c) confirming the v6 enum constant.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: ["twincat-2.2"]
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID; manual online-change drill documented in README."
|
||||
|
||||
- id: twincat-3.1
|
||||
driver: twincat
|
||||
phase: 3
|
||||
plan_pr_id: "3.1"
|
||||
title: "TwinCAT — Per-tag MaxDelay tuning"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Surface NotificationSettings MaxDelay as a per-tag option (default 0 to
|
||||
preserve current behavior). Plumb int? MaxDelayMs through TwinCATTagDefinition,
|
||||
SubscribeAsync, and AddNotificationAsync. Coalesces high-frequency PLC signals
|
||||
so the OPC UA subscription queue stops flooding under bursty change rates.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriverOptions.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: S
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Reuses GVL_Fixture.nCounter as 100 Hz driver. Hardware-gated via TWINCAT_TARGET_NETID."
|
||||
|
||||
- id: twincat-3.2
|
||||
driver: twincat
|
||||
phase: 3
|
||||
plan_pr_id: "3.2"
|
||||
title: "TwinCAT — Cycle-time / jitter / PLC-state diagnostics"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Augment the probe loop to read _AppInfo.OnlineChangeCnt/AppName and
|
||||
_TaskInfo[1].CycleTime/LastExecTime, surface as TwinCATDeviceDiagnostics on
|
||||
DeviceState, and emit through IDriverDiagnostics (cross-driver surface from
|
||||
Modbus task #154). Read system symbols directly without going through the user
|
||||
browse filter.
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATSystemSymbolFilter.cs"
|
||||
docs:
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/v3/twincat-backlog.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: M
|
||||
deps: []
|
||||
cross_driver: true
|
||||
notes: "Reuses IDriverDiagnostics from Modbus task #154. Hardware-gated via TWINCAT_TARGET_NETID."
|
||||
|
||||
- id: twincat-4.1
|
||||
driver: twincat
|
||||
phase: 4
|
||||
plan_pr_id: "4.1"
|
||||
title: "TwinCAT — Nested UDT browse via online type walker"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Largest single piece of work. Recurse BrowseSymbolsAsync into IStructType.SubItems
|
||||
yielding one TwinCATDiscoveredSymbol per leaf with dotted instance paths. Expand
|
||||
arrays-of-structs up to a configurable bound (default 1024). Add a pure
|
||||
TwinCATTypeWalker helper. Folds recursed structure into Discovered/ folder tree.
|
||||
Online runtime path only — TMC offline parsing deferred per open question (a).
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/AdsTwinCATClient.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATTypeWalker.cs"
|
||||
docs:
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
- "docs/v3/twincat-backlog.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/DUTs/ST_NestedFlags.TcDUT"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/DUTs/ST_RecursiveCap.TcDUT"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/DUTs/ST_AlarmRecord.TcDUT"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e: []
|
||||
effort: L
|
||||
deps: ["twincat-1.5"]
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID. PR 1.4 helpful but not blocking."
|
||||
|
||||
- id: twincat-5.1
|
||||
driver: twincat
|
||||
phase: 5
|
||||
plan_pr_id: "5.1"
|
||||
title: "TwinCAT — IAlarmSource via TC3 EventLogger"
|
||||
plan_anchor: "docs/plans/twincat-plan.md"
|
||||
summary: |
|
||||
Implement IAlarmSource over TcEventLogger on AMS port 110 so PLC alarms
|
||||
surface as OPC UA AC events. Begins with a one-day spike (open question (b))
|
||||
documented in docs/v3/twincat-eventlogger-spike.md to determine if a managed
|
||||
wrapper exists or if we hit AMS port 110 directly via a secondary AdsClient
|
||||
+ AddDeviceNotificationAsync on the alarm-list index group. Gated by new
|
||||
EnableAlarms option (default false).
|
||||
files:
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATAlarmSource.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriver.cs"
|
||||
- "src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT/TwinCATDriverOptions.cs"
|
||||
docs:
|
||||
- "docs/drivers/TwinCAT.md"
|
||||
- "docs/v3/twincat-eventlogger-spike.md"
|
||||
- "docs/Driver.TwinCAT.Cli.md"
|
||||
- "docs/drivers/TwinCAT-Test-Fixture.md"
|
||||
fixture:
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/POUs/FB_AlarmHarness.TcPOU"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/PLC/GVLs/GVL_Alarms.TcGVL"
|
||||
- "tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/TwinCatProject/README.md"
|
||||
e2e:
|
||||
- "scripts/e2e/test-twincat.ps1"
|
||||
effort: L
|
||||
deps: []
|
||||
cross_driver: false
|
||||
notes: "Hardware-gated via TWINCAT_TARGET_NETID. Spike-first; e2e Test-AlarmRoundTrip likely deferred to follow-up."
|
||||
@@ -0,0 +1,36 @@
|
||||
# Plan-execution queue
|
||||
|
||||
Gitea-backed work queue that drives the per-driver implementation plans (`docs/plans/*-plan.md`) to completion in **Mode B** (autonomous: auto-merges into the `auto/driver-gaps` integration branch when build+tests pass).
|
||||
|
||||
## Pieces
|
||||
|
||||
- `pr-manifest.yaml` — canonical list of every PR across all six plans.
|
||||
- `setup-labels.sh` — idempotently creates the queue labels in Gitea.
|
||||
- `file-issues.sh` — files one Gitea issue per manifest entry (idempotent — skips ids that already exist).
|
||||
- `next-pr.sh` — picks the next eligible queue issue (queued, blockers all done) as JSON.
|
||||
- `start-pr.sh ISSUE BRANCH` — flips queued → in-progress and creates the branch off `auto/driver-gaps`.
|
||||
- `open-pr.sh ISSUE BRANCH TITLE BODY_FILE` — opens a PR from BRANCH into `auto/driver-gaps`.
|
||||
- `merge-pr.sh PR` — merges a PR with branch-delete (Mode B).
|
||||
- `finish-pr.sh ISSUE success PR` / `finish-pr.sh ISSUE failed REASON_FILE` — closes / marks failed.
|
||||
|
||||
## Flow per loop iteration
|
||||
|
||||
1. `next-pr.sh` → issue#, branch, canonical id.
|
||||
2. `start-pr.sh` → mark in-progress, create branch.
|
||||
3. Loop driver dispatches a Claude Agent to implement the PR on the branch.
|
||||
4. Loop runs `dotnet build` + `dotnet test`.
|
||||
5. On green: `open-pr.sh`, `merge-pr.sh`, `finish-pr.sh success`.
|
||||
6. On red: capture log → `finish-pr.sh failed log.txt`. Issue stays open with `queue/failed` label for retry.
|
||||
|
||||
## Environment
|
||||
|
||||
- Gitea repo: `dohertj2/lmxopcua` on `gitea.dohertylan.com`.
|
||||
- Token: read from `%LOCALAPPDATA%\tea\config.yml` (or `$GITEA_TOKEN` override).
|
||||
- Integration branch: `auto/driver-gaps` (created off master).
|
||||
- Per-PR branches: `auto/<driver>/<plan-pr-id>`.
|
||||
|
||||
## Reset / debug
|
||||
|
||||
- Re-list eligible issues: `bash scripts/queue/next-pr.sh`.
|
||||
- Manually unblock: remove `queue/blocked` label and add `queue/queued`.
|
||||
- Drop a failed PR back into queue: remove `queue/failed`, add `queue/queued`.
|
||||
@@ -0,0 +1,122 @@
|
||||
#!/usr/bin/env bash
|
||||
# Reads scripts/queue/pr-manifest.yaml and creates one Gitea issue per PR.
|
||||
# Idempotent: skips PRs whose canonical id already exists as an open issue.
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
if [ ! -f "$MANIFEST" ]; then
|
||||
echo "manifest not found: $MANIFEST" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Collect existing canonical-id → issue# mapping (from queue-meta blocks)
|
||||
EXISTING_JSON=$(api_repo GET "issues?state=all&type=issues&limit=200&page=1")
|
||||
# multiple pages — keep paging until empty
|
||||
PAGE=2
|
||||
while :; do
|
||||
PG=$(api_repo GET "issues?state=all&type=issues&limit=200&page=$PAGE")
|
||||
COUNT=$(echo "$PG" | python -c "import sys,json; print(len(json.load(sys.stdin)))")
|
||||
if [ "$COUNT" = "0" ]; then break; fi
|
||||
EXISTING_JSON=$(python -c "import sys,json; a=json.loads(sys.argv[1]); b=json.loads(sys.argv[2]); print(json.dumps(a+b))" "$EXISTING_JSON" "$PG")
|
||||
PAGE=$((PAGE+1))
|
||||
done
|
||||
|
||||
python - "$MANIFEST" "$LABEL_MAP" <<'PY'
|
||||
import json, sys, re, yaml, urllib.request, os
|
||||
|
||||
manifest_path, label_map_path = sys.argv[1], sys.argv[2]
|
||||
gitea_token = os.environ["GITEA_TOKEN"]
|
||||
api_base = "https://gitea.dohertylan.com/api/v1/repos/dohertj2/lmxopcua"
|
||||
|
||||
with open(manifest_path) as f: manifest = yaml.safe_load(f)
|
||||
with open(label_map_path) as f: lmap = json.load(f)
|
||||
|
||||
def api(method, path, data=None):
|
||||
req = urllib.request.Request(
|
||||
f"{api_base}/{path}",
|
||||
method=method,
|
||||
headers={
|
||||
"Authorization": f"token {gitea_token}",
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json",
|
||||
},
|
||||
data=json.dumps(data).encode() if data else None,
|
||||
)
|
||||
with urllib.request.urlopen(req) as r:
|
||||
return json.loads(r.read().decode())
|
||||
|
||||
# Collect existing issues' canonical ids → issue#
|
||||
existing = {}
|
||||
page = 1
|
||||
while True:
|
||||
items = api("GET", f"issues?state=all&type=issues&limit=50&page={page}")
|
||||
if not items: break
|
||||
for it in items:
|
||||
m = re.search(r'<!-- queue-meta\s*(\{.*?\})\s*-->', it.get("body","") or "", re.S)
|
||||
if m:
|
||||
try:
|
||||
meta = json.loads(m.group(1))
|
||||
if "id" in meta:
|
||||
existing[meta["id"]] = it["number"]
|
||||
except: pass
|
||||
page += 1
|
||||
|
||||
print(f"existing queue issues: {len(existing)}")
|
||||
|
||||
filed = 0
|
||||
skipped = 0
|
||||
for pr in manifest["prs"]:
|
||||
if pr["id"] in existing:
|
||||
skipped += 1
|
||||
continue
|
||||
title = f"[{pr['driver']}] {pr['title']}"
|
||||
meta = {
|
||||
"id": pr["id"],
|
||||
"driver": pr["driver"],
|
||||
"phase": pr["phase"],
|
||||
"plan_pr_id": pr.get("plan_pr_id",""),
|
||||
"deps": pr.get("deps", []),
|
||||
"cross_driver": pr.get("cross_driver", False),
|
||||
}
|
||||
body_parts = [
|
||||
f"<!-- queue-meta\n{json.dumps(meta)}\n-->",
|
||||
"## Auto-managed PR — Mode B (autonomous)",
|
||||
f"**Driver**: `{pr['driver']}` **Phase**: `{pr['phase']}` **Plan PR**: `{pr.get('plan_pr_id','')}`",
|
||||
f"**Plan**: [`{pr.get('plan_anchor','docs/plans/' + pr['driver'] + '-plan.md')}`]({pr.get('plan_anchor','../docs/plans/' + pr['driver'] + '-plan.md')})",
|
||||
f"**Effort**: `{pr.get('effort','M')}` **Cross-driver**: `{pr.get('cross_driver', False)}`",
|
||||
"",
|
||||
"## Summary",
|
||||
pr.get("summary","_(see plan)_"),
|
||||
]
|
||||
if pr.get("files"):
|
||||
body_parts += ["", "## Source files", *[f"- `{f}`" for f in pr["files"]]]
|
||||
if pr.get("docs"):
|
||||
body_parts += ["", "## Docs", *[f"- `{d}`" for d in pr["docs"]]]
|
||||
if pr.get("fixture"):
|
||||
body_parts += ["", "## Fixture", *[f"- `{x}`" for x in pr["fixture"]]]
|
||||
if pr.get("e2e"):
|
||||
body_parts += ["", "## E2E", *[f"- `{x}`" for x in pr["e2e"]]]
|
||||
if pr.get("deps"):
|
||||
body_parts += ["", "## Depends on", *[f"- canonical: `{d}`" for d in pr["deps"]]]
|
||||
if pr.get("notes"):
|
||||
body_parts += ["", "## Notes", pr["notes"]]
|
||||
body_parts += ["",
|
||||
"---",
|
||||
f"_Branch: `auto/{pr['driver']}/{pr.get('plan_pr_id','').replace('/','-')}`. Target: `auto/driver-gaps`._"]
|
||||
body = "\n".join(body_parts)
|
||||
|
||||
label_names = [
|
||||
f"driver/{pr['driver']}",
|
||||
f"phase/{pr['phase']}",
|
||||
"queue/queued",
|
||||
"auto-managed",
|
||||
]
|
||||
if pr.get("cross_driver"): label_names.append("cross-driver")
|
||||
label_ids = [lmap[n] for n in label_names if n in lmap]
|
||||
issue = api("POST", "issues", {"title": title, "body": body, "labels": label_ids})
|
||||
print(f" filed #{issue['number']}: {pr['id']}")
|
||||
filed += 1
|
||||
|
||||
print(f"\nfiled {filed}, skipped (existing) {skipped}")
|
||||
PY
|
||||
@@ -0,0 +1,39 @@
|
||||
#!/usr/bin/env bash
|
||||
# Closes the issue (success) or marks failed and reopens for retry.
|
||||
# Usage:
|
||||
# finish-pr.sh ISSUE_NUM success PR_NUM
|
||||
# finish-pr.sh ISSUE_NUM failed REASON_FILE
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
ISSUE="${1:?ISSUE_NUM required}"
|
||||
RESULT="${2:?success|failed required}"
|
||||
ARG3="${3:?PR_NUM or REASON_FILE required}"
|
||||
|
||||
INPROG=$(python -c "import json; print(json.load(open('$LABEL_MAP'))['queue/in-progress'])")
|
||||
DONE=$(python -c "import json; print(json.load(open('$LABEL_MAP'))['queue/done'])")
|
||||
FAILED=$(python -c "import json; print(json.load(open('$LABEL_MAP'))['queue/failed'])")
|
||||
|
||||
api_repo DELETE "issues/$ISSUE/labels/$INPROG" >/dev/null || true
|
||||
|
||||
case "$RESULT" in
|
||||
success)
|
||||
PR_NUM="$ARG3"
|
||||
api_repo POST "issues/$ISSUE/labels" "{\"labels\":[$DONE]}" >/dev/null
|
||||
BODY=$(python -c "import json; print(json.dumps({'body':'✅ Auto-loop completed. Merged via PR #$PR_NUM.'}))")
|
||||
api_repo POST "issues/$ISSUE/comments" "$BODY" >/dev/null
|
||||
api_repo PATCH "issues/$ISSUE" '{"state":"closed"}' >/dev/null
|
||||
echo " issue #$ISSUE closed (PR #$PR_NUM merged)"
|
||||
;;
|
||||
failed)
|
||||
REASON_FILE="$ARG3"
|
||||
REASON=$(cat "$REASON_FILE" 2>/dev/null | head -c 4000 || echo "(no reason file)")
|
||||
api_repo POST "issues/$ISSUE/labels" "{\"labels\":[$FAILED]}" >/dev/null
|
||||
BODY=$(python -c "import json,sys; r=open('$REASON_FILE').read()[:4000] if __import__('os').path.exists('$REASON_FILE') else '(no log)'; print(json.dumps({'body':'❌ Auto-loop failed.\n\n\`\`\`\n'+r+'\n\`\`\`'}))")
|
||||
api_repo POST "issues/$ISSUE/comments" "$BODY" >/dev/null
|
||||
echo " issue #$ISSUE marked failed (still open for retry)"
|
||||
;;
|
||||
*)
|
||||
echo "unknown result: $RESULT" >&2; exit 1 ;;
|
||||
esac
|
||||
@@ -0,0 +1,57 @@
|
||||
#!/usr/bin/env bash
|
||||
# Shared helpers for the Gitea-backed plan-execution queue.
|
||||
set -euo pipefail
|
||||
|
||||
GITEA_URL="https://gitea.dohertylan.com"
|
||||
GITEA_REPO="dohertj2/lmxopcua"
|
||||
GITEA_API="$GITEA_URL/api/v1"
|
||||
|
||||
if [ -z "${GITEA_TOKEN:-}" ]; then
|
||||
TEA_CONFIG="${LOCALAPPDATA:-$HOME/AppData/Local}/tea/config.yml"
|
||||
if [ ! -f "$TEA_CONFIG" ]; then
|
||||
TEA_CONFIG="$HOME/.config/tea/config.yml"
|
||||
fi
|
||||
GITEA_TOKEN="$(awk '/token:/{gsub(/[ \t]/,"",$2); print $2; exit}' "$TEA_CONFIG" 2>/dev/null || true)"
|
||||
fi
|
||||
if [ -z "${GITEA_TOKEN:-}" ]; then
|
||||
echo "lib.sh: GITEA_TOKEN not set and tea config not readable" >&2
|
||||
exit 1
|
||||
fi
|
||||
export GITEA_TOKEN
|
||||
|
||||
INTEGRATION_BRANCH="auto/driver-gaps"
|
||||
QUEUE_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && { pwd -W 2>/dev/null || pwd; })"
|
||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && { pwd -W 2>/dev/null || pwd; })"
|
||||
MANIFEST="$QUEUE_ROOT/pr-manifest.yaml"
|
||||
LABEL_MAP="$QUEUE_ROOT/.label-ids.json"
|
||||
|
||||
LABEL_QUEUED="queue/queued"
|
||||
LABEL_IN_PROGRESS="queue/in-progress"
|
||||
LABEL_BLOCKED="queue/blocked"
|
||||
LABEL_FAILED="queue/failed"
|
||||
LABEL_DONE="queue/done"
|
||||
LABEL_AUTO="auto-managed"
|
||||
LABEL_CROSS="cross-driver"
|
||||
|
||||
api() {
|
||||
local method="$1" path="$2" data="${3:-}"
|
||||
if [ -n "$data" ]; then
|
||||
curl -sf -X "$method" \
|
||||
-H "Authorization: token $GITEA_TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "$data" \
|
||||
"$GITEA_API/$path"
|
||||
else
|
||||
curl -sf -X "$method" \
|
||||
-H "Authorization: token $GITEA_TOKEN" \
|
||||
"$GITEA_API/$path"
|
||||
fi
|
||||
}
|
||||
|
||||
api_repo() {
|
||||
api "$1" "repos/$GITEA_REPO/$2" "${3:-}"
|
||||
}
|
||||
|
||||
label_id() {
|
||||
python -c "import json,sys; m=json.load(open('$LABEL_MAP')); print(m['$1'])"
|
||||
}
|
||||
@@ -0,0 +1,57 @@
|
||||
# Loop iteration prompt (Mode B autonomous)
|
||||
|
||||
This is the single self-contained prompt that `/loop` re-fires until the queue empties. Each iteration handles exactly one PR end-to-end.
|
||||
|
||||
---
|
||||
|
||||
You are running one iteration of the autonomous plan-execution loop. The queue lives in Gitea at `dohertj2/lmxopcua`. Helpers: `scripts/queue/*.sh`.
|
||||
|
||||
## Step 1 — pick the next PR
|
||||
Run `bash scripts/queue/next-pr.sh`. It returns JSON.
|
||||
- If `{"empty": true}` → the queue is drained. **Do not call ScheduleWakeup.** Report "queue empty — loop terminating" and exit. The /loop will end.
|
||||
- Otherwise parse: `issue_num`, `canonical_id`, `driver`, `phase`, `plan_pr_id`, `branch`, `title`, `url`.
|
||||
|
||||
## Step 2 — claim it
|
||||
Run `bash scripts/queue/start-pr.sh "$ISSUE_NUM" "$BRANCH"`. This swaps `queue/queued` → `queue/in-progress` and creates the branch off `auto/driver-gaps`.
|
||||
|
||||
## Step 3 — pull the issue body
|
||||
Run `curl -sf -H "Authorization: token $(awk '/token:/{print $2}' "$LOCALAPPDATA/tea/config.yml")" "https://gitea.dohertylan.com/api/v1/repos/dohertj2/lmxopcua/issues/$ISSUE_NUM"` and extract the `body` field. The body contains the Plan link, summary, source files, docs/fixture/e2e files.
|
||||
|
||||
## Step 4 — implement on a worktree
|
||||
Dispatch a general-purpose Agent with `isolation: "worktree"`. Brief it with:
|
||||
- the issue body verbatim
|
||||
- the linked plan section (read `docs/plans/<driver>-plan.md` and quote the relevant per-PR detail)
|
||||
- explicit instructions: implement the source-file changes, the doc updates, the fixture extensions, and the e2e test additions named in the issue
|
||||
- run `dotnet build c:/Users/dohertj2/Desktop/lmxopcua/ZB.MOM.WW.OtOpcUa.slnx` until green
|
||||
- run `dotnet test` for the relevant test project until green
|
||||
- commit on `$BRANCH` with message `Auto: <canonical_id> — <short summary>` followed by `Closes #$ISSUE_NUM`
|
||||
- return a brief summary of what changed
|
||||
|
||||
## Step 5 — verify and push
|
||||
Verify the agent did commit + push. If branch isn't pushed, push it: `git push origin "$BRANCH"`.
|
||||
|
||||
## Step 6 — open PR
|
||||
Build a body file: include the issue summary + the agent's summary. Then:
|
||||
```
|
||||
PR_NUM=$(bash scripts/queue/open-pr.sh "$ISSUE_NUM" "$BRANCH" "$TITLE" /tmp/pr-body.md)
|
||||
```
|
||||
|
||||
## Step 7 — auto-merge (Mode B)
|
||||
Run `bash scripts/queue/merge-pr.sh "$PR_NUM"`.
|
||||
|
||||
## Step 8 — close issue
|
||||
Run `bash scripts/queue/finish-pr.sh "$ISSUE_NUM" success "$PR_NUM"`.
|
||||
|
||||
## On failure
|
||||
If anywhere from Step 4 onward fails (build red, tests red, agent gives up, push fails, merge conflict):
|
||||
- write the failure log to `/tmp/loop-fail-$ISSUE_NUM.log`
|
||||
- run `bash scripts/queue/finish-pr.sh "$ISSUE_NUM" failed /tmp/loop-fail-$ISSUE_NUM.log`
|
||||
- the issue keeps `queue/failed` and stays open for retry
|
||||
- **do not** retry the same issue this iteration; let the loop pick a different one next fire
|
||||
|
||||
## Re-arm
|
||||
At the very end of the iteration (success OR failure), call `ScheduleWakeup` with the same `/loop` prompt and `delaySeconds: 60` to fire the next iteration.
|
||||
|
||||
If the queue was empty in Step 1, do NOT call ScheduleWakeup.
|
||||
|
||||
Report a one-line summary to the user before re-arming.
|
||||
@@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env bash
|
||||
# Merges a PR (Mode B autonomous merge into auto/driver-gaps).
|
||||
# Usage: merge-pr.sh PR_NUM
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
PR="${1:?PR_NUM required}"
|
||||
PAYLOAD='{"Do":"merge","delete_branch_after_merge":true}'
|
||||
api_repo POST "pulls/$PR/merge" "$PAYLOAD" >/dev/null
|
||||
echo " PR #$PR merged into $INTEGRATION_BRANCH (branch deleted)"
|
||||
@@ -0,0 +1,77 @@
|
||||
#!/usr/bin/env bash
|
||||
# Prints the next eligible queue issue as JSON: {issue_num, canonical_id, driver, plan_pr_id, branch, ...}
|
||||
# Eligible = open + label queue/queued + all canonical deps closed.
|
||||
# Picks lowest phase first, then lowest issue number within phase.
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
python - <<PY
|
||||
import json, urllib.request, re, os, sys
|
||||
|
||||
token = os.environ["GITEA_TOKEN"]
|
||||
api_base = "https://gitea.dohertylan.com/api/v1/repos/dohertj2/lmxopcua"
|
||||
|
||||
def api(path):
|
||||
req = urllib.request.Request(f"{api_base}/{path}",
|
||||
headers={"Authorization": f"token {token}"})
|
||||
with urllib.request.urlopen(req) as r:
|
||||
return json.loads(r.read().decode())
|
||||
|
||||
# Gather all queue issues
|
||||
issues = []
|
||||
page = 1
|
||||
while True:
|
||||
items = api(f"issues?state=all&type=issues&limit=50&page={page}&labels=auto-managed")
|
||||
if not items: break
|
||||
issues.extend(items)
|
||||
page += 1
|
||||
|
||||
by_id = {}
|
||||
for it in issues:
|
||||
m = re.search(r'<!-- queue-meta\s*(\{.*?\})\s*-->', it.get("body","") or "", re.S)
|
||||
if not m: continue
|
||||
try: meta = json.loads(m.group(1))
|
||||
except: continue
|
||||
by_id[meta["id"]] = (it, meta)
|
||||
|
||||
def is_done(issue):
|
||||
if issue["state"] == "closed": return True
|
||||
labels = {l["name"] for l in issue["labels"]}
|
||||
return "queue/done" in labels
|
||||
|
||||
eligible = []
|
||||
for cid, (it, meta) in by_id.items():
|
||||
labels = {l["name"] for l in it["labels"]}
|
||||
if it["state"] != "open": continue
|
||||
if "queue/queued" not in labels: continue
|
||||
deps = meta.get("deps", [])
|
||||
blocked = False
|
||||
for d in deps:
|
||||
if d not in by_id:
|
||||
blocked = True; break
|
||||
if not is_done(by_id[d][0]):
|
||||
blocked = True; break
|
||||
if blocked: continue
|
||||
eligible.append((meta.get("phase",99), it["number"], cid, it, meta))
|
||||
|
||||
if not eligible:
|
||||
print(json.dumps({"empty": True}))
|
||||
sys.exit(0)
|
||||
|
||||
eligible.sort(key=lambda x: (x[0], x[1]))
|
||||
phase, num, cid, it, meta = eligible[0]
|
||||
plan_pr = meta.get("plan_pr_id","").replace("/","-")
|
||||
result = {
|
||||
"empty": False,
|
||||
"issue_num": num,
|
||||
"canonical_id": cid,
|
||||
"driver": meta["driver"],
|
||||
"phase": meta["phase"],
|
||||
"plan_pr_id": meta.get("plan_pr_id",""),
|
||||
"title": it["title"],
|
||||
"branch": f"auto/{meta['driver']}/{plan_pr}",
|
||||
"url": it["html_url"],
|
||||
}
|
||||
print(json.dumps(result, indent=2))
|
||||
PY
|
||||
@@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env bash
|
||||
# Opens a PR from BRANCH into auto/driver-gaps, references the issue, sets ready/draft.
|
||||
# Usage: open-pr.sh ISSUE_NUM BRANCH_NAME TITLE BODY_FILE
|
||||
# Echoes the PR number on stdout.
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
ISSUE="${1:?}"; BRANCH="${2:?}"; TITLE="${3:?}"; BODY_FILE="${4:?}"
|
||||
|
||||
BODY=$(cat "$BODY_FILE")
|
||||
PAYLOAD=$(python -c "
|
||||
import json, sys
|
||||
print(json.dumps({
|
||||
'title': sys.argv[1],
|
||||
'body': sys.argv[2] + '\n\nCloses #' + sys.argv[3],
|
||||
'head': sys.argv[4],
|
||||
'base': sys.argv[5],
|
||||
}))
|
||||
" "$TITLE" "$BODY" "$ISSUE" "$BRANCH" "$INTEGRATION_BRANCH")
|
||||
|
||||
PR=$(api_repo POST pulls "$PAYLOAD")
|
||||
PR_NUM=$(echo "$PR" | python -c "import sys,json; print(json.load(sys.stdin)['number'])")
|
||||
echo "$PR_NUM"
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,58 @@
|
||||
#!/usr/bin/env bash
|
||||
# Idempotent: creates queue labels in Gitea and stores name→id map at .label-ids.json
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
declare -A LABELS=(
|
||||
["driver/abcip"]="0e8a16"
|
||||
["driver/ablegacy"]="0e8a16"
|
||||
["driver/focas"]="0e8a16"
|
||||
["driver/opcuaclient"]="0e8a16"
|
||||
["driver/s7"]="0e8a16"
|
||||
["driver/twincat"]="0e8a16"
|
||||
["phase/1"]="bfd4f2"
|
||||
["phase/2"]="bfd4f2"
|
||||
["phase/3"]="bfd4f2"
|
||||
["phase/4"]="bfd4f2"
|
||||
["phase/5"]="bfd4f2"
|
||||
["phase/6"]="bfd4f2"
|
||||
["queue/queued"]="d4c5f9"
|
||||
["queue/in-progress"]="fbca04"
|
||||
["queue/blocked"]="b60205"
|
||||
["queue/failed"]="b60205"
|
||||
["queue/done"]="2ea44f"
|
||||
["auto-managed"]="cccccc"
|
||||
["cross-driver"]="d93f0b"
|
||||
)
|
||||
|
||||
# Pull existing labels
|
||||
EXISTING=$(api_repo GET "labels?limit=200")
|
||||
|
||||
emit_map() {
|
||||
python - <<PY
|
||||
import json, sys
|
||||
existing = json.loads('''$EXISTING''')
|
||||
print(json.dumps({l['name']: l['id'] for l in existing}, indent=2))
|
||||
PY
|
||||
}
|
||||
|
||||
# Create any missing
|
||||
for name in "${!LABELS[@]}"; do
|
||||
color="${LABELS[$name]}"
|
||||
exists=$(echo "$EXISTING" | python -c "import json,sys; ls=json.load(sys.stdin); print('yes' if any(l['name']=='$name' for l in ls) else 'no')")
|
||||
if [ "$exists" = "no" ]; then
|
||||
payload=$(python -c "import json; print(json.dumps({'name':'$name','color':'#$color','description':'queue management'}))")
|
||||
api_repo POST labels "$payload" >/dev/null
|
||||
echo "created label: $name"
|
||||
fi
|
||||
done
|
||||
|
||||
# Refresh and write the map file
|
||||
api_repo GET "labels?limit=200" | python -c "
|
||||
import json, sys
|
||||
ls = json.load(sys.stdin)
|
||||
m = {l['name']: l['id'] for l in ls}
|
||||
open('$LABEL_MAP','w').write(json.dumps(m, indent=2))
|
||||
print(f'wrote {len(m)} labels to $LABEL_MAP')
|
||||
"
|
||||
@@ -0,0 +1,31 @@
|
||||
#!/usr/bin/env bash
|
||||
# Marks an issue in-progress and creates its branch off the integration branch.
|
||||
# Usage: start-pr.sh ISSUE_NUM BRANCH_NAME
|
||||
set -euo pipefail
|
||||
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
. "$HERE/lib.sh"
|
||||
|
||||
ISSUE="${1:?ISSUE_NUM required}"
|
||||
BRANCH="${2:?BRANCH_NAME required}"
|
||||
|
||||
# Swap labels: queued -> in-progress
|
||||
QUEUED=$(python -c "import json; print(json.load(open('$LABEL_MAP'))['queue/queued'])")
|
||||
INPROG=$(python -c "import json; print(json.load(open('$LABEL_MAP'))['queue/in-progress'])")
|
||||
|
||||
api_repo DELETE "issues/$ISSUE/labels/$QUEUED" >/dev/null || true
|
||||
api_repo POST "issues/$ISSUE/labels" "{\"labels\":[$INPROG]}" >/dev/null
|
||||
|
||||
# Create branch off integration
|
||||
EXISTS=$(api_repo GET "branches/$BRANCH" 2>/dev/null || echo "")
|
||||
if [ -z "$EXISTS" ]; then
|
||||
PAYLOAD=$(python -c "import json; print(json.dumps({'new_branch_name':'$BRANCH','old_branch_name':'$INTEGRATION_BRANCH'}))")
|
||||
api_repo POST branches "$PAYLOAD" >/dev/null
|
||||
echo " branch created: $BRANCH"
|
||||
else
|
||||
echo " branch exists: $BRANCH"
|
||||
fi
|
||||
|
||||
# Comment
|
||||
COMMENT=$(python -c "import json; print(json.dumps({'body':'🤖 Auto-loop picked this up. Branch: \`$BRANCH\`. Status: in-progress.'}))")
|
||||
api_repo POST "issues/$ISSUE/comments" "$COMMENT" >/dev/null
|
||||
echo " issue #$ISSUE marked in-progress"
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"ConnectionStrings": {
|
||||
"ConfigDb": "Server=localhost,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;"
|
||||
"ConfigDb": "Server=10.100.0.35,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;"
|
||||
},
|
||||
"Authentication": {
|
||||
"Ldap": {
|
||||
|
||||
@@ -1,260 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Concurrent;
|
||||
using System.Collections.Generic;
|
||||
using System.Linq;
|
||||
using System.Threading.Tasks;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
|
||||
|
||||
/// <summary>
|
||||
/// Subscribes to the four Galaxy alarm attributes (<c>.InAlarm</c>, <c>.Priority</c>,
|
||||
/// <c>.DescAttrName</c>, <c>.Acked</c>) per alarm-bearing attribute discovered during
|
||||
/// <c>DiscoverAsync</c>. Maintains one <see cref="AlarmState"/> per alarm, raises
|
||||
/// <see cref="AlarmTransition"/> on lifecycle transitions (Active / Unacknowledged /
|
||||
/// Acknowledged / Inactive). Ack path writes <c>.AckMsg</c>. Pure-logic state machine
|
||||
/// with delegate-based subscribe/write so it's testable against in-memory fakes.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Transitions emitted (OPC UA Part 9 alarm lifecycle, simplified for the Galaxy model):
|
||||
/// <list type="bullet">
|
||||
/// <item><c>Active</c> — InAlarm false → true. Default to Unacknowledged.</item>
|
||||
/// <item><c>Acknowledged</c> — Acked false → true while InAlarm is still true.</item>
|
||||
/// <item><c>Inactive</c> — InAlarm true → false. If still unacknowledged the alarm
|
||||
/// is marked latched-inactive-unack; next Ack transitions straight to Inactive.</item>
|
||||
/// </list>
|
||||
/// </remarks>
|
||||
public sealed class GalaxyAlarmTracker : IDisposable
|
||||
{
|
||||
public const string InAlarmAttr = ".InAlarm";
|
||||
public const string PriorityAttr = ".Priority";
|
||||
public const string DescAttrNameAttr = ".DescAttrName";
|
||||
public const string AckedAttr = ".Acked";
|
||||
public const string AckMsgAttr = ".AckMsg";
|
||||
|
||||
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
|
||||
private readonly Func<string, Task> _unsubscribe;
|
||||
private readonly Func<string, object, Task<bool>> _write;
|
||||
private readonly Func<DateTime> _clock;
|
||||
|
||||
// Alarm tag (attribute full ref, e.g. "Tank.Level.HiHi") → state.
|
||||
private readonly ConcurrentDictionary<string, AlarmState> _alarms =
|
||||
new(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
// Reverse lookup: probed tag (".InAlarm" etc.) → owning alarm tag.
|
||||
private readonly ConcurrentDictionary<string, (string AlarmTag, AlarmField Field)> _probeToAlarm =
|
||||
new(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
private bool _disposed;
|
||||
|
||||
public event EventHandler<AlarmTransition>? TransitionRaised;
|
||||
|
||||
public GalaxyAlarmTracker(
|
||||
Func<string, Action<string, Vtq>, Task> subscribe,
|
||||
Func<string, Task> unsubscribe,
|
||||
Func<string, object, Task<bool>> write)
|
||||
: this(subscribe, unsubscribe, write, () => DateTime.UtcNow) { }
|
||||
|
||||
internal GalaxyAlarmTracker(
|
||||
Func<string, Action<string, Vtq>, Task> subscribe,
|
||||
Func<string, Task> unsubscribe,
|
||||
Func<string, object, Task<bool>> write,
|
||||
Func<DateTime> clock)
|
||||
{
|
||||
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
|
||||
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
|
||||
_write = write ?? throw new ArgumentNullException(nameof(write));
|
||||
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
|
||||
}
|
||||
|
||||
public int TrackedAlarmCount => _alarms.Count;
|
||||
|
||||
/// <summary>
|
||||
/// Advise the four alarm attributes for <paramref name="alarmTag"/>. Idempotent —
|
||||
/// repeat calls for the same alarm tag are a no-op. Subscribe failure for any of the
|
||||
/// four rolls back the alarm entry so a stale callback cannot promote a phantom.
|
||||
/// </summary>
|
||||
public async Task TrackAsync(string alarmTag)
|
||||
{
|
||||
if (_disposed || string.IsNullOrWhiteSpace(alarmTag)) return;
|
||||
if (_alarms.ContainsKey(alarmTag)) return;
|
||||
|
||||
var state = new AlarmState { AlarmTag = alarmTag };
|
||||
if (!_alarms.TryAdd(alarmTag, state)) return;
|
||||
|
||||
var probes = new[]
|
||||
{
|
||||
(Tag: alarmTag + InAlarmAttr, Field: AlarmField.InAlarm),
|
||||
(Tag: alarmTag + PriorityAttr, Field: AlarmField.Priority),
|
||||
(Tag: alarmTag + DescAttrNameAttr, Field: AlarmField.DescAttrName),
|
||||
(Tag: alarmTag + AckedAttr, Field: AlarmField.Acked),
|
||||
};
|
||||
|
||||
foreach (var p in probes)
|
||||
{
|
||||
_probeToAlarm[p.Tag] = (alarmTag, p.Field);
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
foreach (var p in probes)
|
||||
{
|
||||
await _subscribe(p.Tag, OnProbeCallback).ConfigureAwait(false);
|
||||
}
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Rollback so a partial advise doesn't leak state.
|
||||
_alarms.TryRemove(alarmTag, out _);
|
||||
foreach (var p in probes)
|
||||
{
|
||||
_probeToAlarm.TryRemove(p.Tag, out _);
|
||||
try { await _unsubscribe(p.Tag).ConfigureAwait(false); } catch { }
|
||||
}
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Drop every tracked alarm. Unadvises all 4 probes per alarm as best-effort.
|
||||
/// </summary>
|
||||
public async Task ClearAsync()
|
||||
{
|
||||
_alarms.Clear();
|
||||
foreach (var kv in _probeToAlarm.ToList())
|
||||
{
|
||||
_probeToAlarm.TryRemove(kv.Key, out _);
|
||||
try { await _unsubscribe(kv.Key).ConfigureAwait(false); } catch { }
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Operator ack — write the comment text into <c><alarmTag>.AckMsg</c>.
|
||||
/// Returns false when the runtime reports the write failed.
|
||||
/// </summary>
|
||||
public Task<bool> AcknowledgeAsync(string alarmTag, string comment)
|
||||
{
|
||||
if (_disposed || string.IsNullOrWhiteSpace(alarmTag))
|
||||
return Task.FromResult(false);
|
||||
return _write(alarmTag + AckMsgAttr, comment ?? string.Empty);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Subscription callback entry point. Exposed for tests and for the Backend to route
|
||||
/// fan-out callbacks through. Runs the state machine and fires TransitionRaised
|
||||
/// outside the lock.
|
||||
/// </summary>
|
||||
public void OnProbeCallback(string probeTag, Vtq vtq)
|
||||
{
|
||||
if (_disposed) return;
|
||||
if (!_probeToAlarm.TryGetValue(probeTag, out var link)) return;
|
||||
if (!_alarms.TryGetValue(link.AlarmTag, out var state)) return;
|
||||
|
||||
AlarmTransition? transition = null;
|
||||
var now = _clock();
|
||||
|
||||
lock (state.Lock)
|
||||
{
|
||||
switch (link.Field)
|
||||
{
|
||||
case AlarmField.InAlarm:
|
||||
{
|
||||
var wasActive = state.InAlarm;
|
||||
var isActive = vtq.Value is bool b && b;
|
||||
state.InAlarm = isActive;
|
||||
state.LastUpdateUtc = now;
|
||||
if (!wasActive && isActive)
|
||||
{
|
||||
state.Acked = false;
|
||||
state.LastTransitionUtc = now;
|
||||
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Active, state.Priority, state.DescAttrName, now);
|
||||
}
|
||||
else if (wasActive && !isActive)
|
||||
{
|
||||
state.LastTransitionUtc = now;
|
||||
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Inactive, state.Priority, state.DescAttrName, now);
|
||||
}
|
||||
break;
|
||||
}
|
||||
case AlarmField.Priority:
|
||||
if (vtq.Value is int pi) state.Priority = pi;
|
||||
else if (vtq.Value is short ps) state.Priority = ps;
|
||||
else if (vtq.Value is long pl && pl <= int.MaxValue) state.Priority = (int)pl;
|
||||
state.LastUpdateUtc = now;
|
||||
break;
|
||||
case AlarmField.DescAttrName:
|
||||
state.DescAttrName = vtq.Value as string;
|
||||
state.LastUpdateUtc = now;
|
||||
break;
|
||||
case AlarmField.Acked:
|
||||
{
|
||||
var wasAcked = state.Acked;
|
||||
var isAcked = vtq.Value is bool b && b;
|
||||
state.Acked = isAcked;
|
||||
state.LastUpdateUtc = now;
|
||||
// Fire Acknowledged only when transitioning false→true. Don't fire on initial
|
||||
// subscribe callback (wasAcked==isAcked in that case because the state starts
|
||||
// with Acked=false and the initial probe is usually true for an un-active alarm).
|
||||
if (!wasAcked && isAcked && state.InAlarm)
|
||||
{
|
||||
state.LastTransitionUtc = now;
|
||||
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Acknowledged, state.Priority, state.DescAttrName, now);
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (transition is { } t)
|
||||
{
|
||||
TransitionRaised?.Invoke(this, t);
|
||||
}
|
||||
}
|
||||
|
||||
public IReadOnlyList<AlarmSnapshot> SnapshotStates()
|
||||
{
|
||||
return _alarms.Values.Select(s =>
|
||||
{
|
||||
lock (s.Lock)
|
||||
return new AlarmSnapshot(s.AlarmTag, s.InAlarm, s.Acked, s.Priority, s.DescAttrName);
|
||||
}).ToList();
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
_alarms.Clear();
|
||||
_probeToAlarm.Clear();
|
||||
}
|
||||
|
||||
private sealed class AlarmState
|
||||
{
|
||||
public readonly object Lock = new();
|
||||
public string AlarmTag = "";
|
||||
public bool InAlarm;
|
||||
public bool Acked = true; // default ack'd so first false→true on subscribe doesn't misfire
|
||||
public int Priority;
|
||||
public string? DescAttrName;
|
||||
public DateTime LastUpdateUtc;
|
||||
public DateTime LastTransitionUtc;
|
||||
}
|
||||
|
||||
private enum AlarmField { InAlarm, Priority, DescAttrName, Acked }
|
||||
}
|
||||
|
||||
public enum AlarmStateTransition { Active, Acknowledged, Inactive }
|
||||
|
||||
public sealed record AlarmTransition(
|
||||
string AlarmTag,
|
||||
AlarmStateTransition Transition,
|
||||
int Priority,
|
||||
string? DescAttrName,
|
||||
DateTime AtUtc);
|
||||
|
||||
public sealed record AlarmSnapshot(
|
||||
string AlarmTag,
|
||||
bool InAlarm,
|
||||
bool Acked,
|
||||
int Priority,
|
||||
string? DescAttrName);
|
||||
@@ -1,188 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Linq;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
|
||||
/// <summary>
|
||||
/// Galaxy backend that uses the live <c>ZB</c> repository for <see cref="DiscoverAsync"/> —
|
||||
/// real gobject hierarchy + attributes flow through to the Proxy without needing the MXAccess
|
||||
/// COM client. Runtime data-plane calls (Read/Write/Subscribe/Alarm/History) still surface
|
||||
/// as "MXAccess code lift pending" until the COM client port lands. This is the highest-value
|
||||
/// intermediate state because Discover is what powers the OPC UA address-space build, so
|
||||
/// downstream Proxy + parity tests can exercise the complete tree shape today.
|
||||
/// </summary>
|
||||
public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxyBackend
|
||||
{
|
||||
private long _nextSessionId;
|
||||
private long _nextSubscriptionId;
|
||||
|
||||
// DB-only backend doesn't have a runtime data plane; never raises events.
|
||||
#pragma warning disable CS0067
|
||||
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
|
||||
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
|
||||
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
|
||||
#pragma warning restore CS0067
|
||||
|
||||
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
|
||||
{
|
||||
var id = Interlocked.Increment(ref _nextSessionId);
|
||||
return Task.FromResult(new OpenSessionResponse { Success = true, SessionId = id });
|
||||
}
|
||||
|
||||
public Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
|
||||
public async Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
var hierarchy = await repository.GetHierarchyAsync(ct).ConfigureAwait(false);
|
||||
var attributes = await repository.GetAttributesAsync(ct).ConfigureAwait(false);
|
||||
|
||||
// Group attributes by their owning gobject for the IPC payload.
|
||||
var attrsByGobject = attributes
|
||||
.GroupBy(a => a.GobjectId)
|
||||
.ToDictionary(g => g.Key, g => g.Select(MapAttribute).ToArray());
|
||||
|
||||
var parentByChild = hierarchy
|
||||
.ToDictionary(o => o.GobjectId, o => o.ParentGobjectId);
|
||||
var nameByGobject = hierarchy
|
||||
.ToDictionary(o => o.GobjectId, o => o.TagName);
|
||||
|
||||
var objects = hierarchy.Select(o => new GalaxyObjectInfo
|
||||
{
|
||||
ContainedName = string.IsNullOrEmpty(o.ContainedName) ? o.TagName : o.ContainedName,
|
||||
TagName = o.TagName,
|
||||
ParentContainedName = parentByChild.TryGetValue(o.GobjectId, out var p)
|
||||
&& p != 0
|
||||
&& nameByGobject.TryGetValue(p, out var pName)
|
||||
? pName
|
||||
: null,
|
||||
TemplateCategory = MapCategory(o.CategoryId),
|
||||
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : System.Array.Empty<GalaxyAttributeInfo>(),
|
||||
}).ToArray();
|
||||
|
||||
return new DiscoverHierarchyResponse { Success = true, Objects = objects };
|
||||
}
|
||||
catch (Exception ex) when (ex is System.Data.SqlClient.SqlException
|
||||
or InvalidOperationException
|
||||
or TimeoutException)
|
||||
{
|
||||
return new DiscoverHierarchyResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = $"Galaxy ZB repository error: {ex.Message}",
|
||||
Objects = System.Array.Empty<GalaxyObjectInfo>(),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new ReadValuesResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "MXAccess code lift pending (Phase 2 Task B.1) — DB-backed backend covers Discover only",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
|
||||
{
|
||||
var results = new WriteValueResult[req.Writes.Length];
|
||||
for (var i = 0; i < req.Writes.Length; i++)
|
||||
{
|
||||
results[i] = new WriteValueResult
|
||||
{
|
||||
TagReference = req.Writes[i].TagReference,
|
||||
StatusCode = 0x80020000u,
|
||||
Error = "MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
};
|
||||
}
|
||||
return Task.FromResult(new WriteValuesResponse { Results = results });
|
||||
}
|
||||
|
||||
public Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
|
||||
{
|
||||
var sid = Interlocked.Increment(ref _nextSubscriptionId);
|
||||
return Task.FromResult(new SubscribeResponse
|
||||
{
|
||||
Success = true,
|
||||
SubscriptionId = sid,
|
||||
ActualIntervalMs = req.RequestedIntervalMs,
|
||||
});
|
||||
}
|
||||
|
||||
public Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
|
||||
public Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
|
||||
Tags = System.Array.Empty<HistoryTagValues>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
|
||||
HistoryReadProcessedRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadProcessedResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
|
||||
HistoryReadAtTimeRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadAtTimeResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
|
||||
HistoryReadEventsRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadEventsResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
|
||||
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
|
||||
});
|
||||
|
||||
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
|
||||
|
||||
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
|
||||
{
|
||||
AttributeName = row.AttributeName,
|
||||
MxDataType = row.MxDataType,
|
||||
IsArray = row.IsArray,
|
||||
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
|
||||
SecurityClassification = row.SecurityClassification,
|
||||
IsHistorized = row.IsHistorized,
|
||||
IsAlarm = row.IsAlarm,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Galaxy <c>template_definition.category_id</c> → human-readable name.
|
||||
/// Mirrors v1 Host's <c>AlarmObjectFilter</c> mapping.
|
||||
/// </summary>
|
||||
private static string MapCategory(int categoryId) => categoryId switch
|
||||
{
|
||||
1 => "$WinPlatform",
|
||||
3 => "$AppEngine",
|
||||
4 => "$Area",
|
||||
10 => "$UserDefined",
|
||||
11 => "$ApplicationObject",
|
||||
13 => "$Area",
|
||||
17 => "$DeviceIntegration",
|
||||
24 => "$ViewEngine",
|
||||
26 => "$ViewApp",
|
||||
_ => $"category-{categoryId}",
|
||||
};
|
||||
}
|
||||
@@ -1,35 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
|
||||
/// <summary>
|
||||
/// One row from the v1 <c>HierarchySql</c>. Galaxy <c>gobject</c> deployed instance with its
|
||||
/// hierarchy parent + template-chain context.
|
||||
/// </summary>
|
||||
public sealed class GalaxyHierarchyRow
|
||||
{
|
||||
public int GobjectId { get; init; }
|
||||
public string TagName { get; init; } = string.Empty;
|
||||
public string ContainedName { get; init; } = string.Empty;
|
||||
public string BrowseName { get; init; } = string.Empty;
|
||||
public int ParentGobjectId { get; init; }
|
||||
public bool IsArea { get; init; }
|
||||
public int CategoryId { get; init; }
|
||||
public int HostedByGobjectId { get; init; }
|
||||
public System.Collections.Generic.IReadOnlyList<string> TemplateChain { get; init; } = System.Array.Empty<string>();
|
||||
}
|
||||
|
||||
/// <summary>One row from the v1 <c>AttributesSql</c>.</summary>
|
||||
public sealed class GalaxyAttributeRow
|
||||
{
|
||||
public int GobjectId { get; init; }
|
||||
public string TagName { get; init; } = string.Empty;
|
||||
public string AttributeName { get; init; } = string.Empty;
|
||||
public string FullTagReference { get; init; } = string.Empty;
|
||||
public int MxDataType { get; init; }
|
||||
public string? DataTypeName { get; init; }
|
||||
public bool IsArray { get; init; }
|
||||
public int? ArrayDimension { get; init; }
|
||||
public int MxAttributeCategory { get; init; }
|
||||
public int SecurityClassification { get; init; }
|
||||
public bool IsHistorized { get; init; }
|
||||
public bool IsAlarm { get; init; }
|
||||
}
|
||||
@@ -1,224 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Data.SqlClient;
|
||||
using System.Linq;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
|
||||
/// <summary>
|
||||
/// SQL access to the Galaxy <c>ZB</c> repository — port of v1 <c>GalaxyRepositoryService</c>.
|
||||
/// The two SQL bodies (Hierarchy + Attributes) are byte-for-byte identical to v1 so the
|
||||
/// queries surface the same row set at parity time. Extended-attributes and scope-filter
|
||||
/// queries from v1 are intentionally not ported yet — they're refinements that aren't on
|
||||
/// the Phase 2 critical path.
|
||||
/// </summary>
|
||||
public sealed class GalaxyRepository(GalaxyRepositoryOptions options)
|
||||
{
|
||||
public async Task<bool> TestConnectionAsync(CancellationToken ct = default)
|
||||
{
|
||||
try
|
||||
{
|
||||
using var conn = new SqlConnection(options.ConnectionString);
|
||||
await conn.OpenAsync(ct).ConfigureAwait(false);
|
||||
using var cmd = new SqlCommand("SELECT 1", conn) { CommandTimeout = options.CommandTimeoutSeconds };
|
||||
var result = await cmd.ExecuteScalarAsync(ct).ConfigureAwait(false);
|
||||
return result is int i && i == 1;
|
||||
}
|
||||
catch (SqlException) { return false; }
|
||||
catch (InvalidOperationException) { return false; }
|
||||
}
|
||||
|
||||
public async Task<DateTime?> GetLastDeployTimeAsync(CancellationToken ct = default)
|
||||
{
|
||||
using var conn = new SqlConnection(options.ConnectionString);
|
||||
await conn.OpenAsync(ct).ConfigureAwait(false);
|
||||
using var cmd = new SqlCommand("SELECT time_of_last_deploy FROM galaxy", conn)
|
||||
{ CommandTimeout = options.CommandTimeoutSeconds };
|
||||
var result = await cmd.ExecuteScalarAsync(ct).ConfigureAwait(false);
|
||||
return result is DateTime dt ? dt : null;
|
||||
}
|
||||
|
||||
public async Task<List<GalaxyHierarchyRow>> GetHierarchyAsync(CancellationToken ct = default)
|
||||
{
|
||||
var rows = new List<GalaxyHierarchyRow>();
|
||||
|
||||
using var conn = new SqlConnection(options.ConnectionString);
|
||||
await conn.OpenAsync(ct).ConfigureAwait(false);
|
||||
|
||||
using var cmd = new SqlCommand(HierarchySql, conn) { CommandTimeout = options.CommandTimeoutSeconds };
|
||||
using var reader = await cmd.ExecuteReaderAsync(ct).ConfigureAwait(false);
|
||||
|
||||
while (await reader.ReadAsync(ct).ConfigureAwait(false))
|
||||
{
|
||||
var templateChainRaw = reader.IsDBNull(8) ? string.Empty : reader.GetString(8);
|
||||
var templateChain = templateChainRaw.Length == 0
|
||||
? Array.Empty<string>()
|
||||
: templateChainRaw.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries)
|
||||
.Select(s => s.Trim())
|
||||
.Where(s => s.Length > 0)
|
||||
.ToArray();
|
||||
|
||||
rows.Add(new GalaxyHierarchyRow
|
||||
{
|
||||
GobjectId = Convert.ToInt32(reader.GetValue(0)),
|
||||
TagName = reader.GetString(1),
|
||||
ContainedName = reader.IsDBNull(2) ? string.Empty : reader.GetString(2),
|
||||
BrowseName = reader.GetString(3),
|
||||
ParentGobjectId = Convert.ToInt32(reader.GetValue(4)),
|
||||
IsArea = Convert.ToInt32(reader.GetValue(5)) == 1,
|
||||
CategoryId = Convert.ToInt32(reader.GetValue(6)),
|
||||
HostedByGobjectId = Convert.ToInt32(reader.GetValue(7)),
|
||||
TemplateChain = templateChain,
|
||||
});
|
||||
}
|
||||
return rows;
|
||||
}
|
||||
|
||||
public async Task<List<GalaxyAttributeRow>> GetAttributesAsync(CancellationToken ct = default)
|
||||
{
|
||||
var rows = new List<GalaxyAttributeRow>();
|
||||
|
||||
using var conn = new SqlConnection(options.ConnectionString);
|
||||
await conn.OpenAsync(ct).ConfigureAwait(false);
|
||||
|
||||
using var cmd = new SqlCommand(AttributesSql, conn) { CommandTimeout = options.CommandTimeoutSeconds };
|
||||
using var reader = await cmd.ExecuteReaderAsync(ct).ConfigureAwait(false);
|
||||
|
||||
while (await reader.ReadAsync(ct).ConfigureAwait(false))
|
||||
{
|
||||
rows.Add(new GalaxyAttributeRow
|
||||
{
|
||||
GobjectId = Convert.ToInt32(reader.GetValue(0)),
|
||||
TagName = reader.GetString(1),
|
||||
AttributeName = reader.GetString(2),
|
||||
FullTagReference = reader.GetString(3),
|
||||
MxDataType = Convert.ToInt32(reader.GetValue(4)),
|
||||
DataTypeName = reader.IsDBNull(5) ? null : reader.GetString(5),
|
||||
IsArray = Convert.ToInt32(reader.GetValue(6)) == 1,
|
||||
ArrayDimension = reader.IsDBNull(7) ? (int?)null : Convert.ToInt32(reader.GetValue(7)),
|
||||
MxAttributeCategory = Convert.ToInt32(reader.GetValue(8)),
|
||||
SecurityClassification = Convert.ToInt32(reader.GetValue(9)),
|
||||
IsHistorized = Convert.ToInt32(reader.GetValue(10)) == 1,
|
||||
IsAlarm = Convert.ToInt32(reader.GetValue(11)) == 1,
|
||||
});
|
||||
}
|
||||
return rows;
|
||||
}
|
||||
|
||||
private const string HierarchySql = @"
|
||||
;WITH template_chain AS (
|
||||
SELECT g.gobject_id AS instance_gobject_id, t.gobject_id AS template_gobject_id,
|
||||
t.tag_name AS template_tag_name, t.derived_from_gobject_id, 0 AS depth
|
||||
FROM gobject g
|
||||
INNER JOIN gobject t ON t.gobject_id = g.derived_from_gobject_id
|
||||
WHERE g.is_template = 0 AND g.deployed_package_id <> 0 AND g.derived_from_gobject_id <> 0
|
||||
UNION ALL
|
||||
SELECT tc.instance_gobject_id, t.gobject_id, t.tag_name, t.derived_from_gobject_id, tc.depth + 1
|
||||
FROM template_chain tc
|
||||
INNER JOIN gobject t ON t.gobject_id = tc.derived_from_gobject_id
|
||||
WHERE tc.derived_from_gobject_id <> 0 AND tc.depth < 10
|
||||
)
|
||||
SELECT DISTINCT
|
||||
g.gobject_id,
|
||||
g.tag_name,
|
||||
g.contained_name,
|
||||
CASE WHEN g.contained_name IS NULL OR g.contained_name = ''
|
||||
THEN g.tag_name
|
||||
ELSE g.contained_name
|
||||
END AS browse_name,
|
||||
CASE WHEN g.contained_by_gobject_id = 0
|
||||
THEN g.area_gobject_id
|
||||
ELSE g.contained_by_gobject_id
|
||||
END AS parent_gobject_id,
|
||||
CASE WHEN td.category_id = 13
|
||||
THEN 1
|
||||
ELSE 0
|
||||
END AS is_area,
|
||||
td.category_id AS category_id,
|
||||
g.hosted_by_gobject_id AS hosted_by_gobject_id,
|
||||
ISNULL(
|
||||
STUFF((
|
||||
SELECT '|' + tc.template_tag_name
|
||||
FROM template_chain tc
|
||||
WHERE tc.instance_gobject_id = g.gobject_id
|
||||
ORDER BY tc.depth
|
||||
FOR XML PATH('')
|
||||
), 1, 1, ''),
|
||||
''
|
||||
) AS template_chain
|
||||
FROM gobject g
|
||||
INNER JOIN template_definition td
|
||||
ON g.template_definition_id = td.template_definition_id
|
||||
WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)
|
||||
AND g.is_template = 0
|
||||
AND g.deployed_package_id <> 0
|
||||
ORDER BY parent_gobject_id, g.tag_name";
|
||||
|
||||
private const string AttributesSql = @"
|
||||
;WITH deployed_package_chain AS (
|
||||
SELECT g.gobject_id, p.package_id, p.derived_from_package_id, 0 AS depth
|
||||
FROM gobject g
|
||||
INNER JOIN package p ON p.package_id = g.deployed_package_id
|
||||
WHERE g.is_template = 0 AND g.deployed_package_id <> 0
|
||||
UNION ALL
|
||||
SELECT dpc.gobject_id, p.package_id, p.derived_from_package_id, dpc.depth + 1
|
||||
FROM deployed_package_chain dpc
|
||||
INNER JOIN package p ON p.package_id = dpc.derived_from_package_id
|
||||
WHERE dpc.derived_from_package_id <> 0 AND dpc.depth < 10
|
||||
)
|
||||
SELECT gobject_id, tag_name, attribute_name, full_tag_reference,
|
||||
mx_data_type, data_type_name, is_array, array_dimension,
|
||||
mx_attribute_category, security_classification, is_historized, is_alarm
|
||||
FROM (
|
||||
SELECT
|
||||
dpc.gobject_id,
|
||||
g.tag_name,
|
||||
da.attribute_name,
|
||||
g.tag_name + '.' + da.attribute_name
|
||||
+ CASE WHEN da.is_array = 1 THEN '[]' ELSE '' END
|
||||
AS full_tag_reference,
|
||||
da.mx_data_type,
|
||||
dt.description AS data_type_name,
|
||||
da.is_array,
|
||||
CASE WHEN da.is_array = 1
|
||||
THEN CONVERT(int, CONVERT(varbinary(2),
|
||||
SUBSTRING(da.mx_value, 15, 2) + SUBSTRING(da.mx_value, 13, 2), 2))
|
||||
ELSE NULL
|
||||
END AS array_dimension,
|
||||
da.mx_attribute_category,
|
||||
da.security_classification,
|
||||
CASE WHEN EXISTS (
|
||||
SELECT 1 FROM deployed_package_chain dpc2
|
||||
INNER JOIN primitive_instance pi ON pi.package_id = dpc2.package_id AND pi.primitive_name = da.attribute_name
|
||||
INNER JOIN primitive_definition pd ON pd.primitive_definition_id = pi.primitive_definition_id AND pd.primitive_name = 'HistoryExtension'
|
||||
WHERE dpc2.gobject_id = dpc.gobject_id
|
||||
) THEN 1 ELSE 0 END AS is_historized,
|
||||
CASE WHEN EXISTS (
|
||||
SELECT 1 FROM deployed_package_chain dpc2
|
||||
INNER JOIN primitive_instance pi ON pi.package_id = dpc2.package_id AND pi.primitive_name = da.attribute_name
|
||||
INNER JOIN primitive_definition pd ON pd.primitive_definition_id = pi.primitive_definition_id AND pd.primitive_name = 'AlarmExtension'
|
||||
WHERE dpc2.gobject_id = dpc.gobject_id
|
||||
) THEN 1 ELSE 0 END AS is_alarm,
|
||||
ROW_NUMBER() OVER (
|
||||
PARTITION BY dpc.gobject_id, da.attribute_name
|
||||
ORDER BY dpc.depth
|
||||
) AS rn
|
||||
FROM deployed_package_chain dpc
|
||||
INNER JOIN dynamic_attribute da
|
||||
ON da.package_id = dpc.package_id
|
||||
INNER JOIN gobject g
|
||||
ON g.gobject_id = dpc.gobject_id
|
||||
INNER JOIN template_definition td
|
||||
ON td.template_definition_id = g.template_definition_id
|
||||
LEFT JOIN data_type dt
|
||||
ON dt.mx_data_type = da.mx_data_type
|
||||
WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)
|
||||
AND da.attribute_name NOT LIKE '[_]%'
|
||||
AND da.attribute_name NOT LIKE '%.Description'
|
||||
AND da.mx_attribute_category IN (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24)
|
||||
) ranked
|
||||
WHERE rn = 1
|
||||
ORDER BY tag_name, attribute_name";
|
||||
}
|
||||
@@ -1,13 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
|
||||
/// <summary>
|
||||
/// Connection settings for the Galaxy <c>ZB</c> repository database. Set from the
|
||||
/// <c>DriverConfig</c> JSON section <c>Database</c> per <c>plan.md</c> §"Galaxy DriverConfig".
|
||||
/// </summary>
|
||||
public sealed class GalaxyRepositoryOptions
|
||||
{
|
||||
public string ConnectionString { get; init; } =
|
||||
"Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;";
|
||||
|
||||
public int CommandTimeoutSeconds { get; init; } = 60;
|
||||
}
|
||||
@@ -1,46 +0,0 @@
|
||||
using System.Collections.Generic;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
|
||||
/// <summary>
|
||||
/// Galaxy data-plane abstraction. Replaces the placeholder <c>StubFrameHandler</c> with a
|
||||
/// real boundary the lifted <c>MxAccessClient</c> + <c>GalaxyRepository</c> implement during
|
||||
/// Phase 2 Task B.1. Splitting the IPC dispatch (<c>GalaxyFrameHandler</c>) from the
|
||||
/// backend means the dispatcher is unit-testable against an in-memory mock without needing
|
||||
/// live Galaxy.
|
||||
/// </summary>
|
||||
public interface IGalaxyBackend
|
||||
{
|
||||
/// <summary>
|
||||
/// Server-pushed events the backend raises asynchronously (data-change, alarm,
|
||||
/// host-status). The frame handler subscribes once on connect and forwards each
|
||||
/// event to the Proxy as a typed <see cref="MessageKind"/> notification.
|
||||
/// </summary>
|
||||
event System.EventHandler<OnDataChangeNotification>? OnDataChange;
|
||||
event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
|
||||
event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
|
||||
|
||||
Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct);
|
||||
Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct);
|
||||
|
||||
Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct);
|
||||
|
||||
Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct);
|
||||
Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct);
|
||||
|
||||
Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct);
|
||||
Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct);
|
||||
|
||||
Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct);
|
||||
Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct);
|
||||
|
||||
Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct);
|
||||
Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(HistoryReadProcessedRequest req, CancellationToken ct);
|
||||
Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(HistoryReadAtTimeRequest req, CancellationToken ct);
|
||||
Task<HistoryReadEventsResponse> HistoryReadEventsAsync(HistoryReadEventsRequest req, CancellationToken ct);
|
||||
|
||||
Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct);
|
||||
}
|
||||
@@ -1,43 +0,0 @@
|
||||
using ArchestrA.MxAccess;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
/// <summary>
|
||||
/// Delegate matching <c>LMXProxyServer.OnDataChange</c> COM event signature. Allows
|
||||
/// <see cref="MxAccessClient"/> to subscribe via the abstracted <see cref="IMxProxy"/>
|
||||
/// instead of the COM object directly (so the test mock works without MXAccess registered).
|
||||
/// </summary>
|
||||
public delegate void MxDataChangeHandler(
|
||||
int hLMXServerHandle,
|
||||
int phItemHandle,
|
||||
object pvItemValue,
|
||||
int pwItemQuality,
|
||||
object pftItemTimeStamp,
|
||||
ref MXSTATUS_PROXY[] ItemStatus);
|
||||
|
||||
public delegate void MxWriteCompleteHandler(
|
||||
int hLMXServerHandle,
|
||||
int phItemHandle,
|
||||
ref MXSTATUS_PROXY[] ItemStatus);
|
||||
|
||||
/// <summary>
|
||||
/// Abstraction over <c>LMXProxyServer</c> — port of v1 <c>IMxProxy</c>. Same surface area
|
||||
/// so the lifted client behaves identically; only the namespace + apartment-marshalling
|
||||
/// entry-point change.
|
||||
/// </summary>
|
||||
public interface IMxProxy
|
||||
{
|
||||
int Register(string clientName);
|
||||
void Unregister(int handle);
|
||||
|
||||
int AddItem(int handle, string address);
|
||||
void RemoveItem(int handle, int itemHandle);
|
||||
|
||||
void AdviseSupervisory(int handle, int itemHandle);
|
||||
void UnAdviseSupervisory(int handle, int itemHandle);
|
||||
|
||||
void Write(int handle, int itemHandle, object value, int securityClassification);
|
||||
|
||||
event MxDataChangeHandler? OnDataChange;
|
||||
event MxWriteCompleteHandler? OnWriteComplete;
|
||||
}
|
||||
@@ -1,408 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Concurrent;
|
||||
using System.Linq;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using ArchestrA.MxAccess;
|
||||
using Serilog;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
/// <summary>
|
||||
/// MXAccess runtime client — focused port of v1 <c>MxAccessClient</c>. Owns one
|
||||
/// <c>LMXProxyServer</c> COM connection on the supplied <see cref="StaPump"/>; serializes
|
||||
/// read / write / subscribe through the pump because all COM calls must run on the STA
|
||||
/// thread. Subscriptions are stored so they can be replayed on reconnect (full reconnect
|
||||
/// loop is the deferred-but-non-blocking refinement; this version covers connect/read/write
|
||||
/// /subscribe/unsubscribe — the MVP needed for parity testing).
|
||||
/// </summary>
|
||||
public sealed class MxAccessClient : IDisposable
|
||||
{
|
||||
private static readonly ILogger Log = Serilog.Log.ForContext<MxAccessClient>();
|
||||
|
||||
private readonly StaPump _pump;
|
||||
private readonly IMxProxy _proxy;
|
||||
private readonly string _clientName;
|
||||
private readonly MxAccessClientOptions _options;
|
||||
|
||||
// Galaxy attribute reference → MXAccess item handle (set on first Subscribe/Read).
|
||||
private readonly ConcurrentDictionary<string, int> _addressToHandle = new(StringComparer.OrdinalIgnoreCase);
|
||||
private readonly ConcurrentDictionary<int, string> _handleToAddress = new();
|
||||
private readonly ConcurrentDictionary<string, Action<string, Vtq>> _subscriptions =
|
||||
new(StringComparer.OrdinalIgnoreCase);
|
||||
private readonly ConcurrentDictionary<int, TaskCompletionSource<bool>> _pendingWrites = new();
|
||||
|
||||
private int _connectionHandle;
|
||||
private bool _connected;
|
||||
private DateTime _lastObservedActivityUtc = DateTime.UtcNow;
|
||||
private CancellationTokenSource? _monitorCts;
|
||||
private int _reconnectCount;
|
||||
private bool _disposed;
|
||||
|
||||
/// <summary>Fires whenever the connection transitions Connected ↔ Disconnected.</summary>
|
||||
public event EventHandler<bool>? ConnectionStateChanged;
|
||||
|
||||
/// <summary>
|
||||
/// Fires once per failed subscription replay after a reconnect. Carries the tag reference
|
||||
/// and the exception so the backend can propagate the degradation signal (e.g. mark the
|
||||
/// subscription bad on the Proxy side rather than silently losing its callback). Added for
|
||||
/// PR 6 low finding #2 — the replay loop previously ate per-tag failures silently and an
|
||||
/// operator would only find out that a specific subscription stopped updating through a
|
||||
/// data-quality complaint from downstream.
|
||||
/// </summary>
|
||||
public event EventHandler<SubscriptionReplayFailedEventArgs>? SubscriptionReplayFailed;
|
||||
|
||||
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName, MxAccessClientOptions? options = null)
|
||||
{
|
||||
_pump = pump;
|
||||
_proxy = proxy;
|
||||
_clientName = clientName;
|
||||
_options = options ?? new MxAccessClientOptions();
|
||||
_proxy.OnDataChange += OnDataChange;
|
||||
_proxy.OnWriteComplete += OnWriteComplete;
|
||||
}
|
||||
|
||||
public bool IsConnected => _connected;
|
||||
public int SubscriptionCount => _subscriptions.Count;
|
||||
public int ReconnectCount => _reconnectCount;
|
||||
|
||||
/// <summary>
|
||||
/// Wonderware client identity used when registering with the LMXProxyServer. Surfaced so
|
||||
/// <see cref="Backend.MxAccessGalaxyBackend"/> can tag its <c>OnHostStatusChanged</c> IPC
|
||||
/// pushes with a stable gateway name per PR 8.
|
||||
/// </summary>
|
||||
public string ClientName => _clientName;
|
||||
|
||||
/// <summary>Connects on the STA thread. Idempotent. Starts the reconnect monitor on first call.</summary>
|
||||
public async Task<int> ConnectAsync()
|
||||
{
|
||||
var handle = await _pump.InvokeAsync(() =>
|
||||
{
|
||||
if (_connected) return _connectionHandle;
|
||||
_connectionHandle = _proxy.Register(_clientName);
|
||||
_connected = true;
|
||||
return _connectionHandle;
|
||||
});
|
||||
|
||||
ConnectionStateChanged?.Invoke(this, true);
|
||||
|
||||
if (_options.AutoReconnect && _monitorCts is null)
|
||||
{
|
||||
_monitorCts = new CancellationTokenSource();
|
||||
_ = Task.Run(() => MonitorLoopAsync(_monitorCts.Token));
|
||||
}
|
||||
|
||||
return handle;
|
||||
}
|
||||
|
||||
public async Task DisconnectAsync()
|
||||
{
|
||||
_monitorCts?.Cancel();
|
||||
_monitorCts = null;
|
||||
|
||||
await _pump.InvokeAsync(() =>
|
||||
{
|
||||
if (!_connected) return;
|
||||
try { _proxy.Unregister(_connectionHandle); }
|
||||
finally
|
||||
{
|
||||
_connected = false;
|
||||
_addressToHandle.Clear();
|
||||
_handleToAddress.Clear();
|
||||
}
|
||||
});
|
||||
|
||||
ConnectionStateChanged?.Invoke(this, false);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Background loop that watches for connection liveness signals and triggers
|
||||
/// reconnect-with-replay when the connection appears dead. Per Phase 2 high finding #2:
|
||||
/// v1's MxAccessClient.Monitor pattern lifted into the new pump-based client. Uses
|
||||
/// observed-activity timestamp + optional probe-tag subscription. Without an explicit
|
||||
/// probe tag, falls back to "no data change in N seconds + no successful read in N
|
||||
/// seconds = unhealthy" — same shape as v1.
|
||||
/// </summary>
|
||||
private async Task MonitorLoopAsync(CancellationToken ct)
|
||||
{
|
||||
while (!ct.IsCancellationRequested)
|
||||
{
|
||||
try { await Task.Delay(_options.MonitorInterval, ct); }
|
||||
catch (OperationCanceledException) { break; }
|
||||
|
||||
if (!_connected || _disposed) continue;
|
||||
|
||||
var idle = DateTime.UtcNow - _lastObservedActivityUtc;
|
||||
if (idle <= _options.StaleThreshold) continue;
|
||||
|
||||
// Probe: try a no-op COM call. If the proxy is dead, the call will throw — that's
|
||||
// our reconnect signal. PR 6 low finding #1: AddItem allocates an MXAccess item
|
||||
// handle; we must RemoveItem it on the same pump turn or the long-running monitor
|
||||
// leaks one handle per probe cycle (one every MonitorInterval seconds, indefinitely).
|
||||
bool probeOk;
|
||||
try
|
||||
{
|
||||
probeOk = await _pump.InvokeAsync(() =>
|
||||
{
|
||||
int probeHandle = 0;
|
||||
try
|
||||
{
|
||||
probeHandle = _proxy.AddItem(_connectionHandle, "$Heartbeat");
|
||||
return probeHandle > 0;
|
||||
}
|
||||
catch { return false; }
|
||||
finally
|
||||
{
|
||||
if (probeHandle > 0)
|
||||
{
|
||||
try { _proxy.RemoveItem(_connectionHandle, probeHandle); }
|
||||
catch { /* proxy is dying; best-effort cleanup */ }
|
||||
}
|
||||
}
|
||||
});
|
||||
}
|
||||
catch { probeOk = false; }
|
||||
|
||||
if (probeOk)
|
||||
{
|
||||
_lastObservedActivityUtc = DateTime.UtcNow;
|
||||
continue;
|
||||
}
|
||||
|
||||
// Connection appears dead — reconnect-with-replay.
|
||||
try
|
||||
{
|
||||
await _pump.InvokeAsync(() =>
|
||||
{
|
||||
try { _proxy.Unregister(_connectionHandle); } catch { /* dead anyway */ }
|
||||
_connected = false;
|
||||
});
|
||||
ConnectionStateChanged?.Invoke(this, false);
|
||||
|
||||
await _pump.InvokeAsync(() =>
|
||||
{
|
||||
_connectionHandle = _proxy.Register(_clientName);
|
||||
_connected = true;
|
||||
});
|
||||
_reconnectCount++;
|
||||
ConnectionStateChanged?.Invoke(this, true);
|
||||
|
||||
// Replay every subscription that was active before the disconnect. PR 6 low
|
||||
// finding #2: surface per-tag failures — log them and raise
|
||||
// SubscriptionReplayFailed so the backend can propagate the degraded state
|
||||
// (previously swallowed silently; downstream quality dropped without a signal).
|
||||
var snapshot = _addressToHandle.Keys.ToArray();
|
||||
_addressToHandle.Clear();
|
||||
_handleToAddress.Clear();
|
||||
var failed = 0;
|
||||
foreach (var fullRef in snapshot)
|
||||
{
|
||||
try { await SubscribeOnPumpAsync(fullRef); }
|
||||
catch (Exception subEx)
|
||||
{
|
||||
failed++;
|
||||
Log.Warning(subEx,
|
||||
"MXAccess subscription replay failed for {TagReference} after reconnect #{Reconnect}",
|
||||
fullRef, _reconnectCount);
|
||||
SubscriptionReplayFailed?.Invoke(this,
|
||||
new SubscriptionReplayFailedEventArgs(fullRef, subEx));
|
||||
}
|
||||
}
|
||||
|
||||
if (failed > 0)
|
||||
Log.Warning("Subscription replay completed — {Failed} of {Total} failed", failed, snapshot.Length);
|
||||
else
|
||||
Log.Information("Subscription replay completed — {Total} re-subscribed cleanly", snapshot.Length);
|
||||
|
||||
_lastObservedActivityUtc = DateTime.UtcNow;
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Reconnect failed; back off and retry on the next tick.
|
||||
_connected = false;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// One-shot read implemented as a transient subscribe + unsubscribe.
|
||||
/// <c>LMXProxyServer</c> doesn't expose a synchronous read, so the canonical pattern
|
||||
/// (lifted from v1) is to subscribe, await the first OnDataChange, then unsubscribe.
|
||||
/// This method captures that single value.
|
||||
/// </summary>
|
||||
public async Task<Vtq> ReadAsync(string fullReference, TimeSpan timeout, CancellationToken ct)
|
||||
{
|
||||
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
|
||||
|
||||
var tcs = new TaskCompletionSource<Vtq>(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
Action<string, Vtq> oneShot = (_, value) => tcs.TrySetResult(value);
|
||||
|
||||
// Stash the one-shot handler before sending the subscribe, then remove it after firing.
|
||||
_subscriptions.AddOrUpdate(fullReference, oneShot, (_, existing) => Combine(existing, oneShot));
|
||||
var addedToReadOnlyAttribute = !_addressToHandle.ContainsKey(fullReference);
|
||||
|
||||
try
|
||||
{
|
||||
await SubscribeOnPumpAsync(fullReference);
|
||||
|
||||
using var _ = ct.Register(() => tcs.TrySetCanceled());
|
||||
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(timeout, ct));
|
||||
if (raceTask != tcs.Task) throw new TimeoutException($"MXAccess read of {fullReference} timed out after {timeout}");
|
||||
|
||||
return await tcs.Task;
|
||||
}
|
||||
finally
|
||||
{
|
||||
// High 1 — always detach the one-shot handler, even on cancellation/timeout/throw.
|
||||
// If we were the one who added the underlying MXAccess subscription (no other
|
||||
// caller had it), tear it down too so we don't leak a probe item handle.
|
||||
_subscriptions.AddOrUpdate(fullReference, _ => default!, (_, existing) => Remove(existing, oneShot));
|
||||
if (addedToReadOnlyAttribute)
|
||||
{
|
||||
try { await UnsubscribeAsync(fullReference); }
|
||||
catch { /* shutdown-best-effort */ }
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Writes <paramref name="value"/> to the runtime and AWAITS the OnWriteComplete
|
||||
/// callback so the caller learns the actual write status. Per Phase 2 medium finding #4
|
||||
/// in <c>exit-gate-phase-2.md</c>: the previous fire-and-forget version returned a
|
||||
/// false-positive Good even when the runtime rejected the write post-callback.
|
||||
/// </summary>
|
||||
public async Task<bool> WriteAsync(string fullReference, object value,
|
||||
int securityClassification = 0, TimeSpan? timeout = null)
|
||||
{
|
||||
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
|
||||
var actualTimeout = timeout ?? TimeSpan.FromSeconds(5);
|
||||
|
||||
var itemHandle = await _pump.InvokeAsync(() => ResolveItem(fullReference));
|
||||
|
||||
var tcs = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
if (!_pendingWrites.TryAdd(itemHandle, tcs))
|
||||
{
|
||||
// A prior write to the same item handle is still pending — uncommon but possible
|
||||
// if the caller spammed writes. Replace it: the older TCS observes a Cancelled task.
|
||||
if (_pendingWrites.TryRemove(itemHandle, out var prior))
|
||||
prior.TrySetCanceled();
|
||||
_pendingWrites[itemHandle] = tcs;
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
await _pump.InvokeAsync(() =>
|
||||
_proxy.Write(_connectionHandle, itemHandle, value, securityClassification));
|
||||
|
||||
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(actualTimeout));
|
||||
if (raceTask != tcs.Task)
|
||||
throw new TimeoutException($"MXAccess write of {fullReference} timed out after {actualTimeout}");
|
||||
|
||||
return await tcs.Task;
|
||||
}
|
||||
finally
|
||||
{
|
||||
_pendingWrites.TryRemove(itemHandle, out _);
|
||||
}
|
||||
}
|
||||
|
||||
public async Task SubscribeAsync(string fullReference, Action<string, Vtq> callback)
|
||||
{
|
||||
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
|
||||
|
||||
_subscriptions.AddOrUpdate(fullReference, callback, (_, existing) => Combine(existing, callback));
|
||||
await SubscribeOnPumpAsync(fullReference);
|
||||
}
|
||||
|
||||
public Task UnsubscribeAsync(string fullReference) => _pump.InvokeAsync(() =>
|
||||
{
|
||||
if (!_connected) return;
|
||||
if (!_addressToHandle.TryRemove(fullReference, out var handle)) return;
|
||||
_handleToAddress.TryRemove(handle, out _);
|
||||
_subscriptions.TryRemove(fullReference, out _);
|
||||
|
||||
try
|
||||
{
|
||||
_proxy.UnAdviseSupervisory(_connectionHandle, handle);
|
||||
_proxy.RemoveItem(_connectionHandle, handle);
|
||||
}
|
||||
catch { /* best-effort during teardown */ }
|
||||
});
|
||||
|
||||
private Task<int> SubscribeOnPumpAsync(string fullReference) => _pump.InvokeAsync(() =>
|
||||
{
|
||||
if (_addressToHandle.TryGetValue(fullReference, out var existing)) return existing;
|
||||
|
||||
var itemHandle = _proxy.AddItem(_connectionHandle, fullReference);
|
||||
_addressToHandle[fullReference] = itemHandle;
|
||||
_handleToAddress[itemHandle] = fullReference;
|
||||
_proxy.AdviseSupervisory(_connectionHandle, itemHandle);
|
||||
return itemHandle;
|
||||
});
|
||||
|
||||
private int ResolveItem(string fullReference)
|
||||
{
|
||||
if (_addressToHandle.TryGetValue(fullReference, out var existing)) return existing;
|
||||
var itemHandle = _proxy.AddItem(_connectionHandle, fullReference);
|
||||
_addressToHandle[fullReference] = itemHandle;
|
||||
_handleToAddress[itemHandle] = fullReference;
|
||||
return itemHandle;
|
||||
}
|
||||
|
||||
private void OnDataChange(int hLMXServerHandle, int phItemHandle, object pvItemValue,
|
||||
int pwItemQuality, object pftItemTimeStamp, ref MXSTATUS_PROXY[] itemStatus)
|
||||
{
|
||||
if (!_handleToAddress.TryGetValue(phItemHandle, out var fullRef)) return;
|
||||
|
||||
// Liveness: any data-change event is proof the connection is alive.
|
||||
_lastObservedActivityUtc = DateTime.UtcNow;
|
||||
|
||||
var ts = pftItemTimeStamp is DateTime dt ? dt.ToUniversalTime() : DateTime.UtcNow;
|
||||
var quality = (byte)Math.Min(255, Math.Max(0, pwItemQuality));
|
||||
var vtq = new Vtq(pvItemValue, ts, quality);
|
||||
|
||||
if (_subscriptions.TryGetValue(fullRef, out var cb)) cb?.Invoke(fullRef, vtq);
|
||||
}
|
||||
|
||||
private void OnWriteComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] itemStatus)
|
||||
{
|
||||
if (_pendingWrites.TryRemove(phItemHandle, out var tcs))
|
||||
tcs.TrySetResult(itemStatus is null || itemStatus.Length == 0 || itemStatus[0].success != 0);
|
||||
}
|
||||
|
||||
private static Action<string, Vtq> Combine(Action<string, Vtq> a, Action<string, Vtq> b)
|
||||
=> (Action<string, Vtq>)Delegate.Combine(a, b)!;
|
||||
|
||||
private static Action<string, Vtq> Remove(Action<string, Vtq> source, Action<string, Vtq> remove)
|
||||
=> (Action<string, Vtq>?)Delegate.Remove(source, remove) ?? ((_, _) => { });
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_disposed = true;
|
||||
_monitorCts?.Cancel();
|
||||
|
||||
try { DisconnectAsync().GetAwaiter().GetResult(); }
|
||||
catch { /* swallow */ }
|
||||
|
||||
_proxy.OnDataChange -= OnDataChange;
|
||||
_proxy.OnWriteComplete -= OnWriteComplete;
|
||||
_monitorCts?.Dispose();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Tunables for <see cref="MxAccessClient"/>'s reconnect monitor. Defaults match the v1
|
||||
/// monitor's polling cadence so behavior is consistent across the lift.
|
||||
/// </summary>
|
||||
public sealed class MxAccessClientOptions
|
||||
{
|
||||
/// <summary>Whether to start the background monitor at connect time.</summary>
|
||||
public bool AutoReconnect { get; init; } = true;
|
||||
|
||||
/// <summary>How often the monitor wakes up to check liveness.</summary>
|
||||
public TimeSpan MonitorInterval { get; init; } = TimeSpan.FromSeconds(5);
|
||||
|
||||
/// <summary>If no data-change activity in this window, the monitor probes the connection.</summary>
|
||||
public TimeSpan StaleThreshold { get; init; } = TimeSpan.FromSeconds(60);
|
||||
}
|
||||
@@ -1,68 +0,0 @@
|
||||
using System;
|
||||
using System.Runtime.InteropServices;
|
||||
using ArchestrA.MxAccess;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
/// <summary>
|
||||
/// Concrete <see cref="IMxProxy"/> backed by a real <c>LMXProxyServer</c> COM object.
|
||||
/// Port of v1 <c>MxProxyAdapter</c>. <strong>Must only be constructed on an STA thread</strong>
|
||||
/// — the StaPump owns this instance.
|
||||
/// </summary>
|
||||
public sealed class MxProxyAdapter : IMxProxy, IDisposable
|
||||
{
|
||||
private LMXProxyServer? _lmxProxy;
|
||||
|
||||
public event MxDataChangeHandler? OnDataChange;
|
||||
public event MxWriteCompleteHandler? OnWriteComplete;
|
||||
|
||||
public int Register(string clientName)
|
||||
{
|
||||
_lmxProxy = new LMXProxyServer();
|
||||
_lmxProxy.OnDataChange += ProxyOnDataChange;
|
||||
_lmxProxy.OnWriteComplete += ProxyOnWriteComplete;
|
||||
|
||||
var handle = _lmxProxy.Register(clientName);
|
||||
if (handle <= 0)
|
||||
throw new InvalidOperationException($"LMXProxyServer.Register returned invalid handle: {handle}");
|
||||
return handle;
|
||||
}
|
||||
|
||||
public void Unregister(int handle)
|
||||
{
|
||||
if (_lmxProxy is null) return;
|
||||
try
|
||||
{
|
||||
_lmxProxy.OnDataChange -= ProxyOnDataChange;
|
||||
_lmxProxy.OnWriteComplete -= ProxyOnWriteComplete;
|
||||
_lmxProxy.Unregister(handle);
|
||||
}
|
||||
finally
|
||||
{
|
||||
// ReleaseComObject loop until refcount = 0 — the Tier C SafeHandle wraps this in
|
||||
// production; here the lifetime is owned by the surrounding MxAccessHandle.
|
||||
while (Marshal.IsComObject(_lmxProxy) && Marshal.ReleaseComObject(_lmxProxy) > 0) { }
|
||||
_lmxProxy = null;
|
||||
}
|
||||
}
|
||||
|
||||
public int AddItem(int handle, string address) => _lmxProxy!.AddItem(handle, address);
|
||||
|
||||
public void RemoveItem(int handle, int itemHandle) => _lmxProxy!.RemoveItem(handle, itemHandle);
|
||||
|
||||
public void AdviseSupervisory(int handle, int itemHandle) => _lmxProxy!.AdviseSupervisory(handle, itemHandle);
|
||||
|
||||
public void UnAdviseSupervisory(int handle, int itemHandle) => _lmxProxy!.UnAdvise(handle, itemHandle);
|
||||
|
||||
public void Write(int handle, int itemHandle, object value, int securityClassification) =>
|
||||
_lmxProxy!.Write(handle, itemHandle, value, securityClassification);
|
||||
|
||||
private void ProxyOnDataChange(int hLMXServerHandle, int phItemHandle, object pvItemValue,
|
||||
int pwItemQuality, object pftItemTimeStamp, ref MXSTATUS_PROXY[] ItemStatus)
|
||||
=> OnDataChange?.Invoke(hLMXServerHandle, phItemHandle, pvItemValue, pwItemQuality, pftItemTimeStamp, ref ItemStatus);
|
||||
|
||||
private void ProxyOnWriteComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] ItemStatus)
|
||||
=> OnWriteComplete?.Invoke(hLMXServerHandle, phItemHandle, ref ItemStatus);
|
||||
|
||||
public void Dispose() => Unregister(0);
|
||||
}
|
||||
-20
@@ -1,20 +0,0 @@
|
||||
using System;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
/// <summary>
|
||||
/// Fired by <see cref="MxAccessClient.SubscriptionReplayFailed"/> when a previously-active
|
||||
/// subscription fails to be restored after a reconnect. The backend should treat the tag as
|
||||
/// unhealthy until the next successful resubscribe.
|
||||
/// </summary>
|
||||
public sealed class SubscriptionReplayFailedEventArgs : EventArgs
|
||||
{
|
||||
public SubscriptionReplayFailedEventArgs(string tagReference, Exception exception)
|
||||
{
|
||||
TagReference = tagReference;
|
||||
Exception = exception;
|
||||
}
|
||||
|
||||
public string TagReference { get; }
|
||||
public Exception Exception { get; }
|
||||
}
|
||||
@@ -1,24 +0,0 @@
|
||||
using System;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
/// <summary>Value-timestamp-quality triplet — port of v1 <c>Vtq</c>.</summary>
|
||||
public readonly struct Vtq
|
||||
{
|
||||
public object? Value { get; }
|
||||
public DateTime TimestampUtc { get; }
|
||||
public byte Quality { get; }
|
||||
|
||||
public Vtq(object? value, DateTime timestampUtc, byte quality)
|
||||
{
|
||||
Value = value;
|
||||
TimestampUtc = timestampUtc;
|
||||
Quality = quality;
|
||||
}
|
||||
|
||||
/// <summary>OPC DA Good = 192.</summary>
|
||||
public static Vtq Good(object? v) => new(v, DateTime.UtcNow, 192);
|
||||
|
||||
/// <summary>OPC DA Bad = 0.</summary>
|
||||
public static Vtq Bad() => new(null, DateTime.UtcNow, 0);
|
||||
}
|
||||
@@ -1,608 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Linq;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Backend;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
|
||||
/// <summary>
|
||||
/// Production <see cref="IGalaxyBackend"/> — combines the SQL-backed
|
||||
/// <see cref="GalaxyRepository"/> for Discover with the live MXAccess
|
||||
/// <see cref="MxAccessClient"/> for Read / Write / Subscribe. History stays bad-coded
|
||||
/// until the Wonderware Historian SDK plugin loader (Task B.1.h) lands. Alarms come from
|
||||
/// MxAccess <c>AlarmExtension</c> primitives but the wire-up is also Phase 2 follow-up
|
||||
/// (the v1 alarm subsystem is its own subtree).
|
||||
/// </summary>
|
||||
public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
|
||||
{
|
||||
private readonly GalaxyRepository _repository;
|
||||
private readonly MxAccessClient _mx;
|
||||
private readonly IHistorianDataSource? _historian;
|
||||
private long _nextSessionId;
|
||||
private long _nextSubscriptionId;
|
||||
|
||||
// Active SubscriptionId → MXAccess full reference list — so Unsubscribe can find them.
|
||||
private readonly System.Collections.Concurrent.ConcurrentDictionary<long, IReadOnlyList<string>> _subs = new();
|
||||
// Reverse lookup: tag reference → subscription IDs subscribed to it (one tag may belong to many).
|
||||
private readonly System.Collections.Concurrent.ConcurrentDictionary<string, System.Collections.Concurrent.ConcurrentBag<long>>
|
||||
_refToSubs = new(System.StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
|
||||
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
|
||||
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
|
||||
|
||||
private readonly System.EventHandler<bool> _onConnectionStateChanged;
|
||||
private readonly GalaxyRuntimeProbeManager _probeManager;
|
||||
private readonly System.EventHandler<HostStateTransition> _onProbeStateChanged;
|
||||
private readonly GalaxyAlarmTracker _alarmTracker;
|
||||
private readonly System.EventHandler<AlarmTransition> _onAlarmTransition;
|
||||
|
||||
// Cached during DiscoverAsync so SubscribeAlarmsAsync knows which attributes to advise.
|
||||
// One entry per IsAlarm=true attribute in the last discovered hierarchy.
|
||||
private readonly System.Collections.Concurrent.ConcurrentBag<string> _discoveredAlarmTags = new();
|
||||
|
||||
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx, IHistorianDataSource? historian = null)
|
||||
{
|
||||
_repository = repository;
|
||||
_mx = mx;
|
||||
_historian = historian;
|
||||
|
||||
// PR 8: gateway-level host-status push. When the MXAccess COM proxy transitions
|
||||
// connected↔disconnected, raise OnHostStatusChanged with a synthetic host entry named
|
||||
// after the Wonderware client identity so the Admin UI surfaces top-level transport
|
||||
// health even before per-platform/per-engine probing lands (deferred to a later PR that
|
||||
// ports v1's GalaxyRuntimeProbeManager with ScanState subscriptions).
|
||||
_onConnectionStateChanged = (_, connected) =>
|
||||
{
|
||||
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
|
||||
{
|
||||
HostName = _mx.ClientName,
|
||||
RuntimeStatus = connected ? "Running" : "Stopped",
|
||||
LastObservedUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
});
|
||||
};
|
||||
_mx.ConnectionStateChanged += _onConnectionStateChanged;
|
||||
|
||||
// PR 13: per-platform runtime probes. ScanState subscriptions fire OnProbeCallback,
|
||||
// which runs the state machine and raises StateChanged on transitions we care about.
|
||||
// We forward each transition through the same OnHostStatusChanged IPC event that the
|
||||
// gateway-level ConnectionStateChanged uses — tagged with the platform's TagName so the
|
||||
// Admin UI can show per-host health independently from the top-level transport status.
|
||||
_probeManager = new GalaxyRuntimeProbeManager(
|
||||
subscribe: (probe, cb) => _mx.SubscribeAsync(probe, cb),
|
||||
unsubscribe: probe => _mx.UnsubscribeAsync(probe));
|
||||
_onProbeStateChanged = (_, t) =>
|
||||
{
|
||||
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
|
||||
{
|
||||
HostName = t.TagName,
|
||||
RuntimeStatus = t.NewState switch
|
||||
{
|
||||
HostRuntimeState.Running => "Running",
|
||||
HostRuntimeState.Stopped => "Stopped",
|
||||
_ => "Unknown",
|
||||
},
|
||||
LastObservedUtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
});
|
||||
};
|
||||
_probeManager.StateChanged += _onProbeStateChanged;
|
||||
|
||||
// PR 14: alarm subsystem. Per IsAlarm=true attribute discovered, subscribe to the four
|
||||
// alarm-state attributes (.InAlarm/.Priority/.DescAttrName/.Acked), track lifecycle,
|
||||
// and raise GalaxyAlarmEvent on transitions — forwarded through the existing
|
||||
// OnAlarmEvent IPC event that the PR 4 ConnectionSink already wires into AlarmEvent frames.
|
||||
_alarmTracker = new GalaxyAlarmTracker(
|
||||
subscribe: (tag, cb) => _mx.SubscribeAsync(tag, cb),
|
||||
unsubscribe: tag => _mx.UnsubscribeAsync(tag),
|
||||
write: (tag, v) => _mx.WriteAsync(tag, v));
|
||||
_onAlarmTransition = (_, t) => OnAlarmEvent?.Invoke(this, new GalaxyAlarmEvent
|
||||
{
|
||||
EventId = Guid.NewGuid().ToString("N"),
|
||||
ObjectTagName = t.AlarmTag,
|
||||
AlarmName = t.AlarmTag,
|
||||
Severity = t.Priority,
|
||||
StateTransition = t.Transition switch
|
||||
{
|
||||
AlarmStateTransition.Active => "Active",
|
||||
AlarmStateTransition.Acknowledged => "Acknowledged",
|
||||
AlarmStateTransition.Inactive => "Inactive",
|
||||
_ => "Unknown",
|
||||
},
|
||||
Message = t.DescAttrName ?? t.AlarmTag,
|
||||
UtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
});
|
||||
_alarmTracker.TransitionRaised += _onAlarmTransition;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Exposed for tests. Production flow: DiscoverAsync completes → backend calls
|
||||
/// <c>SyncProbesAsync</c> with the runtime hosts (WinPlatform + AppEngine gobjects) to
|
||||
/// advise ScanState per host.
|
||||
/// </summary>
|
||||
internal GalaxyRuntimeProbeManager ProbeManager => _probeManager;
|
||||
|
||||
public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
await _mx.ConnectAsync();
|
||||
return new OpenSessionResponse { Success = true, SessionId = Interlocked.Increment(ref _nextSessionId) };
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new OpenSessionResponse { Success = false, Error = $"MXAccess connect failed: {ex.Message}" };
|
||||
}
|
||||
}
|
||||
|
||||
public async Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct)
|
||||
{
|
||||
await _mx.DisconnectAsync();
|
||||
}
|
||||
|
||||
public async Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
var hierarchy = await _repository.GetHierarchyAsync(ct).ConfigureAwait(false);
|
||||
var attributes = await _repository.GetAttributesAsync(ct).ConfigureAwait(false);
|
||||
|
||||
var attrsByGobject = attributes
|
||||
.GroupBy(a => a.GobjectId)
|
||||
.ToDictionary(g => g.Key, g => g.Select(MapAttribute).ToArray());
|
||||
var nameByGobject = hierarchy.ToDictionary(o => o.GobjectId, o => o.TagName);
|
||||
|
||||
var objects = hierarchy.Select(o => new GalaxyObjectInfo
|
||||
{
|
||||
ContainedName = string.IsNullOrEmpty(o.ContainedName) ? o.TagName : o.ContainedName,
|
||||
TagName = o.TagName,
|
||||
ParentContainedName = o.ParentGobjectId != 0 && nameByGobject.TryGetValue(o.ParentGobjectId, out var p) ? p : null,
|
||||
TemplateCategory = MapCategory(o.CategoryId),
|
||||
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : Array.Empty<GalaxyAttributeInfo>(),
|
||||
}).ToArray();
|
||||
|
||||
// PR 14: cache alarm-bearing attribute full refs so SubscribeAlarmsAsync can advise
|
||||
// them on demand. Format matches the Galaxy reference grammar <tag>.<attr>.
|
||||
var freshAlarmTags = attributes
|
||||
.Where(a => a.IsAlarm)
|
||||
.Select(a => nameByGobject.TryGetValue(a.GobjectId, out var tn)
|
||||
? tn + "." + a.AttributeName
|
||||
: null)
|
||||
.Where(s => !string.IsNullOrWhiteSpace(s))
|
||||
.Cast<string>()
|
||||
.ToArray();
|
||||
while (_discoveredAlarmTags.TryTake(out _)) { }
|
||||
foreach (var t in freshAlarmTags) _discoveredAlarmTags.Add(t);
|
||||
|
||||
// PR 13: Sync the per-platform probe manager against the just-discovered hierarchy
|
||||
// so ScanState subscriptions track the current runtime set. Best-effort — probe
|
||||
// failures don't block Discover from returning, since the gateway-level signal from
|
||||
// MxAccessClient.ConnectionStateChanged still flows and the Admin UI degrades to
|
||||
// that level if any per-host probe couldn't advise.
|
||||
try
|
||||
{
|
||||
var targets = hierarchy
|
||||
.Where(o => o.CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|
||||
|| o.CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine)
|
||||
.Select(o => new HostProbeTarget(o.TagName, o.CategoryId));
|
||||
await _probeManager.SyncAsync(targets).ConfigureAwait(false);
|
||||
}
|
||||
catch { /* swallow — Discover succeeded; probes are a diagnostic enrichment */ }
|
||||
|
||||
return new DiscoverHierarchyResponse { Success = true, Objects = objects };
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new DiscoverHierarchyResponse { Success = false, Error = ex.Message, Objects = Array.Empty<GalaxyObjectInfo>() };
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
|
||||
{
|
||||
if (!_mx.IsConnected) return new ReadValuesResponse { Success = false, Error = "Not connected", Values = Array.Empty<GalaxyDataValue>() };
|
||||
|
||||
var results = new List<GalaxyDataValue>(req.TagReferences.Length);
|
||||
foreach (var reference in req.TagReferences)
|
||||
{
|
||||
try
|
||||
{
|
||||
var vtq = await _mx.ReadAsync(reference, TimeSpan.FromSeconds(5), ct);
|
||||
results.Add(ToWire(reference, vtq));
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
results.Add(new GalaxyDataValue
|
||||
{
|
||||
TagReference = reference,
|
||||
StatusCode = 0x80020000u, // Bad_InternalError
|
||||
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
ValueBytes = MessagePackSerializer.Serialize(ex.Message),
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
return new ReadValuesResponse { Success = true, Values = results.ToArray() };
|
||||
}
|
||||
|
||||
public async Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
|
||||
{
|
||||
var results = new List<WriteValueResult>(req.Writes.Length);
|
||||
foreach (var w in req.Writes)
|
||||
{
|
||||
try
|
||||
{
|
||||
// Decode the value back from the MessagePack bytes the Proxy sent.
|
||||
var value = w.ValueBytes is null
|
||||
? null
|
||||
: MessagePackSerializer.Deserialize<object>(w.ValueBytes);
|
||||
|
||||
var ok = await _mx.WriteAsync(w.TagReference, value!);
|
||||
results.Add(new WriteValueResult
|
||||
{
|
||||
TagReference = w.TagReference,
|
||||
StatusCode = ok ? 0u : 0x80020000u, // Good or Bad_InternalError
|
||||
Error = ok ? null : "MXAccess runtime reported write failure",
|
||||
});
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
results.Add(new WriteValueResult { TagReference = w.TagReference, StatusCode = 0x80020000u, Error = ex.Message });
|
||||
}
|
||||
}
|
||||
return new WriteValuesResponse { Results = results.ToArray() };
|
||||
}
|
||||
|
||||
public async Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
|
||||
{
|
||||
var sid = Interlocked.Increment(ref _nextSubscriptionId);
|
||||
|
||||
try
|
||||
{
|
||||
foreach (var tag in req.TagReferences)
|
||||
{
|
||||
_refToSubs.AddOrUpdate(tag,
|
||||
_ => new System.Collections.Concurrent.ConcurrentBag<long> { sid },
|
||||
(_, bag) => { bag.Add(sid); return bag; });
|
||||
|
||||
// The MXAccess SubscribeAsync only takes one callback per tag; the same callback
|
||||
// fires for every active subscription of that tag — we fan out by SubscriptionId.
|
||||
await _mx.SubscribeAsync(tag, OnTagValueChanged);
|
||||
}
|
||||
|
||||
_subs[sid] = req.TagReferences;
|
||||
return new SubscribeResponse { Success = true, SubscriptionId = sid, ActualIntervalMs = req.RequestedIntervalMs };
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new SubscribeResponse { Success = false, Error = ex.Message };
|
||||
}
|
||||
}
|
||||
|
||||
public async Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct)
|
||||
{
|
||||
if (!_subs.TryRemove(req.SubscriptionId, out var refs)) return;
|
||||
foreach (var r in refs)
|
||||
{
|
||||
// Drop this subscription from the reverse map; only unsubscribe from MXAccess if no
|
||||
// other subscription is still listening (multiple Proxy subs may share a tag).
|
||||
_refToSubs.TryGetValue(r, out var bag);
|
||||
if (bag is not null)
|
||||
{
|
||||
var remaining = new System.Collections.Concurrent.ConcurrentBag<long>(
|
||||
bag.Where(id => id != req.SubscriptionId));
|
||||
if (remaining.IsEmpty)
|
||||
{
|
||||
_refToSubs.TryRemove(r, out _);
|
||||
await _mx.UnsubscribeAsync(r);
|
||||
}
|
||||
else
|
||||
{
|
||||
_refToSubs[r] = remaining;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Fires for every value change on any subscribed Galaxy attribute. Wraps the value in
|
||||
/// a <see cref="GalaxyDataValue"/> and raises <see cref="OnDataChange"/> once per
|
||||
/// subscription that includes this tag — the IPC sink translates that into outbound
|
||||
/// <c>OnDataChangeNotification</c> frames.
|
||||
/// </summary>
|
||||
private void OnTagValueChanged(string fullReference, MxAccess.Vtq vtq)
|
||||
{
|
||||
if (!_refToSubs.TryGetValue(fullReference, out var bag) || bag.IsEmpty) return;
|
||||
|
||||
var wireValue = ToWire(fullReference, vtq);
|
||||
// Emit one notification per active SubscriptionId for this tag — the Proxy fans out to
|
||||
// each ISubscribable consumer based on the SubscriptionId in the payload.
|
||||
foreach (var sid in bag.Distinct())
|
||||
{
|
||||
OnDataChange?.Invoke(this, new OnDataChangeNotification
|
||||
{
|
||||
SubscriptionId = sid,
|
||||
Values = new[] { wireValue },
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// PR 14: advise every alarm-bearing attribute's 4-attr quartet. Best-effort per-alarm —
|
||||
/// a subscribe failure on one alarm doesn't abort the whole call, since operators prefer
|
||||
/// partial alarm coverage to none. Idempotent on repeat calls (tracker internally
|
||||
/// skips already-tracked alarms).
|
||||
/// </summary>
|
||||
public async Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct)
|
||||
{
|
||||
foreach (var tag in _discoveredAlarmTags)
|
||||
{
|
||||
try { await _alarmTracker.TrackAsync(tag).ConfigureAwait(false); }
|
||||
catch { /* swallow per-alarm — tracker rolls back its own state on failure */ }
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// PR 14: route operator ack through the tracker's AckMsg write path. EventId on the
|
||||
/// incoming request maps directly to the alarm full reference (Proxy-side naming
|
||||
/// convention from GalaxyProxyDriver.RaiseAlarmEvent → ev.EventId).
|
||||
/// </summary>
|
||||
public async Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct)
|
||||
{
|
||||
// EventId carries a per-transition Guid.ToString("N"); there's no reverse map from
|
||||
// event id to alarm tag yet, so v1's convention (ack targets the condition) is matched
|
||||
// by reading the alarm name from the Comment envelope: v1 packed "<tag>|<comment>".
|
||||
// Until the Proxy is updated to send the alarm tag separately, fall back to treating
|
||||
// the EventId as the alarm tag — Client CLI passes it through unchanged.
|
||||
var tag = req.EventId;
|
||||
if (!string.IsNullOrWhiteSpace(tag))
|
||||
{
|
||||
try { await _alarmTracker.AcknowledgeAsync(tag, req.Comment ?? string.Empty).ConfigureAwait(false); }
|
||||
catch { /* swallow — ack failures surface via MxAccessClient.WriteAsync logs */ }
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
|
||||
{
|
||||
if (_historian is null)
|
||||
return new HistoryReadResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
|
||||
Tags = Array.Empty<HistoryTagValues>(),
|
||||
};
|
||||
|
||||
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
|
||||
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
|
||||
var tags = new List<HistoryTagValues>(req.TagReferences.Length);
|
||||
|
||||
try
|
||||
{
|
||||
foreach (var reference in req.TagReferences)
|
||||
{
|
||||
var samples = await _historian.ReadRawAsync(reference, start, end, (int)req.MaxValuesPerTag, ct).ConfigureAwait(false);
|
||||
tags.Add(new HistoryTagValues
|
||||
{
|
||||
TagReference = reference,
|
||||
Values = samples.Select(s => ToWire(reference, s)).ToArray(),
|
||||
});
|
||||
}
|
||||
return new HistoryReadResponse { Success = true, Tags = tags.ToArray() };
|
||||
}
|
||||
catch (OperationCanceledException) { throw; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new HistoryReadResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = $"Historian read failed: {ex.Message}",
|
||||
Tags = tags.ToArray(),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
|
||||
HistoryReadProcessedRequest req, CancellationToken ct)
|
||||
{
|
||||
if (_historian is null)
|
||||
return new HistoryReadProcessedResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
|
||||
Values = Array.Empty<GalaxyDataValue>(),
|
||||
};
|
||||
|
||||
if (req.IntervalMs <= 0)
|
||||
return new HistoryReadProcessedResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "HistoryReadProcessed requires IntervalMs > 0",
|
||||
Values = Array.Empty<GalaxyDataValue>(),
|
||||
};
|
||||
|
||||
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
|
||||
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
|
||||
|
||||
try
|
||||
{
|
||||
var samples = await _historian.ReadAggregateAsync(
|
||||
req.TagReference, start, end, req.IntervalMs, req.AggregateColumn, ct).ConfigureAwait(false);
|
||||
|
||||
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
|
||||
return new HistoryReadProcessedResponse { Success = true, Values = wire };
|
||||
}
|
||||
catch (OperationCanceledException) { throw; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new HistoryReadProcessedResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = $"Historian aggregate read failed: {ex.Message}",
|
||||
Values = Array.Empty<GalaxyDataValue>(),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
|
||||
HistoryReadAtTimeRequest req, CancellationToken ct)
|
||||
{
|
||||
if (_historian is null)
|
||||
return new HistoryReadAtTimeResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
|
||||
Values = Array.Empty<GalaxyDataValue>(),
|
||||
};
|
||||
|
||||
if (req.TimestampsUtcUnixMs.Length == 0)
|
||||
return new HistoryReadAtTimeResponse { Success = true, Values = Array.Empty<GalaxyDataValue>() };
|
||||
|
||||
var timestamps = req.TimestampsUtcUnixMs
|
||||
.Select(ms => DateTimeOffset.FromUnixTimeMilliseconds(ms).UtcDateTime)
|
||||
.ToArray();
|
||||
|
||||
try
|
||||
{
|
||||
var samples = await _historian.ReadAtTimeAsync(req.TagReference, timestamps, ct).ConfigureAwait(false);
|
||||
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
|
||||
return new HistoryReadAtTimeResponse { Success = true, Values = wire };
|
||||
}
|
||||
catch (OperationCanceledException) { throw; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new HistoryReadAtTimeResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = $"Historian at-time read failed: {ex.Message}",
|
||||
Values = Array.Empty<GalaxyDataValue>(),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
|
||||
HistoryReadEventsRequest req, CancellationToken ct)
|
||||
{
|
||||
if (_historian is null)
|
||||
return new HistoryReadEventsResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
|
||||
Events = Array.Empty<GalaxyHistoricalEvent>(),
|
||||
};
|
||||
|
||||
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
|
||||
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
|
||||
|
||||
try
|
||||
{
|
||||
var events = await _historian.ReadEventsAsync(req.SourceName, start, end, req.MaxEvents, ct).ConfigureAwait(false);
|
||||
var wire = events.Select(e => new GalaxyHistoricalEvent
|
||||
{
|
||||
EventId = e.Id.ToString(),
|
||||
SourceName = e.Source,
|
||||
EventTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.EventTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
ReceivedTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.ReceivedTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
DisplayText = e.DisplayText,
|
||||
Severity = e.Severity,
|
||||
}).ToArray();
|
||||
return new HistoryReadEventsResponse { Success = true, Events = wire };
|
||||
}
|
||||
catch (OperationCanceledException) { throw; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
return new HistoryReadEventsResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = $"Historian event read failed: {ex.Message}",
|
||||
Events = Array.Empty<GalaxyHistoricalEvent>(),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_alarmTracker.TransitionRaised -= _onAlarmTransition;
|
||||
_alarmTracker.Dispose();
|
||||
_probeManager.StateChanged -= _onProbeStateChanged;
|
||||
_probeManager.Dispose();
|
||||
_mx.ConnectionStateChanged -= _onConnectionStateChanged;
|
||||
_historian?.Dispose();
|
||||
}
|
||||
|
||||
private static GalaxyDataValue ToWire(string reference, Vtq vtq) => new()
|
||||
{
|
||||
TagReference = reference,
|
||||
ValueBytes = vtq.Value is null ? null : MessagePackSerializer.Serialize(vtq.Value),
|
||||
ValueMessagePackType = 0,
|
||||
StatusCode = vtq.Quality >= 192 ? 0u : 0x40000000u, // Good vs Uncertain placeholder
|
||||
SourceTimestampUtcUnixMs = new DateTimeOffset(vtq.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Maps a <see cref="HistorianSample"/> (raw historian row, OPC-UA-free) to the IPC wire
|
||||
/// shape. The Proxy decodes the MessagePack value and maps <see cref="HistorianSample.Quality"/>
|
||||
/// through <c>QualityMapper</c> on its side of the pipe — we keep the raw byte here so
|
||||
/// rich OPC DA status codes (e.g. <c>BadNotConnected</c>, <c>UncertainSubNormal</c>) survive
|
||||
/// the hop intact.
|
||||
/// </summary>
|
||||
private static GalaxyDataValue ToWire(string reference, HistorianSample sample) => new()
|
||||
{
|
||||
TagReference = reference,
|
||||
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value),
|
||||
ValueMessagePackType = 0,
|
||||
StatusCode = HistorianQualityMapper.Map(sample.Quality),
|
||||
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
};
|
||||
|
||||
|
||||
/// <summary>
|
||||
/// Maps a <see cref="HistorianAggregateSample"/> (one aggregate bucket) to the IPC wire
|
||||
/// shape. A null <see cref="HistorianAggregateSample.Value"/> means the aggregate was
|
||||
/// unavailable for the bucket — the Proxy translates that to OPC UA <c>BadNoData</c>.
|
||||
/// </summary>
|
||||
private static GalaxyDataValue ToWire(string reference, HistorianAggregateSample sample) => new()
|
||||
{
|
||||
TagReference = reference,
|
||||
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value.Value),
|
||||
ValueMessagePackType = 0,
|
||||
StatusCode = sample.Value is null ? 0x800E0000u /* BadNoData */ : 0x00000000u,
|
||||
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
};
|
||||
|
||||
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
|
||||
{
|
||||
AttributeName = row.AttributeName,
|
||||
MxDataType = row.MxDataType,
|
||||
IsArray = row.IsArray,
|
||||
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
|
||||
SecurityClassification = row.SecurityClassification,
|
||||
IsHistorized = row.IsHistorized,
|
||||
IsAlarm = row.IsAlarm,
|
||||
};
|
||||
|
||||
private static string MapCategory(int categoryId) => categoryId switch
|
||||
{
|
||||
1 => "$WinPlatform",
|
||||
3 => "$AppEngine",
|
||||
4 => "$Area",
|
||||
10 => "$UserDefined",
|
||||
11 => "$ApplicationObject",
|
||||
13 => "$Area",
|
||||
17 => "$DeviceIntegration",
|
||||
24 => "$ViewEngine",
|
||||
26 => "$ViewApp",
|
||||
_ => $"category-{categoryId}",
|
||||
};
|
||||
}
|
||||
-273
@@ -1,273 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Linq;
|
||||
using System.Threading.Tasks;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
|
||||
|
||||
/// <summary>
|
||||
/// Per-platform + per-AppEngine runtime probe. Subscribes to <c><TagName>.ScanState</c>
|
||||
/// for each $WinPlatform and $AppEngine gobject, tracks Unknown → Running → Stopped
|
||||
/// transitions, and fires <see cref="StateChanged"/> so <see cref="Backend.MxAccessGalaxyBackend"/>
|
||||
/// can forward per-host events through the existing IPC <c>OnHostStatusChanged</c> event.
|
||||
/// Pure-logic state machine with an injected clock so it's deterministically testable —
|
||||
/// port of v1 <c>GalaxyRuntimeProbeManager</c> without the OPC UA node-manager coupling.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// State machine rules (documented in v1's <c>runtimestatus.md</c> and preserved here):
|
||||
/// <list type="bullet">
|
||||
/// <item><c>ScanState</c> is on-change-only — a stably-Running host may go hours without a
|
||||
/// callback. Running → Stopped is driven by an explicit <c>ScanState=false</c> callback,
|
||||
/// never by starvation.</item>
|
||||
/// <item>Unknown → Running is a startup transition and does NOT fire StateChanged (would
|
||||
/// paint every host as "just recovered" at startup, which is noise).</item>
|
||||
/// <item>Stopped → Running and Running → Stopped fire StateChanged. Unknown → Stopped
|
||||
/// fires StateChanged because that's a first-known-bad signal operators need.</item>
|
||||
/// <item>All public methods are thread-safe. Callbacks fire outside the internal lock to
|
||||
/// avoid lock inversion with caller-owned state.</item>
|
||||
/// </list>
|
||||
/// </remarks>
|
||||
public sealed class GalaxyRuntimeProbeManager : IDisposable
|
||||
{
|
||||
public const int CategoryWinPlatform = 1;
|
||||
public const int CategoryAppEngine = 3;
|
||||
public const string ProbeAttribute = ".ScanState";
|
||||
|
||||
private readonly Func<DateTime> _clock;
|
||||
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
|
||||
private readonly Func<string, Task> _unsubscribe;
|
||||
private readonly object _lock = new();
|
||||
|
||||
// probe tag → per-host state
|
||||
private readonly Dictionary<string, HostProbeState> _byProbe = new(StringComparer.OrdinalIgnoreCase);
|
||||
// tag name → probe tag (for reverse lookup on the desired-set diff)
|
||||
private readonly Dictionary<string, string> _probeByTagName = new(StringComparer.OrdinalIgnoreCase);
|
||||
private bool _disposed;
|
||||
|
||||
/// <summary>
|
||||
/// Fires on every state transition that operators should react to. See class remarks
|
||||
/// for the rules on which transitions fire.
|
||||
/// </summary>
|
||||
public event EventHandler<HostStateTransition>? StateChanged;
|
||||
|
||||
public GalaxyRuntimeProbeManager(
|
||||
Func<string, Action<string, Vtq>, Task> subscribe,
|
||||
Func<string, Task> unsubscribe)
|
||||
: this(subscribe, unsubscribe, () => DateTime.UtcNow) { }
|
||||
|
||||
internal GalaxyRuntimeProbeManager(
|
||||
Func<string, Action<string, Vtq>, Task> subscribe,
|
||||
Func<string, Task> unsubscribe,
|
||||
Func<DateTime> clock)
|
||||
{
|
||||
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
|
||||
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
|
||||
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
|
||||
}
|
||||
|
||||
/// <summary>Number of probes currently advised. Test/dashboard hook.</summary>
|
||||
public int ActiveProbeCount
|
||||
{
|
||||
get { lock (_lock) return _byProbe.Count; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Snapshot every currently-tracked host's state. One entry per probe.
|
||||
/// </summary>
|
||||
public IReadOnlyList<HostProbeSnapshot> SnapshotStates()
|
||||
{
|
||||
lock (_lock)
|
||||
{
|
||||
return _byProbe.Select(kv => new HostProbeSnapshot(
|
||||
TagName: kv.Value.TagName,
|
||||
State: kv.Value.State,
|
||||
LastChangedUtc: kv.Value.LastStateChangeUtc)).ToList();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Query the current runtime state for <paramref name="tagName"/>. Returns
|
||||
/// <see cref="HostRuntimeState.Unknown"/> when the host is not tracked.
|
||||
/// </summary>
|
||||
public HostRuntimeState GetState(string tagName)
|
||||
{
|
||||
lock (_lock)
|
||||
{
|
||||
if (_probeByTagName.TryGetValue(tagName, out var probe)
|
||||
&& _byProbe.TryGetValue(probe, out var state))
|
||||
return state.State;
|
||||
return HostRuntimeState.Unknown;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Diff the desired host set (filtered $WinPlatform / $AppEngine from the latest Discover)
|
||||
/// against the currently-tracked set and advise / unadvise as needed. Idempotent:
|
||||
/// calling twice with the same set does nothing.
|
||||
/// </summary>
|
||||
public async Task SyncAsync(IEnumerable<HostProbeTarget> desiredHosts)
|
||||
{
|
||||
if (_disposed) return;
|
||||
|
||||
var desired = desiredHosts
|
||||
.Where(h => !string.IsNullOrWhiteSpace(h.TagName))
|
||||
.ToDictionary(h => h.TagName, StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
List<string> toAdvise;
|
||||
List<string> toUnadvise;
|
||||
lock (_lock)
|
||||
{
|
||||
toAdvise = desired.Keys
|
||||
.Where(tag => !_probeByTagName.ContainsKey(tag))
|
||||
.ToList();
|
||||
toUnadvise = _probeByTagName.Keys
|
||||
.Where(tag => !desired.ContainsKey(tag))
|
||||
.Select(tag => _probeByTagName[tag])
|
||||
.ToList();
|
||||
|
||||
foreach (var tag in toAdvise)
|
||||
{
|
||||
var probe = tag + ProbeAttribute;
|
||||
_probeByTagName[tag] = probe;
|
||||
_byProbe[probe] = new HostProbeState
|
||||
{
|
||||
TagName = tag,
|
||||
State = HostRuntimeState.Unknown,
|
||||
LastStateChangeUtc = _clock(),
|
||||
};
|
||||
}
|
||||
|
||||
foreach (var probe in toUnadvise)
|
||||
{
|
||||
_byProbe.Remove(probe);
|
||||
}
|
||||
|
||||
foreach (var removedTag in _probeByTagName.Keys.Where(t => !desired.ContainsKey(t)).ToList())
|
||||
{
|
||||
_probeByTagName.Remove(removedTag);
|
||||
}
|
||||
}
|
||||
|
||||
foreach (var tag in toAdvise)
|
||||
{
|
||||
var probe = tag + ProbeAttribute;
|
||||
try
|
||||
{
|
||||
await _subscribe(probe, OnProbeCallback);
|
||||
}
|
||||
catch
|
||||
{
|
||||
// Rollback on subscribe failure so a later Tick can't transition a never-advised
|
||||
// probe into a false Stopped state. Callers can re-Sync later to retry.
|
||||
lock (_lock)
|
||||
{
|
||||
_byProbe.Remove(probe);
|
||||
_probeByTagName.Remove(tag);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
foreach (var probe in toUnadvise)
|
||||
{
|
||||
try { await _unsubscribe(probe); } catch { /* best-effort cleanup */ }
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Public entry point for tests and internal callbacks. Production flow: MxAccessClient's
|
||||
/// SubscribeAsync delivers VTQ updates through the callback wired in <see cref="SyncAsync"/>,
|
||||
/// which calls this method under the lock to update state and fires
|
||||
/// <see cref="StateChanged"/> outside the lock for any transition that matters.
|
||||
/// </summary>
|
||||
public void OnProbeCallback(string probeTag, Vtq vtq)
|
||||
{
|
||||
if (_disposed) return;
|
||||
|
||||
HostStateTransition? transition = null;
|
||||
lock (_lock)
|
||||
{
|
||||
if (!_byProbe.TryGetValue(probeTag, out var state)) return;
|
||||
|
||||
var isRunning = vtq.Quality >= 192 && vtq.Value is bool b && b;
|
||||
var now = _clock();
|
||||
var previous = state.State;
|
||||
state.LastCallbackUtc = now;
|
||||
|
||||
if (isRunning)
|
||||
{
|
||||
state.GoodUpdateCount++;
|
||||
if (previous != HostRuntimeState.Running)
|
||||
{
|
||||
state.State = HostRuntimeState.Running;
|
||||
state.LastStateChangeUtc = now;
|
||||
if (previous == HostRuntimeState.Stopped)
|
||||
{
|
||||
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Running, now);
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
state.FailureCount++;
|
||||
if (previous != HostRuntimeState.Stopped)
|
||||
{
|
||||
state.State = HostRuntimeState.Stopped;
|
||||
state.LastStateChangeUtc = now;
|
||||
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Stopped, now);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (transition is { } t)
|
||||
{
|
||||
StateChanged?.Invoke(this, t);
|
||||
}
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
lock (_lock)
|
||||
{
|
||||
_byProbe.Clear();
|
||||
_probeByTagName.Clear();
|
||||
}
|
||||
}
|
||||
|
||||
private sealed class HostProbeState
|
||||
{
|
||||
public string TagName { get; set; } = "";
|
||||
public HostRuntimeState State { get; set; }
|
||||
public DateTime LastStateChangeUtc { get; set; }
|
||||
public DateTime? LastCallbackUtc { get; set; }
|
||||
public long GoodUpdateCount { get; set; }
|
||||
public long FailureCount { get; set; }
|
||||
}
|
||||
}
|
||||
|
||||
public enum HostRuntimeState
|
||||
{
|
||||
Unknown,
|
||||
Running,
|
||||
Stopped,
|
||||
}
|
||||
|
||||
public sealed record HostStateTransition(
|
||||
string TagName,
|
||||
HostRuntimeState OldState,
|
||||
HostRuntimeState NewState,
|
||||
DateTime AtUtc);
|
||||
|
||||
public sealed record HostProbeSnapshot(
|
||||
string TagName,
|
||||
HostRuntimeState State,
|
||||
DateTime LastChangedUtc);
|
||||
|
||||
public readonly record struct HostProbeTarget(string TagName, int CategoryId)
|
||||
{
|
||||
public bool IsRuntimeHost =>
|
||||
CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|
||||
|| CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine;
|
||||
}
|
||||
@@ -1,121 +0,0 @@
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
|
||||
/// <summary>
|
||||
/// Phase 2 placeholder backend — accepts session open/close + responds to recycle, returns
|
||||
/// "not-implemented" results for every data-plane call. Replaced by the lifted
|
||||
/// <c>MxAccessClient</c>-backed implementation during the deferred Galaxy code move
|
||||
/// (Task B.1 + parity gate). Keeps the IPC end-to-end testable today.
|
||||
/// </summary>
|
||||
public sealed class StubGalaxyBackend : IGalaxyBackend
|
||||
{
|
||||
private long _nextSessionId;
|
||||
private long _nextSubscriptionId;
|
||||
|
||||
// Stub backend never raises events — implements the interface members for symmetry.
|
||||
#pragma warning disable CS0067
|
||||
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
|
||||
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
|
||||
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
|
||||
#pragma warning restore CS0067
|
||||
|
||||
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
|
||||
{
|
||||
var id = Interlocked.Increment(ref _nextSessionId);
|
||||
return Task.FromResult(new OpenSessionResponse { Success = true, SessionId = id });
|
||||
}
|
||||
|
||||
public Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
|
||||
public Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new DiscoverHierarchyResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Objects = System.Array.Empty<GalaxyObjectInfo>(),
|
||||
});
|
||||
|
||||
public Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new ReadValuesResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
|
||||
{
|
||||
var results = new WriteValueResult[req.Writes.Length];
|
||||
for (var i = 0; i < req.Writes.Length; i++)
|
||||
{
|
||||
results[i] = new WriteValueResult
|
||||
{
|
||||
TagReference = req.Writes[i].TagReference,
|
||||
StatusCode = 0x80020000u, // Bad_InternalError
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
};
|
||||
}
|
||||
return Task.FromResult(new WriteValuesResponse { Results = results });
|
||||
}
|
||||
|
||||
public Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
|
||||
{
|
||||
var sid = Interlocked.Increment(ref _nextSubscriptionId);
|
||||
return Task.FromResult(new SubscribeResponse
|
||||
{
|
||||
Success = true,
|
||||
SubscriptionId = sid,
|
||||
ActualIntervalMs = req.RequestedIntervalMs,
|
||||
});
|
||||
}
|
||||
|
||||
public Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
|
||||
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
|
||||
|
||||
public Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Tags = System.Array.Empty<HistoryTagValues>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
|
||||
HistoryReadProcessedRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadProcessedResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
|
||||
HistoryReadAtTimeRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadAtTimeResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Values = System.Array.Empty<GalaxyDataValue>(),
|
||||
});
|
||||
|
||||
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
|
||||
HistoryReadEventsRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new HistoryReadEventsResponse
|
||||
{
|
||||
Success = false,
|
||||
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
|
||||
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
|
||||
});
|
||||
|
||||
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
|
||||
=> Task.FromResult(new RecycleStatusResponse
|
||||
{
|
||||
Accepted = true,
|
||||
GraceSeconds = 15, // matches Phase 2 plan §B.8 default
|
||||
});
|
||||
}
|
||||
@@ -1,183 +0,0 @@
|
||||
using System;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using Serilog;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Real IPC dispatcher — routes each <see cref="MessageKind"/> to the matching
|
||||
/// <see cref="IGalaxyBackend"/> method. Replaces <see cref="StubFrameHandler"/>. Heartbeat
|
||||
/// stays handled inline so liveness detection works regardless of backend health.
|
||||
/// </summary>
|
||||
public sealed class GalaxyFrameHandler(IGalaxyBackend backend, ILogger logger) : IFrameHandler
|
||||
{
|
||||
public async Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
switch (kind)
|
||||
{
|
||||
case MessageKind.Heartbeat:
|
||||
{
|
||||
var hb = Deserialize<Heartbeat>(body);
|
||||
await writer.WriteAsync(MessageKind.HeartbeatAck,
|
||||
new HeartbeatAck { SequenceNumber = hb.SequenceNumber, UtcUnixMs = hb.UtcUnixMs }, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.OpenSessionRequest:
|
||||
{
|
||||
var resp = await backend.OpenSessionAsync(Deserialize<OpenSessionRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.OpenSessionResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.CloseSessionRequest:
|
||||
await backend.CloseSessionAsync(Deserialize<CloseSessionRequest>(body), ct);
|
||||
return; // one-way
|
||||
|
||||
case MessageKind.DiscoverHierarchyRequest:
|
||||
{
|
||||
var resp = await backend.DiscoverAsync(Deserialize<DiscoverHierarchyRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.DiscoverHierarchyResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.ReadValuesRequest:
|
||||
{
|
||||
var resp = await backend.ReadValuesAsync(Deserialize<ReadValuesRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.ReadValuesResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.WriteValuesRequest:
|
||||
{
|
||||
var resp = await backend.WriteValuesAsync(Deserialize<WriteValuesRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.WriteValuesResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.SubscribeRequest:
|
||||
{
|
||||
var resp = await backend.SubscribeAsync(Deserialize<SubscribeRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.SubscribeResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.UnsubscribeRequest:
|
||||
await backend.UnsubscribeAsync(Deserialize<UnsubscribeRequest>(body), ct);
|
||||
return; // one-way
|
||||
|
||||
case MessageKind.AlarmSubscribeRequest:
|
||||
await backend.SubscribeAlarmsAsync(Deserialize<AlarmSubscribeRequest>(body), ct);
|
||||
return; // one-way; subsequent alarm events are server-pushed
|
||||
case MessageKind.AlarmAckRequest:
|
||||
await backend.AcknowledgeAlarmAsync(Deserialize<AlarmAckRequest>(body), ct);
|
||||
return;
|
||||
|
||||
case MessageKind.HistoryReadRequest:
|
||||
{
|
||||
var resp = await backend.HistoryReadAsync(Deserialize<HistoryReadRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.HistoryReadResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.HistoryReadProcessedRequest:
|
||||
{
|
||||
var resp = await backend.HistoryReadProcessedAsync(
|
||||
Deserialize<HistoryReadProcessedRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.HistoryReadProcessedResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.HistoryReadAtTimeRequest:
|
||||
{
|
||||
var resp = await backend.HistoryReadAtTimeAsync(
|
||||
Deserialize<HistoryReadAtTimeRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.HistoryReadAtTimeResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.HistoryReadEventsRequest:
|
||||
{
|
||||
var resp = await backend.HistoryReadEventsAsync(
|
||||
Deserialize<HistoryReadEventsRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.HistoryReadEventsResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
case MessageKind.RecycleHostRequest:
|
||||
{
|
||||
var resp = await backend.RecycleAsync(Deserialize<RecycleHostRequest>(body), ct);
|
||||
await writer.WriteAsync(MessageKind.RecycleStatusResponse, resp, ct);
|
||||
return;
|
||||
}
|
||||
default:
|
||||
await SendErrorAsync(writer, "unknown-kind", $"Frame kind {kind} not handled by Host", ct);
|
||||
return;
|
||||
}
|
||||
}
|
||||
catch (OperationCanceledException) { throw; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
logger.Error(ex, "GalaxyFrameHandler threw on {Kind}", kind);
|
||||
await SendErrorAsync(writer, "handler-exception", ex.Message, ct);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Subscribes the backend's server-pushed events for the lifetime of the connection.
|
||||
/// The returned disposable unsubscribes when the connection closes — without it the
|
||||
/// backend's static event invocation list would accumulate dead writer references and
|
||||
/// leak memory + raise <see cref="ObjectDisposedException"/> on every push.
|
||||
/// </summary>
|
||||
public IDisposable AttachConnection(FrameWriter writer)
|
||||
{
|
||||
var sink = new ConnectionSink(backend, writer, logger);
|
||||
sink.Attach();
|
||||
return sink;
|
||||
}
|
||||
|
||||
private static T Deserialize<T>(byte[] body) => MessagePackSerializer.Deserialize<T>(body);
|
||||
|
||||
private static Task SendErrorAsync(FrameWriter writer, string code, string message, CancellationToken ct)
|
||||
=> writer.WriteAsync(MessageKind.ErrorResponse,
|
||||
new ErrorResponse { Code = code, Message = message }, ct);
|
||||
|
||||
private sealed class ConnectionSink : IDisposable
|
||||
{
|
||||
private readonly IGalaxyBackend _backend;
|
||||
private readonly FrameWriter _writer;
|
||||
private readonly ILogger _logger;
|
||||
private EventHandler<OnDataChangeNotification>? _onData;
|
||||
private EventHandler<GalaxyAlarmEvent>? _onAlarm;
|
||||
private EventHandler<HostConnectivityStatus>? _onHost;
|
||||
|
||||
public ConnectionSink(IGalaxyBackend backend, FrameWriter writer, ILogger logger)
|
||||
{
|
||||
_backend = backend; _writer = writer; _logger = logger;
|
||||
}
|
||||
|
||||
public void Attach()
|
||||
{
|
||||
_onData = (_, e) => Push(MessageKind.OnDataChangeNotification, e);
|
||||
_onAlarm = (_, e) => Push(MessageKind.AlarmEvent, e);
|
||||
_onHost = (_, e) => Push(MessageKind.RuntimeStatusChange,
|
||||
new RuntimeStatusChangeNotification { Status = e });
|
||||
_backend.OnDataChange += _onData;
|
||||
_backend.OnAlarmEvent += _onAlarm;
|
||||
_backend.OnHostStatusChanged += _onHost;
|
||||
}
|
||||
|
||||
private void Push<T>(MessageKind kind, T payload)
|
||||
{
|
||||
// Fire-and-forget — pushes can race with disposal of the writer. We swallow
|
||||
// ObjectDisposedException because the dispose path will detach this sink shortly.
|
||||
try { _writer.WriteAsync(kind, payload, CancellationToken.None).GetAwaiter().GetResult(); }
|
||||
catch (ObjectDisposedException) { }
|
||||
catch (Exception ex) { _logger.Warning(ex, "ConnectionSink push failed for {Kind}", kind); }
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (_onData is not null) _backend.OnDataChange -= _onData;
|
||||
if (_onAlarm is not null) _backend.OnAlarmEvent -= _onAlarm;
|
||||
if (_onHost is not null) _backend.OnHostStatusChanged -= _onHost;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,45 +0,0 @@
|
||||
using System;
|
||||
using System.IO.Pipes;
|
||||
using System.Security.AccessControl;
|
||||
using System.Security.Principal;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Builds the <see cref="PipeSecurity"/> required by <c>driver-stability.md §"IPC Security"</c>:
|
||||
/// only the configured OtOpcUa server principal SID gets <c>ReadWrite | Synchronize</c>;
|
||||
/// LocalSystem is explicitly denied. Any other authenticated user falls through to the
|
||||
/// implicit deny.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Earlier revisions also denied <c>BUILTIN\Administrators</c>, which broke live testing
|
||||
/// on dev boxes where the allowed user (<c>dohertj2</c>) is also a member of the local
|
||||
/// Administrators group — UAC's filtered token still carries the Admins SID as deny-only,
|
||||
/// so the deny ACE fired even from non-elevated shells. The per-connection
|
||||
/// <see cref="PipeServer.VerifyCaller"/> check already gates on the exact allowed SID,
|
||||
/// which is the real authorization boundary, so the Admins deny added no defence in depth
|
||||
/// in that topology.
|
||||
/// </remarks>
|
||||
public static class PipeAcl
|
||||
{
|
||||
public static PipeSecurity Create(SecurityIdentifier allowedSid)
|
||||
{
|
||||
if (allowedSid is null) throw new ArgumentNullException(nameof(allowedSid));
|
||||
|
||||
var security = new PipeSecurity();
|
||||
|
||||
security.AddAccessRule(new PipeAccessRule(
|
||||
allowedSid,
|
||||
PipeAccessRights.ReadWrite | PipeAccessRights.Synchronize,
|
||||
AccessControlType.Allow));
|
||||
|
||||
var localSystem = new SecurityIdentifier(WellKnownSidType.LocalSystemSid, null);
|
||||
if (allowedSid != localSystem)
|
||||
security.AddAccessRule(new PipeAccessRule(localSystem, PipeAccessRights.FullControl, AccessControlType.Deny));
|
||||
|
||||
// Owner = allowed SID so the deny rules can't be removed without write-DACL rights.
|
||||
security.SetOwner(allowedSid);
|
||||
|
||||
return security;
|
||||
}
|
||||
}
|
||||
@@ -1,179 +0,0 @@
|
||||
using System;
|
||||
using System.IO.Pipes;
|
||||
using System.Security.Principal;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using Serilog;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Accepts one client connection at a time on a named pipe with the strict ACL from
|
||||
/// <see cref="PipeAcl"/>. Verifies the peer SID and the per-process shared secret before any
|
||||
/// RPC frame is accepted. Per <c>driver-stability.md §"IPC Security"</c>.
|
||||
/// </summary>
|
||||
public sealed class PipeServer : IDisposable
|
||||
{
|
||||
private readonly string _pipeName;
|
||||
private readonly SecurityIdentifier _allowedSid;
|
||||
private readonly string _sharedSecret;
|
||||
private readonly ILogger _logger;
|
||||
private readonly CancellationTokenSource _cts = new();
|
||||
private NamedPipeServerStream? _current;
|
||||
|
||||
public PipeServer(string pipeName, SecurityIdentifier allowedSid, string sharedSecret, ILogger logger)
|
||||
{
|
||||
_pipeName = pipeName ?? throw new ArgumentNullException(nameof(pipeName));
|
||||
_allowedSid = allowedSid ?? throw new ArgumentNullException(nameof(allowedSid));
|
||||
_sharedSecret = sharedSecret ?? throw new ArgumentNullException(nameof(sharedSecret));
|
||||
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Accepts one connection, performs Hello handshake, then dispatches frames to
|
||||
/// <paramref name="handler"/> until EOF or cancel. Returns when the client disconnects.
|
||||
/// </summary>
|
||||
public async Task RunOneConnectionAsync(IFrameHandler handler, CancellationToken ct)
|
||||
{
|
||||
using var linked = CancellationTokenSource.CreateLinkedTokenSource(_cts.Token, ct);
|
||||
var acl = PipeAcl.Create(_allowedSid);
|
||||
|
||||
// .NET Framework 4.8 uses the legacy constructor overload that takes a PipeSecurity directly.
|
||||
_current = new NamedPipeServerStream(
|
||||
_pipeName,
|
||||
PipeDirection.InOut,
|
||||
maxNumberOfServerInstances: 1,
|
||||
PipeTransmissionMode.Byte,
|
||||
PipeOptions.Asynchronous,
|
||||
inBufferSize: 64 * 1024,
|
||||
outBufferSize: 64 * 1024,
|
||||
pipeSecurity: acl);
|
||||
|
||||
try
|
||||
{
|
||||
await _current.WaitForConnectionAsync(linked.Token).ConfigureAwait(false);
|
||||
|
||||
using var reader = new FrameReader(_current, leaveOpen: true);
|
||||
using var writer = new FrameWriter(_current, leaveOpen: true);
|
||||
|
||||
// First frame must be a Hello with the correct shared secret. Reading it before
|
||||
// the caller-SID impersonation check satisfies Windows' ERROR_CANNOT_IMPERSONATE
|
||||
// rule — ImpersonateNamedPipeClient fails until at least one frame has been read.
|
||||
var first = await reader.ReadFrameAsync(linked.Token).ConfigureAwait(false);
|
||||
if (first is null || first.Value.Kind != MessageKind.Hello)
|
||||
{
|
||||
_logger.Warning("IPC first frame was not Hello; dropping");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!VerifyCaller(_current, out var reason))
|
||||
{
|
||||
_logger.Warning("IPC caller rejected: {Reason}", reason);
|
||||
_current.Disconnect();
|
||||
return;
|
||||
}
|
||||
|
||||
var hello = MessagePackSerializer.Deserialize<Hello>(first.Value.Body);
|
||||
if (!string.Equals(hello.SharedSecret, _sharedSecret, StringComparison.Ordinal))
|
||||
{
|
||||
await writer.WriteAsync(MessageKind.HelloAck,
|
||||
new HelloAck { Accepted = false, RejectReason = "shared-secret-mismatch" },
|
||||
linked.Token).ConfigureAwait(false);
|
||||
_logger.Warning("IPC Hello rejected: shared-secret-mismatch");
|
||||
return;
|
||||
}
|
||||
|
||||
if (hello.ProtocolMajor != Hello.CurrentMajor)
|
||||
{
|
||||
await writer.WriteAsync(MessageKind.HelloAck,
|
||||
new HelloAck { Accepted = false, RejectReason = $"major-version-mismatch-peer={hello.ProtocolMajor}-server={Hello.CurrentMajor}" },
|
||||
linked.Token).ConfigureAwait(false);
|
||||
_logger.Warning("IPC Hello rejected: major mismatch peer={Peer} server={Server}",
|
||||
hello.ProtocolMajor, Hello.CurrentMajor);
|
||||
return;
|
||||
}
|
||||
|
||||
await writer.WriteAsync(MessageKind.HelloAck,
|
||||
new HelloAck { Accepted = true, HostName = Environment.MachineName },
|
||||
linked.Token).ConfigureAwait(false);
|
||||
|
||||
using var attachment = handler.AttachConnection(writer);
|
||||
|
||||
while (!linked.Token.IsCancellationRequested)
|
||||
{
|
||||
var frame = await reader.ReadFrameAsync(linked.Token).ConfigureAwait(false);
|
||||
if (frame is null) break;
|
||||
|
||||
await handler.HandleAsync(frame.Value.Kind, frame.Value.Body, writer, linked.Token).ConfigureAwait(false);
|
||||
}
|
||||
}
|
||||
finally
|
||||
{
|
||||
_current.Dispose();
|
||||
_current = null;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Runs the server continuously, handling one connection at a time. When a connection ends
|
||||
/// (clean or error), accepts the next.
|
||||
/// </summary>
|
||||
public async Task RunAsync(IFrameHandler handler, CancellationToken ct)
|
||||
{
|
||||
while (!ct.IsCancellationRequested)
|
||||
{
|
||||
try { await RunOneConnectionAsync(handler, ct).ConfigureAwait(false); }
|
||||
catch (OperationCanceledException) { break; }
|
||||
catch (Exception ex) { _logger.Error(ex, "IPC connection loop error — accepting next"); }
|
||||
}
|
||||
}
|
||||
|
||||
private bool VerifyCaller(NamedPipeServerStream pipe, out string reason)
|
||||
{
|
||||
try
|
||||
{
|
||||
pipe.RunAsClient(() =>
|
||||
{
|
||||
using var wi = WindowsIdentity.GetCurrent();
|
||||
if (wi.User is null)
|
||||
throw new InvalidOperationException("GetCurrent().User is null — cannot verify caller");
|
||||
if (wi.User != _allowedSid)
|
||||
throw new UnauthorizedAccessException(
|
||||
$"caller SID {wi.User.Value} does not match allowed {_allowedSid.Value}");
|
||||
});
|
||||
reason = string.Empty;
|
||||
return true;
|
||||
}
|
||||
catch (Exception ex) { reason = ex.Message; return false; }
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_cts.Cancel();
|
||||
_current?.Dispose();
|
||||
_cts.Dispose();
|
||||
}
|
||||
}
|
||||
|
||||
public interface IFrameHandler
|
||||
{
|
||||
Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct);
|
||||
|
||||
/// <summary>
|
||||
/// Called once per accepted connection after the Hello handshake. Lets the handler
|
||||
/// attach server-pushed event sinks (data-change, alarm, host-status) to the
|
||||
/// connection's <paramref name="writer"/>. Returns an <see cref="IDisposable"/> the
|
||||
/// pipe server disposes when the connection closes — backends use it to unsubscribe.
|
||||
/// Implementations that don't push events can return <see cref="NoopAttachment"/>.
|
||||
/// </summary>
|
||||
IDisposable AttachConnection(FrameWriter writer);
|
||||
|
||||
public sealed class NoopAttachment : IDisposable
|
||||
{
|
||||
public static readonly NoopAttachment Instance = new();
|
||||
public void Dispose() { }
|
||||
}
|
||||
}
|
||||
@@ -1,33 +0,0 @@
|
||||
using System;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Placeholder handler that responds to the framed IPC with error responses. Replaced by the
|
||||
/// real Galaxy-backed handler when the MXAccess code move (deferred) lands.
|
||||
/// </summary>
|
||||
public sealed class StubFrameHandler : IFrameHandler
|
||||
{
|
||||
public Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct)
|
||||
{
|
||||
// Minimal lifecycle: heartbeat ack keeps the supervisor's liveness detector happy even
|
||||
// while the data-plane is stubbed, so integration tests of the supervisor can run end-to-end.
|
||||
if (kind == MessageKind.Heartbeat)
|
||||
{
|
||||
var hb = MessagePackSerializer.Deserialize<Heartbeat>(body);
|
||||
return writer.WriteAsync(MessageKind.HeartbeatAck,
|
||||
new HeartbeatAck { SequenceNumber = hb.SequenceNumber, UtcUnixMs = hb.UtcUnixMs }, ct);
|
||||
}
|
||||
|
||||
return writer.WriteAsync(MessageKind.ErrorResponse,
|
||||
new ErrorResponse { Code = "not-implemented", Message = $"Kind {kind} is stubbed — MXAccess lift deferred" },
|
||||
ct);
|
||||
}
|
||||
|
||||
public IDisposable AttachConnection(FrameWriter writer) => IFrameHandler.NoopAttachment.Instance;
|
||||
}
|
||||
@@ -1,5 +0,0 @@
|
||||
// Shim — .NET Framework 4.8 doesn't ship with IsExternalInit, required for init-only setters +
|
||||
// positional records. Safe to add in our own namespace; the compiler accepts any type with this name.
|
||||
namespace System.Runtime.CompilerServices;
|
||||
|
||||
internal static class IsExternalInit;
|
||||
@@ -1,139 +0,0 @@
|
||||
using System;
|
||||
using System.Security.Principal;
|
||||
using System.Threading;
|
||||
using Serilog;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Backend;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host;
|
||||
|
||||
/// <summary>
|
||||
/// Entry point for the <c>OtOpcUaGalaxyHost</c> Windows service / console host. Reads the
|
||||
/// pipe name, allowed-SID, and shared secret from environment (passed by the supervisor at
|
||||
/// spawn time per <c>driver-stability.md</c>).
|
||||
/// </summary>
|
||||
public static class Program
|
||||
{
|
||||
public static int Main(string[] args)
|
||||
{
|
||||
Log.Logger = new LoggerConfiguration()
|
||||
.MinimumLevel.Information()
|
||||
.WriteTo.File(
|
||||
@"%ProgramData%\OtOpcUa\galaxy-host-.log".Replace("%ProgramData%", Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData)),
|
||||
rollingInterval: RollingInterval.Day)
|
||||
.CreateLogger();
|
||||
|
||||
try
|
||||
{
|
||||
var pipeName = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_PIPE") ?? "OtOpcUaGalaxy";
|
||||
var allowedSidValue = Environment.GetEnvironmentVariable("OTOPCUA_ALLOWED_SID")
|
||||
?? throw new InvalidOperationException("OTOPCUA_ALLOWED_SID not set — supervisor must pass the server principal SID");
|
||||
var sharedSecret = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_SECRET")
|
||||
?? throw new InvalidOperationException("OTOPCUA_GALAXY_SECRET not set — supervisor must pass the per-process secret at spawn time");
|
||||
|
||||
var allowedSid = new SecurityIdentifier(allowedSidValue);
|
||||
|
||||
using var server = new PipeServer(pipeName, allowedSid, sharedSecret, Log.Logger);
|
||||
using var cts = new CancellationTokenSource();
|
||||
Console.CancelKeyPress += (_, e) => { e.Cancel = true; cts.Cancel(); };
|
||||
|
||||
Log.Information("OtOpcUaGalaxyHost starting — pipe={Pipe} allowedSid={Sid}", pipeName, allowedSidValue);
|
||||
|
||||
// Backend selection — env var picks the implementation:
|
||||
// OTOPCUA_GALAXY_BACKEND=stub → StubGalaxyBackend (no Galaxy required)
|
||||
// OTOPCUA_GALAXY_BACKEND=db → DbBackedGalaxyBackend (Discover only, against ZB)
|
||||
// OTOPCUA_GALAXY_BACKEND=mxaccess → MxAccessGalaxyBackend (real COM + ZB; default)
|
||||
var backendKind = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_BACKEND")?.ToLowerInvariant() ?? "mxaccess";
|
||||
var zbConn = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_ZB_CONN")
|
||||
?? "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;";
|
||||
var clientName = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_CLIENT_NAME") ?? "OtOpcUa-Galaxy.Host";
|
||||
|
||||
IGalaxyBackend backend;
|
||||
StaPump? pump = null;
|
||||
MxAccessClient? mx = null;
|
||||
switch (backendKind)
|
||||
{
|
||||
case "stub":
|
||||
backend = new StubGalaxyBackend();
|
||||
break;
|
||||
case "db":
|
||||
backend = new DbBackedGalaxyBackend(new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }));
|
||||
break;
|
||||
default: // mxaccess
|
||||
pump = new StaPump("Galaxy.Sta");
|
||||
pump.WaitForStartedAsync().GetAwaiter().GetResult();
|
||||
mx = new MxAccessClient(pump, new MxProxyAdapter(), clientName);
|
||||
var historian = BuildHistorianIfEnabled();
|
||||
backend = new MxAccessGalaxyBackend(
|
||||
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }),
|
||||
mx,
|
||||
historian);
|
||||
break;
|
||||
}
|
||||
|
||||
Log.Information("OtOpcUaGalaxyHost backend={Backend}", backendKind);
|
||||
var handler = new GalaxyFrameHandler(backend, Log.Logger);
|
||||
try { server.RunAsync(handler, cts.Token).GetAwaiter().GetResult(); }
|
||||
finally
|
||||
{
|
||||
(backend as IDisposable)?.Dispose();
|
||||
mx?.Dispose();
|
||||
pump?.Dispose();
|
||||
}
|
||||
|
||||
Log.Information("OtOpcUaGalaxyHost stopped cleanly");
|
||||
return 0;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Log.Fatal(ex, "OtOpcUaGalaxyHost fatal");
|
||||
return 2;
|
||||
}
|
||||
finally { Log.CloseAndFlush(); }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Builds a <see cref="HistorianDataSource"/> from the OTOPCUA_HISTORIAN_* environment
|
||||
/// variables the supervisor passes at spawn time. Returns null when the historian is
|
||||
/// disabled (default) so <c>MxAccessGalaxyBackend.HistoryReadAsync</c> returns a clear
|
||||
/// "not configured" error instead of attempting an SDK connection to localhost.
|
||||
/// </summary>
|
||||
private static IHistorianDataSource? BuildHistorianIfEnabled()
|
||||
{
|
||||
var enabled = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_ENABLED");
|
||||
if (!string.Equals(enabled, "true", StringComparison.OrdinalIgnoreCase) && enabled != "1")
|
||||
return null;
|
||||
|
||||
var cfg = new HistorianConfiguration
|
||||
{
|
||||
Enabled = true,
|
||||
ServerName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVER") ?? "localhost",
|
||||
Port = TryParseInt("OTOPCUA_HISTORIAN_PORT", 32568),
|
||||
IntegratedSecurity = !string.Equals(Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_INTEGRATED"), "false", StringComparison.OrdinalIgnoreCase),
|
||||
UserName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_USER"),
|
||||
Password = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_PASS"),
|
||||
CommandTimeoutSeconds = TryParseInt("OTOPCUA_HISTORIAN_TIMEOUT_SEC", 30),
|
||||
MaxValuesPerRead = TryParseInt("OTOPCUA_HISTORIAN_MAX_VALUES", 10000),
|
||||
FailureCooldownSeconds = TryParseInt("OTOPCUA_HISTORIAN_COOLDOWN_SEC", 60),
|
||||
};
|
||||
|
||||
var servers = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVERS");
|
||||
if (!string.IsNullOrWhiteSpace(servers))
|
||||
cfg.ServerNames = new System.Collections.Generic.List<string>(
|
||||
servers.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries));
|
||||
|
||||
Log.Information("Historian enabled — {NodeCount} configured node(s), port={Port}",
|
||||
cfg.ServerNames.Count > 0 ? cfg.ServerNames.Count : 1, cfg.Port);
|
||||
return new HistorianDataSource(cfg);
|
||||
}
|
||||
|
||||
private static int TryParseInt(string envName, int defaultValue)
|
||||
{
|
||||
var raw = Environment.GetEnvironmentVariable(envName);
|
||||
return int.TryParse(raw, out var parsed) ? parsed : defaultValue;
|
||||
}
|
||||
}
|
||||
@@ -1,58 +0,0 @@
|
||||
using System;
|
||||
using System.Runtime.ConstrainedExecution;
|
||||
using System.Runtime.InteropServices;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
|
||||
|
||||
/// <summary>
|
||||
/// SafeHandle-style lifetime wrapper for an <c>LMXProxyServer</c> COM connection. Per Task B.3
|
||||
/// + decision #65: <see cref="ReleaseHandle"/> must call <c>Marshal.ReleaseComObject</c> until
|
||||
/// refcount = 0, then <c>UnregisterProxy</c>. The finalizer runs as a
|
||||
/// <see cref="CriticalFinalizerObject"/> to honor AppDomain-unload ordering.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// This scaffold accepts any RCW (tagged as <see cref="object"/>) so we can unit-test the
|
||||
/// release logic with a mock. The concrete wiring to <c>ArchestrA.MxAccess.LMXProxyServer</c>
|
||||
/// lands when the actual Galaxy code moves over (the part deferred to the parity gate).
|
||||
/// </remarks>
|
||||
public sealed class MxAccessHandle : SafeHandle
|
||||
{
|
||||
private object? _comObject;
|
||||
private readonly Action<object>? _unregister;
|
||||
|
||||
public MxAccessHandle(object comObject, Action<object>? unregister = null)
|
||||
: base(IntPtr.Zero, ownsHandle: true)
|
||||
{
|
||||
_comObject = comObject ?? throw new ArgumentNullException(nameof(comObject));
|
||||
_unregister = unregister;
|
||||
|
||||
// The pointer value itself doesn't matter — we're wrapping an RCW, not a native handle.
|
||||
SetHandle(new IntPtr(1));
|
||||
}
|
||||
|
||||
public override bool IsInvalid => handle == IntPtr.Zero;
|
||||
|
||||
public object? RawComObject => _comObject;
|
||||
|
||||
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
|
||||
protected override bool ReleaseHandle()
|
||||
{
|
||||
if (_comObject is null) return true;
|
||||
|
||||
try { _unregister?.Invoke(_comObject); }
|
||||
catch { /* swallow — we're in finalizer/cleanup; log elsewhere */ }
|
||||
|
||||
try
|
||||
{
|
||||
if (Marshal.IsComObject(_comObject))
|
||||
{
|
||||
while (Marshal.ReleaseComObject(_comObject) > 0) { /* loop until fully released */ }
|
||||
}
|
||||
}
|
||||
catch { /* swallow */ }
|
||||
|
||||
_comObject = null;
|
||||
SetHandle(IntPtr.Zero);
|
||||
return true;
|
||||
}
|
||||
}
|
||||
@@ -1,206 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Concurrent;
|
||||
using System.Runtime.InteropServices;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
|
||||
|
||||
/// <summary>
|
||||
/// Dedicated STA thread with a Win32 message pump that owns all <c>LMXProxyServer</c> COM
|
||||
/// instances. Lifted from v1 <c>StaComThread</c> per CLAUDE.md "Reference Implementation".
|
||||
/// Per <c>driver-stability.md</c> Galaxy deep dive §"STA thread + Win32 message pump":
|
||||
/// work items dispatched via <c>PostThreadMessage(WM_APP)</c>; <c>WM_APP+1</c> requests a
|
||||
/// graceful drain → <c>WM_QUIT</c>; supervisor escalates to <c>Environment.Exit(2)</c> if the
|
||||
/// pump doesn't drain within the recycle grace window.
|
||||
/// </summary>
|
||||
public sealed class StaPump : IDisposable
|
||||
{
|
||||
private const uint WM_APP = 0x8000;
|
||||
private const uint WM_DRAIN_AND_QUIT = WM_APP + 1;
|
||||
private const uint PM_NOREMOVE = 0x0000;
|
||||
|
||||
private readonly Thread _thread;
|
||||
private readonly ConcurrentQueue<WorkItem> _workItems = new();
|
||||
private readonly TaskCompletionSource<bool> _started = new(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
|
||||
private volatile uint _nativeThreadId;
|
||||
private volatile bool _pumpExited;
|
||||
private volatile bool _disposed;
|
||||
|
||||
public int ThreadId => _thread.ManagedThreadId;
|
||||
public DateTime LastDispatchedUtc { get; private set; } = DateTime.MinValue;
|
||||
public int QueueDepth => _workItems.Count;
|
||||
public bool IsRunning => _nativeThreadId != 0 && !_disposed && !_pumpExited;
|
||||
|
||||
public StaPump(string name = "Galaxy.Sta")
|
||||
{
|
||||
_thread = new Thread(PumpLoop) { Name = name, IsBackground = true };
|
||||
_thread.SetApartmentState(ApartmentState.STA);
|
||||
_thread.Start();
|
||||
}
|
||||
|
||||
public Task WaitForStartedAsync() => _started.Task;
|
||||
|
||||
/// <summary>Posts a work item; resolves once it's executed on the STA thread.</summary>
|
||||
public Task<T> InvokeAsync<T>(Func<T> work)
|
||||
{
|
||||
if (_disposed) throw new ObjectDisposedException(nameof(StaPump));
|
||||
if (_pumpExited) throw new InvalidOperationException("STA pump has exited");
|
||||
|
||||
var tcs = new TaskCompletionSource<T>(TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
_workItems.Enqueue(new WorkItem(
|
||||
() =>
|
||||
{
|
||||
try { tcs.TrySetResult(work()); }
|
||||
catch (Exception ex) { tcs.TrySetException(ex); }
|
||||
},
|
||||
ex => tcs.TrySetException(ex)));
|
||||
|
||||
if (!PostThreadMessage(_nativeThreadId, WM_APP, IntPtr.Zero, IntPtr.Zero))
|
||||
{
|
||||
_pumpExited = true;
|
||||
DrainAndFaultQueue();
|
||||
}
|
||||
|
||||
return tcs.Task;
|
||||
}
|
||||
|
||||
public Task InvokeAsync(Action work) => InvokeAsync(() => { work(); return 0; });
|
||||
|
||||
/// <summary>
|
||||
/// Health probe — returns true if a no-op work item round-trips within
|
||||
/// <paramref name="timeout"/>. Used by the supervisor; timeout means the pump is wedged
|
||||
/// and a recycle is warranted (Task B.2 acceptance).
|
||||
/// </summary>
|
||||
public async Task<bool> IsResponsiveAsync(TimeSpan timeout)
|
||||
{
|
||||
if (!IsRunning) return false;
|
||||
var task = InvokeAsync(() => { });
|
||||
var completed = await Task.WhenAny(task, Task.Delay(timeout)).ConfigureAwait(false);
|
||||
return completed == task;
|
||||
}
|
||||
|
||||
private void PumpLoop()
|
||||
{
|
||||
try
|
||||
{
|
||||
_nativeThreadId = GetCurrentThreadId();
|
||||
|
||||
// Force the system to create the thread message queue before we signal Started.
|
||||
// PeekMessage(PM_NOREMOVE) on an empty queue is the documented way to do this.
|
||||
PeekMessage(out _, IntPtr.Zero, 0, 0, PM_NOREMOVE);
|
||||
|
||||
_started.TrySetResult(true);
|
||||
|
||||
// GetMessage returns 0 on WM_QUIT, -1 on error, otherwise a positive value.
|
||||
while (GetMessage(out var msg, IntPtr.Zero, 0, 0) > 0)
|
||||
{
|
||||
if (msg.message == WM_APP)
|
||||
{
|
||||
DrainQueue();
|
||||
}
|
||||
else if (msg.message == WM_DRAIN_AND_QUIT)
|
||||
{
|
||||
DrainQueue();
|
||||
PostQuitMessage(0);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Pass through any window/dialog messages the COM proxy may inject.
|
||||
TranslateMessage(ref msg);
|
||||
DispatchMessage(ref msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_started.TrySetException(ex);
|
||||
}
|
||||
finally
|
||||
{
|
||||
_pumpExited = true;
|
||||
DrainAndFaultQueue();
|
||||
}
|
||||
}
|
||||
|
||||
private void DrainQueue()
|
||||
{
|
||||
while (_workItems.TryDequeue(out var item))
|
||||
{
|
||||
item.Execute();
|
||||
LastDispatchedUtc = DateTime.UtcNow;
|
||||
}
|
||||
}
|
||||
|
||||
private void DrainAndFaultQueue()
|
||||
{
|
||||
var ex = new InvalidOperationException("STA pump has exited");
|
||||
while (_workItems.TryDequeue(out var item))
|
||||
{
|
||||
try { item.Fault(ex); }
|
||||
catch { /* faulting a TCS shouldn't throw, but be defensive */ }
|
||||
}
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
|
||||
try
|
||||
{
|
||||
if (_nativeThreadId != 0 && !_pumpExited)
|
||||
PostThreadMessage(_nativeThreadId, WM_DRAIN_AND_QUIT, IntPtr.Zero, IntPtr.Zero);
|
||||
_thread.Join(TimeSpan.FromSeconds(5));
|
||||
}
|
||||
catch { /* swallow — best effort */ }
|
||||
|
||||
DrainAndFaultQueue();
|
||||
}
|
||||
|
||||
private sealed record WorkItem(Action Execute, Action<Exception> Fault);
|
||||
|
||||
#region Win32 P/Invoke
|
||||
|
||||
[StructLayout(LayoutKind.Sequential)]
|
||||
private struct MSG
|
||||
{
|
||||
public IntPtr hwnd;
|
||||
public uint message;
|
||||
public IntPtr wParam;
|
||||
public IntPtr lParam;
|
||||
public uint time;
|
||||
public POINT pt;
|
||||
}
|
||||
|
||||
[StructLayout(LayoutKind.Sequential)]
|
||||
private struct POINT { public int x; public int y; }
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
private static extern int GetMessage(out MSG lpMsg, IntPtr hWnd, uint wMsgFilterMin, uint wMsgFilterMax);
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
[return: MarshalAs(UnmanagedType.Bool)]
|
||||
private static extern bool TranslateMessage(ref MSG lpMsg);
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
private static extern IntPtr DispatchMessage(ref MSG lpMsg);
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
[return: MarshalAs(UnmanagedType.Bool)]
|
||||
private static extern bool PostThreadMessage(uint idThread, uint Msg, IntPtr wParam, IntPtr lParam);
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
private static extern void PostQuitMessage(int nExitCode);
|
||||
|
||||
[DllImport("user32.dll")]
|
||||
[return: MarshalAs(UnmanagedType.Bool)]
|
||||
private static extern bool PeekMessage(out MSG lpMsg, IntPtr hWnd, uint wMsgFilterMin, uint wMsgFilterMax,
|
||||
uint wRemoveMsg);
|
||||
|
||||
[DllImport("kernel32.dll")]
|
||||
private static extern uint GetCurrentThreadId();
|
||||
|
||||
#endregion
|
||||
}
|
||||
@@ -1,64 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Stability;
|
||||
|
||||
/// <summary>
|
||||
/// Galaxy-specific RSS watchdog per <c>driver-stability.md §"Memory Watchdog Thresholds"</c>.
|
||||
/// Baseline-relative + absolute caps. Sustained-slope detection uses a rolling 30-min window.
|
||||
/// Pluggable RSS source keeps it unit-testable.
|
||||
/// </summary>
|
||||
public sealed class MemoryWatchdog
|
||||
{
|
||||
/// <summary>Absolute hard ceiling — process is force-killed above this.</summary>
|
||||
public long HardCeilingBytes { get; init; } = 1_500L * 1024 * 1024;
|
||||
|
||||
/// <summary>Sustained slope (bytes/min) above which soft recycle is scheduled.</summary>
|
||||
public long SustainedSlopeBytesPerMinute { get; init; } = 5L * 1024 * 1024;
|
||||
|
||||
public TimeSpan SlopeWindow { get; init; } = TimeSpan.FromMinutes(30);
|
||||
|
||||
private readonly long _baselineBytes;
|
||||
private readonly Queue<RssSample> _samples = new();
|
||||
|
||||
public MemoryWatchdog(long baselineBytes)
|
||||
{
|
||||
_baselineBytes = baselineBytes;
|
||||
}
|
||||
|
||||
/// <summary>Called every 30s with the current RSS. Returns the action the supervisor should take.</summary>
|
||||
public WatchdogAction Sample(long rssBytes, DateTime utcNow)
|
||||
{
|
||||
_samples.Enqueue(new RssSample(utcNow, rssBytes));
|
||||
while (_samples.Count > 0 && utcNow - _samples.Peek().TimestampUtc > SlopeWindow)
|
||||
_samples.Dequeue();
|
||||
|
||||
if (rssBytes >= HardCeilingBytes)
|
||||
return WatchdogAction.HardKill;
|
||||
|
||||
var softThreshold = Math.Max(_baselineBytes * 2, _baselineBytes + 200L * 1024 * 1024);
|
||||
var warnThreshold = Math.Max((long)(_baselineBytes * 1.5), _baselineBytes + 200L * 1024 * 1024);
|
||||
|
||||
if (rssBytes >= softThreshold) return WatchdogAction.SoftRecycle;
|
||||
if (rssBytes >= warnThreshold) return WatchdogAction.Warn;
|
||||
|
||||
if (_samples.Count >= 2)
|
||||
{
|
||||
var oldest = _samples.Peek();
|
||||
var span = (utcNow - oldest.TimestampUtc).TotalMinutes;
|
||||
if (span >= SlopeWindow.TotalMinutes * 0.9) // need ~full window to trust the slope
|
||||
{
|
||||
var delta = rssBytes - oldest.RssBytes;
|
||||
var bytesPerMin = delta / span;
|
||||
if (bytesPerMin >= SustainedSlopeBytesPerMinute)
|
||||
return WatchdogAction.SoftRecycle;
|
||||
}
|
||||
}
|
||||
|
||||
return WatchdogAction.None;
|
||||
}
|
||||
|
||||
private readonly record struct RssSample(DateTime TimestampUtc, long RssBytes);
|
||||
}
|
||||
|
||||
public enum WatchdogAction { None, Warn, SoftRecycle, HardKill }
|
||||
@@ -1,121 +0,0 @@
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.IO.MemoryMappedFiles;
|
||||
using System.Runtime.InteropServices;
|
||||
using System.Text;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Stability;
|
||||
|
||||
/// <summary>
|
||||
/// Ring-buffer of the last <see cref="Capacity"/> IPC operations, written into a
|
||||
/// memory-mapped file. On hard crash the supervisor reads the MMF after the corpse is gone
|
||||
/// to see what was in flight. Thread-safe for the single-writer, multi-reader pattern.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// File layout:
|
||||
/// <code>
|
||||
/// [16-byte header: magic(4) | version(4) | capacity(4) | writeIndex(4)]
|
||||
/// [capacity × 256-byte entries: each is [8-byte utcUnixMs | 8-byte opKind | 240-byte UTF-8 message]]
|
||||
/// </code>
|
||||
/// </remarks>
|
||||
public sealed class PostMortemMmf : IDisposable
|
||||
{
|
||||
private const int Magic = 0x4F505043; // 'OPPC'
|
||||
private const int Version = 1;
|
||||
private const int HeaderBytes = 16;
|
||||
public const int EntryBytes = 256;
|
||||
private const int MessageOffset = 16;
|
||||
private const int MessageCapacity = EntryBytes - MessageOffset;
|
||||
|
||||
public int Capacity { get; }
|
||||
public string Path { get; }
|
||||
|
||||
private readonly MemoryMappedFile _mmf;
|
||||
private readonly MemoryMappedViewAccessor _accessor;
|
||||
private readonly object _writeGate = new();
|
||||
|
||||
public PostMortemMmf(string path, int capacity = 1000)
|
||||
{
|
||||
if (capacity <= 0) throw new ArgumentOutOfRangeException(nameof(capacity));
|
||||
Capacity = capacity;
|
||||
Path = path;
|
||||
|
||||
var fileBytes = HeaderBytes + capacity * EntryBytes;
|
||||
Directory.CreateDirectory(System.IO.Path.GetDirectoryName(path)!);
|
||||
|
||||
var fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.Read);
|
||||
fs.SetLength(fileBytes);
|
||||
_mmf = MemoryMappedFile.CreateFromFile(fs, null, fileBytes,
|
||||
MemoryMappedFileAccess.ReadWrite, HandleInheritability.None, leaveOpen: false);
|
||||
_accessor = _mmf.CreateViewAccessor(0, fileBytes, MemoryMappedFileAccess.ReadWrite);
|
||||
|
||||
// Initialize header if blank/garbage.
|
||||
if (_accessor.ReadInt32(0) != Magic)
|
||||
{
|
||||
_accessor.Write(0, Magic);
|
||||
_accessor.Write(4, Version);
|
||||
_accessor.Write(8, capacity);
|
||||
_accessor.Write(12, 0); // writeIndex
|
||||
}
|
||||
}
|
||||
|
||||
public void Write(long opKind, string message)
|
||||
{
|
||||
lock (_writeGate)
|
||||
{
|
||||
var idx = _accessor.ReadInt32(12);
|
||||
var offset = HeaderBytes + idx * EntryBytes;
|
||||
|
||||
_accessor.Write(offset + 0, DateTimeOffset.UtcNow.ToUnixTimeMilliseconds());
|
||||
_accessor.Write(offset + 8, opKind);
|
||||
|
||||
var msgBytes = Encoding.UTF8.GetBytes(message ?? string.Empty);
|
||||
var copy = Math.Min(msgBytes.Length, MessageCapacity - 1);
|
||||
_accessor.WriteArray(offset + MessageOffset, msgBytes, 0, copy);
|
||||
_accessor.Write(offset + MessageOffset + copy, (byte)0); // null terminator
|
||||
|
||||
var next = (idx + 1) % Capacity;
|
||||
_accessor.Write(12, next);
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>Reads all entries in order (oldest → newest). Safe to call from another process.</summary>
|
||||
public PostMortemEntry[] ReadAll()
|
||||
{
|
||||
var magic = _accessor.ReadInt32(0);
|
||||
if (magic != Magic) return [];
|
||||
|
||||
var capacity = _accessor.ReadInt32(8);
|
||||
var writeIndex = _accessor.ReadInt32(12);
|
||||
|
||||
var entries = new PostMortemEntry[capacity];
|
||||
var count = 0;
|
||||
for (var i = 0; i < capacity; i++)
|
||||
{
|
||||
var slot = (writeIndex + i) % capacity;
|
||||
var offset = HeaderBytes + slot * EntryBytes;
|
||||
|
||||
var ts = _accessor.ReadInt64(offset + 0);
|
||||
if (ts == 0) continue; // unwritten
|
||||
|
||||
var op = _accessor.ReadInt64(offset + 8);
|
||||
var msgBuf = new byte[MessageCapacity];
|
||||
_accessor.ReadArray(offset + MessageOffset, msgBuf, 0, MessageCapacity);
|
||||
var nulTerm = Array.IndexOf<byte>(msgBuf, 0);
|
||||
var msg = Encoding.UTF8.GetString(msgBuf, 0, nulTerm < 0 ? MessageCapacity : nulTerm);
|
||||
|
||||
entries[count++] = new PostMortemEntry(ts, op, msg);
|
||||
}
|
||||
|
||||
Array.Resize(ref entries, count);
|
||||
return entries;
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_accessor.Dispose();
|
||||
_mmf.Dispose();
|
||||
}
|
||||
}
|
||||
|
||||
public readonly record struct PostMortemEntry(long UtcUnixMs, long OpKind, string Message);
|
||||
@@ -1,40 +0,0 @@
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Stability;
|
||||
|
||||
/// <summary>
|
||||
/// Frequency-capped soft-recycle decision per <c>driver-stability.md §"Recycle Policy"</c>.
|
||||
/// Default cap: 1 soft recycle per hour. Scheduled recycle at 03:00 local; supervisor reads
|
||||
/// <see cref="ShouldSoftRecycleScheduled"/> to decide.
|
||||
/// </summary>
|
||||
public sealed class RecyclePolicy
|
||||
{
|
||||
public TimeSpan SoftRecycleCap { get; init; } = TimeSpan.FromHours(1);
|
||||
public int DailyRecycleHourLocal { get; init; } = 3;
|
||||
|
||||
private readonly List<DateTime> _recentRecyclesUtc = new();
|
||||
|
||||
/// <summary>Returns true if a soft recycle would be allowed under the frequency cap.</summary>
|
||||
public bool TryRequestSoftRecycle(DateTime utcNow, out string? reason)
|
||||
{
|
||||
_recentRecyclesUtc.RemoveAll(t => utcNow - t > SoftRecycleCap);
|
||||
if (_recentRecyclesUtc.Count > 0)
|
||||
{
|
||||
reason = $"soft-recycle frequency cap: last recycle was {(utcNow - _recentRecyclesUtc[_recentRecyclesUtc.Count - 1]).TotalMinutes:F1} min ago";
|
||||
return false;
|
||||
}
|
||||
_recentRecyclesUtc.Add(utcNow);
|
||||
reason = null;
|
||||
return true;
|
||||
}
|
||||
|
||||
public bool ShouldSoftRecycleScheduled(DateTime localNow, ref DateTime lastScheduledDateLocal)
|
||||
{
|
||||
if (localNow.Hour != DailyRecycleHourLocal) return false;
|
||||
if (localNow.Date <= lastScheduledDateLocal.Date) return false;
|
||||
|
||||
lastScheduledDateLocal = localNow.Date;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
@@ -1,53 +0,0 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<OutputType>Exe</OutputType>
|
||||
<TargetFramework>net48</TargetFramework>
|
||||
<!-- Decision #23: x86 required for MXAccess COM interop. The MxAccess COM client is
|
||||
now ported (Backend/MxAccess/) so we need the x86 platform target for the
|
||||
ArchestrA.MxAccess.dll COM interop reference to resolve at runtime. -->
|
||||
<PlatformTarget>x86</PlatformTarget>
|
||||
<Prefer32Bit>true</Prefer32Bit>
|
||||
<Nullable>enable</Nullable>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
||||
<GenerateDocumentationFile>true</GenerateDocumentationFile>
|
||||
<NoWarn>$(NoWarn);CS1591</NoWarn>
|
||||
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host</RootNamespace>
|
||||
<AssemblyName>OtOpcUa.Driver.Galaxy.Host</AssemblyName>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="System.IO.Pipes.AccessControl" Version="5.0.0"/>
|
||||
<PackageReference Include="System.Memory" Version="4.5.5"/>
|
||||
<PackageReference Include="System.Threading.Tasks.Extensions" Version="4.5.4"/>
|
||||
<PackageReference Include="System.Data.SqlClient" Version="4.9.0"/>
|
||||
<PackageReference Include="Serilog" Version="4.2.0"/>
|
||||
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
|
||||
<!-- PR 3.2: Historian SDK code lifted to the Wonderware sidecar. Galaxy.Host still
|
||||
consumes the historian types (MxAccessGalaxyBackend, Program) until phase 7,
|
||||
so reference the sidecar project to keep building. -->
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.csproj"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Reference Include="ArchestrA.MxAccess">
|
||||
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
|
||||
<Private>true</Private>
|
||||
</Reference>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -1,590 +0,0 @@
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
using IpcHostConnectivityStatus = ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts.HostConnectivityStatus;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
|
||||
|
||||
/// <summary>
|
||||
/// <see cref="IDriver"/> implementation that forwards every capability over the Galaxy IPC
|
||||
/// channel to the out-of-process Host. Implements the full Phase 2 capability surface;
|
||||
/// bodies that depend on the deferred Host-side MXAccess code lift will surface
|
||||
/// <see cref="GalaxyIpcException"/> with code <c>not-implemented</c> until the Host's
|
||||
/// <c>IGalaxyBackend</c> is wired to the real <c>MxAccessClient</c>.
|
||||
/// </summary>
|
||||
public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
|
||||
: IDriver,
|
||||
ITagDiscovery,
|
||||
IReadable,
|
||||
IWritable,
|
||||
ISubscribable,
|
||||
IAlarmSource,
|
||||
IHistoryProvider,
|
||||
IRediscoverable,
|
||||
IHostConnectivityProbe,
|
||||
IAlarmHistorianWriter,
|
||||
IDisposable
|
||||
{
|
||||
private GalaxyIpcClient? _client;
|
||||
private long _sessionId;
|
||||
private DriverHealth _health = new(DriverState.Unknown, null, null);
|
||||
|
||||
private IReadOnlyList<Core.Abstractions.HostConnectivityStatus> _hostStatuses = [];
|
||||
|
||||
public string DriverInstanceId => options.DriverInstanceId;
|
||||
public string DriverType => "Galaxy";
|
||||
|
||||
public event EventHandler<DataChangeEventArgs>? OnDataChange;
|
||||
public event EventHandler<AlarmEventArgs>? OnAlarmEvent;
|
||||
public event EventHandler<RediscoveryEventArgs>? OnRediscoveryNeeded;
|
||||
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
|
||||
|
||||
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
|
||||
{
|
||||
_health = new DriverHealth(DriverState.Initializing, null, null);
|
||||
try
|
||||
{
|
||||
_client = await GalaxyIpcClient.ConnectAsync(
|
||||
options.PipeName, options.SharedSecret, options.ConnectTimeout, cancellationToken);
|
||||
|
||||
// Route Host-pushed event frames to the matching Raise* methods. Must be set BEFORE
|
||||
// the first CallAsync so a RuntimeStatusChange arriving between OpenSessionRequest
|
||||
// and OpenSessionResponse lands on the handler rather than unblocking the call with
|
||||
// the wrong kind.
|
||||
_client.SetEventHandler(DispatchHostEventAsync);
|
||||
|
||||
var resp = await _client.CallAsync<OpenSessionRequest, OpenSessionResponse>(
|
||||
MessageKind.OpenSessionRequest,
|
||||
new OpenSessionRequest { DriverInstanceId = DriverInstanceId, DriverConfigJson = driverConfigJson },
|
||||
MessageKind.OpenSessionResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host OpenSession failed: {resp.Error}");
|
||||
|
||||
_sessionId = resp.SessionId;
|
||||
_health = new DriverHealth(DriverState.Healthy, DateTime.UtcNow, null);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_health = new DriverHealth(DriverState.Faulted, null, ex.Message);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
public async Task ReinitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
|
||||
{
|
||||
await ShutdownAsync(cancellationToken);
|
||||
await InitializeAsync(driverConfigJson, cancellationToken);
|
||||
}
|
||||
|
||||
public async Task ShutdownAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
if (_client is null) return;
|
||||
|
||||
try
|
||||
{
|
||||
await _client.SendOneWayAsync(
|
||||
MessageKind.CloseSessionRequest,
|
||||
new CloseSessionRequest { SessionId = _sessionId },
|
||||
cancellationToken);
|
||||
}
|
||||
catch { /* shutdown is best effort */ }
|
||||
|
||||
await _client.DisposeAsync();
|
||||
_client = null;
|
||||
_health = new DriverHealth(DriverState.Unknown, _health.LastSuccessfulRead, null);
|
||||
}
|
||||
|
||||
public DriverHealth GetHealth() => _health;
|
||||
public long GetMemoryFootprint() => 0;
|
||||
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
|
||||
|
||||
// ---- ITagDiscovery ----
|
||||
|
||||
public async Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(builder);
|
||||
var client = RequireClient();
|
||||
|
||||
var resp = await client.CallAsync<DiscoverHierarchyRequest, DiscoverHierarchyResponse>(
|
||||
MessageKind.DiscoverHierarchyRequest,
|
||||
new DiscoverHierarchyRequest { SessionId = _sessionId },
|
||||
MessageKind.DiscoverHierarchyResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host DiscoverHierarchy failed: {resp.Error}");
|
||||
|
||||
foreach (var obj in resp.Objects)
|
||||
{
|
||||
var folder = builder.Folder(obj.ContainedName, obj.ContainedName);
|
||||
foreach (var attr in obj.Attributes)
|
||||
{
|
||||
var fullName = $"{obj.TagName}.{attr.AttributeName}";
|
||||
var handle = folder.Variable(
|
||||
attr.AttributeName,
|
||||
attr.AttributeName,
|
||||
new DriverAttributeInfo(
|
||||
FullName: fullName,
|
||||
DriverDataType: MapDataType(attr.MxDataType),
|
||||
IsArray: attr.IsArray,
|
||||
ArrayDim: attr.ArrayDim,
|
||||
SecurityClass: MapSecurity(attr.SecurityClassification),
|
||||
IsHistorized: attr.IsHistorized,
|
||||
IsAlarm: attr.IsAlarm));
|
||||
|
||||
// PR 15: when Galaxy flags the attribute as alarm-bearing (AlarmExtension
|
||||
// primitive), register an alarm-condition sink so the generic node manager
|
||||
// can route OnAlarmEvent payloads for this tag to the concrete address-space
|
||||
// builder. Severity default Medium — the live severity arrives through
|
||||
// AlarmEventArgs once MxAccessGalaxyBackend's tracker starts firing.
|
||||
if (attr.IsAlarm)
|
||||
{
|
||||
handle.MarkAsAlarmCondition(new AlarmConditionInfo(
|
||||
SourceName: fullName,
|
||||
InitialSeverity: AlarmSeverity.Medium,
|
||||
InitialDescription: null));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ---- IReadable ----
|
||||
|
||||
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
|
||||
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<ReadValuesRequest, ReadValuesResponse>(
|
||||
MessageKind.ReadValuesRequest,
|
||||
new ReadValuesRequest { SessionId = _sessionId, TagReferences = [.. fullReferences] },
|
||||
MessageKind.ReadValuesResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host ReadValues failed: {resp.Error}");
|
||||
|
||||
var byRef = resp.Values.ToDictionary(v => v.TagReference);
|
||||
var result = new DataValueSnapshot[fullReferences.Count];
|
||||
for (var i = 0; i < fullReferences.Count; i++)
|
||||
{
|
||||
result[i] = byRef.TryGetValue(fullReferences[i], out var v)
|
||||
? ToSnapshot(v)
|
||||
: new DataValueSnapshot(null, StatusBadInternalError, null, DateTime.UtcNow);
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
// ---- IWritable ----
|
||||
|
||||
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
|
||||
IReadOnlyList<WriteRequest> writes, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<WriteValuesRequest, WriteValuesResponse>(
|
||||
MessageKind.WriteValuesRequest,
|
||||
new WriteValuesRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
Writes = [.. writes.Select(FromWriteRequest)],
|
||||
},
|
||||
MessageKind.WriteValuesResponse,
|
||||
cancellationToken);
|
||||
|
||||
return [.. resp.Results.Select(r => new WriteResult(r.StatusCode))];
|
||||
}
|
||||
|
||||
// ---- ISubscribable ----
|
||||
|
||||
public async Task<ISubscriptionHandle> SubscribeAsync(
|
||||
IReadOnlyList<string> fullReferences, TimeSpan publishingInterval, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<SubscribeRequest, SubscribeResponse>(
|
||||
MessageKind.SubscribeRequest,
|
||||
new SubscribeRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
TagReferences = [.. fullReferences],
|
||||
RequestedIntervalMs = (int)publishingInterval.TotalMilliseconds,
|
||||
},
|
||||
MessageKind.SubscribeResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host Subscribe failed: {resp.Error}");
|
||||
|
||||
return new GalaxySubscriptionHandle(resp.SubscriptionId);
|
||||
}
|
||||
|
||||
public async Task UnsubscribeAsync(ISubscriptionHandle handle, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var sid = ((GalaxySubscriptionHandle)handle).SubscriptionId;
|
||||
await client.SendOneWayAsync(
|
||||
MessageKind.UnsubscribeRequest,
|
||||
new UnsubscribeRequest { SessionId = _sessionId, SubscriptionId = sid },
|
||||
cancellationToken);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Internal entry point used by the IPC client when the Host pushes an
|
||||
/// <see cref="MessageKind.OnDataChangeNotification"/> frame. Surfaces it as a managed
|
||||
/// <see cref="OnDataChange"/> event.
|
||||
/// </summary>
|
||||
internal void RaiseDataChange(OnDataChangeNotification notif)
|
||||
{
|
||||
var handle = new GalaxySubscriptionHandle(notif.SubscriptionId);
|
||||
// ISubscribable.OnDataChange fires once per changed attribute — fan out the batch.
|
||||
foreach (var v in notif.Values)
|
||||
OnDataChange?.Invoke(this, new DataChangeEventArgs(handle, v.TagReference, ToSnapshot(v)));
|
||||
}
|
||||
|
||||
// ---- IAlarmSource ----
|
||||
|
||||
public async Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(
|
||||
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
await client.SendOneWayAsync(
|
||||
MessageKind.AlarmSubscribeRequest,
|
||||
new AlarmSubscribeRequest { SessionId = _sessionId },
|
||||
cancellationToken);
|
||||
return new GalaxyAlarmSubscriptionHandle($"alarm-{_sessionId}");
|
||||
}
|
||||
|
||||
public Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken)
|
||||
=> Task.CompletedTask;
|
||||
|
||||
public async Task AcknowledgeAsync(
|
||||
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
foreach (var ack in acknowledgements)
|
||||
{
|
||||
await client.SendOneWayAsync(
|
||||
MessageKind.AlarmAckRequest,
|
||||
new AlarmAckRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
EventId = ack.ConditionId,
|
||||
Comment = ack.Comment ?? string.Empty,
|
||||
},
|
||||
cancellationToken);
|
||||
}
|
||||
}
|
||||
|
||||
internal void RaiseAlarmEvent(GalaxyAlarmEvent ev)
|
||||
{
|
||||
var handle = new GalaxyAlarmSubscriptionHandle($"alarm-{_sessionId}");
|
||||
OnAlarmEvent?.Invoke(this, new AlarmEventArgs(
|
||||
SubscriptionHandle: handle,
|
||||
SourceNodeId: ev.ObjectTagName,
|
||||
ConditionId: ev.EventId,
|
||||
AlarmType: ev.AlarmName,
|
||||
Message: ev.Message,
|
||||
Severity: MapSeverity(ev.Severity),
|
||||
SourceTimestampUtc: DateTimeOffset.FromUnixTimeMilliseconds(ev.UtcUnixMs).UtcDateTime));
|
||||
}
|
||||
|
||||
// ---- IHistoryProvider ----
|
||||
|
||||
public async Task<HistoryReadResult> ReadRawAsync(
|
||||
string fullReference, DateTime startUtc, DateTime endUtc, uint maxValuesPerNode,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<HistoryReadRequest, HistoryReadResponse>(
|
||||
MessageKind.HistoryReadRequest,
|
||||
new HistoryReadRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
TagReferences = [fullReference],
|
||||
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
MaxValuesPerTag = maxValuesPerNode,
|
||||
},
|
||||
MessageKind.HistoryReadResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host HistoryRead failed: {resp.Error}");
|
||||
|
||||
var first = resp.Tags.FirstOrDefault();
|
||||
IReadOnlyList<DataValueSnapshot> samples = first is null
|
||||
? Array.Empty<DataValueSnapshot>()
|
||||
: [.. first.Values.Select(ToSnapshot)];
|
||||
return new HistoryReadResult(samples, ContinuationPoint: null);
|
||||
}
|
||||
|
||||
public async Task<HistoryReadResult> ReadProcessedAsync(
|
||||
string fullReference, DateTime startUtc, DateTime endUtc, TimeSpan interval,
|
||||
HistoryAggregateType aggregate, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var column = MapAggregateToColumn(aggregate);
|
||||
|
||||
var resp = await client.CallAsync<HistoryReadProcessedRequest, HistoryReadProcessedResponse>(
|
||||
MessageKind.HistoryReadProcessedRequest,
|
||||
new HistoryReadProcessedRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
TagReference = fullReference,
|
||||
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
IntervalMs = (long)interval.TotalMilliseconds,
|
||||
AggregateColumn = column,
|
||||
},
|
||||
MessageKind.HistoryReadProcessedResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host HistoryReadProcessed failed: {resp.Error}");
|
||||
|
||||
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
|
||||
return new HistoryReadResult(samples, ContinuationPoint: null);
|
||||
}
|
||||
|
||||
public async Task<HistoryReadResult> ReadAtTimeAsync(
|
||||
string fullReference, IReadOnlyList<DateTime> timestampsUtc, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<HistoryReadAtTimeRequest, HistoryReadAtTimeResponse>(
|
||||
MessageKind.HistoryReadAtTimeRequest,
|
||||
new HistoryReadAtTimeRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
TagReference = fullReference,
|
||||
TimestampsUtcUnixMs = [.. timestampsUtc.Select(t => new DateTimeOffset(t, TimeSpan.Zero).ToUnixTimeMilliseconds())],
|
||||
},
|
||||
MessageKind.HistoryReadAtTimeResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host HistoryReadAtTime failed: {resp.Error}");
|
||||
|
||||
// ReadAtTime returns one sample per requested timestamp in the same order — the Host
|
||||
// pads with bad-quality snapshots when a timestamp can't be interpolated, so response
|
||||
// length matches request length exactly. We trust that contract rather than
|
||||
// re-aligning here, because the Host is the source-of-truth for interpolation policy.
|
||||
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
|
||||
return new HistoryReadResult(samples, ContinuationPoint: null);
|
||||
}
|
||||
|
||||
public async Task<HistoricalEventsResult> ReadEventsAsync(
|
||||
string? sourceName, DateTime startUtc, DateTime endUtc, int maxEvents, CancellationToken cancellationToken)
|
||||
{
|
||||
var client = RequireClient();
|
||||
var resp = await client.CallAsync<HistoryReadEventsRequest, HistoryReadEventsResponse>(
|
||||
MessageKind.HistoryReadEventsRequest,
|
||||
new HistoryReadEventsRequest
|
||||
{
|
||||
SessionId = _sessionId,
|
||||
SourceName = sourceName,
|
||||
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
MaxEvents = maxEvents,
|
||||
},
|
||||
MessageKind.HistoryReadEventsResponse,
|
||||
cancellationToken);
|
||||
|
||||
if (!resp.Success)
|
||||
throw new InvalidOperationException($"Galaxy.Host HistoryReadEvents failed: {resp.Error}");
|
||||
|
||||
IReadOnlyList<HistoricalEvent> events = [.. resp.Events.Select(ToHistoricalEvent)];
|
||||
return new HistoricalEventsResult(events, ContinuationPoint: null);
|
||||
}
|
||||
|
||||
internal static HistoricalEvent ToHistoricalEvent(GalaxyHistoricalEvent wire) => new(
|
||||
EventId: wire.EventId,
|
||||
SourceName: wire.SourceName,
|
||||
EventTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.EventTimeUtcUnixMs).UtcDateTime,
|
||||
ReceivedTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.ReceivedTimeUtcUnixMs).UtcDateTime,
|
||||
Message: wire.DisplayText,
|
||||
Severity: wire.Severity);
|
||||
|
||||
/// <summary>
|
||||
/// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian
|
||||
/// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>.
|
||||
/// Kept on the Proxy side so Galaxy.Host stays OPC-UA-free.
|
||||
/// </summary>
|
||||
internal static string MapAggregateToColumn(HistoryAggregateType aggregate) => aggregate switch
|
||||
{
|
||||
HistoryAggregateType.Average => "Average",
|
||||
HistoryAggregateType.Minimum => "Minimum",
|
||||
HistoryAggregateType.Maximum => "Maximum",
|
||||
HistoryAggregateType.Count => "ValueCount",
|
||||
HistoryAggregateType.Total => throw new NotSupportedException(
|
||||
"HistoryAggregateType.Total is not supported by the Wonderware Historian AnalogSummary " +
|
||||
"query — use Average × Count on the caller side, or switch to Average/Minimum/Maximum/Count."),
|
||||
_ => throw new NotSupportedException($"Unknown HistoryAggregateType {aggregate}"),
|
||||
};
|
||||
|
||||
// ---- IRediscoverable ----
|
||||
|
||||
/// <summary>
|
||||
/// Triggered by the IPC client when the Host pushes a deploy-watermark notification
|
||||
/// (Galaxy <c>time_of_last_deploy</c> changed per decision #54).
|
||||
/// </summary>
|
||||
internal void RaiseRediscoveryNeeded(string reason, string? scopeHint = null) =>
|
||||
OnRediscoveryNeeded?.Invoke(this, new RediscoveryEventArgs(reason, scopeHint));
|
||||
|
||||
// ---- IHostConnectivityProbe ----
|
||||
|
||||
public IReadOnlyList<Core.Abstractions.HostConnectivityStatus> GetHostStatuses() => _hostStatuses;
|
||||
|
||||
internal void OnHostConnectivityUpdate(IpcHostConnectivityStatus update)
|
||||
{
|
||||
var translated = new Core.Abstractions.HostConnectivityStatus(
|
||||
HostName: update.HostName,
|
||||
State: ParseHostState(update.RuntimeStatus),
|
||||
LastChangedUtc: DateTimeOffset.FromUnixTimeMilliseconds(update.LastObservedUtcUnixMs).UtcDateTime);
|
||||
|
||||
var prior = _hostStatuses.FirstOrDefault(h => h.HostName == translated.HostName);
|
||||
_hostStatuses = [
|
||||
.. _hostStatuses.Where(h => h.HostName != translated.HostName),
|
||||
translated
|
||||
];
|
||||
|
||||
if (prior is null || prior.State != translated.State)
|
||||
{
|
||||
OnHostStatusChanged?.Invoke(this, new HostStatusChangedEventArgs(
|
||||
translated.HostName, prior?.State ?? HostState.Unknown, translated.State));
|
||||
}
|
||||
}
|
||||
|
||||
private static HostState ParseHostState(string s) => s switch
|
||||
{
|
||||
"Running" => HostState.Running,
|
||||
"Stopped" => HostState.Stopped,
|
||||
"Faulted" => HostState.Faulted,
|
||||
_ => HostState.Unknown,
|
||||
};
|
||||
|
||||
// ---- helpers ----
|
||||
|
||||
/// <summary>
|
||||
/// Event-handler registered with <see cref="GalaxyIpcClient.SetEventHandler"/>. Decodes
|
||||
/// the MessagePack body into the matching wire contract and delegates to the existing
|
||||
/// <c>Raise*</c> helpers. Unknown kinds are silently ignored — the IPC contract is
|
||||
/// append-only, so a newer Host sending a kind this Proxy doesn't recognise shouldn't
|
||||
/// break the session.
|
||||
/// </summary>
|
||||
private Task DispatchHostEventAsync(MessageKind kind, byte[] body)
|
||||
{
|
||||
switch (kind)
|
||||
{
|
||||
case MessageKind.OnDataChangeNotification:
|
||||
RaiseDataChange(MessagePackSerializer.Deserialize<OnDataChangeNotification>(body));
|
||||
break;
|
||||
case MessageKind.AlarmEvent:
|
||||
RaiseAlarmEvent(MessagePackSerializer.Deserialize<GalaxyAlarmEvent>(body));
|
||||
break;
|
||||
case MessageKind.HostConnectivityStatus:
|
||||
OnHostConnectivityUpdate(MessagePackSerializer.Deserialize<IpcHostConnectivityStatus>(body));
|
||||
break;
|
||||
case MessageKind.RuntimeStatusChange:
|
||||
var rsc = MessagePackSerializer.Deserialize<RuntimeStatusChangeNotification>(body);
|
||||
OnHostConnectivityUpdate(rsc.Status);
|
||||
break;
|
||||
// HistorianConnectivityStatus has no consumer on this Proxy today — drop.
|
||||
// Response kinds never reach the event handler; the client routes those to
|
||||
// their pending CallAsync TCS.
|
||||
}
|
||||
return Task.CompletedTask;
|
||||
}
|
||||
|
||||
private GalaxyIpcClient RequireClient() =>
|
||||
_client ?? throw new InvalidOperationException("Driver not initialized");
|
||||
|
||||
private const uint StatusBadInternalError = 0x80020000u;
|
||||
|
||||
private static DataValueSnapshot ToSnapshot(GalaxyDataValue v) => new(
|
||||
Value: v.ValueBytes,
|
||||
StatusCode: v.StatusCode,
|
||||
SourceTimestampUtc: v.SourceTimestampUtcUnixMs > 0
|
||||
? DateTimeOffset.FromUnixTimeMilliseconds(v.SourceTimestampUtcUnixMs).UtcDateTime
|
||||
: null,
|
||||
ServerTimestampUtc: DateTimeOffset.FromUnixTimeMilliseconds(v.ServerTimestampUtcUnixMs).UtcDateTime);
|
||||
|
||||
private static GalaxyDataValue FromWriteRequest(WriteRequest w) => new()
|
||||
{
|
||||
TagReference = w.FullReference,
|
||||
ValueBytes = MessagePack.MessagePackSerializer.Serialize(w.Value),
|
||||
ValueMessagePackType = 0,
|
||||
StatusCode = 0,
|
||||
SourceTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
|
||||
};
|
||||
|
||||
private static DriverDataType MapDataType(int mxDataType) => mxDataType switch
|
||||
{
|
||||
0 => DriverDataType.Boolean,
|
||||
1 => DriverDataType.Int32,
|
||||
2 => DriverDataType.Float32,
|
||||
3 => DriverDataType.Float64,
|
||||
4 => DriverDataType.String,
|
||||
5 => DriverDataType.DateTime,
|
||||
_ => DriverDataType.String,
|
||||
};
|
||||
|
||||
private static SecurityClassification MapSecurity(int mxSec) => mxSec switch
|
||||
{
|
||||
0 => SecurityClassification.FreeAccess,
|
||||
1 => SecurityClassification.Operate,
|
||||
2 => SecurityClassification.SecuredWrite,
|
||||
3 => SecurityClassification.VerifiedWrite,
|
||||
4 => SecurityClassification.Tune,
|
||||
5 => SecurityClassification.Configure,
|
||||
6 => SecurityClassification.ViewOnly,
|
||||
_ => SecurityClassification.FreeAccess,
|
||||
};
|
||||
|
||||
private static AlarmSeverity MapSeverity(int sev) => sev switch
|
||||
{
|
||||
<= 250 => AlarmSeverity.Low,
|
||||
<= 500 => AlarmSeverity.Medium,
|
||||
<= 800 => AlarmSeverity.High,
|
||||
_ => AlarmSeverity.Critical,
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Phase 7 follow-up #247 — IAlarmHistorianWriter implementation. Forwards alarm
|
||||
/// batches to Galaxy.Host over the existing IPC channel, reusing the connection
|
||||
/// the driver already established for data-plane traffic. Throws
|
||||
/// <see cref="InvalidOperationException"/> when called before
|
||||
/// <see cref="InitializeAsync"/> has connected the client; the SQLite drain worker
|
||||
/// translates that to whole-batch RetryPlease per its catch contract.
|
||||
/// </summary>
|
||||
public Task<IReadOnlyList<HistorianWriteOutcome>> WriteBatchAsync(
|
||||
IReadOnlyList<AlarmHistorianEvent> batch, CancellationToken cancellationToken)
|
||||
{
|
||||
if (_client is null)
|
||||
throw new InvalidOperationException(
|
||||
"GalaxyProxyDriver IPC client not connected — historian writes rejected until InitializeAsync completes");
|
||||
return new GalaxyHistorianWriter(_client).WriteBatchAsync(batch, cancellationToken);
|
||||
}
|
||||
|
||||
public void Dispose() => _client?.DisposeAsync().AsTask().GetAwaiter().GetResult();
|
||||
}
|
||||
|
||||
internal sealed record GalaxySubscriptionHandle(long SubscriptionId) : ISubscriptionHandle
|
||||
{
|
||||
public string DiagnosticId => $"galaxy-sub-{SubscriptionId}";
|
||||
}
|
||||
|
||||
internal sealed record GalaxyAlarmSubscriptionHandle(string Id) : IAlarmSubscriptionHandle
|
||||
{
|
||||
public string DiagnosticId => Id;
|
||||
}
|
||||
|
||||
public sealed class GalaxyProxyOptions
|
||||
{
|
||||
public required string DriverInstanceId { get; init; }
|
||||
public required string PipeName { get; init; }
|
||||
public required string SharedSecret { get; init; }
|
||||
public TimeSpan ConnectTimeout { get; init; } = TimeSpan.FromSeconds(10);
|
||||
}
|
||||
@@ -1,61 +0,0 @@
|
||||
using System.Text.Json;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
|
||||
|
||||
/// <summary>
|
||||
/// Static factory registration helper for <see cref="GalaxyProxyDriver"/>. Server's
|
||||
/// Program.cs calls <see cref="Register"/> once at startup; the bootstrapper (task #248)
|
||||
/// then materialises Galaxy DriverInstance rows from the central config DB into live
|
||||
/// driver instances. No dependency on Microsoft.Extensions.DependencyInjection so the
|
||||
/// driver project stays free of DI machinery.
|
||||
/// </summary>
|
||||
public static class GalaxyProxyDriverFactoryExtensions
|
||||
{
|
||||
public const string DriverTypeName = "Galaxy";
|
||||
|
||||
/// <summary>
|
||||
/// Register the Galaxy driver factory in the supplied <see cref="DriverFactoryRegistry"/>.
|
||||
/// Throws if 'Galaxy' is already registered — single-instance per process.
|
||||
/// </summary>
|
||||
public static void Register(DriverFactoryRegistry registry)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(registry);
|
||||
// Galaxy is Tier C — out-of-process MXAccess Host, scheduled recycle is allowed.
|
||||
registry.Register(DriverTypeName, CreateInstance, DriverTier.C);
|
||||
}
|
||||
|
||||
internal static GalaxyProxyDriver CreateInstance(string driverInstanceId, string driverConfigJson)
|
||||
{
|
||||
ArgumentException.ThrowIfNullOrWhiteSpace(driverInstanceId);
|
||||
ArgumentException.ThrowIfNullOrWhiteSpace(driverConfigJson);
|
||||
|
||||
// DriverConfig column is a JSON object that mirrors GalaxyProxyOptions.
|
||||
// Required: PipeName, SharedSecret. Optional: ConnectTimeoutMs (defaults to 10s).
|
||||
// The DriverInstanceId from the row wins over any value in the JSON — the row
|
||||
// is the authoritative identity per the schema's UX_DriverInstance_Generation_LogicalId.
|
||||
using var doc = JsonDocument.Parse(driverConfigJson);
|
||||
var root = doc.RootElement;
|
||||
|
||||
string pipeName = root.TryGetProperty("PipeName", out var p) && p.ValueKind == JsonValueKind.String
|
||||
? p.GetString()!
|
||||
: throw new InvalidOperationException(
|
||||
$"GalaxyProxyDriver config for '{driverInstanceId}' missing required PipeName");
|
||||
string sharedSecret = root.TryGetProperty("SharedSecret", out var s) && s.ValueKind == JsonValueKind.String
|
||||
? s.GetString()!
|
||||
: throw new InvalidOperationException(
|
||||
$"GalaxyProxyDriver config for '{driverInstanceId}' missing required SharedSecret");
|
||||
var connectTimeout = root.TryGetProperty("ConnectTimeoutMs", out var t) && t.ValueKind == JsonValueKind.Number
|
||||
? TimeSpan.FromMilliseconds(t.GetInt32())
|
||||
: TimeSpan.FromSeconds(10);
|
||||
|
||||
return new GalaxyProxyDriver(new GalaxyProxyOptions
|
||||
{
|
||||
DriverInstanceId = driverInstanceId,
|
||||
PipeName = pipeName,
|
||||
SharedSecret = sharedSecret,
|
||||
ConnectTimeout = connectTimeout,
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -1,90 +0,0 @@
|
||||
using ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Phase 7 follow-up (task #247) — bridges <see cref="SqliteStoreAndForwardSink"/>'s
|
||||
/// drain worker to <c>Driver.Galaxy.Host</c> over the existing <see cref="GalaxyIpcClient"/>
|
||||
/// pipe. Translates <see cref="AlarmHistorianEvent"/> batches into the
|
||||
/// <see cref="HistorianAlarmEventDto"/> wire format the Host expects + maps per-event
|
||||
/// <see cref="HistorianAlarmEventOutcomeDto"/> responses back to
|
||||
/// <see cref="HistorianWriteOutcome"/> so the SQLite queue knows what to ack /
|
||||
/// dead-letter / retry.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Reuses the IPC channel <see cref="GalaxyProxyDriver"/> already opens for the
|
||||
/// Galaxy data plane — no second pipe to <c>Driver.Galaxy.Host</c>, no separate
|
||||
/// auth handshake. The IPC client's call gate serializes historian batches with
|
||||
/// driver Reads/Writes/Subscribes; historian batches are infrequent (every few
|
||||
/// seconds at most under the SQLite sink's drain cadence) so the contention is
|
||||
/// negligible compared to per-tag-read pressure.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Pipe-level transport faults (broken pipe, host crash) bubble up as
|
||||
/// <see cref="GalaxyIpcException"/> which the SQLite sink's drain worker catches +
|
||||
/// translates to a whole-batch RetryPlease per the
|
||||
/// <see cref="SqliteStoreAndForwardSink"/> docstring — failed events stay queued
|
||||
/// for the next drain tick after backoff.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
public sealed class GalaxyHistorianWriter : IAlarmHistorianWriter
|
||||
{
|
||||
private readonly GalaxyIpcClient _client;
|
||||
|
||||
public GalaxyHistorianWriter(GalaxyIpcClient client)
|
||||
{
|
||||
_client = client ?? throw new ArgumentNullException(nameof(client));
|
||||
}
|
||||
|
||||
public async Task<IReadOnlyList<HistorianWriteOutcome>> WriteBatchAsync(
|
||||
IReadOnlyList<AlarmHistorianEvent> batch, CancellationToken cancellationToken)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(batch);
|
||||
if (batch.Count == 0) return [];
|
||||
|
||||
var request = new HistorianAlarmEventRequest
|
||||
{
|
||||
Events = batch.Select(ToDto).ToArray(),
|
||||
};
|
||||
|
||||
var response = await _client.CallAsync<HistorianAlarmEventRequest, HistorianAlarmEventResponse>(
|
||||
requestKind: MessageKind.HistorianAlarmEventRequest,
|
||||
request: request,
|
||||
expectedResponseKind: MessageKind.HistorianAlarmEventResponse,
|
||||
ct: cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (response.Outcomes.Length != batch.Count)
|
||||
throw new InvalidOperationException(
|
||||
$"Galaxy.Host returned {response.Outcomes.Length} outcomes for a batch of {batch.Count} — protocol mismatch");
|
||||
|
||||
var outcomes = new HistorianWriteOutcome[response.Outcomes.Length];
|
||||
for (var i = 0; i < response.Outcomes.Length; i++)
|
||||
outcomes[i] = MapOutcome(response.Outcomes[i]);
|
||||
return outcomes;
|
||||
}
|
||||
|
||||
internal static HistorianAlarmEventDto ToDto(AlarmHistorianEvent e) => new()
|
||||
{
|
||||
AlarmId = e.AlarmId,
|
||||
EquipmentPath = e.EquipmentPath,
|
||||
AlarmName = e.AlarmName,
|
||||
AlarmTypeName = e.AlarmTypeName,
|
||||
Severity = (int)e.Severity,
|
||||
EventKind = e.EventKind,
|
||||
Message = e.Message,
|
||||
User = e.User,
|
||||
Comment = e.Comment,
|
||||
TimestampUtcUnixMs = new DateTimeOffset(e.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
|
||||
};
|
||||
|
||||
internal static HistorianWriteOutcome MapOutcome(HistorianAlarmEventOutcomeDto wire) => wire switch
|
||||
{
|
||||
HistorianAlarmEventOutcomeDto.Ack => HistorianWriteOutcome.Ack,
|
||||
HistorianAlarmEventOutcomeDto.RetryPlease => HistorianWriteOutcome.RetryPlease,
|
||||
HistorianAlarmEventOutcomeDto.PermanentFail => HistorianWriteOutcome.PermanentFail,
|
||||
_ => throw new InvalidOperationException($"Unknown HistorianAlarmEventOutcomeDto byte {(byte)wire}"),
|
||||
};
|
||||
}
|
||||
@@ -1,243 +0,0 @@
|
||||
using System.IO.Pipes;
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
|
||||
|
||||
/// <summary>
|
||||
/// Client-side IPC channel to a running <c>Driver.Galaxy.Host</c>. Owns the data-plane pipe
|
||||
/// connection, serializes request/response round-trips, and routes unsolicited push frames
|
||||
/// (<see cref="MessageKind.OnDataChangeNotification"/>, <see cref="MessageKind.AlarmEvent"/>,
|
||||
/// <see cref="MessageKind.HostConnectivityStatus"/>, <see cref="MessageKind.RuntimeStatusChange"/>,
|
||||
/// <see cref="MessageKind.HistorianConnectivityStatus"/>) to a handler supplied via
|
||||
/// <see cref="SetEventHandler"/>. One instance per session.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// A single background reader task owns the read side of the pipe. Calls are serialized by
|
||||
/// <see cref="_writeGate"/>, so at most one pending response is outstanding at a time — the
|
||||
/// reader uses a single pending-response slot. Any frame that doesn't match the pending
|
||||
/// expected kind (or <see cref="MessageKind.ErrorResponse"/>) is treated as a push event and
|
||||
/// forwarded to the registered handler. Without this router, a push event arriving between
|
||||
/// request and response would satisfy the caller's read and fail the next
|
||||
/// <see cref="CallAsync{TReq, TResp}"/> with an "Expected X, got Y" error.
|
||||
/// </remarks>
|
||||
public sealed class GalaxyIpcClient : IAsyncDisposable
|
||||
{
|
||||
private readonly NamedPipeClientStream _stream;
|
||||
private readonly FrameReader _reader;
|
||||
private readonly FrameWriter _writer;
|
||||
private readonly SemaphoreSlim _writeGate = new(1, 1);
|
||||
private readonly CancellationTokenSource _readerCts = new();
|
||||
|
||||
private readonly object _pendingLock = new();
|
||||
private TaskCompletionSource<(MessageKind Kind, byte[] Body)>? _pending;
|
||||
private MessageKind _pendingExpected;
|
||||
|
||||
private Task? _readerTask;
|
||||
private Func<MessageKind, byte[], Task>? _eventHandler;
|
||||
|
||||
private GalaxyIpcClient(NamedPipeClientStream stream)
|
||||
{
|
||||
_stream = stream;
|
||||
_reader = new FrameReader(stream, leaveOpen: true);
|
||||
_writer = new FrameWriter(stream, leaveOpen: true);
|
||||
}
|
||||
|
||||
/// <summary>Connects, sends Hello with the shared secret, and awaits HelloAck. Throws on rejection.</summary>
|
||||
public static async Task<GalaxyIpcClient> ConnectAsync(
|
||||
string pipeName, string sharedSecret, TimeSpan connectTimeout, CancellationToken ct)
|
||||
{
|
||||
var stream = new NamedPipeClientStream(
|
||||
serverName: ".",
|
||||
pipeName: pipeName,
|
||||
direction: PipeDirection.InOut,
|
||||
options: PipeOptions.Asynchronous);
|
||||
|
||||
await stream.ConnectAsync((int)connectTimeout.TotalMilliseconds, ct);
|
||||
|
||||
var client = new GalaxyIpcClient(stream);
|
||||
try
|
||||
{
|
||||
await client._writer.WriteAsync(MessageKind.Hello,
|
||||
new Hello { PeerName = "Galaxy.Proxy", SharedSecret = sharedSecret }, ct);
|
||||
|
||||
// Hello/HelloAck is the one round-trip that runs inline before the reader loop
|
||||
// starts — the Host expects its response-side write before accepting any other
|
||||
// frames, so there's no push-event window to worry about here.
|
||||
var ack = await client._reader.ReadFrameAsync(ct);
|
||||
if (ack is null || ack.Value.Kind != MessageKind.HelloAck)
|
||||
throw new InvalidOperationException("Did not receive HelloAck from Galaxy.Host");
|
||||
|
||||
var ackMsg = FrameReader.Deserialize<HelloAck>(ack.Value.Body);
|
||||
if (!ackMsg.Accepted)
|
||||
throw new UnauthorizedAccessException($"Galaxy.Host rejected Hello: {ackMsg.RejectReason}");
|
||||
|
||||
client._readerTask = Task.Run(() => client.ReadLoopAsync(client._readerCts.Token));
|
||||
return client;
|
||||
}
|
||||
catch
|
||||
{
|
||||
await client.DisposeAsync();
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Register a handler that receives unsolicited push frames. Safe to call once per
|
||||
/// session — typically during the driver's <c>InitializeAsync</c> right after
|
||||
/// <see cref="ConnectAsync"/>. The handler is invoked on the reader's thread-pool
|
||||
/// task; it should not block. Exceptions thrown by the handler are swallowed so a
|
||||
/// buggy event subscriber cannot kill the reader loop.
|
||||
/// </summary>
|
||||
public void SetEventHandler(Func<MessageKind, byte[], Task> handler)
|
||||
=> _eventHandler = handler ?? throw new ArgumentNullException(nameof(handler));
|
||||
|
||||
/// <summary>Round-trips a request and returns the deserialized response.</summary>
|
||||
public async Task<TResp> CallAsync<TReq, TResp>(
|
||||
MessageKind requestKind, TReq request, MessageKind expectedResponseKind, CancellationToken ct)
|
||||
{
|
||||
await _writeGate.WaitAsync(ct);
|
||||
var tcs = new TaskCompletionSource<(MessageKind, byte[])>(
|
||||
TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
try
|
||||
{
|
||||
lock (_pendingLock)
|
||||
{
|
||||
if (_pending is not null)
|
||||
throw new InvalidOperationException(
|
||||
"GalaxyIpcClient pending-response slot is not empty — call re-entry is a bug");
|
||||
_pending = tcs;
|
||||
_pendingExpected = expectedResponseKind;
|
||||
}
|
||||
|
||||
await _writer.WriteAsync(requestKind, request, ct);
|
||||
|
||||
using var reg = ct.Register(static s =>
|
||||
((TaskCompletionSource<(MessageKind, byte[])>)s!).TrySetCanceled(), tcs);
|
||||
var frame = await tcs.Task.ConfigureAwait(false);
|
||||
|
||||
if (frame.Item1 == MessageKind.ErrorResponse)
|
||||
{
|
||||
var err = MessagePackSerializer.Deserialize<ErrorResponse>(frame.Item2);
|
||||
throw new GalaxyIpcException(err.Code, err.Message);
|
||||
}
|
||||
|
||||
return MessagePackSerializer.Deserialize<TResp>(frame.Item2);
|
||||
}
|
||||
finally
|
||||
{
|
||||
lock (_pendingLock)
|
||||
{
|
||||
if (ReferenceEquals(_pending, tcs)) _pending = null;
|
||||
}
|
||||
_writeGate.Release();
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Fire-and-forget request — used for unsubscribe, alarm-ack, close-session, and other
|
||||
/// calls where the protocol is one-way. The send is still serialized through the write
|
||||
/// gate so it doesn't interleave a frame with a concurrent <see cref="CallAsync{TReq, TResp}"/>.
|
||||
/// </summary>
|
||||
public async Task SendOneWayAsync<TReq>(MessageKind requestKind, TReq request, CancellationToken ct)
|
||||
{
|
||||
await _writeGate.WaitAsync(ct);
|
||||
try { await _writer.WriteAsync(requestKind, request, ct); }
|
||||
finally { _writeGate.Release(); }
|
||||
}
|
||||
|
||||
private async Task ReadLoopAsync(CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
while (!ct.IsCancellationRequested)
|
||||
{
|
||||
(MessageKind Kind, byte[] Body)? frame;
|
||||
try
|
||||
{
|
||||
var read = await _reader.ReadFrameAsync(ct).ConfigureAwait(false);
|
||||
frame = read is null ? null : (read.Value.Kind, read.Value.Body);
|
||||
}
|
||||
catch (OperationCanceledException) { break; }
|
||||
catch (Exception ex)
|
||||
{
|
||||
FailPending(ex);
|
||||
break;
|
||||
}
|
||||
|
||||
if (frame is null)
|
||||
{
|
||||
FailPending(new EndOfStreamException("IPC peer closed the pipe"));
|
||||
break;
|
||||
}
|
||||
|
||||
// Route: response-ish frame to pending TCS if one is waiting, else treat as event.
|
||||
// ErrorResponse always terminates a pending call — that's the Host signalling a
|
||||
// request-scoped failure. Unsolicited ErrorResponse with no pending call shouldn't
|
||||
// happen under a well-formed protocol; if it does, we drop it to the event channel
|
||||
// so it shows up in logs rather than deadlocking the next CallAsync.
|
||||
TaskCompletionSource<(MessageKind, byte[])>? pendingTcs = null;
|
||||
lock (_pendingLock)
|
||||
{
|
||||
if (_pending is not null && (frame.Value.Kind == _pendingExpected
|
||||
|| frame.Value.Kind == MessageKind.ErrorResponse))
|
||||
{
|
||||
pendingTcs = _pending;
|
||||
_pending = null;
|
||||
}
|
||||
}
|
||||
|
||||
if (pendingTcs is not null)
|
||||
{
|
||||
pendingTcs.TrySetResult(frame.Value);
|
||||
continue;
|
||||
}
|
||||
|
||||
var handler = _eventHandler;
|
||||
if (handler is null) continue;
|
||||
|
||||
try { await handler(frame.Value.Kind, frame.Value.Body).ConfigureAwait(false); }
|
||||
catch
|
||||
{
|
||||
// A buggy subscriber must not kill the reader. The handler is expected to
|
||||
// do its own logging; swallowing here keeps the channel alive for the next
|
||||
// frame + the next CallAsync.
|
||||
}
|
||||
}
|
||||
}
|
||||
finally
|
||||
{
|
||||
// Any still-pending call after the loop exits would otherwise hang forever.
|
||||
FailPending(new EndOfStreamException("IPC reader loop exited"));
|
||||
}
|
||||
}
|
||||
|
||||
private void FailPending(Exception ex)
|
||||
{
|
||||
TaskCompletionSource<(MessageKind, byte[])>? tcs;
|
||||
lock (_pendingLock) { tcs = _pending; _pending = null; }
|
||||
tcs?.TrySetException(ex);
|
||||
}
|
||||
|
||||
public async ValueTask DisposeAsync()
|
||||
{
|
||||
_readerCts.Cancel();
|
||||
if (_readerTask is not null)
|
||||
{
|
||||
try { await _readerTask.ConfigureAwait(false); } catch { /* shutdown */ }
|
||||
}
|
||||
|
||||
_writeGate.Dispose();
|
||||
_reader.Dispose();
|
||||
_writer.Dispose();
|
||||
_readerCts.Dispose();
|
||||
await _stream.DisposeAsync();
|
||||
}
|
||||
}
|
||||
|
||||
public sealed class GalaxyIpcException(string code, string message)
|
||||
: Exception($"[{code}] {message}")
|
||||
{
|
||||
public string Code { get; } = code;
|
||||
}
|
||||
@@ -1,29 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Supervisor;
|
||||
|
||||
/// <summary>
|
||||
/// Respawn-with-backoff schedule per <c>driver-stability.md §"Crash-loop circuit breaker"</c>:
|
||||
/// 5s → 15s → 60s, capped. Reset on a successful (> <see cref="StableRunThreshold"/>)
|
||||
/// run.
|
||||
/// </summary>
|
||||
public sealed class Backoff
|
||||
{
|
||||
public static TimeSpan[] DefaultSequence { get; } =
|
||||
[TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15), TimeSpan.FromSeconds(60)];
|
||||
|
||||
public TimeSpan StableRunThreshold { get; init; } = TimeSpan.FromMinutes(2);
|
||||
|
||||
private readonly TimeSpan[] _sequence;
|
||||
private int _index;
|
||||
|
||||
public Backoff(TimeSpan[]? sequence = null) => _sequence = sequence ?? DefaultSequence;
|
||||
|
||||
public TimeSpan Next()
|
||||
{
|
||||
var delay = _sequence[Math.Min(_index, _sequence.Length - 1)];
|
||||
_index++;
|
||||
return delay;
|
||||
}
|
||||
|
||||
/// <summary>Called when the spawned process has stayed up past the stable threshold.</summary>
|
||||
public void RecordStableRun() => _index = 0;
|
||||
}
|
||||
@@ -1,68 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Supervisor;
|
||||
|
||||
/// <summary>
|
||||
/// Crash-loop circuit breaker per <c>driver-stability.md</c>:
|
||||
/// 3 crashes within 5 min → open with escalating cooldown 1h → 4h → 24h manual. A sticky
|
||||
/// alert stays until the operator explicitly resets.
|
||||
/// </summary>
|
||||
public sealed class CircuitBreaker
|
||||
{
|
||||
public int CrashesAllowedPerWindow { get; init; } = 3;
|
||||
public TimeSpan Window { get; init; } = TimeSpan.FromMinutes(5);
|
||||
|
||||
public TimeSpan[] CooldownEscalation { get; init; } =
|
||||
[TimeSpan.FromHours(1), TimeSpan.FromHours(4), TimeSpan.MaxValue];
|
||||
|
||||
private readonly List<DateTime> _crashesUtc = [];
|
||||
private DateTime? _openSinceUtc;
|
||||
private int _escalationLevel;
|
||||
public bool StickyAlertActive { get; private set; }
|
||||
|
||||
/// <summary>
|
||||
/// Called by the supervisor each time the host process exits unexpectedly. Returns
|
||||
/// <c>false</c> when the breaker is open — supervisor must not respawn.
|
||||
/// </summary>
|
||||
public bool TryRecordCrash(DateTime utcNow, out TimeSpan cooldownRemaining)
|
||||
{
|
||||
if (_openSinceUtc is { } openedAt)
|
||||
{
|
||||
var cooldown = CooldownEscalation[Math.Min(_escalationLevel, CooldownEscalation.Length - 1)];
|
||||
if (cooldown == TimeSpan.MaxValue)
|
||||
{
|
||||
cooldownRemaining = TimeSpan.MaxValue;
|
||||
return false; // manual reset required
|
||||
}
|
||||
if (utcNow - openedAt < cooldown)
|
||||
{
|
||||
cooldownRemaining = cooldown - (utcNow - openedAt);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Cooldown elapsed — close the breaker but keep the sticky alert per spec.
|
||||
_openSinceUtc = null;
|
||||
_escalationLevel++;
|
||||
}
|
||||
|
||||
_crashesUtc.RemoveAll(t => utcNow - t > Window);
|
||||
_crashesUtc.Add(utcNow);
|
||||
|
||||
if (_crashesUtc.Count > CrashesAllowedPerWindow)
|
||||
{
|
||||
_openSinceUtc = utcNow;
|
||||
StickyAlertActive = true;
|
||||
cooldownRemaining = CooldownEscalation[Math.Min(_escalationLevel, CooldownEscalation.Length - 1)];
|
||||
return false;
|
||||
}
|
||||
|
||||
cooldownRemaining = TimeSpan.Zero;
|
||||
return true;
|
||||
}
|
||||
|
||||
public void ManualReset()
|
||||
{
|
||||
_crashesUtc.Clear();
|
||||
_openSinceUtc = null;
|
||||
_escalationLevel = 0;
|
||||
StickyAlertActive = false;
|
||||
}
|
||||
}
|
||||
@@ -1,28 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Supervisor;
|
||||
|
||||
/// <summary>
|
||||
/// Tracks missed heartbeats on the dedicated heartbeat pipe per
|
||||
/// <c>driver-stability.md §"Heartbeat between proxy and host"</c>: 2s cadence, 3 consecutive
|
||||
/// misses = host declared dead (~6s detection).
|
||||
/// </summary>
|
||||
public sealed class HeartbeatMonitor
|
||||
{
|
||||
public int MissesUntilDead { get; init; } = 3;
|
||||
|
||||
public TimeSpan Cadence { get; init; } = TimeSpan.FromSeconds(2);
|
||||
|
||||
public int ConsecutiveMisses { get; private set; }
|
||||
public DateTime? LastAckUtc { get; private set; }
|
||||
|
||||
public void RecordAck(DateTime utcNow)
|
||||
{
|
||||
ConsecutiveMisses = 0;
|
||||
LastAckUtc = utcNow;
|
||||
}
|
||||
|
||||
public bool RecordMiss()
|
||||
{
|
||||
ConsecutiveMisses++;
|
||||
return ConsecutiveMisses >= MissesUntilDead;
|
||||
}
|
||||
}
|
||||
-30
@@ -1,30 +0,0 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
||||
<GenerateDocumentationFile>true</GenerateDocumentationFile>
|
||||
<NoWarn>$(NoWarn);CS1591</NoWarn>
|
||||
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy</RootNamespace>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.Abstractions\ZB.MOM.WW.OtOpcUa.Core.Abstractions.csproj"/>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core\ZB.MOM.WW.OtOpcUa.Core.csproj"/>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
|
||||
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian\ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.csproj"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -1,32 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class AlarmSubscribeRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class GalaxyAlarmEvent
|
||||
{
|
||||
[Key(0)] public string EventId { get; set; } = string.Empty;
|
||||
[Key(1)] public string ObjectTagName { get; set; } = string.Empty;
|
||||
[Key(2)] public string AlarmName { get; set; } = string.Empty;
|
||||
[Key(3)] public int Severity { get; set; }
|
||||
|
||||
/// <summary>Per OPC UA Part 9 lifecycle: Active, Unacknowledged, Confirmed, Inactive, etc.</summary>
|
||||
[Key(4)] public string StateTransition { get; set; } = string.Empty;
|
||||
|
||||
[Key(5)] public string Message { get; set; } = string.Empty;
|
||||
[Key(6)] public long UtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class AlarmAckRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string EventId { get; set; } = string.Empty;
|
||||
[Key(2)] public string Comment { get; set; } = string.Empty;
|
||||
}
|
||||
@@ -1,53 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
/// <summary>
|
||||
/// IPC-shape for a tag value snapshot. Per decision #13: value + StatusCode + source + server timestamps.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class GalaxyDataValue
|
||||
{
|
||||
[Key(0)] public string TagReference { get; set; } = string.Empty;
|
||||
[Key(1)] public byte[]? ValueBytes { get; set; }
|
||||
[Key(2)] public int ValueMessagePackType { get; set; }
|
||||
[Key(3)] public uint StatusCode { get; set; }
|
||||
[Key(4)] public long SourceTimestampUtcUnixMs { get; set; }
|
||||
[Key(5)] public long ServerTimestampUtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class ReadValuesRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string[] TagReferences { get; set; } = System.Array.Empty<string>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class ReadValuesResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class WriteValuesRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public GalaxyDataValue[] Writes { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class WriteValueResult
|
||||
{
|
||||
[Key(0)] public string TagReference { get; set; } = string.Empty;
|
||||
[Key(1)] public uint StatusCode { get; set; }
|
||||
[Key(2)] public string? Error { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class WriteValuesResponse
|
||||
{
|
||||
[Key(0)] public WriteValueResult[] Results { get; set; } = System.Array.Empty<WriteValueResult>();
|
||||
}
|
||||
@@ -1,50 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class DiscoverHierarchyRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// IPC-shape for a Galaxy object. Proxy maps to/from <c>DriverAttributeInfo</c> (Core.Abstractions).
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class GalaxyObjectInfo
|
||||
{
|
||||
[Key(0)] public string ContainedName { get; set; } = string.Empty;
|
||||
[Key(1)] public string TagName { get; set; } = string.Empty;
|
||||
[Key(2)] public string? ParentContainedName { get; set; }
|
||||
[Key(3)] public string TemplateCategory { get; set; } = string.Empty;
|
||||
[Key(4)] public GalaxyAttributeInfo[] Attributes { get; set; } = System.Array.Empty<GalaxyAttributeInfo>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class GalaxyAttributeInfo
|
||||
{
|
||||
[Key(0)] public string AttributeName { get; set; } = string.Empty;
|
||||
[Key(1)] public int MxDataType { get; set; }
|
||||
[Key(2)] public bool IsArray { get; set; }
|
||||
[Key(3)] public uint? ArrayDim { get; set; }
|
||||
[Key(4)] public int SecurityClassification { get; set; }
|
||||
[Key(5)] public bool IsHistorized { get; set; }
|
||||
|
||||
/// <summary>
|
||||
/// True when the attribute has an AlarmExtension primitive in the Galaxy repository
|
||||
/// (<c>primitive_definition.primitive_name = 'AlarmExtension'</c>). The generic
|
||||
/// node-manager uses this to enrich the variable's OPC UA node with an
|
||||
/// <c>AlarmConditionState</c> during address-space build. Added in PR 9 as the
|
||||
/// discovery-side foundation for the alarm event wire-up that follows in PR 10+.
|
||||
/// </summary>
|
||||
[Key(6)] public bool IsAlarm { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class DiscoverHierarchyResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public GalaxyObjectInfo[] Objects { get; set; } = System.Array.Empty<GalaxyObjectInfo>();
|
||||
}
|
||||
@@ -1,75 +0,0 @@
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
/// <summary>
|
||||
/// Length-prefixed framing per decision #28. Each IPC frame is:
|
||||
/// <c>[4-byte big-endian length][1-byte message kind][MessagePack body]</c>.
|
||||
/// Length is the body size only; the kind byte is not part of the prefixed length.
|
||||
/// </summary>
|
||||
public static class Framing
|
||||
{
|
||||
public const int LengthPrefixSize = 4;
|
||||
public const int KindByteSize = 1;
|
||||
|
||||
/// <summary>
|
||||
/// Maximum permitted body length (16 MiB). Protects the receiver from a hostile or
|
||||
/// misbehaving peer sending an oversized length prefix.
|
||||
/// </summary>
|
||||
public const int MaxFrameBodyBytes = 16 * 1024 * 1024;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Wire identifier for each contract. Values are stable — new contracts append.
|
||||
/// </summary>
|
||||
public enum MessageKind : byte
|
||||
{
|
||||
Hello = 0x01,
|
||||
HelloAck = 0x02,
|
||||
Heartbeat = 0x03,
|
||||
HeartbeatAck = 0x04,
|
||||
|
||||
OpenSessionRequest = 0x10,
|
||||
OpenSessionResponse = 0x11,
|
||||
CloseSessionRequest = 0x12,
|
||||
|
||||
DiscoverHierarchyRequest = 0x20,
|
||||
DiscoverHierarchyResponse = 0x21,
|
||||
|
||||
ReadValuesRequest = 0x30,
|
||||
ReadValuesResponse = 0x31,
|
||||
WriteValuesRequest = 0x32,
|
||||
WriteValuesResponse = 0x33,
|
||||
|
||||
SubscribeRequest = 0x40,
|
||||
SubscribeResponse = 0x41,
|
||||
UnsubscribeRequest = 0x42,
|
||||
OnDataChangeNotification = 0x43,
|
||||
|
||||
AlarmSubscribeRequest = 0x50,
|
||||
AlarmEvent = 0x51,
|
||||
AlarmAckRequest = 0x52,
|
||||
|
||||
HistoryReadRequest = 0x60,
|
||||
HistoryReadResponse = 0x61,
|
||||
HistoryReadProcessedRequest = 0x62,
|
||||
HistoryReadProcessedResponse = 0x63,
|
||||
HistoryReadAtTimeRequest = 0x64,
|
||||
HistoryReadAtTimeResponse = 0x65,
|
||||
HistoryReadEventsRequest = 0x66,
|
||||
HistoryReadEventsResponse = 0x67,
|
||||
|
||||
HostConnectivityStatus = 0x70,
|
||||
RuntimeStatusChange = 0x71,
|
||||
|
||||
// Phase 7 Stream D — historian alarm sink. Main server → Galaxy.Host batched
|
||||
// writes into the Aveva Historian alarm schema via the already-loaded
|
||||
// aahClientManaged DLLs. HistorianConnectivityStatus fires proactively from the
|
||||
// Host when the SDK session transitions so diagnostics flip promptly.
|
||||
HistorianAlarmEventRequest = 0x80,
|
||||
HistorianAlarmEventResponse = 0x81,
|
||||
HistorianConnectivityStatus = 0x82,
|
||||
|
||||
RecycleHostRequest = 0xF0,
|
||||
RecycleStatusResponse = 0xF1,
|
||||
|
||||
ErrorResponse = 0xFE,
|
||||
}
|
||||
@@ -1,36 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
/// <summary>
|
||||
/// First frame of every connection. Advertises protocol major/minor and the peer's feature set.
|
||||
/// Major mismatch is fatal; minor is advisory. Per Task A.3.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class Hello
|
||||
{
|
||||
public const int CurrentMajor = 1;
|
||||
public const int CurrentMinor = 0;
|
||||
|
||||
[Key(0)] public int ProtocolMajor { get; set; } = CurrentMajor;
|
||||
[Key(1)] public int ProtocolMinor { get; set; } = CurrentMinor;
|
||||
[Key(2)] public string PeerName { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>Per-process shared secret — verified on the Host side against the value passed by the supervisor at spawn time.</summary>
|
||||
[Key(3)] public string SharedSecret { get; set; } = string.Empty;
|
||||
|
||||
[Key(4)] public string[] Features { get; set; } = System.Array.Empty<string>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HelloAck
|
||||
{
|
||||
[Key(0)] public int ProtocolMajor { get; set; } = Hello.CurrentMajor;
|
||||
[Key(1)] public int ProtocolMinor { get; set; } = Hello.CurrentMinor;
|
||||
|
||||
/// <summary>True if the server accepted the hello; false + <see cref="RejectReason"/> filled if not.</summary>
|
||||
[Key(2)] public bool Accepted { get; set; }
|
||||
[Key(3)] public string? RejectReason { get; set; }
|
||||
|
||||
[Key(4)] public string HostName { get; set; } = string.Empty;
|
||||
}
|
||||
@@ -1,92 +0,0 @@
|
||||
using System;
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
/// <summary>
|
||||
/// Phase 7 Stream D — IPC contracts for routing Part 9 alarm transitions from the
|
||||
/// main .NET 10 server into Galaxy.Host's already-loaded <c>aahClientManaged</c>
|
||||
/// DLLs. Reuses the Tier-C isolation + licensing pathway rather than loading 32-bit
|
||||
/// native historian code into the main server.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// Batched on the wire to amortize IPC overhead — the main server's SqliteStoreAndForwardSink
|
||||
/// ships up to 100 events per request per Phase 7 plan Stream D.5.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// Per-event outcomes (Ack / RetryPlease / PermanentFail) let the drain worker
|
||||
/// dead-letter malformed events without blocking neighbors in the batch.
|
||||
/// <see cref="HistorianConnectivityStatusNotification"/> fires proactively from
|
||||
/// the Host when the SDK session drops so the /hosts + /alarms/historian Admin
|
||||
/// diagnostics pages flip to red promptly instead of waiting for the next
|
||||
/// drain cycle.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
[MessagePackObject]
|
||||
public sealed class HistorianAlarmEventRequest
|
||||
{
|
||||
[Key(0)] public HistorianAlarmEventDto[] Events { get; set; } = Array.Empty<HistorianAlarmEventDto>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistorianAlarmEventResponse
|
||||
{
|
||||
/// <summary>Per-event outcome, same order as the request.</summary>
|
||||
[Key(0)] public HistorianAlarmEventOutcomeDto[] Outcomes { get; set; } = Array.Empty<HistorianAlarmEventOutcomeDto>();
|
||||
}
|
||||
|
||||
/// <summary>Outcome enum — bytes on the wire so it stays compact.</summary>
|
||||
public enum HistorianAlarmEventOutcomeDto : byte
|
||||
{
|
||||
/// <summary>Successfully persisted to the historian — remove from queue.</summary>
|
||||
Ack = 0,
|
||||
/// <summary>Transient failure (historian disconnected, timeout, busy) — retry after backoff.</summary>
|
||||
RetryPlease = 1,
|
||||
/// <summary>Permanent failure (malformed, unrecoverable SDK error) — move to dead-letter.</summary>
|
||||
PermanentFail = 2,
|
||||
}
|
||||
|
||||
/// <summary>One alarm-transition payload. Fields mirror <c>Core.AlarmHistorian.AlarmHistorianEvent</c>.</summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HistorianAlarmEventDto
|
||||
{
|
||||
[Key(0)] public string AlarmId { get; set; } = string.Empty;
|
||||
[Key(1)] public string EquipmentPath { get; set; } = string.Empty;
|
||||
[Key(2)] public string AlarmName { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>Concrete Part 9 subtype name — "LimitAlarm" / "OffNormalAlarm" / "AlarmCondition" / "DiscreteAlarm".</summary>
|
||||
[Key(3)] public string AlarmTypeName { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>Numeric severity the Host maps to the historian's priority scale.</summary>
|
||||
[Key(4)] public int Severity { get; set; }
|
||||
|
||||
/// <summary>Which transition this event represents — "Activated" / "Cleared" / "Acknowledged" / etc.</summary>
|
||||
[Key(5)] public string EventKind { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>Pre-rendered message — template tokens resolved upstream.</summary>
|
||||
[Key(6)] public string Message { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>Operator who triggered the transition. "system" for engine-driven events.</summary>
|
||||
[Key(7)] public string User { get; set; } = "system";
|
||||
|
||||
/// <summary>Operator-supplied free-form comment, if any.</summary>
|
||||
[Key(8)] public string? Comment { get; set; }
|
||||
|
||||
/// <summary>Source timestamp (UTC Unix milliseconds).</summary>
|
||||
[Key(9)] public long TimestampUtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Proactive notification — Galaxy.Host pushes this when the historian SDK session
|
||||
/// transitions (connected / disconnected / degraded). The main server reflects this
|
||||
/// into the historian sink status so Admin UI surfaces the problem without the
|
||||
/// operator having to scrutinize drain cadence.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HistorianConnectivityStatusNotification
|
||||
{
|
||||
[Key(0)] public string Status { get; set; } = "unknown"; // connected | disconnected | degraded
|
||||
[Key(1)] public string? Detail { get; set; }
|
||||
[Key(2)] public long ObservedAtUtcUnixMs { get; set; }
|
||||
}
|
||||
@@ -1,110 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string[] TagReferences { get; set; } = System.Array.Empty<string>();
|
||||
[Key(2)] public long StartUtcUnixMs { get; set; }
|
||||
[Key(3)] public long EndUtcUnixMs { get; set; }
|
||||
[Key(4)] public uint MaxValuesPerTag { get; set; } = 1000;
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryTagValues
|
||||
{
|
||||
[Key(0)] public string TagReference { get; set; } = string.Empty;
|
||||
[Key(1)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public HistoryTagValues[] Tags { get; set; } = System.Array.Empty<HistoryTagValues>();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Processed (aggregated) historian read — OPC UA HistoryReadProcessed service. The
|
||||
/// aggregate column is a string (e.g. "Average", "Minimum") mapped by the Proxy from the
|
||||
/// OPC UA HistoryAggregateType enum so Galaxy.Host stays OPC-UA-free.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadProcessedRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string TagReference { get; set; } = string.Empty;
|
||||
[Key(2)] public long StartUtcUnixMs { get; set; }
|
||||
[Key(3)] public long EndUtcUnixMs { get; set; }
|
||||
[Key(4)] public long IntervalMs { get; set; }
|
||||
[Key(5)] public string AggregateColumn { get; set; } = "Average";
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadProcessedResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// At-time historian read — OPC UA HistoryReadAtTime service. Returns one sample per
|
||||
/// requested timestamp (interpolated when no exact match exists). The per-timestamp array
|
||||
/// is flow-encoded as Unix milliseconds to avoid MessagePack DateTime quirks.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadAtTimeRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string TagReference { get; set; } = string.Empty;
|
||||
[Key(2)] public long[] TimestampsUtcUnixMs { get; set; } = System.Array.Empty<long>();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadAtTimeResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Historical events read — OPC UA HistoryReadEvents service and Alarm & Condition
|
||||
/// history. <c>SourceName</c> null means "all sources". Distinct from the live
|
||||
/// <see cref="GalaxyAlarmEvent"/> stream because historical rows carry both
|
||||
/// <c>EventTime</c> (when the event occurred in the process) and <c>ReceivedTime</c>
|
||||
/// (when the Historian persisted it) and have no StateTransition — the Historian logs
|
||||
/// the instantaneous event, not the OPC UA alarm lifecycle.
|
||||
/// </summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadEventsRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string? SourceName { get; set; }
|
||||
[Key(2)] public long StartUtcUnixMs { get; set; }
|
||||
[Key(3)] public long EndUtcUnixMs { get; set; }
|
||||
[Key(4)] public int MaxEvents { get; set; } = 1000;
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class GalaxyHistoricalEvent
|
||||
{
|
||||
[Key(0)] public string EventId { get; set; } = string.Empty;
|
||||
[Key(1)] public string? SourceName { get; set; }
|
||||
[Key(2)] public long EventTimeUtcUnixMs { get; set; }
|
||||
[Key(3)] public long ReceivedTimeUtcUnixMs { get; set; }
|
||||
[Key(4)] public string? DisplayText { get; set; }
|
||||
[Key(5)] public ushort Severity { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HistoryReadEventsResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public GalaxyHistoricalEvent[] Events { get; set; } = System.Array.Empty<GalaxyHistoricalEvent>();
|
||||
}
|
||||
@@ -1,47 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class OpenSessionRequest
|
||||
{
|
||||
[Key(0)] public string DriverInstanceId { get; set; } = string.Empty;
|
||||
|
||||
/// <summary>JSON blob sourced from <c>DriverInstance.DriverConfig</c>.</summary>
|
||||
[Key(1)] public string DriverConfigJson { get; set; } = string.Empty;
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class OpenSessionResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public long SessionId { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class CloseSessionRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class Heartbeat
|
||||
{
|
||||
[Key(0)] public long SequenceNumber { get; set; }
|
||||
[Key(1)] public long UtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class HeartbeatAck
|
||||
{
|
||||
[Key(0)] public long SequenceNumber { get; set; }
|
||||
[Key(1)] public long UtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class ErrorResponse
|
||||
{
|
||||
[Key(0)] public string Code { get; set; } = string.Empty;
|
||||
[Key(1)] public string Message { get; set; } = string.Empty;
|
||||
}
|
||||
@@ -1,34 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
/// <summary>Per-host runtime status — per <c>driver-stability.md</c> Galaxy §"Connection Health Probe".</summary>
|
||||
[MessagePackObject]
|
||||
public sealed class HostConnectivityStatus
|
||||
{
|
||||
[Key(0)] public string HostName { get; set; } = string.Empty;
|
||||
[Key(1)] public string RuntimeStatus { get; set; } = string.Empty; // Running | Stopped | Unknown
|
||||
[Key(2)] public long LastObservedUtcUnixMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class RuntimeStatusChangeNotification
|
||||
{
|
||||
[Key(0)] public HostConnectivityStatus Status { get; set; } = new();
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class RecycleHostRequest
|
||||
{
|
||||
/// <summary>One of: Soft, Hard.</summary>
|
||||
[Key(0)] public string Kind { get; set; } = "Soft";
|
||||
[Key(1)] public string Reason { get; set; } = string.Empty;
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class RecycleStatusResponse
|
||||
{
|
||||
[Key(0)] public bool Accepted { get; set; }
|
||||
[Key(1)] public int GraceSeconds { get; set; } = 15;
|
||||
[Key(2)] public string? Error { get; set; }
|
||||
}
|
||||
@@ -1,34 +0,0 @@
|
||||
using MessagePack;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class SubscribeRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public string[] TagReferences { get; set; } = System.Array.Empty<string>();
|
||||
[Key(2)] public int RequestedIntervalMs { get; set; } = 1000;
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class SubscribeResponse
|
||||
{
|
||||
[Key(0)] public bool Success { get; set; }
|
||||
[Key(1)] public string? Error { get; set; }
|
||||
[Key(2)] public long SubscriptionId { get; set; }
|
||||
[Key(3)] public int ActualIntervalMs { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class UnsubscribeRequest
|
||||
{
|
||||
[Key(0)] public long SessionId { get; set; }
|
||||
[Key(1)] public long SubscriptionId { get; set; }
|
||||
}
|
||||
|
||||
[MessagePackObject]
|
||||
public sealed class OnDataChangeNotification
|
||||
{
|
||||
[Key(0)] public long SubscriptionId { get; set; }
|
||||
[Key(1)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
|
||||
}
|
||||
@@ -1,67 +0,0 @@
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
|
||||
/// <summary>
|
||||
/// Reads length-prefixed, kind-tagged frames from a stream. Single-consumer — do not call
|
||||
/// <see cref="ReadFrameAsync"/> from multiple threads against the same instance.
|
||||
/// </summary>
|
||||
public sealed class FrameReader : IDisposable
|
||||
{
|
||||
private readonly Stream _stream;
|
||||
private readonly bool _leaveOpen;
|
||||
|
||||
public FrameReader(Stream stream, bool leaveOpen = false)
|
||||
{
|
||||
_stream = stream ?? throw new ArgumentNullException(nameof(stream));
|
||||
_leaveOpen = leaveOpen;
|
||||
}
|
||||
|
||||
public async Task<(MessageKind Kind, byte[] Body)?> ReadFrameAsync(CancellationToken ct)
|
||||
{
|
||||
var lengthPrefix = new byte[Framing.LengthPrefixSize];
|
||||
if (!await ReadExactAsync(lengthPrefix, ct).ConfigureAwait(false))
|
||||
return null; // clean EOF on frame boundary
|
||||
|
||||
var length = (lengthPrefix[0] << 24) | (lengthPrefix[1] << 16) | (lengthPrefix[2] << 8) | lengthPrefix[3];
|
||||
if (length < 0 || length > Framing.MaxFrameBodyBytes)
|
||||
throw new InvalidDataException($"IPC frame length {length} out of range.");
|
||||
|
||||
var kindByte = _stream.ReadByte();
|
||||
if (kindByte < 0) throw new EndOfStreamException("EOF after length prefix, before kind byte.");
|
||||
|
||||
var body = new byte[length];
|
||||
if (!await ReadExactAsync(body, ct).ConfigureAwait(false))
|
||||
throw new EndOfStreamException("EOF mid-frame.");
|
||||
|
||||
return ((MessageKind)(byte)kindByte, body);
|
||||
}
|
||||
|
||||
public static T Deserialize<T>(byte[] body) => MessagePackSerializer.Deserialize<T>(body);
|
||||
|
||||
private async Task<bool> ReadExactAsync(byte[] buffer, CancellationToken ct)
|
||||
{
|
||||
var offset = 0;
|
||||
while (offset < buffer.Length)
|
||||
{
|
||||
var read = await _stream.ReadAsync(buffer, offset, buffer.Length - offset, ct).ConfigureAwait(false);
|
||||
if (read == 0)
|
||||
{
|
||||
if (offset == 0) return false;
|
||||
throw new EndOfStreamException($"Stream ended after reading {offset} of {buffer.Length} bytes.");
|
||||
}
|
||||
offset += read;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (!_leaveOpen) _stream.Dispose();
|
||||
}
|
||||
}
|
||||
@@ -1,57 +0,0 @@
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.Threading;
|
||||
using System.Threading.Tasks;
|
||||
using MessagePack;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
|
||||
|
||||
/// <summary>
|
||||
/// Writes length-prefixed, kind-tagged MessagePack frames to a stream. Thread-safe via
|
||||
/// <see cref="SemaphoreSlim"/> — multiple producers (e.g. heartbeat + data-plane sharing a stream)
|
||||
/// get serialized writes.
|
||||
/// </summary>
|
||||
public sealed class FrameWriter : IDisposable
|
||||
{
|
||||
private readonly Stream _stream;
|
||||
private readonly SemaphoreSlim _gate = new(1, 1);
|
||||
private readonly bool _leaveOpen;
|
||||
|
||||
public FrameWriter(Stream stream, bool leaveOpen = false)
|
||||
{
|
||||
_stream = stream ?? throw new ArgumentNullException(nameof(stream));
|
||||
_leaveOpen = leaveOpen;
|
||||
}
|
||||
|
||||
public async Task WriteAsync<T>(MessageKind kind, T message, CancellationToken ct)
|
||||
{
|
||||
var body = MessagePackSerializer.Serialize(message, cancellationToken: ct);
|
||||
if (body.Length > Framing.MaxFrameBodyBytes)
|
||||
throw new InvalidOperationException(
|
||||
$"IPC frame body {body.Length} exceeds {Framing.MaxFrameBodyBytes} byte cap.");
|
||||
|
||||
var lengthPrefix = new byte[Framing.LengthPrefixSize];
|
||||
// Big-endian — easy to read in hex dumps.
|
||||
lengthPrefix[0] = (byte)((body.Length >> 24) & 0xFF);
|
||||
lengthPrefix[1] = (byte)((body.Length >> 16) & 0xFF);
|
||||
lengthPrefix[2] = (byte)((body.Length >> 8) & 0xFF);
|
||||
lengthPrefix[3] = (byte)( body.Length & 0xFF);
|
||||
|
||||
await _gate.WaitAsync(ct).ConfigureAwait(false);
|
||||
try
|
||||
{
|
||||
await _stream.WriteAsync(lengthPrefix, 0, lengthPrefix.Length, ct).ConfigureAwait(false);
|
||||
_stream.WriteByte((byte)kind);
|
||||
await _stream.WriteAsync(body, 0, body.Length, ct).ConfigureAwait(false);
|
||||
await _stream.FlushAsync(ct).ConfigureAwait(false);
|
||||
}
|
||||
finally { _gate.Release(); }
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
_gate.Dispose();
|
||||
if (!_leaveOpen) _stream.Dispose();
|
||||
}
|
||||
}
|
||||
-23
@@ -1,23 +0,0 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>netstandard2.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<LangVersion>latest</LangVersion>
|
||||
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
||||
<GenerateDocumentationFile>true</GenerateDocumentationFile>
|
||||
<NoWarn>$(NoWarn);CS1591</NoWarn>
|
||||
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared</RootNamespace>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<!-- Decision #32: MessagePack for IPC. Netstandard 2.0 consumable by both .NET 10 (Proxy) + .NET 4.8 (Host). -->
|
||||
<PackageReference Include="MessagePack" Version="2.5.187"/>
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
|
||||
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
@@ -52,7 +52,7 @@ public sealed class GalaxyDiscoverer
|
||||
if (string.IsNullOrEmpty(attr.AttributeName)) continue;
|
||||
|
||||
var fullReference = !string.IsNullOrEmpty(attr.FullTagReference)
|
||||
? attr.FullTagReference
|
||||
? StripArraySuffix(attr.FullTagReference)
|
||||
: obj.TagName + "." + attr.AttributeName;
|
||||
|
||||
var info = new DriverAttributeInfo(
|
||||
@@ -77,4 +77,15 @@ public sealed class GalaxyDiscoverer
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// PR 5.W workaround for mxaccessgw GalaxyRepository.cs:173-175 — the gateway's
|
||||
// SQL appends `[]` to array-typed `full_tag_reference` values, but MxAccess COM
|
||||
// `IInstance.AddItem` doesn't accept `[]`-suffixed addresses (so any downstream
|
||||
// Subscribe/Read/Write through the worker would fail with the suffixed form).
|
||||
// Strip defensively here so the parity matrix can run today; remove once the
|
||||
// gw fix (mxaccessgw/requirements-array-suffix-fix.md) lands.
|
||||
private static string StripArraySuffix(string fullReference) =>
|
||||
fullReference.EndsWith("[]", StringComparison.Ordinal)
|
||||
? fullReference[..^2]
|
||||
: fullReference;
|
||||
}
|
||||
|
||||
@@ -0,0 +1,30 @@
|
||||
using MxGateway.Contracts.Proto.Galaxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
|
||||
/// <summary>
|
||||
/// PR 6.1 — Decorator that emits one <see cref="System.Diagnostics.Activity"/> span
|
||||
/// per <c>GetHierarchy</c> RPC. <c>galaxy.object_count</c> on the span lets ops
|
||||
/// correlate slow Discover passes with Galaxy size without instrumenting the
|
||||
/// discoverer's translation step.
|
||||
/// </summary>
|
||||
internal sealed class TracedGalaxyHierarchySource(IGalaxyHierarchySource inner, string clientName) : IGalaxyHierarchySource
|
||||
{
|
||||
public async Task<IReadOnlyList<GalaxyObject>> GetHierarchyAsync(CancellationToken cancellationToken)
|
||||
{
|
||||
using var activity = GalaxyTelemetry.ActivitySource.StartActivity("galaxy.get_hierarchy");
|
||||
activity?.SetTag("galaxy.client", clientName);
|
||||
try
|
||||
{
|
||||
var hierarchy = await inner.GetHierarchyAsync(cancellationToken).ConfigureAwait(false);
|
||||
activity?.SetTag("galaxy.object_count", hierarchy.Count);
|
||||
return hierarchy;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
activity.RecordError(ex);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -22,13 +22,22 @@ public sealed record GalaxyDriverOptions(
|
||||
/// through the server-side secret store (DPAPI for production, environment override for
|
||||
/// dev) — the API key never appears in cleartext config.
|
||||
/// </summary>
|
||||
// PR 6.5 tuning notes:
|
||||
// ConnectTimeoutSeconds = 10 — cold-start network path comfort margin; soak runs
|
||||
// never saw a successful connect take >2s, so 10s is generous without being lax.
|
||||
// DefaultCallTimeoutSeconds = 30 — bumped from 5s because a 50k-tag SubscribeBulk
|
||||
// can exceed 5s under MxAccess COM contention (the worker walks the gw item list
|
||||
// serially under the apartment lock). 30s leaves comfortable headroom for the
|
||||
// legitimate worst case while still failing fast on a wedged worker.
|
||||
// StreamTimeoutSeconds = 0 — unlimited; the StreamEvents RPC must run for the
|
||||
// lifetime of the driver. Set a finite value only for diagnostic runs.
|
||||
public sealed record GalaxyGatewayOptions(
|
||||
string Endpoint,
|
||||
string ApiKeySecretRef,
|
||||
bool UseTls = true,
|
||||
string? CaCertificatePath = null,
|
||||
int ConnectTimeoutSeconds = 10,
|
||||
int DefaultCallTimeoutSeconds = 5,
|
||||
int DefaultCallTimeoutSeconds = 30,
|
||||
int StreamTimeoutSeconds = 0);
|
||||
|
||||
/// <summary>
|
||||
@@ -47,10 +56,17 @@ public sealed record GalaxyGatewayOptions(
|
||||
/// Reserved for ArchestrA secured-write user mapping; PR 4.3 wires <c>WriteSecured</c>
|
||||
/// routing against this id. 0 = anonymous.
|
||||
/// </param>
|
||||
/// <param name="EventPumpChannelCapacity">
|
||||
/// Bounded-channel size between the EventPump's network-read loop and its listener
|
||||
/// fan-out loop (PR 6.2). Default 50_000 = one second of headroom at 50k tags / 1Hz;
|
||||
/// raise it when <c>galaxy.events.dropped</c> shows up under transient consumer
|
||||
/// slowness, lower it on a memory-tight host where the headroom isn't needed.
|
||||
/// </param>
|
||||
public sealed record GalaxyMxAccessOptions(
|
||||
string ClientName,
|
||||
int PublishingIntervalMs = 1000,
|
||||
int WriteUserId = 0);
|
||||
int WriteUserId = 0,
|
||||
int EventPumpChannelCapacity = 50_000);
|
||||
|
||||
/// <summary>
|
||||
/// Galaxy Repository browse-side knobs consumed by PR 4.1's <c>GalaxyDiscoverer</c>.
|
||||
|
||||
@@ -1,6 +1,7 @@
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Browse;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Config;
|
||||
@@ -185,8 +186,16 @@ public sealed class GalaxyDriver
|
||||
_ownedMxSession = new GalaxyMxSession(_options.MxAccess, _logger);
|
||||
await _ownedMxSession.ConnectAsync(clientOptions, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
_subscriber = new GatewayGalaxySubscriber(_ownedMxSession);
|
||||
_dataWriter = new GatewayGalaxyDataWriter(_ownedMxSession, _options.MxAccess.WriteUserId, _logger);
|
||||
// PR 6.1 — wrap the gw-facing seams in tracing decorators so every Subscribe /
|
||||
// Unsubscribe / Write / StreamEvents call emits a span on the
|
||||
// "ZB.MOM.WW.OtOpcUa.Driver.Galaxy" ActivitySource. The host process's tracing
|
||||
// listener (OTLP exporter, dotnet-trace, etc.) consumes these without the driver
|
||||
// taking a dependency on the OpenTelemetry packages.
|
||||
_subscriber = new TracedGalaxySubscriber(
|
||||
new GatewayGalaxySubscriber(_ownedMxSession), _options.MxAccess.ClientName);
|
||||
_dataWriter = new TracedGalaxyDataWriter(
|
||||
new GatewayGalaxyDataWriter(_ownedMxSession, _options.MxAccess.WriteUserId, _logger),
|
||||
_options.MxAccess.ClientName);
|
||||
|
||||
_supervisor = new ReconnectSupervisor(
|
||||
reopen: ReopenAsync,
|
||||
@@ -201,7 +210,9 @@ public sealed class GalaxyDriver
|
||||
|
||||
_supervisor.StateChanged += OnSupervisorStateChanged;
|
||||
|
||||
_probeWatcher = new PerPlatformProbeWatcher(_subscriber, _hostStatuses, _logger);
|
||||
_probeWatcher = new PerPlatformProbeWatcher(
|
||||
_subscriber, _hostStatuses, _logger,
|
||||
bufferedUpdateIntervalMs: _options.MxAccess.PublishingIntervalMs);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -252,10 +263,58 @@ public sealed class GalaxyDriver
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Resolves <c>Gateway.ApiKeySecretRef</c> to the actual API-key bytes. Three
|
||||
/// forms supported, evaluated in order:
|
||||
/// <list type="number">
|
||||
/// <item><c>env:NAME</c> — reads <c>Environment.GetEnvironmentVariable(NAME)</c>.
|
||||
/// Throws when the variable is unset, so a misconfigured deployment fails
|
||||
/// fast at InitializeAsync rather than silently sending an empty key.</item>
|
||||
/// <item><c>file:PATH</c> — reads UTF-8 text from <c>PATH</c>, trimming
|
||||
/// whitespace. Lets operators stash the key in an ACL'd file outside the
|
||||
/// repo (the same pattern as the legacy <c>.local/galaxy-host-secret.txt</c>).</item>
|
||||
/// <item>Anything else — used as the literal API key. Convenient for dev,
|
||||
/// and avoids breaking existing configs that pre-date this resolver.</item>
|
||||
/// </list>
|
||||
/// A future PR can swap any of these arms for a DPAPI-backed lookup without
|
||||
/// changing the call site.
|
||||
/// </summary>
|
||||
internal static string ResolveApiKey(string secretRef)
|
||||
{
|
||||
ArgumentException.ThrowIfNullOrEmpty(secretRef);
|
||||
|
||||
if (secretRef.StartsWith("env:", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
var name = secretRef[4..];
|
||||
var value = Environment.GetEnvironmentVariable(name);
|
||||
return !string.IsNullOrEmpty(value)
|
||||
? value
|
||||
: throw new InvalidOperationException(
|
||||
$"Galaxy.Gateway.ApiKeySecretRef='{secretRef}' resolves to env var '{name}', but it is unset.");
|
||||
}
|
||||
|
||||
if (secretRef.StartsWith("file:", StringComparison.OrdinalIgnoreCase))
|
||||
{
|
||||
var path = secretRef[5..];
|
||||
if (!File.Exists(path))
|
||||
{
|
||||
throw new InvalidOperationException(
|
||||
$"Galaxy.Gateway.ApiKeySecretRef='{secretRef}' points at '{path}', which doesn't exist.");
|
||||
}
|
||||
var contents = File.ReadAllText(path).Trim();
|
||||
return !string.IsNullOrEmpty(contents)
|
||||
? contents
|
||||
: throw new InvalidOperationException(
|
||||
$"Galaxy.Gateway.ApiKeySecretRef='{secretRef}' file '{path}' is empty.");
|
||||
}
|
||||
|
||||
return secretRef;
|
||||
}
|
||||
|
||||
private static MxGatewayClientOptions BuildClientOptions(GalaxyGatewayOptions gw) => new()
|
||||
{
|
||||
Endpoint = new Uri(gw.Endpoint, UriKind.Absolute),
|
||||
ApiKey = gw.ApiKeySecretRef,
|
||||
ApiKey = ResolveApiKey(gw.ApiKeySecretRef),
|
||||
UseTls = gw.UseTls,
|
||||
CaCertificatePath = gw.CaCertificatePath,
|
||||
ConnectTimeout = TimeSpan.FromSeconds(gw.ConnectTimeoutSeconds),
|
||||
@@ -357,7 +416,7 @@ public sealed class GalaxyDriver
|
||||
private SecurityClassification ResolveSecurity(string fullReference) =>
|
||||
_securityByFullRef.TryGetValue(fullReference, out var sec) ? sec : SecurityClassification.FreeAccess;
|
||||
|
||||
// ===== IReadable (PR 4.2 — abstraction; PR 4.4 supplies production reader) =====
|
||||
// ===== IReadable =====
|
||||
|
||||
/// <inheritdoc />
|
||||
public Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
|
||||
@@ -367,19 +426,152 @@ public sealed class GalaxyDriver
|
||||
ArgumentNullException.ThrowIfNull(fullReferences);
|
||||
if (fullReferences.Count == 0) return Task.FromResult<IReadOnlyList<DataValueSnapshot>>([]);
|
||||
|
||||
if (_dataReader is null)
|
||||
if (_dataReader is not null)
|
||||
{
|
||||
// The production GW-backed reader builds on the StreamEvents pump that PR 4.4
|
||||
// ships; until then a real gateway-driver instance can't fulfill reads.
|
||||
// Tests that need to exercise IReadable inject a fake reader via the internal
|
||||
// ctor; production deployments running on this PR should keep the
|
||||
// legacy-host backend selected via the Galaxy:Backend flag (PR 4.W).
|
||||
throw new NotSupportedException(
|
||||
"GalaxyDriver.ReadAsync requires the StreamEvents-backed reader from PR 4.4. " +
|
||||
"Until that lands, route reads through the legacy-host backend (Galaxy:Backend=legacy-host).");
|
||||
// Test-only path — tests inject a canned reader via the internal ctor.
|
||||
return _dataReader.ReadAsync(fullReferences, cancellationToken);
|
||||
}
|
||||
|
||||
return _dataReader.ReadAsync(fullReferences, cancellationToken);
|
||||
if (_subscriber is null)
|
||||
{
|
||||
throw new NotSupportedException(
|
||||
"GalaxyDriver.ReadAsync requires a connected GalaxyMxSession (production runtime not built). " +
|
||||
"Either inject a test seam via the internal ctor or call InitializeAsync against a real gateway.");
|
||||
}
|
||||
|
||||
return ReadViaSubscribeOnceAsync(fullReferences, cancellationToken);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Production read path. MxAccess has no one-shot Read RPC — every value comes
|
||||
/// through the event stream. We synthesise a Read by:
|
||||
/// <list type="number">
|
||||
/// <item>Subscribing the requested tags through the existing
|
||||
/// <see cref="SubscriptionRegistry"/> + <see cref="EventPump"/>.</item>
|
||||
/// <item>Waiting for the first <c>OnDataChange</c> per item handle (the gateway
|
||||
/// pushes the current value as the initial event after a SubscribeBulk).</item>
|
||||
/// <item>Unsubscribing.</item>
|
||||
/// </list>
|
||||
/// Tags the gw rejects at SubscribeBulk time, or that never publish before the
|
||||
/// caller's cancellation token fires, return a Bad-status snapshot in input order
|
||||
/// so the caller still sees one snapshot per requested reference.
|
||||
/// </summary>
|
||||
private async Task<IReadOnlyList<DataValueSnapshot>> ReadViaSubscribeOnceAsync(
|
||||
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
|
||||
{
|
||||
var pump = EnsureEventPumpStarted();
|
||||
var subscriptionId = _subscriptions.NextSubscriptionId();
|
||||
|
||||
// Pre-allocate one TaskCompletionSource per full-reference so the OnDataChange
|
||||
// handler can complete them out-of-order as events arrive. Wired BEFORE the
|
||||
// SubscribeBulk call so we don't race with the first event the gw pushes.
|
||||
var pendingByRef = new Dictionary<string, TaskCompletionSource<DataValueSnapshot>>(
|
||||
StringComparer.OrdinalIgnoreCase);
|
||||
foreach (var fullRef in fullReferences.Distinct(StringComparer.OrdinalIgnoreCase))
|
||||
{
|
||||
pendingByRef[fullRef] = new TaskCompletionSource<DataValueSnapshot>(
|
||||
TaskCreationOptions.RunContinuationsAsynchronously);
|
||||
}
|
||||
|
||||
EventHandler<DataChangeEventArgs> handler = (_, args) =>
|
||||
{
|
||||
// Filter to OUR subscription — the pump's OnDataChange fans out across all
|
||||
// subscriptions on the driver, and we don't want a parallel ISubscribable
|
||||
// caller's events to leak into our read.
|
||||
if (args.SubscriptionHandle is GalaxySubscriptionHandle gsh
|
||||
&& gsh.SubscriptionId == subscriptionId
|
||||
&& pendingByRef.TryGetValue(args.FullReference, out var tcs))
|
||||
{
|
||||
tcs.TrySetResult(args.Snapshot);
|
||||
}
|
||||
};
|
||||
pump.OnDataChange += handler;
|
||||
|
||||
var bufferedIntervalMs = _options.MxAccess.PublishingIntervalMs;
|
||||
IReadOnlyList<SubscribeResult> results;
|
||||
try
|
||||
{
|
||||
results = await _subscriber!
|
||||
.SubscribeBulkAsync(fullReferences, bufferedIntervalMs, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch
|
||||
{
|
||||
pump.OnDataChange -= handler;
|
||||
throw;
|
||||
}
|
||||
|
||||
// Register bindings so the pump knows to dispatch events for these handles.
|
||||
var bindings = new List<TagBinding>(fullReferences.Count);
|
||||
for (var i = 0; i < fullReferences.Count; i++)
|
||||
{
|
||||
var fullRef = fullReferences[i];
|
||||
var match = results.FirstOrDefault(r => string.Equals(r.TagAddress, fullRef, StringComparison.OrdinalIgnoreCase));
|
||||
var itemHandle = match is { WasSuccessful: true } ? match.ItemHandle : 0;
|
||||
bindings.Add(new TagBinding(fullRef, itemHandle));
|
||||
|
||||
// Tags the gw rejected up front — complete with Bad status now so the
|
||||
// wait below doesn't time out on them.
|
||||
if (itemHandle <= 0
|
||||
&& pendingByRef.TryGetValue(fullRef, out var rejectedTcs))
|
||||
{
|
||||
rejectedTcs.TrySetResult(new DataValueSnapshot(
|
||||
Value: null,
|
||||
StatusCode: 0x80000000u, // Bad
|
||||
SourceTimestampUtc: null,
|
||||
ServerTimestampUtc: DateTime.UtcNow));
|
||||
}
|
||||
}
|
||||
_subscriptions.Register(subscriptionId, bindings);
|
||||
|
||||
try
|
||||
{
|
||||
// Wait for every pending TCS to complete or the caller's CT to fire. When the
|
||||
// CT fires before all values arrive, fill the still-pending entries with a
|
||||
// Bad-status snapshot rather than throwing — Read semantics let callers see
|
||||
// partial results.
|
||||
using var registration = cancellationToken.Register(() =>
|
||||
{
|
||||
foreach (var tcs in pendingByRef.Values)
|
||||
{
|
||||
tcs.TrySetResult(new DataValueSnapshot(
|
||||
Value: null,
|
||||
StatusCode: 0x800B0000u, // BadTimeout
|
||||
SourceTimestampUtc: null,
|
||||
ServerTimestampUtc: DateTime.UtcNow));
|
||||
}
|
||||
});
|
||||
|
||||
var snapshots = new DataValueSnapshot[fullReferences.Count];
|
||||
for (var i = 0; i < fullReferences.Count; i++)
|
||||
{
|
||||
snapshots[i] = await pendingByRef[fullReferences[i]].Task.ConfigureAwait(false);
|
||||
}
|
||||
return snapshots;
|
||||
}
|
||||
finally
|
||||
{
|
||||
pump.OnDataChange -= handler;
|
||||
// Drop the bindings + unsubscribe the live handles. UnsubscribeBulkAsync's
|
||||
// failure isn't fatal — the registry is already cleared, so any straggling
|
||||
// event from the gw would be a no-op fan-out.
|
||||
_subscriptions.Remove(subscriptionId);
|
||||
var liveHandles = bindings.Where(b => b.ItemHandle > 0).Select(b => b.ItemHandle).ToArray();
|
||||
if (liveHandles.Length > 0)
|
||||
{
|
||||
try
|
||||
{
|
||||
await _subscriber!.UnsubscribeBulkAsync(liveHandles, CancellationToken.None)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex,
|
||||
"GalaxyDriver.ReadViaSubscribeOnceAsync UnsubscribeBulk failed for {Count} handle(s) — registry already cleared.",
|
||||
liveHandles.Length);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ===== IWritable (PR 4.3) =====
|
||||
@@ -433,7 +625,12 @@ public sealed class GalaxyDriver
|
||||
return new GalaxySubscriptionHandle(subscriptionId);
|
||||
}
|
||||
|
||||
var bufferedIntervalMs = (int)Math.Max(0, publishingInterval.TotalMilliseconds);
|
||||
// PR 6.3 — when the caller doesn't set a publishing interval (TimeSpan.Zero or
|
||||
// negative), fall back to the configured MxAccess.PublishingIntervalMs. The
|
||||
// server's UA subscription publishingInterval drives this in production; tests
|
||||
// and infrastructure callers (probe watcher, deploy watcher) hit the fallback.
|
||||
var requested = (int)Math.Max(0, publishingInterval.TotalMilliseconds);
|
||||
var bufferedIntervalMs = requested > 0 ? requested : _options.MxAccess.PublishingIntervalMs;
|
||||
var results = await _subscriber
|
||||
.SubscribeBulkAsync(fullReferences, bufferedIntervalMs, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
@@ -503,7 +700,10 @@ public sealed class GalaxyDriver
|
||||
lock (_pumpLock)
|
||||
{
|
||||
if (_eventPump is not null) return _eventPump;
|
||||
_eventPump = new EventPump(_subscriber!, _subscriptions, _logger);
|
||||
_eventPump = new EventPump(
|
||||
_subscriber!, _subscriptions, _logger,
|
||||
channelCapacity: _options.MxAccess.EventPumpChannelCapacity,
|
||||
clientName: _options.MxAccess.ClientName);
|
||||
_eventPump.OnDataChange += OnPumpDataChange;
|
||||
_eventPump.Start();
|
||||
return _eventPump;
|
||||
@@ -547,9 +747,7 @@ public sealed class GalaxyDriver
|
||||
var clientOptions = new MxGatewayClientOptions
|
||||
{
|
||||
Endpoint = new Uri(gw.Endpoint, UriKind.Absolute),
|
||||
// PR 4.1 stub: ApiKeySecretRef is currently treated as the literal API key.
|
||||
// PR 4.W (or a follow-up) wires up DPAPI-backed secret resolution.
|
||||
ApiKey = gw.ApiKeySecretRef,
|
||||
ApiKey = ResolveApiKey(gw.ApiKeySecretRef),
|
||||
UseTls = gw.UseTls,
|
||||
CaCertificatePath = gw.CaCertificatePath,
|
||||
ConnectTimeout = TimeSpan.FromSeconds(gw.ConnectTimeoutSeconds),
|
||||
@@ -559,7 +757,8 @@ public sealed class GalaxyDriver
|
||||
: null,
|
||||
};
|
||||
_ownedRepositoryClient = GalaxyRepositoryClient.Create(clientOptions);
|
||||
return new GatewayGalaxyHierarchySource(_ownedRepositoryClient);
|
||||
return new TracedGalaxyHierarchySource(
|
||||
new GatewayGalaxyHierarchySource(_ownedRepositoryClient), _options.MxAccess.ClientName);
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
|
||||
@@ -54,14 +54,15 @@ public static class GalaxyDriverFactoryExtensions
|
||||
UseTls: dto.Gateway.UseTls ?? true,
|
||||
CaCertificatePath: dto.Gateway.CaCertificatePath,
|
||||
ConnectTimeoutSeconds: dto.Gateway.ConnectTimeoutSeconds ?? 10,
|
||||
DefaultCallTimeoutSeconds: dto.Gateway.DefaultCallTimeoutSeconds ?? 5,
|
||||
DefaultCallTimeoutSeconds: dto.Gateway.DefaultCallTimeoutSeconds ?? 30,
|
||||
StreamTimeoutSeconds: dto.Gateway.StreamTimeoutSeconds ?? 0),
|
||||
MxAccess: new GalaxyMxAccessOptions(
|
||||
ClientName: dto.MxAccess?.ClientName
|
||||
?? throw new InvalidOperationException(
|
||||
$"Galaxy driver '{driverInstanceId}' missing required MxAccess.ClientName"),
|
||||
PublishingIntervalMs: dto.MxAccess.PublishingIntervalMs ?? 1000,
|
||||
WriteUserId: dto.MxAccess.WriteUserId ?? 0),
|
||||
WriteUserId: dto.MxAccess.WriteUserId ?? 0,
|
||||
EventPumpChannelCapacity: dto.MxAccess.EventPumpChannelCapacity ?? 50_000),
|
||||
Repository: new GalaxyRepositoryOptions(
|
||||
DiscoverPageSize: dto.Repository?.DiscoverPageSize ?? 5000,
|
||||
WatchDeployEvents: dto.Repository?.WatchDeployEvents ?? true),
|
||||
@@ -104,6 +105,7 @@ public static class GalaxyDriverFactoryExtensions
|
||||
public string? ClientName { get; init; }
|
||||
public int? PublishingIntervalMs { get; init; }
|
||||
public int? WriteUserId { get; init; }
|
||||
public int? EventPumpChannelCapacity { get; init; }
|
||||
}
|
||||
|
||||
internal sealed class RepositoryDto
|
||||
|
||||
@@ -18,7 +18,7 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Health;
|
||||
/// (<see cref="HostConnectivityForwarder"/>) both feed this aggregator; the
|
||||
/// <see cref="GalaxyDriver"/> consumes <see cref="Snapshot"/> from
|
||||
/// <c>IHostConnectivityProbe.GetHostStatuses()</c> and re-raises
|
||||
/// <see cref="OnHostStatusChanged"/> as the driver-level event in a follow-up PR.
|
||||
/// <see cref="OnHostStatusChanged"/> as the driver-level event (wired in PR 4.W).
|
||||
/// </remarks>
|
||||
public sealed class HostStatusAggregator
|
||||
{
|
||||
|
||||
@@ -36,6 +36,7 @@ public sealed class PerPlatformProbeWatcher : IDisposable
|
||||
private readonly IGalaxySubscriber _subscriber;
|
||||
private readonly HostStatusAggregator _aggregator;
|
||||
private readonly ILogger _logger;
|
||||
private readonly int _bufferedUpdateIntervalMs;
|
||||
|
||||
// Tracked platform → gw item handle. Item handle 0 means the gw rejected the subscribe;
|
||||
// we keep the entry so SyncPlatformsAsync doesn't try to subscribe it again on every call.
|
||||
@@ -45,11 +46,20 @@ public sealed class PerPlatformProbeWatcher : IDisposable
|
||||
private bool _disposed;
|
||||
|
||||
public PerPlatformProbeWatcher(
|
||||
IGalaxySubscriber subscriber, HostStatusAggregator aggregator, ILogger? logger = null)
|
||||
IGalaxySubscriber subscriber,
|
||||
HostStatusAggregator aggregator,
|
||||
ILogger? logger = null,
|
||||
int bufferedUpdateIntervalMs = 0)
|
||||
{
|
||||
_subscriber = subscriber ?? throw new ArgumentNullException(nameof(subscriber));
|
||||
_aggregator = aggregator ?? throw new ArgumentNullException(nameof(aggregator));
|
||||
_logger = logger ?? NullLogger.Instance;
|
||||
if (bufferedUpdateIntervalMs < 0)
|
||||
{
|
||||
throw new ArgumentOutOfRangeException(nameof(bufferedUpdateIntervalMs),
|
||||
"bufferedUpdateIntervalMs must be >= 0; 0 means use the gw's default cadence.");
|
||||
}
|
||||
_bufferedUpdateIntervalMs = bufferedUpdateIntervalMs;
|
||||
}
|
||||
|
||||
/// <summary>Snapshot of platform tag names currently watched.</summary>
|
||||
@@ -107,10 +117,12 @@ public sealed class PerPlatformProbeWatcher : IDisposable
|
||||
if (toAdd.Count == 0) return;
|
||||
|
||||
var probeAddresses = toAdd.Select(p => p + ProbeSuffix).ToArray();
|
||||
// bufferedUpdateInterval=0 — probe ScanState changes are rare enough that the gw's
|
||||
// default cadence is fine; explicit polling rate goes through PR 6.3.
|
||||
// PR 6.3 — use the configured bufferedUpdateIntervalMs (defaults to 0 = gw cadence
|
||||
// when the driver hasn't overridden MxAccess.PublishingIntervalMs). Probe ScanState
|
||||
// changes are rare so a coarser interval is usually fine; deployments that need
|
||||
// tighter health visibility can dial it down through GalaxyDriverOptions.
|
||||
var results = await _subscriber.SubscribeBulkAsync(
|
||||
probeAddresses, bufferedUpdateIntervalMs: 0, cancellationToken).ConfigureAwait(false);
|
||||
probeAddresses, _bufferedUpdateIntervalMs, cancellationToken).ConfigureAwait(false);
|
||||
|
||||
for (var i = 0; i < toAdd.Count; i++)
|
||||
{
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
using System.Diagnostics.Metrics;
|
||||
using System.Threading.Channels;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Logging.Abstractions;
|
||||
using MxGateway.Contracts.Proto;
|
||||
@@ -13,19 +15,47 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
/// <see cref="SubscriptionRegistry.ResolveSubscribers"/>).
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// One pump per connected <see cref="GalaxyMxSession"/>. Reconnect lives in PR 4.5's
|
||||
/// supervisor; on transport failure here we log + propagate so the supervisor can
|
||||
/// decide whether to restart.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// PR 6.2 — the network-read loop and the listener-fanout loop are decoupled by a
|
||||
/// bounded <see cref="Channel{T}"/>. When a listener is slow enough to fill the
|
||||
/// channel, new events are dropped (newest-dropped semantics: producer's
|
||||
/// <c>TryWrite</c> fails) rather than back-pressuring the gw stream. Three counters
|
||||
/// on the <c>ZB.MOM.WW.OtOpcUa.Driver.Galaxy</c> meter expose received / dispatched
|
||||
/// / dropped totals so ops sees pressure before it manifests as user-visible loss.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
internal sealed class EventPump : IAsyncDisposable
|
||||
{
|
||||
public const string MeterName = "ZB.MOM.WW.OtOpcUa.Driver.Galaxy";
|
||||
private const int DefaultChannelCapacity = 50_000;
|
||||
|
||||
// Single static meter so a host-level MeterListener catches all pump instances.
|
||||
private static readonly Meter Meter = new(MeterName);
|
||||
private static readonly Counter<long> EventsReceived =
|
||||
Meter.CreateCounter<long>("galaxy.events.received", unit: "{event}",
|
||||
description: "MxEvents read from the gateway StreamEvents stream.");
|
||||
private static readonly Counter<long> EventsDispatched =
|
||||
Meter.CreateCounter<long>("galaxy.events.dispatched", unit: "{event}",
|
||||
description: "MxEvents passed through the bounded channel and into OnDataChange.");
|
||||
private static readonly Counter<long> EventsDropped =
|
||||
Meter.CreateCounter<long>("galaxy.events.dropped", unit: "{event}",
|
||||
description: "MxEvents dropped because the bounded channel was full (newest-dropped).");
|
||||
|
||||
private readonly IGalaxySubscriber _subscriber;
|
||||
private readonly SubscriptionRegistry _registry;
|
||||
private readonly ILogger _logger;
|
||||
private readonly Func<long, ISubscriptionHandle> _handleFactory;
|
||||
private readonly Channel<MxEvent> _channel;
|
||||
private readonly KeyValuePair<string, object?> _clientTag;
|
||||
private readonly CancellationTokenSource _cts = new();
|
||||
|
||||
private Task? _loop;
|
||||
private Task? _dispatchLoop;
|
||||
private bool _disposed;
|
||||
|
||||
public event EventHandler<DataChangeEventArgs>? OnDataChange;
|
||||
@@ -34,12 +64,30 @@ internal sealed class EventPump : IAsyncDisposable
|
||||
IGalaxySubscriber subscriber,
|
||||
SubscriptionRegistry registry,
|
||||
ILogger? logger = null,
|
||||
Func<long, ISubscriptionHandle>? handleFactory = null)
|
||||
Func<long, ISubscriptionHandle>? handleFactory = null,
|
||||
int channelCapacity = DefaultChannelCapacity,
|
||||
string? clientName = null)
|
||||
{
|
||||
_subscriber = subscriber ?? throw new ArgumentNullException(nameof(subscriber));
|
||||
_registry = registry ?? throw new ArgumentNullException(nameof(registry));
|
||||
_logger = logger ?? NullLogger.Instance;
|
||||
_handleFactory = handleFactory ?? (id => new GalaxySubscriptionHandle(id));
|
||||
|
||||
if (channelCapacity < 1)
|
||||
{
|
||||
throw new ArgumentOutOfRangeException(nameof(channelCapacity),
|
||||
"channelCapacity must be >= 1; recommended 50_000 for 50k-tag deployments.");
|
||||
}
|
||||
_channel = Channel.CreateBounded<MxEvent>(new BoundedChannelOptions(channelCapacity)
|
||||
{
|
||||
// Newest-dropped policy: when full, the producer's TryWrite returns false
|
||||
// and we account for the drop. We do this manually rather than relying on
|
||||
// BoundedChannelFullMode.DropWrite so we can count drops without polling.
|
||||
FullMode = BoundedChannelFullMode.Wait,
|
||||
SingleReader = true,
|
||||
SingleWriter = true,
|
||||
});
|
||||
_clientTag = new KeyValuePair<string, object?>("galaxy.client", clientName ?? "<unknown>");
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
@@ -51,6 +99,7 @@ internal sealed class EventPump : IAsyncDisposable
|
||||
ObjectDisposedException.ThrowIf(_disposed, this);
|
||||
if (_loop is not null) return;
|
||||
_loop = Task.Run(() => RunAsync(_cts.Token));
|
||||
_dispatchLoop = Task.Run(() => DispatchLoopAsync(_cts.Token));
|
||||
}
|
||||
|
||||
private async Task RunAsync(CancellationToken ct)
|
||||
@@ -60,7 +109,15 @@ internal sealed class EventPump : IAsyncDisposable
|
||||
await foreach (var ev in _subscriber.StreamEventsAsync(ct).WithCancellation(ct).ConfigureAwait(false))
|
||||
{
|
||||
if (ct.IsCancellationRequested) break;
|
||||
Dispatch(ev);
|
||||
EventsReceived.Add(1, _clientTag);
|
||||
|
||||
// Newest-dropped: TryWrite fast-paths the common case (channel has room).
|
||||
// When full we count the drop and continue reading the gw stream so
|
||||
// back-pressure doesn't propagate upstream.
|
||||
if (!_channel.Writer.TryWrite(ev))
|
||||
{
|
||||
EventsDropped.Add(1, _clientTag);
|
||||
}
|
||||
}
|
||||
}
|
||||
catch (OperationCanceledException) when (ct.IsCancellationRequested)
|
||||
@@ -72,6 +129,32 @@ internal sealed class EventPump : IAsyncDisposable
|
||||
_logger.LogWarning(ex,
|
||||
"Galaxy EventPump loop ended with an exception — reconnect supervisor (PR 4.5) handles restart.");
|
||||
}
|
||||
finally
|
||||
{
|
||||
// Tell the dispatch loop the producer is done so it drains and exits.
|
||||
_channel.Writer.TryComplete();
|
||||
}
|
||||
}
|
||||
|
||||
private async Task DispatchLoopAsync(CancellationToken ct)
|
||||
{
|
||||
try
|
||||
{
|
||||
await foreach (var ev in _channel.Reader.ReadAllAsync(ct).ConfigureAwait(false))
|
||||
{
|
||||
Dispatch(ev);
|
||||
EventsDispatched.Add(1, _clientTag);
|
||||
}
|
||||
}
|
||||
catch (OperationCanceledException) when (ct.IsCancellationRequested)
|
||||
{
|
||||
// Clean shutdown.
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex,
|
||||
"Galaxy EventPump dispatch loop ended with an exception — events past this point will be lost until restart.");
|
||||
}
|
||||
}
|
||||
|
||||
private void Dispatch(MxEvent ev)
|
||||
@@ -121,10 +204,15 @@ internal sealed class EventPump : IAsyncDisposable
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
_cts.Cancel();
|
||||
_channel.Writer.TryComplete();
|
||||
if (_loop is not null)
|
||||
{
|
||||
try { await _loop.ConfigureAwait(false); } catch { /* shutdown */ }
|
||||
}
|
||||
if (_dispatchLoop is not null)
|
||||
{
|
||||
try { await _dispatchLoop.ConfigureAwait(false); } catch { /* shutdown */ }
|
||||
}
|
||||
_cts.Dispose();
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,35 @@
|
||||
using System.Diagnostics;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
/// <summary>
|
||||
/// PR 6.1 — In-box <see cref="ActivitySource"/> wired around every gw call the
|
||||
/// driver makes (Subscribe/Unsubscribe, Write/WriteSecured, GetHierarchy). The
|
||||
/// decorators in this folder produce one span per call, tagged with the inputs
|
||||
/// ops needs to triage a slow or failing operation:
|
||||
/// <c>galaxy.tag_count</c>, <c>galaxy.success_count</c>, <c>galaxy.client</c>.
|
||||
/// <para>
|
||||
/// The driver itself doesn't take a dependency on the OpenTelemetry packages —
|
||||
/// <c>System.Diagnostics.ActivitySource</c> is in the BCL. The host process
|
||||
/// decides which listener (OTLP exporter, Application Insights, dotnet-trace)
|
||||
/// subscribes to <see cref="ActivitySourceName"/>.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
internal static class GalaxyTelemetry
|
||||
{
|
||||
public const string ActivitySourceName = "ZB.MOM.WW.OtOpcUa.Driver.Galaxy";
|
||||
|
||||
public static readonly ActivitySource ActivitySource = new(ActivitySourceName);
|
||||
|
||||
/// <summary>
|
||||
/// Tag a span with a failure reason and set its status to <c>Error</c>. Helper
|
||||
/// so the decorators don't repeat the four-line idiom on every catch block.
|
||||
/// </summary>
|
||||
public static void RecordError(this Activity? activity, Exception ex)
|
||||
{
|
||||
if (activity is null) return;
|
||||
activity.SetStatus(ActivityStatusCode.Error, ex.Message);
|
||||
activity.SetTag("exception.type", ex.GetType().FullName);
|
||||
activity.SetTag("exception.message", ex.Message);
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,6 @@
|
||||
using MxGateway.Client;
|
||||
using MxGateway.Contracts.Proto;
|
||||
// Use the generated nested status enum for the SetBufferedUpdateInterval reply check.
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
@@ -9,14 +10,16 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
/// gateway and streams MxEvents via the gw's bidirectional events RPC.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// The gw's <c>SubscribeBulkAsync</c> doesn't currently take a buffered-update-interval
|
||||
/// hint as a typed parameter — gw issue #102 / lmx_mxgw_impl.md gw-9 tracks adding
|
||||
/// <c>buffered_update_interval_ms</c>. Until that lands, the parameter is captured here
|
||||
/// and forwarded to <c>SetBufferedUpdateInterval</c> in a follow-up. PR 6.3 picks it up.
|
||||
/// PR 6.3 wired the per-call <c>buffered_update_interval_ms</c> through
|
||||
/// <see cref="SubscribeBulkAsync"/>. The gw's contract is session-level
|
||||
/// (<c>SetBufferedUpdateInterval</c> applies to all buffered subscriptions on the
|
||||
/// server handle), so we cache the last-applied value and skip redundant calls.
|
||||
/// </remarks>
|
||||
public sealed class GatewayGalaxySubscriber : IGalaxySubscriber
|
||||
{
|
||||
private readonly GalaxyMxSession _session;
|
||||
private readonly Lock _intervalLock = new();
|
||||
private int _lastAppliedIntervalMs = -1; // -1 = never applied; 0 = explicit "use gw default"
|
||||
|
||||
public GatewayGalaxySubscriber(GalaxyMxSession session)
|
||||
{
|
||||
@@ -31,14 +34,65 @@ public sealed class GatewayGalaxySubscriber : IGalaxySubscriber
|
||||
"GalaxyMxSession is not connected. Call ConnectAsync before subscribing.");
|
||||
var serverHandle = _session.ServerHandle;
|
||||
|
||||
// PR 6.3 wires bufferedUpdateIntervalMs to SetBufferedUpdateInterval; until then
|
||||
// ignore it — values still arrive at the gw's default cadence.
|
||||
_ = bufferedUpdateIntervalMs;
|
||||
// The gw's SubscribeBulk RPC doesn't carry a per-call interval — buffered cadence
|
||||
// is session-level, set via SetBufferedUpdateInterval. Apply it before the
|
||||
// SubscribeBulk so the very first events on the new handles publish at the
|
||||
// requested cadence. Skip when the last-applied value already matches.
|
||||
if (bufferedUpdateIntervalMs > 0)
|
||||
{
|
||||
await EnsureSessionIntervalAsync(session, serverHandle, bufferedUpdateIntervalMs, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
|
||||
return await session.SubscribeBulkAsync(serverHandle, fullReferences, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Apply the gateway's session-level <c>SetBufferedUpdateInterval</c> command. The
|
||||
/// gw's contract is "for this server handle, every buffered subscription publishes
|
||||
/// at this cadence" — there's no per-handle granularity, so we cache the last
|
||||
/// applied value and skip redundant calls.
|
||||
/// </summary>
|
||||
private async Task EnsureSessionIntervalAsync(
|
||||
MxGateway.Client.MxGatewaySession session, int serverHandle, int intervalMs, CancellationToken cancellationToken)
|
||||
{
|
||||
lock (_intervalLock)
|
||||
{
|
||||
if (_lastAppliedIntervalMs == intervalMs) return;
|
||||
}
|
||||
|
||||
var reply = await session.InvokeAsync(
|
||||
new MxCommandRequest
|
||||
{
|
||||
SessionId = session.SessionId,
|
||||
ClientCorrelationId = Guid.NewGuid().ToString("N"),
|
||||
Command = new MxCommand
|
||||
{
|
||||
Kind = MxCommandKind.SetBufferedUpdateInterval,
|
||||
SetBufferedUpdateInterval = new SetBufferedUpdateIntervalCommand
|
||||
{
|
||||
ServerHandle = serverHandle,
|
||||
UpdateIntervalMilliseconds = intervalMs,
|
||||
},
|
||||
},
|
||||
},
|
||||
cancellationToken).ConfigureAwait(false);
|
||||
|
||||
if (reply.ProtocolStatus?.Code is not (ProtocolStatusCode.Ok or ProtocolStatusCode.MxaccessFailure))
|
||||
{
|
||||
// Don't throw on a soft failure — the SubscribeBulk will still succeed at the
|
||||
// gw's default cadence, which is functional just not the requested cadence.
|
||||
// The trace span (PR 6.1) plus the warning here gives ops the signal.
|
||||
return;
|
||||
}
|
||||
|
||||
lock (_intervalLock)
|
||||
{
|
||||
_lastAppliedIntervalMs = intervalMs;
|
||||
}
|
||||
}
|
||||
|
||||
public async Task UnsubscribeBulkAsync(IReadOnlyList<int> itemHandles, CancellationToken cancellationToken)
|
||||
{
|
||||
if (itemHandles.Count == 0) return;
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
/// <summary>
|
||||
/// PR 6.1 — Decorator that emits one <see cref="System.Diagnostics.Activity"/> span
|
||||
/// per gw write batch. Tags secured-write counts so ops can see the routing-by-
|
||||
/// classification split (FreeAccess/Operate vs Tune/Configure) without re-reading
|
||||
/// the discovery dictionary.
|
||||
/// </summary>
|
||||
internal sealed class TracedGalaxyDataWriter(IGalaxyDataWriter inner, string clientName) : IGalaxyDataWriter
|
||||
{
|
||||
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
|
||||
IReadOnlyList<WriteRequest> writes,
|
||||
Func<string, SecurityClassification> securityResolver,
|
||||
CancellationToken cancellationToken)
|
||||
{
|
||||
using var activity = GalaxyTelemetry.ActivitySource.StartActivity("galaxy.write");
|
||||
activity?.SetTag("galaxy.client", clientName);
|
||||
activity?.SetTag("galaxy.tag_count", writes.Count);
|
||||
|
||||
if (activity is { IsAllDataRequested: true })
|
||||
{
|
||||
// Counting the secured-write split is cheap (one resolver call per request)
|
||||
// and only happens when a tracing listener is actively recording — keeps the
|
||||
// hot path free when no one's listening.
|
||||
var securedCount = 0;
|
||||
foreach (var w in writes)
|
||||
{
|
||||
var sc = securityResolver(w.FullReference);
|
||||
if (sc is SecurityClassification.Tune
|
||||
or SecurityClassification.Configure
|
||||
or SecurityClassification.VerifiedWrite)
|
||||
{
|
||||
securedCount++;
|
||||
}
|
||||
}
|
||||
activity.SetTag("galaxy.secured_write_count", securedCount);
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
var results = await inner.WriteAsync(writes, securityResolver, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
activity?.SetTag("galaxy.success_count", results.Count(r => r.StatusCode < 0x80000000u));
|
||||
return results;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
activity.RecordError(ex);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,91 @@
|
||||
using System.Runtime.CompilerServices;
|
||||
using MxGateway.Contracts.Proto;
|
||||
|
||||
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Runtime;
|
||||
|
||||
/// <summary>
|
||||
/// PR 6.1 — Decorator that emits one <see cref="System.Diagnostics.Activity"/> span
|
||||
/// per gw subscription RPC. Wraps the production <see cref="GatewayGalaxySubscriber"/>;
|
||||
/// tests substitute a fake at the same seam without taking the tracing overhead.
|
||||
/// </summary>
|
||||
internal sealed class TracedGalaxySubscriber(IGalaxySubscriber inner, string clientName) : IGalaxySubscriber
|
||||
{
|
||||
public async Task<IReadOnlyList<SubscribeResult>> SubscribeBulkAsync(
|
||||
IReadOnlyList<string> fullReferences, int bufferedUpdateIntervalMs, CancellationToken cancellationToken)
|
||||
{
|
||||
using var activity = GalaxyTelemetry.ActivitySource.StartActivity("galaxy.subscribe_bulk");
|
||||
activity?.SetTag("galaxy.client", clientName);
|
||||
activity?.SetTag("galaxy.tag_count", fullReferences.Count);
|
||||
activity?.SetTag("galaxy.buffered_interval_ms", bufferedUpdateIntervalMs);
|
||||
try
|
||||
{
|
||||
var results = await inner.SubscribeBulkAsync(fullReferences, bufferedUpdateIntervalMs, cancellationToken)
|
||||
.ConfigureAwait(false);
|
||||
activity?.SetTag("galaxy.success_count", results.Count(r => r.WasSuccessful));
|
||||
return results;
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
activity.RecordError(ex);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
public async Task UnsubscribeBulkAsync(IReadOnlyList<int> itemHandles, CancellationToken cancellationToken)
|
||||
{
|
||||
using var activity = GalaxyTelemetry.ActivitySource.StartActivity("galaxy.unsubscribe_bulk");
|
||||
activity?.SetTag("galaxy.client", clientName);
|
||||
activity?.SetTag("galaxy.tag_count", itemHandles.Count);
|
||||
try
|
||||
{
|
||||
await inner.UnsubscribeBulkAsync(itemHandles, cancellationToken).ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
activity.RecordError(ex);
|
||||
throw;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Streaming RPC — one parent span covers the entire stream lifetime. Per-event
|
||||
/// spans would dominate the trace volume at 50k tags / 1Hz; ops gets per-event
|
||||
/// visibility through <see cref="EventPump"/>'s metrics in PR 6.2 instead.
|
||||
/// </summary>
|
||||
public async IAsyncEnumerable<MxEvent> StreamEventsAsync(
|
||||
[EnumeratorCancellation] CancellationToken cancellationToken)
|
||||
{
|
||||
using var activity = GalaxyTelemetry.ActivitySource.StartActivity("galaxy.stream_events");
|
||||
activity?.SetTag("galaxy.client", clientName);
|
||||
|
||||
IAsyncEnumerator<MxEvent>? enumerator = null;
|
||||
try
|
||||
{
|
||||
enumerator = inner.StreamEventsAsync(cancellationToken).GetAsyncEnumerator(cancellationToken);
|
||||
var eventCount = 0L;
|
||||
while (true)
|
||||
{
|
||||
bool moveNext;
|
||||
try
|
||||
{
|
||||
moveNext = await enumerator.MoveNextAsync().ConfigureAwait(false);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
activity.RecordError(ex);
|
||||
activity?.SetTag("galaxy.event_count", eventCount);
|
||||
throw;
|
||||
}
|
||||
|
||||
if (!moveNext) break;
|
||||
eventCount++;
|
||||
yield return enumerator.Current;
|
||||
}
|
||||
activity?.SetTag("galaxy.event_count", eventCount);
|
||||
}
|
||||
finally
|
||||
{
|
||||
if (enumerator is not null) await enumerator.DisposeAsync().ConfigureAwait(false);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -14,7 +14,6 @@ using ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.AbCip;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.AbLegacy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.FOCAS;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.Modbus;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.S7;
|
||||
using ZB.MOM.WW.OtOpcUa.Driver.TwinCAT;
|
||||
@@ -110,12 +109,10 @@ builder.Services.AddSingleton<NodeBootstrap>();
|
||||
builder.Services.AddSingleton<DriverFactoryRegistry>(_ =>
|
||||
{
|
||||
var registry = new DriverFactoryRegistry();
|
||||
// Both Galaxy backends register side-by-side under distinct DriverType names
|
||||
// ("Galaxy" → legacy GalaxyProxyDriver, "GalaxyMxGateway" → in-process GalaxyDriver
|
||||
// over the gRPC mxaccessgw). The DriverInstance row's DriverType selects between
|
||||
// them at bootstrap time — see lmx_mxgw.md / PR 4.W. Phase 7 retires the legacy
|
||||
// factory once parity tests pin both.
|
||||
GalaxyProxyDriverFactoryExtensions.Register(registry);
|
||||
// Galaxy access flows through the in-process GalaxyDriver (DriverType =
|
||||
// "GalaxyMxGateway") talking gRPC to the mxaccessgw worker. The legacy
|
||||
// out-of-process GalaxyProxyDriver retired in PR 7.2 once the parity matrix
|
||||
// (docs/v2/Galaxy.ParityMatrix.md) verified equivalence.
|
||||
ZB.MOM.WW.OtOpcUa.Driver.Galaxy.GalaxyDriverFactoryExtensions.Register(registry);
|
||||
FocasDriverFactoryExtensions.Register(registry);
|
||||
ModbusDriverFactoryExtensions.Register(registry);
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user