Commit Graph

63 Commits

Author SHA1 Message Date
Joseph Doherty 52997ee164 feat(observability): F13d Prometheus + OpenTelemetry instrumentation
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
OtOpcUaTelemetry (Commons/Observability) centralizes the project's Meter
+ ActivitySource so all instrumentation points emit through a single
named surface. Counters cover the hot paths:

  otopcua.deploy.applied               (outcome=ack|reject)
  otopcua.deploy.apply.duration        (s, histogram)
  otopcua.driver.lifecycle             (event=spawn|spawn_stub|stop|fault)
  otopcua.virtualtag.eval              (outcome=ok|fail|skip)
  otopcua.scriptedalarm.transition     (state=activated|acknowledged|cleared)
  otopcua.opcua.sink.write             (kind=value|alarm|rebuild)
  otopcua.redundancy.service_level_change (level=byte)

Plus two ActivitySource spans:

  otopcua.deploy.apply                 wraps DriverHostActor.ApplyAndAck
  otopcua.opcua.address_space_rebuild  wraps OpcUaPublishActor.HandleRebuild

Instruments are no-op until a listener attaches, so tests + dev hosts
pay nothing for unread telemetry.

Host Program.cs gains AddOtOpcUaObservability() (binds the OtOpcUa Meter
+ ActivitySource to OpenTelemetry, attaches a Prometheus exporter) and
MapOtOpcUaMetrics() (mounts /metrics scrape endpoint). Driver-side
internals + ASP.NET request metrics deliberately stay off — the scrape
payload is scoped to OtOpcUa signals only.

Tests use MeterListener + ActivityListener to verify
VirtualTagActor.eval, OpcUaPublishActor.AttributeValueUpdate, and
RebuildAddressSpace actually emit on the central instruments. Runtime
suite is 72 / 72 green (+3).

Closes #105. Path A (F13b/c/d) complete; next batch options: #85 UNS
folder hierarchy in SDK, or F8b/F9b production engine bindings.
2026-05-26 10:29:40 -04:00
Joseph Doherty 21eac21409 feat(opcua,host): F13c LDAP-bound UserName validator
Adds IOpcUaUserAuthenticator seam in OpcUaServer.Security with a deny-all
NullOpcUaUserAuthenticator default. OpcUaApplicationHost subscribes to
SessionManager.ImpersonateUser after _application.Start so UserName tokens
flow through the authenticator and either attach a UserIdentity to the
session (Allow) or set IdentityValidationError = BadIdentityTokenRejected
(Deny / authenticator exception). Anonymous + X509 tokens fall through to
SDK defaults.

LdapOpcUaUserAuthenticator (Host project) bridges to the same
ILdapAuthService that AddOtOpcUaAuth uses for Admin cookies / JWT, so a
single LDAP source-of-truth governs both Admin control plane and OPC UA
data plane. Program.cs registers LdapOptions + LdapAuthService +
IOpcUaUserAuthenticator on driver-role hosts; admin-only nodes are
unchanged.

OtOpcUaServerHostedService threads the resolved authenticator into
OpcUaApplicationHost so the seam respects Host DI.

10 new tests: 6 in OpcUaServer.Tests cover the pure HandleImpersonation
static method (success / denial / anonymous fallthrough / authenticator-
throw / null-username / Null authenticator); 4 in Host.IntegrationTests
cover the LdapOpcUaUserAuthenticator adapter (LDAP allow → Allow with
roles, LDAP deny → Deny, exception → backend-error denial, display-name
fallback). OpcUaServer suite is 40 / 40 green.

Closes #104. Unblocks Task 60 (dual-endpoint + ServiceLevel tests) once
#81 residual lands.
2026-05-26 10:21:37 -04:00
Joseph Doherty 50787823d3 feat(host,runtime): #108 Host DI bindings — OPC UA server + deferred sink
v2-ci / build (push) Failing after 45s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Wires the OPC UA SDK into the fused Host's lifecycle on driver-role
nodes + spawns OpcUaPublishActor with the proper sink/publisher/dbFactory/
applier resolution. The full read+write data path is now live in
production: Deploy → DriverHost → OpcUaPublish → SDK NodeManager →
subscribed OPC UA clients.

DeferredAddressSpaceSink (Commons.OpcUa):
  - Thread-safe wrapper IOpcUaAddressSpaceSink that delegates to an
    inner sink swapped in at runtime. Needed because Akka actors
    resolve the sink at construction time, but the production sink
    (SdkAddressSpaceSink wrapping OtOpcUaNodeManager) only exists
    after the SDK StandardServer has started.
  - Defaults to NullOpcUaAddressSpaceSink so calls before swap are
    safe; SetSink(null) reverts (for graceful shutdown).

OtOpcUaServerHostedService (Host.OpcUa):
  - IHostedService that owns the OPC UA SDK lifecycle. Reads
    OpcUaApplicationHostOptions from the 'OpcUa' config section,
    creates an OtOpcUaSdkServer, boots it through OpcUaApplicationHost,
    then swaps a real SdkAddressSpaceSink into the DeferredAddressSpaceSink
    singleton.
  - SDK boot failure is logged + non-fatal — the rest of the host
    (admin UI, driver actors) keeps running. Stop reverts to null sink.

WithOtOpcUaRuntimeActors (Runtime):
  - Now spawns OpcUaPublishActor (new actor) + threads its ActorRef
    into DriverHostActor's Props so successful applies trigger the
    address-space rebuild pipeline.
  - Phase7Applier is constructed here from the resolved sink + a
    logger; OpcUaPublishActor takes both.
  - Prepends the opcua-synchronized-dispatcher HOCON so the extension
    is self-contained — consumers (Host, tests) don't need to redeclare
    the dispatcher block.
  - New OpcUaPublishActorKey + OpcUaPublishActorName for actor-registry
    resolution.
  - AddOtOpcUaRuntime now also TryAddSingleton's NullOpcUaAddressSpaceSink
    + NullServiceLevelPublisher so admin-only nodes (or tests that
    don't bind the Deferred sink) stay safe.

Host.Program.cs (driver-role only):
  - Binds DeferredAddressSpaceSink as singleton + as IOpcUaAddressSpaceSink
  - AddHostedService<OtOpcUaServerHostedService>()

Tests: OpcUaServer 24 -> 28 (+4 DeferredAddressSpaceSink unit tests),
Runtime 69 -> 69 (existing ServiceCollectionExtensionsTests extended
to verify the new mux + publish actor registration).

All 6 v2 test suites green: 177 tests passing.

Closes #108. Engine-wiring is now production-bound end-to-end on
driver-role nodes — Deploy reaches real OPC UA Variable nodes that
subscribed clients see.
2026-05-26 10:02:15 -04:00
Joseph Doherty 3e3f7588bd feat(runtime,host): close F7 — driver subscribe + write paths + Host DI
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Three pieces landed in one batch, closing F7-residual + Host DI #106:

Runtime/DriverInstanceActor:
  - Subscribe / Unsubscribe message contracts; the Connected state
    handles them via IDriver.ISubscribable. On every OnDataChange
    event the actor publishes AttributeValuePublished to its parent
    (DriverHostActor → OpcUaPublishActor). OPC UA StatusCode is
    mapped to the 3-state OpcUaQuality enum via severity bits
    (00=Good, 01=Uncertain, 10/11=Bad).
  - DetachSubscription tears the handler off the driver on
    DisconnectObserved, Unsubscribe, and PostStop so a stale handler
    never pushes to a dead actor.
  - WriteAttribute now dispatches IWritable.WriteAsync (batch of one)
    with a 5s CancellationTokenSource; status-code propagated to
    WriteAttributeResult on non-Good results.

Host:
  - New ProjectReferences to Core + every cross-platform driver
    assembly (AbCip/AbLegacy/FOCAS/Galaxy/Modbus/S7/TwinCAT).
    Galaxy is net10 (gRPC client to mxaccessgw); the COM-bound net48
    Wonderware Historian driver stays out of the Host's reference
    closure — its .Client gRPC wrapper is what binds for historian
    needs.
  - New DriverFactoryBootstrap.AddOtOpcUaDriverFactories() registers
    a singleton DriverFactoryRegistry, invokes each driver's
    Register(registry, loggerFactory), and binds IDriverFactory to
    DriverFactoryRegistryAdapter. Replaces the F7 NullDriverFactory
    default so deploys actually materialise real IDriver instances
    on driver-role nodes. ShouldStub() still gates per-platform
    behaviour at spawn time.
  - Program.cs wires AddOtOpcUaDriverFactories() before AddAkka so
    the runtime extension can resolve IDriverFactory from DI.

Tests: Runtime 46 -> 52 (+6):
- Write returns success when StatusCode = Good
- Write propagates non-Good status code in failure Reason
- Subscribe forwards OnDataChange to parent as AttributeValuePublished
- Quality translation: Uncertain (0x40...) and Bad (0x80...)
- Subscribe against non-ISubscribable returns failure
- DisconnectObserved detaches handler so late events are dropped

All 6 v2 test suites green: 152 tests passing.

Closes F7. F7-residual sub-tasks #110 (subscribe) and #111 (write)
both shipped. Host DI binding #106 shipped.
2026-05-26 09:28:34 -04:00
Joseph Doherty 850d6774ea feat(adminui): F15 Phase A — shell + auth + fleet + hosts pages
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Implements Phase A of the F15 rebuild plan: minimum-viable Admin surface
with a working sign-in path and a fleet-state landing page. Decisions Q1–Q5
of docs/v2/AdminUI-rebuild-plan.md were taken as recommended.

- App.razor (moved into AdminUI library from the Host stub; vendored
  Bootstrap from RCL wwwroot — no public CDN, air-gap safe)
- Routes.razor (AuthorizeRouteView enforces page-level [Authorize])
- RedirectToLogin.razor (preserves returnUrl through the auth hop)
- Login.razor (static SSR, posts to /auth/login; Q5 wording about
  generic-vs-specific LDAP errors)
- Account.razor (identity + fleet roles + raw LDAP groups; Q4 — no
  per-cluster grants; fleet-wide LDAP-group → role mapping only)
- Fleet.razor (per-node deployment status: reads NodeDeploymentState
  + unions with IClusterRoleInfo.MembersWithRole("driver") so freshly-
  joined nodes appear as "waiting"; 10s auto-refresh)
- Hosts.razor (Akka cluster topology: members, status, roles, role-
  leader; 5s auto-refresh)

Host's stub App.razor deleted; Program.cs now points at
AdminUI.Components.App via an added using.

All 104 v2 tests remain green.
2026-05-26 07:49:35 -04:00
Joseph Doherty 686138123f feat(runtime): F11 — HistorianAdapterActor wired to IAlarmHistorianSink
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been cancelled
v2-ci / build (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been cancelled
v2-ci / integration (push) Has been cancelled
Reshapes the placeholder buffered-counter actor into a thin fire-and-forget
bridge over the existing IAlarmHistorianSink contract. Default sink is
NullAlarmHistorianSink; production deployments override the DI binding to
SqliteStoreAndForwardSink wrapping WonderwareHistorianClient (the v1
components in src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware*
are reused verbatim — actor is just a mailbox-friendly entry point).

- HistorianAdapterActor.Props(IAlarmHistorianSink?) — null defaults to NullAlarmHistorianSink
- Receive<AlarmHistorianEvent>: fire-and-forget sink.EnqueueAsync
- Receive<GetStatus>: returns sink.GetStatus() (queue depth + drain state)
- ServiceCollectionExtensions.AddOtOpcUaRuntime registers the default sink
- WithOtOpcUaRuntimeActors spawns the actor + registers HistorianAdapterActorKey
- Program.cs calls AddOtOpcUaRuntime when hasDriver

Tests: 2 new (forward-to-sink + GetStatus). Runtime suite 17 → 18.
2026-05-26 07:18:08 -04:00
Joseph Doherty f18c285cca feat(adminui): FleetStatusSignalRBridge — DPS → SignalR forwarding (F16)
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been cancelled
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been cancelled
v2-ci / integration (push) Has been cancelled
v2-ci / build (push) Has been cancelled
New per-admin-node actor that subscribes to the fleet-status DistributedPubSub
topic + forwards every FleetStatusChanged snapshot to all SignalR clients
connected to FleetStatusHub via IHubContext.

Wired via WithOtOpcUaSignalRBridges (new AkkaConfigurationBuilder extension in
AdminUI.Hubs) — Program.cs calls it inside the if(hasAdmin) block alongside
WithOtOpcUaControlPlaneSingletons.

Per-node subscription rather than cluster-singleton: every admin node forwards
its own snapshots to its own connected clients. Simpler than singleton
coordination + acceptable because the messages are small and SignalR fan-out
is per-node anyway.
2026-05-26 07:01:08 -04:00
Joseph Doherty d6fac2d81d test(host): 2-node integration test harness + consolidate to one ActorSystem (Task 58)
Builds TwoNodeClusterHarness: two in-process Host-equivalent nodes sharing
an in-memory ConfigDb. Forms a 2-member Akka cluster. ClusterFormationTests
proves both nodes see each other as admin+driver role members.

Fixes a real production bug uncovered while wiring the harness — Program.cs
ran two separate ActorSystems (one from AddOtOpcUaCluster.AkkaHostedService
with cluster HOCON, one from Akka.Hosting.AddAkka with bare HOCON). Cluster
singletons landed on the bare ActorSystem and could not actually form a
cluster ("Configuration does not contain `akka.cluster` node").

Consolidation:
- AddOtOpcUaCluster now only binds AkkaClusterOptions + registers IClusterRoleInfo
- New WithOtOpcUaClusterBootstrap pushes embedded HOCON + Remote/Cluster options
  into Akka.Hosting's AkkaConfigurationBuilder
- AkkaHostedService.cs deleted — Akka.Hosting now owns the lifecycle
- Program.cs + harness call WithOtOpcUaClusterBootstrap inside AddAkka

Why not WebApplicationFactory<Program>? Program.cs reads OTOPCUA_ROLES from
process env (shared across in-process WAFs); the harness replays Program.cs's
DI graph from a clean WebApplicationBuilder per node with per-node config
overrides. Same production extensions, isolated config + Kestrel + Akka ports.

Tests: 93 v2 tests pass (was 91 + 2 new cluster formation), 0 skipped.
2026-05-26 06:27:04 -04:00
Joseph Doherty 09d6676e1f feat(runtime): WithOtOpcUaRuntimeActors extension for driver-role node startup (F19)
Mirrors WithOtOpcUaControlPlaneSingletons for the driver role. Spawns
DriverHostActor + DbHealthProbeActor on the host's ActorSystem and
registers both under marker keys. Host's Program.cs now calls it when
the node carries the driver role, so driver-only and admin+driver
deployments both auto-bootstrap the per-node actors.

Integration test covers the registration round-trip via Microsoft.Extensions.Hosting
+ Akka.Hosting AddAkka.
2026-05-26 06:09:37 -04:00
Joseph Doherty fa1d685ccd feat(host): health endpoints + per-environment appsettings layout 2026-05-26 05:23:01 -04:00
Joseph Doherty e2b357f89a feat(host): role-gated Program.cs composes all v2 components 2026-05-26 05:22:59 -04:00
Joseph Doherty c168c1c9c6 feat(migration): add Migrate-To-V2.ps1 idempotent migration runner 2026-05-26 04:26:01 -04:00
Joseph Doherty 30a2104fa5 feat(scaffold): introduce 8 v2 component projects
Adds the empty project skeletons that subsequent v2 tasks fill in:

  src/Core/ZB.MOM.WW.OtOpcUa.Commons      (types, interfaces, message contracts)
  src/Core/ZB.MOM.WW.OtOpcUa.Cluster      (Akka.Hosting + cluster wiring)
  src/Server/ZB.MOM.WW.OtOpcUa.Security   (cookie+JWT auth, LDAP)
  src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane (admin-role cluster singletons)
  src/Server/ZB.MOM.WW.OtOpcUa.Runtime    (per-node driver actors)
  src/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer (OPC UA SDK application host)
  src/Server/ZB.MOM.WW.OtOpcUa.AdminUI    (Razor class library)
  src/Server/ZB.MOM.WW.OtOpcUa.Host       (single fused web binary)

Each project sets TreatWarningsAsErrors=true in its own csproj (per the
Directory.Build.props deviation note in the previous commit). NuGetAuditSuppress
entries cover transitive vulnerability advisories the new strictness surfaces:

  - GHSA-g94r-2vxg-569j (OpenTelemetry.Api 1.9.0 via Akka.Cluster.Hosting/Tools)
  - GHSA-h958-fxgg-g7w3 (Opc.Ua.Core 1.5.374.126 via OpcUaServer)
  - GHSA-37gx-xxp4-5rgx + GHSA-w3x6-4m5h-cxqf (legacy advisories already accepted)

OpcUaServer pins OPCFoundation.NetStandard.Opc.Ua.Configuration to 1.5.374.126
via VersionOverride to match Opc.Ua.Server's transitive Opc.Ua.Core (same
constraint as the legacy Server project).

Runtime does NOT project-reference any concrete Driver.* assemblies; drivers
load reflectively at runtime (Phase 6). Runtime gets the IDriver contract
through Core.Abstractions instead.

Host's Microsoft.Extensions.Hosting.WindowsServices is conditional on the
Windows OS so the project builds on macOS dev machines.

Build verification: dotnet build -> 438 warnings (all pre-existing xUnit1051
in legacy Server.Tests/Admin.Tests), 0 errors. Closes Task 9 (build green
smoke check, no separate commit).
2026-05-26 03:44:56 -04:00