Compare commits
5 Commits
a8becc9c46
...
8ac71db464
| Author | SHA1 | Date | |
|---|---|---|---|
| 8ac71db464 | |||
| 7e3b56c27d | |||
| e40615dad5 | |||
| 1689901c0e | |||
| 3c3fef911c |
@@ -0,0 +1,20 @@
|
|||||||
|
# Multi-stage build of OtOpcUa.Host targeting linux-x64. Used by docker-dev/docker-compose.yml
|
||||||
|
# to spin four host containers (admin-a, admin-b, driver-a, driver-b) from a single image —
|
||||||
|
# Compose drives OTOPCUA_ROLES + Cluster:* env per container to differentiate them.
|
||||||
|
|
||||||
|
FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build
|
||||||
|
WORKDIR /src
|
||||||
|
COPY . .
|
||||||
|
RUN dotnet restore ZB.MOM.WW.OtOpcUa.slnx
|
||||||
|
RUN dotnet publish src/Server/ZB.MOM.WW.OtOpcUa.Host/ZB.MOM.WW.OtOpcUa.Host.csproj \
|
||||||
|
-c Release -o /app --no-restore
|
||||||
|
|
||||||
|
FROM mcr.microsoft.com/dotnet/aspnet:10.0 AS runtime
|
||||||
|
WORKDIR /app
|
||||||
|
COPY --from=build /app ./
|
||||||
|
|
||||||
|
EXPOSE 9000
|
||||||
|
EXPOSE 4053
|
||||||
|
EXPOSE 4840
|
||||||
|
|
||||||
|
ENTRYPOINT ["dotnet", "OtOpcUa.Host.dll"]
|
||||||
@@ -0,0 +1,62 @@
|
|||||||
|
# docker-dev
|
||||||
|
|
||||||
|
Mac-friendly four-node OtOpcUa fleet for manual UI exercise + integration smoke tests. Spins up an Akka cluster + SQL Server + OpenLDAP + Traefik in front of two admin nodes.
|
||||||
|
|
||||||
|
## Stack
|
||||||
|
|
||||||
|
| Service | Role | Ports |
|
||||||
|
|---|---|---|
|
||||||
|
| `sql` | SQL Server 2022 (`ConfigDb` backing store) | host `14330` → container `1433` |
|
||||||
|
| `ldap` | OpenLDAP with dev users `alice` / `bob` | host `3893` → container `1389` |
|
||||||
|
| `admin-a` | OtOpcUa.Host, `OTOPCUA_ROLES=admin`, cluster seed | internal `9000` |
|
||||||
|
| `admin-b` | OtOpcUa.Host, `OTOPCUA_ROLES=admin`, joins admin-a | internal `9000` |
|
||||||
|
| `driver-a` | OtOpcUa.Host, `OTOPCUA_ROLES=driver` | host `4840` → container `4840` |
|
||||||
|
| `driver-b` | OtOpcUa.Host, `OTOPCUA_ROLES=driver` | host `4841` → container `4840` |
|
||||||
|
| `traefik` | Routes `:80` to whichever admin-* currently passes `/health/active` | host `80`, dashboard `8080` |
|
||||||
|
|
||||||
|
All six containers share an Akka cluster bound to port `4053` inside the Compose network. The Akka `PublicHostname` of each container matches its Compose service name; the seed-node list points at `admin-a` so the other three join via that.
|
||||||
|
|
||||||
|
## Bring up
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# from the repo root
|
||||||
|
docker compose -f docker-dev/docker-compose.yml up -d --build
|
||||||
|
|
||||||
|
# wait ~15 seconds for SQL to come up + the cluster to form
|
||||||
|
|
||||||
|
open http://localhost # Blazor admin UI via Traefik
|
||||||
|
open http://localhost:8080 # Traefik dashboard
|
||||||
|
```
|
||||||
|
|
||||||
|
The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.
|
||||||
|
|
||||||
|
## Auth (dev only)
|
||||||
|
|
||||||
|
Use one of the LDAP dev users from `LDAP_USERS` in `docker-compose.yml`:
|
||||||
|
|
||||||
|
| Username | Password |
|
||||||
|
|---|---|
|
||||||
|
| `alice` | `alice123` |
|
||||||
|
| `bob` | `bob123` |
|
||||||
|
|
||||||
|
The compose mounts everyone into `ou=FleetAdmin` so the dev role mapping resolves to `FleetAdmin`.
|
||||||
|
|
||||||
|
## Tear down
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-dev/docker-compose.yml down -v
|
||||||
|
```
|
||||||
|
|
||||||
|
The `-v` drops the SQL + LDAP volumes; remove it to keep ConfigDb state across restarts.
|
||||||
|
|
||||||
|
## Failover smoke
|
||||||
|
|
||||||
|
1. Watch the Traefik dashboard at `http://localhost:8080`. Both `admin-a` and `admin-b` should be listed as healthy in the `otopcua-admin` service.
|
||||||
|
2. `docker compose -f docker-dev/docker-compose.yml stop admin-a` — `admin-b` should pick up the admin role-leader within ~15 s (Akka split-brain stable-after). Traefik will route traffic to `admin-b` once its `/health/active` returns 200.
|
||||||
|
3. `docker compose -f docker-dev/docker-compose.yml start admin-a` — `admin-a` rejoins as a follower; `admin-b` keeps the leader role until something disturbs it.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- This compose is for the **local Mac/Linux developer rig**. The team's CI + soak runs go to the remote docker host at `10.100.0.35` (see `docs/v2/dev-environment.md`); the file there mirrors this one with adjusted port bindings.
|
||||||
|
- The OPC UA driver endpoints (`opc.tcp://localhost:4840`, `opc.tcp://localhost:4841`) are reachable directly from the host — Traefik is only in front of the admin HTTP surface.
|
||||||
|
- Galaxy + Wonderware drivers can't run in Linux containers (they need the Windows-only mxaccessgw + Historian SDK). On non-Windows, `DriverInstanceActor.ShouldStub(driverType, roles)` returns `true` for those types and the actor goes straight to a `Stubbed` state that returns deterministic success.
|
||||||
@@ -0,0 +1,130 @@
|
|||||||
|
# docker-dev/ — Mac-friendly four-node fleet for v2 development + manual UI exercise.
|
||||||
|
#
|
||||||
|
# Stack:
|
||||||
|
# sql SQL Server 2022 (ConfigDb backing store)
|
||||||
|
# ldap OpenLDAP with the dev users from C:\publish\glauth\auth.md mirrored in
|
||||||
|
# admin-a OtOpcUa.Host with OTOPCUA_ROLES=admin (cluster seed)
|
||||||
|
# admin-b OtOpcUa.Host with OTOPCUA_ROLES=admin (joins admin-a)
|
||||||
|
# driver-a OtOpcUa.Host with OTOPCUA_ROLES=driver (joins via admin-a)
|
||||||
|
# driver-b OtOpcUa.Host with OTOPCUA_ROLES=driver (joins via admin-a)
|
||||||
|
# traefik Routes :80 to whichever admin-* currently passes /health/active
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# docker compose -f docker-dev/docker-compose.yml up -d --build
|
||||||
|
# open http://localhost # Blazor admin UI via Traefik
|
||||||
|
# open http://localhost:8080 # Traefik dashboard
|
||||||
|
#
|
||||||
|
# Tear-down: docker compose -f docker-dev/docker-compose.yml down -v
|
||||||
|
|
||||||
|
name: otopcua-dev
|
||||||
|
|
||||||
|
services:
|
||||||
|
|
||||||
|
sql:
|
||||||
|
image: mcr.microsoft.com/mssql/server:2022-latest
|
||||||
|
environment:
|
||||||
|
ACCEPT_EULA: "Y"
|
||||||
|
SA_PASSWORD: "OtOpcUa!Dev123"
|
||||||
|
MSSQL_PID: Developer
|
||||||
|
ports:
|
||||||
|
- "14330:1433"
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "/opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P 'OtOpcUa!Dev123' -No -Q 'SELECT 1' || exit 1"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 20
|
||||||
|
|
||||||
|
ldap:
|
||||||
|
image: bitnami/openldap:2.6
|
||||||
|
environment:
|
||||||
|
LDAP_ROOT: "dc=lmxopcua,dc=local"
|
||||||
|
LDAP_ADMIN_USERNAME: "admin"
|
||||||
|
LDAP_ADMIN_PASSWORD: "ldapadmin"
|
||||||
|
LDAP_USERS: "alice,bob"
|
||||||
|
LDAP_PASSWORDS: "alice123,bob123"
|
||||||
|
LDAP_USER_DC: "ou=FleetAdmin"
|
||||||
|
ports:
|
||||||
|
- "3893:1389"
|
||||||
|
|
||||||
|
admin-a: &otopcua-host
|
||||||
|
build:
|
||||||
|
context: ..
|
||||||
|
dockerfile: docker-dev/Dockerfile
|
||||||
|
image: otopcua-host:dev
|
||||||
|
depends_on:
|
||||||
|
sql: { condition: service_healthy }
|
||||||
|
environment:
|
||||||
|
OTOPCUA_ROLES: "admin"
|
||||||
|
ASPNETCORE_URLS: "http://+:9000"
|
||||||
|
ConnectionStrings__ConfigDb: "Server=sql,1433;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
|
||||||
|
Cluster__Hostname: "0.0.0.0"
|
||||||
|
Cluster__Port: "4053"
|
||||||
|
Cluster__PublicHostname: "admin-a"
|
||||||
|
Cluster__SeedNodes__0: "akka.tcp://otopcua@admin-a:4053"
|
||||||
|
Cluster__Roles__0: "admin"
|
||||||
|
Security__Jwt__SigningKey: "docker-dev-signing-key-with-at-least-32-bytes-of-utf8-content-12345"
|
||||||
|
Security__Jwt__Issuer: "otopcua-dev"
|
||||||
|
Security__Jwt__Audience: "otopcua-dev"
|
||||||
|
Authentication__Ldap__Server: "ldap"
|
||||||
|
Authentication__Ldap__Port: "1389"
|
||||||
|
Authentication__Ldap__AllowInsecureLdap: "true"
|
||||||
|
|
||||||
|
admin-b:
|
||||||
|
<<: *otopcua-host
|
||||||
|
environment:
|
||||||
|
OTOPCUA_ROLES: "admin"
|
||||||
|
ASPNETCORE_URLS: "http://+:9000"
|
||||||
|
ConnectionStrings__ConfigDb: "Server=sql,1433;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
|
||||||
|
Cluster__Hostname: "0.0.0.0"
|
||||||
|
Cluster__Port: "4053"
|
||||||
|
Cluster__PublicHostname: "admin-b"
|
||||||
|
Cluster__SeedNodes__0: "akka.tcp://otopcua@admin-a:4053"
|
||||||
|
Cluster__Roles__0: "admin"
|
||||||
|
Security__Jwt__SigningKey: "docker-dev-signing-key-with-at-least-32-bytes-of-utf8-content-12345"
|
||||||
|
Security__Jwt__Issuer: "otopcua-dev"
|
||||||
|
Security__Jwt__Audience: "otopcua-dev"
|
||||||
|
Authentication__Ldap__Server: "ldap"
|
||||||
|
Authentication__Ldap__Port: "1389"
|
||||||
|
Authentication__Ldap__AllowInsecureLdap: "true"
|
||||||
|
|
||||||
|
driver-a:
|
||||||
|
<<: *otopcua-host
|
||||||
|
environment:
|
||||||
|
OTOPCUA_ROLES: "driver"
|
||||||
|
ConnectionStrings__ConfigDb: "Server=sql,1433;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
|
||||||
|
Cluster__Hostname: "0.0.0.0"
|
||||||
|
Cluster__Port: "4053"
|
||||||
|
Cluster__PublicHostname: "driver-a"
|
||||||
|
Cluster__SeedNodes__0: "akka.tcp://otopcua@admin-a:4053"
|
||||||
|
Cluster__Roles__0: "driver"
|
||||||
|
ports:
|
||||||
|
- "4840:4840"
|
||||||
|
|
||||||
|
driver-b:
|
||||||
|
<<: *otopcua-host
|
||||||
|
environment:
|
||||||
|
OTOPCUA_ROLES: "driver"
|
||||||
|
ConnectionStrings__ConfigDb: "Server=sql,1433;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;"
|
||||||
|
Cluster__Hostname: "0.0.0.0"
|
||||||
|
Cluster__Port: "4053"
|
||||||
|
Cluster__PublicHostname: "driver-b"
|
||||||
|
Cluster__SeedNodes__0: "akka.tcp://otopcua@admin-a:4053"
|
||||||
|
Cluster__Roles__0: "driver"
|
||||||
|
ports:
|
||||||
|
- "4841:4840"
|
||||||
|
|
||||||
|
traefik:
|
||||||
|
image: traefik:v3.1
|
||||||
|
command:
|
||||||
|
- --entrypoints.web.address=:80
|
||||||
|
- --providers.file.filename=/etc/traefik/dynamic.yml
|
||||||
|
- --providers.file.watch=true
|
||||||
|
- --api.insecure=true
|
||||||
|
ports:
|
||||||
|
- "80:80"
|
||||||
|
- "8080:8080"
|
||||||
|
volumes:
|
||||||
|
- ./traefik-dynamic.yml:/etc/traefik/dynamic.yml:ro
|
||||||
|
depends_on:
|
||||||
|
- admin-a
|
||||||
|
- admin-b
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
# docker-dev companion to scripts/install/traefik-dynamic.yml. Same routing rules,
|
||||||
|
# but the upstream targets are the Compose service names (admin-a, admin-b) on
|
||||||
|
# port 9000 instead of the Windows hostnames a bare-metal deployment would use.
|
||||||
|
|
||||||
|
http:
|
||||||
|
routers:
|
||||||
|
otopcua-admin:
|
||||||
|
entryPoints: ["web"]
|
||||||
|
rule: "PathPrefix(`/`)"
|
||||||
|
service: otopcua-admin
|
||||||
|
|
||||||
|
services:
|
||||||
|
otopcua-admin:
|
||||||
|
loadBalancer:
|
||||||
|
servers:
|
||||||
|
- url: "http://admin-a:9000"
|
||||||
|
- url: "http://admin-b:9000"
|
||||||
|
healthCheck:
|
||||||
|
path: /health/active
|
||||||
|
interval: 5s
|
||||||
|
timeout: 2s
|
||||||
+6
-3
@@ -9,10 +9,13 @@ The project was originally called **LmxOpcUa** (a single-driver Galaxy/MXAccess
|
|||||||
|
|
||||||
## Platform overview
|
## Platform overview
|
||||||
|
|
||||||
- **Core** owns the OPC UA stack, address space, session/security/subscription machinery.
|
> **v2 (2026-05-26):** the separate `OtOpcUa.Server` + `OtOpcUa.Admin` services fused into a single role-gated `OtOpcUa.Host` binary, joined by an Akka.NET cluster. See [v2 design](plans/2026-05-26-akka-hosting-alignment-design.md) for the architectural decision.
|
||||||
|
|
||||||
|
- **Core** owns shared abstractions (driver capability contracts, scripting, virtual tags, alarm historian).
|
||||||
- **Drivers** plug in via capability interfaces in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`: `IDriver`, `IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`, `IPerCallHostResolver`. Each driver opts into whichever it supports.
|
- **Drivers** plug in via capability interfaces in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`: `IDriver`, `IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`, `IPerCallHostResolver`. Each driver opts into whichever it supports.
|
||||||
- **Server** is the OPC UA endpoint process (net10, AnyCPU). Hosts every driver in-process. The Galaxy driver reaches MXAccess via gRPC to a separately-installed **mxaccessgw** sidecar (sibling repo); it is no longer hosted from this repo.
|
- **Host** (`src/Server/ZB.MOM.WW.OtOpcUa.Host`) is the single fused binary (.NET 10, AnyCPU). `OTOPCUA_ROLES` env decides what to mount: `admin` (Blazor + control-plane singletons), `driver` (OPC UA endpoint + per-node actors), or both. See [ServiceHosting.md](ServiceHosting.md).
|
||||||
- **Admin** is the Blazor Server operator UI (net10, x64). Owns the Config DB draft/publish flow, ACL + role-grant authoring, fleet status + `/metrics` scrape endpoint.
|
- **Cluster + ControlPlane + Runtime + AdminUI + Security** sit between Core and Host. The cluster glues per-node actors into one logical fleet; the control-plane singletons (deploy coordinator, audit writer, redundancy state) live on the admin role-leader. See [Redundancy.md](Redundancy.md).
|
||||||
|
- The Galaxy driver still reaches MXAccess via gRPC to a separately-installed **mxaccessgw** sidecar (sibling repo).
|
||||||
|
|
||||||
## Where to find what
|
## Where to find what
|
||||||
|
|
||||||
|
|||||||
+64
-74
@@ -1,103 +1,93 @@
|
|||||||
# Redundancy
|
# Redundancy (v2)
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
OtOpcUa supports OPC UA **non-transparent** warm/hot redundancy. Two (or more) OtOpcUa Server processes run side-by-side, share the same Config DB, the same driver backends (Galaxy ZB, MXAccess runtime, remote PLCs), and advertise the same OPC UA node tree. Each process owns a distinct `ApplicationUri`; OPC UA clients see both endpoints via the standard `ServerUriArray` and pick one based on the `ServiceLevel` that each server publishes.
|
OtOpcUa supports OPC UA **non-transparent** warm/hot redundancy. Two or more `OtOpcUa.Host` processes run side-by-side, share the same Config DB, and join the same Akka.NET cluster. Each process owns a distinct `ApplicationUri`; OPC UA clients see both endpoints via the standard `ServerUriArray` and pick one based on the `ServiceLevel` byte that each server publishes.
|
||||||
|
|
||||||
The redundancy surface lives in `src/Server/ZB.MOM.WW.OtOpcUa.Server/Redundancy/`:
|
> **v2 change.** v1's operator-managed `ClusterNode.RedundancyRole` column + `RedundancyCoordinator` / `ApplyLeaseRegistry` / `PeerHttpProbeLoop` are gone. Primary/secondary is now derived from **Akka cluster role-leader** for the `driver` role. The operator no longer writes a role into the DB; cluster topology + health drive ServiceLevel automatically.
|
||||||
|
|
||||||
| Class | Role |
|
The runtime pieces live in:
|
||||||
|---|---|
|
|
||||||
| `RedundancyCoordinator` | Process-singleton; owns the current `RedundancyTopology` loaded from the `ClusterNode` table. `RefreshAsync` re-reads after `sp_PublishGeneration` so operator role swaps take effect without a process restart. CAS-style swap (`Interlocked.Exchange`) means readers always see a coherent snapshot. |
|
|
||||||
| `RedundancyTopology` | Immutable `(ClusterId, Self, Peers, ServerUriArray, ValidityFlags)` snapshot. |
|
|
||||||
| `ApplyLeaseRegistry` | Tracks in-progress `sp_PublishGeneration` apply leases keyed on `(ConfigGenerationId, PublishRequestId)`. `await using` the disposable scope guarantees every exit path (success / exception / cancellation) decrements the lease; a stale-lease watchdog force-closes any lease older than `ApplyMaxDuration` (default 10 minutes) so a crashed publisher can't pin the node at `PrimaryMidApply`. |
|
|
||||||
| `PeerReachabilityTracker` | Maintains last-known reachability for each peer node over two independent probes — OPC UA ping and HTTP `/healthz`. Both must succeed for `peerReachable = true`. |
|
|
||||||
| `RecoveryStateManager` | Gates transitions out of the `Recovering*` bands; requires dwell + publish-witness satisfaction before allowing a return to nominal. |
|
|
||||||
| `ServiceLevelCalculator` | Pure function `(role, selfHealthy, peerUa, peerHttp, applyInProgress, recoveryDwellMet, topologyValid, operatorMaintenance) → byte`. |
|
|
||||||
| `RedundancyStatePublisher` | Orchestrates inputs into the calculator, pushes the resulting byte to the OPC UA `ServiceLevel` variable via an edge-triggered `OnStateChanged` event, and fires `OnServerUriArrayChanged` when the topology's `ServerUriArray` shifts. |
|
|
||||||
|
|
||||||
## Data model
|
| Component | Project | Role |
|
||||||
|
|
||||||
Per-node redundancy state lives in the Config DB `ClusterNode` table (`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ClusterNode.cs`):
|
|
||||||
|
|
||||||
| Column | Role |
|
|
||||||
|---|---|
|
|
||||||
| `NodeId` | Unique node identity; matches `Node:NodeId` in the server's bootstrap `appsettings.json`. |
|
|
||||||
| `ClusterId` | Foreign key into `ServerCluster`. |
|
|
||||||
| `RedundancyRole` | `Primary`, `Secondary`, or `Standalone` (`RedundancyRole` enum in `Configuration/Enums`). |
|
|
||||||
| `ServiceLevelBase` | Per-node base value used to bias nominal ServiceLevel output. |
|
|
||||||
| `ApplicationUri` | Unique-per-node OPC UA ApplicationUri advertised in endpoint descriptions. |
|
|
||||||
|
|
||||||
`ServerUriArray` is derived from the set of peer `ApplicationUri` values at topology-load time and republished when the topology changes.
|
|
||||||
|
|
||||||
## ServiceLevel matrix
|
|
||||||
|
|
||||||
`ServiceLevelCalculator` produces one of the following bands (see `ServiceLevelBand` enum in the same file):
|
|
||||||
|
|
||||||
| Band | Byte | Meaning |
|
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `Maintenance` | 0 | Operator-declared maintenance. |
|
| `ServiceLevelCalculator` | `OtOpcUa.ControlPlane.Redundancy` | Pure function `(NodeHealthInputs) → byte`. No side effects. |
|
||||||
| `NoData` | 1 | Self-reported unhealthy (`/healthz` fails). |
|
| `RedundancyStateActor` | `OtOpcUa.ControlPlane.Redundancy` | Admin-role cluster singleton; subscribes to cluster topology events, debounces 250ms, broadcasts `RedundancyStateChanged` on the `redundancy-state` DPS topic. |
|
||||||
| `InvalidTopology` | 2 | More than one Primary detected; both nodes self-demote. |
|
| `DbHealthProbeActor` | `OtOpcUa.Runtime.Health` | Per-node; runs `SELECT 1` against ConfigDb every 5s. Read by health endpoint + redundancy calc. |
|
||||||
| `RecoveringBackup` | 30 | Backup post-fault, dwell not met. |
|
| `PeerOpcUaProbeActor` | `OtOpcUa.Runtime.Health` | Per-node; pings peer `opc.tcp://peer:4840` (real probe call is staged for follow-up F12). |
|
||||||
| `BackupMidApply` | 50 | Backup inside a publish-apply window. |
|
| `ClusterRoleInfo` | `OtOpcUa.Cluster` | Live view of cluster membership + role-leader; exposes `IClusterRoleInfo` to the rest of the host. |
|
||||||
| `IsolatedBackup` | 80 | Primary unreachable; Backup says "take over if asked" — does **not** auto-promote (non-transparent model). |
|
|
||||||
| `AuthoritativeBackup` | 100 | Backup nominal. |
|
|
||||||
| `RecoveringPrimary` | 180 | Primary post-fault, dwell not met. |
|
|
||||||
| `PrimaryMidApply` | 200 | Primary inside a publish-apply window. |
|
|
||||||
| `IsolatedPrimary` | 230 | Primary with unreachable peer, retains authority. |
|
|
||||||
| `AuthoritativePrimary` | 255 | Primary nominal. |
|
|
||||||
|
|
||||||
The reserved bands (0 Maintenance, 1 NoData, 2 InvalidTopology) take precedence over operational states per OPC UA Part 5 §6.3.34. Operational values occupy 2..255 so spec-compliant clients that treat "<3 = unhealthy" keep working.
|
## ServiceLevel tiers (Part 5 §6.5)
|
||||||
|
|
||||||
Standalone nodes (single-instance deployments) report `AuthoritativePrimary` when healthy and `PrimaryMidApply` during publish.
|
`ServiceLevelCalculator.Compute(NodeHealthInputs)` returns a byte in 0..255 by tier:
|
||||||
|
|
||||||
## Publish fencing and split-brain prevention
|
| Tier | Byte | Condition |
|
||||||
|
|---|---|---|
|
||||||
|
| Down | 0 | Member status is not `Up` or `Joining` (leaving, removed, exiting). |
|
||||||
|
| Critically degraded | 100 | ConfigDb unreachable AND data is stale. |
|
||||||
|
| Stale | 200 | Data stale but ConfigDb reachable. |
|
||||||
|
| Healthy follower | 240 | DB ok + OPC UA probe ok + not stale. |
|
||||||
|
| Healthy leader | 250 | Healthy + this node is the `driver` role-leader. |
|
||||||
|
|
||||||
Any Admin-triggered `sp_PublishGeneration` acquires an apply lease through `ApplyLeaseRegistry.BeginApplyLease`. While the lease is held:
|
Drivers write their computed byte into the OPC UA `ServiceLevel` Variable on each refresh. Clients with the standard redundancy heuristic ("pick the highest ServiceLevel") therefore prefer the role-leader and fall back to followers on its degradation.
|
||||||
|
|
||||||
- The calculator reports `PrimaryMidApply` / `BackupMidApply` — clients see the band shift and cut over to the unaffected peer rather than racing against a half-applied generation.
|
## Data flow
|
||||||
- `RedundancyCoordinator.RefreshAsync` is called at the end of the apply window so the post-publish topology becomes visible exactly once, atomically.
|
|
||||||
- The watchdog force-closes any lease older than `ApplyMaxDuration`; a stuck publisher therefore cannot strand a node at `PrimaryMidApply`.
|
|
||||||
|
|
||||||
Because role transitions are **operator-driven** (write `RedundancyRole` in the Config DB + publish), the Backup never auto-promotes. An `IsolatedBackup` at 80 is the signal that the operator should intervene; auto-failover is intentionally out of scope for the non-transparent model (decision #154).
|
```
|
||||||
|
Cluster topology event ──┐
|
||||||
|
DB health probe ─────────┤
|
||||||
|
OPC UA peer probe ───────┤
|
||||||
|
▼
|
||||||
|
RedundancyStateActor (admin singleton)
|
||||||
|
│ debounce 250ms
|
||||||
|
▼
|
||||||
|
DPS topic "redundancy-state"
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Driver nodes' OpcUaPublishActor
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
ServiceLevelCalculator → byte
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
OPC UA ServiceLevel Variable
|
||||||
|
```
|
||||||
|
|
||||||
## Metrics
|
The admin singleton is the cluster's only `RedundancyStateActor`. If the admin leader fails over, the new admin node spins up its replacement, re-subscribes to cluster events, and publishes a fresh snapshot from the current `Cluster.State`. There is no DB-persisted state to recover.
|
||||||
|
|
||||||
`RedundancyMetrics` in `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Services/RedundancyMetrics.cs` registers the `ZB.MOM.WW.OtOpcUa.Redundancy` meter on the Admin process. Instruments:
|
## Configuration
|
||||||
|
|
||||||
| Name | Kind | Tags | Description |
|
Per-node identity comes from `appsettings.json` + the `OTOPCUA_ROLES` env var:
|
||||||
|---|---|---|---|
|
|
||||||
| `otopcua.redundancy.role_transition` | Counter<long> | `cluster.id`, `node.id`, `from_role`, `to_role` | Incremented every time `FleetStatusPoller` observes a `RedundancyRole` change on a `ClusterNode` row. |
|
|
||||||
| `otopcua.redundancy.primary_count` | ObservableGauge<long> | `cluster.id` | Primary-role nodes per cluster — should be exactly 1 in nominal state. |
|
|
||||||
| `otopcua.redundancy.secondary_count` | ObservableGauge<long> | `cluster.id` | Secondary-role nodes per cluster. |
|
|
||||||
| `otopcua.redundancy.stale_count` | ObservableGauge<long> | `cluster.id` | Nodes whose `LastSeenAt` exceeded the stale threshold. |
|
|
||||||
|
|
||||||
Admin `Program.cs` wires OpenTelemetry to the Prometheus exporter when `Metrics:Prometheus:Enabled=true` (default), exposing the meter under `/metrics`. The endpoint is intentionally unauthenticated — fleet conventions put it behind a reverse-proxy basic-auth gate if needed.
|
```json
|
||||||
|
{
|
||||||
|
"Cluster": {
|
||||||
|
"Hostname": "0.0.0.0",
|
||||||
|
"Port": 4053,
|
||||||
|
"PublicHostname": "node-a.lan",
|
||||||
|
"SeedNodes": ["akka.tcp://otopcua@node-a.lan:4053"],
|
||||||
|
"Roles": ["admin", "driver"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
## Real-time notifications (Admin UI)
|
```
|
||||||
|
OTOPCUA_ROLES=admin,driver
|
||||||
|
```
|
||||||
|
|
||||||
`FleetStatusPoller` in `src/Server/ZB.MOM.WW.OtOpcUa.Admin/Hubs/` polls the `ClusterNode` table, records role transitions, updates `RedundancyMetrics.SetClusterCounts`, and pushes a `RoleChanged` SignalR event onto `FleetStatusHub` when a transition is observed. `RedundancyTab.razor` subscribes with `_hub.On<RoleChangedMessage>("RoleChanged", …)` so connected Admin sessions see role swaps the moment they happen.
|
Both nodes share the same `ConfigDb` connection string; `Cluster.PublicHostname` + `Roles` are what makes them distinct in cluster gossip. The first node bootstraps the cluster (its address goes in `SeedNodes`); the second node joins via the same `SeedNodes` list.
|
||||||
|
|
||||||
## Configuring a redundant pair
|
There is no longer a `Node:NodeId` setting, no `ClusterNode.RedundancyRole`, no `ServiceLevelBase`. NodeId is derived as `host:port` of the cluster `PublicHostname` (see `ClusterRoleInfo.LocalNode` for the formula).
|
||||||
|
|
||||||
Redundancy is configured **in the Config DB, not appsettings.json**. The fields that must differ between the two instances:
|
## Split-brain
|
||||||
|
|
||||||
| Field | Location | Instance 1 | Instance 2 |
|
`akka.conf` configures Akka's split-brain resolver with `active-strategy = keep-oldest`, `stable-after = 15s`, and `failure-detector.threshold = 10.0`. Under a clean partition: the oldest member stays up + the smaller (or younger) side downs itself within ~15 seconds. The `RedundancyStateActor` on the surviving partition re-computes from the post-partition `Cluster.State`.
|
||||||
|---|---|---|---|
|
|
||||||
| `NodeId` | `appsettings.json` `Node:NodeId` (bootstrap) | `node-a` | `node-b` |
|
|
||||||
| `ClusterNode.ApplicationUri` | Config DB | `urn:node-a:OtOpcUa` | `urn:node-b:OtOpcUa` |
|
|
||||||
| `ClusterNode.RedundancyRole` | Config DB | `Primary` | `Secondary` |
|
|
||||||
| `ClusterNode.ServiceLevelBase` | Config DB | typically 255 | typically 100 |
|
|
||||||
|
|
||||||
Shared between instances: `ClusterId`, Config DB connection string, published generation, cluster-level ACLs, UNS hierarchy, driver instances.
|
There is no operator-driven role swap during a partition. Failover is what the cluster does automatically.
|
||||||
|
|
||||||
Role swaps, stand-alone promotions, and base-level adjustments all happen through the Admin UI `RedundancyTab` — the operator edits the `ClusterNode` row in a draft generation and publishes. `RedundancyCoordinator.RefreshAsync` picks up the new topology without a process restart.
|
|
||||||
|
|
||||||
## Client-side failover
|
## Client-side failover
|
||||||
|
|
||||||
The OtOpcUa Client CLI at `src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI` supports `-F` / `--failover-urls` for automatic client-side failover; for long-running subscriptions the CLI monitors session KeepAlive and reconnects to the next available server, recreating the subscription on the new endpoint. See [`Client.CLI.md`](Client.CLI.md) for the command reference.
|
The OtOpcUa Client CLI at `src/Client/ZB.MOM.WW.OtOpcUa.Client.CLI` supports `-F` / `--failover-urls` for automatic client-side failover; for long-running subscriptions the CLI monitors session KeepAlive and reconnects to the next available server, recreating the subscription on the new endpoint. See [`Client.CLI.md`](Client.CLI.md).
|
||||||
|
|
||||||
## Depth reference
|
## Depth reference
|
||||||
|
|
||||||
For the full decision trail and implementation plan — topology invariants, peer-probe cadence, recovery-dwell policy, compliance-script guard against enum-value drift — see `docs/v2/plan.md` §Phase 6.3.
|
For the full design — message contracts, tiered calculator truth table, recovery semantics — see `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §6.
|
||||||
|
|||||||
+55
-41
@@ -1,62 +1,76 @@
|
|||||||
# Service Hosting
|
# Service Hosting (v2)
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
A production OtOpcUa deployment runs **two or three processes**, each
|
A production OtOpcUa deployment runs **one binary per node**, plus the optional Wonderware historian sidecar:
|
||||||
with a distinct runtime and install surface:
|
|
||||||
|
|
||||||
| Process | Project | Runtime | Platform | Responsibility |
|
| Process | Project | Runtime | Platform | Responsibility |
|
||||||
|---|---|---|---|---|
|
|---|---|---|---|---|
|
||||||
| **OtOpcUa Server** | `src/Server/ZB.MOM.WW.OtOpcUa.Server` | .NET 10 | x64 | Hosts the OPC UA endpoint; loads every driver in-process (Modbus, S7, AbCip, AbLegacy, TwinCAT, FOCAS, OPC UA Client, Galaxy via mxaccessgw); exposes `/healthz`. |
|
| **OtOpcUa Host** | `src/Server/ZB.MOM.WW.OtOpcUa.Host` | .NET 10 | AnyCPU | Single fused binary. `OTOPCUA_ROLES` env decides what to mount: `admin` (Blazor + auth + control-plane singletons), `driver` (OPC UA endpoint + per-driver actors), or both. |
|
||||||
| **OtOpcUa Admin** | `src/Server/ZB.MOM.WW.OtOpcUa.Admin` | .NET 10 (ASP.NET Core / Blazor Server) | x64 | Operator UI for Config DB editing + fleet status, SignalR hubs (`FleetStatusHub`, `AlertHub`), Prometheus `/metrics`. |
|
| **OtOpcUa Wonderware Historian** *(optional)* | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware` | .NET Framework 4.8 | x86 (32-bit) | Out-of-process sidecar exposing the Wonderware Historian SDK over a named pipe. Required only when `Historian:Wonderware:Enabled=true`. |
|
||||||
| **OtOpcUa Wonderware Historian** *(optional)* | `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware` | .NET Framework 4.8 | x86 (32-bit) | Out-of-process sidecar exposing the Wonderware Historian SDK over a named pipe. Required only when `Historian:Wonderware:Enabled=true` in `appsettings.json`. |
|
|
||||||
|
|
||||||
Galaxy access uses a separately-installed **mxaccessgw** running out
|
Galaxy access still uses the separately-installed **mxaccessgw** sidecar (see `docs/v2/Galaxy.ParityRig.md`); the gateway owns the MXAccess COM bitness constraint (its worker is x86 net48). Nothing in the OtOpcUa repo carries that constraint anymore.
|
||||||
of a sibling repo (`c:\Users\dohertj2\Desktop\mxaccessgw\`) — see
|
|
||||||
`docs/v2/Galaxy.ParityRig.md` for setup. The mxaccessgw owns the
|
|
||||||
MXAccess COM bitness constraint (its worker is x86 net48); nothing
|
|
||||||
in the OtOpcUa repo carries that constraint anymore. PR 7.2 retired
|
|
||||||
the legacy in-process `Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared`
|
|
||||||
projects + the `OtOpcUaGalaxyHost` Windows service.
|
|
||||||
|
|
||||||
## OtOpcUa Server
|
> **v2 change.** v1's separate `OtOpcUa.Server` + `OtOpcUa.Admin` Windows services merged into a single role-gated `OtOpcUa.Host` binary. Two installers became one (with a `-Roles` parameter). The whole DI graph is composed in `OtOpcUa.Host/Program.cs`; per-role wiring is conditional on the env var.
|
||||||
|
|
||||||
Hosted via `Microsoft.Extensions.Hosting` with `AddWindowsService`
|
## Role gating
|
||||||
(decision #30 — replaced TopShelf in v2). The host's `Build()`
|
|
||||||
returns immediately when launched interactively (e.g. `dotnet run`)
|
|
||||||
but blocks for SCM signals when running as a Windows service.
|
|
||||||
|
|
||||||
In-process drivers are registered at startup in `Program.cs`'s
|
`Program.cs` reads `OTOPCUA_ROLES`, parses it with `RoleParser`, and conditionally registers services:
|
||||||
`DriverFactoryRegistry` block; the `DriverInstance` rows in the
|
|
||||||
central Config DB select which driver factories materialise into
|
|
||||||
live `IDriver` instances. See `docs/v2/driver-specs.md` for the
|
|
||||||
per-driver `DriverConfig` JSON shapes.
|
|
||||||
|
|
||||||
## OtOpcUa Admin
|
| Role present | Wires |
|
||||||
|
|---|---|
|
||||||
|
| `admin` | `AddOtOpcUaAuth`, `AddAdminUI`, `AddSignalR`, `AddOtOpcUaAdminClients`, `MapOtOpcUaAuth`, `MapAdminUI<App>`, `MapOtOpcUaHubs`, `WithOtOpcUaControlPlaneSingletons` (5 admin singletons via `Akka.Hosting`) |
|
||||||
|
| `driver` | `WithOtOpcUaRuntimeActors` (DriverHostActor + DbHealthProbeActor) — and the OPC UA endpoint on port 4840 |
|
||||||
|
| Either / both | `AddOtOpcUaConfigDb`, `AddOtOpcUaCluster`, `AddOtOpcUaHealth` (`/health/ready`, `/health/active`, `/healthz`) |
|
||||||
|
|
||||||
Same hosting model; runs the Blazor Server UI + SignalR hubs.
|
Single-node dev: `OTOPCUA_ROLES=admin,driver`. Production: typically two admin nodes (HA pair) + N driver nodes.
|
||||||
Reads from the same Config DB the Server writes to.
|
|
||||||
|
## Akka cluster
|
||||||
|
|
||||||
|
The host joins an Akka.NET cluster bound to the address in `appsettings.json::Cluster`:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Cluster": {
|
||||||
|
"Hostname": "0.0.0.0",
|
||||||
|
"Port": 4053,
|
||||||
|
"PublicHostname": "node-a.lan",
|
||||||
|
"SeedNodes": ["akka.tcp://otopcua@node-a.lan:4053"],
|
||||||
|
"Roles": ["admin", "driver"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `WithOtOpcUaClusterBootstrap` (in `OtOpcUa.Cluster`) loads the embedded HOCON (split-brain resolver, pinned dispatcher, failure detector tuning) and overlays remote endpoint + cluster options.
|
||||||
|
- All cluster singletons + per-node actors live on this single ActorSystem — there is no second Akka instance.
|
||||||
|
|
||||||
|
See [Redundancy.md](Redundancy.md) for the role-leader + ServiceLevel story.
|
||||||
|
|
||||||
|
## Health endpoints
|
||||||
|
|
||||||
|
Both admin and driver nodes expose:
|
||||||
|
|
||||||
|
| Path | Status meaning |
|
||||||
|
|---|---|
|
||||||
|
| `/healthz` | Process alive. |
|
||||||
|
| `/health/ready` | ConfigDb reachable + cluster member state is `Up`. |
|
||||||
|
| `/health/active` | Admin-role leader (the node Traefik or an HA LB should route traffic to). |
|
||||||
|
|
||||||
|
Used by Traefik for the active-leader-only routing pattern (see [Task 63 traefik docs](v2/Architecture-v2.md) — TODO).
|
||||||
|
|
||||||
## OtOpcUa Wonderware Historian (optional)
|
## OtOpcUa Wonderware Historian (optional)
|
||||||
|
|
||||||
When `Historian:Wonderware:Enabled=true`, the Server speaks to a
|
Unchanged from v1. Pipe IPC contract lives in `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Contracts/`; sidecar pipe handler in `src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Pipe/`. Install via `scripts/install/Install-Services.ps1 -InstallWonderwareHistorian`.
|
||||||
sidecar that wraps the Wonderware Historian SDK (which is .NET
|
|
||||||
Framework only). The pipe IPC contract is in
|
|
||||||
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Client/Contracts/`
|
|
||||||
and the sidecar's pipe handler lives at
|
|
||||||
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware/Pipe/`.
|
|
||||||
|
|
||||||
Install via the `-InstallWonderwareHistorian` switch on
|
|
||||||
`scripts/install/Install-Services.ps1`.
|
|
||||||
|
|
||||||
## Install / Uninstall
|
## Install / Uninstall
|
||||||
|
|
||||||
- `scripts/install/Install-Services.ps1` — installs `OtOpcUa` and
|
- `scripts/install/Install-Services.ps1 -Roles admin,driver` — installs `OtOpcUaHost`. v2 rewrite tracked as plan Task 62.
|
||||||
optionally `OtOpcUaWonderwareHistorian`.
|
- `scripts/install/Uninstall-Services.ps1` — stops + removes the host service (and the historian sidecar if installed).
|
||||||
- `scripts/install/Uninstall-Services.ps1` — stops + removes both,
|
|
||||||
plus `OtOpcUaGalaxyHost` if a pre-7.2 rig still carries it.
|
|
||||||
|
|
||||||
## Logging
|
## Logging
|
||||||
|
|
||||||
Serilog with rolling-daily file sinks. Each service writes to
|
Serilog with rolling-daily file sinks. Each host writes to `logs/otopcua-*.log` plus stdout (NSSM/systemd-friendly). Per-environment log level overrides go in `appsettings.{Environment}.json`.
|
||||||
`%ProgramData%\OtOpcUa\<service>-*.log` plus stdout (NSSM-friendly).
|
|
||||||
|
## Depth reference
|
||||||
|
|
||||||
|
For the full host-architecture rationale (why fused vs. split, role-gating tradeoffs, multi-node deployment shapes), see `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §3-4.
|
||||||
|
|||||||
@@ -71,10 +71,10 @@
|
|||||||
{"id": 59, "subject": "Task 59: Deploy + failover integration tests", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [60], "blockedBy": [58], "commit": "5cfbe8b", "deviation": "Happy-path + idempotency landed. Failover scenarios (kill-mid-apply, split-brain, restart-during-deploy) deferred as F22 — they need node-down/restart primitives on the harness. Two production bugs fixed in this commit: (1) coordinator missing DPS subscription for ACKs, (2) NodeId collision on shared loopback host."},
|
{"id": 59, "subject": "Task 59: Deploy + failover integration tests", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [60], "blockedBy": [58], "commit": "5cfbe8b", "deviation": "Happy-path + idempotency landed. Failover scenarios (kill-mid-apply, split-brain, restart-during-deploy) deferred as F22 — they need node-down/restart primitives on the harness. Two production bugs fixed in this commit: (1) coordinator missing DPS subscription for ACKs, (2) NodeId collision on shared loopback host."},
|
||||||
{"id": 60, "subject": "Task 60: OPC UA dual-endpoint + ServiceLevel tests", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [59], "blockedBy": [58]},
|
{"id": 60, "subject": "Task 60: OPC UA dual-endpoint + ServiceLevel tests", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [59], "blockedBy": [58]},
|
||||||
{"id": 61, "subject": "Task 61: E2E test infrastructure + GitHub Actions CI", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [59,60]},
|
{"id": 61, "subject": "Task 61: E2E test infrastructure + GitHub Actions CI", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [], "blockedBy": [59,60]},
|
||||||
{"id": 62, "subject": "Task 62: Rewrite Install-Services.ps1", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [63,64,65], "blockedBy": [53]},
|
{"id": 62, "subject": "Task 62: Rewrite Install-Services.ps1", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [63,64,65], "blockedBy": [53], "commit": "e40615d"},
|
||||||
{"id": 63, "subject": "Task 63: Traefik config + docker-dev compose", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,64,65], "blockedBy": [53]},
|
{"id": 63, "subject": "Task 63: Traefik config + docker-dev compose", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,64,65], "blockedBy": [53], "commit": "7e3b56c", "deviation": "Untested on macOS (no local Docker). Compose file should work — exercise + adjust on first run against a real Docker host."},
|
||||||
{"id": 64, "subject": "Task 64: Update existing docs (Redundancy, ServiceHosting, security)", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,65], "blockedBy": [57]},
|
{"id": 64, "subject": "Task 64: Update existing docs (Redundancy, ServiceHosting, security)", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,65], "blockedBy": [57], "commit": "3c3fef9", "deviation": "Redundancy.md + ServiceHosting.md full rewrites. security.md v2 banner only — full per-section rewrite waits for F15 (Admin pages migration) since security.md references many pages that will move. README.md platform-overview updated."},
|
||||||
{"id": 65, "subject": "Task 65: New v2 docs (Architecture-v2, Cluster, ControlPlane, Runtime)", "status": "pending", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,64], "blockedBy": [57]},
|
{"id": 65, "subject": "Task 65: New v2 docs (Architecture-v2, Cluster, ControlPlane, Runtime)", "status": "completed", "classification": "standard", "estMinutes": 5, "parallelizableWith": [62,63,64], "blockedBy": [57], "commit": "1689901"},
|
||||||
{"id": "F1", "subject": "Follow-up: AuthEndpoints integration tests against fused Host", "status": "completed", "classification": "small", "estMinutes": 10, "parallelizableWith": ["F2"], "blockedBy": [53], "commit": "463512d", "origin": "Deviation from Task 29 (commit 38ea0c5) — deferred until Task 53 wires AddOtOpcUaAuth/MapOtOpcUaAuth in Program. Add WebApplicationFactory<OtOpcUa.Host.Program> tests for /auth/login (204/401/503), /auth/ping (401/200), /auth/token (200+JWT), /auth/logout (204+cookie clear) using a stub ILdapAuthService.", "deviation": "Used HostBuilder + TestServer directly (Security.Tests/AuthEndpointsIntegrationTests) instead of WebApplicationFactory<Program> — Host needs Akka cluster bootstrap that's out of scope for this contract test. Cluster-mode auth coverage belongs in Task 58."},
|
{"id": "F1", "subject": "Follow-up: AuthEndpoints integration tests against fused Host", "status": "completed", "classification": "small", "estMinutes": 10, "parallelizableWith": ["F2"], "blockedBy": [53], "commit": "463512d", "origin": "Deviation from Task 29 (commit 38ea0c5) — deferred until Task 53 wires AddOtOpcUaAuth/MapOtOpcUaAuth in Program. Add WebApplicationFactory<OtOpcUa.Host.Program> tests for /auth/login (204/401/503), /auth/ping (401/200), /auth/token (200+JWT), /auth/logout (204+cookie clear) using a stub ILdapAuthService.", "deviation": "Used HostBuilder + TestServer directly (Security.Tests/AuthEndpointsIntegrationTests) instead of WebApplicationFactory<Program> — Host needs Akka cluster bootstrap that's out of scope for this contract test. Cluster-mode auth coverage belongs in Task 58."},
|
||||||
{"id": "F2", "subject": "Follow-up: Replace JwtBearer BuildServiceProvider antipattern with IPostConfigureOptions", "status": "completed", "classification": "small", "estMinutes": 5, "parallelizableWith": ["F1"], "blockedBy": [], "commit": "45a8c79", "origin": "Deviation from Task 26 (commit 207fc6a) — AddOtOpcUaAuth uses services.BuildServiceProvider().CreateScope() inside .AddJwtBearer lambda (ASP0000). Refactor to IPostConfigureOptions<JwtBearerOptions> so validation parameters resolve lazily from the real request provider."},
|
{"id": "F2", "subject": "Follow-up: Replace JwtBearer BuildServiceProvider antipattern with IPostConfigureOptions", "status": "completed", "classification": "small", "estMinutes": 5, "parallelizableWith": ["F1"], "blockedBy": [], "commit": "45a8c79", "origin": "Deviation from Task 26 (commit 207fc6a) — AddOtOpcUaAuth uses services.BuildServiceProvider().CreateScope() inside .AddJwtBearer lambda (ASP0000). Refactor to IPostConfigureOptions<JwtBearerOptions> so validation parameters resolve lazily from the real request provider."},
|
||||||
{"id": "F3", "subject": "Follow-up: Add EventId unique column to ConfigAuditLog for cross-restart audit idempotency", "status": "pending", "classification": "small", "estMinutes": 15, "parallelizableWith": ["F4"], "blockedBy": [], "origin": "Deviation from Task 33 — AuditWriterActor only dedups in-buffer; ConfigAuditLog lacks EventId column so a duplicate AuditEvent that arrives after a flush becomes a duplicate row. Add nullable EventId Guid + filtered unique index, migration, and refactor AuditWriterActor.WrapDetails away."},
|
{"id": "F3", "subject": "Follow-up: Add EventId unique column to ConfigAuditLog for cross-restart audit idempotency", "status": "pending", "classification": "small", "estMinutes": 15, "parallelizableWith": ["F4"], "blockedBy": [], "origin": "Deviation from Task 33 — AuditWriterActor only dedups in-buffer; ConfigAuditLog lacks EventId column so a duplicate AuditEvent that arrives after a flush becomes a duplicate row. Add nullable EventId Guid + filtered unique index, migration, and refactor AuditWriterActor.WrapDetails away."},
|
||||||
|
|||||||
@@ -1,5 +1,19 @@
|
|||||||
# Security
|
# Security
|
||||||
|
|
||||||
|
> **v2 status (2026-05-26).** The four security concerns below are unchanged in v2.
|
||||||
|
> Paths + project names moved: `OtOpcUa.Server/Security/` → `OtOpcUa.Security/`
|
||||||
|
> (`Ldap/`, `Jwt/`, `Endpoints/AuthEndpoints.cs`), `OtOpcUa.Admin` is gone (its
|
||||||
|
> auth + role-grant pages live in `OtOpcUa.AdminUI`), and Admin auth policies
|
||||||
|
> register in `OtOpcUa.Host/Program.cs` via `AddOtOpcUaAuth` rather than in a
|
||||||
|
> separate Admin process. The v2 `Security:Jwt` section adds JWT bearer auth
|
||||||
|
> alongside the existing cookie scheme (`AddJwtBearer` wired via
|
||||||
|
> `IPostConfigureOptions<JwtBearerOptions>` in `OtOpcUa.Security`). DataProtection
|
||||||
|
> keys persist to the shared `ConfigDb.DataProtectionKeys` table so cookies
|
||||||
|
> survive failover between admin-role nodes.
|
||||||
|
>
|
||||||
|
> See `docs/plans/2026-05-26-akka-hosting-alignment-design.md` §5 for the v2
|
||||||
|
> auth + DataProtection rationale.
|
||||||
|
|
||||||
OtOpcUa has four independent security concerns. This document covers all four:
|
OtOpcUa has four independent security concerns. This document covers all four:
|
||||||
|
|
||||||
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
|
1. **Transport security** — OPC UA secure channel (signing, encryption, X.509 trust).
|
||||||
|
|||||||
@@ -0,0 +1,127 @@
|
|||||||
|
# OtOpcUa v2 Architecture
|
||||||
|
|
||||||
|
Single-page tour of the v2 layout. For decision history + tradeoffs, see [`2026-05-26-akka-hosting-alignment-design.md`](../plans/2026-05-26-akka-hosting-alignment-design.md).
|
||||||
|
|
||||||
|
## Big picture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────┐
|
||||||
|
│ OtOpcUa.Host │ (fused binary)
|
||||||
|
│ │
|
||||||
|
│ reads OTOPCUA_ROLES env, mounts: │
|
||||||
|
│ ┌─────────────────────────────────────┐ │
|
||||||
|
│ │ admin → Blazor + auth + control- │ │
|
||||||
|
│ │ plane singletons │ │
|
||||||
|
│ │ driver → OPC UA endpoint + │ │
|
||||||
|
│ │ per-node actors │ │
|
||||||
|
│ └─────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
│ joins
|
||||||
|
▼
|
||||||
|
┌─────────────────────────────────────────────┐
|
||||||
|
│ Akka.NET cluster │
|
||||||
|
│ (split-brain resolver: keep-oldest, 15s) │
|
||||||
|
└─────────────────────────────────────────────┘
|
||||||
|
|
||||||
|
shared by every node: ┌─────────────────┐
|
||||||
|
│ ConfigDb (SQL) │ live-edit + Deployment artifacts + audit
|
||||||
|
└─────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
The v1 setup was two separate Windows services (`OtOpcUa.Server` + `OtOpcUa.Admin`) talking through the DB. v2 collapses them into one binary with role gating, and adds an Akka cluster so admin singletons can drive deploys and the redundancy story is automatic.
|
||||||
|
|
||||||
|
## Project layout
|
||||||
|
|
||||||
|
```
|
||||||
|
src/Core/ shared abstractions, no Server deps
|
||||||
|
ZB.MOM.WW.OtOpcUa.Commons types + Akka message contracts + interfaces
|
||||||
|
ZB.MOM.WW.OtOpcUa.Cluster HOCON, AkkaClusterOptions, IClusterRoleInfo
|
||||||
|
ZB.MOM.WW.OtOpcUa.Configuration EF Core DbContext + entities
|
||||||
|
|
||||||
|
src/Server/ server-side projects
|
||||||
|
ZB.MOM.WW.OtOpcUa.Security cookie+JWT auth, LDAP, JwtTokenService
|
||||||
|
ZB.MOM.WW.OtOpcUa.ControlPlane admin-role cluster singletons
|
||||||
|
ZB.MOM.WW.OtOpcUa.Runtime driver-role per-node actors
|
||||||
|
ZB.MOM.WW.OtOpcUa.OpcUaServer OPC UA endpoint facade + Phase7Composer
|
||||||
|
ZB.MOM.WW.OtOpcUa.AdminUI Blazor Razor class library
|
||||||
|
ZB.MOM.WW.OtOpcUa.Host fused binary (Program.cs)
|
||||||
|
```
|
||||||
|
|
||||||
|
| Project | Role | Doc |
|
||||||
|
|---|---|---|
|
||||||
|
| Cluster | Bootstrap + cluster topology view | [Cluster.md](Cluster.md) |
|
||||||
|
| ControlPlane | Admin singletons (deploy, audit, fleet, redundancy) | [ControlPlane.md](ControlPlane.md) |
|
||||||
|
| Runtime | Driver-role actor tree | [Runtime.md](Runtime.md) |
|
||||||
|
| Security | Cookie+JWT auth, LDAP, /auth/* endpoints | [../security.md](../security.md) |
|
||||||
|
| OpcUaServer | OPC UA endpoint host + composer | [../OpcUaServer.md](../OpcUaServer.md) |
|
||||||
|
| Host | Role-gated DI graph + Program.cs | [../ServiceHosting.md](../ServiceHosting.md) |
|
||||||
|
|
||||||
|
## Role gating
|
||||||
|
|
||||||
|
`Program.cs` reads `OTOPCUA_ROLES` once (per process) and decides what to wire:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
var roles = RoleParser.Parse(Environment.GetEnvironmentVariable("OTOPCUA_ROLES"));
|
||||||
|
var hasAdmin = roles.Contains("admin");
|
||||||
|
var hasDriver = roles.Contains("driver");
|
||||||
|
|
||||||
|
builder.Services.AddOtOpcUaConfigDb(builder.Configuration);
|
||||||
|
builder.Services.AddOtOpcUaCluster(builder.Configuration);
|
||||||
|
|
||||||
|
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||||
|
{
|
||||||
|
ab.WithOtOpcUaClusterBootstrap(sp); // HOCON + remote + cluster options
|
||||||
|
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||||
|
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||||
|
});
|
||||||
|
|
||||||
|
if (hasAdmin)
|
||||||
|
{
|
||||||
|
builder.Services.AddOtOpcUaAuth(builder.Configuration);
|
||||||
|
builder.Services.AddAdminUI();
|
||||||
|
// SignalR, AdminOpsClient, etc.
|
||||||
|
}
|
||||||
|
|
||||||
|
builder.Services.AddOtOpcUaHealth();
|
||||||
|
```
|
||||||
|
|
||||||
|
There is a **single** ActorSystem. Cluster singletons + per-node actors share it via the `Akka.Hosting` registry. This was a v2 fix (the initial Phase 9 wiring ran two ActorSystems by mistake; see commit `d6fac2d`).
|
||||||
|
|
||||||
|
## Live-edit vs draft/publish
|
||||||
|
|
||||||
|
v1 had `ConfigGeneration(Draft|Published)` with every live-edit entity FK'd to a generation. Edits accumulated in a Draft until Publish promoted them.
|
||||||
|
|
||||||
|
v2 removes that entirely:
|
||||||
|
|
||||||
|
- No `ConfigGeneration` table, no `GenerationId` columns.
|
||||||
|
- Every live-edit entity has a `RowVersion` (`IsRowVersion()`) for last-write-wins.
|
||||||
|
- Audit goes to `ConfigEdit` (per-row delta) and `ConfigAuditLog` (event-level).
|
||||||
|
- Deploys snapshot the *current* DB state into an immutable `Deployment.ArtifactBlob` + its `RevisionHash`. That artifact is what driver nodes apply.
|
||||||
|
|
||||||
|
See [ControlPlane.md § Deploy flow](ControlPlane.md#deploy-flow) for the end-to-end dispatch + ACK + seal sequence.
|
||||||
|
|
||||||
|
## NodeId
|
||||||
|
|
||||||
|
Each cluster member has a `NodeId` derived as `{PublicHostname}:{Port}` of the Akka remote endpoint. `ClusterRoleInfo.LocalNode` + `ConfigPublishCoordinator.DiscoverDriverNodes()` use the same formula so they always agree. The port suffix makes loopback test deployments distinguishable (commit `5cfbe8b`); in production the hostname alone is already unique.
|
||||||
|
|
||||||
|
## Health endpoints
|
||||||
|
|
||||||
|
| Path | Returns 200 when… |
|
||||||
|
|---|---|
|
||||||
|
| `/healthz` | Process is alive (no checks). |
|
||||||
|
| `/health/ready` | DB reachable + this node is `Up` in the cluster. |
|
||||||
|
| `/health/active` | This node is the admin role-leader (used by Traefik/HA-LB to pin traffic). |
|
||||||
|
|
||||||
|
## What lives where (quick map)
|
||||||
|
|
||||||
|
| Concern | Project | Entry point |
|
||||||
|
|---|---|---|
|
||||||
|
| Read OTOPCUA_ROLES | `Cluster.RoleParser` | static `Parse(string?)` |
|
||||||
|
| Cluster lifecycle | `Cluster.WithOtOpcUaClusterBootstrap` | extension on `AkkaConfigurationBuilder` |
|
||||||
|
| Local node identity | `Cluster.IClusterRoleInfo.LocalNode` | DI singleton |
|
||||||
|
| Admin singletons | `ControlPlane.WithOtOpcUaControlPlaneSingletons` | extension on `AkkaConfigurationBuilder` |
|
||||||
|
| Driver actors | `Runtime.WithOtOpcUaRuntimeActors` | extension on `AkkaConfigurationBuilder` |
|
||||||
|
| Auth pipeline | `Security.AddOtOpcUaAuth` + `MapOtOpcUaAuth` | extensions on `IServiceCollection` / `IEndpointRouteBuilder` |
|
||||||
|
| OPC UA facade | `OpcUaServer.OpcUaApplicationHost` | runtime host, started by driver-role startup |
|
||||||
|
| Health endpoints | `Host.Health.AddOtOpcUaHealth` + `MapOtOpcUaHealth` | extensions on `IServiceCollection` / `IEndpointRouteBuilder` |
|
||||||
@@ -0,0 +1,102 @@
|
|||||||
|
# OtOpcUa.Cluster
|
||||||
|
|
||||||
|
Akka.NET cluster bootstrap + topology view. Used by every other server-side project to talk to the live cluster.
|
||||||
|
|
||||||
|
Path: `src/Core/ZB.MOM.WW.OtOpcUa.Cluster/`
|
||||||
|
|
||||||
|
## Public surface
|
||||||
|
|
||||||
|
| Type | Role |
|
||||||
|
|---|---|
|
||||||
|
| `AkkaClusterOptions` | DI-bound options from `appsettings.json::Cluster`. Hostname/Port/PublicHostname/SeedNodes/Roles. |
|
||||||
|
| `IClusterRoleInfo` (interface in Commons) | Live view of cluster membership + role-leader topology. Thread-safe + event-raising. |
|
||||||
|
| `ClusterRoleInfo` | Implementation. Subscribes to `ClusterEvent.IMemberEvent` + `RoleLeaderChanged` + `LeaderChanged`. |
|
||||||
|
| `HoconLoader.LoadBaseConfig()` | Reads the embedded `Resources/akka.conf`. |
|
||||||
|
| `RoleParser.Parse(string?)` | Parses `OTOPCUA_ROLES` env var into a deduped `string[]`. |
|
||||||
|
| `ServiceCollectionExtensions.AddOtOpcUaCluster(configuration)` | Binds options + registers `IClusterRoleInfo` singleton. **Does not** start an ActorSystem. |
|
||||||
|
| `WithOtOpcUaClusterBootstrap(serviceProvider)` | Extension on `AkkaConfigurationBuilder`. Loads embedded HOCON + applies `WithRemoting(...)` + `WithClustering(...)` from options. |
|
||||||
|
|
||||||
|
## Bootstrap flow
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Program.cs
|
||||||
|
builder.Services.AddOtOpcUaCluster(builder.Configuration);
|
||||||
|
|
||||||
|
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||||
|
{
|
||||||
|
ab.WithOtOpcUaClusterBootstrap(sp); // HOCON + remote + cluster
|
||||||
|
// …singletons + node actors layered on
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
Order matters: `AddOtOpcUaCluster` must come before `AddAkka` so the options binding has run by the time the `AddAkka` lambda fires. Inside the lambda, `WithOtOpcUaClusterBootstrap` resolves `IOptions<AkkaClusterOptions>` from `sp` and writes them into the Akka builder.
|
||||||
|
|
||||||
|
The single ActorSystem this produces is what every other v2 piece runs on. There is no second Akka instance — that was a Phase 9 bug (commit `d6fac2d` consolidated).
|
||||||
|
|
||||||
|
## Embedded HOCON
|
||||||
|
|
||||||
|
`src/Core/ZB.MOM.WW.OtOpcUa.Cluster/Resources/akka.conf` contains:
|
||||||
|
|
||||||
|
| Setting | Value | Why |
|
||||||
|
|---|---|---|
|
||||||
|
| `akka.actor.provider` | `cluster` | Required for `Cluster.Get(system)` to work. |
|
||||||
|
| `akka.cluster.split-brain-resolver.active-strategy` | `keep-oldest` | Smaller/younger side downs itself on partition. |
|
||||||
|
| `akka.cluster.split-brain-resolver.stable-after` | `15s` | Time before SBR acts. |
|
||||||
|
| `akka.cluster.failure-detector.threshold` | `10.0` | Higher than default (8.0) for GC-pause tolerance. |
|
||||||
|
| `opcua-synchronized-dispatcher.type` | `PinnedDispatcher` | Dedicated thread for `OpcUaPublishActor` so SDK calls stay marshalled. |
|
||||||
|
|
||||||
|
The Cluster.Tests project verifies these key values stay correct (`HoconLoaderTests`).
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"Cluster": {
|
||||||
|
"Hostname": "0.0.0.0",
|
||||||
|
"Port": 4053,
|
||||||
|
"PublicHostname": "node-a.lan",
|
||||||
|
"SeedNodes": ["akka.tcp://otopcua@node-a.lan:4053"],
|
||||||
|
"Roles": ["admin", "driver"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `Hostname`: interface to bind. `0.0.0.0` listens on every interface.
|
||||||
|
- `Port`: TCP port for cluster gossip. Default 4053.
|
||||||
|
- `PublicHostname`: address advertised in cluster gossip. Must be reachable by every other node.
|
||||||
|
- `SeedNodes`: where new nodes go to join. List one (or two) stable nodes. First node bootstraps the cluster from its own address.
|
||||||
|
- `Roles`: free-form tags Akka gossip propagates. v2 uses `admin` + `driver`; per-role wiring in `Program.cs` reads `OTOPCUA_ROLES` env var, not this list — these two should stay in sync.
|
||||||
|
|
||||||
|
## IClusterRoleInfo
|
||||||
|
|
||||||
|
Anywhere in the host that needs the local node's identity or a view of who-else-is-in-the-cluster, inject `IClusterRoleInfo`:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public sealed class MyService(IClusterRoleInfo cluster)
|
||||||
|
{
|
||||||
|
public NodeId Self => cluster.LocalNode;
|
||||||
|
public IReadOnlyList<NodeId> Drivers => cluster.MembersWithRole("driver");
|
||||||
|
public NodeId? AdminLeader => cluster.RoleLeader("admin");
|
||||||
|
|
||||||
|
public MyService(IClusterRoleInfo cluster)
|
||||||
|
{
|
||||||
|
cluster.RoleLeaderChanged += (_, e) =>
|
||||||
|
Console.WriteLine($"role={e.Role}: {e.PreviousLeader} → {e.NewLeader}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`LocalNode` is `{PublicHostname}:{Port}` (the port suffix lets loopback test deployments stay distinct; production hostnames are already unique). `ConfigPublishCoordinator` uses the same `{host}:{port}` formula so the expected-ack set and the driver self-identification agree (commit `5cfbe8b`).
|
||||||
|
|
||||||
|
## Lifecycle
|
||||||
|
|
||||||
|
Akka.Hosting owns the lifecycle: `IHostedService` starts the ActorSystem at host start, runs `CoordinatedShutdown.ClusterLeavingReason` on host stop. The Cluster project does not register its own `IHostedService` (the v1 `AkkaHostedService` was deleted in commit `d6fac2d`).
|
||||||
|
|
||||||
|
## Tests
|
||||||
|
|
||||||
|
`tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests/` covers:
|
||||||
|
|
||||||
|
- `HoconLoaderTests` — embedded resource loads + key settings parse correctly.
|
||||||
|
- `RoleParserTests` — comma-split + dedup + trim semantics.
|
||||||
|
|
||||||
|
Cross-project integration is in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/` (cluster formation, deploy round-trip).
|
||||||
@@ -0,0 +1,99 @@
|
|||||||
|
# OtOpcUa.ControlPlane
|
||||||
|
|
||||||
|
Five admin-role cluster singletons that drive the v2 deploy, audit, fleet, and redundancy stories. Path: `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/`.
|
||||||
|
|
||||||
|
## Singletons
|
||||||
|
|
||||||
|
| Actor | File | Marker key | Role |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `ConfigPublishCoordinator` | `Coordinators/ConfigPublishCoordinator.cs` | `ConfigPublishCoordinatorKey` | Dispatches `DispatchDeployment`, collects `ApplyAck`s, seals/fails/times-out. |
|
||||||
|
| `AdminOperationsActor` | `AdminOperations/AdminOperationsActor.cs` | `AdminOperationsActorKey` | Receives `StartDeployment` from the UI, snapshots ConfigDb via `ConfigComposer`, persists `Deployment` row + `ConfigEdit` marker, tells the coordinator to dispatch. |
|
||||||
|
| `AuditWriterActor` | `Audit/AuditWriterActor.cs` | `AuditWriterActorKey` | Batched `ConfigAuditLog` writer. Flushes every 500 events or 5 s. In-buffer dedup; cross-restart dedup tracked as F3. |
|
||||||
|
| `FleetStatusBroadcaster` | `Fleet/FleetStatusBroadcaster.cs` | `FleetStatusBroadcasterKey` | Aggregates per-node `FleetNodeStatus` heartbeats; publishes `FleetStatusChanged` on the `fleet-status` DPS topic (SignalR bridge tracked as F16). |
|
||||||
|
| `RedundancyStateActor` | `Redundancy/RedundancyStateActor.cs` | `RedundancyStateActorKey` | Cluster-event subscriber; debounces 250 ms; publishes `RedundancyStateChanged` on the `redundancy-state` DPS topic. |
|
||||||
|
|
||||||
|
All five register via `WithOtOpcUaControlPlaneSingletons()` (extension on `AkkaConfigurationBuilder`). Each uses `ClusterSingletonOptions { Role = "admin" }` so the singleton runs on the admin role-leader and migrates to the next admin node on failover.
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Program.cs (admin role only)
|
||||||
|
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||||
|
{
|
||||||
|
ab.WithOtOpcUaClusterBootstrap(sp);
|
||||||
|
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||||
|
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
Resolve from anywhere via `IRequiredActor<T>` or the `ActorRegistry`:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public sealed class AdminOperationsClient(ActorRegistry registry) : IAdminOperationsClient
|
||||||
|
{
|
||||||
|
private readonly IActorRef _proxy = registry.Get<AdminOperationsActorKey>();
|
||||||
|
// ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Deploy flow
|
||||||
|
|
||||||
|
```
|
||||||
|
UI → IAdminOperationsClient.StartDeploymentAsync(createdBy)
|
||||||
|
│ Ask the AdminOperationsActor singleton proxy
|
||||||
|
▼
|
||||||
|
AdminOperationsActor
|
||||||
|
│ ConfigComposer.SnapshotAndFlattenAsync(db) → ConfigArtifact(blob, revHash)
|
||||||
|
│ insert Deployment(Dispatching) + ConfigEdit marker
|
||||||
|
│ Tell coordinator → DispatchDeployment
|
||||||
|
▼
|
||||||
|
ConfigPublishCoordinator
|
||||||
|
│ DiscoverDriverNodes() → expected ACK set (host:port per member)
|
||||||
|
│ insert NodeDeploymentState(Applying) per driver
|
||||||
|
│ Publish DispatchDeployment on "deployments" topic
|
||||||
|
│ Start apply-deadline timer (2 min default)
|
||||||
|
▼ DistributedPubSub
|
||||||
|
DriverHostActor (on each driver node — subscribed to "deployments")
|
||||||
|
│ PreStart subscribed; current state Steady(rev)
|
||||||
|
│ if currentRev == msg.rev → immediate ApplyAck(Applied) (idempotent)
|
||||||
|
│ else Become(Applying) → write NodeDeploymentStatus → ApplyAck
|
||||||
|
▼ via "deployment-acks" topic
|
||||||
|
ConfigPublishCoordinator (subscribed to "deployment-acks" in PreStart)
|
||||||
|
│ PersistNodeAck + collect
|
||||||
|
│ all-Applied → Sealed
|
||||||
|
│ any-Failed → PartiallyFailed
|
||||||
|
│ deadline → TimedOut
|
||||||
|
```
|
||||||
|
|
||||||
|
The dedicated `deployment-acks` topic + coordinator subscription was added in commit `5cfbe8b`. Before that, ACKs were published back on `deployments` and the coordinator (not subscribed) silently dropped them — deployments hung at `AwaitingApplyAcks` forever in multi-node tests.
|
||||||
|
|
||||||
|
### Failover recovery
|
||||||
|
|
||||||
|
If the admin singleton fails over mid-deploy, the new instance's `PreStart` queries `NodeDeploymentState` for any `Dispatching`/`AwaitingApplyAcks` row, rebuilds `_expectedAcks` + `_acks` from persisted state, and resumes the deadline timer. See `Coordinators/ConfigPublishCoordinator.cs::PreStart`.
|
||||||
|
|
||||||
|
## ConfigComposer
|
||||||
|
|
||||||
|
Pure function `SnapshotAndFlattenAsync(db) → ConfigArtifact(byte[], string)`:
|
||||||
|
|
||||||
|
1. Reads every live-edit table.
|
||||||
|
2. Serialises to a stable byte[] (deterministic ordering).
|
||||||
|
3. Computes SHA-256 over the bytes → 64-hex `RevisionHash`.
|
||||||
|
|
||||||
|
Same DB state → same artifact + same hash. That's what makes the `NoChanges` outcome work (AdminOperations compares the proposed hash to the last sealed deployment's hash).
|
||||||
|
|
||||||
|
## ServiceLevelCalculator
|
||||||
|
|
||||||
|
Pure function exposed at `Redundancy/ServiceLevelCalculator.Compute(NodeHealthInputs)`. Returns the OPC UA `ServiceLevel` byte per the truth table in [Redundancy.md](../Redundancy.md#servicelevel-tiers-part-5-65). No side effects; trivially unit-testable.
|
||||||
|
|
||||||
|
## DPS topics
|
||||||
|
|
||||||
|
| Topic | Publisher | Subscribers |
|
||||||
|
|---|---|---|
|
||||||
|
| `deployments` | ConfigPublishCoordinator | DriverHostActor (per-node) |
|
||||||
|
| `deployment-acks` | DriverHostActor | ConfigPublishCoordinator |
|
||||||
|
| `fleet-status` | FleetStatusBroadcaster | (SignalR bridge — F16) |
|
||||||
|
| `redundancy-state` | RedundancyStateActor | (per-node ServiceLevel calc — F10) |
|
||||||
|
|
||||||
|
## Tests
|
||||||
|
|
||||||
|
`tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/` — 29 tests covering coordinator (happy path, timeout, failover recovery), AdminOps (StartDeployment outcomes), AuditWriter (batching, dedup), FleetStatusBroadcaster (heartbeat staleness), RedundancyStateActor (debounce, snapshot), ConfigComposer (purity), ServiceLevelCalculator (truth table).
|
||||||
|
|
||||||
|
Multi-node tests (cross-ActorSystem) are in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/`.
|
||||||
@@ -0,0 +1,126 @@
|
|||||||
|
# OtOpcUa.Runtime
|
||||||
|
|
||||||
|
Driver-role actor tree — one set per node. Path: `src/Server/ZB.MOM.WW.OtOpcUa.Runtime/`.
|
||||||
|
|
||||||
|
## Actor tree
|
||||||
|
|
||||||
|
```
|
||||||
|
DriverHostActor (per node)
|
||||||
|
│ state machine: Steady ⇄ Applying ⇄ Stale
|
||||||
|
│
|
||||||
|
├──▶ DriverInstanceActor (per configured DriverInstance row)
|
||||||
|
│ state: Connecting → Connected → Reconnecting (or Stubbed)
|
||||||
|
│
|
||||||
|
├──▶ VirtualTagActor (per VirtualTag row)
|
||||||
|
│ compiles + evaluates expression, publishes derived value
|
||||||
|
│
|
||||||
|
├──▶ ScriptedAlarmActor (per ScriptedAlarm row)
|
||||||
|
│ state: Inactive ⇄ Active ⇄ Acknowledged
|
||||||
|
│
|
||||||
|
├──▶ OpcUaPublishActor (per node, pinned dispatcher)
|
||||||
|
│ marshalled OPC UA SDK writes + RebuildAddressSpace
|
||||||
|
│
|
||||||
|
├──▶ HistorianAdapterActor (per node)
|
||||||
|
│ pipe IPC to Wonderware historian sidecar
|
||||||
|
│
|
||||||
|
├──▶ PeerOpcUaProbeActor (per peer node)
|
||||||
|
│ opc.tcp ping → redundancy-state DPS topic
|
||||||
|
│
|
||||||
|
└──▶ DbHealthProbeActor (per node)
|
||||||
|
cached SELECT 1; consumed by /health/ready + redundancy calc
|
||||||
|
```
|
||||||
|
|
||||||
|
## Public surface
|
||||||
|
|
||||||
|
| Type | File |
|
||||||
|
|---|---|
|
||||||
|
| `WithOtOpcUaRuntimeActors()` | `ServiceCollectionExtensions.cs` — extension on `AkkaConfigurationBuilder`. Spawns `DriverHostActor` + `DbHealthProbeActor` on the host's ActorSystem. |
|
||||||
|
| `DriverHostActor` | `Drivers/DriverHostActor.cs` |
|
||||||
|
| `DriverInstanceActor` | `Drivers/DriverInstanceActor.cs` |
|
||||||
|
| `VirtualTagActor` | `VirtualTags/VirtualTagActor.cs` |
|
||||||
|
| `ScriptedAlarmActor` | `ScriptedAlarms/ScriptedAlarmActor.cs` |
|
||||||
|
| `OpcUaPublishActor` | `OpcUa/OpcUaPublishActor.cs` |
|
||||||
|
| `HistorianAdapterActor` | `Historian/HistorianAdapterActor.cs` |
|
||||||
|
| `PeerOpcUaProbeActor` | `Health/PeerOpcUaProbeActor.cs` |
|
||||||
|
| `DbHealthProbeActor` | `Health/DbHealthProbeActor.cs` |
|
||||||
|
|
||||||
|
Marker keys for registry lookup: `DriverHostActorKey`, `DbHealthProbeActorKey`.
|
||||||
|
|
||||||
|
## DriverHostActor
|
||||||
|
|
||||||
|
Per-node supervisor with three Become states:
|
||||||
|
|
||||||
|
| State | Meaning |
|
||||||
|
|---|---|
|
||||||
|
| `Steady(rev)` | Caught up. `DispatchDeployment` with `msg.rev == currentRev` → immediate `ApplyAck(Applied)` (idempotent). New rev → `Become(Applying)`. |
|
||||||
|
| `Applying(id)` | Apply in progress. Further `DispatchDeployment` for in-flight ID → debug-log + ignore. For new ID → defer via `Self.Forward`. |
|
||||||
|
| `Stale` | ConfigDb unreachable on bootstrap. Periodic `RetryConfigDbConnection` tries to advance to `Steady`. |
|
||||||
|
|
||||||
|
`PreStart`:
|
||||||
|
|
||||||
|
1. Subscribe to `deployments` DPS topic.
|
||||||
|
2. Read most-recent `NodeDeploymentState` for this node from ConfigDb.
|
||||||
|
3. If `Applied` → restore `_currentRevision`, `Become(Steady)`.
|
||||||
|
4. If `Applying` (orphan from crash) → replay apply (idempotent).
|
||||||
|
5. If `Failed` → `Become(Steady)` at last known rev.
|
||||||
|
6. DB unreachable → `Become(Stale)`, start retry timer.
|
||||||
|
|
||||||
|
ACK publishing: when no `_coordinatorOverride` is set (production), `SendAck` publishes on the dedicated `deployment-acks` DPS topic which the coordinator subscribes to (commit `5cfbe8b`).
|
||||||
|
|
||||||
|
## DriverInstanceActor
|
||||||
|
|
||||||
|
Per-driver-instance child. State machine:
|
||||||
|
|
||||||
|
- `Connecting` → first attempt to reach the underlying driver
|
||||||
|
- `Connected` → subscriptions active, reads/writes flow
|
||||||
|
- `Reconnecting` → temporary disconnect; backoff retry
|
||||||
|
- `Stubbed` → DEV-STUB mode for Windows-only drivers (Galaxy, Wonderware Historian) on non-Windows or when `roles` contains `dev`
|
||||||
|
|
||||||
|
`ShouldStub(driverType, roles)` returns `true` for `"Galaxy" | "Historian.Wonderware"` on non-Windows; the actor goes straight to `Stubbed` and returns deterministic success without touching real hardware. Wiring this into the DriverHost child-spawn path is follow-up F20 (folds into F7).
|
||||||
|
|
||||||
|
Engine wiring (subscription publishing, ApplyDelta diff, bad-quality-on-disconnect, write path, supervisor backoff) is stubbed — tracked as F7. Tests exercise message contracts, not engine behaviour.
|
||||||
|
|
||||||
|
## VirtualTagActor / ScriptedAlarmActor
|
||||||
|
|
||||||
|
Skeleton state machines + message handlers. Engine work:
|
||||||
|
|
||||||
|
- `VirtualTagEngine.Evaluate()` not yet called from `VirtualTagActor.DependencyValueChanged` (F8).
|
||||||
|
- `AlarmConditionService` not yet called from `ScriptedAlarmActor` (F9).
|
||||||
|
- `ScriptedAlarmState` DB persistence on `PreRestart` not wired (F9).
|
||||||
|
|
||||||
|
## OpcUaPublishActor
|
||||||
|
|
||||||
|
The only actor on the **pinned dispatcher** (`opcua-synchronized-dispatcher` from `akka.conf`). All OPC UA SDK address-space writes go through it so the SDK's threading model isn't violated.
|
||||||
|
|
||||||
|
Message contracts are defined; actual SDK calls are stubbed (counters only). Real address-space writes + `ServiceLevel` Variable updates + `RebuildAddressSpace` after a deploy land in F10 (gated on F13 — full `OpcUaApplicationHost` extraction).
|
||||||
|
|
||||||
|
## HistorianAdapterActor, PeerOpcUaProbeActor
|
||||||
|
|
||||||
|
Both have message contracts wired. Engine integration deferred:
|
||||||
|
|
||||||
|
- `HistorianAdapterActor` — named-pipe IPC to the Wonderware historian sidecar + `SqliteStoreAndForwardSink` (F11).
|
||||||
|
- `PeerOpcUaProbeActor` — real `opc.tcp://peer:4840` ping (F12). Current stub always returns `Ok=true`.
|
||||||
|
|
||||||
|
## DbHealthProbeActor
|
||||||
|
|
||||||
|
`Ask<DbHealthStatus>` returns cached state (refreshed every 5 s by an internal `SELECT 1`). Consumed by `/health/ready` and `RedundancyStateActor`.
|
||||||
|
|
||||||
|
## Lifecycle wiring
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Program.cs (driver role only)
|
||||||
|
builder.Services.AddAkka("otopcua", (ab, sp) =>
|
||||||
|
{
|
||||||
|
ab.WithOtOpcUaClusterBootstrap(sp);
|
||||||
|
if (hasAdmin) ab.WithOtOpcUaControlPlaneSingletons();
|
||||||
|
if (hasDriver) ab.WithOtOpcUaRuntimeActors();
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
`WithOtOpcUaRuntimeActors` resolves `IDbContextFactory<OtOpcUaConfigDbContext>` + `IClusterRoleInfo` from DI, then spawns `DbHealthProbeActor` and `DriverHostActor` as top-level `/user/` actors. Both register marker keys in `ActorRegistry` so the registry lookup works from anywhere.
|
||||||
|
|
||||||
|
## Tests
|
||||||
|
|
||||||
|
`tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests/` — 16 tests covering DriverHostActor (Steady ack, Applying transitions, Stale recovery), DriverInstanceActor (state machine, stub mode), VirtualTagActor + ScriptedAlarmActor (message contracts), OpcUaPublishActor (props + message acceptance), DbHealthProbe + PeerOpcUaProbe (probe loop), and the `WithOtOpcUaRuntimeActors` registration round-trip.
|
||||||
|
|
||||||
|
End-to-end deploy from admin → driver via the cluster is in `tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests/DeployHappyPathTests.cs`.
|
||||||
@@ -1,46 +1,63 @@
|
|||||||
<#
|
<#
|
||||||
.SYNOPSIS
|
.SYNOPSIS
|
||||||
Registers the v2 Windows services on a node: OtOpcUa (main server, net10) and
|
Registers the v2 Windows service on a node: OtOpcUaHost (fused binary, .NET 10)
|
||||||
optionally OtOpcUaWonderwareHistorian (Wonderware historian sidecar).
|
and optionally OtOpcUaWonderwareHistorian (Wonderware historian sidecar, net48 x86).
|
||||||
|
|
||||||
.DESCRIPTION
|
.DESCRIPTION
|
||||||
PR 7.2 retired the legacy out-of-process OtOpcUaGalaxyHost service alongside the
|
v2 consolidates the legacy OtOpcUa + OtOpcUaAdmin services into a single role-gated
|
||||||
GalaxyProxyDriver / GalaxyHost / GalaxyShared projects. Galaxy access now flows
|
OtOpcUaHost binary. The -Roles parameter sets the OTOPCUA_ROLES service env so
|
||||||
through the in-process GalaxyDriver talking gRPC to a separately-installed
|
Program.cs decides what to mount (admin / driver / both). The Wonderware historian
|
||||||
mxaccessgw. The mxaccessgw server runs out of its own repo
|
sidecar logic is unchanged from v1; install it with -InstallWonderwareHistorian.
|
||||||
(`c:\Users\dohertj2\Desktop\mxaccessgw\`) — see
|
|
||||||
`docs/v2/Galaxy.ParityRig.md` for the gw setup recipe.
|
Galaxy access flows through the mxaccessgw sibling repo (separate service); see
|
||||||
|
docs/v2/Galaxy.ParityRig.md for the gateway setup.
|
||||||
|
|
||||||
.PARAMETER InstallRoot
|
.PARAMETER InstallRoot
|
||||||
Where the binaries live (typically C:\Program Files\OtOpcUa).
|
Where the binaries live (typically C:\Program Files\OtOpcUa). The OtOpcUaHost
|
||||||
|
service runs OtOpcUa.Host.exe from this directory; publish the Host project there
|
||||||
|
with `dotnet publish -c Release -r win-x64 --self-contained` first.
|
||||||
|
|
||||||
.PARAMETER ServiceAccount
|
.PARAMETER ServiceAccount
|
||||||
Service account SID or DOMAIN\name. The OtOpcUa service runs under this account.
|
Service account SID or DOMAIN\name. The OtOpcUaHost service runs under this account.
|
||||||
|
|
||||||
|
.PARAMETER Roles
|
||||||
|
Comma-separated cluster roles for this node. One of:
|
||||||
|
- "admin,driver" — single-node dev or all-in-one production node
|
||||||
|
- "admin" — admin-only HA pair member (Blazor + control-plane singletons)
|
||||||
|
- "driver" — driver-only node (OPC UA endpoint + per-node actors)
|
||||||
|
Written to the service env as OTOPCUA_ROLES.
|
||||||
|
|
||||||
|
.PARAMETER HttpPort
|
||||||
|
HTTP port for the AdminUI + auth endpoints. Default 9000. Written as ASPNETCORE_URLS.
|
||||||
|
Ignored on driver-only nodes (no Blazor surface).
|
||||||
|
|
||||||
.PARAMETER InstallWonderwareHistorian
|
.PARAMETER InstallWonderwareHistorian
|
||||||
Gate the OtOpcUaWonderwareHistorian sidecar install. Off by default; set when
|
Gate the OtOpcUaWonderwareHistorian sidecar install. Off by default; set when the
|
||||||
the deployment uses the Wonderware historian for history reads + alarm-event
|
deployment uses the Wonderware historian for history reads + alarm-event persistence.
|
||||||
persistence.
|
|
||||||
|
|
||||||
.PARAMETER HistorianSharedSecret
|
.PARAMETER HistorianSharedSecret
|
||||||
Per-process secret passed to the Historian sidecar via env var. Generated
|
Per-process secret passed to the historian sidecar via env var. Generated freshly
|
||||||
freshly per install when not supplied.
|
per install when not supplied.
|
||||||
|
|
||||||
.EXAMPLE
|
.EXAMPLE
|
||||||
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua'
|
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' `
|
||||||
|
-ServiceAccount 'OTOPCUA\svc-otopcua' -Roles 'admin,driver'
|
||||||
|
|
||||||
.EXAMPLE
|
.EXAMPLE
|
||||||
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua' `
|
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' `
|
||||||
|
-ServiceAccount 'OTOPCUA\svc-otopcua' -Roles 'driver' `
|
||||||
-InstallWonderwareHistorian
|
-InstallWonderwareHistorian
|
||||||
#>
|
#>
|
||||||
[CmdletBinding()]
|
[CmdletBinding()]
|
||||||
param(
|
param(
|
||||||
[Parameter(Mandatory)] [string]$InstallRoot,
|
[Parameter(Mandatory)] [string]$InstallRoot,
|
||||||
[Parameter(Mandatory)] [string]$ServiceAccount,
|
[Parameter(Mandatory)] [string]$ServiceAccount,
|
||||||
|
[Parameter(Mandatory)] [ValidateSet('admin', 'driver', 'admin,driver', 'driver,admin')]
|
||||||
|
[string]$Roles,
|
||||||
|
[int]$HttpPort = 9000,
|
||||||
|
|
||||||
# PR 3.W — Wonderware historian sidecar. Optional; gates the
|
# Wonderware historian sidecar. Optional; gates the OtOpcUaWonderwareHistorian
|
||||||
# OtOpcUaWonderwareHistorian service. Secret + pipe defaults match the server's
|
# service. Secret + pipe defaults match the server's Historian:Wonderware appsettings.
|
||||||
# Historian:Wonderware appsettings block.
|
|
||||||
[switch]$InstallWonderwareHistorian,
|
[switch]$InstallWonderwareHistorian,
|
||||||
[string]$HistorianSharedSecret,
|
[string]$HistorianSharedSecret,
|
||||||
[string]$HistorianPipeName = 'OtOpcUaWonderwareHistorian',
|
[string]$HistorianPipeName = 'OtOpcUaWonderwareHistorian',
|
||||||
@@ -51,18 +68,19 @@ param(
|
|||||||
|
|
||||||
$ErrorActionPreference = 'Stop'
|
$ErrorActionPreference = 'Stop'
|
||||||
|
|
||||||
if (-not (Test-Path "$InstallRoot\OtOpcUa.Server.exe")) {
|
if (-not (Test-Path "$InstallRoot\OtOpcUa.Host.exe")) {
|
||||||
Write-Error "OtOpcUa.Server.exe not found at $InstallRoot — copy the publish output first"
|
Write-Error "OtOpcUa.Host.exe not found at $InstallRoot — copy the publish output first"
|
||||||
exit 1
|
exit 1
|
||||||
}
|
}
|
||||||
|
|
||||||
# Generate fresh shared secrets per install if not supplied.
|
|
||||||
function New-SharedSecret {
|
function New-SharedSecret {
|
||||||
$bytes = New-Object byte[] 32
|
$bytes = New-Object byte[] 32
|
||||||
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
|
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
|
||||||
return [Convert]::ToBase64String($bytes)
|
return [Convert]::ToBase64String($bytes)
|
||||||
}
|
}
|
||||||
if ($InstallWonderwareHistorian -and -not $HistorianSharedSecret) { $HistorianSharedSecret = New-SharedSecret }
|
if ($InstallWonderwareHistorian -and -not $HistorianSharedSecret) {
|
||||||
|
$HistorianSharedSecret = New-SharedSecret
|
||||||
|
}
|
||||||
|
|
||||||
if ($InstallWonderwareHistorian -and -not (Test-Path "$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe")) {
|
if ($InstallWonderwareHistorian -and -not (Test-Path "$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe")) {
|
||||||
Write-Error "OtOpcUa.Driver.Historian.Wonderware.exe not found at $InstallRoot\WonderwareHistorian — copy the publish output first"
|
Write-Error "OtOpcUa.Driver.Historian.Wonderware.exe not found at $InstallRoot\WonderwareHistorian — copy the publish output first"
|
||||||
@@ -76,10 +94,7 @@ $sid = if ($ServiceAccount.StartsWith('S-1-')) {
|
|||||||
(New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value
|
(New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value
|
||||||
}
|
}
|
||||||
|
|
||||||
# --- Install OtOpcUaWonderwareHistorian (PR 3.W) — separate sidecar that exposes the
|
# --- OtOpcUaWonderwareHistorian sidecar (optional, unchanged from v1) -------
|
||||||
# Wonderware Historian SDK via a named-pipe protocol consumed by the .NET 10 server.
|
|
||||||
# Optional: only installed when -InstallWonderwareHistorian is supplied. Depends on the
|
|
||||||
# hard AVEVA services that host the historian SDK runtime path.
|
|
||||||
$historianDepend = $null
|
$historianDepend = $null
|
||||||
if ($InstallWonderwareHistorian) {
|
if ($InstallWonderwareHistorian) {
|
||||||
$historianEnv = @(
|
$historianEnv = @(
|
||||||
@@ -87,14 +102,10 @@ if ($InstallWonderwareHistorian) {
|
|||||||
"OTOPCUA_ALLOWED_SID=$sid"
|
"OTOPCUA_ALLOWED_SID=$sid"
|
||||||
"OTOPCUA_HISTORIAN_SECRET=$HistorianSharedSecret"
|
"OTOPCUA_HISTORIAN_SECRET=$HistorianSharedSecret"
|
||||||
"OTOPCUA_HISTORIAN_ENABLED=true"
|
"OTOPCUA_HISTORIAN_ENABLED=true"
|
||||||
# Default-on when the historian sidecar is installed; flip to false for a
|
|
||||||
# read-only deployment that still loads aahClientManaged for reads but
|
|
||||||
# rejects WriteAlarmEvents frames.
|
|
||||||
"OTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED=true"
|
"OTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED=true"
|
||||||
"OTOPCUA_HISTORIAN_SERVER=$HistorianServer"
|
"OTOPCUA_HISTORIAN_SERVER=$HistorianServer"
|
||||||
"OTOPCUA_HISTORIAN_PORT=$HistorianPort"
|
"OTOPCUA_HISTORIAN_PORT=$HistorianPort"
|
||||||
) -join "`0"
|
)
|
||||||
$historianEnv += "`0`0"
|
|
||||||
|
|
||||||
Write-Host "Installing OtOpcUaWonderwareHistorian..."
|
Write-Host "Installing OtOpcUaWonderwareHistorian..."
|
||||||
& sc.exe create OtOpcUaWonderwareHistorian binPath= "`"$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe`"" `
|
& sc.exe create OtOpcUaWonderwareHistorian binPath= "`"$InstallRoot\WonderwareHistorian\OtOpcUa.Driver.Historian.Wonderware.exe`"" `
|
||||||
@@ -105,36 +116,59 @@ if ($InstallWonderwareHistorian) {
|
|||||||
& sc.exe config OtOpcUaWonderwareHistorian start= delayed-auto | Out-Null
|
& sc.exe config OtOpcUaWonderwareHistorian start= delayed-auto | Out-Null
|
||||||
|
|
||||||
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaWonderwareHistorian"
|
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaWonderwareHistorian"
|
||||||
$envValue = $historianEnv.Split("`0") | Where-Object { $_ -ne '' }
|
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $historianEnv
|
||||||
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue
|
|
||||||
|
& sc.exe failure OtOpcUaWonderwareHistorian reset= 86400 actions= restart/5000/restart/30000/restart/60000 | Out-Null
|
||||||
|
|
||||||
$historianDepend = 'OtOpcUaWonderwareHistorian'
|
$historianDepend = 'OtOpcUaWonderwareHistorian'
|
||||||
}
|
}
|
||||||
|
|
||||||
# --- Install OtOpcUa. Galaxy access flows through GalaxyDriver → mxaccessgw (gRPC),
|
# --- OtOpcUaHost (the fused v2 binary) --------------------------------------
|
||||||
# so OtOpcUa no longer depends on a sibling service for Galaxy connectivity. The
|
$normalisedRoles = ($Roles -split ',' | ForEach-Object { $_.Trim() } | Sort-Object -Unique) -join ','
|
||||||
# mxaccessgw is installed separately. When the Wonderware sidecar is installed,
|
|
||||||
# depend on it for startup ordering.
|
|
||||||
$otOpcUaDepends = @()
|
|
||||||
if ($historianDepend) { $otOpcUaDepends += $historianDepend }
|
|
||||||
|
|
||||||
Write-Host "Installing OtOpcUa..."
|
$hasAdmin = $normalisedRoles -split ',' -contains 'admin'
|
||||||
|
|
||||||
|
$hostEnv = @(
|
||||||
|
"OTOPCUA_ROLES=$normalisedRoles",
|
||||||
|
'DOTNET_ENVIRONMENT=Production'
|
||||||
|
)
|
||||||
|
if ($hasAdmin) {
|
||||||
|
$hostEnv += "ASPNETCORE_URLS=http://+:$HttpPort"
|
||||||
|
}
|
||||||
|
|
||||||
|
$hostDepends = @()
|
||||||
|
if ($historianDepend) { $hostDepends += $historianDepend }
|
||||||
|
|
||||||
|
Write-Host "Installing OtOpcUaHost (roles=$normalisedRoles)..."
|
||||||
$createArgs = @(
|
$createArgs = @(
|
||||||
'create', 'OtOpcUa',
|
'create', 'OtOpcUaHost',
|
||||||
'binPath=', "`"$InstallRoot\OtOpcUa.Server.exe`"",
|
'binPath=', "`"$InstallRoot\OtOpcUa.Host.exe`"",
|
||||||
'DisplayName=', 'OtOpcUa Server',
|
'DisplayName=', "OtOpcUa Host ($normalisedRoles)",
|
||||||
'start=', 'auto',
|
'start=', 'auto',
|
||||||
'obj=', $ServiceAccount
|
'obj=', $ServiceAccount
|
||||||
)
|
)
|
||||||
if ($otOpcUaDepends.Count -gt 0) {
|
if ($hostDepends.Count -gt 0) {
|
||||||
$createArgs += @('depend=', ($otOpcUaDepends -join '/'))
|
$createArgs += @('depend=', ($hostDepends -join '/'))
|
||||||
}
|
}
|
||||||
& sc.exe @createArgs | Out-Null
|
& sc.exe @createArgs | Out-Null
|
||||||
|
|
||||||
|
# Env block via registry MultiString (sc.exe doesn't take env directly).
|
||||||
|
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaHost"
|
||||||
|
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $hostEnv
|
||||||
|
|
||||||
|
# Restart-on-failure: 5s, 30s, 60s; reset counter after a clean 24h run.
|
||||||
|
& sc.exe failure OtOpcUaHost reset= 86400 actions= restart/5000/restart/30000/restart/60000 | Out-Null
|
||||||
|
|
||||||
Write-Host ""
|
Write-Host ""
|
||||||
Write-Host "Installed. Start with:"
|
Write-Host "Installed OtOpcUaHost:"
|
||||||
|
Write-Host " Roles: $normalisedRoles"
|
||||||
|
if ($hasAdmin) { Write-Host " HTTP port: $HttpPort" }
|
||||||
|
Write-Host " Binary: $InstallRoot\OtOpcUa.Host.exe"
|
||||||
|
Write-Host " Account: $ServiceAccount"
|
||||||
|
Write-Host ""
|
||||||
|
Write-Host "Start with:"
|
||||||
if ($InstallWonderwareHistorian) { Write-Host " sc.exe start OtOpcUaWonderwareHistorian" }
|
if ($InstallWonderwareHistorian) { Write-Host " sc.exe start OtOpcUaWonderwareHistorian" }
|
||||||
Write-Host " sc.exe start OtOpcUa"
|
Write-Host " sc.exe start OtOpcUaHost"
|
||||||
if ($InstallWonderwareHistorian) {
|
if ($InstallWonderwareHistorian) {
|
||||||
Write-Host ""
|
Write-Host ""
|
||||||
Write-Host "Wonderware historian shared secret (configure into appsettings.json Historian:Wonderware:SharedSecret):"
|
Write-Host "Wonderware historian shared secret (configure into appsettings.json Historian:Wonderware:SharedSecret):"
|
||||||
@@ -142,5 +176,5 @@ if ($InstallWonderwareHistorian) {
|
|||||||
}
|
}
|
||||||
Write-Host ""
|
Write-Host ""
|
||||||
Write-Host "NOTE: Galaxy access flows through mxaccessgw — install + run that separately"
|
Write-Host "NOTE: Galaxy access flows through mxaccessgw — install + run that separately"
|
||||||
Write-Host " per docs/v2/Galaxy.ParityRig.md. OtOpcUa connects via the Galaxy.Gateway"
|
Write-Host " per docs/v2/Galaxy.ParityRig.md. OtOpcUaHost connects via the"
|
||||||
Write-Host " section of appsettings.json (default endpoint http://localhost:5120)."
|
Write-Host " Galaxy.Gateway section of appsettings.json (default http://localhost:5120)."
|
||||||
|
|||||||
@@ -0,0 +1,68 @@
|
|||||||
|
<#
|
||||||
|
.SYNOPSIS
|
||||||
|
Installs Traefik as a Windows service that routes admin HTTP traffic to whichever
|
||||||
|
OtOpcUa.Host node holds the admin role-leader (via /health/active).
|
||||||
|
|
||||||
|
.DESCRIPTION
|
||||||
|
Downloads the Traefik Windows binary into $InstallRoot, drops traefik.yml +
|
||||||
|
traefik-dynamic.yml from this directory next to it, and registers Traefik as a
|
||||||
|
Windows service via sc.exe with restart-on-failure.
|
||||||
|
|
||||||
|
Companion to Install-Services.ps1. Run on the box that fronts the admin HTTP
|
||||||
|
traffic (typically a separate node from OtOpcUaHost, or co-located on the
|
||||||
|
primary admin node).
|
||||||
|
|
||||||
|
.PARAMETER InstallRoot
|
||||||
|
Where the Traefik binary + config land. Default 'C:\Program Files\Traefik'.
|
||||||
|
|
||||||
|
.PARAMETER TraefikVersion
|
||||||
|
Traefik version to download. Default 'v3.1.6'.
|
||||||
|
|
||||||
|
.EXAMPLE
|
||||||
|
.\Install-Traefik.ps1 -InstallRoot 'C:\Program Files\Traefik'
|
||||||
|
#>
|
||||||
|
[CmdletBinding()]
|
||||||
|
param(
|
||||||
|
[string]$InstallRoot = 'C:\Program Files\Traefik',
|
||||||
|
[string]$TraefikVersion = 'v3.1.6'
|
||||||
|
)
|
||||||
|
|
||||||
|
$ErrorActionPreference = 'Stop'
|
||||||
|
|
||||||
|
if (-not (Test-Path $InstallRoot)) {
|
||||||
|
New-Item -ItemType Directory -Path $InstallRoot | Out-Null
|
||||||
|
}
|
||||||
|
|
||||||
|
$zip = Join-Path $env:TEMP "traefik-$TraefikVersion.zip"
|
||||||
|
$url = "https://github.com/traefik/traefik/releases/download/$TraefikVersion/traefik_${TraefikVersion}_windows_amd64.zip"
|
||||||
|
|
||||||
|
Write-Host "Downloading Traefik $TraefikVersion..."
|
||||||
|
Invoke-WebRequest -Uri $url -OutFile $zip
|
||||||
|
Expand-Archive -Path $zip -DestinationPath $InstallRoot -Force
|
||||||
|
Remove-Item $zip
|
||||||
|
|
||||||
|
$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
|
||||||
|
Copy-Item -Force (Join-Path $scriptDir 'traefik.yml') $InstallRoot
|
||||||
|
Copy-Item -Force (Join-Path $scriptDir 'traefik-dynamic.yml') (Join-Path $InstallRoot 'dynamic.yml')
|
||||||
|
|
||||||
|
# Traefik reads dynamic.yml from /etc/traefik on Linux; on Windows place it next to the
|
||||||
|
# binary and point the file provider at it. Edit traefik.yml's `filename:` if you want
|
||||||
|
# to change the location.
|
||||||
|
(Get-Content -Raw (Join-Path $InstallRoot 'traefik.yml')) `
|
||||||
|
-replace '/etc/traefik/dynamic.yml', (Join-Path $InstallRoot 'dynamic.yml').Replace('\', '/') `
|
||||||
|
| Set-Content (Join-Path $InstallRoot 'traefik.yml')
|
||||||
|
|
||||||
|
Write-Host "Installing Traefik Windows service..."
|
||||||
|
& sc.exe create OtOpcUaTraefik binPath= "`"$InstallRoot\traefik.exe`" --configFile=`"$InstallRoot\traefik.yml`"" `
|
||||||
|
DisplayName= 'OtOpcUa Traefik (admin HTTP front door)' `
|
||||||
|
start= auto | Out-Null
|
||||||
|
|
||||||
|
& sc.exe failure OtOpcUaTraefik reset= 86400 actions= restart/5000/restart/30000/restart/60000 | Out-Null
|
||||||
|
|
||||||
|
Write-Host ""
|
||||||
|
Write-Host "Installed OtOpcUaTraefik. Edit:"
|
||||||
|
Write-Host " $InstallRoot\dynamic.yml (router + service definitions)"
|
||||||
|
Write-Host "Start with:"
|
||||||
|
Write-Host " sc.exe start OtOpcUaTraefik"
|
||||||
|
Write-Host ""
|
||||||
|
Write-Host "Traefik dashboard: http://localhost:8080 (turn off api.insecure in production)"
|
||||||
@@ -43,11 +43,11 @@ function Test-NssmService([string]$Name) {
|
|||||||
# Step 1: Stop in reverse dependency order
|
# Step 1: Stop in reverse dependency order
|
||||||
# ------------------------------------------------------------------------
|
# ------------------------------------------------------------------------
|
||||||
|
|
||||||
Step "Stopping services (OtOpcUa → OtOpcUaWonderwareHistorian → MxAccessGw)"
|
Step "Stopping services (OtOpcUaHost > OtOpcUaWonderwareHistorian > MxAccessGw)"
|
||||||
|
|
||||||
foreach ($name in @('OtOpcUa', 'OtOpcUaWonderwareHistorian', 'MxAccessGw')) {
|
foreach ($name in @('OtOpcUaHost', 'OtOpcUaWonderwareHistorian', 'MxAccessGw')) {
|
||||||
if (Test-NssmService $name) {
|
if (Test-NssmService $name) {
|
||||||
Run { nssm stop $name } "stop $name"
|
Run { Stop-Service $name -Force -ErrorAction SilentlyContinue } "stop $name"
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
Write-Host " ($name not installed; skipping)" -ForegroundColor DarkGray
|
Write-Host " ($name not installed; skipping)" -ForegroundColor DarkGray
|
||||||
@@ -56,7 +56,7 @@ foreach ($name in @('OtOpcUa', 'OtOpcUaWonderwareHistorian', 'MxAccessGw')) {
|
|||||||
|
|
||||||
if (-not $WhatIf) {
|
if (-not $WhatIf) {
|
||||||
Start-Sleep -Seconds 3
|
Start-Sleep -Seconds 3
|
||||||
Get-Process MxGateway.Server, MxGateway.Worker, OtOpcUa.Server, OtOpcUa.Driver.Historian.Wonderware -ErrorAction SilentlyContinue |
|
Get-Process MxGateway.Server, MxGateway.Worker, OtOpcUa.Host, OtOpcUa.Driver.Historian.Wonderware -ErrorAction SilentlyContinue |
|
||||||
ForEach-Object {
|
ForEach-Object {
|
||||||
Write-Host " killing residual process $($_.ProcessName) (PID=$($_.Id))" -ForegroundColor DarkYellow
|
Write-Host " killing residual process $($_.ProcessName) (PID=$($_.Id))" -ForegroundColor DarkYellow
|
||||||
Stop-Process -Id $_.Id -Force -ErrorAction SilentlyContinue
|
Stop-Process -Id $_.Id -Force -ErrorAction SilentlyContinue
|
||||||
@@ -109,14 +109,14 @@ Run {
|
|||||||
# Step 4: Refresh OtOpcUa + Wonderware historian sidecar
|
# Step 4: Refresh OtOpcUa + Wonderware historian sidecar
|
||||||
# ------------------------------------------------------------------------
|
# ------------------------------------------------------------------------
|
||||||
|
|
||||||
Step "Publishing OtOpcUa server + Wonderware historian sidecar from $RepoRoot"
|
Step "Publishing OtOpcUa.Host + Wonderware historian sidecar from $RepoRoot"
|
||||||
|
|
||||||
Run {
|
Run {
|
||||||
& dotnet publish "$RepoRoot\src\Server\ZB.MOM.WW.OtOpcUa.Server" `
|
& dotnet publish "$RepoRoot\src\Server\ZB.MOM.WW.OtOpcUa.Host" `
|
||||||
-c Release -o (Join-Path $PublishRoot "lmxopcua") | Out-Null
|
-c Release -o (Join-Path $PublishRoot "lmxopcua") | Out-Null
|
||||||
& dotnet publish "$RepoRoot\src\Drivers\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware" `
|
& dotnet publish "$RepoRoot\src\Drivers\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware" `
|
||||||
-c Release -o (Join-Path $PublishRoot "lmxopcua\WonderwareHistorian") | Out-Null
|
-c Release -o (Join-Path $PublishRoot "lmxopcua\WonderwareHistorian") | Out-Null
|
||||||
} "dotnet publish (Server + sidecar)"
|
} "dotnet publish (Host + sidecar)"
|
||||||
|
|
||||||
# ------------------------------------------------------------------------
|
# ------------------------------------------------------------------------
|
||||||
# Step 5: Service env block — ensure OTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED
|
# Step 5: Service env block — ensure OTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED
|
||||||
@@ -143,16 +143,16 @@ if (Test-NssmService 'OtOpcUaWonderwareHistorian') {
|
|||||||
# Step 6: Start in forward dependency order
|
# Step 6: Start in forward dependency order
|
||||||
# ------------------------------------------------------------------------
|
# ------------------------------------------------------------------------
|
||||||
|
|
||||||
Step "Starting services (MxAccessGw → OtOpcUaWonderwareHistorian → OtOpcUa)"
|
Step "Starting services (MxAccessGw > OtOpcUaWonderwareHistorian > OtOpcUaHost)"
|
||||||
|
|
||||||
foreach ($pair in @(
|
foreach ($pair in @(
|
||||||
@{ Name = 'MxAccessGw'; Wait = 4 },
|
@{ Name = 'MxAccessGw'; Wait = 4 },
|
||||||
@{ Name = 'OtOpcUaWonderwareHistorian'; Wait = 4 },
|
@{ Name = 'OtOpcUaWonderwareHistorian'; Wait = 4 },
|
||||||
@{ Name = 'OtOpcUa'; Wait = 8 }
|
@{ Name = 'OtOpcUaHost'; Wait = 8 }
|
||||||
)) {
|
)) {
|
||||||
$name = $pair.Name
|
$name = $pair.Name
|
||||||
if (Test-NssmService $name) {
|
if (Test-NssmService $name) {
|
||||||
Run { nssm start $name } "start $name"
|
Run { Start-Service $name } "start $name"
|
||||||
if (-not $WhatIf) { Start-Sleep -Seconds $pair.Wait }
|
if (-not $WhatIf) { Start-Sleep -Seconds $pair.Wait }
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
@@ -167,7 +167,7 @@ foreach ($pair in @(
|
|||||||
Step "Smoke verification"
|
Step "Smoke verification"
|
||||||
|
|
||||||
if (-not $WhatIf) {
|
if (-not $WhatIf) {
|
||||||
foreach ($name in @('MxAccessGw', 'OtOpcUaWonderwareHistorian', 'OtOpcUa')) {
|
foreach ($name in @('MxAccessGw', 'OtOpcUaWonderwareHistorian', 'OtOpcUaHost')) {
|
||||||
if (Test-NssmService $name) {
|
if (Test-NssmService $name) {
|
||||||
$status = (Get-Service $name).Status
|
$status = (Get-Service $name).Status
|
||||||
$color = if ($status -eq 'Running') { 'Green' } else { 'Red' }
|
$color = if ($status -eq 'Running') { 'Green' } else { 'Red' }
|
||||||
|
|||||||
@@ -3,16 +3,17 @@
|
|||||||
Stops + removes the v2 services. Mirrors Install-Services.ps1.
|
Stops + removes the v2 services. Mirrors Install-Services.ps1.
|
||||||
|
|
||||||
.DESCRIPTION
|
.DESCRIPTION
|
||||||
PR 7.2 retired the legacy OtOpcUaGalaxyHost service. Galaxy access now flows
|
Removes the v2 OtOpcUaHost service plus the optional OtOpcUaWonderwareHistorian
|
||||||
through the in-process GalaxyDriver against a separately-installed mxaccessgw.
|
sidecar. Also cleans up legacy service names from prior installs:
|
||||||
OtOpcUaGalaxyHost is included in the cleanup loop below so this script safely
|
- OtOpcUa (v1 server) — replaced by OtOpcUaHost in v2
|
||||||
removes it from any rig still carrying the legacy service from a pre-7.2
|
- OtOpcUaAdmin (v1 admin) — fused into OtOpcUaHost in v2
|
||||||
install.
|
- OtOpcUaGalaxyHost (pre-7.2 Galaxy host) — long-retired
|
||||||
#>
|
#>
|
||||||
[CmdletBinding()] param()
|
[CmdletBinding()] param()
|
||||||
$ErrorActionPreference = 'Continue'
|
$ErrorActionPreference = 'Continue'
|
||||||
|
|
||||||
foreach ($svc in 'OtOpcUa', 'OtOpcUaWonderwareHistorian', 'OtOpcUaGalaxyHost') {
|
foreach ($svc in 'OtOpcUaHost', 'OtOpcUaWonderwareHistorian',
|
||||||
|
'OtOpcUa', 'OtOpcUaAdmin', 'OtOpcUaGalaxyHost') {
|
||||||
if (Get-Service $svc -ErrorAction SilentlyContinue) {
|
if (Get-Service $svc -ErrorAction SilentlyContinue) {
|
||||||
Write-Host "Stopping $svc..."
|
Write-Host "Stopping $svc..."
|
||||||
Stop-Service $svc -Force -ErrorAction SilentlyContinue
|
Stop-Service $svc -Force -ErrorAction SilentlyContinue
|
||||||
|
|||||||
@@ -0,0 +1,24 @@
|
|||||||
|
# Dynamic (file-provider) Traefik config for the OtOpcUa admin HTTP routing.
|
||||||
|
# Picked up by traefik.yml's file provider (with watch: true) so router/service
|
||||||
|
# edits hot-reload without a Traefik restart.
|
||||||
|
|
||||||
|
http:
|
||||||
|
routers:
|
||||||
|
otopcua-admin:
|
||||||
|
entryPoints: ["web"]
|
||||||
|
rule: "HostRegexp(`otopcua.*`)"
|
||||||
|
service: otopcua-admin
|
||||||
|
|
||||||
|
services:
|
||||||
|
otopcua-admin:
|
||||||
|
loadBalancer:
|
||||||
|
servers:
|
||||||
|
- url: "http://admin-a:9000"
|
||||||
|
- url: "http://admin-b:9000"
|
||||||
|
healthCheck:
|
||||||
|
path: /health/active
|
||||||
|
interval: 5s
|
||||||
|
timeout: 2s
|
||||||
|
# Default expected status is 2xx. Followers return 503 from
|
||||||
|
# /health/active so Traefik will drop them from the balancer
|
||||||
|
# within the next interval after a leadership change.
|
||||||
@@ -0,0 +1,30 @@
|
|||||||
|
# Traefik static configuration for the OtOpcUa fleet HTTP front door.
|
||||||
|
#
|
||||||
|
# Routes admin-role HTTP traffic (Blazor + auth + SignalR + /auth/*) to whichever
|
||||||
|
# OtOpcUa.Host node currently holds the admin role-leader. Uses the /health/active
|
||||||
|
# endpoint as the active-leader signal: a node returns 200 only when it is the
|
||||||
|
# Akka admin role-leader; followers return 503 and Traefik routes around them.
|
||||||
|
#
|
||||||
|
# OPC UA traffic is NOT routed through Traefik — clients connect directly to
|
||||||
|
# opc.tcp://node:4840 on every driver node and use the standard ServiceLevel
|
||||||
|
# heuristic for failover.
|
||||||
|
|
||||||
|
entryPoints:
|
||||||
|
web:
|
||||||
|
address: ":80"
|
||||||
|
|
||||||
|
providers:
|
||||||
|
file:
|
||||||
|
filename: /etc/traefik/dynamic.yml
|
||||||
|
watch: true
|
||||||
|
|
||||||
|
api:
|
||||||
|
insecure: true
|
||||||
|
dashboard: true
|
||||||
|
|
||||||
|
log:
|
||||||
|
level: INFO
|
||||||
|
format: common
|
||||||
|
|
||||||
|
accessLog:
|
||||||
|
format: common
|
||||||
Reference in New Issue
Block a user