diff --git a/docs/Client.CLI.md b/docs/Client.CLI.md index 32ed038..cdab19d 100644 --- a/docs/Client.CLI.md +++ b/docs/Client.CLI.md @@ -2,9 +2,9 @@ ## Overview -`ZB.MOM.WW.OtOpcUa.Client.CLI` is a cross-platform command-line client for the LmxOpcUa OPC UA server. It targets .NET 10 and uses the shared `IOpcUaClientService` from `Client.Shared` for all OPC UA operations. Commands are routed and parsed by [CliFx](https://github.com/Tyrrrz/CliFx). +`ZB.MOM.WW.OtOpcUa.Client.CLI` is a cross-platform command-line client for the OtOpcUa OPC UA server. It targets .NET 10 and uses the shared `IOpcUaClientService` from `Client.Shared` for all OPC UA operations. Commands are routed and parsed by [CliFx](https://github.com/Tyrrrz/CliFx). -The CLI is the primary tool for operators and developers to test and interact with the server from a terminal. It supports all core operations: connectivity testing, browsing, reading, writing, subscriptions, alarm monitoring, history reads, and redundancy queries. +The CLI is the primary tool for operators and developers to test and interact with the server from a terminal. It supports all core operations: connectivity testing, browsing, reading, writing, subscriptions, alarm monitoring, history reads, and redundancy queries. Any driver surface exposed by the server (Galaxy, Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS, OPC UA Client) is reachable through these commands — the CLI is driver-agnostic because everything below the OPC UA endpoint is. ## Build and Run @@ -14,7 +14,7 @@ dotnet build dotnet run -- [options] ``` -The executable name is `lmxopcua-cli`. +The executable name is still `lmxopcua-cli` — a residual from the pre-v2 rename (`Program.cs:SetExecutableName`). Scripts + operator muscle memory depend on the name; flipping it to `otopcua-cli` is a follow-up that also needs to move the client-side PKI store folder ({LocalAppData}/LmxOpcUaClient/pki/ — used by the shared client for its application certificate) so trust relationships survive the rename. ## Architecture @@ -54,7 +54,7 @@ lmxopcua-cli write -u opc.tcp://localhost:4840 -n "ns=2;s=MyNode" -v 42 -U opera When `-F` is provided, the shared service tries the primary URL first, then each failover URL in order. For long-running commands (`subscribe`, `alarms`), the service monitors the session via keep-alive and automatically reconnects to the next available server on failure. ```bash -lmxopcua-cli connect -u opc.tcp://localhost:4840/LmxOpcUa -F opc.tcp://localhost:4841/LmxOpcUa +lmxopcua-cli connect -u opc.tcp://localhost:4840/OtOpcUa -F opc.tcp://localhost:4841/OtOpcUa ``` ### Transport Security @@ -67,7 +67,7 @@ When `sign` or `encrypt` is specified, the shared service: 4. Fails with a clear error if no matching endpoint is found ```bash -lmxopcua-cli browse -u opc.tcp://localhost:4840/LmxOpcUa -S encrypt -U admin -P secret -r -d 2 +lmxopcua-cli browse -u opc.tcp://localhost:4840/OtOpcUa -S encrypt -U admin -P secret -r -d 2 ``` ### Verbose Logging @@ -81,14 +81,14 @@ The `--verbose` flag switches Serilog output from `Warning` to `Debug` level, sh Tests connectivity to an OPC UA server. Creates a session, prints connection metadata, and disconnects. ```bash -lmxopcua-cli connect -u opc.tcp://localhost:4840/LmxOpcUa -U admin -P admin123 +lmxopcua-cli connect -u opc.tcp://localhost:4840/OtOpcUa -U admin -P admin123 ``` Output: ```text -Connected to: opc.tcp://localhost:4840/LmxOpcUa -Server: LmxOpcUa +Connected to: opc.tcp://localhost:4840/OtOpcUa +Server: OtOpcUa Server Security Mode: None Security Policy: http://opcfoundation.org/UA/SecurityPolicy#None Connection successful. @@ -99,7 +99,7 @@ Connection successful. Reads the current value of a single node and prints the value, status code, and timestamps. ```bash -lmxopcua-cli read -u opc.tcp://localhost:4840/LmxOpcUa -n "ns=3;s=DEV.ScanState" -U admin -P admin123 +lmxopcua-cli read -u opc.tcp://localhost:4840/OtOpcUa -n "ns=3;s=DEV.ScanState" -U admin -P admin123 ``` | Flag | Description | @@ -135,10 +135,10 @@ Browses the OPC UA address space starting from the Objects folder or a specified ```bash # Browse top-level Objects folder -lmxopcua-cli browse -u opc.tcp://localhost:4840/LmxOpcUa -U admin -P admin123 +lmxopcua-cli browse -u opc.tcp://localhost:4840/OtOpcUa -U admin -P admin123 # Browse a specific node recursively to depth 3 -lmxopcua-cli browse -u opc.tcp://localhost:4840/LmxOpcUa -U admin -P admin123 -r -d 3 -n "ns=3;s=ZB" +lmxopcua-cli browse -u opc.tcp://localhost:4840/OtOpcUa -U admin -P admin123 -r -d 3 -n "ns=3;s=ZB" ``` | Flag | Description | @@ -166,12 +166,12 @@ Reads historical data from a node. Supports raw history reads and aggregate (pro ```bash # Raw history -lmxopcua-cli historyread -u opc.tcp://localhost:4840/LmxOpcUa \ +lmxopcua-cli historyread -u opc.tcp://localhost:4840/OtOpcUa \ -n "ns=1;s=TestMachine_001.TestHistoryValue" \ --start "2026-03-25" --end "2026-03-30" # Aggregate: 1-hour average -lmxopcua-cli historyread -u opc.tcp://localhost:4840/LmxOpcUa \ +lmxopcua-cli historyread -u opc.tcp://localhost:4840/OtOpcUa \ -n "ns=1;s=TestMachine_001.TestHistoryValue" \ --start "2026-03-25" --end "2026-03-30" \ --aggregate Average --interval 3600000 @@ -203,10 +203,10 @@ Subscribes to alarm events on a node. Prints structured alarm output including s ```bash # Subscribe to alarm events on the Server node -lmxopcua-cli alarms -u opc.tcp://localhost:4840/LmxOpcUa +lmxopcua-cli alarms -u opc.tcp://localhost:4840/OtOpcUa # Subscribe to a specific source node with condition refresh -lmxopcua-cli alarms -u opc.tcp://localhost:4840/LmxOpcUa \ +lmxopcua-cli alarms -u opc.tcp://localhost:4840/OtOpcUa \ -n "ns=1;s=TestMachine_001" --refresh ``` @@ -221,7 +221,7 @@ lmxopcua-cli alarms -u opc.tcp://localhost:4840/LmxOpcUa \ Reads the OPC UA redundancy state from a server: redundancy mode, service level, server URIs, and application URI. ```bash -lmxopcua-cli redundancy -u opc.tcp://localhost:4840/LmxOpcUa -U admin -P admin123 +lmxopcua-cli redundancy -u opc.tcp://localhost:4840/OtOpcUa -U admin -P admin123 ``` Example output: @@ -230,9 +230,9 @@ Example output: Redundancy Mode: Warm Service Level: 200 Server URIs: - - urn:localhost:LmxOpcUa:instance1 - - urn:localhost:LmxOpcUa:instance2 -Application URI: urn:localhost:LmxOpcUa:instance1 + - urn:localhost:OtOpcUa:instance1 + - urn:localhost:OtOpcUa:instance2 +Application URI: urn:localhost:OtOpcUa:instance1 ``` ## Testing diff --git a/docs/Client.UI.md b/docs/Client.UI.md index 3c9c861..6839055 100644 --- a/docs/Client.UI.md +++ b/docs/Client.UI.md @@ -2,7 +2,7 @@ ## Overview -`ZB.MOM.WW.OtOpcUa.Client.UI` is a cross-platform Avalonia desktop application for connecting to and interacting with the LmxOpcUa OPC UA server. It targets .NET 10 and uses the shared `IOpcUaClientService` from `Client.Shared` for all OPC UA operations. +`ZB.MOM.WW.OtOpcUa.Client.UI` is a cross-platform Avalonia desktop application for connecting to and interacting with the OtOpcUa OPC UA server. It targets .NET 10 and uses the shared `IOpcUaClientService` from `Client.Shared` for all OPC UA operations. The UI provides a single-window interface for browsing the address space, reading and writing values, monitoring live subscriptions, managing alarms, and querying historical data. @@ -43,7 +43,7 @@ The application uses a single-window layout with five main areas: │ │ │ ││ │ (lazy-load) │ └──────────────────────────────────────────────┘│ ├──────────────┴──────────────────────────────────────────────┤ -│ Connected to opc.tcp://... | LmxOpcUa | Session: ... | 3 subs│ +│ Connected to opc.tcp://... | OtOpcUa Server | Session: ... | 3 subs│ └─────────────────────────────────────────────────────────────┘ ``` @@ -55,7 +55,7 @@ The top bar provides the endpoint URL, Connect, and Disconnect buttons. The **Co | Setting | Description | |---------|-------------| -| Endpoint URL | OPC UA server endpoint (e.g., `opc.tcp://localhost:4840/LmxOpcUa`) | +| Endpoint URL | OPC UA server endpoint (e.g., `opc.tcp://localhost:4840/OtOpcUa`) | | Username / Password | Credentials for `UserName` token authentication | | Security Mode | Transport security: None, Sign, SignAndEncrypt | | Failover URLs | Comma-separated backup endpoints for redundancy failover | @@ -65,7 +65,7 @@ The top bar provides the endpoint URL, Connect, and Disconnect buttons. The **Co ### Settings Persistence -Connection settings are saved to `{LocalAppData}/LmxOpcUaClient/settings.json` after each successful connection and on window close. The settings are reloaded on next launch, including: +Connection settings are saved to `{LocalAppData}/LmxOpcUaClient/settings.json` after each successful connection and on window close. The folder name is a residual from the pre-v2 rename (the `Client.Shared` session factory still calls itself `LmxOpcUaClient` at `OpcUaClientService.cs:428`); renaming to `OtOpcUaClient` is a follow-up that needs a migration shim so existing users don't lose their settings on upgrade. The settings are reloaded on next launch, including: - All connection parameters - Active subscription node IDs (restored after reconnection) diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000..e06888c --- /dev/null +++ b/docs/README.md @@ -0,0 +1,83 @@ +# OtOpcUa documentation + +Two tiers of documentation live here: + +- **Current reference** at the top level (`docs/*.md`) — describes what's shipped today. Start here for operator + integrator reference. +- **Implementation history + design notes** at `docs/v2/*.md` — the authoritative plan + decision log the current reference is built from. Start here when you need the *why* behind an architectural choice, or when a top-level doc says "see plan.md § X". + +The project was originally called **LmxOpcUa** (a single-driver Galaxy/MXAccess OPC UA server) and has since become **OtOpcUa**, a multi-driver OPC UA server platform. Any lingering `LmxOpcUa`-string in a path you see in docs is a deliberate residual (executable name `lmxopcua-cli`, client PKI folder `{LocalAppData}/LmxOpcUaClient/`) — fixing those requires migration shims + is tracked as follow-ups. + +## Platform overview + +- **Core** owns the OPC UA stack, address space, session/security/subscription machinery. +- **Drivers** plug in via capability interfaces in `ZB.MOM.WW.OtOpcUa.Core.Abstractions`: `IDriver`, `IReadable`, `IWritable`, `ITagDiscovery`, `ISubscribable`, `IHostConnectivityProbe`, `IAlarmSource`, `IHistoryProvider`, `IPerCallHostResolver`. Each driver opts into whichever it supports. +- **Server** is the OPC UA endpoint process (net10, x64). Hosts every driver except Galaxy in-process; talks to Galaxy via a named pipe because MXAccess COM is 32-bit-only. +- **Admin** is the Blazor Server operator UI (net10, x64). Owns the Config DB draft/publish flow, ACL + role-grant authoring, fleet status + `/metrics` scrape endpoint. +- **Galaxy.Host** is a .NET Framework 4.8 x86 Windows service that wraps MXAccess COM on an STA thread for the Galaxy driver. + +## Where to find what + +### Architecture + data-path reference + +| Doc | Covers | +|-----|--------| +| [OpcUaServer.md](OpcUaServer.md) | Top-level server architecture — Core, driver dispatch, Config DB, generations | +| [AddressSpace.md](AddressSpace.md) | `GenericDriverNodeManager` + `ITagDiscovery` + `IAddressSpaceBuilder` | +| [ReadWriteOperations.md](ReadWriteOperations.md) | OPC UA Read/Write → `CapabilityInvoker` → `IReadable`/`IWritable` | +| [Subscriptions.md](Subscriptions.md) | Monitored items → `ISubscribable` + per-driver subscription refcount | +| [AlarmTracking.md](AlarmTracking.md) | `IAlarmSource` + `AlarmSurfaceInvoker` + OPC UA alarm conditions | +| [DataTypeMapping.md](DataTypeMapping.md) | Per-driver `DriverAttributeInfo` → OPC UA variable types | +| [IncrementalSync.md](IncrementalSync.md) | Address-space rebuild on redeploy + `sp_ComputeGenerationDiff` | +| [HistoricalDataAccess.md](HistoricalDataAccess.md) | `IHistoryProvider` as a per-driver optional capability | + +### Drivers + +| Doc | Covers | +|-----|--------| +| [drivers/README.md](drivers/README.md) | Index of the seven shipped drivers + capability matrix | +| [drivers/Galaxy.md](drivers/Galaxy.md) | Galaxy driver — MXAccess bridge, Host/Proxy split, named-pipe IPC | +| [drivers/Galaxy-Repository.md](drivers/Galaxy-Repository.md) | Galaxy-specific discovery via the ZB SQL database | + +For Modbus / S7 / AB CIP / AB Legacy / TwinCAT / FOCAS / OPC UA Client specifics, see [v2/driver-specs.md](v2/driver-specs.md). + +### Operational + +| Doc | Covers | +|-----|--------| +| [Configuration.md](Configuration.md) | appsettings bootstrap + Config DB + Admin UI draft/publish | +| [security.md](security.md) | Transport security profiles, LDAP auth, ACL trie, role grants, OTOPCUA0001 analyzer | +| [Redundancy.md](Redundancy.md) | `RedundancyCoordinator`, `ServiceLevelCalculator`, apply-lease, Prometheus metrics | +| [ServiceHosting.md](ServiceHosting.md) | Three-process deploy (Server + Admin + Galaxy.Host) install/uninstall | +| [StatusDashboard.md](StatusDashboard.md) | Pointer — superseded by [v2/admin-ui.md](v2/admin-ui.md) | + +### Client tooling + +| Doc | Covers | +|-----|--------| +| [Client.CLI.md](Client.CLI.md) | `lmxopcua-cli` — command-line client | +| [Client.UI.md](Client.UI.md) | Avalonia desktop client | + +### Requirements + +| Doc | Covers | +|-----|--------| +| [reqs/HighLevelReqs.md](reqs/HighLevelReqs.md) | HLRs — numbered system-level requirements | +| [reqs/OpcUaServerReqs.md](reqs/OpcUaServerReqs.md) | OPC UA server-layer reqs | +| [reqs/ServiceHostReqs.md](reqs/ServiceHostReqs.md) | Per-process hosting reqs | +| [reqs/ClientRequirements.md](reqs/ClientRequirements.md) | Client CLI + UI reqs | +| [reqs/GalaxyRepositoryReqs.md](reqs/GalaxyRepositoryReqs.md) | Galaxy-scoped repository reqs | +| [reqs/MxAccessClientReqs.md](reqs/MxAccessClientReqs.md) | Galaxy-scoped MXAccess reqs | +| [reqs/StatusDashboardReqs.md](reqs/StatusDashboardReqs.md) | Pointer — superseded by Admin UI | + +## Implementation history (`docs/v2/`) + +Design decisions + phase plans + execution notes. Load-bearing cross-references from the top-level docs: + +- [v2/plan.md](v2/plan.md) — authoritative v2 vision doc + numbered decision log (referenced as "decision #N" elsewhere) +- [v2/admin-ui.md](v2/admin-ui.md) — Admin UI spec +- [v2/acl-design.md](v2/acl-design.md) — data-plane ACL + permission-trie design (Phase 6.2) +- [v2/config-db-schema.md](v2/config-db-schema.md) — Config DB schema reference +- [v2/driver-specs.md](v2/driver-specs.md) — per-driver addressing + quirks for every shipped protocol +- [v2/dev-environment.md](v2/dev-environment.md) — dev-box bootstrap +- [v2/test-data-sources.md](v2/test-data-sources.md) — integration-test simulator matrix (includes the pinned libplctag `ab_server` version for AB CIP tests) +- [v2/implementation/phase-*-*.md](v2/implementation/) — per-phase execution plans with exit-gate evidence diff --git a/docs/reqs/HighLevelReqs.md b/docs/reqs/HighLevelReqs.md index a977567..6771ed8 100644 --- a/docs/reqs/HighLevelReqs.md +++ b/docs/reqs/HighLevelReqs.md @@ -1,47 +1,94 @@ # High-Level Requirements +> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as three cooperating processes (Server, Admin, Galaxy.Host). The Galaxy integration is one of seven shipped drivers. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 are new and cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer. + ## HLR-001: OPC UA Server -The system shall expose an OPC UA server endpoint that OPC UA clients can connect to for browsing, reading, and writing Galaxy tag data. +The system shall expose an OPC UA server endpoint that OPC UA clients can connect to for browsing, reading, writing, subscribing, acknowledging alarms, and reading historical values. Data is sourced from one or more **driver instances** that plug into the common core; OPC UA clients see a single unified address space per endpoint regardless of how many drivers are active behind it. -## HLR-002: Galaxy Hierarchy as OPC UA Address Space +## HLR-002: Multi-Driver Plug-In Model -The system shall build an OPC UA address space that mirrors the System Platform Galaxy object hierarchy, using contained names for browse structure and tag names for runtime data access. +The system shall support pluggable driver modules that bind to specific data sources. v2.0 ships seven drivers: Galaxy (AVEVA System Platform via MXAccess), Modbus TCP (including DL205 via `AddressFormat=DL205`), Allen-Bradley CIP (ControlLogix/CompactLogix), Allen-Bradley Legacy (SLC/MicroLogix via PCCC), Siemens S7, Beckhoff TwinCAT (ADS), FANUC FOCAS, and OPC UA Client (aggregation/gateway). Drivers implement only the capability interfaces (`IDriver`, `ITagDiscovery`, `IReadable`, `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`, `IHostConnectivityProbe`, `IPerCallHostResolver`, `IRediscoverable`) defined in `ZB.MOM.WW.OtOpcUa.Core.Abstractions` that apply to their protocol. Multiple instances of the same driver type are supported; each instance binds to its own OPC UA namespace index. -## HLR-003: MXAccess Runtime Data Access +## HLR-003: Address Space Composition per Namespace -The system shall use the MXAccess toolkit to subscribe to, read, and write Galaxy tag attribute values at runtime on behalf of connected OPC UA clients. +The system shall build the OPC UA address space by composing per-driver subtrees into a single endpoint. Each driver instance owns one namespace and registers its nodes via the core-provided `IAddressSpaceBuilder` streaming API. The Galaxy driver continues to mirror the deployed ArchestrA object hierarchy (contained-name browse paths) in a namespace of kind `SystemPlatform`. Native-protocol drivers populate a namespace of kind `Equipment` whose browse structure conforms to the canonical 5-level Unified Namespace (`Enterprise / Site / Area / Line / Equipment / Signal`). ## HLR-004: Data Type Mapping -The system shall map Galaxy attribute data types (mx_data_type) to appropriate OPC UA built-in types, including support for array attributes. +Each driver shall map its native data types to OPC UA built-in types via `DriverDataType` conversions, including support for arrays (ValueRank=1 with ArrayDimensions). Type mapping is driver-specific — `docs/DataTypeMapping.md` covers Galaxy/MXAccess; each other driver's spec in `docs/v2/driver-specs.md` covers its own mapping. Unknown/unmapped driver types shall default to String per the driver's spec. -## HLR-005: Dynamic Address Space Rebuild +## HLR-005: Live Data Access -The system shall detect Galaxy deployment changes (via `galaxy.time_of_last_deploy`) and rebuild the OPC UA address space to reflect the current deployed state. +For every data-path operation (read, write, subscribe notification, alarm event, history read, tag rediscovery, host connectivity probe), the system shall route the call through the capability interface owned by the target driver instance. Reads and subscriptions shall deliver a `DataValueSnapshot` carrying value, OPC UA `StatusCode`, and source timestamp regardless of the underlying protocol. Every async capability invocation at dispatch shall pass through `Core.Resilience.CapabilityInvoker`. -## HLR-006: Windows Service Hosting +## HLR-006: Change Detection and Rediscovery -The system shall run as a Windows service (via TopShelf) with support for install, uninstall, and interactive console modes. +Drivers whose backend has a native change signal (e.g. Galaxy's `time_of_last_deploy`, OPC UA Client receiving `ServerStatusChange`) shall implement the optional `IRediscoverable` interface so the core can rebuild only the affected subtree. Drivers whose tag set is static relative to a published config generation are not required to implement `IRediscoverable`; their address-space structure changes only via a new published Config DB generation (see HLR-012). -## HLR-007: Logging +## HLR-007: Service Hosting -The system shall log operational events to rolling daily log files using Serilog. +The system shall be deployed as three cooperating Windows services: -## HLR-008: Connection Resilience +- **OtOpcUa.Server** — .NET 10 x64, `Microsoft.Extensions.Hosting` + `AddWindowsService`, hosts all non-Galaxy drivers in-process and the OPC UA endpoint. +- **OtOpcUa.Admin** — .NET 10 x64 Blazor Server web app, hosts the admin UI, SignalR hubs for live updates, `/metrics` Prometheus endpoint, and audit log writers. +- **OtOpcUa.Galaxy.Host** — .NET Framework 4.8 x86 (TopShelf), hosts MXAccess COM + Galaxy Repository SQL + Historian plugin. Talks to `Driver.Galaxy.Proxy` inside `OtOpcUa.Server` via a named pipe (MessagePack over length-prefixed frames, per-process shared secret, SID-restricted ACL). -The system shall automatically reconnect to MXAccess after connection loss, replaying active subscriptions upon reconnect. +## HLR-008: Logging -## HLR-009: Status Dashboard +The system shall log operational events to rolling daily file sinks using Serilog on every process. Plain-text is on by default; structured JSON (CompactJsonFormatter) is opt-in via `Serilog:WriteJson = true` so SIEMs (Splunk, Datadog) can ingest without a regex parser. -The system shall host an embedded HTTP status dashboard (similar to the LmxProxy dashboard) providing at-a-glance operational visibility including connection state, health, subscription statistics, and operation metrics. +## HLR-009: Transport Security and Authentication + +The system shall support configurable OPC UA transport-security profiles (`None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`, `Aes128_Sha256_RsaOaep-Sign`, `Aes128_Sha256_RsaOaep-SignAndEncrypt`, `Aes256_Sha256_RsaPss-Sign`, `Aes256_Sha256_RsaPss-SignAndEncrypt`) resolved at startup by `SecurityProfileResolver`. UserName-token authentication shall be validated against LDAP (production: Active Directory; dev: GLAuth). The server certificate is always created even for `None`-only deployments because UserName token encryption depends on it. + +## HLR-010: Per-Driver-Instance Resilience + +Every async capability call at dispatch shall pass through `Core.Resilience.CapabilityInvoker`, which runs a Polly v8 pipeline keyed on `(DriverInstanceId, HostName, DriverCapability)`. Retry and circuit-breaker strategies are per capability per decision #143: Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead retry automatically; Write and AlarmAcknowledge do **not** retry unless the tag or capability is explicitly marked with `WriteIdempotentAttribute`. A driver-instance circuit-breaker trip sets Bad quality on that instance's nodes only; other drivers are unaffected (decision #144 — per-host Polly isolation). + +## HLR-011: Config DB and Draft/Publish + +Cluster topology, driver instances, namespaces, UNS hierarchy, equipment, tags, node ACLs, poll groups, and role grants shall live in a central MSSQL Config DB, not in `appsettings.json`. Changes accumulate in a draft generation that is validated and then atomically published. Each published generation gets a monotonically increasing `GenerationNumber` scoped per cluster. Nodes poll the DB for new published generations and diff-apply surgically against an atomic snapshot. `appsettings.json` is reduced to bootstrap-only fields (Config DB connection, NodeId, ClusterId, LDAP, security profile, redundancy role, logging, local cache path). + +## HLR-012: Local Cache Fallback + +Each node shall maintain a sealed LiteDB local cache of the most recent successfully applied generation. If the central Config DB is unreachable at startup, the node shall boot from its cached generation and log a warning. Cache reads are the Polly `Fallback` leg of the Config DB pipeline. + +## HLR-013: Cluster Redundancy + +The system shall support non-transparent OPC UA redundancy via 2-node clusters sharing a Config DB generation. `RedundancyCoordinator` + `ServiceLevelCalculator` compute a dynamic OPC UA `ServiceLevel` reflecting role (Primary/Secondary), publish state (current generation applied vs mid-apply), health (driver circuit-breaker state), and apply-lease state. Clients select an endpoint by `ServerUriArray` + `ServiceLevel` per the OPC UA spec; there is no VIP or load balancer. Single-node deployments use the same model with `NodeCount = 1`. + +## HLR-014: Fleet-Wide Identifier Uniqueness + +Equipment identifiers that integrate with external systems (`ZTag` for ERP, `SAPID` for SAP PM) shall be unique fleet-wide (across all clusters), not just within a cluster. The Admin UI enforces this at draft-publish time via the `ExternalIdReservation` table, which reserves external IDs across clusters so two clusters cannot publish the same ZTag or SAPID. `EquipmentUuid` is immutable and globally unique (UUIDv4). `EquipmentId` and `MachineCode` are unique within a cluster. + +## HLR-015: Admin UI Operator Surface + +The system shall provide a Blazor Server Admin UI (`OtOpcUa.Admin`) as the sole write path into the Config DB. Capabilities include: cluster + node management, driver-instance CRUD with schemaless JSON editors, UNS drag-and-drop hierarchy editor, CSV-driven equipment import with fleet-wide external-id reservation, draft/publish with a 6-section diff viewer (Drivers / Namespaces / UNS / Equipment / Tags / ACLs), node-ACL editor producing a permission trie, LDAP role grants, redundancy tab, live cluster-generation state via SignalR, audit log viewer. Users authenticate via cookie-auth over LDAP bind; three admin roles (`ConfigViewer`, `ConfigEditor`, `FleetAdmin`) gate UI operations. + +## HLR-016: Audit Logging + +Every publish event and every ACL / role-grant change shall produce an immutable audit log row in the Config DB via `AuditLogService` with the acting principal, timestamp, action, before/after generation numbers, and affected entity ids. Audit rows are never mutated or deleted. + +## HLR-017: Prometheus Metrics + +The Admin service shall expose a `/metrics` endpoint using OpenTelemetry → Prometheus. Core / Server shall emit driver health (per `DriverInstanceId`), Polly circuit-breaker states (per `DriverInstanceId` + `HostName` + `DriverCapability`), capability-call duration histograms, subscription counts, session counts, memory-tracking gauges (Phase 6.1), publish durations, and Config-DB apply-status gauges. + +## HLR-018: Roslyn Analyzer OTOPCUA0001 + +All direct call sites to capability-interface methods (`IReadable.ReadAsync`, `IWritable.WriteAsync`, `ITagDiscovery.DiscoverAsync`, `ISubscribable.SubscribeAsync`, `IAlarmSource.SubscribeAlarmsAsync` / `AcknowledgeAsync`, `IHistoryProvider.*`, `IHostConnectivityProbe.*`) made outside `Core.Resilience.CapabilityInvoker` shall produce Roslyn diagnostic **OTOPCUA0001** at build time. The analyzer is shipped in `ZB.MOM.WW.OtOpcUa.Analyzers` and referenced by every project that could host a capability call, guaranteeing that resilience cannot be accidentally bypassed. + +## Retired HLRs + +- **HLR-009 (Status Dashboard)** — retired. Superseded by the Admin UI (HLR-015). See `docs/v2/admin-ui.md`. ## Component-Level Requirements Detailed requirements are broken out into the following documents: - [OPC UA Server Requirements](OpcUaServerReqs.md) -- [MXAccess Client Requirements](MxAccessClientReqs.md) -- [Galaxy Repository Requirements](GalaxyRepositoryReqs.md) -- [Service Host Requirements](ServiceHostReqs.md) -- [Status Dashboard Requirements](StatusDashboardReqs.md) +- [Galaxy Driver — Repository Requirements](GalaxyRepositoryReqs.md) (Galaxy driver only) +- [Galaxy Driver — MXAccess Client Requirements](MxAccessClientReqs.md) (Galaxy driver only) +- [Service Host Requirements](ServiceHostReqs.md) (all three processes) +- [Client Requirements](ClientRequirements.md) (Client CLI + Client UI) +- [Status Dashboard Requirements](StatusDashboardReqs.md) (retired — pointer only)