Doc refresh (task #205) — requirements updated for multi-driver OtOpcUa three-process deploy
Per-file summary: - docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved. - docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md. - docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added. - docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor). - docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved. - docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance. HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,8 +1,10 @@
|
||||
# OPC UA Client Requirements
|
||||
|
||||
## Overview
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). The Client surface (shared library + CLI + UI) shipped for v2 is preserved; this refresh restructures the document into numbered, directly-verifiable requirements (CLI-* and UI-* prefixes) layered on top of the existing detailed design content. Requirement coverage added for the `redundancy` command, alarm subscribe/ack round-trip, history-read, and UI tree-browser drag-to-subscribe behaviors. Original design-spec material for `ConnectionSettings`, `IOpcUaClientService`, models, and view-models is retained as reference-level details below the numbered requirements.
|
||||
|
||||
Three new .NET 10 cross-platform projects providing a shared OPC UA client library, a CLI tool, and an Avalonia desktop UI. All projects target Windows and macOS.
|
||||
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)
|
||||
|
||||
See also: `docs/Client.CLI.md`, `docs/Client.UI.md`.
|
||||
|
||||
## Projects
|
||||
|
||||
@@ -10,134 +12,161 @@ Three new .NET 10 cross-platform projects providing a shared OPC UA client libra
|
||||
|---------|------|---------|
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.Shared` | Class library | Core OPC UA client, models, interfaces |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.CLI` | Console app | Command-line interface using CliFx |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.UI` | Avalonia app | Desktop UI with tree browser, subscriptions, alarms |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.Shared.Tests` | Test project | Unit tests for shared library |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.CLI.Tests` | Test project | Unit tests for CLI commands |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.UI.Tests` | Test project | Unit tests for UI view models |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.UI` | Avalonia app | Desktop UI |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.Shared.Tests` | Test project | Shared-library unit tests |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.CLI.Tests` | Test project | CLI command tests |
|
||||
| `ZB.MOM.WW.OtOpcUa.Client.UI.Tests` | Test project | ViewModel unit tests |
|
||||
|
||||
## Shared Requirements (Client.Shared)
|
||||
|
||||
### SHR-001: Single Service Interface
|
||||
|
||||
The Client.Shared library shall expose a single service interface `IOpcUaClientService` covering connect, disconnect, read, write, browse, subscribe, alarm-subscribe, alarm-ack, history-read-raw, history-read-aggregate, and get-redundancy-info operations.
|
||||
|
||||
### SHR-002: ConnectionSettings Model
|
||||
|
||||
The library shall expose a `ConnectionSettings` record with the fields: `EndpointUrl` (required), `FailoverUrls[]`, `Username`, `Password`, `SecurityMode` (None/Sign/SignAndEncrypt; default None), `SessionTimeoutSeconds` (default 60), `AutoAcceptCertificates` (default true), `CertificateStorePath`.
|
||||
|
||||
### SHR-003: Automatic Failover
|
||||
|
||||
The library shall monitor session keep-alive and automatically fail over across `FailoverUrls` when the primary endpoint is unreachable, emitting a `ConnectionStateChanged` event on each transition (Disconnected / Connecting / Connected / Reconnecting).
|
||||
|
||||
### SHR-004: Cross-Platform Certificate Store
|
||||
|
||||
The library shall auto-generate a client certificate on first use and store it in a cross-platform path (default `{AppData}/OtOpcUaClient/pki/`). Server certificates are auto-accepted when `AutoAcceptCertificates = true`.
|
||||
|
||||
### SHR-005: Type-Coercing Write
|
||||
|
||||
The library's `WriteValueAsync(NodeId, object)` shall read the node's current value to determine target type and coerce the input value before sending.
|
||||
|
||||
### SHR-006: UI-Thread Dispatch Neutrality
|
||||
|
||||
The library shall not assume any specific synchronization context. Events (`DataChanged`, `AlarmEvent`, `ConnectionStateChanged`) are raised on the OPC UA stack thread; the consuming CLI / UI is responsible for dispatching to its UI thread.
|
||||
|
||||
---
|
||||
|
||||
## CLI Requirements (Client.CLI)
|
||||
|
||||
### CLI-001: Command Surface
|
||||
|
||||
The CLI shall expose the following commands: `connect`, `read`, `write`, `browse`, `subscribe`, `historyread`, `alarms`, `redundancy`.
|
||||
|
||||
### CLI-002: Common Options
|
||||
|
||||
All CLI commands shall accept the options `-u, --url` (required), `-U, --username`, `-P, --password`, `-S, --security none|sign|encrypt`, `-F, --failover-urls` (comma-separated), `--verbose`.
|
||||
|
||||
### CLI-003: Connect Command
|
||||
|
||||
The `connect` command shall attempt to establish a session using the supplied options and print `Connected` plus the resolved endpoint's `ServerUriArray` and `ApplicationUri` on success, or a diagnostic error message on failure.
|
||||
|
||||
### CLI-004: Read Command
|
||||
|
||||
The `read -n <NodeId>` command shall print `NodeId`, `Value`, `StatusCode`, `SourceTimestamp`, `ServerTimestamp` one per line.
|
||||
|
||||
### CLI-005: Write Command
|
||||
|
||||
The `write -n <NodeId> -v <value>` command shall coerce the value to the node's current type (per SHR-005) and print the resulting `StatusCode`. A `Bad_UserAccessDenied` result is printed verbatim so operators see the authorization outcome.
|
||||
|
||||
### CLI-006: Browse Command
|
||||
|
||||
The `browse [-n <parent>] [-r] [-d <depth>]` command shall list child nodes under `parent` (or the `Objects` folder if omitted). `-r` enables recursion up to `-d` depth (default 1).
|
||||
|
||||
### CLI-007: Subscribe Command
|
||||
|
||||
The `subscribe -n <NodeId> -i <intervalMs>` command shall create a monitored item at `intervalMs` publishing interval, print each `DataChanged` event as `<timestamp> <nodeId> <value> <status>` until Ctrl-C, then cleanly unsubscribe.
|
||||
|
||||
### CLI-008: Historyread Command
|
||||
|
||||
The `historyread -n <NodeId> --start <utc> --end <utc> [--max <n>] [--aggregate <type> --interval <ms>]` command shall print raw values or aggregate buckets. Supported aggregate types: Average, Minimum, Maximum, Count, Start, End.
|
||||
|
||||
### CLI-009: Alarms Command
|
||||
|
||||
The `alarms [-n <source>] [-i <intervalMs>]` command shall subscribe to alarm events, print each event as `<time> <source> <condition> <severity> <state> <acked> <message>`, accept `ack <conditionId>` commands interactively, and support `refresh` to trigger `RequestConditionRefreshAsync`.
|
||||
|
||||
### CLI-010: Redundancy Command
|
||||
|
||||
The `redundancy` command shall call `GetRedundancyInfoAsync` and print `Mode`, `ServiceLevel`, `ApplicationUri`, and `ServerUris` (one per line). Suitable for redundancy-failover smoke tests.
|
||||
|
||||
### CLI-011: Logging
|
||||
|
||||
The CLI shall use Serilog console sink at `Warning` minimum by default; `--verbose` raises to `Debug`.
|
||||
|
||||
---
|
||||
|
||||
## UI Requirements (Client.UI)
|
||||
|
||||
### UI-001: Connection Panel
|
||||
|
||||
The UI shall present a top-bar connection panel with fields for Endpoint URL, Username, Password, Security mode, and a Connect / Disconnect button. The resolved `RedundancyInfo` is displayed next to the bar on successful connect.
|
||||
|
||||
### UI-002: Tree Browser
|
||||
|
||||
The UI shall present a left-pane tree browser backed by `IOpcUaClientService.BrowseAsync`, lazy-loading children on node expansion (one level per `BrowseAsync` call).
|
||||
|
||||
### UI-003: Read/Write Tab
|
||||
|
||||
The UI shall provide a Read/Write tab that auto-reads the selected tree node's current value, displays `Value` + `StatusCode` + `SourceTimestamp`, and accepts a write value with a Send button.
|
||||
|
||||
### UI-004: Subscriptions Tab
|
||||
|
||||
The UI shall provide a Subscriptions tab that lists active monitored items (columns: NodeId, Value, Status, Timestamp), supports Add and Remove, and dispatches `DataChanged` events to the Avalonia UI thread via `Dispatcher.UIThread.Post`.
|
||||
|
||||
### UI-005: Alarms Tab
|
||||
|
||||
The UI shall provide an Alarms tab that supports SubscribeAlarms / UnsubscribeAlarms / RefreshConditions commands, displays live alarm events, and supports `Acknowledge` on selected events. Acknowledgment failure (including `Bad_UserAccessDenied`) is surfaced to the user.
|
||||
|
||||
### UI-006: History Tab
|
||||
|
||||
The UI shall provide a History tab with inputs for StartTime, EndTime, MaxValues, AggregateType, Interval, a Read command, and a results table with columns (Timestamp, Value, Status).
|
||||
|
||||
### UI-007: Connection State Reflects in UI
|
||||
|
||||
All tabs shall reflect the connection state — when disconnected, all action commands are disabled; the status bar shows `Disconnected` / `Connecting` / `Connected` / `Reconnecting` tied to the `ConnectionStateChanged` event.
|
||||
|
||||
### UI-008: Cross-Platform
|
||||
|
||||
The UI shall build and run on Windows (win-x64) and macOS (osx-arm64 / osx-x64). No platform-specific OPC UA stack APIs are used.
|
||||
|
||||
---
|
||||
|
||||
## Technology Stack
|
||||
|
||||
- .NET 10, C#
|
||||
- OPC UA: OPCFoundation.NetStandard.Opc.Ua.Client
|
||||
- OPC UA: `OPCFoundation.NetStandard.Opc.Ua.Client`
|
||||
- Logging: Serilog
|
||||
- CLI: CliFx
|
||||
- UI: Avalonia 11.x with CommunityToolkit.Mvvm
|
||||
- Tests: xUnit 3, Shouldly, Microsoft.Testing.Platform runner
|
||||
|
||||
## Client.Shared
|
||||
## Client.Shared — Design Detail
|
||||
|
||||
### ConnectionSettings Model
|
||||
### IOpcUaClientService Interface (reference)
|
||||
|
||||
```
|
||||
EndpointUrl: string (required)
|
||||
FailoverUrls: string[] (optional)
|
||||
Username: string? (optional, first-class property)
|
||||
Password: string? (optional, first-class property)
|
||||
SecurityMode: enum (None, Sign, SignAndEncrypt) — default None
|
||||
SessionTimeoutSeconds: int — default 60
|
||||
AutoAcceptCertificates: bool — default true
|
||||
CertificateStorePath: string? — default platform-appropriate location
|
||||
```
|
||||
**Lifecycle:** `ConnectAsync(ConnectionSettings)`, `DisconnectAsync()`, `IsConnected`.
|
||||
|
||||
### IOpcUaClientService Interface
|
||||
**Read/Write:** `ReadValueAsync(NodeId)`, `WriteValueAsync(NodeId, object)`.
|
||||
|
||||
Single service interface covering all OPC UA operations:
|
||||
**Browse:** `BrowseAsync(NodeId? parent)` → `BrowseResult[]` (NodeId, DisplayName, NodeClass, HasChildren); lazy-load compatible.
|
||||
|
||||
**Lifecycle:**
|
||||
- `ConnectAsync(ConnectionSettings)` — connect to server, handle endpoint discovery, security, auth
|
||||
- `DisconnectAsync()` — close session cleanly
|
||||
- `IsConnected` property
|
||||
**Subscribe:** `SubscribeAsync(NodeId, int intervalMs)`, `UnsubscribeAsync(NodeId)`, `event DataChanged(NodeId, DataValue)`.
|
||||
|
||||
**Read/Write:**
|
||||
- `ReadValueAsync(NodeId)` — returns DataValue (value, status, timestamps)
|
||||
- `WriteValueAsync(NodeId, object value)` — auto-detects target type, returns StatusCode
|
||||
**Alarms:** `SubscribeAlarmsAsync(NodeId? source, int intervalMs)`, `UnsubscribeAlarmsAsync()`, `AcknowledgeAsync(conditionId, comment)`, `RequestConditionRefreshAsync()`, `event AlarmEvent(AlarmEventArgs)`.
|
||||
|
||||
**Browse:**
|
||||
- `BrowseAsync(NodeId? parent)` — returns list of BrowseResult (NodeId, DisplayName, NodeClass)
|
||||
- Lazy-load compatible (browse one level at a time)
|
||||
**History:** `HistoryReadRawAsync(NodeId, start, end, maxValues)`, `HistoryReadAggregateAsync(NodeId, start, end, AggregateType, intervalMs)`.
|
||||
|
||||
**Subscribe:**
|
||||
- `SubscribeAsync(NodeId, int intervalMs)` — create monitored item subscription
|
||||
- `UnsubscribeAsync(NodeId)` — remove monitored item
|
||||
- `event DataChanged` — fires on value change with (NodeId, DataValue)
|
||||
|
||||
**Alarms:**
|
||||
- `SubscribeAlarmsAsync(NodeId? source, int intervalMs)` — subscribe to alarm events
|
||||
- `UnsubscribeAlarmsAsync()` — remove alarm subscription
|
||||
- `RequestConditionRefreshAsync()` — trigger condition refresh
|
||||
- `event AlarmEvent` — fires on alarm state change with AlarmEventArgs
|
||||
|
||||
**History:**
|
||||
- `HistoryReadRawAsync(NodeId, DateTime start, DateTime end, int maxValues)` — raw historical values
|
||||
- `HistoryReadAggregateAsync(NodeId, DateTime start, DateTime end, AggregateType, double intervalMs)` — aggregated values
|
||||
|
||||
**Redundancy:**
|
||||
- `GetRedundancyInfoAsync()` — returns RedundancyInfo (mode, service level, server URIs, app URI)
|
||||
|
||||
**Failover:**
|
||||
- Automatic failover across FailoverUrls with keep-alive monitoring
|
||||
- `event ConnectionStateChanged` — fires on connect/disconnect/failover
|
||||
**Redundancy:** `GetRedundancyInfoAsync()` → `RedundancyInfo` (Mode, ServiceLevel, ServerUris, ApplicationUri).
|
||||
|
||||
### Models
|
||||
|
||||
- `BrowseResult`: NodeId, DisplayName, NodeClass, HasChildren
|
||||
- `AlarmEventArgs`: SourceName, ConditionName, Severity, Message, Retain, ActiveState, AckedState, Time
|
||||
- `RedundancyInfo`: Mode, ServiceLevel, ServerUris, ApplicationUri
|
||||
- `ConnectionState`: enum (Disconnected, Connecting, Connected, Reconnecting)
|
||||
- `AggregateType`: enum (Average, Minimum, Maximum, Count, Start, End)
|
||||
- `BrowseResult` — NodeId, DisplayName, NodeClass, HasChildren
|
||||
- `AlarmEventArgs` — SourceName, ConditionName, Severity, Message, Retain, ActiveState, AckedState, Time
|
||||
- `RedundancyInfo` — Mode, ServiceLevel, ServerUris, ApplicationUri
|
||||
- `ConnectionState` — enum (Disconnected, Connecting, Connected, Reconnecting)
|
||||
- `AggregateType` — enum (Average, Minimum, Maximum, Count, Start, End)
|
||||
|
||||
### Type Conversion
|
||||
---
|
||||
|
||||
Port the existing `ConvertValue` logic from the CLI tool: reads the current node value to determine the target type, then coerces the input value.
|
||||
|
||||
### Certificate Management
|
||||
|
||||
- Cross-platform certificate store path (default: `{AppData}/LmxOpcUaClient/pki/`)
|
||||
- Auto-generate client certificate on first use
|
||||
- Auto-accept untrusted server certificates (configurable)
|
||||
|
||||
### Logging
|
||||
|
||||
Serilog with `ILogger` passed via constructor or `Log.ForContext<T>()`. No sinks configured in the library — consumers configure sinks.
|
||||
|
||||
## Client.CLI
|
||||
|
||||
### Commands
|
||||
|
||||
All 8 commands:
|
||||
|
||||
| Command | Description |
|
||||
|---------|-------------|
|
||||
| `connect` | Test server connectivity |
|
||||
| `read` | Read a node value |
|
||||
| `write` | Write a value to a node |
|
||||
| `browse` | Browse address space (with depth/recursive) |
|
||||
| `subscribe` | Monitor node for value changes |
|
||||
| `historyread` | Read historical data (raw + aggregates) |
|
||||
| `alarms` | Subscribe to alarm events |
|
||||
| `redundancy` | Query redundancy state |
|
||||
|
||||
All commands use the shared `IOpcUaClientService`. Each command:
|
||||
1. Creates `ConnectionSettings` from CLI options
|
||||
2. Creates `OpcUaClientService`
|
||||
3. Calls the appropriate method
|
||||
4. Formats and prints results
|
||||
|
||||
### Common Options (all commands)
|
||||
|
||||
- `-u, --url` (required): Endpoint URL
|
||||
- `-U, --username`: Username
|
||||
- `-P, --password`: Password
|
||||
- `-S, --security`: Security mode (none/sign/encrypt)
|
||||
- `-F, --failover-urls`: Comma-separated failover endpoints
|
||||
|
||||
### Logging
|
||||
|
||||
Serilog console sink at Warning level by default, with `--verbose` flag for Debug.
|
||||
|
||||
## Client.UI
|
||||
|
||||
### Window Layout
|
||||
## Client.UI — View Layout (reference)
|
||||
|
||||
Single-window Avalonia application:
|
||||
|
||||
@@ -146,82 +175,43 @@ Single-window Avalonia application:
|
||||
│ [Endpoint URL] [User] [Pass] [Security▼] [Connect] │
|
||||
│ Redundancy: Mode=Warm ServiceLevel=200 AppUri=... │
|
||||
├──────────────┬──────────────────────────────────────────┤
|
||||
│ │ ┌─Read/Write─┬─Subscriptions─┬─Alarms─┬─History─┐│
|
||||
│ Address │ │ Node: ns=3;s=Tag.Attr ││
|
||||
│ Space │ │ Value: 42.5 ││
|
||||
│ Tree │ │ Status: Good ││
|
||||
│ Browser │ │ [Write: ____] [Send] ││
|
||||
│ │ │ ││
|
||||
│ (lazy-load) │ │ ││
|
||||
│ │ └──────────────────────────────────────┘│
|
||||
│ │ ┌Read/Write┬Subscriptions┬Alarms┬History┐│
|
||||
│ Address │ │ Node: ns=3;s=Tag.Attr ││
|
||||
│ Space │ │ Value: 42.5 Status: Good ││
|
||||
│ Tree │ │ [Write: ____] [Send] ││
|
||||
│ Browser │ └───────────────────────────────────────┘│
|
||||
├──────────────┴──────────────────────────────────────────┤
|
||||
│ Status: Connected | Session: abc123 | 3 subscriptions │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Views and ViewModels (CommunityToolkit.Mvvm)
|
||||
### ViewModels (CommunityToolkit.Mvvm)
|
||||
|
||||
**MainWindowViewModel:**
|
||||
- Connection settings properties (bound to top bar inputs)
|
||||
- ConnectCommand / DisconnectCommand (RelayCommand)
|
||||
- ConnectionState property
|
||||
- RedundancyInfo property
|
||||
- SelectedTreeNode property
|
||||
- StatusMessage property
|
||||
|
||||
**BrowseTreeViewModel:**
|
||||
- Root nodes collection (ObservableCollection)
|
||||
- Lazy-load children on expand via `BrowseAsync`
|
||||
- TreeNodeViewModel: NodeId, DisplayName, NodeClass, Children, IsExpanded, HasChildren
|
||||
|
||||
**ReadWriteViewModel:**
|
||||
- SelectedNode (from tree selection)
|
||||
- CurrentValue, Status, SourceTimestamp
|
||||
- WriteValue input + WriteCommand
|
||||
- Auto-read on node selection
|
||||
|
||||
**SubscriptionsViewModel:**
|
||||
- ActiveSubscriptions collection (ObservableCollection)
|
||||
- AddSubscription / RemoveSubscription commands
|
||||
- Live value updates dispatched to UI thread
|
||||
- Columns: NodeId, Value, Status, Timestamp
|
||||
|
||||
**AlarmsViewModel:**
|
||||
- AlarmEvents collection (ObservableCollection)
|
||||
- SubscribeCommand / UnsubscribeCommand / RefreshCommand
|
||||
- MonitoredNode property
|
||||
- Live alarm events dispatched to UI thread
|
||||
|
||||
**HistoryViewModel:**
|
||||
- SelectedNode (from tree selection)
|
||||
- StartTime, EndTime, MaxValues, AggregateType, Interval
|
||||
- ReadCommand
|
||||
- Results collection (ObservableCollection)
|
||||
- Columns: Timestamp, Value, Status
|
||||
|
||||
### UI Thread Dispatch
|
||||
|
||||
All events from `IOpcUaClientService` must be dispatched to the Avalonia UI thread via `Dispatcher.UIThread.Post()` before updating ObservableCollections.
|
||||
- `MainWindowViewModel` — connection fields, connect/disconnect commands, `ConnectionState`, `RedundancyInfo`, `SelectedTreeNode`, `StatusMessage`.
|
||||
- `BrowseTreeViewModel` — root collection (`ObservableCollection<TreeNodeViewModel>`), lazy-load on expand.
|
||||
- `ReadWriteViewModel` — auto-read on selection, `WriteValue` + `WriteCommand`.
|
||||
- `SubscriptionsViewModel` — `ActiveSubscriptions`, `AddSubscription`, `RemoveSubscription`, live `DataChanged` dispatch to UI thread.
|
||||
- `AlarmsViewModel` — `AlarmEvents`, Subscribe / Unsubscribe / Refresh / Acknowledge commands.
|
||||
- `HistoryViewModel` — `StartTime`, `EndTime`, `MaxValues`, `AggregateType`, `Interval`, `ReadCommand`, `Results`.
|
||||
|
||||
## Test Projects
|
||||
|
||||
### Client.Shared.Tests
|
||||
- ConnectionSettings validation
|
||||
- Type conversion (ConvertValue)
|
||||
- BrowseResult model construction
|
||||
- AlarmEventArgs model construction
|
||||
- `ConnectionSettings` validation
|
||||
- Type conversion
|
||||
- `BrowseResult` / `AlarmEventArgs` / `RedundancyInfo` model construction
|
||||
- FailoverUrl parsing
|
||||
|
||||
### Client.CLI.Tests
|
||||
- Command option parsing (via CliFx test infrastructure)
|
||||
- Output formatting
|
||||
- Output formatting for each command
|
||||
|
||||
### Client.UI.Tests
|
||||
- ViewModel property change notifications
|
||||
- Command can-execute logic
|
||||
- Tree node lazy-load behavior (with mocked IOpcUaClientService)
|
||||
- ViewModel property-change notifications
|
||||
- Command `CanExecute` logic
|
||||
- Tree lazy-load behavior (with mocked `IOpcUaClientService`)
|
||||
|
||||
### Test Framework
|
||||
- xUnit 3 with Microsoft.Testing.Platform runner
|
||||
- Shouldly for assertions
|
||||
- No live OPC UA server required — mock IOpcUaClientService for unit tests
|
||||
- Shouldly
|
||||
- No live OPC UA server required — mock `IOpcUaClientService` for unit tests
|
||||
|
||||
@@ -1,106 +1,113 @@
|
||||
# Galaxy Repository — Component Requirements
|
||||
# Galaxy Driver — Galaxy Repository Requirements
|
||||
|
||||
Parent: [HLR-002](HighLevelReqs.md#hlr-002-galaxy-hierarchy-as-opc-ua-address-space), [HLR-005](HighLevelReqs.md#hlr-005-dynamic-address-space-rebuild)
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). Scope clarified: this document is **Galaxy-driver-specific**. Galaxy is one of seven drivers in the OtOpcUa platform; the requirements below describe the SQL-side of the Galaxy driver (hierarchy/attribute/change-detection queries against the ZB database) that backs the Galaxy driver's `ITagDiscovery.DiscoverAsync` and `IRediscoverable` implementations. All Galaxy-specific SQL runs inside `OtOpcUa.Galaxy.Host` (.NET 4.8 x86 Windows service); the in-server `Driver.Galaxy.Proxy` calls it over a named pipe. For platform-wide tag discovery requirements see `OpcUaServerReqs.md` OPC-002. For deeper spec see `docs/GalaxyRepository.md` and `docs/v2/driver-specs.md`.
|
||||
|
||||
Parent: [HLR-002](HighLevelReqs.md#hlr-002-multi-driver-plug-in-model), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-006](HighLevelReqs.md#hlr-006-change-detection-and-rediscovery)
|
||||
|
||||
Driver scope: Galaxy only. Namespace kind: `SystemPlatform`.
|
||||
|
||||
## GR-001: Hierarchy Extraction
|
||||
|
||||
The system shall query the Galaxy Repository database to extract all deployed objects with their parent-child containment relationships, contained names, and tag names.
|
||||
The Galaxy driver's `ITagDiscovery.DiscoverAsync` implementation shall query the ZB Galaxy Repository database to extract all deployed objects with their parent-child containment relationships, contained names, and tag names.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Executes `queries/hierarchy.sql` against the ZB database.
|
||||
- Executes `queries/hierarchy.sql` against the ZB database from within `OtOpcUa.Galaxy.Host`.
|
||||
- Returns a list of objects with: `gobject_id`, `tag_name`, `contained_name`, `browse_name`, `parent_gobject_id`, `is_area`.
|
||||
- Objects with `parent_gobject_id = 0` are children of the root ZB node.
|
||||
- Objects with `parent_gobject_id = 0` become children of the root ZB node inside the `SystemPlatform` namespace.
|
||||
- Only deployed, non-template objects matching the category filter (areas, engines, user-defined objects, etc.) are returned.
|
||||
- Query completes within 10 seconds on a typical Galaxy (hundreds of objects). Log a Warning if it takes longer.
|
||||
- Query completes within 10 seconds on a typical Galaxy (hundreds of objects). Log Warning if it takes longer.
|
||||
|
||||
### Details
|
||||
|
||||
- Results are ordered by `parent_gobject_id, tag_name` for deterministic tree building.
|
||||
- If the query returns zero rows, log a Warning (Galaxy may have no deployed objects, or the DB connection may be misconfigured).
|
||||
- Orphan detection: if a row references a `parent_gobject_id` that does not exist in the result set and is not 0, log a Warning and skip that node.
|
||||
- Empty result → Warning logged (Galaxy may have no deployed objects, or the DB connection may be misconfigured).
|
||||
- Orphan detection: a row referencing a non-existent `parent_gobject_id` (and not 0) is skipped with a Warning.
|
||||
- Streamed to the core via `IAddressSpaceBuilder.AddFolder` / `AddObject` calls over the Galaxy named pipe; no in-memory full-tree buffering on the Host side.
|
||||
|
||||
---
|
||||
|
||||
## GR-002: Attribute Extraction
|
||||
|
||||
The system shall query user-defined (dynamic) attributes for deployed objects, including data type, array flag, and array dimensions.
|
||||
The Galaxy driver shall query user-defined (dynamic) attributes for deployed objects, including data type, array flag, and array dimensions.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Executes `queries/attributes.sql` using the template chain CTE to resolve inherited attributes.
|
||||
- Returns: `gobject_id`, `tag_name`, `attribute_name`, `full_tag_reference`, `mx_data_type`, `is_array`, `array_dimension`, `security_classification`.
|
||||
- Attributes starting with `_` are filtered out by the query.
|
||||
- `array_dimension` is correctly extracted from the `mx_value` hex bytes (positions 13-16, little-endian uint16).
|
||||
- `array_dimension` is extracted from the `mx_value` hex bytes (positions 13-16, little-endian uint16).
|
||||
|
||||
### Details
|
||||
|
||||
- CTE recursion depth is limited to 10 levels (per the query). This is sufficient for Galaxy template hierarchies.
|
||||
- If `mx_data_type` is null or not in the known set (1-8, 13-16), default to String.
|
||||
- If `gobject_id` from an attribute row does not match any hierarchy object, skip that attribute (object may not be deployed).
|
||||
- CTE recursion depth is limited to 10 levels.
|
||||
- `mx_data_type` not in the known set (1-8, 13-16) defaults to String.
|
||||
- `gobject_id` that doesn't match a hierarchy object is skipped (object may not be deployed).
|
||||
- Each emitted attribute is reported via `DriverAttributeInfo` to the core through `IAddressSpaceBuilder.AddVariable`.
|
||||
|
||||
---
|
||||
|
||||
## GR-003: Change Detection
|
||||
## GR-003: Change Detection and IRediscoverable
|
||||
|
||||
The system shall poll `galaxy.time_of_last_deploy` at a configurable interval to detect when a new deployment has occurred.
|
||||
The Galaxy driver shall implement `IRediscoverable` by polling `galaxy.time_of_last_deploy` on a configurable interval to detect when a new deployment has occurred.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Polls `SELECT time_of_last_deploy FROM galaxy` at a configurable interval (`GalaxyRepository:ChangeDetectionIntervalSeconds`, default 30 seconds).
|
||||
- Polls `SELECT time_of_last_deploy FROM galaxy` at a configurable interval (`Galaxy:ChangeDetectionIntervalSeconds`, default 30 seconds).
|
||||
- Compares the returned timestamp to the last known value stored in memory.
|
||||
- If different, triggers a rebuild (re-run hierarchy + attributes queries, notify OPC UA server).
|
||||
- First poll after startup always triggers an initial build.
|
||||
- If the query fails (SQL timeout, connection error), log Warning and retry at next interval. Do not trigger a rebuild on failure.
|
||||
- If different, raises the `IRediscoverable.RediscoveryNeeded` signal so the core re-runs `ITagDiscovery.DiscoverAsync` and surgically rebuilds the Galaxy namespace subtree (per OPC-017).
|
||||
- First poll after startup always triggers an initial discovery.
|
||||
- Query failure → Warning logged; no rediscovery triggered; retry at next interval.
|
||||
|
||||
### Details
|
||||
|
||||
- Polling runs on a background timer thread, not blocking the STA thread.
|
||||
- `time_of_last_deploy` is a datetime column. Compare using exact equality (not range).
|
||||
- Polling runs on a background `Task` inside `OtOpcUa.Galaxy.Host`, not on the STA message-pump thread.
|
||||
- `time_of_last_deploy` is a `datetime` column; compared using exact equality (not a range).
|
||||
- Signal delivery to the Proxy happens via a server-push message on the Galaxy named pipe.
|
||||
|
||||
---
|
||||
|
||||
## GR-004: Rebuild on Change
|
||||
## GR-004: Rediscovery Data Flow
|
||||
|
||||
When a deployment change is detected, the system shall re-query hierarchy and attributes and provide the updated structure to the OPC UA server for address space rebuild.
|
||||
On a deployment change, the Galaxy driver shall re-query hierarchy + attributes and stream the updated structure to the core for surgical namespace rebuild.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- On change detection, re-query both hierarchy and attributes.
|
||||
- Provide the new data set to the OPC UA server component for address space replacement.
|
||||
- Log at Information level: "Galaxy deployment change detected. Rebuilding address space. ({ObjectCount} objects, {AttributeCount} attributes)".
|
||||
- Log total rebuild time at Information level.
|
||||
- If the re-query fails, log Error and keep the existing address space (do not clear it).
|
||||
- On change signal, re-run `GR-001` (hierarchy) and `GR-002` (attributes) queries.
|
||||
- Stream the new tree to the core via `IAddressSpaceBuilder` over the named pipe.
|
||||
- Log at Information level: `"Galaxy deployment change detected. Rebuilding. ({ObjectCount} objects, {AttributeCount} attributes)"`.
|
||||
- Log total rediscovery duration at Information level.
|
||||
- On re-query failure: Error logged; existing Galaxy subtree is retained.
|
||||
|
||||
### Details
|
||||
|
||||
- Rebuild is not atomic from the DB perspective — hierarchy and attributes are two separate queries. This is acceptable; deployment is an infrequent operation.
|
||||
- Raise an event/callback that the OPC UA server subscribes to: `OnGalaxyChanged(hierarchyData, attributeData)`.
|
||||
- Rediscovery is not atomic from the DB perspective — hierarchy and attributes are two separate queries. Acceptable; Galaxy deployment is an infrequent operation.
|
||||
- The core owns the diff/surgical apply per OPC-017; the Galaxy driver only streams the new authoritative tree.
|
||||
|
||||
---
|
||||
|
||||
## GR-005: Connection Configuration
|
||||
|
||||
Database connection parameters shall be configurable via appsettings.json (connection string using Windows Authentication by default).
|
||||
Galaxy DB connection parameters shall be configurable via environment variables passed from the `OtOpcUa.Galaxy.Host` supervisor at spawn time.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Connection string in `appsettings.json` under `GalaxyRepository:ConnectionString`.
|
||||
- Default: `Server=localhost;Database=ZB;Integrated Security=true` (Windows Auth).
|
||||
- ADO.NET `SqlConnection` used for queries (.NET Framework 4.8 built-in).
|
||||
- Connection string via `OTOPCUA_GALAXY_ZB_CONN` environment variable.
|
||||
- Default: `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` (Windows Auth).
|
||||
- ADO.NET `SqlConnection` used for queries (.NET Framework 4.8).
|
||||
- Connection is opened per-query (not kept open). Connection pooling handles efficiency.
|
||||
- If the initial connection test at startup fails, log Error with the connection string and continue attempting (change detection polls will keep retrying).
|
||||
- If the initial connection test at startup fails, log Error with the connection string sanitized and continue attempting (change-detection polls keep retrying).
|
||||
|
||||
### Details
|
||||
|
||||
- Command timeout: configurable via `GalaxyRepository:CommandTimeoutSeconds`, default 30 seconds.
|
||||
- No ORM. Raw ADO.NET with `SqlCommand` and `SqlDataReader`. SQL text is embedded as constants (not dynamically constructed).
|
||||
- Command timeout: `Galaxy:CommandTimeoutSeconds` in Config DB driver JSON (default 30 seconds).
|
||||
- No ORM. Raw ADO.NET with `SqlCommand` and `SqlDataReader`. SQL text embedded as constants.
|
||||
|
||||
---
|
||||
|
||||
## GR-006: Query Safety
|
||||
|
||||
All SQL queries shall be static read-only SELECT statements. No writes to the Galaxy Repository database.
|
||||
All Galaxy SQL queries shall be static read-only SELECT statements. No writes to the Galaxy Repository database.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
@@ -112,10 +119,23 @@ All SQL queries shall be static read-only SELECT statements. No writes to the Ga
|
||||
|
||||
## GR-007: Startup Validation
|
||||
|
||||
On startup, the Galaxy Repository component shall validate database connectivity.
|
||||
On startup, the Galaxy driver's DB component inside `OtOpcUa.Galaxy.Host` shall validate database connectivity.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Execute a simple test query (`SELECT 1`) against the configured database.
|
||||
- If the database is unreachable, log an Error but do not prevent service startup.
|
||||
- The service runs in degraded mode (empty address space) until the database becomes available and the next change detection poll succeeds.
|
||||
- Execute a simple test query (`SELECT 1`) against the configured Galaxy DB.
|
||||
- If the database is unreachable, log Error but do not prevent Host startup.
|
||||
- The Galaxy driver runs in degraded mode (empty SystemPlatform namespace) until the database becomes available and the next change-detection poll succeeds.
|
||||
- In degraded mode the Galaxy driver instance reports `DriverHealth.Unavailable`, causing its Polly circuit state to be open until the first successful discovery.
|
||||
|
||||
---
|
||||
|
||||
## GR-008: Capability Wrapping
|
||||
|
||||
All calls into the Galaxy DB component from the Proxy side shall route through `CapabilityInvoker.InvokeAsync(DriverCapability.Discover, …)`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `Driver.Galaxy.Proxy.DiscoverAsync` is a thin capability-invoker call that sends a MessagePack request over the named pipe to the Host's DB component.
|
||||
- Roslyn analyzer **OTOPCUA0001** validates there are no direct discovery calls bypassing the invoker.
|
||||
- Polly pipeline for `DriverCapability.Discover` on the Galaxy driver instance carries Timeout + Retry + CircuitBreaker.
|
||||
|
||||
@@ -1,6 +1,10 @@
|
||||
# MXAccess Client — Component Requirements
|
||||
# Galaxy Driver — MXAccess Client Requirements
|
||||
|
||||
Parent: [HLR-003](HighLevelReqs.md#hlr-003-mxaccess-runtime-data-access), [HLR-008](HighLevelReqs.md#hlr-008-connection-resilience)
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). Scope narrowed: this document covers the MXAccess surface **inside `OtOpcUa.Galaxy.Host`** (.NET Framework 4.8 x86 Windows service). The in-server `Driver.Galaxy.Proxy` implements the `IReadable` / `IWritable` / `ISubscribable` / `IAlarmSource` / `IHistoryProvider` capability interfaces and routes every wire call through the named pipe to this Host process. The STA thread + reconnect playback + subscription refcount requirements from v1 are preserved; what changed is where they live (Host service, not the Server process). MXA-010 (proxy-side wrapping) and MXA-011 (pipe ACL / shared secret) are new.
|
||||
|
||||
Parent: [HLR-002](HighLevelReqs.md#hlr-002-multi-driver-plug-in-model), [HLR-005](HighLevelReqs.md#hlr-005-live-data-access), [HLR-007](HighLevelReqs.md#hlr-007-service-hosting)
|
||||
|
||||
Driver scope: Galaxy only. Process scope: `OtOpcUa.Galaxy.Host` (Host side) and `Driver.Galaxy.Proxy` (server-side forwarder).
|
||||
|
||||
## MXA-001: STA Thread with Message Pump
|
||||
|
||||
@@ -8,165 +12,194 @@ All MXAccess COM objects shall be created and called on a dedicated STA thread r
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- A dedicated thread is created with `ApartmentState.STA` before any MXAccess COM objects are instantiated.
|
||||
- The thread runs a Win32 message pump using `GetMessage`/`TranslateMessage`/`DispatchMessage` loop.
|
||||
- A dedicated thread is created with `ApartmentState.STA` before any MXAccess COM object is instantiated; implementation lives in `StaPump` inside `OtOpcUa.Galaxy.Host`.
|
||||
- The thread runs a Win32 message pump using `GetMessage` / `TranslateMessage` / `DispatchMessage`.
|
||||
- Work items are marshalled to the STA thread via `PostThreadMessage(WM_APP)` and a concurrent queue.
|
||||
- The STA thread processes work items between message pump iterations.
|
||||
- All COM object creation (`LMXProxyServer` constructor), method calls, and event callbacks happen on this thread.
|
||||
- All COM object creation (`LMXProxyServer`), method calls, and event callbacks happen on this thread.
|
||||
- Thread name `Galaxy.Sta` (for diagnostics).
|
||||
|
||||
### Details
|
||||
|
||||
- Thread name: `MxAccess-STA` (for diagnostics).
|
||||
- If the STA thread dies unexpectedly, log Fatal and trigger service shutdown. Do not attempt to create a replacement thread (COM objects on the dead thread are unrecoverable).
|
||||
- `RunAsync(Action)` method returns a `Task` that completes when the action executes on the STA thread. Callers can `await` it.
|
||||
- If the STA thread dies unexpectedly, log Fatal and trigger Host service shutdown. The supervisor restarts the Host under its driver-stability policy (`docs/v2/driver-stability.md`). COM objects on the dead thread are unrecoverable; no in-process recovery is attempted.
|
||||
- `RunAsync(Action)` returns a `Task` that completes when the action executes on the STA thread. Callers can `await` it.
|
||||
|
||||
---
|
||||
|
||||
## MXA-002: Connection Lifecycle
|
||||
|
||||
The client shall support Register/Unregister lifecycle with the LMXProxyServer COM object, tracking the connection handle.
|
||||
The Host shall support Register/Unregister lifecycle with the `LMXProxyServer` COM object, tracking the connection handle.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `Register(clientName)` is called on the STA thread and returns a positive connection handle on success.
|
||||
- If Register returns handle <= 0, throw with descriptive error.
|
||||
- Handle ≤ 0 → descriptive error thrown; Host reports `DriverHealth.Unavailable` via the pipe so the Proxy reports Bad quality to the core.
|
||||
- `Unregister(handle)` is called during disconnect after all subscriptions are removed.
|
||||
- Client name: configurable via `MxAccess:ClientName`, default `LmxOpcUa`. Must be unique per MXAccess registration.
|
||||
- Client name comes from `OTOPCUA_GALAXY_CLIENT_NAME` environment variable; default `OtOpcUa-Galaxy.Host`. Must be unique per MXAccess registration (a cluster's Primary and Secondary each get their own client-name suffix via node override).
|
||||
- Connection state transitions: Disconnected → Connecting → Connected → Disconnecting → Disconnected (and Error from any state).
|
||||
|
||||
### Details
|
||||
|
||||
- `ConnectedSince` timestamp (UTC) is recorded after successful Register.
|
||||
- `ReconnectCount` is tracked for diagnostics and dashboard display.
|
||||
- State change events are raised for dashboard and health check consumption.
|
||||
- `ConnectedSince` (UTC) recorded after successful Register.
|
||||
- `ReconnectCount` tracked for diagnostics and `/metrics`.
|
||||
- State changes are emitted over the pipe as `DriverHealth` updates.
|
||||
|
||||
---
|
||||
|
||||
## MXA-003: Tag Subscription
|
||||
|
||||
The client shall support subscribing to tags via AddItem + AdviseSupervisory, receiving value updates through OnDataChange callbacks.
|
||||
The Host shall support subscribing to tags via AddItem + AdviseSupervisory, receiving value updates through OnDataChange callbacks.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Subscribe sequence: `AddItem(handle, address)` returns item handle, then `AdviseSupervisory(handle, itemHandle)` starts the subscription.
|
||||
- `OnDataChange` callback delivers value, quality (integer), timestamp, and MXSTATUS_PROXY array.
|
||||
- `OnDataChange` callback delivers value, quality, timestamp, and MXSTATUS_PROXY array.
|
||||
- Item address format: `tag_name.AttributeName` for scalars, `tag_name.AttributeName[]` for whole arrays.
|
||||
- If AddItem fails (e.g., tag does not exist), log Warning and return failure to caller.
|
||||
- Bidirectional maps of `address ↔ itemHandle` are maintained for callback resolution.
|
||||
- AddItem failure → Warning logged, failure propagated over the pipe to the Proxy.
|
||||
- Bidirectional maps of `address ↔ itemHandle` maintained for callback resolution.
|
||||
- Multi-client refcounting: two Proxy-side subscribe calls for the same address produce one MXAccess subscription; refcount decrement on the last unsubscribe triggers `UnAdvise` / `RemoveItem`.
|
||||
|
||||
### Details
|
||||
|
||||
- Use `AdviseSupervisory` (not `Advise`) because this is a background service with no interactive user session. AdviseSupervisory allows secured/verified writes without user authentication.
|
||||
- Stored subscriptions dictionary maps address to callback for reconnect replay.
|
||||
- On reconnect, all entries in stored subscriptions are re-subscribed (AddItem + AdviseSupervisory with new handles).
|
||||
- `AdviseSupervisory` (not `Advise`) is used because this is a background service without an interactive user session.
|
||||
- Stored subscriptions dictionary maps address → callback for reconnect replay.
|
||||
- On reconnect, every entry in stored subscriptions is re-subscribed (AddItem + AdviseSupervisory with new handles).
|
||||
|
||||
---
|
||||
|
||||
## MXA-004: Tag Read/Write
|
||||
|
||||
The client shall support synchronous-style read and write operations, marshalled to the STA thread, with configurable timeouts.
|
||||
The Host shall support synchronous-style read and write operations, marshalled to the STA thread, with configurable timeouts.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Read: implemented as subscribe-get-first-value-unsubscribe pattern (AddItem → AdviseSupervisory → wait for OnDataChange → UnAdvise → RemoveItem).
|
||||
- Read pattern: prefer cached subscription value; fall back to subscribe-get-first-value-unsubscribe (AddItem → AdviseSupervisory → wait for OnDataChange → UnAdvise → RemoveItem).
|
||||
- Write: AddItem → AdviseSupervisory → `Write()` → await `OnWriteComplete` callback → cleanup.
|
||||
- Read timeout: configurable via `MxAccess:ReadTimeoutSeconds`, default 5 seconds.
|
||||
- Write timeout: configurable via `MxAccess:WriteTimeoutSeconds`, default 5 seconds. On timeout, log Warning and return timeout error.
|
||||
- Concurrent operation limit: configurable semaphore via `MxAccess:MaxConcurrentOperations`, default 10.
|
||||
- Read timeout: `Galaxy:ReadTimeoutSeconds` in driver config (default 5 seconds) — enforced on the Host side in addition to the Proxy-side Polly `Timeout` leg.
|
||||
- Write timeout: `Galaxy:WriteTimeoutSeconds` (default 5 seconds) — enforced similarly.
|
||||
- Concurrent operation limit: configurable semaphore (`Galaxy:MaxConcurrentOperations`, default 10).
|
||||
- All operations marshalled to the STA thread.
|
||||
|
||||
### Details
|
||||
|
||||
- Write uses security classification -1 (no security). Galaxy runtime handles security enforcement.
|
||||
- `OnWriteComplete` callback: check MXSTATUS_PROXY `success` field. If 0, extract detail code and propagate error.
|
||||
- COM exceptions (`COMException` with HRESULT) are caught and translated to meaningful error messages.
|
||||
- Write uses security classification `-1` (no security). Galaxy runtime enforces security; OtOpcUa authorization is enforced server-side before the call ever reaches the pipe (per OPC-014 `AuthorizationGate`).
|
||||
- `OnWriteComplete`: check `MXSTATUS_PROXY.success`. If 0, extract detail code and propagate as an error over the pipe.
|
||||
- COM exceptions translated to meaningful error messages.
|
||||
|
||||
---
|
||||
|
||||
## MXA-005: Auto-Reconnect
|
||||
|
||||
The client shall monitor connection health and automatically reconnect on failure, replaying all stored subscriptions after reconnect.
|
||||
The Host shall monitor connection health and automatically reconnect on failure, replaying all stored subscriptions after reconnect.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Monitor loop runs on a background thread, checking connection health at configurable interval (`MxAccess:MonitorIntervalSeconds`, default 5 seconds).
|
||||
- If disconnected, attempt reconnect. On success, replay all stored subscriptions.
|
||||
- On reconnect failure, log Warning and retry at next interval (no exponential backoff — reconnect as quickly as possible on a plant-floor service).
|
||||
- Monitor loop runs on a background thread at `Galaxy:MonitorIntervalSeconds` (default 5 seconds).
|
||||
- On disconnect, attempt reconnect. On success, replay all stored subscriptions.
|
||||
- On reconnect failure, log Warning and retry at next interval (no exponential backoff inside the Host; the Proxy-side Polly pipeline handles cross-process backoff against pipe failures).
|
||||
- Reconnect count is incremented on each successful reconnect.
|
||||
- Monitor loop is cancellable (for clean shutdown).
|
||||
- Monitor loop is cancellable for clean Host shutdown.
|
||||
|
||||
### Details
|
||||
|
||||
- Reconnect cleans up old COM objects before creating new ones.
|
||||
- After reconnect, probe subscription is re-established first, then stored subscriptions.
|
||||
- No max retry limit — keep trying indefinitely until service is stopped.
|
||||
- After reconnect, probe subscription (MXA-006) is re-established first, then stored subscriptions.
|
||||
- No max retry limit — keep trying indefinitely until the Host service is stopped.
|
||||
|
||||
---
|
||||
|
||||
## MXA-006: Probe-Based Health Monitoring
|
||||
|
||||
The client shall optionally subscribe to a configurable probe tag and use OnDataChange callback staleness to detect silent connection failures.
|
||||
The Host shall optionally subscribe to a configurable probe tag and use OnDataChange callback staleness to detect silent connection failures.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Subscribe to a configurable probe tag (a known-good Galaxy attribute that changes periodically).
|
||||
- Probe tag address configured via `Galaxy:ProbeTag`. If unset, probe monitoring is disabled.
|
||||
- Track `_lastProbeValueTime` (UTC) updated on each OnDataChange for the probe tag.
|
||||
- If `DateTime.UtcNow - _lastProbeValueTime > staleThreshold`, force disconnect and reconnect.
|
||||
- Probe tag address: configurable via `MxAccess:ProbeTag`. If not configured, probe monitoring is disabled.
|
||||
- Stale threshold: configurable via `MxAccess:ProbeStaleThresholdSeconds`, default 60 seconds.
|
||||
- Stale threshold: `Galaxy:ProbeStaleThresholdSeconds` (default 60 seconds).
|
||||
- Implements `IHostConnectivityProbe` on the Proxy side so the core's `CapabilityInvoker` records probe outcomes with `DriverCapability.Probe` telemetry.
|
||||
|
||||
### Details
|
||||
|
||||
- The probe tag should be an attribute that the Galaxy runtime updates regularly (e.g., a platform heartbeat or area-level timestamp). The specific tag is site-dependent.
|
||||
- After forced reconnect, reset `_lastProbeValueTime` to `DateTime.UtcNow` to give the new connection a full threshold window.
|
||||
- The probe tag should be an attribute the Galaxy runtime updates regularly (platform heartbeat, area timestamp). Specific tag is site-dependent.
|
||||
- After forced reconnect, reset `_lastProbeValueTime` to `DateTime.UtcNow`.
|
||||
|
||||
---
|
||||
|
||||
## MXA-007: COM Cleanup
|
||||
|
||||
On disconnect or disposal, the client shall unwire event handlers, unadvise/remove all items, unregister, and release COM objects via Marshal.ReleaseComObject.
|
||||
On disconnect or disposal, the Host shall unwire event handlers, unadvise/remove all items, unregister, and release COM objects via `Marshal.ReleaseComObject`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Cleanup order: UnAdvise all active subscriptions → RemoveItem all items → unwire OnDataChange and OnWriteComplete event handlers → Unregister → `Marshal.ReleaseComObject`.
|
||||
- Cleanup order: UnAdvise all active subscriptions → RemoveItem all items → unwire OnDataChange and OnWriteComplete handlers → Unregister → `Marshal.ReleaseComObject`.
|
||||
- On dispose: run disconnect if still connected, then dispose STA thread.
|
||||
- Each cleanup step is wrapped in try/catch (cleanup must not throw).
|
||||
- After cleanup: handle maps are cleared, pending write TCS entries are abandoned, COM reference is set to null.
|
||||
- Each cleanup step wrapped in try/catch (cleanup must not throw).
|
||||
- After cleanup: handle maps cleared, pending write TCS entries abandoned, COM reference set to null.
|
||||
|
||||
### Details
|
||||
|
||||
- `_storedSubscriptions` is NOT cleared on disconnect (preserved for reconnect replay). Only cleared on Dispose.
|
||||
- Event handlers must be unwired BEFORE Unregister, or callbacks may fire on a dead object.
|
||||
- `Marshal.ReleaseComObject` in a finally block, always, even if earlier steps fail.
|
||||
- Stored subscriptions are NOT cleared on disconnect (preserved for reconnect replay). Only cleared on Dispose.
|
||||
- Event handlers unwired BEFORE Unregister (else callbacks may fire on a dead object).
|
||||
- `Marshal.ReleaseComObject` in a `finally` block, always.
|
||||
|
||||
---
|
||||
|
||||
## MXA-008: Operation Metrics
|
||||
|
||||
The MXAccess client shall record timing and success/failure for Read, Write, and Subscribe operations.
|
||||
The MXAccess Host shall record timing and success/failure for Read, Write, and Subscribe operations.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Each operation records: duration (ms), success/failure.
|
||||
- Metrics are available for the status dashboard: count, success rate, avg/min/max/P95 latency.
|
||||
- Uses a rolling 1000-entry buffer for percentile calculation.
|
||||
- Metrics are exposed via a queryable interface consumed by the status report service.
|
||||
|
||||
### Details
|
||||
|
||||
- Uses an `ITimingScope` pattern: `using (var scope = metrics.BeginOperation("read")) { ... }` for automatic timing and success tracking.
|
||||
- Metrics are periodically logged at Debug level for diagnostics.
|
||||
- Each operation records duration (ms) + success/failure.
|
||||
- Metrics exposed over the pipe to the Proxy, which re-publishes them via OpenTelemetry → Prometheus under `DriverInstanceId = "galaxy-*"`, `HostName = "galaxy.host"`.
|
||||
- Rolling 1000-entry buffer for percentile calculation.
|
||||
- Uses an `ITimingScope` pattern: `using (var scope = metrics.BeginOperation("read")) { ... }`.
|
||||
|
||||
---
|
||||
|
||||
## MXA-009: Error Code Translation
|
||||
|
||||
The client shall translate known MXAccess error codes from MXSTATUS_PROXY.detail into human-readable messages for logging and OPC UA status propagation.
|
||||
The Host shall translate known MXAccess error codes from `MXSTATUS_PROXY.detail` into human-readable messages for logging and OPC UA status propagation.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Error 1008 → "User lacks security permission"
|
||||
- Error 1012 → "Secured write required (one signature)"
|
||||
- Error 1013 → "Verified write required (two signatures)"
|
||||
- Unknown error codes are logged with their numeric value.
|
||||
- Translated messages are included in OPC UA StatusCode descriptions and log entries.
|
||||
- Unknown error codes logged with their numeric value.
|
||||
- Translated messages flow back through the pipe and surface in OPC UA `StatusCode` descriptions and Server logs.
|
||||
- Errors 1008 / 1012 / 1013 on write operations map to `Bad_UserAccessDenied` at the OPC UA surface.
|
||||
|
||||
---
|
||||
|
||||
## MXA-010: Proxy-Side Capability Wrapping
|
||||
|
||||
`Driver.Galaxy.Proxy` shall implement the capability interfaces as thin forwarders that serialize every call through the named pipe and route every call through `CapabilityInvoker`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `Driver.Galaxy.Proxy` implements `IDriver` + `IReadable` + `IWritable` + `ISubscribable` + `ITagDiscovery` + `IRediscoverable` + `IAlarmSource` + `IHistoryProvider` + `IHostConnectivityProbe`.
|
||||
- Each implementation uses `CapabilityInvoker.InvokeAsync(DriverCapability.<...>, …)` — direct pipe calls bypassing the invoker are caught by Roslyn **OTOPCUA0001**.
|
||||
- Each method serializes a MessagePack request frame, sends over the pipe, awaits the response frame, deserializes, returns.
|
||||
- Pipe disconnect mid-call → `CapabilityInvoker`'s circuit breaker counts the failure; sustained disconnect opens the circuit and Galaxy nodes surface Bad quality until the pipe reconnects.
|
||||
- Proxy tolerates Host service restarts — it automatically reconnects and replays subscription setup (parallel to MXA-005 but across the IPC boundary).
|
||||
|
||||
---
|
||||
|
||||
## MXA-011: Pipe Security
|
||||
|
||||
The named pipe between Proxy and Host shall be restricted to the Server's runtime principal via SID-based ACL and authenticated with a per-process shared secret.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Pipe name from `OTOPCUA_GALAXY_PIPE` environment variable; default `OtOpcUaGalaxy`.
|
||||
- Allowed SID passed as `OTOPCUA_ALLOWED_SID` — only the declared principal (typically the Server service account) can open the pipe; `Administrators` is explicitly NOT granted (per the `project_galaxy_host_installed` memory note).
|
||||
- Shared secret passed via `OTOPCUA_GALAXY_SECRET` at spawn time; the Proxy must present the matching secret on the opening handshake.
|
||||
- Secret is process-scoped (regenerated per Host restart) and never persisted to disk or Config DB.
|
||||
- Pipe ACL denials are logged as Warning with the rejected principal SID.
|
||||
|
||||
### Details
|
||||
|
||||
- Environment variables are passed by the supervisor launching the Host (`docs/v2/driver-stability.md`).
|
||||
- Dev-box secret is stored at `.local/galaxy-host-secret.txt` for NSSM-wrapped development runs (memory note: `project_galaxy_host_installed`).
|
||||
|
||||
@@ -1,234 +1,266 @@
|
||||
# OPC UA Server — Component Requirements
|
||||
|
||||
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-002](HighLevelReqs.md#hlr-002-galaxy-hierarchy-as-opc-ua-address-space), [HLR-004](HighLevelReqs.md#hlr-004-data-type-mapping)
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, `AuthorizationGate`, `ServiceLevel` reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`.
|
||||
|
||||
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-010](HighLevelReqs.md#hlr-010-per-driver-instance-resilience), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)
|
||||
|
||||
## OPC-001: Server Endpoint
|
||||
|
||||
The OPC UA server shall listen on a configurable TCP port (default 4840) using the OPC Foundation .NET Standard stack.
|
||||
The OPC UA server shall listen on a configurable TCP endpoint using the OPC Foundation .NET Standard stack and expose a single endpoint URL per cluster node.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Server starts and accepts TCP connections on the configured port.
|
||||
- Port is read from `appsettings.json` under `OpcUa:Port`; defaults to 4840 if absent.
|
||||
- Endpoint URL format: `opc.tcp://<hostname>:<port>/LmxOpcUa`.
|
||||
- If the port is in use at startup, log an Error and fail to start (do not silently pick another port).
|
||||
- Security policy: None (no certificate validation). This is an internal plant-floor service.
|
||||
- Endpoint URL comes from `ClusterNode.EndpointUrl` in the Config DB (default form `opc.tcp://<hostname>:<port>/OtOpcUa`).
|
||||
- `ApplicationName` and `ApplicationUri` come from `ClusterNode` fields; `ApplicationUri` is unique per node so redundancy `ServerUriArray` entries are distinguishable.
|
||||
- Port defaults to 4840. If the port is in use at startup the server shall log Error and fail to start (no silent port reassignment).
|
||||
- Uses `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet.
|
||||
- Endpoint URL logged at Information level on startup.
|
||||
|
||||
### Details
|
||||
|
||||
- Configurable items: port (default 4840), endpoint path (default `/LmxOpcUa`), server application name (default `LmxOpcUa`).
|
||||
- Server shall use the `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet package.
|
||||
- On startup, log the endpoint URL at Information level.
|
||||
- Node-local `appsettings.json` only carries the `Config DB connection + NodeId + ClusterId` bootstrap — actual endpoint topology comes from the Config DB per HLR-011.
|
||||
|
||||
---
|
||||
|
||||
## OPC-002: Address Space Structure
|
||||
## OPC-002: Address Space Composition
|
||||
|
||||
The server shall create folder nodes for areas and object nodes for automation objects, organized in the same parent-child hierarchy as the Galaxy.
|
||||
The server shall compose an address space by mounting each active driver instance's subtree under a dedicated OPC UA namespace.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- The root folder node has BrowseName `ZB` (hardcoded Galaxy name).
|
||||
- Objects where `is_area = 1` are created as FolderType nodes (organizational).
|
||||
- Objects where `is_area = 0` are created as BaseObjectType nodes.
|
||||
- Parent-child relationships use Organizes references (for areas) and HasComponent references (for contained objects).
|
||||
- A client browsing Root → Objects → ZB → DEV → TestArea → TestMachine_001 → DelmiaReceiver sees the same structure as `gr/layout.md`.
|
||||
|
||||
### Details
|
||||
|
||||
- NodeIds use a string-based identifier scheme: `ns=1;s=<tag_name>` for object nodes, `ns=1;s=<tag_name>.<attribute_name>` for variable nodes.
|
||||
- Infrastructure objects (AppEngines, Platforms) are included in the tree but may have no variable children.
|
||||
- When `contained_name` is null or empty, fall back to `tag_name` as the BrowseName.
|
||||
- Each `DriverInstance` in the current published generation registers one `IDriver` implementation in the core.
|
||||
- Each driver's `ITagDiscovery.DiscoverAsync` result is streamed into the core via `IAddressSpaceBuilder` — `AddFolder` / `AddVariable` calls; the driver does not buffer the whole tree.
|
||||
- Each driver instance gets its own namespace index; `NamespaceUri` comes from the `Namespace` row in the Config DB.
|
||||
- Each cluster has at most one namespace per `Kind` (`Equipment`, `SystemPlatform`, future `Simulated`); enforced by UNIQUE on `(ClusterId, Kind)` in the DB.
|
||||
- Galaxy driver subtree preserves the contained-name browse structure from the deployed Galaxy (moved to `GalaxyRepositoryReqs.md`).
|
||||
- Equipment-kind drivers populate the canonical 5-level UNS structure (`Enterprise/Site/Area/Line/Equipment/Signal`).
|
||||
|
||||
---
|
||||
|
||||
## OPC-003: Variable Nodes for Attributes
|
||||
## OPC-003: Variable Nodes and Access Levels
|
||||
|
||||
Each user-defined attribute on a deployed object shall be represented as an OPC UA variable node under its parent object node.
|
||||
Each tag produced by a driver's `ITagDiscovery` shall become an OPC UA variable node.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Each row from `attributes.sql` creates one variable node under the matching object node (matched by `gobject_id`).
|
||||
- Variable node BrowseName and DisplayName are set to `attribute_name`.
|
||||
- Variable node stores `full_tag_reference` as its runtime MXAccess address.
|
||||
- Variable node AccessLevel is set based on the attribute's `security_classification` per the mapping in `gr/data_type_mapping.md`.
|
||||
- FreeAccess (0), Operate (1), Tune (4), Configure (5) → AccessLevel = CurrentRead | CurrentWrite (3).
|
||||
- SecuredWrite (2), VerifiedWrite (3), ViewOnly (6) → AccessLevel = CurrentRead (1).
|
||||
- Objects with no user-defined attributes still appear as object nodes with zero children.
|
||||
|
||||
### Details
|
||||
|
||||
- Security classification determines the OPC UA AccessLevel and UserAccessLevel attributes on each variable node. The OPC UA stack enforces read-only access for nodes with CurrentRead-only access level.
|
||||
- Attributes whose names start with `_` are already filtered by the SQL query.
|
||||
- Variable node `BrowseName` and `DisplayName` come from `DriverAttributeInfo`.
|
||||
- `DataType` is resolved from `DriverDataType` per each driver's spec in `docs/v2/driver-specs.md`.
|
||||
- `AccessLevel` and `UserAccessLevel` are derived from the tag's `SecurityClassification` and the session's effective permissions walked through the node-ACL permission trie (see OPC-017 `AuthorizationGate`).
|
||||
- Scalar attributes produce `ValueRank = Scalar`; array attributes produce `ValueRank = OneDimension` with `ArrayDimensions` set from the driver's attribute info.
|
||||
|
||||
---
|
||||
|
||||
## OPC-004: Browse Name Translation
|
||||
## OPC-004: Namespace Index Allocation
|
||||
|
||||
Browse names shall use contained names (human-readable, scoped to parent). The server shall internally translate browse paths to tag_name references for MXAccess operations.
|
||||
The server shall register one OPC UA namespace per active driver instance.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- A variable node browsed as `ZB/DEV/TestArea/TestMachine_001/DelmiaReceiver/DownloadPath` correctly translates to MXAccess reference `DelmiaReceiver_001.DownloadPath`.
|
||||
- Translation uses the `tag_name` stored on the parent object node, not the browse path.
|
||||
- No runtime path parsing — the mapping is baked into each node at build time.
|
||||
|
||||
### Details
|
||||
|
||||
- Each variable node stores its `full_tag_reference` (e.g., `DelmiaReceiver_001.DownloadPath`) at address-space build time. Read/write operations use this stored reference directly.
|
||||
- Namespace index 0 remains the standard OPC UA namespace.
|
||||
- Each driver instance's `Namespace.Uri` becomes a registered namespace; its index is assigned deterministically at startup from the published generation's driver ordering.
|
||||
- All variable NodeIds use the driver's namespace index; NodeId identifiers are string-shaped and stable across restarts of the same generation.
|
||||
- Namespace index reshuffles are a publish-time concern; clients reconciling server-relative NodeIds must re-resolve namespace URIs after a new generation is applied.
|
||||
|
||||
---
|
||||
|
||||
## OPC-005: Data Type Mapping
|
||||
## OPC-005: Read Operations
|
||||
|
||||
Variable nodes shall use OPC UA data types mapped from Galaxy mx_data_type values per the mapping in `gr/data_type_mapping.md`.
|
||||
The server shall fulfill OPC UA `Read` requests by invoking `IReadable.ReadAsync` on the target driver instance, dispatched through `CapabilityInvoker`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Every `mx_data_type` value in the mapping table produces the correct OPC UA DataType NodeId on the variable node.
|
||||
- Unknown/unmapped `mx_data_type` values default to String (i=12).
|
||||
- ElapsedTime (type 7) maps to Double representing seconds.
|
||||
|
||||
### Details
|
||||
|
||||
- Full mapping table in `gr/data_type_mapping.md`.
|
||||
- DateTime conversion: Galaxy may store local time; convert to UTC for OPC UA.
|
||||
- LocalizedText (type 15): use empty locale string with the text value.
|
||||
- Every read call at dispatch passes through `Core.Resilience.CapabilityInvoker.InvokeAsync(DriverCapability.Read, …)`.
|
||||
- Returned `DataValueSnapshot` is converted to an OPC UA `DataValue` with `StatusCode`, source timestamp, and server timestamp.
|
||||
- If the owning driver instance's Polly circuit is open, the read returns Bad quality immediately without hitting the wire.
|
||||
- Reads on a node the session has no `Read` bit for in the permission trie return `Bad_UserAccessDenied` before the capability is invoked (OPC-017).
|
||||
- Read timeout is the Polly timeout leg on the `Read` capability; its duration is per-`(DriverInstanceId, HostName)` and comes from the Config DB.
|
||||
|
||||
---
|
||||
|
||||
## OPC-006: Array Support
|
||||
## OPC-006: Write Operations
|
||||
|
||||
Attributes marked as arrays shall have ValueRank=1 and ArrayDimensions set to the attribute's array_dimension value.
|
||||
The server shall fulfill OPC UA `Write` requests by invoking `IWritable.WriteAsync` through `CapabilityInvoker` with **idempotence-aware** retry policy.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `is_array = 1` produces ValueRank = 1 (OneDimension) and ArrayDimensions = `[array_dimension]`.
|
||||
- `is_array = 0` produces ValueRank = -1 (Scalar) and no ArrayDimensions.
|
||||
- MXAccess reference for array attributes uses `tag_name.attribute[]` (whole array) format.
|
||||
|
||||
### Details
|
||||
|
||||
- Individual array element access (`tag_name.attribute[n]`) is not required for initial implementation. Whole-array read/write only.
|
||||
- If `array_dimension` is null or 0 when `is_array = 1`, log a Warning and default to ArrayDimensions = [0] (variable-length).
|
||||
- Writes dispatch through `CapabilityInvoker.InvokeAsync(DriverCapability.Write, …)`.
|
||||
- Writes **do not auto-retry** unless the tag's `TagConfig.WriteIdempotent = true`, or the driver's capability is marked with `[WriteIdempotent]` (decision #143).
|
||||
- Writes on a node the session lacks the required permission bit for (`WriteOperate`, `WriteTune`, or `WriteConfigure` derived from the tag's `SecurityClassification`) return `Bad_UserAccessDenied` before the capability runs.
|
||||
- A write into an open circuit returns a driver-shaped error (`Bad_NoCommunication` / `Bad_ServerNotConnected`) without hitting the wire.
|
||||
- The server shall coerce the written OPC UA value to the driver's expected native type using the node's `DriverDataType` before calling `WriteAsync`.
|
||||
- Writes to a NodeId not currently in the address space return `Bad_NodeIdUnknown`.
|
||||
|
||||
---
|
||||
|
||||
## OPC-007: Read Operations
|
||||
## OPC-007: Subscriptions and Monitored Items
|
||||
|
||||
The server shall fulfill OPC UA Read requests by reading the corresponding tag value from MXAccess using the tag_name.AttributeName reference.
|
||||
The server shall map OPC UA `CreateMonitoredItems` / `DeleteMonitoredItems` to `ISubscribable.SubscribeAsync` / `UnsubscribeAsync` on the owning driver instance.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- OPC UA Read request for a variable node results in a read via MXAccess using the node's stored `full_tag_reference`.
|
||||
- Returned value is converted from the COM variant to the OPC UA data type specified on the node.
|
||||
- OPC UA StatusCode reflects MXAccess quality: Good maps to Good, Bad/Uncertain map appropriately.
|
||||
- If MXAccess is not connected, return StatusCode = Bad_NotConnected.
|
||||
- Read timeout: configurable, default 5 seconds. On timeout, return Bad_Timeout.
|
||||
|
||||
### Details
|
||||
|
||||
- Prefer cached subscription-delivered values over on-demand reads to reduce COM round-trips.
|
||||
- If no subscription is active for the tag, perform an on-demand read (AddItem, AdviseSupervisory, wait for first OnDataChange, then UnAdvise/RemoveItem).
|
||||
- Concurrency: semaphore-limited to configurable max (default 10) concurrent MXAccess operations.
|
||||
- Subscription setup dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.Subscribe, …)`.
|
||||
- Two OPC UA monitored items against the same tag produce exactly one driver-side subscription (ref-counted); last unsubscribe releases the driver-side resource.
|
||||
- `OnDataChange` callbacks from the driver arrive as `DataValueSnapshot` and are forwarded to all OPC UA monitored items on that tag.
|
||||
- Driver-side quality maps to OPC UA `StatusCode` per the driver's spec.
|
||||
- When the owning driver's circuit opens, subscribed items publish Bad quality; when it resets, resumption publishes the cached or freshly-sampled value.
|
||||
- Across generation applies that preserve a tag's NodeId, existing OPC UA monitored items are preserved (no re-subscribe required on the client).
|
||||
|
||||
---
|
||||
|
||||
## OPC-008: Write Operations
|
||||
## OPC-008: Alarm Surface
|
||||
|
||||
The server shall fulfill OPC UA Write requests by writing to the corresponding tag via MXAccess.
|
||||
The server shall expose the OPC UA alarm and condition model backed by each driver's `IAlarmSource` (where implemented).
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- OPC UA Write request results in an MXAccess `Write()` call with completion confirmed via `OnWriteComplete()` callback.
|
||||
- Write timeout: configurable, default 5 seconds. On timeout, log Warning and return Bad_Timeout.
|
||||
- MXSTATUS_PROXY with `success = 0` causes the OPC UA write to return Bad_InternalError with the detail message.
|
||||
- MXAccess errors 1008 (no permission), 1012 (secured write), 1013 (verified write) return Bad_UserAccessDenied.
|
||||
- Write to a non-existent tag returns Bad_NodeIdUnknown.
|
||||
- The server shall attempt to convert the written value to the expected Galaxy data type before passing to Write().
|
||||
|
||||
### Details
|
||||
|
||||
- Write uses security classification -1 (no security). Galaxy runtime handles security enforcement.
|
||||
- Write sequence: uses existing subscription handle if available, otherwise AddItem + AdviseSupervisory + Write + await OnWriteComplete + cleanup.
|
||||
- Concurrent write limit: same semaphore as reads (configurable, default 10).
|
||||
- Drivers implementing `IAlarmSource` (today: Galaxy, FOCAS, OPC UA Client) produce alarm events that the core maps onto OPC UA `ConditionType` / `AlarmConditionType` instances in the driver's namespace.
|
||||
- `AlarmSubscribe` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmSubscribe, …)` and retries on transient failure.
|
||||
- `AlarmAcknowledge` from the OPC UA client dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmAcknowledge, …)` and **does not retry** (decision #143 — ack is a write-shaped operation).
|
||||
- Alarm-ack requires the `AlarmAck` permission bit for the tag / equipment node; otherwise `Bad_UserAccessDenied`.
|
||||
- Drivers that do not implement `IAlarmSource` contribute no alarm nodes; the core does not synthesize placeholder conditions.
|
||||
|
||||
---
|
||||
|
||||
## OPC-009: Subscriptions
|
||||
## OPC-009: Historical Access
|
||||
|
||||
The server shall support OPC UA subscriptions by mapping them to MXAccess advisory subscriptions and forwarding data change notifications.
|
||||
The server shall surface OPC UA Historical Access (HA) via each driver's `IHistoryProvider` (where implemented).
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- OPC UA CreateMonitoredItems results in MXAccess `AdviseSupervisory()` subscriptions for the requested tags.
|
||||
- Data changes from `OnDataChange` callback are forwarded as OPC UA notifications to all subscribed clients.
|
||||
- Shared subscriptions: if two OPC UA clients subscribe to the same tag, only one MXAccess subscription exists (ref-counted).
|
||||
- Last subscriber unsubscribing triggers UnAdvise/RemoveItem on the MXAccess side.
|
||||
- After MXAccess reconnect, all active MXAccess subscriptions are re-established automatically.
|
||||
|
||||
### Details
|
||||
|
||||
- Publishing interval from the OPC UA subscription request is honored on the OPC UA side; MXAccess delivers changes as fast as it receives them.
|
||||
- OPC UA quality mapping from MXAccess quality integers: 192+ = Good, 64-191 = Uncertain, 0-63 = Bad.
|
||||
- OnDataChange with MXSTATUS_PROXY failure: deliver notification with Bad quality to subscribed clients.
|
||||
- `HistoryRead` for `Raw`, `Processed`, `AtTime`, and `Events` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.HistoryRead, …)`.
|
||||
- Drivers implementing `IHistoryProvider` today: Galaxy (Wonderware Historian), OPC UA Client (proxy to remote historian).
|
||||
- Drivers not implementing `IHistoryProvider` return `Bad_HistoryOperationUnsupported` for history requests on their nodes.
|
||||
- History reads require the `Read` permission bit on the target node.
|
||||
|
||||
---
|
||||
|
||||
## OPC-010: Address Space Rebuild
|
||||
## OPC-010: Transport Security Profiles
|
||||
|
||||
When a Galaxy deployment change is detected, the server shall rebuild the address space without dropping existing OPC UA client connections where possible.
|
||||
The server shall offer OPC UA transport-security profiles resolved at startup by `SecurityProfileResolver`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- When Galaxy Repository detects a deployment change, the OPC UA address space is updated.
|
||||
- Only changed gobject subtrees are torn down and rebuilt; unchanged nodes, subscriptions, and alarm tracking remain intact.
|
||||
- Existing OPC UA client sessions are preserved — clients stay connected.
|
||||
- Subscriptions for tags on unchanged objects continue to work without interruption.
|
||||
- Subscriptions for tags that no longer exist receive a Bad_NodeIdUnknown status notification.
|
||||
- Sync is logged at Information level with the number of changed gobjects.
|
||||
|
||||
### Details
|
||||
|
||||
- Uses incremental subtree sync: compares previous hierarchy+attributes with new, identifies changed gobject IDs, expands to include child subtrees, tears down only affected subtrees, and rebuilds them.
|
||||
- First build (no cached state) performs a full build.
|
||||
- If no changes are detected, the sync is a no-op (logged and skipped).
|
||||
- Alarm tracking and MXAccess subscriptions for unchanged objects are not disrupted.
|
||||
- Falls back to full rebuild behavior if the entire hierarchy changes.
|
||||
- Supported profiles: `None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`, `Aes128_Sha256_RsaOaep-Sign`, `Aes128_Sha256_RsaOaep-SignAndEncrypt`, `Aes256_Sha256_RsaPss-Sign`, `Aes256_Sha256_RsaPss-SignAndEncrypt`.
|
||||
- Active profile list comes from `OpcUa.SecurityProfile` in `appsettings.json` (bootstrap config) or Config DB (per-cluster override).
|
||||
- Server certificate is created at first startup even when only `None` is enabled, because UserName-token encryption depends on an ApplicationInstanceCertificate.
|
||||
- Certificate store root path is configurable (default `%ProgramData%/OtOpcUa/pki/`).
|
||||
- `AutoAcceptUntrustedClientCertificates` is a config flag; production deployments set it to `false` and operators add trusted client certs via the Admin UI Cert Trust screen.
|
||||
|
||||
---
|
||||
|
||||
## OPC-011: Server Diagnostics Node
|
||||
## OPC-011: UserName Authentication
|
||||
|
||||
The server shall expose a ServerStatus node under the standard OPC UA Server object with ServerState, CurrentTime, and StartTime. This is required by the OPC UA specification for compliant servers.
|
||||
The server shall validate `UserNameIdentityToken` credentials against LDAP (production: Active Directory; dev: GLAuth).
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- ServerState reports Running during normal operation.
|
||||
- CurrentTime returns the server's current UTC time.
|
||||
- StartTime returns the UTC time when the service started.
|
||||
- If `Ldap.Enabled = false`, all UserName tokens are rejected (`BadUserAccessDenied`).
|
||||
- When enabled, the server performs an LDAP bind using the supplied credentials via `LdapUserAuthenticator`.
|
||||
- On successful bind, group memberships resolved from LDAP are mapped through `LdapOptions.GroupToRole` to produce the session's permission bits (`ReadOnly`, `WriteOperate`, `WriteTune`, `WriteConfigure`, `AlarmAck`).
|
||||
- `LdapAuthenticationProvider` implements both `IUserAuthenticationProvider` and `IRoleProvider`.
|
||||
- UserName tokens are always carried on an encrypted secure channel (either Sign-and-Encrypt transport, or encrypted token using the server certificate even on a `None` channel).
|
||||
|
||||
---
|
||||
|
||||
## OPC-012: Namespace Configuration
|
||||
## OPC-012: Capability Dispatch via CapabilityInvoker
|
||||
|
||||
The server shall register a namespace URI at namespace index 1. All application-specific NodeIds shall use this namespace.
|
||||
Every async capability-interface call the server makes shall route through `Core.Resilience.CapabilityInvoker`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Namespace URI: `urn:ZB:LmxOpcUa` (Galaxy name is configurable).
|
||||
- All object and variable NodeIds created from Galaxy data use namespace index 1.
|
||||
- Standard OPC UA nodes remain in namespace 0.
|
||||
- `CapabilityInvoker.InvokeAsync` resolves a Polly resilience pipeline keyed on `(DriverInstanceId, HostName, DriverCapability)`.
|
||||
- Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead pipelines carry Timeout + Retry + CircuitBreaker strategies.
|
||||
- Write / AlarmAcknowledge pipelines carry Timeout + CircuitBreaker only; Retry is enabled only when the tag or capability carries `[WriteIdempotent]` (decision #143).
|
||||
- Roslyn diagnostic **OTOPCUA0001** fires on any direct call to a capability-interface method from outside `CapabilityInvoker` (enforced via `ZB.MOM.WW.OtOpcUa.Analyzers`).
|
||||
|
||||
---
|
||||
|
||||
## OPC-013: Session Management
|
||||
## OPC-013: Per-Host Polly Isolation
|
||||
|
||||
Polly pipelines shall be keyed per `(DriverInstanceId, HostName, DriverCapability)` so that a failing device in one driver does not trip the circuit for another device on the same driver or any other driver (decision #144).
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- A driver serving `N` devices has `N × capabilityCount` distinct pipelines.
|
||||
- Circuit-breaker state transitions are telemetry-published per pipeline and appear on the Admin UI + `/metrics`.
|
||||
- A host-scope fault (e.g. shared PLC gateway) naturally trips all devices behind that host but leaves other hosts untouched.
|
||||
|
||||
---
|
||||
|
||||
## OPC-014: Authorization Gate and Permission Trie
|
||||
|
||||
`Security.AuthorizationGate` shall enforce node-level permissions on every browse, read, write, subscribe, alarm-ack, and history call before dispatch.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Permission bits for the session are assembled at login from LDAP group → role → permission mapping plus Config-DB `NodeAcl` rows that modify permission inheritance along the browse tree.
|
||||
- The permission trie walks from the addressed node toward the root, inheriting permissions unless a `NodeAcl` overrides; first match wins.
|
||||
- Missing `Read` bit → `Bad_UserAccessDenied` on Read / Subscribe / HistoryRead.
|
||||
- Missing `Write*` bit (matching the tag's `SecurityClassification`) → `Bad_UserAccessDenied` on Write.
|
||||
- Missing `AlarmAck` bit → `Bad_UserAccessDenied` on acknowledge.
|
||||
- Authorization decisions are made at the server layer only — drivers never enforce authorization and only expose `SecurityClassification` metadata.
|
||||
|
||||
---
|
||||
|
||||
## OPC-015: ServiceLevel Reporting
|
||||
|
||||
The server shall expose a dynamic `ServiceLevel` value computed by `RedundancyCoordinator` + `ServiceLevelCalculator`.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `ServiceLevel` reflects: redundancy role (Primary higher than Secondary), publish state (current generation applied > mid-apply > failed-apply), driver health (any driver instance in open circuit lowers the value), apply-lease state.
|
||||
- `ServiceLevel` is exposed as a Variable under the standard `Server` object and is readable by any authenticated client.
|
||||
- Clients that observe Primary's `ServiceLevel` drop below Secondary's should failover per the OPC UA spec.
|
||||
- Single-node deployments (`NodeCount = 1`) always publish their node as Primary.
|
||||
|
||||
---
|
||||
|
||||
## OPC-016: Session Management
|
||||
|
||||
The server shall support multiple concurrent OPC UA client sessions.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Maximum concurrent sessions: configurable, default 100.
|
||||
- Session timeout: configurable, default 30 minutes of inactivity.
|
||||
- Expired sessions are cleaned up and their subscriptions removed.
|
||||
- Session count is reported to the status dashboard.
|
||||
- Maximum concurrent sessions and session timeout come from Config DB cluster settings (default 100 sessions, 30-minute idle timeout).
|
||||
- Expired sessions are cleaned up and their subscriptions and monitored items removed.
|
||||
- Active session count is reported as a Prometheus gauge on the Admin `/metrics` endpoint.
|
||||
|
||||
---
|
||||
|
||||
## OPC-017: Address Space Rebuild on Generation Apply
|
||||
|
||||
When a new Config DB generation is applied, the server shall surgically update only the affected driver subtrees.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Apply compares the previous generation to the incoming generation and produces per-driver add / modify / remove sets.
|
||||
- Existing OPC UA sessions, subscriptions, and monitored items are preserved across apply whenever the target NodeId survives the generation change.
|
||||
- Tags that no longer exist post-apply emit `Bad_NodeIdUnknown` on their subscribed monitored items.
|
||||
- During apply, the node's `ServiceLevel` is lowered (per `ServiceLevelCalculator`) so redundancy partners temporarily take precedence.
|
||||
- Galaxy subtree rebuilds triggered by `IRediscoverable` (Galaxy deployment change) are scoped to the Galaxy driver's namespace and follow the same preservation rule (OPC-006 from the v1 file, now subsumed).
|
||||
|
||||
---
|
||||
|
||||
## OPC-018: Server Diagnostics Nodes
|
||||
|
||||
The server shall expose standard OPC UA `Server` object nodes required by the spec.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- `ServerStatus` / `ServerState` / `CurrentTime` / `StartTime` populated and compliant with the OPC UA 1.05 spec.
|
||||
- `ServerCapabilities` declares historical access capabilities for namespaces that have an `IHistoryProvider`-backed driver.
|
||||
- `ServerRedundancy.RedundancySupport` reflects the cluster's redundancy mode (`None` / `Warm` / `Hot`).
|
||||
- `ServerRedundancy.ServerUriArray` lists both cluster members' `ApplicationUri` values.
|
||||
|
||||
---
|
||||
|
||||
## OPC-019: Observability Hooks
|
||||
|
||||
The server shall emit OpenTelemetry metrics consumed by the Admin `/metrics` Prometheus endpoint.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Counters: capability calls per `DriverInstanceId` + `DriverCapability`, OPC UA requests per method, alarm events emitted, history reads, generation apply attempts.
|
||||
- Histograms: capability-call duration per `DriverInstanceId` + `DriverCapability`, OPC UA request duration per method.
|
||||
- Gauges: circuit-breaker state per pipeline, active OPC UA sessions, active monitored items, subscription queue depth, `ServiceLevel` value, memory-tracking watermarks (Phase 6.1).
|
||||
- Metric cardinality is bounded — `DriverInstanceId` and `HostName` are the only high-cardinality labels, both controlled by the Config DB.
|
||||
|
||||
@@ -1,117 +1,265 @@
|
||||
# Service Host — Component Requirements
|
||||
|
||||
Parent: [HLR-006](HighLevelReqs.md#hlr-006-windows-service-hosting), [HLR-007](HighLevelReqs.md#hlr-007-logging)
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). v1 was a single Windows service; v2 ships **three cooperating Windows services** and the service-host requirements are rewritten per-process. SVC-001…SVC-006 from v1 are preserved in spirit (TopShelf, Serilog, config loading, graceful shutdown, startup sequence, unhandled-exception handling) but are now scoped to the process they apply to. SRV-* prefixes the Server process, ADM-* the Admin process, GHX-* the Galaxy Host process. A shared-requirements section at the top covers cross-process concerns (Serilog, logging rotation, bootstrap config scope).
|
||||
|
||||
## SVC-001: TopShelf Hosting
|
||||
Parent: [HLR-007](HighLevelReqs.md#hlr-007-service-hosting), [HLR-008](HighLevelReqs.md#hlr-008-logging), [HLR-011](HighLevelReqs.md#hlr-011-config-db-and-draft-publish)
|
||||
|
||||
The application shall use TopShelf for Windows service lifecycle (install, uninstall, start, stop) and interactive console mode for development.
|
||||
## Shared Requirements (all three processes)
|
||||
|
||||
### Acceptance Criteria
|
||||
### SVC-SHARED-001: Serilog Logging
|
||||
|
||||
- TopShelf HostFactory configures the service with name `LmxOpcUa`, display name `LMX OPC UA Server`.
|
||||
- Service installs via command line: `ZB.MOM.WW.OtOpcUa.Host.exe install`.
|
||||
- Service uninstalls via: `ZB.MOM.WW.OtOpcUa.Host.exe uninstall`.
|
||||
- Service runs as LocalSystem account (needed for MXAccess COM access and Windows Auth to SQL Server).
|
||||
- Interactive console mode (exe with no args) works for development/debugging.
|
||||
- `StartAutomatically` is set for Windows service registration.
|
||||
Every process shall use Serilog with a rolling daily file sink at Information level minimum, plus a console sink, plus opt-in CompactJsonFormatter file sink.
|
||||
|
||||
### Details
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Platform target: x86 (32-bit) — required for MXAccess COM interop.
|
||||
- Service description: "OPC UA server exposing System Platform Galaxy tags via MXAccess."
|
||||
- Console sink active on every process (for interactive / debug mode).
|
||||
- Rolling daily file sink:
|
||||
- Server: `logs/otopcua-YYYYMMDD.log`
|
||||
- Admin: `logs/otopcua-admin-YYYYMMDD.log`
|
||||
- Galaxy Host: `%ProgramData%\OtOpcUa\galaxy-host-YYYYMMDD.log`
|
||||
- Retention count and min level configurable via `Serilog:*` in each process's `appsettings.json`.
|
||||
- JSON sink opt-in via `Serilog:WriteJson = true` (emits `*.json.log` alongside the plain-text file) for SIEM ingestion.
|
||||
- `Log.CloseAndFlush()` invoked in a `finally` block on shutdown.
|
||||
- Structured logging (Serilog message templates) — no `string.Format`.
|
||||
|
||||
---
|
||||
|
||||
## SVC-002: Serilog Logging
|
||||
### SVC-SHARED-002: Bootstrap Configuration Scope
|
||||
|
||||
The application shall configure Serilog with a rolling daily file sink and console sink, with log files retained for a configurable number of days (default 31).
|
||||
`appsettings.json` is bootstrap-only per HLR-011. Operational configuration (clusters, drivers, namespaces, tags, ACLs, poll groups) lives in the Config DB.
|
||||
|
||||
### Acceptance Criteria
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Console sink active (for interactive/debug mode).
|
||||
- Rolling daily file sink writing to `logs/lmxopcua-YYYYMMDD.log`.
|
||||
- Retained file count: configurable, default 31 days.
|
||||
- Minimum log level: configurable, default Information.
|
||||
- Log file path: configurable, default `logs/lmxopcua-.log`.
|
||||
- Serilog is initialized before any other component (first thing in Main).
|
||||
- `Log.CloseAndFlush()` called in finally block on exit.
|
||||
|
||||
### Details
|
||||
|
||||
- Structured logging with Serilog message templates (not string.Format).
|
||||
- Log output includes timestamp, level, source context, message, and exception.
|
||||
- Fatal exceptions are caught at the top level and logged before exit.
|
||||
- `appsettings.json` may contain only: Config DB connection string, `Node:NodeId`, `Node:ClusterId`, `Node:LocalCachePath`, `OpcUa:*` security bootstrap fields, `Ldap:*` bootstrap fields, `Serilog:*`, `Redundancy:*` role id.
|
||||
- Any attempt to configure driver instances, tags, or equipment through `appsettings.json` shall be rejected at startup with a descriptive error.
|
||||
- Invalid or missing required bootstrap fields are detected at startup with a clear error (`"Node:NodeId not configured"` style).
|
||||
|
||||
---
|
||||
|
||||
## SVC-003: Configuration
|
||||
## OtOpcUa.Server — Service Host Requirements (SRV-*)
|
||||
|
||||
The application shall load configuration from appsettings.json with support for environment-specific overrides (appsettings.*.json) and environment variables.
|
||||
### SRV-001: Microsoft.Extensions.Hosting + AddWindowsService
|
||||
|
||||
### Acceptance Criteria
|
||||
The Server shall use `Host.CreateApplicationBuilder(args)` with `AddWindowsService(o => o.ServiceName = "OtOpcUa")` to run as a Windows service.
|
||||
|
||||
- `appsettings.json` is the primary configuration file.
|
||||
- Environment-specific overrides via `appsettings.{environment}.json`.
|
||||
- Configuration sections: `OpcUa`, `MxAccess`, `GalaxyRepository`, `Dashboard`.
|
||||
- Missing optional configuration keys use documented defaults (service does not crash).
|
||||
- Invalid configuration (e.g., port = -1) is detected at startup with a clear error message.
|
||||
#### Acceptance Criteria
|
||||
|
||||
### Details
|
||||
|
||||
- Config is loaded once at startup. No hot-reload (service restart required for config changes). This is appropriate for an industrial service.
|
||||
- All configurable values and their defaults are documented in `appsettings.json`.
|
||||
- Service name `OtOpcUa`.
|
||||
- Installs via standard `sc.exe` tooling or the build-provided installer.
|
||||
- Runs as a configured service account (typically a domain service account with Config DB read access; Windows Auth to SQL Server).
|
||||
- Console mode (running `ZB.MOM.WW.OtOpcUa.Server.exe` with no Windows service context) works for development and debugging.
|
||||
- Platform target: .NET 10 x64 (default per decision in `plan.md` §3).
|
||||
|
||||
---
|
||||
|
||||
## SVC-004: Graceful Shutdown
|
||||
### SRV-002: Startup Sequence
|
||||
|
||||
On service stop, the application shall gracefully shut down all components and flush logs before exiting.
|
||||
The Server shall start components in a defined order, with failure handling at each step.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- TopShelf WhenStopped triggers orderly shutdown.
|
||||
- Shutdown sequence: (1) stop change detection polling, (2) stop OPC UA server (stop accepting new sessions, complete pending operations), (3) disconnect MXAccess (cleanup all COM objects), (4) stop status dashboard HTTP listener, (5) flush Serilog.
|
||||
- Shutdown completes within 30 seconds (Windows SCM timeout).
|
||||
- All IDisposable components are disposed in reverse-creation order.
|
||||
|
||||
### Details
|
||||
|
||||
- `CancellationTokenSource` signals all background loops (monitor, change detection, HTTP listener) to stop.
|
||||
- Log "Service shutdown complete" at Information level as the final log entry before flush.
|
||||
|
||||
---
|
||||
|
||||
## SVC-005: Startup Sequence
|
||||
|
||||
The service shall start components in a defined order, with failure handling at each step.
|
||||
|
||||
### Acceptance Criteria
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Startup sequence:
|
||||
1. Load configuration
|
||||
2. Initialize Serilog
|
||||
3. Start STA thread
|
||||
4. Connect to MXAccess
|
||||
5. Query Galaxy Repository for initial build
|
||||
6. Build OPC UA address space
|
||||
7. Start OPC UA server listener
|
||||
8. Start change detection polling
|
||||
9. Start status dashboard HTTP listener
|
||||
- Failure in steps 1-4 prevents startup (service fails to start).
|
||||
- Failure in steps 5-9 logs Error but allows the service to run in degraded mode.
|
||||
|
||||
### Details
|
||||
|
||||
- Degraded mode means the service is running but may have an empty address space (waiting for Galaxy DB) or no dashboard (port conflict). MXAccess connection is the minimum required for the service to be useful.
|
||||
1. Load `appsettings.json` bootstrap configuration + initialize Serilog.
|
||||
2. Validate bootstrap fields (NodeId, ClusterId, Config DB connection).
|
||||
3. Initialize `OpcUaApplicationHost` (server-certificate resolution via `SecurityProfileResolver`).
|
||||
4. Connect to Config DB; request current published generation for `ClusterId`.
|
||||
5. If unreachable, fall back to `LiteDbConfigCache` (latest applied generation).
|
||||
6. Apply generation: register driver instances, build namespaces, wire capability pipelines.
|
||||
7. Start `OpcUaServerService` hosted service (opens endpoint listener).
|
||||
8. Start `HostStatusPublisher` (pushes `ClusterNodeGenerationState` to Config DB for Admin UI SignalR consumers).
|
||||
9. Start `RedundancyCoordinator` + `ServiceLevelCalculator`.
|
||||
- Failure in steps 1-3 prevents startup.
|
||||
- Failure in steps 4-6 logs Error and enters degraded mode (empty namespaces, `DriverHealth.Unavailable` on every driver, `ServiceLevel = 0`).
|
||||
- Failure in steps 7-9 logs Error and shuts down (endpoint is non-optional).
|
||||
|
||||
---
|
||||
|
||||
## SVC-006: Unhandled Exception Handling
|
||||
### SRV-003: Graceful Shutdown
|
||||
|
||||
The service shall handle unexpected crashes gracefully.
|
||||
On service stop, the Server shall gracefully shut down all driver instances, the OPC UA listener, and flush logs before exiting.
|
||||
|
||||
### Acceptance Criteria
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Register `AppDomain.CurrentDomain.UnhandledException` handler that logs Fatal before the process terminates.
|
||||
- TopShelf service recovery is configured: restart on failure with 60-second delay.
|
||||
- Fatal-level log entry includes the full exception details.
|
||||
- `IHostApplicationLifetime.ApplicationStopping` triggers orderly shutdown.
|
||||
- Shutdown sequence: stop `HostStatusPublisher` → stop driver instances (disconnect each via `IDriver.DisposeAsync`, which for Galaxy tears down the named pipe) → stop OPC UA server (stop accepting new sessions, complete pending reads/writes) → flush Serilog.
|
||||
- Shutdown completes within 30 seconds (Windows SCM timeout).
|
||||
- All `IDisposable` / `IAsyncDisposable` components disposed in reverse-creation order.
|
||||
- Final log entry: `"OtOpcUa.Server shutdown complete"` at Information level.
|
||||
|
||||
---
|
||||
|
||||
### SRV-004: Unhandled Exception Handling
|
||||
|
||||
The Server shall handle unexpected crashes gracefully.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Registers `AppDomain.CurrentDomain.UnhandledException` handler that logs Fatal before the process terminates.
|
||||
- Windows service recovery configured: restart on failure with 60-second delay.
|
||||
- Fatal log entry includes full exception details.
|
||||
|
||||
---
|
||||
|
||||
### SRV-005: Drivers Hosted In-Process
|
||||
|
||||
All drivers except Galaxy run in-process within the Server.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Modbus TCP, AB CIP, AB Legacy, S7, TwinCAT, FOCAS, OPC UA Client drivers are resolved from the DI container and managed by `DriverHost`.
|
||||
- Galaxy driver in-process component is `Driver.Galaxy.Proxy`, which forwards to `OtOpcUa.Galaxy.Host` over the named pipe (see GHX-*).
|
||||
- Each driver instance's lifecycle (connect, discover, subscribe, dispose) is orchestrated by `DriverHost`.
|
||||
|
||||
---
|
||||
|
||||
### SRV-006: Redundancy-Node Bootstrap
|
||||
|
||||
The Server shall bootstrap its redundancy identity from `appsettings.json` and the Config DB.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- `Node:NodeId` + `Node:ClusterId` identify this node uniquely; the `Redundancy` coordinator looks up `ClusterNode.RedundancyRole` (Primary / Secondary) from the Config DB.
|
||||
- Two nodes of the same cluster connect to the same Config DB and the same ClusterId but have different NodeIds and different `ApplicationUri` values.
|
||||
- Missing or ambiguous `(ClusterId, NodeId)` causes startup failure.
|
||||
|
||||
---
|
||||
|
||||
## OtOpcUa.Admin — Service Host Requirements (ADM-*)
|
||||
|
||||
### ADM-001: ASP.NET Core Blazor Server
|
||||
|
||||
The Admin app shall use `WebApplication.CreateBuilder` with Razor Components (`AddRazorComponents().AddInteractiveServerComponents()`), SignalR, and cookie authentication.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Blazor Server (not WebAssembly) per `plan.md` §Tech Stack.
|
||||
- Hosts SignalR hubs for live cluster state (used by `ClusterNodeGenerationState` views, crash-loop alerts, etc.).
|
||||
- Runs as a Windows service via `AddWindowsService` OR as a standard ASP.NET Core process behind IIS / reverse proxy (site decides).
|
||||
- Platform target: .NET 10 x64.
|
||||
|
||||
---
|
||||
|
||||
### ADM-002: Authentication and Authorization
|
||||
|
||||
Admin users authenticate via LDAP bind with cookie auth; three admin roles gate operations.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Cookie auth scheme: `OtOpcUa.Admin`, 8-hour expiry, path `/login` for challenge.
|
||||
- LDAP bind via `LdapAuthService`; user group memberships map to admin roles (`ConfigViewer`, `ConfigEditor`, `FleetAdmin`).
|
||||
- Authorization policies:
|
||||
- `CanEdit` requires `ConfigEditor` or `FleetAdmin`.
|
||||
- `CanPublish` requires `FleetAdmin`.
|
||||
- View-only access requires `ConfigViewer` (or higher).
|
||||
- Unauthenticated requests to any Admin page redirect to `/login`.
|
||||
- Per-cluster role grants layer on top: a `ConfigEditor` with no grant for cluster X can view it but not edit.
|
||||
|
||||
---
|
||||
|
||||
### ADM-003: Config DB as Sole Write Path
|
||||
|
||||
The Admin service shall be the only process with write access to the Config DB.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- EF Core `OtOpcUaConfigDbContext` configured with the SQL login / connection string that has read+write permission on config tables.
|
||||
- Server nodes connect with a read-only principal (`grant SELECT` only).
|
||||
- Admin writes produce draft-generation rows; publish writes are atomic and transactional.
|
||||
- Every write is audited via `AuditLogService` per ADM-006.
|
||||
|
||||
---
|
||||
|
||||
### ADM-004: Prometheus /metrics Endpoint
|
||||
|
||||
The Admin service shall expose an OpenTelemetry → Prometheus metrics endpoint at `/metrics`.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- `OpenTelemetry.Metrics` registered with Prometheus exporter.
|
||||
- `/metrics` scrapeable without authentication (standard Prometheus pattern) OR gated behind an infrastructure allow-list (site-configurable).
|
||||
- Exports metrics from Server nodes of managed clusters (aggregated via Config DB heartbeat telemetry) plus Admin-local metrics (login attempts, publish duration, active sessions).
|
||||
|
||||
---
|
||||
|
||||
### ADM-005: Graceful Shutdown
|
||||
|
||||
On shutdown, the Admin service shall disconnect SignalR clients cleanly, finish in-flight DB writes, and flush Serilog.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- `IHostApplicationLifetime.ApplicationStopping` closes SignalR hub connections gracefully.
|
||||
- In-flight publish transactions are allowed to complete up to 30 seconds.
|
||||
- Final log entry: `"OtOpcUa.Admin shutdown complete"`.
|
||||
|
||||
---
|
||||
|
||||
### ADM-006: Audit Logging
|
||||
|
||||
Every publish and every ACL / role-grant change shall produce an immutable audit row via `AuditLogService`.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Audit rows include: timestamp (UTC), acting principal (LDAP DN + display name), action, entity kind + id, before/after generation number where applicable, session id, source IP.
|
||||
- Audit rows are never mutated or deleted by application code.
|
||||
- Audit table schema enforces immutability via DB permissions (no UPDATE / DELETE granted to the Admin app's principal).
|
||||
|
||||
---
|
||||
|
||||
## OtOpcUa.Galaxy.Host — Service Host Requirements (GHX-*)
|
||||
|
||||
### GHX-001: TopShelf Windows Service Hosting
|
||||
|
||||
The Galaxy Host shall use TopShelf for Windows service lifecycle (install, uninstall, start, stop) and interactive console mode.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Service name `OtOpcUaGalaxyHost`, display name `OtOpcUa Galaxy Host`.
|
||||
- Installs via `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.exe install`.
|
||||
- Uninstalls via `ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.exe uninstall`.
|
||||
- Runs as a configured user account (typically the same account as the Server, or a dedicated Galaxy service account with ArchestrA platform access).
|
||||
- Interactive console mode (no args) for development / debugging.
|
||||
- Platform target: **.NET Framework 4.8 x86** — required for MXAccess COM 32-bit interop.
|
||||
- Development deployments may use NSSM in place of TopShelf (memory: `project_galaxy_host_installed`).
|
||||
|
||||
### Details
|
||||
|
||||
- Service description: "OtOpcUa Galaxy Host — MXAccess + Galaxy Repository backend for the Galaxy driver, named-pipe IPC to OtOpcUa.Server."
|
||||
|
||||
---
|
||||
|
||||
### GHX-002: Named-Pipe IPC Bootstrap
|
||||
|
||||
The Host shall open a named pipe on startup whose name, ACL, and shared secret come from environment variables supplied by the supervisor at spawn time.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- `OTOPCUA_GALAXY_PIPE` → pipe name (default `OtOpcUaGalaxy`).
|
||||
- `OTOPCUA_ALLOWED_SID` → SID of the principal allowed to connect; any other principal is denied at the ACL layer.
|
||||
- `OTOPCUA_GALAXY_SECRET` → per-process shared secret; `Driver.Galaxy.Proxy` must present it on handshake.
|
||||
- `OTOPCUA_GALAXY_BACKEND` → `stub` / `db` / `mxaccess` (default `mxaccess`) — selects which backend implementation is loaded.
|
||||
- Missing `OTOPCUA_ALLOWED_SID` or `OTOPCUA_GALAXY_SECRET` at startup throws with a descriptive error.
|
||||
|
||||
---
|
||||
|
||||
### GHX-003: Backend Lifecycle
|
||||
|
||||
The Host shall instantiate the STA pump + MXAccess backend + Galaxy Repository + optional Historian plugin in a defined order and tear them down cleanly on shutdown.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- Startup (mxaccess backend): initialize Serilog → resolve env vars → create `PipeServer` → start `StaPump` → create `MxAccessClient` on STA thread → initialize `GalaxyRepository` → optionally initialize Historian plugin → begin pipe request handling.
|
||||
- Shutdown: stop pipe → dispose MxAccessClient (MXA-007 COM cleanup) → dispose STA pump → flush Serilog.
|
||||
- Shutdown must complete within 30 seconds (Windows SCM timeout).
|
||||
- `Console.CancelKeyPress` triggers the same sequence in console mode.
|
||||
|
||||
---
|
||||
|
||||
### GHX-004: Unhandled Exception Handling
|
||||
|
||||
The Host shall log Fatal on crash and let the supervisor restart it.
|
||||
|
||||
#### Acceptance Criteria
|
||||
|
||||
- `AppDomain.CurrentDomain.UnhandledException` handler logs Fatal with full exception details before termination.
|
||||
- The supervisor's driver-stability policy (`docs/v2/driver-stability.md`) governs restart behavior — backoff, crash-loop detection, and alerting live there, not in the Host.
|
||||
- Server-side: `Driver.Galaxy.Proxy` detects pipe disconnect, opens its capability circuit, reports Bad quality on Galaxy nodes; reconnects automatically when the Host is back.
|
||||
|
||||
@@ -1,157 +1,29 @@
|
||||
# Status Dashboard — Component Requirements
|
||||
# Status Dashboard — Retired
|
||||
|
||||
Parent: [HLR-009](HighLevelReqs.md#hlr-009-status-dashboard)
|
||||
> **Revision** — Retired 2026-04-19 (task #205). The embedded HTTP Status Dashboard hosted inside the v1 LmxOpcUa service (`Dashboard:Port 8081`) has been **superseded by the Admin UI** introduced in OtOpcUa v2. The requirements formerly numbered DASH-001 through DASH-009 no longer apply.
|
||||
|
||||
Reference: LmxProxy Status Dashboard (see `dashboard.JPG` in project root).
|
||||
## What replaces it
|
||||
|
||||
## DASH-001: Embedded HTTP Endpoint
|
||||
Operator surface is now the **OtOpcUa Admin** Blazor Server web app:
|
||||
|
||||
The service shall host a lightweight HTTP listener on a configurable port serving a self-contained HTML status dashboard page (no external dependencies).
|
||||
- Canonical design doc: `docs/v2/admin-ui.md`
|
||||
- High-level operator surface requirement: [HLR-015](HighLevelReqs.md#hlr-015-admin-ui-operator-surface)
|
||||
- Service-host requirements for the Admin process: [ServiceHostReqs.md → ADM-*](ServiceHostReqs.md#otopcua-admin---service-host-requirements-adm-)
|
||||
- Cross-cluster metrics endpoint: `/metrics` on the Admin app — see [HLR-017](HighLevelReqs.md#hlr-017-prometheus-metrics).
|
||||
- Audit log: see [HLR-016](HighLevelReqs.md#hlr-016-audit-logging) and `AuditLogService`.
|
||||
|
||||
### Acceptance Criteria
|
||||
## Mapping from retired DASH-* requirements to today's surface
|
||||
|
||||
- Uses `System.Net.HttpListener` on a configurable port (`Dashboard:Port`, default 8081).
|
||||
- Routes:
|
||||
- `GET /` → HTML dashboard
|
||||
- `GET /api/status` → JSON status report
|
||||
- `GET /api/health` → 200 OK if healthy, 503 if unhealthy
|
||||
- Only GET requests accepted; other methods return 405.
|
||||
- Unknown paths return 404.
|
||||
- All responses include `Cache-Control: no-cache, no-store, must-revalidate` headers.
|
||||
- Dashboard can be disabled via config (`Dashboard:Enabled`, default true).
|
||||
| Retired requirement | Replacement |
|
||||
|---------------------|-------------|
|
||||
| DASH-001 Embedded HTTP listener | Admin UI (Blazor Server) hosted in the `OtOpcUa.Admin` process. |
|
||||
| DASH-002 Connection panel | Admin UI cluster-node view (live via SignalR) shows per-driver connection state. |
|
||||
| DASH-003 Health panel | Admin UI renders `DriverHealth` + Polly circuit state per driver instance; cluster-level rollup on the cluster dashboard. |
|
||||
| DASH-004 Subscriptions panel | Prometheus gauges (session count, monitored-item count, driver-subscription count) exposed via `/metrics`. |
|
||||
| DASH-005 Operations table | Capability-call duration histograms + counts exposed via `/metrics`; Admin UI renders latency summaries per `DriverInstanceId`. |
|
||||
| DASH-006 Footer (last-updated + version) | Admin UI footer; version stamped from the assembly version of the Admin app. |
|
||||
| DASH-007 Auto-refresh | Admin UI uses SignalR push for live updates — no meta-refresh. |
|
||||
| DASH-008 JSON status API | Prometheus `/metrics` endpoint is the programmatic surface. |
|
||||
| DASH-009 Galaxy info panel | Admin UI Galaxy-driver-instance detail view (driver config, last discovery time, Galaxy DB connection state, MXAccess pipe health). |
|
||||
|
||||
### Details
|
||||
|
||||
- HTTP prefix: `http://+:{port}/` to bind to all interfaces.
|
||||
- If HttpListener fails to start (port conflict, missing URL reservation), log Error and continue service startup without the dashboard.
|
||||
- HTML page is self-contained: inline CSS, no external resources (no CDN, no JavaScript frameworks).
|
||||
|
||||
---
|
||||
|
||||
## DASH-002: Connection Panel
|
||||
|
||||
The dashboard shall display a Connection panel showing MXAccess connection state.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Shows: **Connected** (True/False), **State** (Connected/Disconnected/Reconnecting/Error), **Connected Since** (UTC timestamp).
|
||||
- Green left border when Connected, red when Disconnected/Error, yellow when Reconnecting.
|
||||
- "Connected Since" shows "N/A" when not connected.
|
||||
- Data sourced from MXAccess client's connection state properties.
|
||||
|
||||
### Details
|
||||
|
||||
- Timestamp format: `yyyy-MM-dd HH:mm:ss UTC`.
|
||||
- Panel title: "Connection".
|
||||
|
||||
---
|
||||
|
||||
## DASH-003: Health Panel
|
||||
|
||||
The dashboard shall display a Health panel showing overall service health.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Three states: **Healthy** (green text), **Degraded** (yellow text), **Unhealthy** (red text).
|
||||
- Includes a health message string explaining the status.
|
||||
- Health rules:
|
||||
- Not connected to MXAccess → Unhealthy
|
||||
- Success rate < 50% with > 100 total operations → Degraded
|
||||
- Connected with acceptable success rate → Healthy
|
||||
|
||||
### Details
|
||||
|
||||
- Health message examples: "LmxOpcUa is healthy", "MXAccess client is not connected", "Average success rate is below 50%".
|
||||
- Green left border for Healthy, yellow for Degraded, red for Unhealthy.
|
||||
|
||||
---
|
||||
|
||||
## DASH-004: Subscriptions Panel
|
||||
|
||||
The dashboard shall display a Subscriptions panel showing subscription statistics.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Shows: **Clients** (connected OPC UA client count), **Tags** (total variable nodes in address space), **Active** (active MXAccess subscriptions), **Delivered** (cumulative data change notifications delivered).
|
||||
- Values update on each dashboard refresh.
|
||||
- Zero values shown as "0", not blank.
|
||||
|
||||
### Details
|
||||
|
||||
- "Tags" is the count of variable nodes, not object/folder nodes.
|
||||
- "Active" is the count of distinct MXAccess item subscriptions (after ref-counting — the number of actual AdviseSupervisory calls, not the number of OPC UA monitored items).
|
||||
- "Delivered" is a running counter since service start (not reset on reconnect).
|
||||
|
||||
---
|
||||
|
||||
## DASH-005: Operations Table
|
||||
|
||||
The dashboard shall display an operations metrics table showing performance statistics.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Table with columns: **Operation**, **Count**, **Success Rate**, **Avg (ms)**, **Min (ms)**, **Max (ms)**, **P95 (ms)**.
|
||||
- Rows: Read, Write, Subscribe, Browse.
|
||||
- Empty cells show em-dash ("—") when no data available (count = 0).
|
||||
- Success rate displayed as percentage (e.g., "99.8%").
|
||||
- Latency values rounded to 1 decimal place.
|
||||
|
||||
### Details
|
||||
|
||||
- Metrics sourced from the PerformanceMetrics component (1000-entry rolling buffer for percentile calculation).
|
||||
- "Browse" row tracks OPC UA browse operations.
|
||||
- "Subscribe" row tracks OPC UA CreateMonitoredItems operations.
|
||||
|
||||
---
|
||||
|
||||
## DASH-006: Footer
|
||||
|
||||
The dashboard shall display a footer with last-updated time and service identification.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Format: "Last updated: {timestamp} UTC | Service: ZB.MOM.WW.OtOpcUa.Host v{version}".
|
||||
- Timestamp is the server-side UTC time when the HTML was generated.
|
||||
- Version is read from the assembly version (`Assembly.GetExecutingAssembly().GetName().Version`).
|
||||
|
||||
---
|
||||
|
||||
## DASH-007: Auto-Refresh
|
||||
|
||||
The dashboard page shall auto-refresh to show current status without manual reload.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- HTML page includes `<meta http-equiv="refresh" content="10">` for 10-second auto-refresh.
|
||||
- No JavaScript required for refresh (pure HTML meta-refresh).
|
||||
- Refresh interval: configurable via `Dashboard:RefreshIntervalSeconds`, default 10 seconds.
|
||||
|
||||
---
|
||||
|
||||
## DASH-008: JSON Status API
|
||||
|
||||
The `/api/status` endpoint shall return a JSON object with all dashboard data for programmatic consumption.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Response Content-Type: `application/json`.
|
||||
- JSON structure includes: connection state, health status, subscription statistics, and operation metrics.
|
||||
- Same data as the HTML dashboard, structured for machine consumption.
|
||||
- Suitable for integration with external monitoring tools.
|
||||
|
||||
---
|
||||
|
||||
## DASH-009: Galaxy Info Panel
|
||||
|
||||
The dashboard shall display a Galaxy Info panel showing Galaxy Repository state.
|
||||
|
||||
### Acceptance Criteria
|
||||
|
||||
- Shows: **Galaxy Name** (e.g., ZB), **DB Status** (Connected/Disconnected), **Last Deploy** (timestamp from `galaxy.time_of_last_deploy`), **Objects** (count), **Attributes** (count), **Last Rebuild** (timestamp of last address space rebuild).
|
||||
- Provides visibility into the Galaxy Repository component's state independently of MXAccess connection status.
|
||||
|
||||
### Details
|
||||
|
||||
- "DB Status" reflects whether the most recent change detection poll succeeded.
|
||||
- "Last Deploy" shows the raw `time_of_last_deploy` value from the Galaxy database.
|
||||
- "Objects" and "Attributes" show counts from the most recent successful hierarchy/attribute query.
|
||||
A formal requirements-level doc for the Admin UI (AdminUiReqs.md) is not yet written — the design doc at `docs/v2/admin-ui.md` serves as the authoritative reference until formal cert-compliance requirements are needed.
|
||||
|
||||
Reference in New Issue
Block a user