docs(dcl): document primary/backup endpoint redundancy across requirements and test infra
This commit is contained in:
@@ -68,6 +68,8 @@ Central cluster only. Sites have no user interface.
|
|||||||
### Site & Data Connection Management (Admin Role)
|
### Site & Data Connection Management (Admin Role)
|
||||||
- Create, edit, and delete site definitions, including Akka node addresses (NodeA/NodeB) and gRPC node addresses (GrpcNodeA/GrpcNodeB).
|
- Create, edit, and delete site definitions, including Akka node addresses (NodeA/NodeB) and gRPC node addresses (GrpcNodeA/GrpcNodeB).
|
||||||
- Define data connections and assign them to sites (name, protocol type, connection details).
|
- Define data connections and assign them to sites (name, protocol type, connection details).
|
||||||
|
- **Data connection form**: "Primary Endpoint Configuration" (required JSON text area) and optional "Backup Endpoint Configuration" (collapsible section, hidden by default, revealed via "Add Backup Endpoint" button; "Remove Backup" button when editing an existing backup). "Failover Retry Count" numeric input (default 3, min 1, max 20) is visible only when a backup endpoint is configured.
|
||||||
|
- **Data connection list page**: Shows Primary Config and Backup Config columns. Active Endpoint column populated from health reports.
|
||||||
|
|
||||||
### Area Management (Admin Role)
|
### Area Management (Admin Role)
|
||||||
- Define hierarchical area structures per site.
|
- Define hierarchical area structures per site.
|
||||||
|
|||||||
@@ -104,9 +104,46 @@ LmxProxy is a gRPC-based protocol for communicating with LMX data servers. The D
|
|||||||
|
|
||||||
**Test Infrastructure**: The `infra/lmxfakeproxy/` project provides a fake LmxProxy server that bridges to the OPC UA test server. It implements the full `scada.ScadaService` proto, enabling end-to-end testing of `RealLmxProxyClient` without a Windows LmxProxy deployment. See [test_infra_lmxfakeproxy.md](../test_infra/test_infra_lmxfakeproxy.md) for setup.
|
**Test Infrastructure**: The `infra/lmxfakeproxy/` project provides a fake LmxProxy server that bridges to the OPC UA test server. It implements the full `scada.ScadaService` proto, enabling end-to-end testing of `RealLmxProxyClient` without a Windows LmxProxy deployment. See [test_infra_lmxfakeproxy.md](../test_infra/test_infra_lmxfakeproxy.md) for setup.
|
||||||
|
|
||||||
|
## Endpoint Redundancy
|
||||||
|
|
||||||
|
Data connections support an optional backup endpoint for automatic failover when the active endpoint becomes unreachable. Both endpoints use the same protocol.
|
||||||
|
|
||||||
|
**Entity fields:**
|
||||||
|
|
||||||
|
| Field | Type | Notes |
|
||||||
|
|-------|------|-------|
|
||||||
|
| `PrimaryConfiguration` | string? (max 4000) | Required. Renamed from `Configuration` |
|
||||||
|
| `BackupConfiguration` | string? (max 4000) | Optional. Null = no backup |
|
||||||
|
| `FailoverRetryCount` | int (default 3) | Retries on active endpoint before switching |
|
||||||
|
|
||||||
|
**Failover state machine:**
|
||||||
|
|
||||||
|
```
|
||||||
|
Connected → disconnect → push bad quality → retry active endpoint (5s)
|
||||||
|
→ N failures (≥ FailoverRetryCount) → switch to other endpoint
|
||||||
|
→ dispose adapter, create fresh adapter with other config
|
||||||
|
→ reconnect → ReSubscribeAll → Connected
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Round-robin**: primary → backup → primary → backup. No preferred endpoint after first failover — the connection stays on whichever endpoint is working.
|
||||||
|
- **No auto-failback**: The connection remains on the active endpoint until it fails.
|
||||||
|
- **Single-endpoint connections** (no backup): Retry indefinitely on the same endpoint, preserving existing behavior.
|
||||||
|
- **Adapter lifecycle on failover**: The actor disposes the current `IDataConnection` adapter and creates a fresh one via `DataConnectionFactory.Create()` with the other endpoint's configuration. Clean slate — no stale state.
|
||||||
|
|
||||||
|
**Health reporting:**
|
||||||
|
|
||||||
|
- `DataConnectionHealthReport` includes `ActiveEndpoint`: `"Primary"`, `"Backup"`, or `"Primary (no backup)"`.
|
||||||
|
|
||||||
|
**Site event log entries:**
|
||||||
|
|
||||||
|
- `DataConnectionFailover` (Warning) — connection name, from-endpoint, to-endpoint, failure count.
|
||||||
|
- `DataConnectionRestored` (Info) — connection name, active endpoint.
|
||||||
|
|
||||||
|
See [`2026-03-22-primary-backup-data-connections-design.md`](../plans/2026-03-22-primary-backup-data-connections-design.md) for the full design.
|
||||||
|
|
||||||
## Connection Configuration Reference
|
## Connection Configuration Reference
|
||||||
|
|
||||||
All settings are parsed from the data connection's `Configuration` JSON dictionary (stored as `IDictionary<string, string>` connection details). Invalid numeric values fall back to defaults silently.
|
All settings are parsed from the data connection's configuration JSON dictionaries (`PrimaryConfiguration` and optional `BackupConfiguration`, stored as `IDictionary<string, string>` connection details). Both endpoints use the same protocol-specific keys. Invalid numeric values fall back to defaults silently.
|
||||||
|
|
||||||
### OPC UA Settings
|
### OPC UA Settings
|
||||||
|
|
||||||
|
|||||||
@@ -65,6 +65,7 @@
|
|||||||
- Additional protocols can be added by implementing the common interface.
|
- Additional protocols can be added by implementing the common interface.
|
||||||
- The Data Connection Layer is a **clean data pipe** — it publishes tag value updates to Instance Actors but performs no evaluation of triggers or alarm conditions.
|
- The Data Connection Layer is a **clean data pipe** — it publishes tag value updates to Instance Actors but performs no evaluation of triggers or alarm conditions.
|
||||||
- **Initial attribute quality**: Attributes bound to a data connection start with **uncertain** quality when the Instance Actor initializes. The quality remains uncertain until the first value update is received from the Data Connection Layer. This distinguishes "never received a value" from "received a known-good value" or "connection lost" (bad quality).
|
- **Initial attribute quality**: Attributes bound to a data connection start with **uncertain** quality when the Instance Actor initializes. The quality remains uncertain until the first value update is received from the Data Connection Layer. This distinguishes "never received a value" from "received a known-good value" or "connection lost" (bad quality).
|
||||||
|
- Data connections support optional **backup endpoints** with automatic failover after a configurable retry count. On failover, all subscriptions are transparently re-created on the new endpoint.
|
||||||
|
|
||||||
### 2.5 Scale
|
### 2.5 Scale
|
||||||
- Approximately **10 sites**.
|
- Approximately **10 sites**.
|
||||||
|
|||||||
@@ -64,6 +64,8 @@ API key (ReadWrite): `c4559c7c6acc60a997135c1381162e3c30f4572ece78dd933c1a626e6f
|
|||||||
|
|
||||||
Full details: [`lmxproxy/instances_config.md`](../../lmxproxy/instances_config.md)
|
Full details: [`lmxproxy/instances_config.md`](../../lmxproxy/instances_config.md)
|
||||||
|
|
||||||
|
**Primary/backup testing**: The dual OPC UA test servers (ports 50000 and 50010) in local Docker and the dual LmxProxy v2 instances on windev (ports 50100 and 50101) provide primary/backup endpoint pairs for testing Data Connection Layer failover. Use `docker compose stop opcua` to simulate primary failure and verify automatic failover to the backup.
|
||||||
|
|
||||||
## Connection Strings
|
## Connection Strings
|
||||||
|
|
||||||
For use in `appsettings.Development.json`:
|
For use in `appsettings.Development.json`:
|
||||||
|
|||||||
Reference in New Issue
Block a user