# Redundancy ## Overview LmxOpcUa supports OPC UA **non-transparent redundancy** in Warm or Hot mode. In a non-transparent redundancy deployment, two independent server instances run side by side. Both connect to the same Galaxy repository database and the same MXAccess runtime, but each maintains its own OPC UA sessions and subscriptions. Clients discover the redundant set through the `ServerUriArray` exposed in each server's address space and are responsible for managing failover between the two endpoints. When redundancy is disabled (the default), the server reports `RedundancySupport.None` and a fixed `ServiceLevel` of 255. ## Namespace vs Application Identity Both servers in the redundant set share the same **namespace URI** so that clients see identical node IDs regardless of which instance they are connected to. The namespace URI follows the pattern `urn:{GalaxyName}:LmxOpcUa` (e.g., `urn:ZB:LmxOpcUa`). The **ApplicationUri**, on the other hand, must be unique per instance. This is how the OPC UA stack and clients distinguish one server from the other within the redundant set. Each instance sets its own ApplicationUri via the `OpcUa.ApplicationUri` configuration property (e.g., `urn:localhost:LmxOpcUa:instance1` and `urn:localhost:LmxOpcUa:instance2`). When redundancy is disabled, `ApplicationUri` defaults to `urn:{GalaxyName}:LmxOpcUa` if left null. ## Configuration ### Redundancy Section | Property | Type | Default | Description | |---|---|---|---| | `Enabled` | bool | `false` | Enables non-transparent redundancy. When false, the server reports `RedundancySupport.None` and `ServiceLevel = 255`. | | `Mode` | string | `"Warm"` | The redundancy mode advertised to clients. Valid values: `Warm`, `Hot`. | | `Role` | string | `"Primary"` | This instance's role in the redundant pair. Valid values: `Primary`, `Secondary`. The Primary advertises a higher ServiceLevel than the Secondary when both are healthy. | | `ServerUris` | string[] | `[]` | The ApplicationUri values of all servers in the redundant set. Must include this instance's own `OpcUa.ApplicationUri`. Should contain at least 2 entries. | | `ServiceLevelBase` | int | `200` | The base ServiceLevel when the server is fully healthy. Valid range: 1-255. The Secondary automatically receives `ServiceLevelBase - 50`. | ### OpcUa.ApplicationUri | Property | Type | Default | Description | |---|---|---|---| | `ApplicationUri` | string | `null` | Explicit application URI for this server instance. When null, defaults to `urn:{GalaxyName}:LmxOpcUa`. **Required when redundancy is enabled** -- each instance needs a unique identity. | ## ServiceLevel Computation ServiceLevel is a standard OPC UA diagnostic value (0-255) that indicates server health. Clients in a redundant deployment should prefer the server advertising the highest ServiceLevel. **Baseline values:** | Role | Baseline | |---|---| | Primary | `ServiceLevelBase` (default 200) | | Secondary | `ServiceLevelBase - 50` (default 150) | **Penalties applied to the baseline:** | Condition | Penalty | |---|---| | MXAccess disconnected | -100 | | Galaxy DB unreachable | -50 | | Both MXAccess and DB down | ServiceLevel forced to 0 | The final value is clamped to the range 0-255. **Examples (with default ServiceLevelBase = 200):** | Scenario | Primary | Secondary | |---|---|---| | Both healthy | 200 | 150 | | MXAccess down | 100 | 50 | | DB down | 150 | 100 | | Both down | 0 | 0 | ## Two-Instance Deployment When deploying a redundant pair, the following configuration properties must differ between the two instances. All other settings (GalaxyName, ConnectionString, etc.) are shared. | Property | Instance 1 (Primary) | Instance 2 (Secondary) | |---|---|---| | `OpcUa.Port` | 4840 | 4841 | | `OpcUa.ServerName` | `LmxOpcUa-1` | `LmxOpcUa-2` | | `OpcUa.ApplicationUri` | `urn:localhost:LmxOpcUa:instance1` | `urn:localhost:LmxOpcUa:instance2` | | `Dashboard.Port` | 8081 | 8082 | | `MxAccess.ClientName` | `LmxOpcUa-1` | `LmxOpcUa-2` | | `Redundancy.Role` | `Primary` | `Secondary` | ### Instance 1 -- Primary (appsettings.json) ```json { "OpcUa": { "Port": 4840, "ServerName": "LmxOpcUa-1", "GalaxyName": "ZB", "ApplicationUri": "urn:localhost:LmxOpcUa:instance1" }, "MxAccess": { "ClientName": "LmxOpcUa-1" }, "Dashboard": { "Port": 8081 }, "Redundancy": { "Enabled": true, "Mode": "Warm", "Role": "Primary", "ServerUris": [ "urn:localhost:LmxOpcUa:instance1", "urn:localhost:LmxOpcUa:instance2" ], "ServiceLevelBase": 200 } } ``` ### Instance 2 -- Secondary (appsettings.json) ```json { "OpcUa": { "Port": 4841, "ServerName": "LmxOpcUa-2", "GalaxyName": "ZB", "ApplicationUri": "urn:localhost:LmxOpcUa:instance2" }, "MxAccess": { "ClientName": "LmxOpcUa-2" }, "Dashboard": { "Port": 8082 }, "Redundancy": { "Enabled": true, "Mode": "Warm", "Role": "Secondary", "ServerUris": [ "urn:localhost:LmxOpcUa:instance1", "urn:localhost:LmxOpcUa:instance2" ], "ServiceLevelBase": 200 } } ``` ## CLI `redundancy` Command The CLI tool at `tools/opcuacli-dotnet/` includes a `redundancy` command that reads the redundancy state from a running server. ```bash dotnet run -- redundancy -u opc.tcp://localhost:4840/LmxOpcUa dotnet run -- redundancy -u opc.tcp://localhost:4841/LmxOpcUa ``` The command reads the following standard OPC UA nodes and displays their values: - **Redundancy Mode** -- from `Server_ServerRedundancy_RedundancySupport` (None, Warm, or Hot) - **Service Level** -- from `Server_ServiceLevel` (0-255) - **Server URIs** -- from `Server_ServerRedundancy_ServerUriArray` (list of ApplicationUri values in the redundant set) - **Application URI** -- from `Server_ServerArray` (this instance's ApplicationUri) Example output for a healthy Primary: ``` Redundancy Mode: Warm Service Level: 200 Server URIs: - urn:localhost:LmxOpcUa:instance1 - urn:localhost:LmxOpcUa:instance2 Application URI: urn:localhost:LmxOpcUa:instance1 ``` The command also supports `--username`/`--password` and `--security` options for authenticated or encrypted connections. ## Troubleshooting **Mismatched ServerUris between instances** -- Both instances must list the exact same set of ApplicationUri values in `Redundancy.ServerUris`. If they differ, clients may not discover the full redundant set. Check the startup log for the `Redundancy.ServerUris` line on each instance. **ServiceLevel stuck at 255** -- This indicates redundancy is not enabled. When `Redundancy.Enabled` is false (the default), the server always reports `ServiceLevel = 255` and `RedundancySupport.None`. Verify that `Redundancy.Enabled` is set to `true` in the configuration and that the configuration section is correctly bound. **ApplicationUri not set** -- The configuration validator rejects startup when redundancy is enabled but `OpcUa.ApplicationUri` is null or empty. Each instance must have a unique ApplicationUri. Check the error log for: `OpcUa.ApplicationUri must be set when redundancy is enabled`. **Both servers report the same ServiceLevel** -- Verify that one instance has `Redundancy.Role` set to `Primary` and the other to `Secondary`. Both set to `Primary` (or both to `Secondary`) will produce identical baseline values, preventing clients from distinguishing the preferred server. **ServerUriArray not readable** -- When `RedundancySupport` is `None` (redundancy disabled), the OPC UA SDK may not expose the `ServerUriArray` node or it may return an empty value. The CLI `redundancy` command handles this gracefully by catching the read error. Enable redundancy to populate this array.