Files
lmxopcua/docs/Redundancy.md
Joseph Doherty a55153d7d5 Add configurable non-transparent OPC UA server redundancy
Separates ApplicationUri from namespace identity so each instance in a
redundant pair has a unique server URI while sharing the same Galaxy
namespace. Exposes RedundancySupport, ServerUriArray, and dynamic
ServiceLevel through the standard OPC UA server object. ServiceLevel
is computed from role (Primary/Secondary) and runtime health (MXAccess
and DB connectivity). Adds CLI redundancy command, second deployed
service instance, and 31 new tests including paired-server integration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:32:17 -04:00

7.5 KiB

Redundancy

Overview

LmxOpcUa supports OPC UA non-transparent redundancy in Warm or Hot mode. In a non-transparent redundancy deployment, two independent server instances run side by side. Both connect to the same Galaxy repository database and the same MXAccess runtime, but each maintains its own OPC UA sessions and subscriptions. Clients discover the redundant set through the ServerUriArray exposed in each server's address space and are responsible for managing failover between the two endpoints.

When redundancy is disabled (the default), the server reports RedundancySupport.None and a fixed ServiceLevel of 255.

Namespace vs Application Identity

Both servers in the redundant set share the same namespace URI so that clients see identical node IDs regardless of which instance they are connected to. The namespace URI follows the pattern urn:{GalaxyName}:LmxOpcUa (e.g., urn:ZB:LmxOpcUa).

The ApplicationUri, on the other hand, must be unique per instance. This is how the OPC UA stack and clients distinguish one server from the other within the redundant set. Each instance sets its own ApplicationUri via the OpcUa.ApplicationUri configuration property (e.g., urn:localhost:LmxOpcUa:instance1 and urn:localhost:LmxOpcUa:instance2).

When redundancy is disabled, ApplicationUri defaults to urn:{GalaxyName}:LmxOpcUa if left null.

Configuration

Redundancy Section

Property Type Default Description
Enabled bool false Enables non-transparent redundancy. When false, the server reports RedundancySupport.None and ServiceLevel = 255.
Mode string "Warm" The redundancy mode advertised to clients. Valid values: Warm, Hot.
Role string "Primary" This instance's role in the redundant pair. Valid values: Primary, Secondary. The Primary advertises a higher ServiceLevel than the Secondary when both are healthy.
ServerUris string[] [] The ApplicationUri values of all servers in the redundant set. Must include this instance's own OpcUa.ApplicationUri. Should contain at least 2 entries.
ServiceLevelBase int 200 The base ServiceLevel when the server is fully healthy. Valid range: 1-255. The Secondary automatically receives ServiceLevelBase - 50.

OpcUa.ApplicationUri

Property Type Default Description
ApplicationUri string null Explicit application URI for this server instance. When null, defaults to urn:{GalaxyName}:LmxOpcUa. Required when redundancy is enabled -- each instance needs a unique identity.

ServiceLevel Computation

ServiceLevel is a standard OPC UA diagnostic value (0-255) that indicates server health. Clients in a redundant deployment should prefer the server advertising the highest ServiceLevel.

Baseline values:

Role Baseline
Primary ServiceLevelBase (default 200)
Secondary ServiceLevelBase - 50 (default 150)

Penalties applied to the baseline:

Condition Penalty
MXAccess disconnected -100
Galaxy DB unreachable -50
Both MXAccess and DB down ServiceLevel forced to 0

The final value is clamped to the range 0-255.

Examples (with default ServiceLevelBase = 200):

Scenario Primary Secondary
Both healthy 200 150
MXAccess down 100 50
DB down 150 100
Both down 0 0

Two-Instance Deployment

When deploying a redundant pair, the following configuration properties must differ between the two instances. All other settings (GalaxyName, ConnectionString, etc.) are shared.

Property Instance 1 (Primary) Instance 2 (Secondary)
OpcUa.Port 4840 4841
OpcUa.ServerName LmxOpcUa-1 LmxOpcUa-2
OpcUa.ApplicationUri urn:localhost:LmxOpcUa:instance1 urn:localhost:LmxOpcUa:instance2
Dashboard.Port 8081 8082
MxAccess.ClientName LmxOpcUa-1 LmxOpcUa-2
Redundancy.Role Primary Secondary

Instance 1 -- Primary (appsettings.json)

{
  "OpcUa": {
    "Port": 4840,
    "ServerName": "LmxOpcUa-1",
    "GalaxyName": "ZB",
    "ApplicationUri": "urn:localhost:LmxOpcUa:instance1"
  },
  "MxAccess": {
    "ClientName": "LmxOpcUa-1"
  },
  "Dashboard": {
    "Port": 8081
  },
  "Redundancy": {
    "Enabled": true,
    "Mode": "Warm",
    "Role": "Primary",
    "ServerUris": [
      "urn:localhost:LmxOpcUa:instance1",
      "urn:localhost:LmxOpcUa:instance2"
    ],
    "ServiceLevelBase": 200
  }
}

Instance 2 -- Secondary (appsettings.json)

{
  "OpcUa": {
    "Port": 4841,
    "ServerName": "LmxOpcUa-2",
    "GalaxyName": "ZB",
    "ApplicationUri": "urn:localhost:LmxOpcUa:instance2"
  },
  "MxAccess": {
    "ClientName": "LmxOpcUa-2"
  },
  "Dashboard": {
    "Port": 8082
  },
  "Redundancy": {
    "Enabled": true,
    "Mode": "Warm",
    "Role": "Secondary",
    "ServerUris": [
      "urn:localhost:LmxOpcUa:instance1",
      "urn:localhost:LmxOpcUa:instance2"
    ],
    "ServiceLevelBase": 200
  }
}

CLI redundancy Command

The CLI tool at tools/opcuacli-dotnet/ includes a redundancy command that reads the redundancy state from a running server.

dotnet run -- redundancy -u opc.tcp://localhost:4840/LmxOpcUa
dotnet run -- redundancy -u opc.tcp://localhost:4841/LmxOpcUa

The command reads the following standard OPC UA nodes and displays their values:

  • Redundancy Mode -- from Server_ServerRedundancy_RedundancySupport (None, Warm, or Hot)
  • Service Level -- from Server_ServiceLevel (0-255)
  • Server URIs -- from Server_ServerRedundancy_ServerUriArray (list of ApplicationUri values in the redundant set)
  • Application URI -- from Server_ServerArray (this instance's ApplicationUri)

Example output for a healthy Primary:

Redundancy Mode:  Warm
Service Level:    200
Server URIs:
  - urn:localhost:LmxOpcUa:instance1
  - urn:localhost:LmxOpcUa:instance2
Application URI:  urn:localhost:LmxOpcUa:instance1

The command also supports --username/--password and --security options for authenticated or encrypted connections.

Troubleshooting

Mismatched ServerUris between instances -- Both instances must list the exact same set of ApplicationUri values in Redundancy.ServerUris. If they differ, clients may not discover the full redundant set. Check the startup log for the Redundancy.ServerUris line on each instance.

ServiceLevel stuck at 255 -- This indicates redundancy is not enabled. When Redundancy.Enabled is false (the default), the server always reports ServiceLevel = 255 and RedundancySupport.None. Verify that Redundancy.Enabled is set to true in the configuration and that the configuration section is correctly bound.

ApplicationUri not set -- The configuration validator rejects startup when redundancy is enabled but OpcUa.ApplicationUri is null or empty. Each instance must have a unique ApplicationUri. Check the error log for: OpcUa.ApplicationUri must be set when redundancy is enabled.

Both servers report the same ServiceLevel -- Verify that one instance has Redundancy.Role set to Primary and the other to Secondary. Both set to Primary (or both to Secondary) will produce identical baseline values, preventing clients from distinguishing the preferred server.

ServerUriArray not readable -- When RedundancySupport is None (redundancy disabled), the OPC UA SDK may not expose the ServerUriArray node or it may return an empty value. The CLI redundancy command handles this gracefully by catching the read error. Enable redundancy to populate this array.