diff --git a/Component-DataConnectionLayer.md b/Component-DataConnectionLayer.md index 713ffe2..66562aa 100644 --- a/Component-DataConnectionLayer.md +++ b/Component-DataConnectionLayer.md @@ -32,10 +32,11 @@ IDataConnection : IAsyncDisposable ├── Write(tagPath, value) → void ├── WriteBatch(values) → void ├── WriteBatchAndWait(values, flagPath, flagValue, responsePath, responseValue, timeout) → bool -└── Status → ConnectionHealth +├── Status → ConnectionHealth +└── Disconnected → event Action? ``` -Additional protocols can be added by implementing this interface. +The `Disconnected` event is raised by an adapter when it detects an unexpected connection loss (server offline, network failure, keep-alive timeout). The `DataConnectionActor` subscribes to this event to trigger the reconnection state machine. Additional protocols can be added by implementing this interface. ### Concrete Type Mappings @@ -51,6 +52,7 @@ Additional protocols can be added by implementing this interface. | `WriteBatch(values)` | OPC UA Write (multiple nodes) | gRPC `WriteBatch` RPC (throws on failure) | | `WriteBatchAndWait(...)` | OPC UA Write + poll for confirmation | `WriteBatch` + poll `Read` at 100ms intervals until response value matches or timeout | | `Status` | OPC UA session state | `IsConnected` — true when `SessionId` is non-empty | +| `Disconnected` | `Session.KeepAlive` event fires with bad `ServiceResult` | gRPC subscription stream ends or throws non-cancellation `RpcException` | ### Common Value Type @@ -66,8 +68,11 @@ Both protocols produce the same value tuple consumed by Instance Actors. Before ## Supported Protocols ### OPC UA -- Standard OPC UA client implementation. -- Supports subscriptions (monitored items) and read/write operations. +- Uses the **OPC Foundation .NET Standard Library** (`OPCFoundation.NetStandard.Opc.Ua.Client`). +- Session-based connection with endpoint discovery, certificate handling, and configurable security modes. +- Subscriptions via OPC UA Monitored Items with data change notifications (1000ms sampling, queue size 10, discard-oldest). +- Read/Write via OPC UA Read/Write services with StatusCode-based quality mapping. +- Disconnect detection via `Session.KeepAlive` event (see Disconnect Detection Pattern below). ### LmxProxy (Custom Protocol) @@ -97,6 +102,49 @@ LmxProxy is a gRPC-based protocol for communicating with LMX data servers. The D **Proto Source**: The `.proto` file originates from the LmxProxy server repository (`lmx/Proxy/Grpc/Protos/scada.proto` in ScadaBridge). The C# stubs are pre-generated and stored at `Adapters/LmxProxyGrpc/`. +**Test Infrastructure**: The `infra/lmxfakeproxy/` project provides a fake LmxProxy server that bridges to the OPC UA test server. It implements the full `scada.ScadaService` proto, enabling end-to-end testing of `RealLmxProxyClient` without a Windows LmxProxy deployment. See [test_infra_lmxfakeproxy.md](test_infra_lmxfakeproxy.md) for setup. + +## Connection Configuration Reference + +All settings are parsed from the data connection's `Configuration` JSON dictionary (stored as `IDictionary` connection details). Invalid numeric values fall back to defaults silently. + +### OPC UA Settings + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `endpoint` / `EndpointUrl` | string | `opc.tcp://localhost:4840` | OPC UA server endpoint URL | +| `SessionTimeoutMs` | int | `60000` | OPC UA session timeout in milliseconds | +| `OperationTimeoutMs` | int | `15000` | Transport operation timeout in milliseconds | +| `PublishingIntervalMs` | int | `1000` | Subscription publishing interval in milliseconds | +| `KeepAliveCount` | int | `10` | Keep-alive frames before session timeout | +| `LifetimeCount` | int | `30` | Subscription lifetime in publish intervals | +| `MaxNotificationsPerPublish` | int | `100` | Max notifications batched per publish cycle | +| `SamplingIntervalMs` | int | `1000` | Per-item server sampling rate in milliseconds | +| `QueueSize` | int | `10` | Per-item notification buffer size | +| `SecurityMode` | string | `None` | Preferred endpoint security: `None`, `Sign`, or `SignAndEncrypt` | +| `AutoAcceptUntrustedCerts` | bool | `true` | Accept untrusted server certificates | + +### LmxProxy Settings + +| Key | Type | Default | Description | +|-----|------|---------|-------------| +| `Host` | string | `localhost` | LmxProxy server hostname | +| `Port` | int | `50051` | LmxProxy gRPC port | +| `ApiKey` | string | *(none)* | API key for `x-api-key` header authentication | +| `SamplingIntervalMs` | int | `0` | Subscription sampling interval: 0 = on-change, >0 = time-based (ms) | +| `UseTls` | bool | `false` | Use HTTPS instead of plain HTTP/2 for gRPC channel | + +### Shared Settings (appsettings.json) + +These are configured via `DataConnectionOptions` in `appsettings.json`, not per-connection: + +| Setting | Default | Description | +|---------|---------|-------------| +| `ReconnectInterval` | 5s | Fixed interval between reconnection attempts | +| `TagResolutionRetryInterval` | 10s | Retry interval for unresolved tag paths | +| `WriteTimeout` | 30s | Timeout for write operations | +| `LmxProxyKeepAliveInterval` | 30s | Keep-alive ping interval for LmxProxy sessions | + ## Subscription Management - When an Instance Actor is created (as part of the Site Runtime actor hierarchy), it registers its data source references with the Data Connection Layer. @@ -130,17 +178,34 @@ Each data connection is managed by a dedicated connection actor that uses the Ak This pattern ensures no messages are lost during connection transitions and is the standard Akka.NET approach for actors with I/O lifecycle dependencies. -**LmxProxy-specific notes**: The `RealLmxProxyClient` holds the `SessionId` returned by the `Connect` RPC and includes it in all subsequent operations. The `LmxProxyDataConnection` adapter has no keep-alive timer — session liveness is handled by the DCL's existing reconnect cycle. Subscriptions use server-streaming gRPC — a background task reads from the `ResponseStream` and invokes the callback for each `VtqMessage`. On connection failure, the DCL actor transitions to **Reconnecting**, disposes the client (which cancels active subscriptions), and retries at the fixed interval. +**LmxProxy-specific notes**: The `RealLmxProxyClient` holds the `SessionId` returned by the `Connect` RPC and includes it in all subsequent operations. Subscriptions use server-streaming gRPC — a background task reads from the `ResponseStream` and invokes the callback for each `VtqMessage`. When the stream breaks (server offline, network failure), the background task detects the `RpcException` or stream end and invokes the `onStreamError` callback, which triggers the adapter's `Disconnected` event. The DCL actor transitions to **Reconnecting**, pushes bad quality, disposes the client, and retries at the fixed interval. + +**OPC UA-specific notes**: The `RealOpcUaClient` uses the OPC Foundation SDK's `Session.KeepAlive` event for proactive disconnect detection. The SDK sends keep-alive requests at the subscription's `KeepAliveCount × PublishingInterval` (default: 10s). When keep-alive fails, the `ConnectionLost` event fires, triggering the same reconnection flow. On reconnection, the DCL re-creates the OPC UA session and subscription, then re-adds all monitored items. ## Connection Lifecycle & Reconnection The DCL manages connection lifecycle automatically: 1. **Connection drop detection**: When a connection to a data source is lost, the DCL immediately pushes a value update with quality `bad` for **every tag subscribed on that connection**. Instance Actors and their downstream consumers (alarms, scripts checking quality) see the staleness immediately. -2. **Auto-reconnect with fixed interval**: The DCL retries the connection at a configurable fixed interval (e.g., every 5 seconds). The retry interval is defined **per data connection**. This is consistent with the fixed-interval retry philosophy used throughout the system. For LmxProxy, the DCL's reconnect cycle owns all recovery — re-establishing the gRPC channel and session after any connection failure. Individual gRPC operations (reads, writes) fail immediately to the caller on error; there is no operation-level retry within the adapter. +2. **Auto-reconnect with fixed interval**: The DCL retries the connection at a configurable fixed interval (e.g., every 5 seconds). The retry interval is defined **per data connection**. This is consistent with the fixed-interval retry philosophy used throughout the system. Individual gRPC/OPC UA operations (reads, writes) fail immediately to the caller on error; there is no operation-level retry within the adapter. 3. **Connection state transitions**: The DCL tracks each connection's state as `connected`, `disconnected`, or `reconnecting`. All transitions are logged to Site Event Logging. 4. **Transparent re-subscribe**: On successful reconnection, the DCL automatically re-establishes all previously active subscriptions for that connection. Instance Actors require no action — they simply see quality return to `good` as fresh values arrive from restored subscriptions. +### Disconnect Detection Pattern + +Each adapter implements the `IDataConnection.Disconnected` event to proactively signal connection loss to the `DataConnectionActor`. Detection uses two complementary paths: + +**Proactive detection** (server goes offline between operations): +- **OPC UA**: The OPC Foundation SDK fires `Session.KeepAlive` events at regular intervals. `RealOpcUaClient` hooks this event; when `ServiceResult.IsBad(e.Status)` (server unreachable, keep-alive timeout), it fires `ConnectionLost`. The `OpcUaDataConnection` adapter translates this into `IDataConnection.Disconnected`. +- **LmxProxy**: gRPC server-streaming subscriptions run in background tasks reading from `ResponseStream`. When the server goes offline, the stream either ends normally (server closed) or throws a non-cancellation `RpcException`. `RealLmxProxyClient` invokes the `onStreamError` callback, which `LmxProxyDataConnection` translates into `IDataConnection.Disconnected`. + +**Reactive detection** (failure discovered during an operation): +- Both adapters wrap `ReadAsync` (and by extension `ReadBatchAsync`) with exception handling. If a read throws a non-cancellation exception, the adapter calls `RaiseDisconnected()` and re-throws. The `DataConnectionActor`'s existing error handling catches the exception while the disconnect event triggers the reconnection state machine. + +**Event marshalling**: The `DataConnectionActor` subscribes to `_adapter.Disconnected` in `PreStart()`. Since `Disconnected` may fire from a background thread (gRPC stream task, OPC UA keep-alive timer), the handler sends an `AdapterDisconnected` message to `Self`, marshalling the notification onto the actor's message loop. This triggers `BecomeReconnecting()` → bad quality push → retry timer. + +**Once-only guard**: Both `LmxProxyDataConnection` and `OpcUaDataConnection` use a `volatile bool _disconnectFired` flag to ensure `RaiseDisconnected()` fires exactly once per connection session. The flag resets on successful reconnection (`ConnectAsync`). + ## Write Failure Handling Writes to physical devices are **synchronous** from the script's perspective: diff --git a/infra/README.md b/infra/README.md index 4f67e4d..ce8dc0d 100644 --- a/infra/README.md +++ b/infra/README.md @@ -17,6 +17,7 @@ This starts five services: | MS SQL 2022 | 1433 | Configuration and machine data databases | | SMTP (Mailpit) | 1025 (SMTP), 8025 (web) | Email capture for notification testing | | REST API (Flask) | 5200 | External REST API for Gateway and Inbound API testing | +| LmxFakeProxy (.NET gRPC) | 50051 (gRPC) | LmxProxy-compatible server bridging to OPC UA test server | ## First-Time SQL Setup diff --git a/test_infra.md b/test_infra.md index 2ab4ca0..a5095ab 100644 --- a/test_infra.md +++ b/test_infra.md @@ -11,6 +11,7 @@ This document describes the local Docker-based test infrastructure for ScadaLink | MS SQL 2022 | `mcr.microsoft.com/mssql/server:2022-latest` | 1433 | `infra/mssql/setup.sql` | | SMTP (Mailpit) | `axllent/mailpit:latest` | 1025 (SMTP), 8025 (web) | Environment vars | | REST API (Flask) | Custom build (`infra/restapi/Dockerfile`) | 5200 | `infra/restapi/app.py` | +| LmxFakeProxy | Custom build (`infra/lmxfakeproxy/Dockerfile`) | 50051 (gRPC) | Environment vars | ## Quick Start @@ -40,6 +41,7 @@ Each service has a dedicated document with configuration details, verification s - [test_infra_db.md](test_infra_db.md) — MS SQL 2022 database - [test_infra_smtp.md](test_infra_smtp.md) — SMTP test server (Mailpit) - [test_infra_restapi.md](test_infra_restapi.md) — REST API test server (Flask) +- [test_infra_lmxfakeproxy.md](test_infra_lmxfakeproxy.md) — LmxProxy fake server (OPC UA bridge) ## Connection Strings @@ -107,6 +109,7 @@ infra/ opcua/nodes.json # Custom OPC UA tag definitions restapi/app.py # Flask REST API server restapi/Dockerfile # REST API container build + lmxfakeproxy/ # .NET gRPC proxy bridging LmxProxy protocol to OPC UA tools/ # Python CLI tools (opcua, ldap, mssql, smtp, restapi) README.md # Quick-start for the infra folder ``` diff --git a/test_infra_lmxfakeproxy.md b/test_infra_lmxfakeproxy.md new file mode 100644 index 0000000..aba8ae6 --- /dev/null +++ b/test_infra_lmxfakeproxy.md @@ -0,0 +1,76 @@ +# Test Infrastructure: LmxFakeProxy + +## Overview + +LmxFakeProxy is a .NET gRPC server that implements the `scada.ScadaService` proto (full parity with the real LmxProxy server) but bridges to the OPC UA test server instead of System Platform MXAccess. This enables end-to-end testing of `RealLmxProxyClient` and the LmxProxy DCL adapter. + +## Image & Ports + +- **Image**: Custom build (`infra/lmxfakeproxy/Dockerfile`) +- **gRPC endpoint**: `localhost:50051` + +## Configuration + +| Environment Variable | Default | Description | +|---------------------|---------|-------------| +| `PORT` | `50051` | gRPC listen port | +| `OPC_ENDPOINT` | `opc.tcp://localhost:50000` | Backend OPC UA server | +| `OPC_PREFIX` | `ns=3;s=` | Prefix prepended to LMX tags to form OPC UA NodeIds | +| `API_KEY` | *(none)* | If set, enforces API key on all gRPC calls | + +## Tag Address Mapping + +LMX-style flat addresses are mapped to OPC UA NodeIds by prepending the configured prefix: + +| LMX Tag | OPC UA NodeId | +|---------|--------------| +| `Motor.Speed` | `ns=3;s=Motor.Speed` | +| `Pump.FlowRate` | `ns=3;s=Pump.FlowRate` | +| `Tank.Level` | `ns=3;s=Tank.Level` | + +## Supported RPCs + +Full parity with the `scada.ScadaService` proto: + +- **Connect / Disconnect / GetConnectionState** — Session management +- **Read / ReadBatch** — Read tag values via OPC UA +- **Write / WriteBatch / WriteBatchAndWait** — Write values via OPC UA +- **Subscribe** — Server-streaming subscriptions via OPC UA MonitoredItems +- **CheckApiKey** — API key validation + +## Verification + +1. Ensure the OPC UA test server is running: +```bash +docker ps --filter name=scadalink-opcua +``` + +2. Start the fake proxy: +```bash +docker compose up -d lmxfakeproxy +``` + +3. Check logs: +```bash +docker logs scadalink-lmxfakeproxy +``` + +4. Test with the ScadaLink CLI or a gRPC client. + +## Running Standalone (without Docker) + +```bash +cd infra/lmxfakeproxy +dotnet run -- --opc-endpoint opc.tcp://localhost:50000 --opc-prefix "ns=3;s=" +``` + +With API key enforcement: +```bash +dotnet run -- --api-key my-secret-key +``` + +## Relevance to ScadaLink Components + +- **Data Connection Layer** — Test `RealLmxProxyClient` and `LmxProxyDataConnection` against real OPC UA data +- **Site Runtime** — Deploy instances with LmxProxy data connections pointing at this server +- **Integration Tests** — End-to-end tests of the LmxProxy protocol path