Cluster Infrastructure: add min-nr-of-members=1 requirement for single-node operation after failover. Add graceful shutdown / CoordinatedShutdown section for fast singleton handover during planned maintenance. Site Runtime: add explicit supervision strategies per actor type (Resume for coordinators, Stop for short-lived execution actors). Stagger Instance Actor startup to prevent reconnection storms. Add Tell-vs-Ask usage guidance per Akka.NET best practices (Tell for hot path, Ask for system boundaries only). Data Connection Layer: add Connection Actor Model section documenting the Become/Stash pattern for connection lifecycle state machine. Health Monitoring: add dead letter count as a monitored metric. Host: add REQ-HOST-8a for dead letter monitoring (subscribe to EventStream, log at Warning level, report as health metric).
7.4 KiB
Component: Data Connection Layer
Purpose
The Data Connection Layer provides a uniform interface for reading from and writing to physical machines at site clusters. It abstracts protocol-specific details behind a common interface, manages subscriptions, and delivers live tag value updates to Instance Actors. It is a clean data pipe — it performs no evaluation of triggers, alarm conditions, or business logic.
Location
Site clusters only. Central does not interact with machines directly.
Responsibilities
- Manage data connections defined at the site level (OPC UA servers, custom protocol endpoints).
- Establish and maintain connections to data sources based on deployed instance configurations.
- Subscribe to tag paths as requested by Instance Actors (based on attribute data source references in the flattened configuration).
- Deliver tag value updates to the requesting Instance Actors.
- Support writing values to machines (when Instance Actors forward
SetAttributewrite requests for data-connected attributes). - Report data connection health status to the Health Monitoring component.
Common Interface
Both OPC UA and the custom protocol implement the same interface:
IDataConnection
├── Connect(connectionDetails) → void
├── Disconnect() → void
├── Subscribe(tagPath, callback) → subscriptionId
├── Unsubscribe(subscriptionId) → void
├── Read(tagPath) → value
├── Write(tagPath, value) → void
└── Status → ConnectionHealth
Additional protocols can be added by implementing this interface.
Supported Protocols
OPC UA
- Standard OPC UA client implementation.
- Supports subscriptions (monitored items) and read/write operations.
Custom Protocol
- Proprietary protocol adapter.
- Implements the same subscription-based model as OPC UA.
Subscription Management
- When an Instance Actor is created (as part of the Site Runtime actor hierarchy), it registers its data source references with the Data Connection Layer.
- The DCL subscribes to the tag paths using the concrete connection details from the flattened configuration.
- Tag value updates are delivered directly to the requesting Instance Actor.
- When an Instance Actor is stopped (due to disable, delete, or redeployment), the DCL cleans up the associated subscriptions.
- When a new Instance Actor is created for a redeployment, subscriptions are established fresh based on the new configuration.
Write-Back Support
- When a script calls
Instance.SetAttributefor an attribute with a data source reference, the Instance Actor sends a write request to the DCL. - The DCL writes the value to the physical device via the appropriate protocol.
- The existing subscription picks up the confirmed new value from the device and delivers it back to the Instance Actor as a standard value update.
- The Instance Actor's in-memory value is not updated until the device confirms the write.
Value Update Message Format
Each value update delivered to an Instance Actor includes:
- Tag path: The relative path of the attribute's data source reference.
- Value: The new value from the device.
- Quality: Data quality indicator (good, bad, uncertain).
- Timestamp: When the value was read from the device.
Connection Actor Model
Each data connection is managed by a dedicated connection actor that uses the Akka.NET Become/Stash pattern to model its lifecycle as a state machine:
- Connecting: The actor attempts to establish the connection. Subscription requests and write commands received during this phase are stashed (buffered in the actor's stash).
- Connected: The actor is actively servicing subscriptions. On entering this state, all stashed messages are unstashed and processed.
- Reconnecting: The connection was lost. The actor transitions back to a connecting-like state, stashing new requests while it retries.
This pattern ensures no messages are lost during connection transitions and is the standard Akka.NET approach for actors with I/O lifecycle dependencies.
Connection Lifecycle & Reconnection
The DCL manages connection lifecycle automatically:
- Connection drop detection: When a connection to a data source is lost, the DCL immediately pushes a value update with quality
badfor every tag subscribed on that connection. Instance Actors and their downstream consumers (alarms, scripts checking quality) see the staleness immediately. - Auto-reconnect with fixed interval: The DCL retries the connection at a configurable fixed interval (e.g., every 5 seconds). The retry interval is defined per data connection. This is consistent with the fixed-interval retry philosophy used throughout the system.
- Connection state transitions: The DCL tracks each connection's state as
connected,disconnected, orreconnecting. All transitions are logged to Site Event Logging. - Transparent re-subscribe: On successful reconnection, the DCL automatically re-establishes all previously active subscriptions for that connection. Instance Actors require no action — they simply see quality return to
goodas fresh values arrive from restored subscriptions.
Write Failure Handling
Writes to physical devices are synchronous from the script's perspective:
- If the write fails (connection down, device rejection, timeout), the error is returned to the calling script. Script authors can catch and handle write errors (log, notify, retry, etc.).
- Write failures are also logged to Site Event Logging.
- There is no store-and-forward for device writes — these are real-time control operations. Buffering stale setpoints for later application would be dangerous in an industrial context.
Tag Path Resolution
When the DCL subscribes to a tag path from the flattened configuration but the path does not exist on the physical device (e.g., typo in the template, device firmware changed, device still booting):
- The failure is logged to Site Event Logging.
- The attribute is marked with quality
bad. - The DCL periodically retries resolution at a configurable interval, accommodating devices that come online in stages or load modules after startup.
- On successful resolution, the subscription activates normally and quality reflects the live value from the device.
Note: Pre-deployment validation at central does not verify that tag paths resolve to real tags on physical devices — that is a runtime concern handled here.
Health Reporting
The DCL reports the following metrics to the Health Monitoring component via the existing periodic heartbeat:
- Connection status:
connected,disconnected, orreconnectingper data connection. - Tag resolution counts: Per connection, the number of total subscribed tags vs. successfully resolved tags. This gives operators visibility into misconfigured templates without needing to open the debug view for individual instances.
Dependencies
- Site Runtime (Instance Actors): Receives subscription registrations and delivers value updates. Receives write requests.
- Health Monitoring: Reports connection status.
- Site Event Logging: Logs connection status changes.
Interactions
- Site Runtime (Instance Actors): Bidirectional — delivers value updates, receives subscription registrations and write-back commands.
- Health Monitoring: Reports connection health periodically.
- Site Event Logging: Logs connection/disconnection events.