Files
lmxopcua/docs/HistoricalDataAccess.md

18 KiB

Historical Data Access

LmxNodeManager exposes OPC UA historical data access (HDA) through an abstract IHistorianDataSource interface (Historian/IHistorianDataSource.cs). The Wonderware Historian implementation lives in a separate assembly, ZB.MOM.WW.LmxOpcUa.Historian.Aveva, which is loaded at runtime only when Historian.Enabled=true. This keeps the aahClientManaged SDK out of the core Host so deployments that do not need history do not need the SDK installed.

Plugin Architecture

The historian surface is split across two assemblies:

  • ZB.MOM.WW.LmxOpcUa.Host (core) owns only OPC UA / BCL types:
    • IHistorianDataSource -- the interface LmxNodeManager depends on
    • HistorianEventDto -- SDK-free representation of a historian event record
    • HistorianAggregateMap -- maps OPC UA aggregate NodeIds to AnalogSummary column names
    • HistorianPluginLoader -- loads the plugin via Assembly.LoadFrom at startup
    • HistoryContinuationPointManager -- paginates HistoryRead results
  • ZB.MOM.WW.LmxOpcUa.Historian.Aveva (plugin) owns everything SDK-bound:
    • HistorianDataSource -- implements IHistorianDataSource, wraps aahClientManaged
    • IHistorianConnectionFactory / SdkHistorianConnectionFactory -- opens and polls ArchestrA.HistorianAccess connections
    • AvevaHistorianPluginEntry.Create(HistorianConfiguration) -- the static factory invoked by the loader

The plugin assembly and its SDK dependencies (aahClientManaged.dll, aahClient.dll, aahClientCommon.dll, Historian.CBE.dll, Historian.DPAPI.dll, ArchestrA.CloudHistorian.Contract.dll) deploy to a Historian/ subfolder next to ZB.MOM.WW.LmxOpcUa.Host.exe. See Service Hosting for the full layout and deployment matrix.

Plugin Loading

When the service starts with Historian.Enabled=true, OpcUaService calls HistorianPluginLoader.TryLoad(config). The loader:

  1. Probes AppDomain.CurrentDomain.BaseDirectory\Historian\ZB.MOM.WW.LmxOpcUa.Historian.Aveva.dll.
  2. Installs a one-shot AppDomain.AssemblyResolve handler that redirects any aahClientManaged/aahClientCommon/Historian.* lookups to the same subfolder, so the CLR can resolve SDK dependencies when the plugin first JITs.
  3. Calls the plugin's AvevaHistorianPluginEntry.Create(HistorianConfiguration) via reflection and returns the resulting IHistorianDataSource.
  4. On any failure (plugin missing, entry type not found, SDK assembly unresolvable, bad image), logs a warning with the expected plugin path and returns null. The server starts normally and LmxNodeManager returns BadHistoryOperationUnsupported for every history call.

Wonderware Historian SDK

The plugin uses the AVEVA Historian managed SDK (aahClientManaged.dll) to query historical data. The SDK provides a cursor-based query API through ArchestrA.HistorianAccess, replacing direct SQL queries against the Historian Runtime database. Two query types are used:

  • HistoryQuery -- Raw historical samples with timestamp, value (numeric or string), and OPC quality.
  • AnalogSummaryQuery -- Pre-computed aggregates with properties for Average, Minimum, Maximum, ValueCount, First, Last, StdDev, and more.

The SDK DLLs are located in lib/ and originate from C:\Program Files (x86)\Wonderware\Historian\. Only the plugin project (src/ZB.MOM.WW.LmxOpcUa.Historian.Aveva/) references them at build time; the core Host project does not.

Configuration

HistorianConfiguration controls the SDK connection:

public class HistorianConfiguration
{
    public bool Enabled { get; set; } = false;
    public string ServerName { get; set; } = "localhost";
    public List<string> ServerNames { get; set; } = new();
    public int FailureCooldownSeconds { get; set; } = 60;
    public bool IntegratedSecurity { get; set; } = true;
    public string? UserName { get; set; }
    public string? Password { get; set; }
    public int Port { get; set; } = 32568;
    public int CommandTimeoutSeconds { get; set; } = 30;
    public int MaxValuesPerRead { get; set; } = 10000;
    public int RequestTimeoutSeconds { get; set; } = 60;
}

When Enabled is false, HistorianPluginLoader.TryLoad is not called, no plugin is loaded, and the node manager returns BadHistoryOperationUnsupported for history read requests. When Enabled is true but the plugin cannot be loaded (missing Historian/ subfolder, SDK assembly resolve failure, etc.), the server still starts and returns the same BadHistoryOperationUnsupported status with a warning in the log.

Connection Properties

Property Default Description
ServerName localhost Single Historian server hostname used when ServerNames is empty. Preserved for backward compatibility with pre-cluster deployments
ServerNames [] Ordered list of Historian cluster nodes. When non-empty, supersedes ServerName and enables read-only cluster failover (see Cluster Failover)
FailureCooldownSeconds 60 How long a failed cluster node is skipped before being re-tried. Zero means no cooldown (retry on every request)
IntegratedSecurity true Use Windows authentication
UserName null Username when IntegratedSecurity is false
Password null Password when IntegratedSecurity is false
Port 32568 Historian TCP port
CommandTimeoutSeconds 30 SDK packet timeout in seconds (inner async bound)
RequestTimeoutSeconds 60 Outer safety timeout applied to sync-over-async history reads on the OPC UA stack thread. Backstop for CommandTimeoutSeconds; a timed-out read returns BadTimeout. Should be greater than CommandTimeoutSeconds. Stability review 2026-04-13 Finding 3
MaxValuesPerRead 10000 Maximum values per history read request

Connection Lifecycle

HistorianDataSource (in the plugin assembly) maintains a persistent connection to the Historian server via ArchestrA.HistorianAccess:

  1. Lazy connect -- The connection is established on the first query via EnsureConnected(). When a cluster is configured, the data source iterates HistorianClusterEndpointPicker.GetHealthyNodes() in order and returns the first node that successfully connects.
  2. Connection reuse -- Subsequent queries reuse the same connection. The active node is tracked in _activeProcessNode / _activeEventNode and surfaced on the dashboard.
  3. Auto-reconnect -- On connection failure, the connection is disposed, the active node is marked failed in the picker, and the next query re-enters the picker loop to try the next eligible candidate.
  4. Clean shutdown -- Dispose() closes the connection when the service stops.

The connection is opened with ReadOnly = true and ConnectionType = Process. The event (alarm history) path uses a separate connection with ConnectionType = Event, but both silos share the same cluster picker so a node that fails on one silo is immediately skipped on the other.

Read-Only Cluster Failover

When HistorianConfiguration.ServerNames is non-empty, the plugin picks from an ordered list of cluster nodes instead of a single ServerName. Each connection attempt tries candidates in configuration order until one succeeds. Failed nodes are placed into a timed cooldown and re-admitted when the cooldown elapses.

HistorianClusterEndpointPicker

The picker (in the plugin assembly, internal) is pure logic with no SDK dependency — all cluster behavior is unit-testable with a fake clock and scripted factory. Key characteristics:

  • Ordered iteration: nodes are tried in the exact order they appear in ServerNames. Operators can express a preference ("primary first, fallback second") by ordering the list.
  • Per-node cooldown: MarkFailed(node, error) starts a FailureCooldownSeconds window during which the node is skipped from GetHealthyNodes(). MarkHealthy(node) clears the window immediately (used on successful connect).
  • Automatic re-admission: when a node's cooldown elapses, the next call to GetHealthyNodes() includes it automatically — no background probe, no manual reset. The cumulative FailureCount and LastError are retained for operator diagnostics.
  • Thread-safe: a single lock guards the per-node state. Operations are microsecond-scale so contention is a non-issue.
  • Shared across silos: one picker instance is shared by the process-values connection and the event-history connection, so a node failure on one path immediately benches it for the other.
  • Zero cooldown mode: FailureCooldownSeconds = 0 disables the cooldown entirely — the node is never benched. Useful for tests or for operators who want the SDK's own retry semantics to be the sole gate.

Connection attempt flow

HistorianDataSource.ConnectToAnyHealthyNode(HistorianConnectionType) performs the actual iteration:

  1. Snapshot healthy nodes from the picker. If empty, throw InvalidOperationException with either "No historian nodes configured" or "All N historian nodes are in cooldown".
  2. For each candidate, clone HistorianConfiguration with the candidate as ServerName and pass it to the factory. On success: MarkHealthy(node) and return the (Connection, Node) tuple. On exception: MarkFailed(node, ex.Message), log a warning, continue.
  3. If all candidates fail, wrap the last inner exception in an InvalidOperationException with the cumulative failure count so the existing read-method catch blocks surface a meaningful error through the health counters.

The wrapping exception intentionally includes the last inner error message in the outer Message so the health snapshot's LastError field is still human-readable when the cluster exhausts every candidate.

Single-node backward compatibility

When ServerNames is empty, the picker is seeded with a single entry from ServerName and the iteration loop still runs — it just has one candidate. Legacy deployments see no behavior change: the picker marks the single node healthy on success, runs the same cooldown logic on failure, and the dashboard renders a compact Node: <hostname> line instead of the cluster table.

Cluster health surface

Runtime cluster state is exposed on HistorianHealthSnapshot:

  • NodeCount / HealthyNodeCount -- size of the configured cluster and how many are currently eligible.
  • ActiveProcessNode / ActiveEventNode -- which nodes are currently serving the two connection silos, or null when a silo has no open connection.
  • Nodes: List<HistorianClusterNodeState> -- per-node state with Name, IsHealthy, CooldownUntil, FailureCount, LastError, LastFailureTime.

The dashboard renders this as a cluster table when NodeCount > 1. See Status Dashboard. HealthCheckService flips the overall service health to Degraded when HealthyNodeCount < NodeCount so operators can alert on a partially-failed cluster even while queries are still succeeding via the remaining nodes.

Runtime Health Counters

HistorianDataSource maintains runtime query counters updated on every read method exit — success or failure — so the dashboard can distinguish "plugin loaded but never queried" from "plugin loaded and queries are failing". The load-time HistorianPluginLoader.LastOutcome only reports whether the assembly resolved at startup; it cannot catch a connection that succeeds at boot and degrades later.

Counters

  • TotalQueries / TotalSuccesses / TotalFailures — cumulative since startup. Every call to RecordSuccess or RecordFailure in the read methods updates these under _healthLock. Empty result sets count as successes — the counter reflects "the SDK call returned" rather than "the SDK call returned data".
  • ConsecutiveFailures — latches while queries are failing; reset to zero by the first success. Drives HealthCheckService degradation at threshold 3.
  • LastSuccessTime / LastFailureTime — UTC timestamps of the most recent success or failure, or null when no query of that outcome has occurred yet.
  • LastError — exception message from the most recent failure, prefixed with the read-path name (raw:, aggregate:, at-time:, events:) so operators can tell which SDK call is broken. Cleared on the next success.
  • ProcessConnectionOpen / EventConnectionOpen — whether the plugin currently holds an open SDK connection on each silo. Read from the data source's _connection / _eventConnection fields via a Volatile.Read.

These fields are read once per dashboard refresh via IHistorianDataSource.GetHealthSnapshot() and serialized into HistorianStatusInfo. See Status Dashboard for the HTML/JSON surface.

Two SDK connection silos

The plugin maintains two independent ArchestrA.HistorianAccess connections, one per HistorianConnectionType:

  • Process connection (ConnectionType = Process) — serves historical value queries: ReadRawAsync, ReadAggregateAsync, ReadAtTimeAsync. This is the SDK's query channel for tags stored in the Historian runtime.
  • Event connection (ConnectionType = Event) — serves historical event/alarm queries: ReadEventsAsync. The SDK requires a separately opened connection for its event store because the query API and wire schema are distinct from value queries.

Both connections are lazy: they open on the first query that needs them. Either can be open, closed, or open against a different cluster node than the other. The dashboard renders both independently in the Historian panel (Process Conn: open (host-a) | Event Conn: closed) so operators can tell which silos are active and which node is serving each. When cluster support is configured, both silos share the same HistorianClusterEndpointPicker, so a failure on one silo marks the node unhealthy for the other as well.

Raw Reads

IHistorianDataSource.ReadRawAsync (plugin implementation) uses a HistoryQuery to retrieve individual samples within a time range:

  1. Create a HistoryQuery via _connection.CreateHistoryQuery()
  2. Configure HistoryQueryArgs with TagNames, StartDateTime, EndDateTime, and RetrievalMode = Full
  3. Iterate: StartQuery -> MoveNext loop -> EndQuery

Each result row is converted to an OPC UA DataValue:

  • QueryResult.Value (double) takes priority; QueryResult.StringValue is used as fallback for string-typed tags.
  • SourceTimestamp and ServerTimestamp are both set to QueryResult.StartDateTime.
  • StatusCode is mapped from the QueryResult.OpcQuality (UInt16) via QualityMapper (the same OPC DA quality byte mapping used for live MXAccess data).

Aggregate Reads

IHistorianDataSource.ReadAggregateAsync (plugin implementation) uses an AnalogSummaryQuery to retrieve pre-computed aggregates:

  1. Create an AnalogSummaryQuery via _connection.CreateAnalogSummaryQuery()
  2. Configure AnalogSummaryQueryArgs with TagNames, StartDateTime, EndDateTime, and Resolution (milliseconds)
  3. Iterate the same StartQuery -> MoveNext -> EndQuery pattern
  4. Extract the requested aggregate from named properties on AnalogSummaryQueryResult

Null aggregate values return BadNoData status rather than Good with a null variant.

Quality Mapping

The Historian SDK returns standard OPC DA quality values in QueryResult.OpcQuality (UInt16). The low byte is passed through the shared QualityMapper pipeline (MapFromMxAccessQuality -> MapToOpcUaStatusCode), which maps the OPC DA quality families to OPC UA status codes:

OPC Quality Byte OPC DA Family OPC UA StatusCode
0-63 Bad Bad (with sub-code when an exact enum match exists)
64-191 Uncertain Uncertain (with sub-code when an exact enum match exists)
192+ Good Good (with sub-code when an exact enum match exists)

See Domain/QualityMapper.cs and Domain/Quality.cs for the full mapping table and sub-code definitions.

Aggregate Function Mapping

HistorianAggregateMap.MapAggregateToColumn (in the core Host assembly, so the node manager can validate aggregate support without requiring the plugin to be loaded) translates OPC UA aggregate NodeIds to AnalogSummaryQueryResult property names:

OPC UA Aggregate Result Property
AggregateFunction_Average Average
AggregateFunction_Minimum Minimum
AggregateFunction_Maximum Maximum
AggregateFunction_Count ValueCount
AggregateFunction_Start First
AggregateFunction_End Last
AggregateFunction_StandardDeviationPopulation StdDev

Unsupported aggregates return null, which causes the node manager to return BadAggregateNotSupported.

HistoryReadRawModified Override

LmxNodeManager overrides HistoryReadRawModified to handle raw history read requests:

  1. Resolve the NodeHandle to a tag reference via _nodeIdToTagReference. Return BadNodeIdUnknown if not found.
  2. Check that _historianDataSource is not null. Return BadHistoryOperationUnsupported if historian is disabled.
  3. Call ReadRawAsync with the time range and NumValuesPerNode from the ReadRawModifiedDetails.
  4. Pack the resulting DataValue list into a HistoryData object and wrap it in an ExtensionObject for the HistoryReadResult.

HistoryReadProcessed Override

HistoryReadProcessed handles aggregate history requests with additional validation:

  1. Resolve the node and check historian availability (same as raw).
  2. Validate that AggregateType is present in the ReadProcessedDetails. Return BadAggregateListMismatch if empty.
  3. Map the requested aggregate to a result property via MapAggregateToColumn. Return BadAggregateNotSupported if unmapped.
  4. Call ReadAggregateAsync with the time range, ProcessingInterval, and property name.
  5. Return results in the same HistoryData / ExtensionObject format.

Historizing Flag and AccessLevel

During variable node creation in CreateAttributeVariable, attributes with IsHistorized == true receive two additional settings:

if (attr.IsHistorized)
    accessLevel |= AccessLevels.HistoryRead;
variable.Historizing = attr.IsHistorized;
  • Historizing = true -- Tells OPC UA clients that this node has historical data available.
  • AccessLevels.HistoryRead -- Enables the HistoryRead access bit on the node, which the OPC UA stack checks before routing history requests to the node manager override. Nodes without this bit set will be rejected by the framework before reaching HistoryReadRawModified or HistoryReadProcessed.

The IsHistorized flag originates from the Galaxy repository database query, which checks whether the attribute has Historian logging configured.