Doc refresh (task #203) — driver docs split + drivers index + IHistoryProvider-aware HistoricalDataAccess

Restructure the driver-facing docs to match the OtOpcUa v2 multi-driver
reality (Galaxy, Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS, OPC UA Client
— 8 drivers total; Galaxy ships as three projects) and the capability-interface
architecture where every driver opts into IDriver + whichever of IReadable /
IWritable / ITagDiscovery / ISubscribable / IHostConnectivityProbe /
IPerCallHostResolver / IAlarmSource / IHistoryProvider / IRediscoverable it
supports. Doc scope follows the code: one-driver-specific docs scoped to that
driver, cross-driver concerns live once at the top level, per-driver specs
cross-link to docs/v2/driver-specs.md rather than duplicate.

What changed per file:

- docs/MxAccessBridge.md -> docs/drivers/Galaxy.md (git mv + rewrite): retitled
  "Galaxy Driver", reframed as one of seven drivers. Added Project Split table
  (Shared .NET Standard 2.0 / Host .NET 4.8 x86 / Proxy .NET 10) and Why
  Out-of-Process section citing both the MXAccess bitness constraint and Tier C
  stability isolation per docs/v2/plan.md section 4. Added IPC Transport
  section covering pipe naming, MessagePack framing, DACL that denies Admins,
  shared-secret handshake, heartbeat, and CallAsync<TReq,TResp> dispatch.
  Moved file paths from src/ZB.MOM.WW.LmxOpcUa.Host/MxAccess/* to
  src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/MxAccess/* and added the
  Shared + Proxy key-file tables. Added CapabilityInvoker + OTOPCUA0001
  analyzer callout. Cross-linked to drivers/README.md, Galaxy-Repository.md,
  HistoricalDataAccess.md.

- docs/GalaxyRepository.md -> docs/drivers/Galaxy-Repository.md (git mv +
  rewrite): retitled "Galaxy Repository — Tag Discovery for the Galaxy
  Driver", opened with a comparison table showing how every driver's
  ITagDiscovery source is different (AB CIP @tags walker, TwinCAT
  SymbolLoaderFactory, FOCAS CNC queries, OPC UA Client Session.Browse, etc).
  Repositioned GalaxyRepositoryService as the Galaxy driver's
  ITagDiscovery.DiscoverAsync implementation. Updated paths to
  Driver.Galaxy.Host/Backend/GalaxyRepository/*. Added IRediscoverable section
  covering the on-change-redeploy IPC path.

- docs/drivers/README.md (new): index with ground-truth driver table —
  project path, stability tier, wire library, capability-interface list, and
  one notable quirk per driver. Verified against the driver csproj files and
  class declarations on focas-pr3-remaining-capabilities (the most recent
  branch containing every driver). Galaxy gets its own dedicated docs; the
  other seven drivers cross-link to docs/v2/driver-specs.md. Lists the full
  Core.Abstractions capability surface, DriverTypeRegistry, CapabilityInvoker,
  and OTOPCUA0001 analyzer.

- docs/HistoricalDataAccess.md (rewrite): reframed around IHistoryProvider as
  a per-driver optional capability interface. Replaced v1 HistorianPluginLoader
  / AvevaHistorianPluginEntry plugin architecture with the v2 story —
  Historian.Aveva was merged into Driver.Galaxy.Host/Backend/Historian/ and
  IPC-forwarded through GalaxyProxyDriver. Documented all four IHistoryProvider
  methods (ReadRawAsync / ReadProcessedAsync / ReadAtTimeAsync /
  ReadEventsAsync), CapabilityInvoker wrapping with DriverCapability.HistoryRead,
  and the per-driver coverage matrix (Galaxy + OPC UA Client implement; the
  six protocol drivers don't and return BadHistoryOperationUnsupported). Kept
  the cluster-failover + health-counter + quality-mapping detail for the
  Galaxy Historian implementation. Flagged one gap: Proxy forwards all four
  history message kinds but the Host-side HistoryAggregateType -> AnalogSummary
  column mapping may surface GalaxyIpcException{Code="not-implemented"} on a
  given branch until the Phase 2 Galaxy out-of-process gate lands.

Driver list built against ground truth (src on focas-pr3-remaining-capabilities):
  Driver.Galaxy.{Shared,Host,Proxy}, Driver.Modbus, Driver.S7, Driver.AbCip,
  Driver.AbLegacy, Driver.TwinCAT, Driver.FOCAS, Driver.OpcUaClient.
Capability interface lists verified against each *Driver.cs class declaration.
Aveva Historian ported to Driver.Galaxy.Host/Backend/Historian/; no separate
Historian.Aveva assembly on v2 branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-20 01:30:47 -04:00
parent 985b7aba26
commit 71339307fa
5 changed files with 372 additions and 385 deletions

View File

@@ -1,228 +1,109 @@
# Historical Data Access
`LmxNodeManager` exposes OPC UA historical data access (HDA) through an abstract `IHistorianDataSource` interface (`Historian/IHistorianDataSource.cs`). The Wonderware Historian implementation lives in a separate assembly, `ZB.MOM.WW.OtOpcUa.Historian.Aveva`, which is loaded at runtime only when `Historian.Enabled=true`. This keeps the `aahClientManaged` SDK out of the core Host so deployments that do not need history do not need the SDK installed.
OPC UA HistoryRead is a **per-driver optional capability** in OtOpcUa. The Core dispatches HistoryRead service calls to the owning driver through the `IHistoryProvider` capability interface (`src/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs`). Drivers that don't implement the interface return `BadHistoryOperationUnsupported` for every history call on their nodes; that is the expected behavior for protocol drivers (Modbus, S7, AB CIP, AB Legacy, TwinCAT, FOCAS) whose wire protocols carry no time-series data.
## Plugin Architecture
Historian integration is no longer a separate bolt-on assembly, as it was in v1 (`ZB.MOM.WW.LmxOpcUa.Historian.Aveva` plugin). It is now one optional capability any driver can implement. The first implementation is the Galaxy driver's Wonderware Historian integration; OPC UA Client forwards HistoryRead to the upstream server. Every other driver leaves the capability unimplemented and the Core short-circuits history calls on nodes that belong to those drivers.
The historian surface is split across two assemblies:
## `IHistoryProvider`
- **`ZB.MOM.WW.OtOpcUa.Host`** (core) owns only OPC UA / BCL types:
- `IHistorianDataSource` -- the interface `LmxNodeManager` depends on
- `HistorianEventDto` -- SDK-free representation of a historian event record
- `HistorianAggregateMap` -- maps OPC UA aggregate NodeIds to AnalogSummary column names
- `HistorianPluginLoader` -- loads the plugin via `Assembly.LoadFrom` at startup
- `HistoryContinuationPointManager` -- paginates HistoryRead results
- **`ZB.MOM.WW.OtOpcUa.Historian.Aveva`** (plugin) owns everything SDK-bound:
- `HistorianDataSource` -- implements `IHistorianDataSource`, wraps `aahClientManaged`
- `IHistorianConnectionFactory` / `SdkHistorianConnectionFactory` -- opens and polls `ArchestrA.HistorianAccess` connections
- `AvevaHistorianPluginEntry.Create(HistorianConfiguration)` -- the static factory invoked by the loader
Four methods, mapping onto the four OPC UA HistoryRead service variants:
The plugin assembly and its SDK dependencies (`aahClientManaged.dll`, `aahClient.dll`, `aahClientCommon.dll`, `Historian.CBE.dll`, `Historian.DPAPI.dll`, `ArchestrA.CloudHistorian.Contract.dll`) deploy to a `Historian/` subfolder next to `ZB.MOM.WW.OtOpcUa.Host.exe`. See [Service Hosting](ServiceHosting.md#required-runtime-assemblies) for the full layout and deployment matrix.
| Method | OPC UA service | Notes |
|--------|----------------|-------|
| `ReadRawAsync` | HistoryReadRawModified (raw subset) | Returns `HistoryReadResult { Samples, ContinuationPoint? }`. The Core handles `ContinuationPoint` pagination. |
| `ReadProcessedAsync` | HistoryReadProcessed | Takes a `HistoryAggregateType` (Average / Minimum / Maximum / Total / Count) and a bucket `interval`. Drivers that can't express an aggregate throw `NotSupportedException`; the Core translates that into `BadAggregateNotSupported`. |
| `ReadAtTimeAsync` | HistoryReadAtTime | Default implementation throws `NotSupportedException` — drivers without interpolation / prior-boundary support leave the default. |
| `ReadEventsAsync` | HistoryReadEvents | Historical alarm/event rows, distinct from the live `IAlarmSource` stream. Default throws; only drivers with an event historian (Galaxy's A&E log) override. |
## Plugin Loading
Supporting DTOs live alongside the interface in `Core.Abstractions`:
When the service starts with `Historian.Enabled=true`, `OpcUaService` calls `HistorianPluginLoader.TryLoad(config)`. The loader:
- `HistoryReadResult(IReadOnlyList<DataValueSnapshot> Samples, byte[]? ContinuationPoint)`
- `HistoryAggregateType` — enum `{ Average, Minimum, Maximum, Total, Count }`
- `HistoricalEvent(EventId, SourceName?, EventTimeUtc, ReceivedTimeUtc, Message?, Severity)`
- `HistoricalEventsResult(IReadOnlyList<HistoricalEvent> Events, byte[]? ContinuationPoint)`
1. Probes `AppDomain.CurrentDomain.BaseDirectory\Historian\ZB.MOM.WW.OtOpcUa.Historian.Aveva.dll`.
2. Installs a one-shot `AppDomain.AssemblyResolve` handler that redirects any `aahClientManaged`/`aahClientCommon`/`Historian.*` lookups to the same subfolder, so the CLR can resolve SDK dependencies when the plugin first JITs.
3. Calls the plugin's `AvevaHistorianPluginEntry.Create(HistorianConfiguration)` via reflection and returns the resulting `IHistorianDataSource`.
4. On any failure (plugin missing, entry type not found, SDK assembly unresolvable, bad image), logs a warning with the expected plugin path and returns `null`. The server starts normally and `LmxNodeManager` returns `BadHistoryOperationUnsupported` for every history call.
## Dispatch through `CapabilityInvoker`
## Wonderware Historian SDK
All four HistoryRead surfaces are wrapped by `CapabilityInvoker` (`Core/Resilience/CapabilityInvoker.cs`) with `DriverCapability.HistoryRead`. The Polly pipeline keyed on `(DriverInstanceId, HostName, DriverCapability.HistoryRead)` provides timeout, circuit-breaker, and bulkhead defaults per the driver's stability tier (see [docs/v2/driver-stability.md](v2/driver-stability.md)).
The plugin uses the AVEVA Historian managed SDK (`aahClientManaged.dll`) to query historical data. The SDK provides a cursor-based query API through `ArchestrA.HistorianAccess`, replacing direct SQL queries against the Historian Runtime database. Two query types are used:
The dispatch point is `DriverNodeManager` in `ZB.MOM.WW.OtOpcUa.Server`. When the OPC UA stack calls `HistoryRead`, the node manager:
- **`HistoryQuery`** -- Raw historical samples with timestamp, value (numeric or string), and OPC quality.
- **`AnalogSummaryQuery`** -- Pre-computed aggregates with properties for Average, Minimum, Maximum, ValueCount, First, Last, StdDev, and more.
1. Resolves the target `NodeHandle` to a `(DriverInstanceId, fullReference)` pair.
2. Checks the owning driver's `DriverTypeMetadata` to see if the type may advertise history at all (fast reject for types that never implement `IHistoryProvider`).
3. If the driver instance implements `IHistoryProvider`, wraps the `ReadRawAsync` / `ReadProcessedAsync` / `ReadAtTimeAsync` / `ReadEventsAsync` call in `CapabilityInvoker.InvokeAsync(... DriverCapability.HistoryRead ...)`.
4. Translates the `HistoryReadResult` into an OPC UA `HistoryData` + `ExtensionObject`.
5. Manages the continuation point via `HistoryContinuationPointManager` so clients can page through large result sets.
The SDK DLLs are located in `lib/` and originate from `C:\Program Files (x86)\Wonderware\Historian\`. Only the plugin project (`src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/`) references them at build time; the core Host project does not.
Driver-level history code never sees the continuation-point protocol or the OPC UA stack types — those stay in the Core.
## Configuration
## Driver coverage
`HistorianConfiguration` controls the SDK connection:
| Driver | Implements `IHistoryProvider`? | Source |
|--------|:------------------------------:|--------|
| Galaxy | Yes — raw, processed, at-time, events | `aahClientManaged` SDK (Wonderware Historian) on the Host side, forwarded through the Proxy's IPC |
| OPC UA Client | Yes — raw, processed, at-time, events (forwarded to upstream) | `Opc.Ua.Client.Session.HistoryRead` against the remote server |
| Modbus | No | Wire protocol has no time-series concept |
| Siemens S7 | No | S7comm has no time-series concept |
| AB CIP | No | CIP has no time-series concept |
| AB Legacy | No | PCCC has no time-series concept |
| TwinCAT | No | ADS symbol reads are point-in-time; archiving is an external concern |
| FOCAS | No | Default — FOCAS has no general-purpose historian API |
```csharp
public class HistorianConfiguration
{
public bool Enabled { get; set; } = false;
public string ServerName { get; set; } = "localhost";
public List<string> ServerNames { get; set; } = new();
public int FailureCooldownSeconds { get; set; } = 60;
public bool IntegratedSecurity { get; set; } = true;
public string? UserName { get; set; }
public string? Password { get; set; }
public int Port { get; set; } = 32568;
public int CommandTimeoutSeconds { get; set; } = 30;
public int MaxValuesPerRead { get; set; } = 10000;
public int RequestTimeoutSeconds { get; set; } = 60;
}
```
## Galaxy — Wonderware Historian (`aahClientManaged`)
When `Enabled` is `false`, `HistorianPluginLoader.TryLoad` is not called, no plugin is loaded, and the node manager returns `BadHistoryOperationUnsupported` for history read requests. When `Enabled` is `true` but the plugin cannot be loaded (missing `Historian/` subfolder, SDK assembly resolve failure, etc.), the server still starts and returns the same `BadHistoryOperationUnsupported` status with a warning in the log.
The Galaxy driver's `IHistoryProvider` implementation lives on the Host side (`.NET 4.8 x86`) in `src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/Backend/Historian/`. The Proxy's `GalaxyProxyDriver.ReadRawAsync` / `ReadProcessedAsync` / `ReadAtTimeAsync` / `ReadEventsAsync` each serializes a `HistoryRead*Request` and awaits the matching `HistoryRead*Response` over the named pipe (see [drivers/Galaxy.md](drivers/Galaxy.md#ipc-transport)).
### Connection Properties
Host-side, `HistorianDataSource` uses the AVEVA Historian managed SDK (`aahClientManaged.dll`) to query historical data via a cursor-based API through `ArchestrA.HistorianAccess`:
| Property | Default | Description |
|---|---|---|
| `ServerName` | `localhost` | Single Historian server hostname used when `ServerNames` is empty. Preserved for backward compatibility with pre-cluster deployments |
| `ServerNames` | `[]` | Ordered list of Historian cluster nodes. When non-empty, supersedes `ServerName` and enables read-only cluster failover (see [Cluster Failover](#read-only-cluster-failover)) |
| `FailureCooldownSeconds` | `60` | How long a failed cluster node is skipped before being re-tried. Zero means no cooldown (retry on every request) |
| `IntegratedSecurity` | `true` | Use Windows authentication |
| `UserName` | `null` | Username when `IntegratedSecurity` is false |
| `Password` | `null` | Password when `IntegratedSecurity` is false |
| `Port` | `32568` | Historian TCP port |
| `CommandTimeoutSeconds` | `30` | SDK packet timeout in seconds (inner async bound) |
| `RequestTimeoutSeconds` | `60` | Outer safety timeout applied to sync-over-async history reads on the OPC UA stack thread. Backstop for `CommandTimeoutSeconds`; a timed-out read returns `BadTimeout`. Should be greater than `CommandTimeoutSeconds`. Stability review 2026-04-13 Finding 3 |
| `MaxValuesPerRead` | `10000` | Maximum values per history read request |
- **`HistoryQuery`** — raw historical samples (timestamp, value, OPC quality)
- **`AnalogSummaryQuery`** — pre-computed aggregates (Average, Minimum, Maximum, ValueCount, First, Last, StdDev)
## Connection Lifecycle
The SDK DLLs are pulled into the Galaxy.Host project at build time; the Server and every other driver project remain SDK-free.
`HistorianDataSource` (in the plugin assembly) maintains a persistent connection to the Historian server via `ArchestrA.HistorianAccess`:
> **Gap / status note.** The raw SDK wrapper (`HistorianDataSource`, `HistorianClusterEndpointPicker`, `HistorianHealthSnapshot`, etc.) has been ported from the v1 `ZB.MOM.WW.LmxOpcUa.Historian.Aveva` plugin into `Driver.Galaxy.Host/Backend/Historian/`. The **IPC wire-up** — `HistoryReadRequest` / `HistoryReadResponse` message kinds, Proxy-side `ReadRawAsync` / `ReadProcessedAsync` / `ReadAtTimeAsync` / `ReadEventsAsync` forwarding — is in place on `GalaxyProxyDriver`. What remains to close on a given branch is Host-side **mapping of `HistoryAggregateType` onto the `AnalogSummaryQuery` column names** (done in `GalaxyProxyDriver.MapAggregateToColumn`; the Host side must mirror it) and the **end-to-end integration test** that was held by the v1 plugin suite. Until those land on a given driver branch, history calls against Galaxy may surface `GalaxyIpcException { Code = "not-implemented" }` or backend-specific errors rather than populated `HistoryReadResult`s. Track the remaining work against the Phase 2 Galaxy out-of-process gate in `docs/v2/plan.md`.
1. **Lazy connect** -- The connection is established on the first query via `EnsureConnected()`. When a cluster is configured, the data source iterates `HistorianClusterEndpointPicker.GetHealthyNodes()` in order and returns the first node that successfully connects.
2. **Connection reuse** -- Subsequent queries reuse the same connection. The active node is tracked in `_activeProcessNode` / `_activeEventNode` and surfaced on the dashboard.
3. **Auto-reconnect** -- On connection failure, the connection is disposed, the active node is marked failed in the picker, and the next query re-enters the picker loop to try the next eligible candidate.
4. **Clean shutdown** -- `Dispose()` closes the connection when the service stops.
### Aggregate function mapping
The connection is opened with `ReadOnly = true` and `ConnectionType = Process`. The event (alarm history) path uses a separate connection with `ConnectionType = Event`, but both silos share the same cluster picker so a node that fails on one silo is immediately skipped on the other.
`GalaxyProxyDriver.MapAggregateToColumn` (Proxy-side) translates the OPC UA Part 13 standard aggregate enum onto `AnalogSummaryQuery` column names consumed by `HistorianDataSource.ReadAggregateAsync`:
## Read-Only Cluster Failover
| `HistoryAggregateType` | Result Property |
|------------------------|-----------------|
| `Average` | `Average` |
| `Minimum` | `Minimum` |
| `Maximum` | `Maximum` |
| `Count` | `ValueCount` |
When `HistorianConfiguration.ServerNames` is non-empty, the plugin picks from an ordered list of cluster nodes instead of a single `ServerName`. Each connection attempt tries candidates in configuration order until one succeeds. Failed nodes are placed into a timed cooldown and re-admitted when the cooldown elapses.
`HistoryAggregateType.Total` is **not supported** by Wonderware `AnalogSummary` and raises `NotSupportedException`, which the Core translates to `BadAggregateNotSupported`. Additional OPC UA aggregates (`Start`, `End`, `StandardDeviationPopulation`) sit on the Historian columns `First`, `Last`, `StdDev` and can be exposed by extending the enum + mapping together.
### HistorianClusterEndpointPicker
### Read-only cluster failover
The picker (in the plugin assembly, internal) is pure logic with no SDK dependency — all cluster behavior is unit-testable with a fake clock and scripted factory. Key characteristics:
`HistorianConfiguration.ServerNames` accepts an ordered list of cluster nodes. `HistorianClusterEndpointPicker` iterates the list in configuration order, marks failed nodes with a `FailureCooldownSeconds` window, and re-admits them when the cooldown elapses. One picker instance is shared by the process-values connection and the event-history connection (two SDK silos), so a node failure on one silo immediately benches it for the other. `FailureCooldownSeconds = 0` disables the cooldown — the SDK's own retry semantics are the sole gate.
- **Ordered iteration**: nodes are tried in the exact order they appear in `ServerNames`. Operators can express a preference ("primary first, fallback second") by ordering the list.
- **Per-node cooldown**: `MarkFailed(node, error)` starts a `FailureCooldownSeconds` window during which the node is skipped from `GetHealthyNodes()`. `MarkHealthy(node)` clears the window immediately (used on successful connect).
- **Automatic re-admission**: when a node's cooldown elapses, the next call to `GetHealthyNodes()` includes it automatically — no background probe, no manual reset. The cumulative `FailureCount` and `LastError` are retained for operator diagnostics.
- **Thread-safe**: a single lock guards the per-node state. Operations are microsecond-scale so contention is a non-issue.
- **Shared across silos**: one picker instance is shared by the process-values connection and the event-history connection, so a node failure on one path immediately benches it for the other.
- **Zero cooldown mode**: `FailureCooldownSeconds = 0` disables the cooldown entirely — the node is never benched. Useful for tests or for operators who want the SDK's own retry semantics to be the sole gate.
Host-side cluster health is surfaced via `HistorianHealthSnapshot { NodeCount, HealthyNodeCount, ActiveProcessNode, ActiveEventNode, Nodes }` and forwarded to the Proxy so the Admin UI Historian panel can render a per-node table. `HealthCheckService` flips overall service health to `Degraded` when `HealthyNodeCount < NodeCount`.
### Connection attempt flow
### Runtime health counters
`HistorianDataSource.ConnectToAnyHealthyNode(HistorianConnectionType)` performs the actual iteration:
`HistorianDataSource` maintains per-read counters — `TotalQueries`, `TotalSuccesses`, `TotalFailures`, `ConsecutiveFailures`, `LastSuccessTime`, `LastFailureTime`, `LastError`, `ProcessConnectionOpen`, `EventConnectionOpen` — so the dashboard can distinguish "backend loaded but never queried" from "backend loaded and queries are failing". `LastError` is prefixed with the read path (`raw:`, `aggregate:`, `at-time:`, `events:`) so operators can tell which silo is broken. `HealthCheckService` degrades at `ConsecutiveFailures >= 3`.
1. Snapshot healthy nodes from the picker. If empty, throw `InvalidOperationException` with either "No historian nodes configured" or "All N historian nodes are in cooldown".
2. For each candidate, clone `HistorianConfiguration` with the candidate as `ServerName` and pass it to the factory. On success: `MarkHealthy(node)` and return the `(Connection, Node)` tuple. On exception: `MarkFailed(node, ex.Message)`, log a warning, continue.
3. If all candidates fail, wrap the last inner exception in an `InvalidOperationException` with the cumulative failure count so the existing read-method catch blocks surface a meaningful error through the health counters.
### Quality mapping
The wrapping exception intentionally includes the last inner error message in the outer `Message` so the health snapshot's `LastError` field is still human-readable when the cluster exhausts every candidate.
### Single-node backward compatibility
When `ServerNames` is empty, the picker is seeded with a single entry from `ServerName` and the iteration loop still runs — it just has one candidate. Legacy deployments see no behavior change: the picker marks the single node healthy on success, runs the same cooldown logic on failure, and the dashboard renders a compact `Node: <hostname>` line instead of the cluster table.
### Cluster health surface
Runtime cluster state is exposed on `HistorianHealthSnapshot`:
- `NodeCount` / `HealthyNodeCount` -- size of the configured cluster and how many are currently eligible.
- `ActiveProcessNode` / `ActiveEventNode` -- which nodes are currently serving the two connection silos, or `null` when a silo has no open connection.
- `Nodes: List<HistorianClusterNodeState>` -- per-node state with `Name`, `IsHealthy`, `CooldownUntil`, `FailureCount`, `LastError`, `LastFailureTime`.
The dashboard renders this as a cluster table when `NodeCount > 1`. See [Status Dashboard](StatusDashboard.md#historian). `HealthCheckService` flips the overall service health to `Degraded` when `HealthyNodeCount < NodeCount` so operators can alert on a partially-failed cluster even while queries are still succeeding via the remaining nodes.
## Runtime Health Counters
`HistorianDataSource` maintains runtime query counters updated on every read method exit — success or failure — so the dashboard can distinguish "plugin loaded but never queried" from "plugin loaded and queries are failing". The load-time `HistorianPluginLoader.LastOutcome` only reports whether the assembly resolved at startup; it cannot catch a connection that succeeds at boot and degrades later.
### Counters
- `TotalQueries` / `TotalSuccesses` / `TotalFailures` — cumulative since startup. Every call to `RecordSuccess` or `RecordFailure` in the read methods updates these under `_healthLock`. Empty result sets count as successes — the counter reflects "the SDK call returned" rather than "the SDK call returned data".
- `ConsecutiveFailures` — latches while queries are failing; reset to zero by the first success. Drives `HealthCheckService` degradation at threshold 3.
- `LastSuccessTime` / `LastFailureTime` — UTC timestamps of the most recent success or failure, or `null` when no query of that outcome has occurred yet.
- `LastError` — exception message from the most recent failure, prefixed with the read-path name (`raw:`, `aggregate:`, `at-time:`, `events:`) so operators can tell which SDK call is broken. Cleared on the next success.
- `ProcessConnectionOpen` / `EventConnectionOpen` — whether the plugin currently holds an open SDK connection on each silo. Read from the data source's `_connection` / `_eventConnection` fields via a `Volatile.Read`.
These fields are read once per dashboard refresh via `IHistorianDataSource.GetHealthSnapshot()` and serialized into `HistorianStatusInfo`. See [Status Dashboard](StatusDashboard.md#historian) for the HTML/JSON surface.
### Two SDK connection silos
The plugin maintains two independent `ArchestrA.HistorianAccess` connections, one per `HistorianConnectionType`:
- **Process connection** (`ConnectionType = Process`) — serves historical *value* queries: `ReadRawAsync`, `ReadAggregateAsync`, `ReadAtTimeAsync`. This is the SDK's query channel for tags stored in the Historian runtime.
- **Event connection** (`ConnectionType = Event`) — serves historical *event/alarm* queries: `ReadEventsAsync`. The SDK requires a separately opened connection for its event store because the query API and wire schema are distinct from value queries.
Both connections are lazy: they open on the first query that needs them. Either can be open, closed, or open against a different cluster node than the other. The dashboard renders both independently in the Historian panel (`Process Conn: open (host-a) | Event Conn: closed`) so operators can tell which silos are active and which node is serving each. When cluster support is configured, both silos share the same `HistorianClusterEndpointPicker`, so a failure on one silo marks the node unhealthy for the other as well.
## Raw Reads
`IHistorianDataSource.ReadRawAsync` (plugin implementation) uses a `HistoryQuery` to retrieve individual samples within a time range:
1. Create a `HistoryQuery` via `_connection.CreateHistoryQuery()`
2. Configure `HistoryQueryArgs` with `TagNames`, `StartDateTime`, `EndDateTime`, and `RetrievalMode = Full`
3. Iterate: `StartQuery` -> `MoveNext` loop -> `EndQuery`
Each result row is converted to an OPC UA `DataValue`:
- `QueryResult.Value` (double) takes priority; `QueryResult.StringValue` is used as fallback for string-typed tags.
- `SourceTimestamp` and `ServerTimestamp` are both set to `QueryResult.StartDateTime`.
- `StatusCode` is mapped from the `QueryResult.OpcQuality` (UInt16) via `QualityMapper` (the same OPC DA quality byte mapping used for live MXAccess data).
## Aggregate Reads
`IHistorianDataSource.ReadAggregateAsync` (plugin implementation) uses an `AnalogSummaryQuery` to retrieve pre-computed aggregates:
1. Create an `AnalogSummaryQuery` via `_connection.CreateAnalogSummaryQuery()`
2. Configure `AnalogSummaryQueryArgs` with `TagNames`, `StartDateTime`, `EndDateTime`, and `Resolution` (milliseconds)
3. Iterate the same `StartQuery` -> `MoveNext` -> `EndQuery` pattern
4. Extract the requested aggregate from named properties on `AnalogSummaryQueryResult`
Null aggregate values return `BadNoData` status rather than `Good` with a null variant.
## Quality Mapping
The Historian SDK returns standard OPC DA quality values in `QueryResult.OpcQuality` (UInt16). The low byte is passed through the shared `QualityMapper` pipeline (`MapFromMxAccessQuality` -> `MapToOpcUaStatusCode`), which maps the OPC DA quality families to OPC UA status codes:
The Historian SDK returns standard OPC DA quality values in `QueryResult.OpcQuality` (UInt16). The low byte flows through the shared `QualityMapper` pipeline (`MapFromMxAccessQuality``MapToOpcUaStatusCode`):
| OPC Quality Byte | OPC DA Family | OPC UA StatusCode |
|---|---|---|
|------------------|---------------|-------------------|
| 0-63 | Bad | `Bad` (with sub-code when an exact enum match exists) |
| 64-191 | Uncertain | `Uncertain` (with sub-code when an exact enum match exists) |
| 192+ | Good | `Good` (with sub-code when an exact enum match exists) |
See `Domain/QualityMapper.cs` and `Domain/Quality.cs` for the full mapping table and sub-code definitions.
See `Domain/QualityMapper.cs` and `Domain/Quality.cs` in `Driver.Galaxy.Host` for the full table.
## Aggregate Function Mapping
## OPC UA Client — upstream forwarding
`HistorianAggregateMap.MapAggregateToColumn` (in the core Host assembly, so the node manager can validate aggregate support without requiring the plugin to be loaded) translates OPC UA aggregate NodeIds to `AnalogSummaryQueryResult` property names:
The OPC UA Client driver (`Driver.OpcUaClient`) implements `IHistoryProvider` by forwarding each call to the upstream server via `Session.HistoryRead`. Raw / processed / at-time / events map onto the stack's native HistoryRead details types. Continuation points are passed through — the Core's `HistoryContinuationPointManager` treats the driver as an opaque pager.
| OPC UA Aggregate | Result Property |
|---|---|
| `AggregateFunction_Average` | `Average` |
| `AggregateFunction_Minimum` | `Minimum` |
| `AggregateFunction_Maximum` | `Maximum` |
| `AggregateFunction_Count` | `ValueCount` |
| `AggregateFunction_Start` | `First` |
| `AggregateFunction_End` | `Last` |
| `AggregateFunction_StandardDeviationPopulation` | `StdDev` |
## Historizing flag and AccessLevel
Unsupported aggregates return `null`, which causes the node manager to return `BadAggregateNotSupported`.
## HistoryReadRawModified Override
`LmxNodeManager` overrides `HistoryReadRawModified` to handle raw history read requests:
1. Resolve the `NodeHandle` to a tag reference via `_nodeIdToTagReference`. Return `BadNodeIdUnknown` if not found.
2. Check that `_historianDataSource` is not null. Return `BadHistoryOperationUnsupported` if historian is disabled.
3. Call `ReadRawAsync` with the time range and `NumValuesPerNode` from the `ReadRawModifiedDetails`.
4. Pack the resulting `DataValue` list into a `HistoryData` object and wrap it in an `ExtensionObject` for the `HistoryReadResult`.
## HistoryReadProcessed Override
`HistoryReadProcessed` handles aggregate history requests with additional validation:
1. Resolve the node and check historian availability (same as raw).
2. Validate that `AggregateType` is present in the `ReadProcessedDetails`. Return `BadAggregateListMismatch` if empty.
3. Map the requested aggregate to a result property via `MapAggregateToColumn`. Return `BadAggregateNotSupported` if unmapped.
4. Call `ReadAggregateAsync` with the time range, `ProcessingInterval`, and property name.
5. Return results in the same `HistoryData` / `ExtensionObject` format.
## Historizing Flag and AccessLevel
During variable node creation in `CreateAttributeVariable`, attributes with `IsHistorized == true` receive two additional settings:
During variable node creation, drivers that advertise history set:
```csharp
if (attr.IsHistorized)
@@ -230,7 +111,13 @@ if (attr.IsHistorized)
variable.Historizing = attr.IsHistorized;
```
- **`Historizing = true`** -- Tells OPC UA clients that this node has historical data available.
- **`AccessLevels.HistoryRead`** -- Enables the `HistoryRead` access bit on the node, which the OPC UA stack checks before routing history requests to the node manager override. Nodes without this bit set will be rejected by the framework before reaching `HistoryReadRawModified` or `HistoryReadProcessed`.
- **`Historizing = true`** — tells OPC UA clients that the node has historical data available.
- **`AccessLevels.HistoryRead`** — enables the `HistoryRead` access bit. The OPC UA stack checks this bit before routing history requests to the Core dispatcher; nodes without it are rejected before reaching `IHistoryProvider`.
The `IsHistorized` flag originates from the Galaxy repository database query, which checks whether the attribute has Historian logging configured.
The `IsHistorized` flag originates in the driver's discovery output. For Galaxy it comes from the repository query detecting a `HistoryExtension` primitive (see [drivers/Galaxy-Repository.md](drivers/Galaxy-Repository.md)). For OPC UA Client it is copied from the upstream server's `Historizing` property.
## Configuration
Driver-specific historian config lives in each driver's `DriverConfig` JSON blob, validated against the driver type's `DriverConfigJsonSchema` in `DriverTypeRegistry`. The Galaxy driver's historian section carries the fields exercised by `HistorianConfiguration``ServerName` / `ServerNames`, `FailureCooldownSeconds`, `IntegratedSecurity` / `UserName` / `Password`, `Port` (default `32568`), `CommandTimeoutSeconds`, `RequestTimeoutSeconds`, `MaxValuesPerRead`. The OPC UA Client driver inherits its timeouts from the upstream session.
See [Configuration.md](Configuration.md) for the schema shape and validation path.