Files
ScadaBridge/docs/plans/2026-05-28-mxgateway-data-connection-design.md
T
Joseph Doherty 8730c6e30a docs(dcl): design for MxGateway data connection (2nd protocol)
Add design doc for a second data-connection protocol, MxGateway, alongside
the OPC UA client. New IDataConnection adapter behind the existing
DataConnectionFactory extension point; tag pipe (read/subscribe/write) plus
Galaxy hierarchy browse, optional 2nd endpoint for failover. Generalizes the
OPC-UA-named browse plumbing to protocol-agnostic browse via
IBrowsableDataConnection. No entity/schema changes.
2026-05-29 07:28:21 -04:00

11 KiB

MxGateway Data Connection — Design

Date: 2026-05-28 Component: Data Connection Layer (#4), with touches to Commons (#16), Central UI (#9), Host (#15) Status: Approved — ready for implementation planning

Summary

Add a second data-connection protocol, MxGateway, alongside the existing OPC UA client. MxGateway connects to the MxAccess Gateway (https://gitea.dohertylan.com/dohertj2, packages ZB.MOM.WW.MxGateway.Client + ZB.MOM.WW.MxGateway.Contracts) over gRPC and exposes an AVEVA/Wonderware MXAccess-backed Galaxy as a clean tag-value pipe, identical in role to the OPC UA adapter.

The Data Connection Layer was built for exactly this: DataConnectionFactory exposes RegisterAdapter(protocolType, factory) and every surrounding mechanism (the DataConnectionActor Become/Stash state machine, primary/backup failover, health reporting, re-subscribe-on-reconnect) is protocol-agnostic. The new protocol is a single IDataConnection adapter plus one registration line — no changes to the actor, the entity schema, or the failover machinery.

Scope

In scope (this slice):

  • Read / Subscribe / Write — MxGateway as a clean tag-value pipe.
  • Galaxy hierarchy browse for the instance-config tag picker.
  • Optional second endpoint for failover (reusing the existing primary/backup model).

Out of scope (possible later slices):

  • Native MXAccess alarms (QueryActiveAlarms / StreamAlarms / AcknowledgeAlarm). ScadaBridge evaluates its own alarms via Alarm Actors from tag values; native alarms are a new concept.
  • Secured writes (WriteSecured, operator + verifier userId). Plain writes carry a configurable WriteUserId only.

Decisions

Decision Choice
Approach New IDataConnection adapter behind the existing factory extension point (not a shared base class, not a separate subsystem).
Protocol string "MxGateway" (matches the NuGet package family).
Browse plumbing Generalized to protocol-agnostic browse driven by IBrowsableDataConnection; OPC UA and MxGateway share one path.
Write user context Optional WriteUserId config field, default 0. No script API change.
Endpoint redundancy Reuse existing primary/backup failover; backup = a second gateway endpoint.
ApiKey secret handling Match whatever OPC UA UserIdentityConfig username/password does today.

Section 1 — Adapter & client lifecycle mapping

New project-internal MxGatewayDataConnection : IDataConnection, IBrowsableDataConnection in ZB.MOM.WW.ScadaBridge.DataConnectionLayer/Adapters/, wrapping an injected IMxGatewayClientFactory (mirrors the IOpcUaClientFactory seam so it is unit-testable with a fake).

IDataConnection MxGateway client
ConnectAsync(details) MxGatewayClient.Create(Endpoint, ApiKey, TLS)OpenSessionAsyncRegisterAsync(clientName) (store serverHandle); start background StreamEventsAsync consumer loop
SubscribeAsync(tagPath, cb) AddItemAsyncAdviseAsync (or SubscribeBulkAsync); map itemHandle ↔ tagPath ↔ callback; return subscriptionId
UnsubscribeAsync(id) UnAdviseAsync + RemoveItemAsync
ReadAsync / ReadBatchAsync ReadBulkAsync (uses cached advised value when present)
WriteAsync / WriteBatchAsync WriteBulkAsync with WriteUserId; value via ToMxValue()
WriteBatchAndWaitAsync generic compose: write values → write flag → poll responsePath (advised value or ReadBulk) until match/timeout
Status ConnectionHealth tracked across session state
Disconnected fired once (Interlocked guard) when StreamEventsAsync faults or the channel breaks

Value/quality mapping. Each OnDataChange MxEvent carries item_handle, value (MxValueToClrValue()), quality (OPC-style int), source_timestamp, statuses, and worker_sequence. Dispatched to the matching tag's SubscriptionCallback as TagValue(ToClrValue(value), mapQuality(quality, statuses), source_timestamp). Quality: quality >= 192Good; bad-category status → Bad; otherwise Uncertain. The loop tracks worker_sequence and resumes with afterWorkerSequence on reconnect so no change is missed.

Reconnection needs no new logic. The existing DataConnectionActor catches Disconnected, pushes bad quality to all subscribed tags, disposes the adapter, and on retry calls ConnectAsync on a fresh adapter then re-subscribes all tags — identical to OPC UA.

Section 2 — Configuration, secrets & endpoint redundancy

New MxGatewayEndpointConfig in Commons (alongside OpcUaEndpointConfig) with a matching MxGatewayEndpointConfigSerializer (flat-dict ⇄ JSON) and MxGatewayEndpointConfigValidator. Stored exactly like OPC UA: per-connection JSON in DataConnection.PrimaryConfiguration / BackupConfiguration. Primary/backup failover works for free — backup = a second gateway endpoint, round-robin, no auto-failback, driven by the existing FailoverRetryCount state machine. No entity or migration changes.

Key Type Default Notes
Endpoint string http://localhost:5000 Gateway base URL
ApiKey string Sent as authorization: Bearer <key>
ClientName string scadabridge-<connName> Registration name
WriteUserId int 0 Applied to every write-back
UseTls / CaFile / ServerName bool/string/string false / — / — TLS to a secured gateway
ReadTimeoutMs int 5000 ReadBulk per-call timeout

Secrets. ApiKey follows whatever OPC UA UserIdentityConfig username/password does today (same at-rest treatment, same log/telemetry redaction). Match that pattern exactly; if OPC UA stores credentials in plaintext, ApiKey inherits the same known limitation (not a new regression) — flag during implementation.

Shared settings (ReconnectInterval, TagResolutionRetryInterval, WriteTimeout) stay in DataConnectionOptions, unchanged, applying to all protocols.

Section 3 — Protocol-agnostic browse (tag picker)

IBrowsableDataConnection is already protocol-neutral (node ids are opaque strings). Generalize the OPC-UA-named plumbing so both protocols flow through one path.

Renames (site + central + UI):

Today Becomes
BrowseOpcUaNodeCommand / BrowseOpcUaNodeResult BrowseNodeCommand / BrowseNodeResult
OpcUaBrowseService / IOpcUaBrowseService BrowseService / IBrowseService
OpcUaBrowserDialog.razor NodeBrowserDialog.razor
BrowseFailure / BrowseFailureKind kept (already generic)

DataConnectionManagerActor resolves the connection, checks adapter is IBrowsableDataConnection, and calls BrowseChildrenAsync(parentNodeId) regardless of protocol (already the OPC UA logic — just drop the "OpcUa" from names). Adapters without the interface return a "browse not supported" failure (unchanged).

MxGateway side. MxGatewayDataConnection.BrowseChildrenAsync wraps GalaxyRepositoryClient.BrowseChildrenAsync (one Galaxy level per call). Mapping:

  • Galaxy object → BrowseNode(NodeId = gobjectId-or-contained-path, DisplayName = tagName, NodeClass = Object, HasChildren = child_has_children[i]).
  • Each object's attributes → BrowseNode(NodeId = FullTagReference, NodeClass = Variable, HasChildren = false) — Variable rows are the selectable tag paths stored in instance config.

GalaxyRepositoryClient is a separate gRPC client from MxGatewayClient, so the adapter holds both (same Endpoint + ApiKey): browse uses the read-only repository client, the hot path uses the gateway client. The tag-picker dialog opens identically for either protocol; only the tree shape and opaque node-id strings differ.

Section 4 — Packaging, DI registration & error classification

NuGet feed. Add a repo-root nuget.config declaring the Gitea feed (https://gitea.dohertylan.com/api/packages/dohertj2/nuget/index.json) alongside nuget.org. Credentials are not committed — from the developer's ~/.nuget, or for the Docker image build a build-arg/secret-mounted credential (wire into docker/deploy.sh). The DCL project references ZB.MOM.WW.MxGateway.Client (…Contracts transitively); both target net10.0.

DI registration in DataConnectionFactory:

RegisterAdapter("MxGateway", details => new MxGatewayDataConnection(
    new MxGatewayClientFactory(_loggerFactory),
    _loggerFactory.CreateLogger<MxGatewayDataConnection>()));

plus an MxGatewayGlobalOptions (parallel to OpcUaGlobalOptions) bound in Host. OPC UA registration untouched.

Error classification (drives bad-quality push vs. synchronous script error):

  • Connection/transport faults (MxGatewaySessionException, gRPC unavailable, stream break) → Disconnected → reconnect + bad quality. Transient.
  • Per-item read/write failures (BulkReadResult / BulkWriteResult with WasSuccessful = false: bad tag, MXAccess rejection) → returned to caller (write) or bad quality (read). Not a disconnect.
  • Auth failures (MxGatewayAuthenticationException / …AuthorizationException) → treated like a failed connect (logged, retried on failover/reconnect cadence); a rotated key is operationally a connection problem, not per-tag.

Matches OPC UA's "operations fail immediately to the caller; connection loss triggers reconnect" split.

Section 5 — Testing, docs & deploy

Testing (fake client seam, no live gateway, following the OPC UA adapter style):

  • MxGatewayDataConnection against a FakeMxGatewayClient: connect→register→advise lifecycle; OnDataChangeTagValue dispatch incl. quality mapping; read/write/batch success + per-item failure; WriteBatchAndWait match & timeout; Disconnected fires once on stream fault; worker_sequence resume on reconnect.
  • MxGatewayEndpointConfigSerializer / Validator round-trip + defaults + invalid-numeric fallback.
  • Browse mapping (object→Object, attribute→Variable, HasChildren hint) against a fake repository client.
  • Generalized-browse regression: existing OPC UA browse tests updated to renamed BrowseNodeCommand / BrowseService and still passing.

Docs (spec travels with code):

  • Component-DataConnectionLayer.md: add MxGateway under "Supported Protocols", an "MxGateway Settings" config table, note IBrowsableDataConnection now backs both protocols.
  • README.md protocol mentions if any.
  • This design doc.

Deploy. bash docker/deploy.sh rebuilds the image; only deploy-config change is NuGet credential wiring for restore. Sites get the adapter automatically (compiled into Host). No new ports/services — the adapter is an outbound gRPC client to the gateway.

Affected components: DCL (adapter, factory, options), Commons (config type, serializer, validator, renamed browse messages + IBrowsableDataConnection consumers), Configuration Database (none — no schema change), Central UI (renamed browse service/dialog, protocol selector + MxGatewayEndpointEditor in DataConnectionForm — net-new UI, use frontend-design skill), Host (options binding), tests, docs, nuget.config.