Files
3yearplan/handoffs/otopcua-handoff.md
Joseph Doherty fc3e19fde1 Add OtOpcUa implementation handoff document
Self-contained extract of all OtOpcUa design material from the plan:
architecture context, LmxOpcUa starting point, two namespaces, driver
strategy, deployment, auth, rollout tiers, UNS hierarchy, canonical
model integration, digital twin touchpoints, sites, roadmap, and all
open TBDs. Includes correction-submission protocol for the implementing
agent.
2026-04-17 09:21:25 -04:00

22 KiB

OtOpcUa — Implementation Handoff

Extracted: 2026-04-17 Source plan: ../goal-state.md, ../current-state.md, ../roadmap.md, ../current-state/equipment-protocol-survey.md Repo for existing codebase: lmxopcua (see ../links.md)

This is a point-in-time extract, not a living document. The authoritative plan content lives in the source files above. If anything here conflicts with the source files, the source files win.

Corrections from the implementation agent are expected and welcome. If the implementation work surfaces inaccuracies, missing constraints, or architectural decisions that need revisiting, send corrections back for integration into the plan. Format: describe what's wrong, what you found, and what the plan should say instead. Corrections will be reviewed and folded into the authoritative plan files — they do not get applied to this handoff document (which is a snapshot, not the source of truth).


What OtOpcUa Is

OtOpcUa is a per-site clustered OPC UA server that is the single sanctioned OPC UA access point for all OT data at each site. It owns the one connection to each piece of equipment and exposes a unified OPC UA surface to every downstream consumer (Aveva System Platform, Ignition, ScadaBridge, future consumers).

It is not a new component from scratch — it is the evolution of the existing LmxOpcUa codebase. LmxOpcUa is absorbed into OtOpcUa, not replaced by a separate component.

Where it sits in the architecture

Layer 1  Equipment (PLCs, controllers, instruments)
            ↕
Layer 2  OtOpcUa  ← THIS COMPONENT
            ↕
Layer 3  SCADA (Aveva System Platform + Ignition)
            ↕
Layer 4  ScadaBridge (sole IT↔OT crossing point)
         ─── IT/OT Boundary ───
         Enterprise IT

OtOpcUa lives entirely on the OT side. It does not change where the IT↔OT crossing sits (that's ScadaBridge central). It is OT-data-facing, site-local, and fronts OT consumers.


What Exists Today (LmxOpcUa — the starting point)

Repo: lmxopcua

  • What: in-house OPC UA server with tight integration to Aveva System Platform.
  • Role: exposes System Platform data/objects via OPC UA, enabling OPC UA clients (including ScadaBridge and third parties) to consume System Platform data natively.
  • Deployment: built and deployed to every Aveva System Platform node — primary cluster in South Bend and every site-level application server cluster. Each System Platform node runs its own local instance.
  • Namespace source: each instance interfaces with its local Application Platform's LMX API. The OPC UA address space reflects the System Platform objects reachable through that node's LMX API — namespace is per-node and scoped to whatever the local App Server surfaces.
  • Security model: standard OPC UA security — None / Sign / SignAndEncrypt modes, Basic256Sha256 and related profiles, UserName token authentication for clients. No bespoke auth scheme.
  • Technology: .NET (in-house pattern shared with ScadaBridge).

Current equipment access problem that OtOpcUa solves

Today, equipment is connected to by multiple systems directly, concurrently:

  • Aveva System Platform (for validated data collection via IO drivers)
  • Ignition SCADA (for KPI data, central from South Bend over WAN)
  • ScadaBridge (for bridge/integration workloads via Akka.NET OPC UA client)

Consequences:

  • Multiple OPC UA sessions per equipment — strains devices with limited concurrent-session support
  • No single access-control point — authorization is per-consumer, no site-level chokepoint
  • Inconsistent data — same tag read by three consumers can produce three subtly different values (different sampling intervals, deadbands, session buffers)

OtOpcUa eliminates all three problems by collapsing to one session per equipment.


Two Namespaces

OtOpcUa serves two logical namespaces through a single endpoint:

1. Equipment namespace (raw data) — NEW

Live values read from equipment via native OPC UA or native device protocols (Modbus, Ethernet/IP, Siemens S7, etc.) translated to OPC UA. This is the new capability — what the "Layer 2 — raw data" role describes.

Raw equipment data at this layer is exactly that — raw — no deadbanding, no aggregation, no business meaning. Business meaning is added at Layer 3 (System Platform / Ignition).

2. System Platform namespace (processed data tap) — EXISTING (from LmxOpcUa)

The former LmxOpcUa functionality, folded in. Exposes Aveva System Platform objects (via the local App Server's LMX API) as OPC UA so that OPC UA-native consumers can read processed data through the same endpoint they use for raw equipment data.

Extensible namespace model

The two-namespace design is not a hard cap. A future simulated namespace could expose synthetic or replayed equipment data to consumers, letting tier-1/tier-2 consumers (ScadaBridge, Ignition, System Platform IO) be exercised against real-shaped-but-offline data streams without physical equipment. Architecturally supported, not committed for build in the 3-year scope. Design the namespace system so adding a third namespace is a configuration change, not a structural refactor.


Responsibilities

  • Single connection per equipment. OtOpcUa is the only OPC UA client that talks to equipment directly. Equipment holds one session — to OtOpcUa — regardless of how many downstream consumers need its data.
  • Site-local aggregation. Downstream consumers connect to OtOpcUa rather than to equipment directly. A consumer reading the same tag gets the same value regardless of who else is subscribed.
  • Unified OPC UA endpoint for OT data. Clients read both raw equipment data and processed System Platform data from one OPC UA endpoint with two namespaces.
  • Access control / authorization chokepoint. Authentication, authorization, rate limiting, and audit of OT OPC UA reads/writes are enforced at OtOpcUa, not at each consumer.
  • Clustered for HA. Multi-node cluster — node loss does not drop equipment or System Platform visibility.

Build vs Buy

Decision: custom build, in-house. Not Kepware, Matrikon, Aveva Communication Drivers, HiveMQ Edge, or any off-the-shelf OPC UA aggregator.

Rationale:

  • Matches the existing in-house .NET pattern (ScadaBridge, SnowBridge, and LmxOpcUa itself)
  • Full control over clustering semantics, access model, and integration with ScadaBridge's operational surface
  • No per-site commercial license
  • No vendor roadmap risk for a component this central

Primary cost acknowledged: equipment driver coverage. Commercial aggregators like Kepware justify their license cost through their driver library. Picking custom build means that library has to be built in-house. See Driver Strategy below.

Reference products (Kepware, Matrikon, etc.) may still be useful for comparison on specific capabilities even though they're not the target.


Driver Strategy: Hybrid — Proactive Core Library + On-Demand Long-Tail

Core driver library (proactive, Year 1 → Year 2)

A core library covering the top equipment protocols for the estate, built proactively so that most site onboardings can draw from existing drivers rather than blocking on driver work.

Core library scope is driven by the equipment protocol survey — see below and ../current-state/equipment-protocol-survey.md. A protocol becomes "core" if it meets any of:

  1. Present at 3+ sites
  2. Combined instance count above ~25
  3. Needed to onboard a Year 1 or Year 2 site
  4. Strategic vendor whose equipment is expected to grow (judgment call)

Long-tail drivers (on-demand, as sites onboard)

Protocols beyond the core library are built on-demand when the first site that needs the protocol reaches onboarding.

Implementation approach (not committed, one possible tactic)

Embedded open-source protocol stacks wrapped in OtOpcUa's driver framework:

  • NModbus for Modbus TCP/RTU
  • Sharp7 for Siemens S7
  • libplctag for EtherNet/IP (Allen-Bradley)
  • Other libraries as needed

This reduces driver work to "write the OPC UA ↔ protocol adapter" rather than "implement the protocol from scratch." The build team may pick this or a different approach per driver.

Equipment where no driver is needed

Equipment that already speaks native OPC UA requires no driver build — OtOpcUa simply proxies the OPC UA session. The driver-build effort is scoped only to equipment exposing non-OPC-UA protocols.


Equipment Protocol Survey (Year 1 prerequisite — not yet run)

The protocol survey determines the core driver library scope. It has not been run yet.

Template, schema, classification rule, rollup views, and a 6-step discovery approach are documented in ../current-state/equipment-protocol-survey.md.

Pre-seeded expected categories (placeholders, not confirmed):

ID Equipment class Native protocol Core candidate?
EQP-001 OPC UA-native equipment OPC UA No driver needed
EQP-002 Siemens S7 PLCs (S7-300/400/1200/1500) Siemens S7 / OPC UA on newer models Unknown — depends on S7-1500 vs older ratio
EQP-003 Allen-Bradley / Rockwell PLCs EtherNet/IP (CIP) Likely core
EQP-004 Generic Modbus devices Modbus TCP / RTU Likely core
EQP-005 Fanuc CNC controllers FOCAS (proprietary library) Depends on CNC count
EQP-006 Long-tail (everything else) Various On-demand

Dual mandate: the same discovery walk also produces the initial UNS naming hierarchy snapshot at equipment-instance granularity (see UNS section below). Two outputs, one walk.


Deployment Footprint

Co-located on existing Aveva System Platform nodes. Same pattern as ScadaBridge — no dedicated hardware.

  • Cluster size: 2-node clusters at most sites. Largest sites (Warsaw West, Warsaw North) run one cluster per production building, matching ScadaBridge's and System Platform's existing per-building cluster pattern.
  • Rationale: zero new hardware footprint; OtOpcUa largely replaces what LmxOpcUa already runs on these nodes, so the incremental resource draw is just the new equipment-driver and clustering work.
  • Trade-off accepted: System Platform, ScadaBridge, and OtOpcUa all share nodes. Resource contention mitigated by (1) modest driver workload relative to ScadaBridge's proven 225k/sec OPC UA ingestion ceiling, (2) monitoring via observability signals, (3) option to move off-node if contention is observed.

TBD — measured impact of adding this workload; headroom numbers at largest sites; whether any site needs dedicated hardware.


Authorization Model

OPC UA-native — user tokens for authentication + namespace-level ACLs for authorization.

  • Every downstream consumer authenticates with standard OPC UA user tokens (UserName tokens and/or X.509 client certs, per site/consumer policy)
  • Authorization enforced via namespace-level ACLs — each identity scoped to permitted equipment/namespaces
  • Inherits the LmxOpcUa auth pattern — consumer-side experience does not change for clients that used LmxOpcUa previously

Explicitly not federated with the enterprise IdP. OT data access is a pure OT concern. The plan's IT/OT boundary stays at ScadaBridge central, not at OtOpcUa. Two identity stores to operate (enterprise IdP for IT-facing components, OPC UA-native identities for OtOpcUa) is the accepted trade-off.

TBD — specific security mode + profile combinations required; credential source (local directory, per-site vault, AD/LDAP); rotation cadence; audit trail of authz decisions.


Rollout Posture

Deploy everywhere fast

The cluster software (server + core driver library) is built and rolled out to every site's System Platform nodes as fast as practical — deployment to all sites is treated as a prerequisite, not a gradual effort.

"Deployment" = installing and configuring so the node is ready to front equipment. It does not mean immediately migrating consumers. A deployed but inactive cluster is cheap.

Tiered consumer cutover (sequenced by risk)

Existing direct equipment connections are moved to OtOpcUa one consumer at a time, in risk order:

Tier Consumer Why this order Timeline
1 ScadaBridge We own both ends; lowest-risk cutover; validates cluster under real load Year 1 (begin at large sites) → Year 2 (complete all sites)
2 Ignition Reduces WAN OPC UA sessions from N per equipment to one per site; medium risk Year 2 (begin) → Year 3 (complete)
3 Aveva System Platform IO Hardest cutover — System Platform IO feeds validated data collection; needs compliance validation Year 3

Steady state at end of Year 3: every equipment session is held by OtOpcUa; every downstream consumer reads OT data through it.


UNS Naming Hierarchy (must implement in equipment namespace)

OtOpcUa's equipment namespace browse paths must implement the plan's 5-level UNS naming hierarchy:

Five levels, always present

Level Name Example
1 Enterprise ent (placeholder — real shortname TBD)
2 Site warsaw-west, shannon, south-bend
3 Area bldg-3, _default (placeholder at single-cluster sites)
4 Line line-2, assembly-a
5 Equipment cnc-mill-05, injection-molder-02

OPC UA browse path form: ent/warsaw-west/bldg-3/line-2/cnc-mill-05 Text form (for messages, dbt keys): ent.warsaw-west.bldg-3.line-2.cnc-mill-05

Signals / tags are children of equipment nodes (level 6), not a separate path level.

Naming rules

  • [a-z0-9-] only. Lowercase enforced.
  • Hyphens within a segment (warsaw-west), slashes between segments in OPC UA browse paths.
  • Max 32 chars per segment, max 200 chars total path.
  • _default is the only reserved segment name (placeholder for levels that don't apply).

Stable equipment UUID

Every equipment node must expose a stable UUIDv4 as a property:

  • UUID is assigned once, never changes, never reused.
  • Path can change (equipment moves, area renamed); UUID cannot.
  • Canonical events downstream carry both UUID (for joins/lineage) and path (for dashboards/filtering).

Authority

The hierarchy definition lives in the central schemas repo (not yet created). OtOpcUa is a consumer of the authoritative definition — it builds its per-site browse tree from the relevant subtree at deploy/config time. Drift between OtOpcUa's browse paths and the schemas repo is a defect.


Canonical Model Integration

OtOpcUa's equipment namespace is one of three surfaces that expose the plan's canonical equipment / production / event model:

Surface Role
OtOpcUa equipment namespace Canonical per-equipment OPC UA node structure. Equipment-class templates from schemas repo define the node layout.
Redpanda topics + Protobuf schemas Canonical event shape on the wire. Source of truth for the model lives in the schemas repo.
dbt curated layer in Snowflake Canonical analytics model — same vocabulary, different access pattern.

Canonical machine state vocabulary

The plan commits to a canonical set of machine state values. OtOpcUa does not derive these states (that's a Layer 3 responsibility — System Platform / Ignition), but OtOpcUa's equipment namespace should expose the raw signals that feed the derivation, and the System Platform namespace will expose the derived state values using this vocabulary:

State Semantics
Running Actively producing at or near theoretical cycle time
Idle Powered and available but not producing
Faulted Fault raised, requires intervention
Starved Ready but blocked by missing upstream input
Blocked Ready but blocked by downstream constraint

Under consideration (TBD): Changeover, Maintenance, Setup / WarmingUp.

State derivation lives at Layer 3 and is published as equipment.state.transitioned events on Redpanda. OtOpcUa's role is to deliver the raw signals cleanly so derivation can be accurate.


Digital Twin Touchpoints

Use case 1 — Standardized equipment state model

OtOpcUa delivers the raw signals that feed the canonical state derivation at Layer 3. Equipment-class templates in the schemas repo define which raw signals each equipment class exposes, standardized across the estate.

Use case 2 — Virtual testing / simulation

OtOpcUa's namespace architecture can accommodate a future simulated namespace — replaying historical equipment data to exercise tier-1/tier-2 consumers without physical equipment. Not committed for build, but the namespace system should be designed so adding it is a configuration change.

Use case 3 — Cross-system canonical model

OtOpcUa's equipment namespace IS the OT-side surface of the canonical model. Every consumer reading equipment data through OtOpcUa sees the same node structure, same naming, same data types, same units — regardless of the underlying equipment's native protocol.


Downstream Consumer Impact

When OtOpcUa is deployed and consumers are cut over:

  • ScadaBridge reads equipment data from OtOpcUa's equipment namespace and System Platform data from OtOpcUa's System Platform namespace — all from the same OPC UA endpoint. Data locality preserved.
  • Ignition consumes from each site's OtOpcUa instead of direct WAN OPC UA sessions. WAN session collapse from N per equipment to one per site.
  • Aveva System Platform IO consumes equipment data from OtOpcUa's equipment namespace rather than direct equipment sessions. This is a meaningful shift in System Platform's IO layer and needs validation against Aveva's supported patterns — System Platform is the most opinionated consumer.
  • LmxOpcUa consumers continue working — the System Platform namespace carries forward unchanged; the previous auth pattern (credentials, security modes) carries forward.

Sites

Primary data center

  • South Bend — primary cluster

Largest sites (one cluster per production building)

  • Warsaw West
  • Warsaw North

Other integrated sites (single cluster per site)

  • Shannon
  • Galway
  • TMT
  • Ponce

Not yet integrated (Year 2+ onboarding)

  • Berlin
  • Winterthur
  • Jacksonville
  • Others — list is expected to change

Roadmap Summary

Year What happens
Year 1 — Foundation Evolve LmxOpcUa into OtOpcUa (equipment namespace + clustering). Run protocol survey (Q1). Build core driver library (Q2+). Deploy to every site. Begin tier-1 cutover (ScadaBridge) at large sites.
Year 2 — Scale Complete tier 1 (ScadaBridge) all sites. Begin tier 2 (Ignition). Build long-tail drivers on demand.
Year 3 — Completion Complete tier 2 (Ignition). Execute tier 3 (System Platform IO) with compliance validation. Reach steady state.

Open Questions / TBDs

Collected from across the plan files — these are items the implementation work will need to resolve:

  • Equipment-protocol inventory (drives core library scope) — survey not yet run
  • First-cutover site selection for tier-1 (ScadaBridge)
  • Per-site tier-2 rollout sequence (Ignition)
  • Per-equipment-class criteria for System Platform IO re-validation (tier 3)
  • Measured resource impact of co-location with System Platform and ScadaBridge
  • Headroom numbers at largest sites (Warsaw campuses)
  • Whether any site needs dedicated hardware
  • Specific OPC UA security mode + profile combinations required vs allowed
  • Where UserName credentials/certs are sourced from (local directory, per-site vault, AD/LDAP)
  • Credential rotation cadence
  • Audit trail of authz decisions
  • Whether namespace ACL definitions live alongside driver/topology config or in their own governance surface
  • Exact OPC UA namespace shape for the equipment namespace (how equipment-class templates map to browse tree structure)
  • How ScadaBridge templates address equipment across multiple per-node OtOpcUa instances
  • Enterprise shortname for UNS hierarchy root (currently ent placeholder)
  • Storage format for the hierarchy in the schemas repo (YAML vs Protobuf vs both)
  • Reconciliation rule if System Platform and Ignition derivations of the same equipment's state diverge
  • Pilot equipment class for the first canonical definition

Sending Corrections Back

If implementation work surfaces any of the following, send corrections back for integration into the 3-year plan:

  • Inaccuracies — something stated here or in the plan doesn't match what the codebase or equipment actually does.
  • Missing constraints — a real-world constraint (Aveva limitation, OPC UA spec requirement, equipment behavior) that the plan doesn't account for.
  • Architectural decisions that need revisiting — a plan decision that turns out to be impractical, with evidence for why and a proposed alternative.
  • Resolved TBDs — answers to any of the open questions above, discovered during implementation.
  • New TBDs — questions the plan didn't think to ask but should have.

Format for corrections:

  1. What the plan currently says (quote or cite file + section)
  2. What you found (evidence — code, equipment behavior, Aveva docs, etc.)
  3. What the plan should say instead (proposed change)

Corrections will be reviewed and folded into the authoritative plan files (goal-state.md, current-state.md, roadmap.md, etc.). This handoff document is a snapshot and will not be updated — the plan files are the living source of truth.