Files
lmxopcua/docs/implementation-plan.md
Joseph Doherty a7576ffb38 Implement LmxOpcUa server — all 6 phases complete
Full OPC UA server on .NET Framework 4.8 (x86) exposing AVEVA System
Platform Galaxy tags via MXAccess. Mirrors Galaxy object hierarchy as
OPC UA address space, translating contained-name browse paths to
tag-name runtime references.

Components implemented:
- Configuration: AppConfiguration with 4 sections, validator
- Domain: ConnectionState, Quality, Vtq, MxDataTypeMapper, error codes
- MxAccess: StaComThread, MxAccessClient (partial classes), MxProxyAdapter
  using strongly-typed ArchestrA.MxAccess COM interop
- Galaxy Repository: SQL queries (hierarchy, attributes, change detection),
  ChangeDetectionService with auto-rebuild on deploy
- OPC UA Server: LmxNodeManager (CustomNodeManager2), LmxOpcUaServer,
  OpcUaServerHost with programmatic config, SecurityPolicy None
- Status Dashboard: HTTP server with HTML/JSON/health endpoints
- Integration: Full 14-step startup, graceful shutdown, component wiring

175 tests (174 unit + 1 integration), all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 05:55:27 -04:00

20 KiB

Implementation Plan: LmxOpcUa Server — All 44 Requirements

Context

The LmxOpcUa project is scaffolded (solution, projects, configs, requirements docs) but has no implementation beyond Program.cs and a stub OpcUaService.cs. This plan implements all 44 requirements across 6 phases, each with verification gates and wiring checks to ensure nothing is left unconnected.

Architecture

Five major components wired together in OpcUaService.cs:

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│ Galaxy Repository│────>│   OPC UA Server   │<───>│  OPC UA Clients │
│  (SQL queries)   │     │  (address space)  │     │                 │
└─────────────────┘     └────────┬──────────┘     └─────────────────┘
                                 │
                        ┌────────┴──────────┐
                        │  MxAccessClient   │
                        │  (STA + COM)      │
                        └───────────────────┘
                                 │
                        ┌────────┴──────────┐
                        │ Status Dashboard  │
                        │ (HTTP + metrics)  │
                        └───────────────────┘

Reference implementation: C:\Users\dohertj2\Desktop\scadalink-design\lmxproxy\src\ZB.MOM.WW.LmxProxy.Host\


PHASE 1: Foundation — Domain Models, Configuration, Interfaces

Reqs: SVC-003, SVC-006 (partial), MXA-008 (interfaces), MXA-009, OPC-005, OPC-012 (partial), GR-005 (config)

Files to Create

Configuration/

  • AppConfiguration.cs — top-level holder for all config sections
  • OpcUaConfiguration.cs — Port, EndpointPath, ServerName, GalaxyName, MaxSessions, SessionTimeoutMinutes
  • MxAccessConfiguration.cs — ClientName, timeouts, concurrency, probe settings
  • GalaxyRepositoryConfiguration.cs — ConnectionString, intervals, command timeout
  • DashboardConfiguration.cs — Enabled, Port, RefreshIntervalSeconds
  • ConfigurationValidator.cs — validate and log effective config at startup

Domain/

  • ConnectionState.cs — enum: Disconnected, Connecting, Connected, Disconnecting, Error, Reconnecting
  • ConnectionStateChangedEventArgs.cs — PreviousState, CurrentState, Message
  • Vtq.cs — Value/Timestamp/Quality struct with factory methods
  • Quality.cs — enum with Bad/Uncertain/Good families matching OPC DA codes
  • QualityMapper.cs — MapFromMxAccessQuality(int) and MapToOpcUaStatusCode(Quality)
  • MxDataTypeMapper.cs — MapToOpcUaDataType(int mxDataType), MapToClrType(int). Unknown defaults to String
  • MxErrorCodes.cs — translate 1008/1012/1013 to human messages
  • GalaxyObjectInfo.cs — DTO matching hierarchy.sql columns
  • GalaxyAttributeInfo.cs — DTO matching attributes.sql columns
  • IMxAccessClient.cs — interface: Connect, Disconnect, Subscribe, Read, Write, OnTagValueChanged delegate
  • IGalaxyRepository.cs — interface: GetHierarchy, GetAttributes, GetLastDeployTime, TestConnection, OnGalaxyChanged event
  • IMxProxy.cs — abstraction over LMXProxyServer COM object (enables testing without DLL)

Metrics/

  • PerformanceMetrics.cs — ITimingScope, OperationMetrics (1000-entry rolling buffer), BeginOperation/GetStatistics. Adapt from reference.

Tests

  • ConfigurationLoadingTests.cs — bind appsettings.json, verify defaults
  • MxDataTypeMapperTests.cs — all 12 type mappings + unknown default
  • QualityMapperTests.cs — boundary values (0, 63, 64, 191, 192)
  • MxErrorCodesTests.cs — known codes + unknown
  • PerformanceMetricsTests.cs — recording, P95, buffer eviction, empty state

Verification Gate 1

  • dotnet build — zero errors
  • All Phase 1 tests pass
  • Config binding loads all 4 sections from appsettings.json
  • MxDataTypeMapper covers every row in gr/data_type_mapping.md
  • Quality enum covers all reference impl values
  • Builds WITHOUT ArchestrA.MxAccess.dll (interface-based, no COM refs in Phase 1)
  • Every new file has doc-comment referencing requirement ID(s)
  • IMxAccessClient has every method needed by OPC-007, OPC-008, OPC-009
  • IGalaxyRepository has every method needed by GR-001 through GR-004

PHASE 2: MxAccessClient — STA Thread and COM Interop

Reqs: MXA-001, MXA-002, MXA-003, MXA-004, MXA-005, MXA-006, MXA-007, MXA-008 (wiring)

Files to Create

MxAccess/

  • StaComThread.cs — adapt from reference. STA thread, Win32 message pump, RunAsync(Action)/RunAsync(Func), WM_APP dispatch
  • MxAccessClient.cs — core partial class implementing IMxAccessClient. Fields: StaComThread, IMxProxy, handle, state, semaphores, maps
  • MxAccessClient.Connection.cs — ConnectAsync (Register on STA), DisconnectAsync (cleanup per MXA-007), COM cleanup
  • MxAccessClient.Subscription.cs — SubscribeAsync (AddItem+AdviseSupervisory), UnsubscribeAsync, ReplayStoredSubscriptions
  • MxAccessClient.ReadWrite.cs — ReadAsync (subscribe-get-first-unsubscribe), WriteAsync (Write+OnWriteComplete), semaphore-limited, timeout, ITimingScope metrics
  • MxAccessClient.EventHandlers.cs — OnDataChange (resolve handle→address, create Vtq, invoke callback, update probe), OnWriteComplete (complete TCS, translate errors)
  • MxAccessClient.Monitor.cs — monitor loop (reconnect on disconnect, probe staleness→force reconnect), cancellable
  • MxProxyAdapter.cs — wraps real LMXProxyServer COM object, forwards calls to IMxProxy interface

Test Helpers (in Tests project):

  • FakeMxProxy.cs — implements IMxProxy, simulates connections/data changes for testing

Design Decision: IMxProxy Abstraction

Code against IMxProxy interface (not LMXProxyServer directly). This allows testing without ArchestrA.MxAccess.dll. MxProxyAdapter wraps the real COM object at runtime.

Tests

  • StaComThreadTests.cs — STA apartment verified, work item execution, dispose
  • MxAccessClientConnectionTests.cs — state transitions, cleanup order
  • MxAccessClientSubscriptionTests.cs — subscribe/unsubscribe, stored subscriptions, reconnect replay, OnDataChange→callback
  • MxAccessClientReadWriteTests.cs — read returns value, read timeout, write completes on callback, write timeout, semaphore limiting
  • MxAccessClientMonitorTests.cs — reconnect on disconnect, probe staleness

Verification Gate 2

  • Solution builds without ArchestrA.MxAccess.dll
  • STA thread test proves work items execute on STA apartment
  • Connection lifecycle: Disconnected→Connecting→Connected→Disconnecting→Disconnected
  • Subscription replay: stored subscriptions replayed after simulated reconnect
  • Read/Write: timeout behavior returns error within expected window
  • Metrics: Read/Write record timing in PerformanceMetrics
  • WIRING CHECK: OnDataChange callback reaches OnTagValueChanged delegate
  • COM cleanup order: UnAdvise→RemoveItem→unwire events→Unregister→ReleaseComObject
  • Error codes 1008/1012/1013 translate correctly in OnWriteComplete path

PHASE 3: Galaxy Repository — SQL Queries and Change Detection

Reqs: GR-001, GR-002, GR-003, GR-004, GR-006, GR-007

Files to Create

GalaxyRepository/

  • GalaxyRepositoryService.cs — implements IGalaxyRepository. SQL embedded as const string (from gr/queries/). ADO.NET SqlConnection per-query. GetHierarchyAsync, GetAttributesAsync, GetLastDeployTimeAsync, TestConnectionAsync
  • ChangeDetectionService.cs — background Timer at configured interval. Polls GetLastDeployTimeAsync, compares to last known, fires OnGalaxyChanged on change. First poll always triggers. Failed poll logs Warning, retries next interval
  • GalaxyRepositoryStats.cs — POCO for dashboard: GalaxyName, DbConnected, LastDeployTime, ObjectCount, AttributeCount, LastRebuildTime

Tests

  • ChangeDetectionServiceTests.cs — first poll triggers, same timestamp skips, changed triggers, failed poll retries
  • GalaxyRepositoryServiceTests.cs (integration, in IntegrationTests) — TestConnection, GetHierarchy returns rows, GetAttributes returns rows

Verification Gate 3

  • All SQL is const string — no concatenation, no parameters, no INSERT/UPDATE/DELETE (GR-006 code review)
  • GetHierarchyAsync maps all columns: gobject_id, tag_name, contained_name, browse_name, parent_gobject_id, is_area
  • GetAttributesAsync maps all columns including array_dimension
  • Change detection: first poll fires, same timestamp skips, changed fires
  • Failed query does NOT crash or trigger false rebuild
  • GalaxyRepositoryStats populated for dashboard
  • Zero rows from hierarchy logs Warning

PHASE 4: OPC UA Server — Address Space and Node Manager

Reqs: OPC-001, OPC-002, OPC-003, OPC-004, OPC-005, OPC-006, OPC-007, OPC-008, OPC-009, OPC-010, OPC-011, OPC-012, OPC-013

Files to Create

OpcUa/

  • LmxOpcUaServer.cs — inherits StandardServer. Creates custom node manager. SecurityPolicy None. Registers namespace urn:{GalaxyName}:LmxOpcUa
  • LmxNodeManager.cs — inherits CustomNodeManager2. Core class:
    • BuildAddressSpace(hierarchy, attributes) — creates folder/object/variable nodes from Galaxy data. NodeId: ns=1;s={tag_name} / ns=1;s={tag_name}.{attr}. Stores full_tag_reference lookup
    • RebuildAddressSpace(hierarchy, attributes) — removes old nodes, rebuilds. Preserves sessions
    • Read/Write overrides delegate to IMxAccessClient via stored full_tag_reference
    • Subscription management: ref-counted shared MXAccess subscriptions
  • OpcUaServerHost.cs — manages ApplicationInstance lifecycle. Programmatic config (no XML). Start/Stop. Exposes ActiveSessionCount
  • OpcUaQualityMapper.cs — domain Quality → OPC UA StatusCodes
  • DataValueConverter.cs — COM variant ↔ OPC UA DataValue. Handles all types from data_type_mapping.md. DateTime UTC. Arrays

Tests

  • DataValueConverterTests.cs — all type conversions, arrays, DateTime UTC
  • LmxNodeManagerBuildTests.cs — synthetic hierarchy matching gr/layout.md, verify node types, NodeIds, data types, ValueRank, ArrayDimensions
  • LmxNodeManagerRebuildTests.cs — rebuild replaces nodes, old nodes gone, new nodes present
  • OpcUaQualityMapperTests.cs — all quality families

Verification Gate 4

  • Endpoint URL: opc.tcp://{hostname}:{port}/LmxOpcUa
  • Namespace: urn:{GalaxyName}:LmxOpcUa at index 1
  • Root ZB folder under Objects
  • Areas → FolderType + Organizes reference
  • Non-areas → BaseObjectType + HasComponent reference
  • Variable nodes: correct DataType, ValueRank, ArrayDimensions per data_type_mapping.md
  • WIRING CHECK: Read handler resolves NodeId → full_tag_reference → calls IMxAccessClient.ReadAsync
  • WIRING CHECK: Write handler resolves NodeId → full_tag_reference → calls IMxAccessClient.WriteAsync
  • Rebuild removes old nodes, creates new ones without crash
  • SecurityPolicy is None
  • MaxSessions/SessionTimeout configured from appsettings

PHASE 5: Status Dashboard — HTTP, HTML, JSON, Health

Reqs: DASH-001 through DASH-009

Files to Create

Status/

  • StatusData.cs — DTO: ConnectionInfo, HealthInfo, SubscriptionInfo, GalaxyInfo, OperationMetrics, Footer
  • HealthCheckService.cs — rules: not connected→Unhealthy, success rate<50% w/>100 ops→Degraded, else Healthy
  • StatusReportService.cs — aggregates from all components. GenerateHtml (self-contained, inline CSS, color-coded panels, meta-refresh). GenerateJson. IsHealthy
  • StatusWebServer.cs — HttpListener. Routes: / → HTML, /api/status → JSON, /api/health → 200/503. GET only. no-cache headers. Disableable

Tests

  • HealthCheckServiceTests.cs — three health rules, messages
  • StatusReportServiceTests.cs — HTML contains all panels, JSON deserializes, meta-refresh tag
  • StatusWebServerTests.cs — routing (200/405/404), cache headers, start/stop

Verification Gate 5

  • HTML contains all panels: Connection, Health, Subscriptions, Galaxy Info, Operations table, Footer
  • Connection panel: green/red/yellow border per state
  • Health panel: three states with correct colors
  • Operations table: Read/Write/Subscribe/Browse with Count/SuccessRate/Avg/Min/Max/P95
  • Galaxy Info panel: galaxy name, DB status, last deploy, object/attribute counts, last rebuild
  • Footer: timestamp + assembly version
  • JSON API: all same data as HTML
  • /api/health: 200 when healthy, 503 when unhealthy
  • Meta-refresh tag with configured interval
  • Port conflict does not prevent service startup
  • Dashboard disabled via config skips HttpListener

PHASE 6: Integration Wiring and End-to-End Verification

Reqs: SVC-004, SVC-005, SVC-006, ALL wiring verification

OpcUaService.cs — Full Implementation

Start() sequence (SVC-005):

  1. Load AppConfiguration via IConfiguration
  2. ConfigurationValidator.ValidateAndLog()
  3. Register AppDomain.UnhandledException handler (SVC-006)
  4. Create PerformanceMetrics
  5. Create MxAccessClient → ConnectAsync (failure = fatal, don't start)
  6. Start MxAccessClient monitor loop
  7. Create GalaxyRepositoryService → TestConnectionAsync (failure = warning, continue)
  8. Create OpcUaServerHost + LmxNodeManager, inject IMxAccessClient
  9. Query initial hierarchy + attributes → BuildAddressSpace
  10. Start OPC UA server listener (failure = fatal)
  11. Create ChangeDetectionService → wire OnGalaxyChanged → nodeManager.RebuildAddressSpace
  12. Start change detection polling
  13. Create HealthCheckService, StatusReportService, StatusWebServer → Start (failure = warning)
  14. Log "LmxOpcUa service started successfully"

Critical wiring (GUARDRAILS):

  • _mxAccessClient.OnTagValueChanged → node manager subscription delivery
  • _changeDetectionService.OnGalaxyChanged_nodeManager.RebuildAddressSpace
  • _mxAccessClient.ConnectionStateChanged → health check updates
  • Node manager Read/Write → _mxAccessClient.ReadAsync/WriteAsync
  • StatusReportService reads from: MxAccessClient, PerformanceMetrics, GalaxyRepositoryStats, OpcUaServerHost

Stop() sequence (SVC-004, reverse order, 30s max):

  1. Cancel CancellationTokenSource (stops all background loops)
  2. Stop change detection
  3. Stop OPC UA server
  4. Disconnect MXAccess (full COM cleanup)
  5. Stop StatusWebServer
  6. Dispose PerformanceMetrics
  7. Log "Service shutdown complete"

Wiring Verification Tests (GUARDRAILS)

These tests prove components are connected end-to-end, not just implemented in isolation:

  • Wiring/MxAccessToNodeManagerWiringTest.cs — simulate OnDataChange on FakeMxProxy → verify data reaches node manager subscription delivery
  • Wiring/ChangeDetectionToRebuildWiringTest.cs — mock GalaxyRepository returns changed timestamp → verify RebuildAddressSpace called
  • Wiring/OpcUaReadToMxAccessWiringTest.cs — issue Read via NodeManager → verify FakeMxProxy receives correct full_tag_reference
  • Wiring/OpcUaWriteToMxAccessWiringTest.cs — issue Write via NodeManager → verify FakeMxProxy receives correct tag + value
  • Wiring/ServiceStartupSequenceTest.cs — create OpcUaService with fakes, call Start(), verify all components created and wired
  • Wiring/ShutdownCompletesTest.cs — Start then Stop, verify completes within 30s
  • EndToEnd/FullDataFlowTest.csTHE ULTIMATE SMOKE TEST: full service with fakes, verify: (1) address space built, (2) MXAccess data change → OPC UA variable, (3) read → correct tag ref, (4) write → correct tag+value, (5) dashboard HTML has real data

Verification Gate 6 (FINAL)

  • Startup: all 14 steps execute in order
  • Shutdown: completes within 30s, all components disposed in reverse order
  • WIRING: MXAccess OnDataChange → node manager subscription delivery
  • WIRING: Galaxy change → address space rebuild
  • WIRING: OPC UA Read → MXAccess ReadAsync with correct tag reference
  • WIRING: OPC UA Write → MXAccess WriteAsync with correct tag+value
  • WIRING: Dashboard aggregates data from all components
  • WIRING: Health endpoint reflects actual connection state
  • AppDomain.UnhandledException registered
  • TopShelf recovery configured (restart, 60s delay)
  • FullDataFlowTest passes end-to-end

Master Requirement Traceability (all 44)

Req Phase Verified By
SVC-001 Done Program.cs already configured
SVC-002 Done Program.cs already configured
SVC-003 1 ConfigurationLoadingTests
SVC-004 6 ShutdownCompletesTest
SVC-005 6 ServiceStartupSequenceTest
SVC-006 6 AppDomain handler registration test
MXA-001 2 StaComThreadTests
MXA-002 2 MxAccessClientConnectionTests
MXA-003 2 MxAccessClientSubscriptionTests
MXA-004 2 MxAccessClientReadWriteTests
MXA-005 2 MxAccessClientMonitorTests
MXA-006 2 MxAccessClientMonitorTests (probe)
MXA-007 2 Cleanup order test
MXA-008 2 Metrics integration in ReadWrite
MXA-009 1+2 MxErrorCodesTests + write error path
GR-001 3 GetHierarchyAsync maps all columns
GR-002 3 GetAttributesAsync maps all columns
GR-003 3 ChangeDetectionServiceTests
GR-004 3+6 ChangeDetectionToRebuildWiringTest
GR-005 1+3 Config tests + ADO.NET usage
GR-006 3 Code review: const string SQL only
GR-007 3 TestConnectionAsync test
OPC-001 4 Endpoint URL test
OPC-002 4 BuildTests: node types + references
OPC-003 4 BuildTests: variable nodes
OPC-004 4+6 ReadWiringTest: browse→tag_name
OPC-005 1+4 MxDataTypeMapperTests + variable node DataType
OPC-006 4 BuildTests: ValueRank + ArrayDimensions
OPC-007 4+6 OpcUaReadToMxAccessWiringTest
OPC-008 4+6 OpcUaWriteToMxAccessWiringTest
OPC-009 4+6 MxAccessToNodeManagerWiringTest
OPC-010 4+6 RebuildTests + ChangeDetectionToRebuildWiringTest
OPC-011 4 ServerStatus node test
OPC-012 4 Namespace URI test
OPC-013 4 Session config test
DASH-001 5 StatusWebServerTests routing
DASH-002 5 HTML contains Connection panel
DASH-003 5 HealthCheckServiceTests
DASH-004 5 HTML contains Subscriptions panel
DASH-005 5 HTML contains Operations table
DASH-006 5 HTML contains Footer
DASH-007 5 Meta-refresh tag test
DASH-008 5 JSON API deserialization test
DASH-009 5 HTML contains Galaxy Info panel

Final Folder Structure

src/ZB.MOM.WW.LmxOpcUa.Host/
    Configuration/       (Phase 1)
    Domain/              (Phase 1)
    Metrics/             (Phase 1)
    MxAccess/            (Phase 2)
    GalaxyRepository/    (Phase 3)
    OpcUa/               (Phase 4)
    Status/              (Phase 5)
    OpcUaService.cs      (Phase 6 — full wiring)
    Program.cs           (existing)
    appsettings.json     (existing)
tests/ZB.MOM.WW.LmxOpcUa.Tests/
    Configuration/       (Phase 1)
    Domain/              (Phase 1)
    Metrics/             (Phase 1)
    MxAccess/            (Phase 2)
    GalaxyRepository/    (Phase 3)
    OpcUa/               (Phase 4)
    Status/              (Phase 5)
    Wiring/              (Phase 6 — GUARDRAILS)
    EndToEnd/            (Phase 6 — GUARDRAILS)
    Helpers/FakeMxProxy.cs (Phase 2)

Verification: How to Run

# Build
dotnet build ZB.MOM.WW.LmxOpcUa.slnx

# All tests
dotnet test ZB.MOM.WW.LmxOpcUa.slnx

# Phase-specific (by namespace convention)
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~Configuration"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~MxAccess"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~GalaxyRepository"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~OpcUa"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~Status"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~Wiring"
dotnet test tests/ZB.MOM.WW.LmxOpcUa.Tests --filter "FullyQualifiedName~EndToEnd"

# Integration tests (requires ZB database)
dotnet test tests/ZB.MOM.WW.LmxOpcUa.IntegrationTests