Files
lmxopcua/docs/ServiceHosting.md
Joseph Doherty 95ad9c6866 Resolve 6 of 7 stability review findings and close test coverage gaps
Fixes P1 StaComThread hang (crash-path faulting via WorkItem queue), P1 subscription
fire-and-forget (block+log or ContinueWith on 5 call sites), P2 continuation point
leak (PurgeExpired on Retrieve/Release), P2 dashboard bind failure (localhost prefix,
bool Start), P3 background loop double-start (task handles + join on stop in 3 files),
and P3 config logging exposure (SqlConnectionStringBuilder password masking). Adds
FakeMxAccessClient fault injection and 12 new tests. Documents required runtime
assemblies in ServiceHosting.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:37:27 -04:00

9.6 KiB

Service Hosting

Overview

The service runs as a Windows service or console application using TopShelf for lifecycle management. It targets .NET Framework 4.8 with an x86 (32-bit) platform target, which is required for MXAccess COM interop with the ArchestrA runtime DLLs.

TopShelf Configuration

Program.Main() configures TopShelf to manage the OpcUaService lifecycle:

var exitCode = HostFactory.Run(host =>
{
    host.UseSerilog();

    host.Service<OpcUaService>(svc =>
    {
        svc.ConstructUsing(() => new OpcUaService());
        svc.WhenStarted(s => s.Start());
        svc.WhenStopped(s => s.Stop());
    });

    host.SetServiceName("LmxOpcUa");
    host.SetDisplayName("LMX OPC UA Server");
    host.SetDescription("OPC UA server exposing System Platform Galaxy tags via MXAccess.");
    host.RunAsLocalSystem();
    host.StartAutomatically();
});

TopShelf provides these deployment modes from the same executable:

Command Description
LmxOpcUa.Host.exe Run as a console application (foreground)
LmxOpcUa.Host.exe install Install as a Windows service
LmxOpcUa.Host.exe uninstall Remove the Windows service
LmxOpcUa.Host.exe start Start the installed service
LmxOpcUa.Host.exe stop Stop the installed service

The service is configured to run as LocalSystem and start automatically on boot.

Working Directory

Before configuring Serilog, Program.Main() sets the working directory to the executable's location:

Environment.CurrentDirectory = AppDomain.CurrentDomain.BaseDirectory;

This is necessary because Windows services default their working directory to System32, which would cause relative log paths and appsettings.json to resolve incorrectly.

Startup Sequence

OpcUaService.Start() executes the following steps in order. If any required step fails, the service logs the error and throws, preventing a partially initialized state.

  1. Load configuration -- The production constructor reads appsettings.json, optional environment overlay, and environment variables, then binds each section to its typed configuration class.
  2. Validate configuration -- ConfigurationValidator.ValidateAndLog() logs all resolved values and checks required constraints (port range, non-empty names and connection strings). If validation fails, the service throws InvalidOperationException.
  3. Register exception handler -- Registers AppDomain.CurrentDomain.UnhandledException to log fatal unhandled exceptions with IsTerminating context.
  4. Create performance metrics -- Creates the PerformanceMetrics instance and a CancellationTokenSource for coordinating shutdown.
  5. Create and connect MXAccess client -- Starts the STA COM thread, creates the MxAccessClient, and attempts an initial connection. If the connection fails, the service logs a warning and continues -- the monitor loop will retry in the background.
  6. Start MXAccess monitor -- Starts the connectivity monitor loop that probes the runtime connection at the configured interval and handles auto-reconnect.
  7. Test Galaxy repository connection -- Calls TestConnectionAsync() on the Galaxy repository to verify the SQL Server database is reachable. If it fails, the service continues without initial address-space data.
  8. Create OPC UA server host -- Creates OpcUaServerHost with the effective MXAccess client (real, override, or null fallback), performance metrics, and optional historian data source.
  9. Query Galaxy hierarchy -- Fetches the object hierarchy and attribute definitions from the Galaxy repository database, recording object and attribute counts.
  10. Start server and build address space -- Starts the OPC UA server, retrieves the LmxNodeManager, and calls BuildAddressSpace() with the queried hierarchy and attributes. If the query or build fails, the server still starts with an empty address space.
  11. Start change detection -- Creates and starts ChangeDetectionService, which polls galaxy.time_of_last_deploy at the configured interval. When a change is detected, it triggers an address-space rebuild via the OnGalaxyChanged event.
  12. Start status dashboard -- Creates the HealthCheckService and StatusReportService, wires in all live components, and starts the StatusWebServer HTTP listener if the dashboard is enabled.
  13. Log startup complete -- Logs "LmxOpcUa service started successfully" at Information level.

Shutdown Sequence

OpcUaService.Stop() tears down components in reverse dependency order:

  1. Cancel operations -- Signals the CancellationTokenSource to stop all background loops.
  2. Stop change detection -- Stops the Galaxy deploy polling loop.
  3. Stop OPC UA server -- Shuts down the OPC UA server host, disconnecting all client sessions.
  4. Stop MXAccess monitor -- Stops the connectivity monitor loop.
  5. Disconnect MXAccess -- Disconnects the MXAccess client and releases COM resources.
  6. Dispose STA thread -- Shuts down the dedicated STA COM thread and its message pump.
  7. Stop dashboard -- Disposes the StatusWebServer HTTP listener.
  8. Dispose metrics -- Releases the performance metrics collector.
  9. Dispose change detection -- Releases the change detection service.
  10. Unregister exception handler -- Removes the AppDomain.UnhandledException handler.

The entire shutdown is wrapped in a try/catch that logs warnings for errors during cleanup, ensuring the service exits even if a component fails to dispose cleanly.

Error Handling

Unhandled exceptions

AppDomain.CurrentDomain.UnhandledException is registered at startup and removed at shutdown. The handler logs the exception at Fatal level with the IsTerminating flag:

Log.Fatal(e.ExceptionObject as Exception,
    "Unhandled exception (IsTerminating={IsTerminating})", e.IsTerminating);

Startup resilience

The startup sequence is designed to degrade gracefully rather than fail entirely:

  • If MXAccess connection fails, the service continues with a NullMxAccessClient that returns bad-quality values for all reads.
  • If the Galaxy repository database is unreachable, the OPC UA server starts with an empty address space.
  • If the status dashboard port is in use, the dashboard logs a warning and does not start, but the OPC UA server continues.

Fatal startup failure

If a critical step (configuration validation, OPC UA server start) throws, Start() catches the exception, logs it at Fatal, and re-throws to let TopShelf report the failure.

Logging

The service uses Serilog with two sinks configured in Program.Main():

Log.Logger = new LoggerConfiguration()
    .MinimumLevel.Information()
    .WriteTo.Console()
    .WriteTo.File(
        path: "logs/lmxopcua-.log",
        rollingInterval: RollingInterval.Day,
        retainedFileCountLimit: 31)
    .CreateLogger();
Sink Details
Console Writes to stdout, useful when running as a console application
Rolling file Writes to logs/lmxopcua-{date}.log, rolls daily, retains 31 days of history

Log files are written relative to the executable directory (see Working Directory above). Each component creates its own contextual logger using Log.ForContext<T>() or Log.ForContext(typeof(T)).

Log.CloseAndFlush() is called in the finally block of Program.Main() to ensure all buffered log entries are written before process exit.

Multi-Instance Deployment

The service supports running multiple instances for redundancy. Each instance requires:

  • A unique Windows service name (e.g., LmxOpcUa, LmxOpcUa2)
  • A unique OPC UA port and dashboard port
  • A unique OpcUa.ApplicationUri and OpcUa.ServerName
  • A unique MxAccess.ClientName
  • Matching Redundancy.ServerUris arrays on all instances

Install additional instances using TopShelf's -servicename flag:

cd C:\publish\lmxopcua\instance2
ZB.MOM.WW.LmxOpcUa.Host.exe install -servicename "LmxOpcUa2" -displayname "LMX OPC UA Server (Instance 2)"

See Redundancy Guide for full deployment details.

Required Runtime Assemblies

The build uses Costura.Fody to embed all NuGet dependencies into the single ZB.MOM.WW.LmxOpcUa.Host.exe. However, the following ArchestrA and Historian DLLs are excluded from embedding and must be present alongside the executable at runtime:

Assembly Purpose
ArchestrA.MxAccess.dll MXAccess COM interop — runtime data access to Galaxy tags
aahClientManaged.dll Wonderware Historian managed SDK — historical data queries
aahClient.dll Historian native dependency
aahClientCommon.dll Historian native dependency
Historian.CBE.dll Historian native dependency
Historian.DPAPI.dll Historian native dependency
ArchestrA.CloudHistorian.Contract.dll Historian contract dependency

These DLLs are sourced from the lib/ folder in the repository and are copied to the build output directory automatically. When deploying, ensure all seven DLLs are in the same directory as the executable.

These assemblies are not redistributable — they are provided by the AVEVA System Platform and Historian installations on the target machine. The copies in lib/ are taken from Program Files (x86)\ArchestrA\Framework\bin on a machine with the platform installed.

Platform Target

The service must be compiled and run as x86 (32-bit). The MXAccess COM toolkit DLLs in Program Files (x86)\ArchestrA\Framework\bin are 32-bit only. Running the service as x64 or AnyCPU (64-bit preferred) causes COM interop failures when creating the LMXProxyServer object on the STA thread.