Fixes P1 StaComThread hang (crash-path faulting via WorkItem queue), P1 subscription fire-and-forget (block+log or ContinueWith on 5 call sites), P2 continuation point leak (PurgeExpired on Retrieve/Release), P2 dashboard bind failure (localhost prefix, bool Start), P3 background loop double-start (task handles + join on stop in 3 files), and P3 config logging exposure (SqlConnectionStringBuilder password masking). Adds FakeMxAccessClient fault injection and 12 new tests. Documents required runtime assemblies in ServiceHosting.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9.6 KiB
Service Hosting
Overview
The service runs as a Windows service or console application using TopShelf for lifecycle management. It targets .NET Framework 4.8 with an x86 (32-bit) platform target, which is required for MXAccess COM interop with the ArchestrA runtime DLLs.
TopShelf Configuration
Program.Main() configures TopShelf to manage the OpcUaService lifecycle:
var exitCode = HostFactory.Run(host =>
{
host.UseSerilog();
host.Service<OpcUaService>(svc =>
{
svc.ConstructUsing(() => new OpcUaService());
svc.WhenStarted(s => s.Start());
svc.WhenStopped(s => s.Stop());
});
host.SetServiceName("LmxOpcUa");
host.SetDisplayName("LMX OPC UA Server");
host.SetDescription("OPC UA server exposing System Platform Galaxy tags via MXAccess.");
host.RunAsLocalSystem();
host.StartAutomatically();
});
TopShelf provides these deployment modes from the same executable:
| Command | Description |
|---|---|
LmxOpcUa.Host.exe |
Run as a console application (foreground) |
LmxOpcUa.Host.exe install |
Install as a Windows service |
LmxOpcUa.Host.exe uninstall |
Remove the Windows service |
LmxOpcUa.Host.exe start |
Start the installed service |
LmxOpcUa.Host.exe stop |
Stop the installed service |
The service is configured to run as LocalSystem and start automatically on boot.
Working Directory
Before configuring Serilog, Program.Main() sets the working directory to the executable's location:
Environment.CurrentDirectory = AppDomain.CurrentDomain.BaseDirectory;
This is necessary because Windows services default their working directory to System32, which would cause relative log paths and appsettings.json to resolve incorrectly.
Startup Sequence
OpcUaService.Start() executes the following steps in order. If any required step fails, the service logs the error and throws, preventing a partially initialized state.
- Load configuration -- The production constructor reads
appsettings.json, optional environment overlay, and environment variables, then binds each section to its typed configuration class. - Validate configuration --
ConfigurationValidator.ValidateAndLog()logs all resolved values and checks required constraints (port range, non-empty names and connection strings). If validation fails, the service throwsInvalidOperationException. - Register exception handler -- Registers
AppDomain.CurrentDomain.UnhandledExceptionto log fatal unhandled exceptions withIsTerminatingcontext. - Create performance metrics -- Creates the
PerformanceMetricsinstance and aCancellationTokenSourcefor coordinating shutdown. - Create and connect MXAccess client -- Starts the STA COM thread, creates the
MxAccessClient, and attempts an initial connection. If the connection fails, the service logs a warning and continues -- the monitor loop will retry in the background. - Start MXAccess monitor -- Starts the connectivity monitor loop that probes the runtime connection at the configured interval and handles auto-reconnect.
- Test Galaxy repository connection -- Calls
TestConnectionAsync()on the Galaxy repository to verify the SQL Server database is reachable. If it fails, the service continues without initial address-space data. - Create OPC UA server host -- Creates
OpcUaServerHostwith the effective MXAccess client (real, override, or null fallback), performance metrics, and optional historian data source. - Query Galaxy hierarchy -- Fetches the object hierarchy and attribute definitions from the Galaxy repository database, recording object and attribute counts.
- Start server and build address space -- Starts the OPC UA server, retrieves the
LmxNodeManager, and callsBuildAddressSpace()with the queried hierarchy and attributes. If the query or build fails, the server still starts with an empty address space. - Start change detection -- Creates and starts
ChangeDetectionService, which pollsgalaxy.time_of_last_deployat the configured interval. When a change is detected, it triggers an address-space rebuild via theOnGalaxyChangedevent. - Start status dashboard -- Creates the
HealthCheckServiceandStatusReportService, wires in all live components, and starts theStatusWebServerHTTP listener if the dashboard is enabled. - Log startup complete -- Logs "LmxOpcUa service started successfully" at
Informationlevel.
Shutdown Sequence
OpcUaService.Stop() tears down components in reverse dependency order:
- Cancel operations -- Signals the
CancellationTokenSourceto stop all background loops. - Stop change detection -- Stops the Galaxy deploy polling loop.
- Stop OPC UA server -- Shuts down the OPC UA server host, disconnecting all client sessions.
- Stop MXAccess monitor -- Stops the connectivity monitor loop.
- Disconnect MXAccess -- Disconnects the MXAccess client and releases COM resources.
- Dispose STA thread -- Shuts down the dedicated STA COM thread and its message pump.
- Stop dashboard -- Disposes the
StatusWebServerHTTP listener. - Dispose metrics -- Releases the performance metrics collector.
- Dispose change detection -- Releases the change detection service.
- Unregister exception handler -- Removes the
AppDomain.UnhandledExceptionhandler.
The entire shutdown is wrapped in a try/catch that logs warnings for errors during cleanup, ensuring the service exits even if a component fails to dispose cleanly.
Error Handling
Unhandled exceptions
AppDomain.CurrentDomain.UnhandledException is registered at startup and removed at shutdown. The handler logs the exception at Fatal level with the IsTerminating flag:
Log.Fatal(e.ExceptionObject as Exception,
"Unhandled exception (IsTerminating={IsTerminating})", e.IsTerminating);
Startup resilience
The startup sequence is designed to degrade gracefully rather than fail entirely:
- If MXAccess connection fails, the service continues with a
NullMxAccessClientthat returns bad-quality values for all reads. - If the Galaxy repository database is unreachable, the OPC UA server starts with an empty address space.
- If the status dashboard port is in use, the dashboard logs a warning and does not start, but the OPC UA server continues.
Fatal startup failure
If a critical step (configuration validation, OPC UA server start) throws, Start() catches the exception, logs it at Fatal, and re-throws to let TopShelf report the failure.
Logging
The service uses Serilog with two sinks configured in Program.Main():
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Information()
.WriteTo.Console()
.WriteTo.File(
path: "logs/lmxopcua-.log",
rollingInterval: RollingInterval.Day,
retainedFileCountLimit: 31)
.CreateLogger();
| Sink | Details |
|---|---|
| Console | Writes to stdout, useful when running as a console application |
| Rolling file | Writes to logs/lmxopcua-{date}.log, rolls daily, retains 31 days of history |
Log files are written relative to the executable directory (see Working Directory above). Each component creates its own contextual logger using Log.ForContext<T>() or Log.ForContext(typeof(T)).
Log.CloseAndFlush() is called in the finally block of Program.Main() to ensure all buffered log entries are written before process exit.
Multi-Instance Deployment
The service supports running multiple instances for redundancy. Each instance requires:
- A unique Windows service name (e.g.,
LmxOpcUa,LmxOpcUa2) - A unique OPC UA port and dashboard port
- A unique
OpcUa.ApplicationUriandOpcUa.ServerName - A unique
MxAccess.ClientName - Matching
Redundancy.ServerUrisarrays on all instances
Install additional instances using TopShelf's -servicename flag:
cd C:\publish\lmxopcua\instance2
ZB.MOM.WW.LmxOpcUa.Host.exe install -servicename "LmxOpcUa2" -displayname "LMX OPC UA Server (Instance 2)"
See Redundancy Guide for full deployment details.
Required Runtime Assemblies
The build uses Costura.Fody to embed all NuGet dependencies into the single ZB.MOM.WW.LmxOpcUa.Host.exe. However, the following ArchestrA and Historian DLLs are excluded from embedding and must be present alongside the executable at runtime:
| Assembly | Purpose |
|---|---|
ArchestrA.MxAccess.dll |
MXAccess COM interop — runtime data access to Galaxy tags |
aahClientManaged.dll |
Wonderware Historian managed SDK — historical data queries |
aahClient.dll |
Historian native dependency |
aahClientCommon.dll |
Historian native dependency |
Historian.CBE.dll |
Historian native dependency |
Historian.DPAPI.dll |
Historian native dependency |
ArchestrA.CloudHistorian.Contract.dll |
Historian contract dependency |
These DLLs are sourced from the lib/ folder in the repository and are copied to the build output directory automatically. When deploying, ensure all seven DLLs are in the same directory as the executable.
These assemblies are not redistributable — they are provided by the AVEVA System Platform and Historian installations on the target machine. The copies in lib/ are taken from Program Files (x86)\ArchestrA\Framework\bin on a machine with the platform installed.
Platform Target
The service must be compiled and run as x86 (32-bit). The MXAccess COM toolkit DLLs in Program Files (x86)\ArchestrA\Framework\bin are 32-bit only. Running the service as x64 or AnyCPU (64-bit preferred) causes COM interop failures when creating the LMXProxyServer object on the STA thread.