Fixes P1 StaComThread hang (crash-path faulting via WorkItem queue), P1 subscription fire-and-forget (block+log or ContinueWith on 5 call sites), P2 continuation point leak (PurgeExpired on Retrieve/Release), P2 dashboard bind failure (localhost prefix, bool Start), P3 background loop double-start (task handles + join on stop in 3 files), and P3 config logging exposure (SqlConnectionStringBuilder password masking). Adds FakeMxAccessClient fault injection and 12 new tests. Documents required runtime assemblies in ServiceHosting.md. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
175 lines
9.6 KiB
Markdown
175 lines
9.6 KiB
Markdown
# Service Hosting
|
|
|
|
## Overview
|
|
|
|
The service runs as a Windows service or console application using TopShelf for lifecycle management. It targets .NET Framework 4.8 with an x86 (32-bit) platform target, which is required for MXAccess COM interop with the ArchestrA runtime DLLs.
|
|
|
|
## TopShelf Configuration
|
|
|
|
`Program.Main()` configures TopShelf to manage the `OpcUaService` lifecycle:
|
|
|
|
```csharp
|
|
var exitCode = HostFactory.Run(host =>
|
|
{
|
|
host.UseSerilog();
|
|
|
|
host.Service<OpcUaService>(svc =>
|
|
{
|
|
svc.ConstructUsing(() => new OpcUaService());
|
|
svc.WhenStarted(s => s.Start());
|
|
svc.WhenStopped(s => s.Stop());
|
|
});
|
|
|
|
host.SetServiceName("LmxOpcUa");
|
|
host.SetDisplayName("LMX OPC UA Server");
|
|
host.SetDescription("OPC UA server exposing System Platform Galaxy tags via MXAccess.");
|
|
host.RunAsLocalSystem();
|
|
host.StartAutomatically();
|
|
});
|
|
```
|
|
|
|
TopShelf provides these deployment modes from the same executable:
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `LmxOpcUa.Host.exe` | Run as a console application (foreground) |
|
|
| `LmxOpcUa.Host.exe install` | Install as a Windows service |
|
|
| `LmxOpcUa.Host.exe uninstall` | Remove the Windows service |
|
|
| `LmxOpcUa.Host.exe start` | Start the installed service |
|
|
| `LmxOpcUa.Host.exe stop` | Stop the installed service |
|
|
|
|
The service is configured to run as `LocalSystem` and start automatically on boot.
|
|
|
|
## Working Directory
|
|
|
|
Before configuring Serilog, `Program.Main()` sets the working directory to the executable's location:
|
|
|
|
```csharp
|
|
Environment.CurrentDirectory = AppDomain.CurrentDomain.BaseDirectory;
|
|
```
|
|
|
|
This is necessary because Windows services default their working directory to `System32`, which would cause relative log paths and `appsettings.json` to resolve incorrectly.
|
|
|
|
## Startup Sequence
|
|
|
|
`OpcUaService.Start()` executes the following steps in order. If any required step fails, the service logs the error and throws, preventing a partially initialized state.
|
|
|
|
1. **Load configuration** -- The production constructor reads `appsettings.json`, optional environment overlay, and environment variables, then binds each section to its typed configuration class.
|
|
2. **Validate configuration** -- `ConfigurationValidator.ValidateAndLog()` logs all resolved values and checks required constraints (port range, non-empty names and connection strings). If validation fails, the service throws `InvalidOperationException`.
|
|
3. **Register exception handler** -- Registers `AppDomain.CurrentDomain.UnhandledException` to log fatal unhandled exceptions with `IsTerminating` context.
|
|
4. **Create performance metrics** -- Creates the `PerformanceMetrics` instance and a `CancellationTokenSource` for coordinating shutdown.
|
|
5. **Create and connect MXAccess client** -- Starts the STA COM thread, creates the `MxAccessClient`, and attempts an initial connection. If the connection fails, the service logs a warning and continues -- the monitor loop will retry in the background.
|
|
6. **Start MXAccess monitor** -- Starts the connectivity monitor loop that probes the runtime connection at the configured interval and handles auto-reconnect.
|
|
7. **Test Galaxy repository connection** -- Calls `TestConnectionAsync()` on the Galaxy repository to verify the SQL Server database is reachable. If it fails, the service continues without initial address-space data.
|
|
8. **Create OPC UA server host** -- Creates `OpcUaServerHost` with the effective MXAccess client (real, override, or null fallback), performance metrics, and optional historian data source.
|
|
9. **Query Galaxy hierarchy** -- Fetches the object hierarchy and attribute definitions from the Galaxy repository database, recording object and attribute counts.
|
|
10. **Start server and build address space** -- Starts the OPC UA server, retrieves the `LmxNodeManager`, and calls `BuildAddressSpace()` with the queried hierarchy and attributes. If the query or build fails, the server still starts with an empty address space.
|
|
11. **Start change detection** -- Creates and starts `ChangeDetectionService`, which polls `galaxy.time_of_last_deploy` at the configured interval. When a change is detected, it triggers an address-space rebuild via the `OnGalaxyChanged` event.
|
|
12. **Start status dashboard** -- Creates the `HealthCheckService` and `StatusReportService`, wires in all live components, and starts the `StatusWebServer` HTTP listener if the dashboard is enabled.
|
|
13. **Log startup complete** -- Logs "LmxOpcUa service started successfully" at `Information` level.
|
|
|
|
## Shutdown Sequence
|
|
|
|
`OpcUaService.Stop()` tears down components in reverse dependency order:
|
|
|
|
1. **Cancel operations** -- Signals the `CancellationTokenSource` to stop all background loops.
|
|
2. **Stop change detection** -- Stops the Galaxy deploy polling loop.
|
|
3. **Stop OPC UA server** -- Shuts down the OPC UA server host, disconnecting all client sessions.
|
|
4. **Stop MXAccess monitor** -- Stops the connectivity monitor loop.
|
|
5. **Disconnect MXAccess** -- Disconnects the MXAccess client and releases COM resources.
|
|
6. **Dispose STA thread** -- Shuts down the dedicated STA COM thread and its message pump.
|
|
7. **Stop dashboard** -- Disposes the `StatusWebServer` HTTP listener.
|
|
8. **Dispose metrics** -- Releases the performance metrics collector.
|
|
9. **Dispose change detection** -- Releases the change detection service.
|
|
10. **Unregister exception handler** -- Removes the `AppDomain.UnhandledException` handler.
|
|
|
|
The entire shutdown is wrapped in a `try/catch` that logs warnings for errors during cleanup, ensuring the service exits even if a component fails to dispose cleanly.
|
|
|
|
## Error Handling
|
|
|
|
### Unhandled exceptions
|
|
|
|
`AppDomain.CurrentDomain.UnhandledException` is registered at startup and removed at shutdown. The handler logs the exception at `Fatal` level with the `IsTerminating` flag:
|
|
|
|
```csharp
|
|
Log.Fatal(e.ExceptionObject as Exception,
|
|
"Unhandled exception (IsTerminating={IsTerminating})", e.IsTerminating);
|
|
```
|
|
|
|
### Startup resilience
|
|
|
|
The startup sequence is designed to degrade gracefully rather than fail entirely:
|
|
|
|
- If MXAccess connection fails, the service continues with a `NullMxAccessClient` that returns bad-quality values for all reads.
|
|
- If the Galaxy repository database is unreachable, the OPC UA server starts with an empty address space.
|
|
- If the status dashboard port is in use, the dashboard logs a warning and does not start, but the OPC UA server continues.
|
|
|
|
### Fatal startup failure
|
|
|
|
If a critical step (configuration validation, OPC UA server start) throws, `Start()` catches the exception, logs it at `Fatal`, and re-throws to let TopShelf report the failure.
|
|
|
|
## Logging
|
|
|
|
The service uses Serilog with two sinks configured in `Program.Main()`:
|
|
|
|
```csharp
|
|
Log.Logger = new LoggerConfiguration()
|
|
.MinimumLevel.Information()
|
|
.WriteTo.Console()
|
|
.WriteTo.File(
|
|
path: "logs/lmxopcua-.log",
|
|
rollingInterval: RollingInterval.Day,
|
|
retainedFileCountLimit: 31)
|
|
.CreateLogger();
|
|
```
|
|
|
|
| Sink | Details |
|
|
|------|---------|
|
|
| Console | Writes to stdout, useful when running as a console application |
|
|
| Rolling file | Writes to `logs/lmxopcua-{date}.log`, rolls daily, retains 31 days of history |
|
|
|
|
Log files are written relative to the executable directory (see Working Directory above). Each component creates its own contextual logger using `Log.ForContext<T>()` or `Log.ForContext(typeof(T))`.
|
|
|
|
`Log.CloseAndFlush()` is called in the `finally` block of `Program.Main()` to ensure all buffered log entries are written before process exit.
|
|
|
|
## Multi-Instance Deployment
|
|
|
|
The service supports running multiple instances for redundancy. Each instance requires:
|
|
|
|
- A unique Windows service name (e.g., `LmxOpcUa`, `LmxOpcUa2`)
|
|
- A unique OPC UA port and dashboard port
|
|
- A unique `OpcUa.ApplicationUri` and `OpcUa.ServerName`
|
|
- A unique `MxAccess.ClientName`
|
|
- Matching `Redundancy.ServerUris` arrays on all instances
|
|
|
|
Install additional instances using TopShelf's `-servicename` flag:
|
|
|
|
```bash
|
|
cd C:\publish\lmxopcua\instance2
|
|
ZB.MOM.WW.LmxOpcUa.Host.exe install -servicename "LmxOpcUa2" -displayname "LMX OPC UA Server (Instance 2)"
|
|
```
|
|
|
|
See [Redundancy Guide](Redundancy.md) for full deployment details.
|
|
|
|
## Required Runtime Assemblies
|
|
|
|
The build uses Costura.Fody to embed all NuGet dependencies into the single `ZB.MOM.WW.LmxOpcUa.Host.exe`. However, the following ArchestrA and Historian DLLs are **excluded from embedding** and must be present alongside the executable at runtime:
|
|
|
|
| Assembly | Purpose |
|
|
|----------|---------|
|
|
| `ArchestrA.MxAccess.dll` | MXAccess COM interop — runtime data access to Galaxy tags |
|
|
| `aahClientManaged.dll` | Wonderware Historian managed SDK — historical data queries |
|
|
| `aahClient.dll` | Historian native dependency |
|
|
| `aahClientCommon.dll` | Historian native dependency |
|
|
| `Historian.CBE.dll` | Historian native dependency |
|
|
| `Historian.DPAPI.dll` | Historian native dependency |
|
|
| `ArchestrA.CloudHistorian.Contract.dll` | Historian contract dependency |
|
|
|
|
These DLLs are sourced from the `lib/` folder in the repository and are copied to the build output directory automatically. When deploying, ensure all seven DLLs are in the same directory as the executable.
|
|
|
|
These assemblies are not redistributable — they are provided by the AVEVA System Platform and Historian installations on the target machine. The copies in `lib/` are taken from `Program Files (x86)\ArchestrA\Framework\bin` on a machine with the platform installed.
|
|
|
|
## Platform Target
|
|
|
|
The service must be compiled and run as x86 (32-bit). The MXAccess COM toolkit DLLs in `Program Files (x86)\ArchestrA\Framework\bin` are 32-bit only. Running the service as x64 or AnyCPU (64-bit preferred) causes COM interop failures when creating the `LMXProxyServer` object on the STA thread.
|