Renames all 11 projects (5 src + 6 tests), the .slnx solution file, all source-file namespaces, all axaml namespace references, and all v1 documentation references in CLAUDE.md and docs/*.md (excluding docs/v2/ which is already in OtOpcUa form). Also updates the TopShelf service registration name from "LmxOpcUa" to "OtOpcUa" per Phase 0 Task 0.6.
Preserves runtime identifiers per Phase 0 Out-of-Scope rules to avoid breaking v1/v2 client trust during coexistence: OPC UA `ApplicationUri` defaults (`urn:{GalaxyName}:LmxOpcUa`), server `EndpointPath` (`/LmxOpcUa`), `ServerName` default (feeds cert subject CN), `MxAccessConfiguration.ClientName` default (defensive — stays "LmxOpcUa" for MxAccess audit-trail consistency), client OPC UA identifiers (`ApplicationName = "LmxOpcUaClient"`, `ApplicationUri = "urn:localhost:LmxOpcUaClient"`, cert directory `%LocalAppData%\LmxOpcUaClient\pki\`), and the `LmxOpcUaServer` class name (class rename out of Phase 0 scope per Task 0.5 sed pattern; happens in Phase 1 alongside `LmxNodeManager → GenericDriverNodeManager` Core extraction). 23 LmxOpcUa references retained, all enumerated and justified in `docs/v2/implementation/exit-gate-phase-0.md`.
Build clean: 0 errors, 30 warnings (lower than baseline 167). Tests at strict improvement over baseline: 821 passing / 1 failing vs baseline 820 / 2 (one flaky pre-existing failure passed this run; the other still fails — both pre-existing and unrelated to the rename). `Client.UI.Tests`, `Historian.Aveva.Tests`, `Client.Shared.Tests`, `IntegrationTests` all match baseline exactly. Exit gate compliance results recorded in `docs/v2/implementation/exit-gate-phase-0.md` with all 7 checks PASS or DEFERRED-to-PR-review (#7 service install verification needs Windows service permissions on the reviewer's box).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
191 lines
11 KiB
Markdown
191 lines
11 KiB
Markdown
# Service Hosting
|
|
|
|
## Overview
|
|
|
|
The service runs as a Windows service or console application using TopShelf for lifecycle management. It targets .NET Framework 4.8 with an x86 (32-bit) platform target, which is required for MXAccess COM interop with the ArchestrA runtime DLLs.
|
|
|
|
## TopShelf Configuration
|
|
|
|
`Program.Main()` configures TopShelf to manage the `OpcUaService` lifecycle:
|
|
|
|
```csharp
|
|
var exitCode = HostFactory.Run(host =>
|
|
{
|
|
host.UseSerilog();
|
|
|
|
host.Service<OpcUaService>(svc =>
|
|
{
|
|
svc.ConstructUsing(() => new OpcUaService());
|
|
svc.WhenStarted(s => s.Start());
|
|
svc.WhenStopped(s => s.Stop());
|
|
});
|
|
|
|
host.SetServiceName("LmxOpcUa");
|
|
host.SetDisplayName("LMX OPC UA Server");
|
|
host.SetDescription("OPC UA server exposing System Platform Galaxy tags via MXAccess.");
|
|
host.RunAsLocalSystem();
|
|
host.StartAutomatically();
|
|
});
|
|
```
|
|
|
|
TopShelf provides these deployment modes from the same executable:
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `OtOpcUa.Host.exe` | Run as a console application (foreground) |
|
|
| `OtOpcUa.Host.exe install` | Install as a Windows service |
|
|
| `OtOpcUa.Host.exe uninstall` | Remove the Windows service |
|
|
| `OtOpcUa.Host.exe start` | Start the installed service |
|
|
| `OtOpcUa.Host.exe stop` | Stop the installed service |
|
|
|
|
The service is configured to run as `LocalSystem` and start automatically on boot.
|
|
|
|
## Working Directory
|
|
|
|
Before configuring Serilog, `Program.Main()` sets the working directory to the executable's location:
|
|
|
|
```csharp
|
|
Environment.CurrentDirectory = AppDomain.CurrentDomain.BaseDirectory;
|
|
```
|
|
|
|
This is necessary because Windows services default their working directory to `System32`, which would cause relative log paths and `appsettings.json` to resolve incorrectly.
|
|
|
|
## Startup Sequence
|
|
|
|
`OpcUaService.Start()` executes the following steps in order. If any required step fails, the service logs the error and throws, preventing a partially initialized state.
|
|
|
|
1. **Load configuration** -- The production constructor reads `appsettings.json`, optional environment overlay, and environment variables, then binds each section to its typed configuration class.
|
|
2. **Validate configuration** -- `ConfigurationValidator.ValidateAndLog()` logs all resolved values and checks required constraints (port range, non-empty names and connection strings). If validation fails, the service throws `InvalidOperationException`.
|
|
3. **Register exception handler** -- Registers `AppDomain.CurrentDomain.UnhandledException` to log fatal unhandled exceptions with `IsTerminating` context.
|
|
4. **Create performance metrics** -- Creates the `PerformanceMetrics` instance and a `CancellationTokenSource` for coordinating shutdown.
|
|
5. **Create and connect MXAccess client** -- Starts the STA COM thread, creates the `MxAccessClient`, and attempts an initial connection. If the connection fails, the service logs a warning and continues -- the monitor loop will retry in the background.
|
|
6. **Start MXAccess monitor** -- Starts the connectivity monitor loop that probes the runtime connection at the configured interval and handles auto-reconnect.
|
|
7. **Test Galaxy repository connection** -- Calls `TestConnectionAsync()` on the Galaxy repository to verify the SQL Server database is reachable. If it fails, the service continues without initial address-space data.
|
|
8. **Create OPC UA server host** -- Creates `OpcUaServerHost` with the effective MXAccess client (real, override, or null fallback), performance metrics, and an optional `IHistorianDataSource` obtained from `HistorianPluginLoader.TryLoad` when `Historian.Enabled=true` (returns `null` if the plugin is absent or fails to load).
|
|
9. **Query Galaxy hierarchy** -- Fetches the object hierarchy and attribute definitions from the Galaxy repository database, recording object and attribute counts.
|
|
10. **Start server and build address space** -- Starts the OPC UA server, retrieves the `LmxNodeManager`, and calls `BuildAddressSpace()` with the queried hierarchy and attributes. If the query or build fails, the server still starts with an empty address space.
|
|
11. **Start change detection** -- Creates and starts `ChangeDetectionService`, which polls `galaxy.time_of_last_deploy` at the configured interval. When a change is detected, it triggers an address-space rebuild via the `OnGalaxyChanged` event.
|
|
12. **Start status dashboard** -- Creates the `HealthCheckService` and `StatusReportService`, wires in all live components, and starts the `StatusWebServer` HTTP listener if the dashboard is enabled. If `StatusWebServer.Start()` returns `false` (port already bound, insufficient permissions, etc.), the service logs a warning, disposes the unstarted instance, sets `OpcUaService.DashboardStartFailed = true`, and continues in degraded mode. Matches the warning-continue policy applied to MxAccess connect, Galaxy DB connect, and initial address space build. Stability review 2026-04-13 Finding 2.
|
|
13. **Log startup complete** -- Logs "LmxOpcUa service started successfully" at `Information` level.
|
|
|
|
## Shutdown Sequence
|
|
|
|
`OpcUaService.Stop()` tears down components in reverse dependency order:
|
|
|
|
1. **Cancel operations** -- Signals the `CancellationTokenSource` to stop all background loops.
|
|
2. **Stop change detection** -- Stops the Galaxy deploy polling loop.
|
|
3. **Stop OPC UA server** -- Shuts down the OPC UA server host, disconnecting all client sessions.
|
|
4. **Stop MXAccess monitor** -- Stops the connectivity monitor loop.
|
|
5. **Disconnect MXAccess** -- Disconnects the MXAccess client and releases COM resources.
|
|
6. **Dispose STA thread** -- Shuts down the dedicated STA COM thread and its message pump.
|
|
7. **Stop dashboard** -- Disposes the `StatusWebServer` HTTP listener.
|
|
8. **Dispose metrics** -- Releases the performance metrics collector.
|
|
9. **Dispose change detection** -- Releases the change detection service.
|
|
10. **Unregister exception handler** -- Removes the `AppDomain.UnhandledException` handler.
|
|
|
|
The entire shutdown is wrapped in a `try/catch` that logs warnings for errors during cleanup, ensuring the service exits even if a component fails to dispose cleanly.
|
|
|
|
## Error Handling
|
|
|
|
### Unhandled exceptions
|
|
|
|
`AppDomain.CurrentDomain.UnhandledException` is registered at startup and removed at shutdown. The handler logs the exception at `Fatal` level with the `IsTerminating` flag:
|
|
|
|
```csharp
|
|
Log.Fatal(e.ExceptionObject as Exception,
|
|
"Unhandled exception (IsTerminating={IsTerminating})", e.IsTerminating);
|
|
```
|
|
|
|
### Startup resilience
|
|
|
|
The startup sequence is designed to degrade gracefully rather than fail entirely:
|
|
|
|
- If MXAccess connection fails, the service continues with a `NullMxAccessClient` that returns bad-quality values for all reads.
|
|
- If the Galaxy repository database is unreachable, the OPC UA server starts with an empty address space.
|
|
- If the status dashboard port is in use, the dashboard logs a warning and does not start, but the OPC UA server continues.
|
|
|
|
### Fatal startup failure
|
|
|
|
If a critical step (configuration validation, OPC UA server start) throws, `Start()` catches the exception, logs it at `Fatal`, and re-throws to let TopShelf report the failure.
|
|
|
|
## Logging
|
|
|
|
The service uses Serilog with two sinks configured in `Program.Main()`:
|
|
|
|
```csharp
|
|
Log.Logger = new LoggerConfiguration()
|
|
.MinimumLevel.Information()
|
|
.WriteTo.Console()
|
|
.WriteTo.File(
|
|
path: "logs/lmxopcua-.log",
|
|
rollingInterval: RollingInterval.Day,
|
|
retainedFileCountLimit: 31)
|
|
.CreateLogger();
|
|
```
|
|
|
|
| Sink | Details |
|
|
|------|---------|
|
|
| Console | Writes to stdout, useful when running as a console application |
|
|
| Rolling file | Writes to `logs/lmxopcua-{date}.log`, rolls daily, retains 31 days of history |
|
|
|
|
Log files are written relative to the executable directory (see Working Directory above). Each component creates its own contextual logger using `Log.ForContext<T>()` or `Log.ForContext(typeof(T))`.
|
|
|
|
`Log.CloseAndFlush()` is called in the `finally` block of `Program.Main()` to ensure all buffered log entries are written before process exit.
|
|
|
|
## Multi-Instance Deployment
|
|
|
|
The service supports running multiple instances for redundancy. Each instance requires:
|
|
|
|
- A unique Windows service name (e.g., `LmxOpcUa`, `LmxOpcUa2`)
|
|
- A unique OPC UA port and dashboard port
|
|
- A unique `OpcUa.ApplicationUri` and `OpcUa.ServerName`
|
|
- A unique `MxAccess.ClientName`
|
|
- Matching `Redundancy.ServerUris` arrays on all instances
|
|
|
|
Install additional instances using TopShelf's `-servicename` flag:
|
|
|
|
```bash
|
|
cd C:\publish\lmxopcua\instance2
|
|
ZB.MOM.WW.OtOpcUa.Host.exe install -servicename "LmxOpcUa2" -displayname "LMX OPC UA Server (Instance 2)"
|
|
```
|
|
|
|
See [Redundancy Guide](Redundancy.md) for full deployment details.
|
|
|
|
## Required Runtime Assemblies
|
|
|
|
The build uses Costura.Fody to embed all NuGet dependencies into the single `ZB.MOM.WW.OtOpcUa.Host.exe`. The only native dependency that must sit alongside the executable in every deployment is the MXAccess COM toolkit:
|
|
|
|
| Assembly | Purpose |
|
|
|----------|---------|
|
|
| `ArchestrA.MxAccess.dll` | MXAccess COM interop — runtime data access to Galaxy tags |
|
|
|
|
The Wonderware Historian SDK is packaged as a **runtime-loaded plugin** so hosts that will not use historical data access do not need the SDK installed. The plugin lives in a `Historian/` subfolder next to `ZB.MOM.WW.OtOpcUa.Host.exe`:
|
|
|
|
```
|
|
ZB.MOM.WW.OtOpcUa.Host.exe
|
|
ArchestrA.MxAccess.dll
|
|
Historian/
|
|
ZB.MOM.WW.OtOpcUa.Historian.Aveva.dll
|
|
aahClientManaged.dll
|
|
aahClientCommon.dll
|
|
aahClient.dll
|
|
Historian.CBE.dll
|
|
Historian.DPAPI.dll
|
|
ArchestrA.CloudHistorian.Contract.dll
|
|
```
|
|
|
|
At startup, if `Historian.Enabled=true` in `appsettings.json`, `HistorianPluginLoader` probes `Historian/ZB.MOM.WW.OtOpcUa.Historian.Aveva.dll` via `Assembly.LoadFrom` and instantiates the plugin's entry point. An `AppDomain.AssemblyResolve` handler redirects the SDK assembly lookups (`aahClientManaged`, `aahClientCommon`, …) to the same subfolder so the CLR can resolve them when the plugin first JITs. If the plugin directory is absent or any SDK dependency fails to load, the loader logs a warning and the server continues to run with history support disabled — `LmxNodeManager` returns `BadHistoryOperationUnsupported` for every history call.
|
|
|
|
Deployment matrix:
|
|
|
|
| Scenario | Host exe | `ArchestrA.MxAccess.dll` | `Historian/` subfolder |
|
|
|----------|----------|--------------------------|------------------------|
|
|
| `Historian.Enabled=false` | required | required | **omit** |
|
|
| `Historian.Enabled=true` | required | required | required |
|
|
|
|
`ArchestrA.MxAccess.dll` and the historian SDK DLLs are not redistributable — they are provided by the AVEVA System Platform and Historian installations on the target machine. The copies in `lib/` are taken from `Program Files (x86)\ArchestrA\Framework\bin` on a machine with the platform installed.
|
|
|
|
## Platform Target
|
|
|
|
The service must be compiled and run as x86 (32-bit). The MXAccess COM toolkit DLLs in `Program Files (x86)\ArchestrA\Framework\bin` are 32-bit only. Running the service as x64 or AnyCPU (64-bit preferred) causes COM interop failures when creating the `LMXProxyServer` object on the STA thread.
|