deprecate(lmxproxy): move all LmxProxy code, tests, and docs to deprecated/

LmxProxy is no longer needed. Moved the entire lmxproxy/ workspace, DCL
adapter files, and related docs to deprecated/. Removed LmxProxy registration
from DataConnectionFactory, project reference from DCL, protocol option from
UI, and cleaned up all requirement docs.
This commit is contained in:
Joseph Doherty
2026-04-08 15:56:23 -04:00
parent 8423915ba1
commit 9dccf8e72f
220 changed files with 25 additions and 132 deletions

View File

@@ -0,0 +1,121 @@
# Component: HealthAndMetrics
## Purpose
Provides health checking, performance metrics collection, and an HTTP status dashboard for monitoring the LmxProxy service.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Health/HealthCheckService.cs` — basic health check.
- `src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs` — detailed health check with test tag read.
- `src/ZB.MOM.WW.LmxProxy.Host/Metrics/PerformanceMetrics.cs` — operation metrics collection.
- `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusReportService.cs` — status report generation.
- `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusWebServer.cs` — HTTP status endpoint.
## Responsibilities
- Evaluate service health based on connection state, operation success rates, and test tag reads.
- Track per-operation performance metrics (counts, latencies, percentiles).
- Serve an HTML status dashboard and JSON/health HTTP endpoints.
- Report metrics to logs on a periodic interval.
## 1. Health Checks
### 1.1 Basic Health Check (HealthCheckService)
`CheckHealthAsync()` evaluates:
| Check | Healthy | Degraded |
|-------|---------|----------|
| MxAccess connected | Yes | — |
| Success rate (if > 100 total ops) | ≥ 50% | < 50% |
| Client count | ≤ 100 | > 100 |
Returns health data dictionary: `scada_connected`, `scada_connection_state`, `total_clients`, `total_tags`, `total_operations`, `average_success_rate`.
### 1.2 Detailed Health Check (DetailedHealthCheckService)
`CheckHealthAsync()` performs an active probe:
1. Checks `IsConnected` — returns **Unhealthy** if not connected.
2. Reads a test tag (default `System.Heartbeat`).
3. If test tag quality is not Good — returns **Degraded**.
4. If test tag timestamp is older than **5 minutes** — returns **Degraded** (stale data detection).
5. Otherwise returns **Healthy**.
## 2. Performance Metrics
### 2.1 Tracking
`PerformanceMetrics` uses a `ConcurrentDictionary<string, OperationMetrics>` to track operations by name.
Operations tracked: `Read`, `ReadBatch`, `Write`, `WriteBatch` (recorded by ScadaGrpcService).
### 2.2 Recording
Two recording patterns:
- `RecordOperation(name, duration, success)` — explicit recording.
- `BeginOperation(name)` — returns an `ITimingScope` (disposable). On dispose, automatically records duration (via `Stopwatch`) and success flag (set via `SetSuccess(bool)`).
### 2.3 Per-Operation Statistics
`OperationMetrics` maintains:
- `_totalCount`, `_successCount` — running counters.
- `_totalMilliseconds`, `_minMilliseconds`, `_maxMilliseconds` — latency range.
- `_durations` — rolling buffer of up to **1000 latency samples** for percentile calculation.
`MetricsStatistics` snapshot:
- `TotalCount`, `SuccessCount`, `SuccessRate` (percentage).
- `AverageMilliseconds`, `MinMilliseconds`, `MaxMilliseconds`.
- `Percentile95Milliseconds` — calculated from sorted samples at the 95th percentile index.
### 2.4 Periodic Reporting
A timer fires every **60 seconds**, logging a summary of all operation metrics to Serilog.
## 3. Status Web Server
### 3.1 Server
`StatusWebServer` uses `HttpListener` on `http://+:{Port}/` (default port 8080).
- Starts an async request-handling loop, spawning a task per request.
- Graceful shutdown: cancels the listener, waits **5 seconds** for the listener task to exit.
- Returns HTTP 405 for non-GET methods, HTTP 500 on errors.
### 3.2 Endpoints
| Endpoint | Method | Response |
|----------|--------|----------|
| `/` | GET | HTML dashboard (auto-refresh every 30 seconds) |
| `/api/status` | GET | JSON status report (camelCase) |
| `/api/health` | GET | Plain text `OK` (200) or `UNHEALTHY` (503) |
### 3.3 HTML Dashboard
Generated by `StatusReportService`:
- Bootstrap-like CSS grid layout with status cards.
- Color-coded status: green = Healthy, yellow = Degraded, red = Unhealthy/Error.
- Operations table with columns: Count, SuccessRate, Avg/Min/Max/P95 milliseconds.
- Service metadata: ServiceName, Version (assembly version), connection state.
- Subscription stats: TotalClients, TotalTags, ActiveSubscriptions.
- Auto-refresh via `<meta http-equiv="refresh" content="30">`.
- Last updated timestamp.
### 3.4 JSON Status Report
Fully nested structure with camelCase property names:
- Service metadata, connection status, subscription stats, performance data, health check results.
## Dependencies
- **MxAccessClient** — `IsConnected`, `ConnectionState` for health checks; test tag read for detailed check.
- **SubscriptionManager** — subscription statistics.
- **PerformanceMetrics** — operation statistics for status report and health evaluation.
- **Configuration** — `WebServerConfiguration` for port and prefix.
## Interactions
- **GrpcServer** populates PerformanceMetrics via timing scopes on every RPC.
- **ServiceHost** creates all health/metrics/status components at startup and disposes them at shutdown.
- External monitoring systems can poll `/api/health` for availability checks.