feat(lmxproxy): replace subscribe/unsubscribe health probe with persistent subscription

The old probe did a subscribe-read-unsubscribe cycle every 5 seconds to
check connection health. This created unnecessary churn and didn't detect
the failure mode where long-lived subscriptions silently stop receiving
COM callbacks (e.g. stalled STA message pump). The new approach keeps a
persistent subscription on the health check tag and forces reconnect if
no value update arrives within a configurable threshold (ProbeStaleThresholdMs,
default 5s). Also adds STA message pump debug logging (5-min heartbeat with
message counters) and fixes log file path resolution for Windows services.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-03-24 11:57:22 -04:00
parent b3222cf30b
commit 95168253fc
12 changed files with 112 additions and 155 deletions

View File

@@ -34,9 +34,7 @@
"HealthCheck": {
"TestTagAddress": "DevPlatform.Scheduler.ScanTime",
"ProbeTimeoutMs": 5000,
"MaxConsecutiveTransportFailures": 3,
"DegradedProbeIntervalMs": 30000
"ProbeStaleThresholdMs": 5000
},
"ServiceRecovery": {
@@ -58,7 +56,8 @@
"Override": {
"Microsoft": "Warning",
"System": "Warning",
"Grpc": "Information"
"Grpc": "Information",
"ZB.MOM.WW.LmxProxy.Host.MxAccess.StaComThread": "Debug"
}
},
"WriteTo": [