Compare commits

...

106 Commits

Author SHA1 Message Date
Joseph Doherty
9dccf8e72f deprecate(lmxproxy): move all LmxProxy code, tests, and docs to deprecated/
LmxProxy is no longer needed. Moved the entire lmxproxy/ workspace, DCL
adapter files, and related docs to deprecated/. Removed LmxProxy registration
from DataConnectionFactory, project reference from DCL, protocol option from
UI, and cleaned up all requirement docs.
2026-04-08 15:56:23 -04:00
Joseph Doherty
8423915ba1 fix(site-runtime): publish quality changes to site stream for real-time debug view updates
HandleConnectionQualityChanged now publishes AttributeValueChanged events
to the SiteStreamManager for all affected attributes. This ensures the
central UI debug view updates in real-time when a data connection
disconnects and attributes go bad quality.

Only publishes to the stream — does NOT notify script or alarm actors,
since the value hasn't changed and firing scripts/alarms on quality-only
changes would cause spurious evaluations.
2026-03-24 16:32:00 -04:00
Joseph Doherty
6df2cbdf90 fix(lmxproxy): support multiple subscriptions per session
Key subscriptions by unique subscriptionId instead of sessionId to prevent
overwrites when the same session calls Subscribe multiple times (e.g. DCL
StaleTagMonitor). Add session-to-subscription reverse lookup for cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:30:06 -04:00
Joseph Doherty
b3076e18db docs(lmxproxy): add stale session subscription fix plan 2026-03-24 16:19:39 -04:00
Joseph Doherty
de7c4067e4 feat(dcl): add debug-level logging for heartbeat subscription callbacks 2026-03-24 16:19:39 -04:00
Joseph Doherty
5fdeaf613f feat(dcl): failover on repeated unstable connections (connect-then-stale pattern)
Previously, failover only triggered when ConnectAsync failed consecutively.
If a connection succeeded but went stale quickly (e.g., heartbeat timeout),
the failure counter reset on each successful connect and failover never
triggered.

Added a separate _consecutiveUnstableDisconnects counter that increments
when a connection lasts less than StableConnectionThreshold (60s) before
disconnecting. When this counter reaches failoverRetryCount, the actor
fails over to the backup endpoint. Stable connections (lasting >60s)
reset this counter.

The original connection-failure failover path is unchanged.
2026-03-24 16:19:39 -04:00
Joseph Doherty
ff2784b862 fix(site-runtime): add SQLite schema migration for backup_configuration column
Existing site databases created before the primary/backup data connections
feature lack the backup_configuration and failover_retry_count columns.
Added TryAddColumnAsync migration that runs on startup after table creation.
2026-03-24 16:19:39 -04:00
Joseph Doherty
0d03aec4f2 feat(dcl): log connection disconnect events to site event log 2026-03-24 16:19:39 -04:00
Joseph Doherty
d4397910f0 feat(dcl): add StaleTagMonitor for heartbeat-based disconnect detection
Composable StaleTagMonitor class in Commons fires a Stale event when no
value is received within a configurable max silence period. Integrated
into both LmxProxyDataConnection and OpcUaDataConnection adapters via
optional HeartbeatTagPath/HeartbeatMaxSilence connection config keys.
When stale, the adapter fires Disconnected triggering the standard
reconnect cycle. 10 unit tests cover timer behavior.
2026-03-24 16:19:39 -04:00
Joseph Doherty
02a7e8abc6 feat(health): show all cluster nodes (online/offline, primary/standby) in health dashboard
Add NodeStatus record, IClusterNodeProvider interface, and AkkaClusterNodeProvider
that queries Akka cluster membership for all site-role nodes. HealthReportSender
populates ClusterNodes before each report. UI shows a row per node with
hostname, Online/Offline badge, and Primary/Standby badge. Falls back to
single-node display if ClusterNodes is not populated.
2026-03-24 16:19:39 -04:00
Joseph Doherty
65cc7b69cd feat(health): wire up NodeHostname, ConnectionEndpoint, TagQuality, ParkedMessageCount collectors
- AkkaHostedService: SetNodeHostname from NodeOptions
- DataConnectionActor: UpdateConnectionEndpoint on state transitions,
  track per-tag quality counts and UpdateTagQuality on value changes
- HealthReportSender: query StoreAndForwardStorage for parked message count
- StoreAndForwardStorage: add GetParkedMessageCountAsync()
2026-03-24 16:19:39 -04:00
Joseph Doherty
e84a831a02 feat(health): redesign health dashboard with 4-column layout and new metrics
New fields in SiteHealthReport: NodeHostname, DataConnectionEndpoints
(primary/secondary), DataConnectionTagQuality (good/bad/uncertain),
ParkedMessageCount. New collector methods to populate them.

Health dashboard redesigned to match mockup: Nodes | Data Connections
(with per-connection tag quality) | Instances + S&F Buffers | Error
Counts + Parked Messages. Site names resolved from repository.
2026-03-24 16:19:39 -04:00
Joseph Doherty
5e2a4c9080 fix(ui): align TreeView node text by giving toggle and spacer equal fixed width 2026-03-24 16:19:39 -04:00
Joseph Doherty
0abaa47de2 fix(ui): normalize TreeView expanded keys to strings for sessionStorage compatibility
Keys from KeySelector (e.g. boxed int) were compared against string keys
restored from sessionStorage, causing expansion state to be lost on
navigation. All keys are now normalized to strings internally.
2026-03-24 16:19:39 -04:00
Joseph Doherty
a0a6bb4986 refactor(ui): replace manual template inheritance tree with TreeView component 2026-03-24 16:19:39 -04:00
Joseph Doherty
2b5dabb336 refactor(ui): redesign Areas page with TreeView and dedicated Add/Edit/Delete pages
Areas page now shows a single TreeView with sites as roots and areas as
children. Context menus: sites get "Add Area", areas get "Add Child Area",
"Edit Area", "Delete Area" — each navigating to a dedicated page.

The Delete Area page shows a TreeView of the area and all recursive children
with assigned instances. Deletion is blocked if any instances are assigned
to the area or its descendants.
2026-03-24 16:19:39 -04:00
Joseph Doherty
968fc4adc7 fix(ui): disable site and instance dropdowns while debug view is connected 2026-03-24 16:19:39 -04:00
Joseph Doherty
4c7fa03c07 fix(ui): remove default list-style bullets from TreeView ul elements 2026-03-24 16:19:39 -04:00
Joseph Doherty
addbb6ffeb fix(ui): move treeview-storage.js to Host wwwroot where static files are served 2026-03-24 16:19:39 -04:00
Joseph Doherty
f1537b62ca refactor(ui): replace instances table with hierarchical TreeView (Site → Area → Instance) 2026-03-24 16:19:39 -04:00
Joseph Doherty
71894f4ba9 refactor(ui): replace manual area tree rendering with TreeView component 2026-03-24 16:19:39 -04:00
Joseph Doherty
4426f3e928 refactor(ui): replace data connections table with TreeView grouped by site 2026-03-24 16:19:39 -04:00
Joseph Doherty
08d511f609 test(ui): add external filtering tests for TreeView (R8) 2026-03-24 16:19:39 -04:00
Joseph Doherty
4e5b5facec feat(ui): add right-click context menu to TreeView (R15) 2026-03-24 16:19:39 -04:00
Joseph Doherty
f127efe6ea feat(ui): add ExpandAll, CollapseAll, RevealNode to TreeView (R12, R13) 2026-03-24 16:19:39 -04:00
Joseph Doherty
d3a6ed5f68 feat(ui): add sessionStorage persistence for TreeView expansion state (R11) 2026-03-24 16:19:39 -04:00
Joseph Doherty
da4f29f6ee feat(ui): add selection support to TreeView (R5) 2026-03-24 16:19:39 -04:00
Joseph Doherty
75648c0c76 feat(ui): add TreeView<TItem> component with core rendering, expand/collapse, ARIA (R1-R4, R14) 2026-03-24 16:19:39 -04:00
Joseph Doherty
4db93cae2b fix(lmxproxy): fix orphaned tag subscriptions when client subscribes per-tag
When a client calls Subscribe multiple times with the same session ID
(one tag per RPC), each call overwrites the ClientSubscription entry.
UnsubscribeClient only cleaned up tags from the last entry, leaving
earlier tags orphaned in _tagSubscriptions. Now scans all tag
subscriptions for client references during cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:43:29 -04:00
Joseph Doherty
eecd82b787 fix(lmxproxy): clean up stale session subscriptions on scavenge and add stream timeout
Grpc.Core doesn't reliably fire CancellationToken on client disconnect,
so Subscribe RPCs can hang forever and leak session subscriptions. Bridge
SessionManager scavenging to SubscriptionManager cleanup, and add a
30-second periodic session validity check in the Subscribe loop so stale
streams exit within 30s of session scavenge rather than hanging until
process restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:21:06 -04:00
Joseph Doherty
b74e139a85 fix(lmxproxy): reset probe timer after reconnect to prevent false stale triggers
Without this, the staleness check could fire immediately after reconnect
before the first OnDataChange callback arrives, causing a reconnect loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:06:42 -04:00
Joseph Doherty
488a7b534b feat(lmxproxy): add Connected Since and Reconnect Count to status page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 13:32:46 -04:00
Joseph Doherty
73fe618953 fix(lmxproxy): protect probe subscription from ReadAsync teardown, add instance configs
ReadAsync internally subscribes/unsubscribes the same ScanTime tag used
by the persistent probe, which was tearing down the probe subscription
and triggering false reconnects every ~5s. Guard UnsubscribeInternal and
stored subscription state so the probe tag is never removed by other
callers. Also removes DetailedHealthCheckService (redundant with the
persistent probe), adds per-instance config files (appsettings.v2.json,
appsettings.v2b.json) loaded via LMXPROXY_INSTANCE env var so deploys
no longer overwrite port settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:20:05 -04:00
Joseph Doherty
95168253fc feat(lmxproxy): replace subscribe/unsubscribe health probe with persistent subscription
The old probe did a subscribe-read-unsubscribe cycle every 5 seconds to
check connection health. This created unnecessary churn and didn't detect
the failure mode where long-lived subscriptions silently stop receiving
COM callbacks (e.g. stalled STA message pump). The new approach keeps a
persistent subscription on the health check tag and forces reconnect if
no value update arrives within a configurable threshold (ProbeStaleThresholdMs,
default 5s). Also adds STA message pump debug logging (5-min heartbeat with
message counters) and fixes log file path resolution for Windows services.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:57:35 -04:00
Joseph Doherty
b3222cf30b fix(site-runtime): wire EventLogHandlerActor so site event log queries work
The SiteCommunicationActor expected an event log handler but none was
registered, causing "Event log handler not available" on the Event Logs
page and CLI. Bridge IEventLogQueryService to Akka via a simple actor.
2026-03-23 00:37:33 -04:00
Joseph Doherty
64c914019d feat(lmxproxy): always show RPC Operations table, rename from 'Operations'
Table now displays all 5 RPC types (Read, ReadBatch, Write, WriteBatch,
Subscribe) with dashes for zero-count operations instead of hiding the
table entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:12:19 -04:00
Joseph Doherty
7f74b660b3 feat(lmxproxy): add delivered/dropped message counts to subscription stats
Subscription metrics (totalDelivered, totalDropped) now visible in
/api/status JSON and HTML dashboard. Card turns yellow if drops > 0.
Aggregated from per-client counters in SubscriptionManager.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:07:58 -04:00
Joseph Doherty
59d143e4c8 docs(lmxproxy): update deviations for STA resolution, OnWriteComplete, subscribe fix
- Deviation #2: document three STA iterations (failed → Task.Run → StaComThread)
- Deviation #7: mark resolved — OnWriteComplete now works via STA message pump
- Deviation #8: note awaited subscription creation fixes flaky subscribe test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:52:09 -04:00
Joseph Doherty
b218773ab0 fix(lmxproxy): await COM subscription creation to fix Subscribe flakiness
SubscriptionManager.Subscribe was fire-and-forgetting the MxAccess COM
subscription creation. The initial OnDataChange callback could fire
before the subscription was established, losing the first (and possibly
only) value update. Changed to async SubscribeAsync that awaits
CreateMxAccessSubscriptionsAsync before returning the channel reader.

Subscribe_ReceivesUpdates now passes 5/5 consecutive runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:48:01 -04:00
Joseph Doherty
84b7b6a7a9 feat(lmxproxy): re-enable OnWriteComplete callback via STA message pump
With StaComThread's GetMessage loop in place, OnWriteComplete callbacks
are now delivered properly. Write flow: dispatch Write() on STA thread,
await OnWriteComplete via TCS, clean up on STA thread. Falls back to
fire-and-forget on timeout as safety net. OnWriteComplete now resolves
or rejects the TCS with MxStatus error details.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:35:09 -04:00
Joseph Doherty
a326a8cbde fix(lmxproxy): make MxAccess client name unique per instance
Multiple instances registering with the same name may cause MxAccess to
conflict on callback routing. ClientName is now configurable via
appsettings.json, defaulting to a GUID-suffixed name if not set.
Instances A and B use "LmxProxy-A" and "LmxProxy-B" respectively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:18:09 -04:00
Joseph Doherty
a59d4ad76c fix(lmxproxy): use raw Win32 message pump instead of WinForms Application.Run
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:18:09 -04:00
Joseph Doherty
b6408726bc feat(lmxproxy): add STA thread with message pump for MxAccess COM callbacks
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:18:09 -04:00
Joseph Doherty
c96e71c83c Revert "fix(lmxproxy): resolve subscribe/unsubscribe race condition on client reconnect"
This reverts commit 9e9efbecab399fd7dcfb3e7e14e8b08418c3c3fc.
2026-03-22 23:18:09 -04:00
Joseph Doherty
fa33e1acf1 fix(lmxproxy): resolve subscribe/unsubscribe race condition on client reconnect
Three fixes for the SubscriptionManager/MxAccessClient subscription pipeline:

1. Serialize Subscribe and UnsubscribeClient with a SemaphoreSlim gate to prevent
   race where old-session unsubscribe removes new-session COM subscriptions.
   CreateMxAccessSubscriptionsAsync is now awaited instead of fire-and-forget.

2. Fix dual VTQ delivery in MxAccessClient.OnDataChange — each update was delivered
   twice (once via stored callback, once via OnTagValueChanged property). Now uses
   stored callback as the single delivery path.

3. Store pending tag addresses when CreateMxAccessSubscriptionsAsync fails (MxAccess
   down) and retry them on reconnect via NotifyReconnection/RetryPendingSubscriptionsAsync.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 23:18:08 -04:00
Joseph Doherty
bc4fc97652 refactor(ui): extract instance bindings and overrides to dedicated Configure page
Move connection bindings, attribute overrides, and area assignment from
inline expandable rows on the Instances table to a separate page at
/deployment/instances/{id}/configure for a cleaner, less cramped UX.
2026-03-22 15:58:32 -04:00
Joseph Doherty
161dc406ed feat(scripts): add typed Parameters.Get<T>() helpers for script API
Replace raw dictionary casting with ScriptParameters wrapper that provides
Get<T>, Get<T?>, Get<T[]>, and Get<List<T>> with clear error messages,
numeric conversion, and JsonElement support for Inbound API parameters.
2026-03-22 15:47:18 -04:00
Joseph Doherty
a0e036fb6b chore(lmxproxy): switch health probe tag to DevPlatform.Scheduler.ScanTime
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 15:21:00 -04:00
Joseph Doherty
ecf4b434c2 refactor(dcl): simplify ValueFormatter now that SDK returns native .NET arrays
The LmxProxy client's ExtractArrayValue now returns proper .NET arrays
(bool[], int[], DateTime[], etc.) instead of ArrayValue objects. Removed
the reflection-based FormatArrayContainer logic — IEnumerable handling
is sufficient for all array types.
2026-03-22 15:15:38 -04:00
Joseph Doherty
af7335f9e2 docs(dcl): update protocol and type mapping docs to reflect v2 TypedValue and SDK integration 2026-03-22 15:11:58 -04:00
Joseph Doherty
ce3942990e feat(lmxproxy): add DatetimeArray proto type for DateTime[] round-trip fidelity
Added DatetimeArray message (repeated int64, UTC ticks) to proto and
code-first contracts. Host serializes DateTime[] → DatetimeArray.
Client deserializes DatetimeArray → DateTime[] (not raw long[]).
Client ExtractArrayValue now unpacks all array types including DateTime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 15:08:15 -04:00
Joseph Doherty
b050371dd5 fix(lmxproxy): handle DateTime[] COM arrays in TypedValueConverter
DateTime[] from MxAccess was falling through to ToString() fallback,
producing "System.DateTime[]" instead of actual values. Now converts
each DateTime to UTC ticks and stores in Int64Array.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 14:56:08 -04:00
Joseph Doherty
dcdf79afdc fix(dcl): format ArrayValue objects as comma-separated strings for display
ArrayValue from LmxProxy client was showing as type name in debug views.
Added ValueFormatter utility and NormalizeValue in LmxProxyDataConnection
to convert arrays at the adapter boundary. DateTime arrays remain as
"System.DateTime[]" due to server-side v1 string serialization.
2026-03-22 14:46:15 -04:00
Joseph Doherty
ea9c2857a7 fix(docker,cli): add LmxProxy.Client to Docker build, fix set-bindings JSON parsing
Docker: include lmxproxy/src/ZB.MOM.WW.LmxProxy.Client in build context
so the project reference resolves during container image build.

CLI: fix set-bindings JSON parsing — use JsonElement.GetString()/GetInt32()
instead of object.ToString() which returned null for deserialized elements.
2026-03-22 14:25:09 -04:00
Joseph Doherty
847302e297 test(dcl): add failover state machine tests for DataConnectionActor 2026-03-22 08:47:44 -04:00
Joseph Doherty
5de6c8d052 docs(dcl): document primary/backup endpoint redundancy across requirements and test infra 2026-03-22 08:43:59 -04:00
Joseph Doherty
e8df71ea64 feat(cli): add --primary-config, --backup-config, --failover-retry-count to data connection commands
Thread backup data connection fields through management command messages,
ManagementActor handlers, SiteService, site-side SQLite storage, and
deployment/replication actors. The old --configuration CLI flag is kept
as a hidden alias for backwards compatibility.
2026-03-22 08:41:57 -04:00
Joseph Doherty
ab4e88f17f feat(ui): add primary/backup endpoint fields to data connection form 2026-03-22 08:36:18 -04:00
Joseph Doherty
801c0c1df2 feat(dcl): add active endpoint to health reports and log failover events
Add ActiveEndpoint field to DataConnectionHealthReport showing which
endpoint is active (Primary, Backup, or Primary with no backup configured).
Log failover transitions and connection restoration events to the site
event log via ISiteEventLogger, passed as an optional parameter through
the actor hierarchy for backwards compatibility.
2026-03-22 08:34:05 -04:00
Joseph Doherty
da290fa4f8 feat(dcl): add failover state machine to DataConnectionActor with round-robin endpoint switching 2026-03-22 08:30:03 -04:00
Joseph Doherty
46304678da feat(dcl): extend CreateConnectionCommand with backup config and failover retry count
Update CreateConnectionCommand to carry PrimaryConnectionDetails,
BackupConnectionDetails, and FailoverRetryCount. Update all callers:
DataConnectionManagerActor, DataConnectionActor, DeploymentManagerActor,
FlatteningService, and ConnectionConfig. The actor stores both configs
but continues using primary only — failover logic comes in Task 3.
2026-03-22 08:24:39 -04:00
Joseph Doherty
04af03980e feat(dcl): rename Configuration to PrimaryConfiguration, add BackupConfiguration and FailoverRetryCount 2026-03-22 08:18:31 -04:00
Joseph Doherty
5ca1be328c docs(dcl): add primary/backup data connections implementation plan
8 tasks with TDD steps, exact file paths, and code samples.
Covers entity model, failover state machine, health reporting,
UI, CLI, management API, deployment, and documentation.
2026-03-22 08:13:23 -04:00
Joseph Doherty
6267ff882c docs(dcl): add primary/backup data connection endpoints design
Covers entity model, failover state machine, health reporting,
UI/CLI changes, and deployment flow for optional backup endpoints
with automatic failover after configurable retry count.
2026-03-22 08:09:25 -04:00
Joseph Doherty
5ec7f35150 feat(dcl): replace hand-rolled LmxProxy gRPC client with real LmxProxyClient library
Switches from v1 string-based proto stubs to the production LmxProxyClient
(v2 native TypedValue protocol) via project reference. Deletes 6k+ lines of
generated proto code. Preserves ILmxProxyClient adapter interface for testability.
2026-03-22 07:55:50 -04:00
Joseph Doherty
abb7579227 chore(infra): remove LmxFakeProxy — replaced by real LmxProxy v2 instances on windev
LmxFakeProxy is no longer needed now that two real LmxProxy v2 instances
are available for testing. Added remote test infra section to test_infra.md
documenting the windev instances. Removed tagsim (never committed).
2026-03-22 07:42:13 -04:00
Joseph Doherty
efed8352c3 feat(infra): add second OPC UA server instance (opcua2) on port 50010
Enables multi-server testing with independent state. Both instances
share the same nodes.json tag config. Updated all infra documentation.
2026-03-22 07:31:56 -04:00
Joseph Doherty
ac44122bf7 docs(lmxproxy): add dual-instance configuration (A on 50100, B on 50101)
Both instances share API keys and connect to the same AVEVA platform.
Verified: 17/17 integration tests pass against both instances.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 07:26:06 -04:00
Joseph Doherty
2c99b370a0 chore(lmxproxy): switch health probe tag to DevAppEngine.Scheduler.ScanTime, remove temp prompts
AppEngine built-in tag is always present and constantly updating (~1s),
making it a more reliable probe than a user-deployed TestChildObject tag.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 07:18:39 -04:00
Joseph Doherty
ec21a9a2a0 docs(lmxproxy): mark gap 1 and gap 2 as resolved with test verification
Gap 1: Active health probing verified — 60s recovery after platform restart.
Gap 2: Address-based subscription cleanup — no stale handles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 07:10:38 -04:00
Joseph Doherty
a6c01d73e2 feat(lmxproxy): active health probing + address-based subscription cleanup (gap 1 & 2)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 06:44:21 -04:00
Joseph Doherty
86a15c0a65 docs(lmxproxy): document reconnection gaps from platform restart testing
Tested aaBootstrap kill on windev — three gaps identified:
1. No active health probing (IsConnected stays true on dead connection)
2. Stale SubscriptionManager handles after reconnect cycle
3. AVEVA objects don't auto-start after platform crash (platform behavior)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 06:19:30 -04:00
Joseph Doherty
5a9574fb95 feat(lmxproxy): add MxAccess status detail mapping for richer error messages
- MxStatusMapper: maps all 40+ MxStatusDetail codes, MxStatusCategory,
  and MxStatusSource to human-readable names and client messages
- OnDataChange: checks MXSTATUS_PROXY.success and overrides quality with
  specific OPC UA code when MxAccess reports a failure (e.g., CommFailure,
  ConfigError, WaitingForInitialData)
- OnWriteComplete: uses MxStatusMapper.FormatStatus for structured logging
- Write errors: catches COMException separately with HRESULT in message
- Read errors: distinguishes COM, timeout, and generic failures in logging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 05:10:50 -04:00
Joseph Doherty
73b2b2f6d7 docs(lmxproxy): add STA message pump gap analysis with implementation guide
Documents when the full STA+Application.Run() approach is needed
(secured/verified writes), why our first attempt failed, the correct
pattern using Form.BeginInvoke(), and tradeoffs vs fire-and-forget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 05:02:15 -04:00
Joseph Doherty
467fdc34d8 docs(lmxproxy): correct deviation #7 — OnWriteComplete is a COM threading issue, not MxAccess behavior
The MxAccess docs explicitly state OnWriteComplete always fires after Write().
The real cause is no Windows message pump in the headless service process to
marshal the COM callback. Fire-and-forget is safe for supervisory writes but
would miss secured/verified write rejections (errors 1012/1013).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 04:53:54 -04:00
Joseph Doherty
866c73dcd4 docs(lmxproxy): add deviation #8 — SubscriptionManager COM subscription wiring
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 04:47:23 -04:00
Joseph Doherty
7bed4b901a fix(lmxproxy): wire MxAccess COM subscriptions in SubscriptionManager
SubscriptionManager tracked client-to-tag routing but never called
MxAccessClient.SubscribeAsync to create the actual COM subscriptions,
so OnDataChange never fired. Now creates MxAccess subscriptions for
new tags and disposes them when the last client unsubscribes.

All 17 integration tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 04:46:15 -04:00
Joseph Doherty
c5d4849bd3 fix(lmxproxy): resolve write timeout — bypass OnWriteComplete callback for supervisory writes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 04:39:14 -04:00
Joseph Doherty
e2c204b62b docs(lmxproxy): add execution prompt to fix failing write integration tests 2026-03-22 04:38:30 -04:00
Joseph Doherty
7079f6eed4 docs(lmxproxy): add ArchestrA MXAccess Toolkit reference documentation 2026-03-22 04:30:39 -04:00
Joseph Doherty
f4386bc518 docs(lmxproxy): record v2 rebuild deviations and key technical decisions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 04:21:36 -04:00
Joseph Doherty
779598d962 feat(lmxproxy): phase 7 — integration tests, deployment to windev, v1 cutover
- Replaced STA dispatch thread with Task.Run pattern for COM interop
- Fixed TypedValue oneof tracking with property-level _setCase field
- Added x-api-key DelegatingHandler for gRPC metadata authentication
- Fixed CheckApiKey RPC to validate request body key (not header)
- Integration tests: 15/17 pass (reads, subscribes, API keys, connections)
- 2 write tests pending (OnWriteComplete callback timing issue)
- v2 service deployed on windev port 50100

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 01:11:44 -04:00
Joseph Doherty
6d9bf594ec feat(lmxproxy): phase 7 — integration test project and test scenarios
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:31:26 -04:00
Joseph Doherty
215cfa29f3 feat(lmxproxy): phase 6 — client extras (builder, factory, DI, streaming extensions)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:29:16 -04:00
Joseph Doherty
8ba75b50e8 feat(lmxproxy): phase 5 — client core (ILmxProxyClient, connection, read/write/subscribe)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:22:29 -04:00
Joseph Doherty
9eb81180c0 feat(lmxproxy): phase 4 — host health monitoring, metrics, status web server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:14:40 -04:00
Joseph Doherty
16d1b95e9a feat(lmxproxy): phase 3 — host gRPC server, security, configuration, service hosting
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:05:36 -04:00
Joseph Doherty
64c92c63e5 feat(lmxproxy): phase 2 — host core (MxAccessClient, SessionManager, SubscriptionManager)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 23:58:17 -04:00
Joseph Doherty
0d63fb1105 feat(lmxproxy): phase 1 — v2 protocol types and domain model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 23:41:56 -04:00
Joseph Doherty
08d2a07d8b docs(lmxproxy): update test tags to TestChildObject namespace for v2 type coverage
Replace JoeAppEngine tags with TestChildObject tags (TestBool, TestInt, TestFloat,
TestDouble, TestString, TestDateTime, and array variants) in Phase 4 and Phase 7
plans. These tags cover all TypedValue oneof cases for comprehensive v2 testing.
2026-03-21 23:35:15 -04:00
Joseph Doherty
4303f06fc3 docs(lmxproxy): add v2 rebuild design, 7-phase implementation plans, and execution prompt
Design doc covers architecture, v2 protocol (TypedValue/QualityCode), COM threading
model, session lifecycle, subscription semantics, error model, and guardrails.
Implementation plans are detailed enough for autonomous Claude Code execution.
Verified all dev tooling on windev (Grpc.Tools, protobuf-net.Grpc, Polly v8, xUnit).
2026-03-21 23:29:42 -04:00
Joseph Doherty
683aea0fbe docs: add LmxProxy requirements documentation with v2 protocol as authoritative design
Generate high-level requirements and 10 component documents derived from source code
and protocol specs. Uses lmxproxy_updates.md (v2 TypedValue/QualityCode) as the source
of truth, with v1 string-based encoding documented as legacy context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 22:38:11 -04:00
Joseph Doherty
970d0a5cb3 refactor: simplify data connections from many-to-many site assignment to direct site ownership
Replace SiteDataConnectionAssignment join table with a direct SiteId FK on DataConnection,
simplifying the data model, repositories, UI, CLI, and deployment service.
2026-03-21 21:07:10 -04:00
Joseph Doherty
cd6efeea90 docs: add requirements generation prompt for LmxProxy project 2026-03-21 21:06:59 -04:00
Joseph Doherty
2810306415 feat: add standalone LmxProxy solution, windev VM documentation
Split LmxProxy Host and Client into a self-contained solution under lmxproxy/,
ported from the ScadaBridge monorepo with updated namespaces (ZB.MOM.WW.LmxProxy.*).
Client project (.NET 10) inlines Core/DataEngine dependencies and builds clean.
Host project (.NET Fx 4.8) retains ArchestrA.MXAccess for Windows deployment.
Added windev.md documenting the WW_DEV_VM development environment setup.
2026-03-21 20:50:05 -04:00
Joseph Doherty
512153646a test: add role-based navigation tests verifying correct nav sections per user role 2026-03-21 15:25:34 -04:00
Joseph Doherty
d3194e3634 feat: separate create/edit form pages, Playwright test infrastructure, /auth/token endpoint
Move all CRUD create/edit forms from inline on list pages to dedicated form pages
with back-button navigation and post-save redirect. Add Playwright Docker container
(browser server on port 3000) with 25 passing E2E tests covering login, navigation,
and site CRUD workflows. Add POST /auth/token endpoint for clean JWT retrieval.
2026-03-21 15:17:24 -04:00
Joseph Doherty
b3f8850711 docs: document script hot-reload mechanisms for all script types 2026-03-21 13:42:06 -04:00
Joseph Doherty
eeca930cbd fix: add EF migration for GrpcNodeAAddress/GrpcNodeBAddress columns on Sites table 2026-03-21 12:44:21 -04:00
Joseph Doherty
416a03b782 feat: complete gRPC streaming channel — site host, docker config, docs, integration tests
Switch site host to WebApplicationBuilder with Kestrel HTTP/2 gRPC server,
add GrpcPort/keepalive config, wire SiteStreamManager as ISiteStreamSubscriber,
expose gRPC ports in docker-compose, add site seed script, update all 10
requirement docs + CLAUDE.md + README.md for the new dual-transport architecture.
2026-03-21 12:38:33 -04:00
Joseph Doherty
3fe3c4161b test: add proto contract, cleanup verification, and regression guardrail tests 2026-03-21 12:36:27 -04:00
Joseph Doherty
49f042a937 refactor: remove ClusterClient streaming path (DebugStreamEvent), events flow via gRPC 2026-03-21 12:18:52 -04:00
Joseph Doherty
2cd43b6992 feat: update DebugStreamBridgeActor to use gRPC for streaming events
After receiving the initial snapshot via ClusterClient, the bridge actor
now opens a gRPC server-streaming subscription via SiteStreamGrpcClient
for ongoing AttributeValueChanged/AlarmStateChanged events. Adds NodeA/
NodeB failover with max 3 retries, retry count reset on successful event,
and IWithTimers-based reconnect scheduling.

- DebugStreamBridgeActor: gRPC stream after snapshot, reconnect state machine
- DebugStreamService: inject SiteStreamGrpcClientFactory, resolve gRPC addresses
- ServiceCollectionExtensions: register SiteStreamGrpcClientFactory singleton
- SiteStreamGrpcClient: make SubscribeAsync/Unsubscribe virtual for testability
- SiteStreamGrpcClientFactory: make GetOrCreate virtual for testability
- New test suite: DebugStreamBridgeActorTests (8 tests)
2026-03-21 12:14:24 -04:00
Joseph Doherty
25a6022f7b feat: add SiteStreamGrpcClient and SiteStreamGrpcClientFactory
Per-site gRPC client for central-side streaming subscriptions to site
servers. SiteStreamGrpcClient manages server-streaming calls with
keepalive, converts proto events to domain types, and supports
cancellation via Unsubscribe. SiteStreamGrpcClientFactory caches one
client per site identifier.

Includes InternalsVisibleTo for test access to conversion helpers and
comprehensive unit tests for event mapping, quality/alarm-state
conversion, unsubscribe behavior, and factory caching.
2026-03-21 12:06:38 -04:00
Joseph Doherty
55a05914d0 feat: add SiteStreamGrpcServer with Channel<T> bridge and stream limits
- Define ISiteStreamSubscriber interface for decoupling from SiteRuntime
- Implement SiteStreamGrpcServer (inherits SiteStreamServiceBase) with:
  - Readiness gate (SetReady)
  - Max concurrent stream enforcement
  - Duplicate correlationId replacement (cancels previous stream)
  - StreamRelayActor creation per subscription
  - Bounded Channel<SiteStreamEvent> bridge (1000 capacity, drop-oldest)
  - Clean teardown: unsubscribe, stop actor, remove tracking entry
- Identity-safe cleanup using ConcurrentDictionary.TryRemove(KeyValuePair)
  to prevent replacement streams from being removed by predecessor cleanup
- 7 unit tests covering reject-not-ready, max-streams, duplicate cancel,
  cleanup-on-cancel, subscribe/remove lifecycle, event forwarding
2026-03-21 11:52:31 -04:00
Joseph Doherty
d70bbbe739 feat: add StreamRelayActor bridging Akka events to gRPC proto channel 2026-03-21 11:48:04 -04:00
402 changed files with 54079 additions and 10857 deletions

View File

@@ -7,8 +7,7 @@ This project contains design documentation for a distributed SCADA system built
- `README.md` — Master index with component table and architecture diagrams.
- `docs/requirements/HighLevelReqs.md` — Complete high-level requirements covering all functional areas.
- `docs/requirements/Component-*.md` — Individual component design documents (one per component).
- `docs/requirements/lmxproxy_protocol.md` — LmxProxy gRPC protocol specification.
- `docs/test_infra/test_infra.md` — Master test infrastructure doc (OPC UA, LDAP, MS SQL, SMTP, REST API, LmxFakeProxy, Traefik).
- `docs/test_infra/test_infra.md` — Master test infrastructure doc (OPC UA, LDAP, MS SQL, SMTP, REST API, Traefik).
- `docs/plans/` — Design decision documents from refinement sessions.
- `AkkaDotNet/` — Akka.NET reference documentation and best practices notes.
- `infra/` — Docker Compose and config files for local test services.
@@ -43,7 +42,7 @@ This project contains design documentation for a distributed SCADA system built
2. Deployment Manager — Central-side deployment pipeline, system-wide artifact deployment, instance lifecycle.
3. Site Runtime — Site-side actor hierarchy (Deployment Manager singleton, Instance/Script/Alarm Actors), script compilation, Akka stream.
4. Data Connection Layer — Protocol abstraction (OPC UA, custom), subscription management, clean data pipe.
5. CentralSite Communication — Akka.NET ClusterClient/ClusterClientReceptionist, message patterns, debug streaming.
5. CentralSite Communication — Akka.NET ClusterClient (command/control) + gRPC server-streaming (real-time data), message patterns, debug streaming.
6. Store-and-Forward Engine — Buffering, fixed-interval retry, parking, SQLite persistence, replication.
7. External System Gateway — External system definitions, API method invocation, database connections.
8. Notification Service — Notification lists, email delivery, store-and-forward integration.
@@ -81,7 +80,8 @@ This project contains design documentation for a distributed SCADA system built
- Tag path resolution retried periodically for devices still booting.
- Static attribute writes persisted to local SQLite (survive restart/failover, reset on redeployment).
- All timestamps are UTC throughout the system.
- Inter-cluster communication uses ClusterClient/ClusterClientReceptionist. Both CentralCommunicationActor and SiteCommunicationActor registered with receptionist. Central creates one ClusterClient per site using NodeA/NodeB as contact points. Sites configure multiple central contact points for failover. Addresses cached in CentralCommunicationActor, refreshed periodically (60s) and on admin changes. Heartbeats serve health monitoring only.
- Inter-cluster communication uses two transports: ClusterClient for command/control (deployments, lifecycle, subscribe/unsubscribe handshake, snapshots) and gRPC server-streaming for real-time data (attribute values, alarm states). Both CentralCommunicationActor and SiteCommunicationActor registered with receptionist. Central creates one ClusterClient per site using NodeA/NodeB as contact points. Sites configure multiple central contact points for failover. Addresses cached in CentralCommunicationActor, refreshed periodically (60s) and on admin changes. Heartbeats serve health monitoring only.
- gRPC streaming channel: SiteStreamGrpcServer on each site node (Kestrel HTTP/2, port 8083); central creates per-site SiteStreamGrpcClient via SiteStreamGrpcClientFactory. Site entity has GrpcNodeAAddress/GrpcNodeBAddress fields. Proto: sitestream.proto with SiteStreamService, SiteStreamEvent (oneof: AttributeValueUpdate, AlarmStateUpdate). DebugStreamEvent message removed (no longer flows through ClusterClient).
### External Integrations
- External System Gateway: HTTP/REST only, JSON serialization, API key + Basic Auth.
@@ -126,7 +126,7 @@ This project contains design documentation for a distributed SCADA system built
### UI & Monitoring
- Central UI: Blazor Server (ASP.NET Core + SignalR) with Bootstrap CSS. No third-party component frameworks (no Blazorise, MudBlazor, Radzen, etc.). Build custom Blazor components for tables, grids, forms, etc.
- UI design: Clean, corporate, internal-use aesthetic. Not flashy. Use the `frontend-design` skill when designing UI pages/components.
- Debug view: real-time streaming via DebugStreamBridgeActor. Health dashboard: 10s polling timer. Deployment status: real-time push via SignalR.
- Debug view: real-time streaming via DebugStreamBridgeActor + gRPC (events via SiteStreamGrpcClient, snapshot via ClusterClient). Health dashboard: 10s polling timer. Deployment status: real-time push via SignalR.
- Health reports: 30s interval, 60s offline threshold, monotonic sequence numbers, raw error counts per interval.
- Dead letter monitoring as a health metric.
- Site Event Logging: 30-day retention, 1GB storage cap, daily purge, paginated queries with keyword search.
@@ -159,5 +159,5 @@ This project contains design documentation for a distributed SCADA system built
- **Test user**: `--username multi-role --password password` — has Admin, Design, and Deployment roles. The `admin` user only has the Admin role and cannot create templates, data connections, or deploy.
- **Config file**: `~/.scadalink/config.json` — stores `managementUrl` and default format. See `docker/README.md` for a ready-to-use test config.
- **Rebuild cluster**: `bash docker/deploy.sh` — builds the `scadalink:latest` image and recreates all containers. Run this after code changes to ManagementActor, Host, or any server-side component.
- **Infrastructure services**: `cd infra && docker compose up -d` — starts LDAP, MS SQL, OPC UA, SMTP, REST API, and LmxFakeProxy. These are separate from the cluster containers in `docker/`.
- **Infrastructure services**: `cd infra && docker compose up -d` — starts LDAP, MS SQL, OPC UA, SMTP, and REST API. These are separate from the cluster containers in `docker/`.
- **All test LDAP passwords**: `password` (see `infra/glauth/config.toml` for users and groups).

View File

@@ -38,7 +38,7 @@ This document serves as the master index for the SCADA system design. The system
| 2 | Deployment Manager | [docs/requirements/Component-DeploymentManager.md](docs/requirements/Component-DeploymentManager.md) | Central-side deployment pipeline with deployment ID/idempotency, per-instance operation lock, state transition matrix, all-or-nothing site apply, system-wide artifact deployment with per-site status. |
| 3 | Site Runtime | [docs/requirements/Component-SiteRuntime.md](docs/requirements/Component-SiteRuntime.md) | Site-side actor hierarchy with explicit supervision strategies, staggered startup, script trust model (constrained APIs), Tell/Ask conventions, concurrency serialization, and site-wide Akka stream with per-subscriber backpressure. |
| 4 | Data Connection Layer | [docs/requirements/Component-DataConnectionLayer.md](docs/requirements/Component-DataConnectionLayer.md) | Common data connection interface (OPC UA, custom), Become/Stash connection actor model, auto-reconnect, immediate bad quality on disconnect, transparent re-subscribe, synchronous write failures, tag path resolution retry. |
| 5 | CentralSite Communication | [docs/requirements/Component-Communication.md](docs/requirements/Component-Communication.md) | Akka.NET remoting/cluster topology, 8 message patterns with per-pattern timeouts, application-level correlation IDs, transport heartbeat config, message ordering, connection failure behavior. |
| 5 | CentralSite Communication | [docs/requirements/Component-Communication.md](docs/requirements/Component-Communication.md) | Dual transport: Akka.NET ClusterClient (command/control) + gRPC server-streaming (real-time data). 8 message patterns with per-pattern timeouts, SiteStreamGrpcServer/Client, application-level correlation IDs, transport heartbeat config, gRPC keepalive, message ordering, connection failure behavior. |
| 6 | Store-and-Forward Engine | [docs/requirements/Component-StoreAndForward.md](docs/requirements/Component-StoreAndForward.md) | Buffering (transient failures only), fixed-interval retry, parking, async best-effort replication, SQLite persistence at sites. |
| 7 | External System Gateway | [docs/requirements/Component-ExternalSystemGateway.md](docs/requirements/Component-ExternalSystemGateway.md) | HTTP/REST + JSON, API key/Basic Auth, per-system timeout, dual call modes (Call/CachedCall), transient/permanent error classification, dedicated blocking I/O dispatcher, ADO.NET connection pooling. |
| 8 | Notification Service | [docs/requirements/Component-NotificationService.md](docs/requirements/Component-NotificationService.md) | SMTP with OAuth2 (M365) or Basic Auth, BCC delivery, plain text, transient/permanent SMTP error classification, store-and-forward integration. |
@@ -90,6 +90,8 @@ This document serves as the master index for the SCADA system design. The system
│ └──────────┘ │
│ ┌───────────────────────────────────┐ │
│ │ Akka.NET Communication Layer │ │
│ │ ClusterClient: command/control │ │
│ │ gRPC Client: real-time streams │ │
│ │ (correlation IDs, per-pattern │ │
│ │ timeouts, message ordering) │ │
│ └──────────────┬────────────────────┘ │
@@ -98,7 +100,8 @@ This document serves as the master index for the SCADA system design. The system
│ └───────────────────────────────────┘ (Config DB)│
│ │ Machine Data DB│
└─────────────────┼───────────────────────────────────┘
│ Akka.NET Remoting
│ Akka.NET Remoting (command/control)
│ gRPC HTTP/2 (real-time data, port 8083)
┌────────────┼────────────┐
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
@@ -112,6 +115,9 @@ This document serves as the master index for the SCADA system design. The system
│ │Site │ │ │ │Site │ │ │ │Site │ │
│ │Runtm│ │ │ │Runtm│ │ │ │Runtm│ │
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │
│ │gRPC │ │ │ │gRPC │ │ │ │gRPC │ │
│ │Srvr │ │ │ │Srvr │ │ │ │Srvr │ │
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │
│ │S&F │ │ │ │S&F │ │ │ │S&F │ │
│ │Engine│ │ │ │Engine│ │ │ │Engine│ │
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │

View File

@@ -41,5 +41,6 @@
<Project Path="tests/ScadaLink.ManagementService.Tests/ScadaLink.ManagementService.Tests.csproj" />
<Project Path="tests/ScadaLink.IntegrationTests/ScadaLink.IntegrationTests.csproj" />
<Project Path="tests/ScadaLink.PerformanceTests/ScadaLink.PerformanceTests.csproj" />
<Project Path="tests/ScadaLink.CentralUI.PlaywrightTests/ScadaLink.CentralUI.PlaywrightTests.csproj" />
</Folder>
</Solution>

View File

@@ -0,0 +1,43 @@
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ScadaLink.DataConnectionLayer.Adapters;
/// <summary>
/// Subscription handle returned by <see cref="ILmxProxyClient.SubscribeAsync"/>.
/// Disposing the subscription stops receiving updates.
/// </summary>
public interface ILmxSubscription : IAsyncDisposable { }
/// <summary>
/// Abstraction over the LmxProxy SDK client for testability.
/// The production implementation delegates to the real
/// <see cref="ZB.MOM.WW.LmxProxy.Client.LmxProxyClient"/> library.
/// </summary>
public interface ILmxProxyClient : IAsyncDisposable
{
bool IsConnected { get; }
Task ConnectAsync(CancellationToken cancellationToken = default);
Task DisconnectAsync();
Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default);
Task<IDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses, CancellationToken cancellationToken = default);
Task WriteAsync(string address, TypedValue value, CancellationToken cancellationToken = default);
Task WriteBatchAsync(IDictionary<string, TypedValue> values, CancellationToken cancellationToken = default);
Task<ILmxSubscription> SubscribeAsync(
IEnumerable<string> addresses,
Action<string, Vtq> onUpdate,
Action<Exception>? onStreamError = null,
CancellationToken cancellationToken = default);
}
/// <summary>
/// Factory for creating <see cref="ILmxProxyClient"/> instances configured
/// with host, port, and optional API key.
/// </summary>
public interface ILmxProxyClientFactory
{
ILmxProxyClient Create(string host, int port, string? apiKey, bool useTls = false);
}

View File

@@ -1,17 +1,23 @@
using Microsoft.Extensions.Logging;
using ScadaLink.Commons.Interfaces.Protocol;
using ScadaLink.Commons.Types.Enums;
using ZB.MOM.WW.LmxProxy.Client.Domain;
using ScadaLink.Commons.Types;
using QualityCode = ScadaLink.Commons.Interfaces.Protocol.QualityCode;
using WriteResult = ScadaLink.Commons.Interfaces.Protocol.WriteResult;
namespace ScadaLink.DataConnectionLayer.Adapters;
/// <summary>
/// LmxProxy adapter implementing IDataConnection.
/// Maps IDataConnection operations to the LmxProxy SDK client.
/// Maps IDataConnection operations to the real LmxProxy SDK client
/// via the <see cref="ILmxProxyClient"/> abstraction.
///
/// LmxProxy-specific behavior:
/// - Session-based connection with automatic 30s keep-alive (managed by SDK)
/// - gRPC streaming for subscriptions via ILmxSubscription handles
/// - API key authentication via x-api-key gRPC metadata header
/// - Native TypedValue writes (v2 protocol)
/// </summary>
public class LmxProxyDataConnection : IDataConnection
{
@@ -24,6 +30,8 @@ public class LmxProxyDataConnection : IDataConnection
private readonly Dictionary<string, ILmxSubscription> _subscriptions = new();
private volatile bool _disconnectFired;
private StaleTagMonitor? _staleMonitor;
private string? _heartbeatSubscriptionId;
public LmxProxyDataConnection(ILmxProxyClientFactory clientFactory, ILogger<LmxProxyDataConnection> logger)
{
@@ -41,21 +49,58 @@ public class LmxProxyDataConnection : IDataConnection
_port = port;
connectionDetails.TryGetValue("ApiKey", out var apiKey);
var samplingIntervalMs = connectionDetails.TryGetValue("SamplingIntervalMs", out var sampStr) && int.TryParse(sampStr, out var samp) ? samp : 0;
var useTls = connectionDetails.TryGetValue("UseTls", out var tlsStr) && bool.TryParse(tlsStr, out var tls) && tls;
_status = ConnectionHealth.Connecting;
_client = _clientFactory.Create(_host, _port, apiKey, samplingIntervalMs, useTls);
_client = _clientFactory.Create(_host, _port, apiKey, useTls);
await _client.ConnectAsync(cancellationToken);
_status = ConnectionHealth.Connected;
_disconnectFired = false;
_logger.LogInformation("LmxProxy connected to {Host}:{Port}", _host, _port);
// Heartbeat stale tag monitoring (optional)
await StartHeartbeatMonitorAsync(connectionDetails, cancellationToken);
}
private async Task StartHeartbeatMonitorAsync(IDictionary<string, string> connectionDetails, CancellationToken cancellationToken)
{
if (!connectionDetails.TryGetValue("HeartbeatTagPath", out var heartbeatTag) || string.IsNullOrWhiteSpace(heartbeatTag))
return;
var maxSilenceSeconds = connectionDetails.TryGetValue("HeartbeatMaxSilence", out var silenceStr)
&& int.TryParse(silenceStr, out var sec) ? sec : 30;
_staleMonitor?.Dispose();
_staleMonitor = new StaleTagMonitor(TimeSpan.FromSeconds(maxSilenceSeconds));
_staleMonitor.Stale += () =>
{
_logger.LogWarning("LmxProxy heartbeat tag '{Tag}' stale — no update in {Seconds}s", heartbeatTag, maxSilenceSeconds);
RaiseDisconnected();
};
try
{
_heartbeatSubscriptionId = await SubscribeAsync(heartbeatTag, (tag, value) =>
{
_logger.LogDebug("LmxProxy heartbeat received: {Tag} = {Value} (quality={Quality})", tag, value.Value, value.Quality);
_staleMonitor.OnValueReceived();
}, cancellationToken);
_staleMonitor.Start();
_logger.LogInformation("LmxProxy heartbeat monitor started for '{Tag}' with {Seconds}s max silence", heartbeatTag, maxSilenceSeconds);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to subscribe to heartbeat tag '{Tag}' — stale monitor not active", heartbeatTag);
_staleMonitor.Dispose();
_staleMonitor = null;
}
}
public async Task DisconnectAsync(CancellationToken cancellationToken = default)
{
StopHeartbeatMonitor();
if (_client != null)
{
await _client.DisconnectAsync();
@@ -72,9 +117,9 @@ public class LmxProxyDataConnection : IDataConnection
{
var vtq = await _client!.ReadAsync(tagPath, cancellationToken);
var quality = MapQuality(vtq.Quality);
var tagValue = new TagValue(vtq.Value, quality, new DateTimeOffset(vtq.TimestampUtc, TimeSpan.Zero));
var tagValue = new TagValue(NormalizeValue(vtq.Value), quality, new DateTimeOffset(vtq.Timestamp, TimeSpan.Zero));
return vtq.Quality == LmxQuality.Bad
return vtq.Quality.IsBad()
? new ReadResult(false, tagValue, "LmxProxy read returned bad quality")
: new ReadResult(true, tagValue, null);
}
@@ -96,8 +141,8 @@ public class LmxProxyDataConnection : IDataConnection
foreach (var (tag, vtq) in vtqs)
{
var quality = MapQuality(vtq.Quality);
var tagValue = new TagValue(vtq.Value, quality, new DateTimeOffset(vtq.TimestampUtc, TimeSpan.Zero));
results[tag] = vtq.Quality == LmxQuality.Bad
var tagValue = new TagValue(NormalizeValue(vtq.Value), quality, new DateTimeOffset(vtq.Timestamp, TimeSpan.Zero));
results[tag] = vtq.Quality.IsBad()
? new ReadResult(false, tagValue, "LmxProxy read returned bad quality")
: new ReadResult(true, tagValue, null);
}
@@ -111,7 +156,7 @@ public class LmxProxyDataConnection : IDataConnection
try
{
await _client!.WriteAsync(tagPath, value!, cancellationToken);
await _client!.WriteAsync(tagPath, ToTypedValue(value), cancellationToken);
return new WriteResult(true, null);
}
catch (Exception ex)
@@ -126,9 +171,8 @@ public class LmxProxyDataConnection : IDataConnection
try
{
var nonNullValues = values.Where(kv => kv.Value != null)
.ToDictionary(kv => kv.Key, kv => kv.Value!);
await _client!.WriteBatchAsync(nonNullValues, cancellationToken);
var typedValues = values.ToDictionary(kv => kv.Key, kv => ToTypedValue(kv.Value));
await _client!.WriteBatchAsync(typedValues, cancellationToken);
return values.Keys.ToDictionary(k => k, _ => new WriteResult(true, null))
as IReadOnlyDictionary<string, WriteResult>;
@@ -174,11 +218,11 @@ public class LmxProxyDataConnection : IDataConnection
(path, vtq) =>
{
var quality = MapQuality(vtq.Quality);
callback(path, new TagValue(vtq.Value, quality, new DateTimeOffset(vtq.TimestampUtc, TimeSpan.Zero)));
callback(path, new TagValue(NormalizeValue(vtq.Value), quality, new DateTimeOffset(vtq.Timestamp, TimeSpan.Zero)));
},
onStreamError: () =>
onStreamError: ex =>
{
_logger.LogWarning("LmxProxy subscription stream ended unexpectedly for {TagPath}", tagPath);
_logger.LogWarning(ex, "LmxProxy subscription stream ended unexpectedly for {TagPath}", tagPath);
RaiseDisconnected();
},
cancellationToken);
@@ -196,8 +240,16 @@ public class LmxProxyDataConnection : IDataConnection
}
}
private void StopHeartbeatMonitor()
{
_staleMonitor?.Dispose();
_staleMonitor = null;
_heartbeatSubscriptionId = null;
}
public async ValueTask DisposeAsync()
{
StopHeartbeatMonitor();
foreach (var subscription in _subscriptions.Values)
{
try { await subscription.DisposeAsync(); }
@@ -219,10 +271,6 @@ public class LmxProxyDataConnection : IDataConnection
throw new InvalidOperationException("LmxProxy client is not connected.");
}
/// <summary>
/// Marks the connection as disconnected and fires the Disconnected event once.
/// Thread-safe: only the first caller triggers the event.
/// </summary>
private void RaiseDisconnected()
{
if (_disconnectFired) return;
@@ -232,11 +280,35 @@ public class LmxProxyDataConnection : IDataConnection
Disconnected?.Invoke();
}
private static QualityCode MapQuality(LmxQuality quality) => quality switch
/// <summary>
/// Normalizes a Vtq value for consumption by the rest of the system.
/// Converts .NET arrays (bool[], int[], DateTime[], etc.) to comma-separated
/// display strings so downstream code sees simple string representations.
/// </summary>
private static object? NormalizeValue(object? value) => value switch
{
LmxQuality.Good => QualityCode.Good,
LmxQuality.Uncertain => QualityCode.Uncertain,
LmxQuality.Bad => QualityCode.Bad,
_ => QualityCode.Bad
null or string => value,
IFormattable => value,
_ => ValueFormatter.FormatDisplayValue(value)
};
private static QualityCode MapQuality(Quality quality)
{
if (quality.IsGood()) return QualityCode.Good;
if (quality.IsUncertain()) return QualityCode.Uncertain;
return QualityCode.Bad;
}
private static TypedValue ToTypedValue(object? value) => value switch
{
bool b => new TypedValue { BoolValue = b },
int i => new TypedValue { Int32Value = i },
long l => new TypedValue { Int64Value = l },
float f => new TypedValue { FloatValue = f },
double d => new TypedValue { DoubleValue = d },
string s => new TypedValue { StringValue = s },
DateTime dt => new TypedValue { DatetimeValue = dt.ToUniversalTime().Ticks },
null => new TypedValue { StringValue = string.Empty },
_ => new TypedValue { StringValue = value.ToString() ?? string.Empty }
};
}

View File

@@ -4,6 +4,8 @@ using NSubstitute.ExceptionExtensions;
using ScadaLink.Commons.Interfaces.Protocol;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.DataConnectionLayer.Adapters;
using ZB.MOM.WW.LmxProxy.Client.Domain;
using QualityCode = ScadaLink.Commons.Interfaces.Protocol.QualityCode;
namespace ScadaLink.DataConnectionLayer.Tests;
@@ -17,7 +19,7 @@ public class LmxProxyDataConnectionTests
{
_mockClient = Substitute.For<ILmxProxyClient>();
_mockFactory = Substitute.For<ILmxProxyClientFactory>();
_mockFactory.Create(Arg.Any<string>(), Arg.Any<int>(), Arg.Any<string?>(), Arg.Any<int>(), Arg.Any<bool>()).Returns(_mockClient);
_mockFactory.Create(Arg.Any<string>(), Arg.Any<int>(), Arg.Any<string?>(), Arg.Any<bool>()).Returns(_mockClient);
_adapter = new LmxProxyDataConnection(_mockFactory, NullLogger<LmxProxyDataConnection>.Instance);
}
@@ -41,7 +43,7 @@ public class LmxProxyDataConnectionTests
});
Assert.Equal(ConnectionHealth.Connected, _adapter.Status);
_mockFactory.Received(1).Create("myhost", 5001, null, 0, false);
_mockFactory.Received(1).Create("myhost", 5001, null, false);
await _mockClient.Received(1).ConnectAsync(Arg.Any<CancellationToken>());
}
@@ -57,7 +59,7 @@ public class LmxProxyDataConnectionTests
["ApiKey"] = "my-secret-key"
});
_mockFactory.Received(1).Create("server", 50051, "my-secret-key", 0, false);
_mockFactory.Received(1).Create("server", 50051, "my-secret-key", false);
}
[Fact]
@@ -67,7 +69,7 @@ public class LmxProxyDataConnectionTests
await _adapter.ConnectAsync(new Dictionary<string, string>());
_mockFactory.Received(1).Create("localhost", 50051, null, 0, false);
_mockFactory.Received(1).Create("localhost", 50051, null, false);
}
[Fact]
@@ -88,7 +90,7 @@ public class LmxProxyDataConnectionTests
await ConnectAdapter();
var now = DateTime.UtcNow;
_mockClient.ReadAsync("Tag1", Arg.Any<CancellationToken>())
.Returns(new LmxVtq(42.5, now, LmxQuality.Good));
.Returns(new Vtq(42.5, now, Quality.Good));
var result = await _adapter.ReadAsync("Tag1");
@@ -102,7 +104,7 @@ public class LmxProxyDataConnectionTests
{
await ConnectAdapter();
_mockClient.ReadAsync("Tag1", Arg.Any<CancellationToken>())
.Returns(new LmxVtq(null, DateTime.UtcNow, LmxQuality.Bad));
.Returns(new Vtq(null, DateTime.UtcNow, Quality.Bad));
var result = await _adapter.ReadAsync("Tag1");
@@ -116,7 +118,7 @@ public class LmxProxyDataConnectionTests
{
await ConnectAdapter();
_mockClient.ReadAsync("Tag1", Arg.Any<CancellationToken>())
.Returns(new LmxVtq("maybe", DateTime.UtcNow, LmxQuality.Uncertain));
.Returns(new Vtq("maybe", DateTime.UtcNow, Quality.Uncertain));
var result = await _adapter.ReadAsync("Tag1");
@@ -130,10 +132,10 @@ public class LmxProxyDataConnectionTests
await ConnectAdapter();
var now = DateTime.UtcNow;
_mockClient.ReadBatchAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<CancellationToken>())
.Returns(new Dictionary<string, LmxVtq>
.Returns(new Dictionary<string, Vtq>
{
["Tag1"] = new(10, now, LmxQuality.Good),
["Tag2"] = new(null, now, LmxQuality.Bad)
["Tag1"] = new(10, now, Quality.Good),
["Tag2"] = new(null, now, Quality.Bad)
});
var results = await _adapter.ReadBatchAsync(["Tag1", "Tag2"]);
@@ -153,14 +155,14 @@ public class LmxProxyDataConnectionTests
var result = await _adapter.WriteAsync("Tag1", 42);
Assert.True(result.Success);
await _mockClient.Received(1).WriteAsync("Tag1", 42, Arg.Any<CancellationToken>());
await _mockClient.Received(1).WriteAsync("Tag1", Arg.Any<TypedValue>(), Arg.Any<CancellationToken>());
}
[Fact]
public async Task Write_Failure_ReturnsError()
{
await ConnectAdapter();
_mockClient.WriteAsync("Tag1", 42, Arg.Any<CancellationToken>())
_mockClient.WriteAsync("Tag1", Arg.Any<TypedValue>(), Arg.Any<CancellationToken>())
.Throws(new InvalidOperationException("Write failed for tag"));
var result = await _adapter.WriteAsync("Tag1", 42);
@@ -184,7 +186,7 @@ public class LmxProxyDataConnectionTests
public async Task WriteBatch_Failure_ReturnsAllErrors()
{
await ConnectAdapter();
_mockClient.WriteBatchAsync(Arg.Any<IDictionary<string, object>>(), Arg.Any<CancellationToken>())
_mockClient.WriteBatchAsync(Arg.Any<IDictionary<string, TypedValue>>(), Arg.Any<CancellationToken>())
.Throws(new InvalidOperationException("Batch write failed"));
var results = await _adapter.WriteBatchAsync(new Dictionary<string, object?> { ["T1"] = 1, ["T2"] = 2 });
@@ -201,7 +203,7 @@ public class LmxProxyDataConnectionTests
{
await ConnectAdapter();
var mockSub = Substitute.For<ILmxSubscription>();
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, LmxVtq>>(), Arg.Any<Action?>(), Arg.Any<CancellationToken>())
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, Vtq>>(), Arg.Any<Action<Exception>?>(), Arg.Any<CancellationToken>())
.Returns(mockSub);
var subId = await _adapter.SubscribeAsync("Tag1", (_, _) => { });
@@ -209,7 +211,7 @@ public class LmxProxyDataConnectionTests
Assert.NotNull(subId);
Assert.NotEmpty(subId);
await _mockClient.Received(1).SubscribeAsync(
Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, LmxVtq>>(), Arg.Any<Action?>(), Arg.Any<CancellationToken>());
Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, Vtq>>(), Arg.Any<Action<Exception>?>(), Arg.Any<CancellationToken>());
}
[Fact]
@@ -217,7 +219,7 @@ public class LmxProxyDataConnectionTests
{
await ConnectAdapter();
var mockSub = Substitute.For<ILmxSubscription>();
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, LmxVtq>>(), Arg.Any<Action?>(), Arg.Any<CancellationToken>())
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, Vtq>>(), Arg.Any<Action<Exception>?>(), Arg.Any<CancellationToken>())
.Returns(mockSub);
var subId = await _adapter.SubscribeAsync("Tag1", (_, _) => { });
@@ -240,7 +242,7 @@ public class LmxProxyDataConnectionTests
{
await ConnectAdapter();
var mockSub = Substitute.For<ILmxSubscription>();
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, LmxVtq>>(), Arg.Any<Action?>(), Arg.Any<CancellationToken>())
_mockClient.SubscribeAsync(Arg.Any<IEnumerable<string>>(), Arg.Any<Action<string, Vtq>>(), Arg.Any<Action<Exception>?>(), Arg.Any<CancellationToken>())
.Returns(mockSub);
await _adapter.SubscribeAsync("Tag1", (_, _) => { });
@@ -280,21 +282,6 @@ public class LmxProxyDataConnectionTests
// --- Configuration Parsing ---
[Fact]
public async Task Connect_ParsesSamplingInterval()
{
_mockClient.IsConnected.Returns(true);
await _adapter.ConnectAsync(new Dictionary<string, string>
{
["Host"] = "server",
["Port"] = "50051",
["SamplingIntervalMs"] = "500"
});
_mockFactory.Received(1).Create("server", 50051, null, 500, false);
}
[Fact]
public async Task Connect_ParsesUseTls()
{
@@ -307,16 +294,16 @@ public class LmxProxyDataConnectionTests
["UseTls"] = "true"
});
_mockFactory.Received(1).Create("server", 50051, null, 0, true);
_mockFactory.Received(1).Create("server", 50051, null, true);
}
[Fact]
public async Task Connect_DefaultsSamplingAndTls()
public async Task Connect_DefaultsHostPortAndTls()
{
_mockClient.IsConnected.Returns(true);
await _adapter.ConnectAsync(new Dictionary<string, string>());
_mockFactory.Received(1).Create("localhost", 50051, null, 0, false);
_mockFactory.Received(1).Create("localhost", 50051, null, false);
}
}

View File

@@ -0,0 +1,92 @@
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.LmxProxy.Client;
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ScadaLink.DataConnectionLayer.Adapters;
/// <summary>
/// Production ILmxProxyClient that delegates to the real
/// <see cref="ZB.MOM.WW.LmxProxy.Client.LmxProxyClient"/> library.
/// </summary>
internal class RealLmxProxyClient : ILmxProxyClient
{
private readonly ZB.MOM.WW.LmxProxy.Client.LmxProxyClient _inner;
public RealLmxProxyClient(ZB.MOM.WW.LmxProxy.Client.LmxProxyClient inner)
{
_inner = inner;
}
public bool IsConnected => _inner.IsConnected;
public Task ConnectAsync(CancellationToken cancellationToken = default)
=> _inner.ConnectAsync(cancellationToken);
public Task DisconnectAsync()
=> _inner.DisconnectAsync();
public Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default)
=> _inner.ReadAsync(address, cancellationToken);
public Task<IDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses, CancellationToken cancellationToken = default)
=> _inner.ReadBatchAsync(addresses, cancellationToken);
public Task WriteAsync(string address, TypedValue value, CancellationToken cancellationToken = default)
=> _inner.WriteAsync(address, value, cancellationToken);
public Task WriteBatchAsync(IDictionary<string, TypedValue> values, CancellationToken cancellationToken = default)
=> _inner.WriteBatchAsync(values, cancellationToken);
public async Task<ILmxSubscription> SubscribeAsync(
IEnumerable<string> addresses,
Action<string, Vtq> onUpdate,
Action<Exception>? onStreamError = null,
CancellationToken cancellationToken = default)
{
var innerSub = await _inner.SubscribeAsync(addresses, onUpdate, onStreamError, cancellationToken);
return new SubscriptionWrapper(innerSub);
}
public async ValueTask DisposeAsync()
{
await _inner.DisposeAsync();
}
private sealed class SubscriptionWrapper(ZB.MOM.WW.LmxProxy.Client.LmxProxyClient.ISubscription inner) : ILmxSubscription
{
public async ValueTask DisposeAsync()
{
await inner.DisposeAsync();
}
}
}
/// <summary>
/// Production factory that creates LmxProxy clients using the real library's builder.
/// </summary>
public class RealLmxProxyClientFactory : ILmxProxyClientFactory
{
private readonly ILoggerFactory _loggerFactory;
public RealLmxProxyClientFactory(ILoggerFactory loggerFactory)
{
_loggerFactory = loggerFactory;
}
public ILmxProxyClient Create(string host, int port, string? apiKey, bool useTls = false)
{
var builder = new LmxProxyClientBuilder()
.WithHost(host)
.WithPort(port)
.WithLogger(_loggerFactory.CreateLogger<ZB.MOM.WW.LmxProxy.Client.LmxProxyClient>());
if (!string.IsNullOrEmpty(apiKey))
builder.WithApiKey(apiKey);
if (useTls)
builder.WithSslCredentials(null);
var client = builder.Build();
return new RealLmxProxyClient(client);
}
}

View File

@@ -0,0 +1,388 @@
# LmxProxy Protocol Specification
> **Note:** This specification reflects the v2 protocol with native `TypedValue` support. The original v1 string-based protocol (string values, string quality) has been replaced.
The LmxProxy protocol is a gRPC-based SCADA read/write interface for bridging ScadaLink's Data Connection Layer to devices via an intermediary proxy server (LmxProxy). The proxy translates LmxProxy protocol operations into backend device calls (e.g., OPC UA). All communication uses HTTP/2 gRPC with Protocol Buffers.
## Service Definition
```protobuf
syntax = "proto3";
package scada;
service ScadaService {
rpc Connect(ConnectRequest) returns (ConnectResponse);
rpc Disconnect(DisconnectRequest) returns (DisconnectResponse);
rpc GetConnectionState(GetConnectionStateRequest) returns (GetConnectionStateResponse);
rpc Read(ReadRequest) returns (ReadResponse);
rpc ReadBatch(ReadBatchRequest) returns (ReadBatchResponse);
rpc Write(WriteRequest) returns (WriteResponse);
rpc WriteBatch(WriteBatchRequest) returns (WriteBatchResponse);
rpc WriteBatchAndWait(WriteBatchAndWaitRequest) returns (WriteBatchAndWaitResponse);
rpc Subscribe(SubscribeRequest) returns (stream VtqMessage);
rpc CheckApiKey(CheckApiKeyRequest) returns (CheckApiKeyResponse);
}
```
Proto file location: `src/ScadaLink.DataConnectionLayer/Adapters/Protos/scada.proto`
## Connection Lifecycle
### Session Model
Every client must call `Connect` before performing any read, write, or subscribe operation. The server returns a session ID (32-character hex GUID) that must be included in all subsequent requests. Sessions persist until `Disconnect` is called or the server restarts — there is no idle timeout.
### Authentication
API key authentication is optional, controlled by server configuration:
- **If required**: The `Connect` RPC fails with `success=false` if the API key doesn't match.
- **If not required**: All API keys are accepted (including empty).
- The API key is sent both in the `ConnectRequest.api_key` field and as an `x-api-key` gRPC metadata header on the `Connect` call.
### Connect
```
ConnectRequest {
client_id: string // Client identifier (e.g., "ScadaLink-{guid}")
api_key: string // API key for authentication (empty if none)
}
ConnectResponse {
success: bool // Whether connection succeeded
message: string // Status message
session_id: string // 32-char hex GUID (only valid if success=true)
}
```
The client generates `client_id` as `"ScadaLink-{Guid:N}"` for uniqueness.
### Disconnect
```
DisconnectRequest {
session_id: string
}
DisconnectResponse {
success: bool
message: string
}
```
Best-effort — the client calls disconnect but does not retry on failure.
### GetConnectionState
```
GetConnectionStateRequest {
session_id: string
}
GetConnectionStateResponse {
is_connected: bool
client_id: string
connected_since_utc_ticks: int64 // DateTime.UtcNow.Ticks at connect time
}
```
### CheckApiKey
```
CheckApiKeyRequest {
api_key: string
}
CheckApiKeyResponse {
is_valid: bool
message: string
}
```
Standalone API key validation without creating a session.
## Value-Timestamp-Quality (VTQ)
The core data structure for all read and subscription results:
```
VtqMessage {
tag: string // Tag address
value: TypedValue // Native typed value (protobuf oneof)
timestamp_utc_ticks: int64 // UTC DateTime.Ticks (100ns intervals since 0001-01-01)
quality: QualityCode // OPC UA status code + symbolic name
}
```
### TypedValue
Values are transmitted as native types via a protobuf `oneof`:
| Oneof Variant | Proto Type | .NET Type |
|---|---|---|
| `bool_value` | `bool` | `bool` |
| `int32_value` | `int32` | `int` |
| `int64_value` | `int64` | `long` |
| `float_value` | `float` | `float` |
| `double_value` | `double` | `double` |
| `string_value` | `string` | `string` |
| `bytes_value` | `bytes` | `byte[]` |
| `datetime_value` | `int64` | `DateTime` (UTC ticks) |
| `array_value` | `ArrayValue` | See below |
### ArrayValue
`ArrayValue` contains typed sub-arrays via a protobuf `oneof`:
| Sub-array | Element Type |
|---|---|
| `BoolArray` | `repeated bool` |
| `Int32Array` | `repeated int32` |
| `Int64Array` | `repeated int64` |
| `FloatArray` | `repeated float` |
| `DoubleArray` | `repeated double` |
| `StringArray` | `repeated string` |
> **Note:** `DateTime` arrays are not natively supported in the proto — they are serialized as `Int64Array` (UTC ticks) by the Host.
The ScadaLink adapter normalizes `ArrayValue` objects to comma-separated display strings at the adapter boundary (see [Component-DataConnectionLayer.md](Component-DataConnectionLayer.md#value-serialization)).
### Value Encoding (v1 — deprecated)
The v1 protocol transmitted all values as strings with client-side parsing (`double.TryParse`, `bool.TryParse`). This has been replaced by native `TypedValue`. The v1 heuristics are no longer used.
### Quality Codes
Quality is transmitted as a `QualityCode` enum with OPC UA status code semantics:
| QualityCode | Meaning | OPC UA Mapping |
|---|---|---|
| Good | Value is reliable | StatusCode high bits clear |
| Uncertain | Value may not be current | Non-zero, high bit clear |
| Bad | Value is unreliable or unavailable | High bit set (`0x80000000`) |
The SDK provides `IsGood()`, `IsUncertain()`, and `IsBad()` extension methods on the `Quality` enum. The adapter maps these to ScadaLink's `QualityCode`.
A null or missing VTQ message is treated as Bad quality with null value and current UTC timestamp.
### Timestamps
- All timestamps are UTC.
- Encoded as `int64` representing `DateTime.Ticks` (100-nanosecond intervals since 0001-01-01 00:00:00 UTC).
- Client reconstructs via `new DateTime(ticks, DateTimeKind.Utc)`.
## Read Operations
### Read (Single Tag)
```
ReadRequest {
session_id: string // Valid session ID
tag: string // Tag address
}
ReadResponse {
success: bool // Whether read succeeded
message: string // Error message if failed
vtq: VtqMessage // Value-timestamp-quality result
}
```
### ReadBatch (Multiple Tags)
```
ReadBatchRequest {
session_id: string
tags: repeated string // Tag addresses
}
ReadBatchResponse {
success: bool // false if any tag failed
message: string // Error message
vtqs: repeated VtqMessage // Results in same order as request
}
```
Batch reads are **partially successful** — individual tags may have Bad quality while the overall response succeeds. If a tag read throws an exception, its VTQ is returned with Bad quality and current UTC timestamp.
## Write Operations
### Write (Single Tag)
```
WriteRequest {
session_id: string
tag: string
value: TypedValue // Native typed value (see TypedValue)
}
WriteResponse {
success: bool
message: string
}
```
The client adapter's `ToTypedValue` method converts `object?` values to the appropriate `TypedValue` variant before transmission. See [Component-DataConnectionLayer.md](Component-DataConnectionLayer.md#value-serialization) for the mapping table.
### WriteBatch (Multiple Tags)
```
WriteItem {
tag: string
value: TypedValue
}
WriteResult {
tag: string
success: bool
message: string
}
WriteBatchRequest {
session_id: string
items: repeated WriteItem
}
WriteBatchResponse {
success: bool // Overall success (all items must succeed)
message: string
results: repeated WriteResult // Per-item results
}
```
Batch writes are **all-or-nothing** at the reporting level — if any item fails, overall `success` is `false`.
### WriteBatchAndWait (Atomic Write + Flag Polling)
A compound operation: write values, then poll a flag tag until it matches an expected value or times out.
```
WriteBatchAndWaitRequest {
session_id: string
items: repeated WriteItem // Values to write (TypedValue)
flag_tag: string // Tag to poll after writes
flag_value: TypedValue // Expected value (typed comparison)
timeout_ms: int32 // Timeout in ms (default 5000 if ≤ 0)
poll_interval_ms: int32 // Poll interval in ms (default 100 if ≤ 0)
}
WriteBatchAndWaitResponse {
success: bool // Overall operation success
message: string
write_results: repeated WriteResult // Per-item write results
flag_reached: bool // Whether flag matched before timeout
elapsed_ms: int32 // Total elapsed time
}
```
**Behavior:**
1. All writes execute first. If any write fails, the operation returns immediately with `success=false`.
2. If writes succeed, polls `flag_tag` at `poll_interval_ms` intervals.
3. Compares the read result's `TypedValue` against `flag_value`.
4. If flag matches before timeout: `success=true`, `flag_reached=true`.
5. If timeout expires: `success=true`, `flag_reached=false` (timeout is not an error).
## Subscription (Server Streaming)
### Subscribe
```
SubscribeRequest {
session_id: string
tags: repeated string // Tag addresses to monitor
sampling_ms: int32 // Backend sampling interval in milliseconds
}
// Returns: stream of VtqMessage
```
**Behavior:**
1. Server validates the session. Invalid session → `RpcException` with `StatusCode.Unauthenticated`.
2. Server registers monitored items on the backend (e.g., OPC UA subscriptions) for all requested tags.
3. On each value change, the server pushes a `VtqMessage` to the response stream.
4. The stream remains open indefinitely until:
- The client cancels (disposes the subscription).
- The server encounters an error (backend disconnect, etc.).
- The gRPC connection drops.
5. On stream termination, the client's `onStreamError` callback fires exactly once.
**Client-side subscription lifecycle:**
```
ILmxSubscription subscription = await client.SubscribeAsync(
addresses: ["Motor.Speed", "Motor.Temperature"],
onUpdate: (tag, vtq) => { /* handle value change */ },
onStreamError: () => { /* handle disconnect */ });
// Later:
await subscription.DisposeAsync(); // Cancels the stream
```
Disposing the subscription cancels the underlying `CancellationTokenSource`, which terminates the background stream-reading task and triggers server-side cleanup of monitored items.
## Tag Addressing
Tags are string addresses that identify data points. The proxy maps tag addresses to backend-specific identifiers.
**LmxFakeProxy example** (OPC UA backend):
Tag addresses are concatenated with a configurable prefix to form OPC UA node IDs:
```
Prefix: "ns=3;s="
Tag: "Motor.Speed"
NodeId: "ns=3;s=Motor.Speed"
```
The prefix is configured at server startup via the `OPC_UA_PREFIX` environment variable.
## Transport Details
| Setting | Value |
|---------|-------|
| Protocol | gRPC over HTTP/2 |
| Default port | 50051 |
| TLS | Optional (controlled by `UseTls` connection parameter) |
| Metadata headers | `x-api-key` (sent on Connect call if API key configured) |
### Connection Parameters
The ScadaLink DCL configures LmxProxy connections via a string dictionary:
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `Host` | string | `"localhost"` | gRPC server hostname |
| `Port` | string (parsed as int) | `"50051"` | gRPC server port |
| `ApiKey` | string | (none) | API key for authentication |
| `SamplingIntervalMs` | string (parsed as int) | `"0"` | Backend sampling interval for subscriptions |
| `UseTls` | string (parsed as bool) | `"false"` | Use HTTPS instead of HTTP |
## Error Handling
| Operation | Error Mechanism | Client Behavior |
|-----------|----------------|-----------------|
| Connect | `success=false` in response | Throws `InvalidOperationException` |
| Read/ReadBatch | `success=false` in response | Throws `InvalidOperationException` |
| Write/WriteBatch | `success=false` in response | Throws `InvalidOperationException` |
| WriteBatchAndWait | `success=false` or `flag_reached=false` | Returns result (timeout is not an exception) |
| Subscribe (auth) | `RpcException` with `Unauthenticated` | Propagated to caller |
| Subscribe (stream) | Stream ends or gRPC error | `onStreamError` callback invoked; `sessionId` nullified |
| Any (disconnected) | Client checks `IsConnected` | Throws `InvalidOperationException("not connected")` |
When a subscription stream ends unexpectedly, the client immediately nullifies its session ID, causing `IsConnected` to return `false`. The DCL adapter fires its `Disconnected` event, which triggers the reconnection cycle in the `DataConnectionActor`.
## Implementation Files
| Component | File |
|-----------|------|
| Proto definition | `src/ScadaLink.DataConnectionLayer/Adapters/Protos/scada.proto` |
| Client interface | `src/ScadaLink.DataConnectionLayer/Adapters/ILmxProxyClient.cs` |
| Client implementation | `src/ScadaLink.DataConnectionLayer/Adapters/RealLmxProxyClient.cs` |
| DCL adapter | `src/ScadaLink.DataConnectionLayer/Adapters/LmxProxyDataConnection.cs` |
| Client factory | `src/ScadaLink.DataConnectionLayer/Adapters/LmxProxyClientFactory.cs` |
| Server implementation | `infra/lmxfakeproxy/Services/ScadaServiceImpl.cs` |
| Session manager | `infra/lmxfakeproxy/Sessions/SessionManager.cs` |
| Tag mapper | `infra/lmxfakeproxy/TagMapper.cs` |
| OPC UA bridge interface | `infra/lmxfakeproxy/Bridge/IOpcUaBridge.cs` |
| OPC UA bridge impl | `infra/lmxfakeproxy/Bridge/OpcUaBridge.cs` |

1
deprecated/lmxproxy/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
publish-v2/

View File

@@ -0,0 +1,71 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What This Is
LmxProxy is a gRPC proxy that bridges ScadaLink's Data Connection Layer to AVEVA System Platform via the ArchestrA MXAccess COM API. It has two projects:
- **Host** (`ZB.MOM.WW.LmxProxy.Host`) — .NET Framework 4.8, x86-only Windows service. Hosts a gRPC server (Grpc.Core) that fronts an MxAccessClient talking to ArchestrA MXAccess. Runs as a Windows service via Topshelf.
- **Client** (`ZB.MOM.WW.LmxProxy.Client`) — .NET 10, AnyCPU library. Code-first gRPC client (protobuf-net.Grpc) consumed by ScadaLink's DCL. This is a NuGet-packable library.
The two projects use **different gRPC stacks**: Host uses proto-file-generated code (`Grpc.Core` + `Grpc.Tools`), Client uses code-first contracts (`protobuf-net.Grpc` with `[DataContract]`/`[ServiceContract]` attributes). They are wire-compatible because both target the same `scada.ScadaService` gRPC service.
## Build Commands
```bash
dotnet build ZB.MOM.WW.LmxProxy.slnx # Build entire solution
dotnet build src/ZB.MOM.WW.LmxProxy.Host # Host only (requires x86 platform)
dotnet build src/ZB.MOM.WW.LmxProxy.Client # Client only
```
The Host project requires the `ArchestrA.MXAccess.dll` COM interop assembly in `lib/`. It targets x86 exclusively (MXAccess is 32-bit COM).
## Architecture
### Host Service Startup Chain
`Program.Main` → Topshelf `HostFactory``LmxProxyService.Start()` which:
1. Validates configuration (`appsettings.json` bound to `LmxProxyConfiguration`)
2. Creates `MxAccessClient` (the `IScadaClient` impl that wraps ArchestrA.MXAccess COM)
3. Connects to MxAccess synchronously at startup
4. Starts connection monitor loop (auto-reconnect)
5. Creates `SubscriptionManager`, `SessionManager`, `PerformanceMetrics`, `ApiKeyService`
6. Creates `ScadaGrpcService` (the proto-generated service impl) with all dependencies
7. Starts Grpc.Core `Server` on configured port (default 50051)
8. Starts HTTP status web server (default port 8080)
### Key Host Components
- `MxAccessClient` — Partial class split across 6 files (Connection, ReadWrite, Subscription, EventHandlers, NestedTypes, main). Wraps `LMXProxyServer` COM object. Uses semaphores for concurrency control.
- `ScadaGrpcService` — Inherits proto-generated `ScadaService.ScadaServiceBase`. All RPCs validate session first, then delegate to `IScadaClient`. Values are string-serialized on the wire (v1 protocol).
- `SessionManager` — Tracks client sessions by GUID.
- `SubscriptionManager` — Manages MxAccess subscriptions, fans out updates via `System.Threading.Channels`.
- `ApiKeyInterceptor` — gRPC server interceptor for API key validation.
### Client Architecture
- `ILmxProxyClient` — Public interface for consumers. Connect/Read/Write/Subscribe/Dispose.
- `LmxProxyClient` — Partial class split across multiple files (Connection, Subscription, Metrics, etc.). Uses `protobuf-net.Grpc` code-first contracts (`IScadaService` in `Domain/ScadaContracts.cs`).
- `LmxProxyClientBuilder` — Fluent builder for configuring client instances.
- `Domain/ScadaContracts.cs` — All gRPC message types as `[DataContract]` POCOs and the `IScadaService` interface with `[ServiceContract]`.
- Value conversion: Client parses string values from wire using double → bool → string heuristic in `ConvertToVtq()`. Writes use `.ToString()` via `ConvertToString()`.
### Protocol
Proto definition: `src/ZB.MOM.WW.LmxProxy.Host/Grpc/Protos/scada.proto`
Currently v1 protocol (string-encoded values, string quality). A v2 protocol spec exists in `docs/lmxproxy_updates.md` that introduces `TypedValue` (protobuf oneof) and `QualityCode` (OPC UA status codes) — not yet implemented.
RPCs: Connect, Disconnect, GetConnectionState, Read, ReadBatch, Write, WriteBatch, WriteBatchAndWait, Subscribe (server streaming), CheckApiKey.
### Configuration
Host configured via `appsettings.json` bound to `LmxProxyConfiguration`. Key sections: GrpcPort, Connection (timeouts, auto-reconnect), Subscription (channel capacity), Tls, WebServer, Serilog, RetryPolicies, HealthCheck.
## Important Constraints
- Host **must** target x86 and .NET Framework 4.8 (ArchestrA.MXAccess is 32-bit COM interop).
- Host uses `Grpc.Core` (the deprecated C-core gRPC library), not `Grpc.Net`. This is required because .NET 4.8 doesn't support `Grpc.Net.Server`.
- Client uses `Grpc.Net.Client` and targets .NET 10 — it runs in the ScadaLink central/site clusters.
- The solution file is `.slnx` format (XML-based, not the older text format).

View File

@@ -0,0 +1,11 @@
<Solution>
<Folder Name="/src/">
<Project Path="src/ZB.MOM.WW.LmxProxy.Host/ZB.MOM.WW.LmxProxy.Host.csproj" />
<Project Path="src/ZB.MOM.WW.LmxProxy.Client/ZB.MOM.WW.LmxProxy.Client.csproj" />
</Folder>
<Folder Name="/tests/">
<Project Path="tests/ZB.MOM.WW.LmxProxy.Host.Tests/ZB.MOM.WW.LmxProxy.Host.Tests.csproj" />
<Project Path="tests/ZB.MOM.WW.LmxProxy.Client.Tests/ZB.MOM.WW.LmxProxy.Client.Tests.csproj" />
<Project Path="tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests.csproj" />
</Folder>
</Solution>

View File

@@ -0,0 +1,107 @@
# LmxProxy v2 Rebuild — Deviations & Key Technical Decisions
Decisions made during implementation that differ from or extend the original plan.
## 1. Grpc.Tools downgraded to 2.68.1
**Plan specified**: Grpc.Tools 2.71.0
**Actual**: 2.68.1
**Why**: protoc.exe from 2.71.0 crashes with access violation (exit code 0xC0000005) on windev (Windows 10, x64). The 2.68.1 version works reliably.
**How to apply**: If upgrading Grpc.Tools in the future, test protoc on windev first.
## 2. STA threading — three iterations
**Plan specified**: Dedicated STA thread with `BlockingCollection<Action>` dispatch queue and `Application.DoEvents()` message pump.
**Iteration 1 (failed)**: `StaDispatchThread` with `BlockingCollection.Take()` + `Application.DoEvents()`. Failed because `Take()` blocked the STA thread, preventing the message pump from running. COM callbacks never fired.
**Iteration 2 (partial)**: Replaced with `Task.Run` on thread pool (MTA). `OnDataChange` worked (MxAccess fires it on its own threads), but `OnWriteComplete` never fired (needs message-pump-based marshaling). Writes used fire-and-forget as a workaround.
**Iteration 3 (current)**: `StaComThread` with Win32 `GetMessage`/`DispatchMessage` loop. Work dispatched via `PostThreadMessage(WM_APP)` which wakes the message pump. COM callbacks (`OnDataChange`, `OnWriteComplete`) are delivered between work items via `DispatchMessage`. All COM objects created and called on this single STA thread.
**How to apply**: All MxAccess COM calls must go through `_staThread.RunAsync()`. Never call COM objects directly from thread pool threads. See `docs/sta_gap.md` for the full design analysis.
## 3. TypedValue property-level `_setCase` tracking
**Plan specified**: `GetValueCase()` heuristic checking non-default values (e.g., `if (BoolValue) return BoolValue`).
**Actual**: Each property setter records `_setCase = TypedValueCase.XxxValue`, and `GetValueCase()` returns `_setCase` directly.
**Why**: protobuf-net code-first has no native `oneof` support. The heuristic approach can't distinguish "field not set" from "field set to default value" (e.g., `BoolValue = false`, `DoubleValue = 0.0`, `Int32Value = 0`). Since protobuf-net calls property setters during deserialization, tracking in the setter correctly identifies which field was deserialized.
**How to apply**: Always use `GetValueCase()` to determine which TypedValue field is set, never check for non-default values directly.
## 4. API key sent via HTTP header (DelegatingHandler)
**Plan specified**: API key sent in `ConnectRequest.ApiKey` field (request body).
**Actual**: API key sent as `x-api-key` HTTP header on every gRPC request via `ApiKeyDelegatingHandler`, in addition to the request body.
**Why**: The Host's `ApiKeyInterceptor` validates the `x-api-key` gRPC metadata header before any RPC handler executes. protobuf-net.Grpc's `CreateGrpcService<T>()` doesn't expose per-call metadata, so the header must be added at the HTTP transport level. A `DelegatingHandler` wrapping the `SocketsHttpHandler` adds it to all outgoing requests.
**How to apply**: The `GrpcChannelFactory.CreateChannel()` accepts an optional `apiKey` parameter. The `LmxProxyClient` passes it during channel creation in `ConnectAsync`.
## 5. v2 test deployment on port 50100
**Plan specified**: Port 50052 for v2 test deployment.
**Actual**: Port 50100.
**Why**: Ports 5004950060 are used by MxAccess internal COM connections (established TCP pairs between the COM client and server). Port 50052 was occupied by an ephemeral MxAccess connection from the v1 service.
**How to apply**: When deploying alongside v1, use ports above 50100 to avoid MxAccess ephemeral port range.
## 6. CheckApiKey validates request body key
**Plan specified**: Not explicitly defined — the interceptor validates the header key.
**Actual**: `CheckApiKey` RPC validates the key from the *request body* (`request.ApiKey`) against `ApiKeyService`, not the header key.
**Why**: The `x-api-key` header always carries the caller's valid key (for interceptor auth). The `CheckApiKey` RPC is designed for clients to test whether a *different* key is valid, so it must check the body key independently.
**How to apply**: `ScadaGrpcService` receives `ApiKeyService` as an optional constructor parameter.
## 7. OnWriteComplete callback — resolved via STA message pump
**Plan specified**: Wait for `OnWriteComplete` COM callback to confirm write success.
**History**: Initially implemented as fire-and-forget because `OnWriteComplete` never fired — the Host had no Windows message pump to deliver the COM callback. See `docs/sta_gap.md` for the full analysis.
**Resolution**: `StaComThread` (a dedicated STA thread with a Win32 `GetMessage`/`DispatchMessage` loop) was introduced, providing a proper message pump. All COM operations are now dispatched to this thread via `PostThreadMessage(WM_APP)`. The message pump delivers `OnWriteComplete` callbacks between work items.
**Current behavior**: Write dispatches `_lmxProxy.Write()` on the STA thread, registers a `TaskCompletionSource` in `_pendingWrites`, then awaits the callback with a timeout. `OnWriteComplete` resolves or rejects the TCS with `MxStatusMapper` error details. If the callback doesn't arrive within the write timeout, falls back to success (fire-and-forget safety net). Clean up (UnAdvise + RemoveItem) happens on the STA thread after the callback or timeout.
**How to apply**: Writes now get real confirmation from MxAccess. Secured write (1012) and verified write (1013) rejections are surfaced as exceptions via `OnWriteComplete`. The timeout fallback ensures writes don't hang if the callback is delayed.
## 8. SubscriptionManager must create MxAccess COM subscriptions
**Plan specified**: SubscriptionManager manages per-client channels and routes updates from MxAccess.
**Actual**: SubscriptionManager must also call `IScadaClient.SubscribeAsync()` to create the underlying COM subscriptions when a tag is first subscribed, and dispose them when the last client unsubscribes.
**Why**: The Phase 2 implementation tracked client-to-tag routing in internal dictionaries but never called `MxAccessClient.SubscribeAsync()` to create the actual MxAccess COM subscriptions (`AddItem` + `AdviseSupervisory`). Without the COM subscription, `OnDataChange` never fired and no updates were delivered to clients. This caused the `Subscribe_ReceivesUpdates` integration test to receive 0 updates over 30 seconds.
**How to apply**: `SubscriptionManager.SubscribeAsync()` collects newly-seen tags (those without an existing `TagSubscription`) and **awaits** `_scadaClient.SubscribeAsync()` for them, passing `OnTagValueChanged` as the callback. The await ensures the COM subscription is fully established before the channel reader is returned — this prevents a race where the initial `OnDataChange` (first value delivery after `AdviseSupervisory`) fires before the gRPC stream handler starts reading. Previously this was fire-and-forget (`_ = CreateMxAccessSubscriptionsAsync()`), causing intermittent `Subscribe_ReceivesUpdates` test failures (0 updates in 30s).
---
# Known Gaps
## Gap 1: No active connection health probing
**Status**: Resolved (2026-03-22, commit `a6c01d7`).
**Problem**: `MxAccessClient.IsConnected` checks `_connectionState == Connected && _connectionHandle > 0`. When the AVEVA platform (aaBootstrap) is killed or restarted, the MxAccess COM object and handle remain valid in memory — `IsConnected` stays `true`. The auto-reconnect monitor loop (`MonitorConnectionAsync`) only triggers when `IsConnected` is `false`, so it never attempts reconnection.
**Observed behavior** (tested 2026-03-22): After killing the aaBootstrap process, all reads returned null values with Bad quality indefinitely. The monitor loop kept seeing `IsConnected == true` and never reconnected.
**Fix implemented**: The monitor loop now actively probes the connection using `ProbeConnectionAsync`, which reads a configurable test tag and classifies the result as `Healthy`, `TransportFailure`, or `DataDegraded`.
- `TransportFailure` for N consecutive probes (default 3) → forced disconnect + full reconnect (new COM object, `Register`, `RecreateStoredSubscriptionsAsync`)
- `DataDegraded` → stay connected, back off probe interval to 30s, report degraded status (platform objects may be stopped)
- `Healthy` → reset counters, resume normal interval
**Verified** (tested 2026-03-22): Graceful platform stop via SMC → 4 failed probes → automatic reconnect → reads restored within ~60 seconds. All 17 integration tests pass after recovery. Subscribed clients receive `Bad_NotConnected` quality during outage, then Good quality resumes automatically.
**Configuration** (`appsettings.json``HealthCheck` section):
- `TestTagAddress`: Tag to probe (default `TestChildObject.TestBool`)
- `ProbeTimeoutMs`: Probe read timeout (default 5000ms)
- `MaxConsecutiveTransportFailures`: Failures before forced reconnect (default 3)
- `DegradedProbeIntervalMs`: Probe interval in degraded mode (default 30000ms)
## Gap 2: Stale SubscriptionManager handles after reconnect
**Status**: Resolved (2026-03-22, commit `a6c01d7`).
**Problem**: `SubscriptionManager` stored `IAsyncDisposable` handles from `_scadaClient.SubscribeAsync()` in `_mxAccessHandles`. After a reconnect, `MxAccessClient.RecreateStoredSubscriptionsAsync()` recreated COM subscriptions internally but `SubscriptionManager._mxAccessHandles` still held stale handles. Additionally, a batch subscription stored the same handle for every address — disposing one address would dispose the entire batch.
**Fix implemented**: Removed `_mxAccessHandles` entirely. `SubscriptionManager` no longer tracks COM subscription handles. Ownership is cleanly split:
- `SubscriptionManager` owns client routing and ref-counting only
- `MxAccessClient` owns COM subscription lifecycle via `_storedSubscriptions` and `_addressToHandle`
- Unsubscribe uses `_scadaClient.UnsubscribeByAddressAsync(addresses)` — address-based, resolves to current handles regardless of reconnect history
## Gap 3: AVEVA objects don't auto-start after platform crash
**Status**: Documented. Platform behavior, not an LmxProxy issue.
**Observed behavior** (tested 2026-03-22): After killing aaBootstrap, the service auto-restarted (via Windows SCM recovery or Watchdog) within seconds. However, the ArchestrA objects (TestChildObject) did not automatically start. MxAccess connected successfully (`Register()` returned a valid handle) but all tag reads returned null values with Bad quality for 40+ minutes. Objects only recovered after manual restart via the System Management Console (SMC).
**Implication for LmxProxy**: Even with Gap 1 fixed (active probing + reconnect), reads will still return Bad quality until the platform objects are running. LmxProxy cannot fix this — it's a platform-level recovery issue. The health check should report this clearly: "MxAccess connected but tag quality is Bad — platform objects may need manual restart."
**Timeline**: aaBootstrap restart from SMC (graceful) takes ~5 minutes for objects to come back. aaBootstrap kill (crash) requires manual object restart via SMC — objects do not auto-recover.

View File

@@ -0,0 +1,646 @@
# LmxProxy Protocol v2 — OPC UA Alignment
This document specifies all changes to the LmxProxy gRPC protocol to align it with OPC UA semantics. The changes replace string-serialized values with typed values and simple quality strings with OPC UA-style status codes.
**Baseline:** `lmxproxy_protocol.md` (v1 protocol spec)
**Strategy:** Clean break — all clients and servers updated simultaneously. No backward compatibility layer.
---
## 1. Change Summary
| Message / Field | v1 Type | v2 Type | Breaking? |
|-----------------|---------|---------|-----------|
| `VtqMessage.value` | `string` | `TypedValue` | Yes |
| `VtqMessage.quality` | `string` | `QualityCode` | Yes |
| `WriteRequest.value` | `string` | `TypedValue` | Yes |
| `WriteItem.value` | `string` | `TypedValue` | Yes |
| `WriteBatchAndWaitRequest.flag_value` | `string` | `TypedValue` | Yes |
**Unchanged messages:** `ConnectRequest`, `ConnectResponse`, `DisconnectRequest`, `DisconnectResponse`, `GetConnectionStateRequest`, `GetConnectionStateResponse`, `CheckApiKeyRequest`, `CheckApiKeyResponse`, `ReadRequest`, `ReadBatchRequest`, `SubscribeRequest`, `WriteResponse`, `WriteBatchResponse`, `WriteBatchAndWaitResponse`, `WriteResult`.
**Unchanged RPCs:** The `ScadaService` definition is identical — same RPC names, same request/response pairing. Only the internal message shapes change.
---
## 2. Complete Updated Proto File
```protobuf
syntax = "proto3";
package scada;
// ============================================================
// Service Definition (unchanged)
// ============================================================
service ScadaService {
rpc Connect(ConnectRequest) returns (ConnectResponse);
rpc Disconnect(DisconnectRequest) returns (DisconnectResponse);
rpc GetConnectionState(GetConnectionStateRequest) returns (GetConnectionStateResponse);
rpc Read(ReadRequest) returns (ReadResponse);
rpc ReadBatch(ReadBatchRequest) returns (ReadBatchResponse);
rpc Write(WriteRequest) returns (WriteResponse);
rpc WriteBatch(WriteBatchRequest) returns (WriteBatchResponse);
rpc WriteBatchAndWait(WriteBatchAndWaitRequest) returns (WriteBatchAndWaitResponse);
rpc Subscribe(SubscribeRequest) returns (stream VtqMessage);
rpc CheckApiKey(CheckApiKeyRequest) returns (CheckApiKeyResponse);
}
// ============================================================
// NEW: Typed Value System
// ============================================================
// Replaces the v1 string-encoded value field.
// Exactly one field will be set. An unset oneof represents null.
message TypedValue {
oneof value {
bool bool_value = 1;
int32 int32_value = 2;
int64 int64_value = 3;
float float_value = 4;
double double_value = 5;
string string_value = 6;
bytes bytes_value = 7; // byte[]
int64 datetime_value = 8; // UTC DateTime.Ticks (100ns intervals since 0001-01-01)
ArrayValue array_value = 9; // arrays of primitives
}
}
// Container for typed arrays. Exactly one field will be set.
message ArrayValue {
oneof values {
BoolArray bool_values = 1;
Int32Array int32_values = 2;
Int64Array int64_values = 3;
FloatArray float_values = 4;
DoubleArray double_values = 5;
StringArray string_values = 6;
}
}
message BoolArray { repeated bool values = 1; }
message Int32Array { repeated int32 values = 1; }
message Int64Array { repeated int64 values = 1; }
message FloatArray { repeated float values = 1; }
message DoubleArray { repeated double values = 1; }
message StringArray { repeated string values = 1; }
// ============================================================
// NEW: OPC UA-Style Quality Codes
// ============================================================
// Replaces the v1 string quality field ("Good", "Bad", "Uncertain").
message QualityCode {
uint32 status_code = 1; // OPC UA-compatible numeric status code
string symbolic_name = 2; // Human-readable name (e.g., "Good", "BadSensorFailure")
}
// ============================================================
// Connection Lifecycle (unchanged)
// ============================================================
message ConnectRequest {
string client_id = 1;
string api_key = 2;
}
message ConnectResponse {
bool success = 1;
string message = 2;
string session_id = 3;
}
message DisconnectRequest {
string session_id = 1;
}
message DisconnectResponse {
bool success = 1;
string message = 2;
}
message GetConnectionStateRequest {
string session_id = 1;
}
message GetConnectionStateResponse {
bool is_connected = 1;
string client_id = 2;
int64 connected_since_utc_ticks = 3;
}
message CheckApiKeyRequest {
string api_key = 1;
}
message CheckApiKeyResponse {
bool is_valid = 1;
string message = 2;
}
// ============================================================
// Value-Timestamp-Quality (CHANGED)
// ============================================================
message VtqMessage {
string tag = 1; // Tag address (unchanged)
TypedValue value = 2; // CHANGED: typed value instead of string
int64 timestamp_utc_ticks = 3; // UTC DateTime.Ticks (unchanged)
QualityCode quality = 4; // CHANGED: structured quality instead of string
}
// ============================================================
// Read Operations (request unchanged, response uses new VtqMessage)
// ============================================================
message ReadRequest {
string session_id = 1;
string tag = 2;
}
message ReadResponse {
bool success = 1;
string message = 2;
VtqMessage vtq = 3; // Uses updated VtqMessage with TypedValue + QualityCode
}
message ReadBatchRequest {
string session_id = 1;
repeated string tags = 2;
}
message ReadBatchResponse {
bool success = 1;
string message = 2;
repeated VtqMessage vtqs = 3; // Uses updated VtqMessage
}
// ============================================================
// Write Operations (CHANGED: TypedValue instead of string)
// ============================================================
message WriteRequest {
string session_id = 1;
string tag = 2;
TypedValue value = 3; // CHANGED from string
}
message WriteResponse {
bool success = 1;
string message = 2;
}
message WriteItem {
string tag = 1;
TypedValue value = 2; // CHANGED from string
}
message WriteResult {
string tag = 1;
bool success = 2;
string message = 3;
}
message WriteBatchRequest {
string session_id = 1;
repeated WriteItem items = 2;
}
message WriteBatchResponse {
bool success = 1;
string message = 2;
repeated WriteResult results = 3;
}
// ============================================================
// WriteBatchAndWait (CHANGED: TypedValue for items and flag)
// ============================================================
message WriteBatchAndWaitRequest {
string session_id = 1;
repeated WriteItem items = 2; // Uses updated WriteItem with TypedValue
string flag_tag = 3;
TypedValue flag_value = 4; // CHANGED from string — type-aware comparison
int32 timeout_ms = 5;
int32 poll_interval_ms = 6;
}
message WriteBatchAndWaitResponse {
bool success = 1;
string message = 2;
repeated WriteResult write_results = 3;
bool flag_reached = 4;
int32 elapsed_ms = 5;
}
// ============================================================
// Subscription (request unchanged, stream uses new VtqMessage)
// ============================================================
message SubscribeRequest {
string session_id = 1;
repeated string tags = 2;
int32 sampling_ms = 3;
}
// Returns: stream of VtqMessage (updated with TypedValue + QualityCode)
```
---
## 3. Detailed Change Specifications
### 3.1 Typed Value Representation
**What changed:** The `string value` field throughout the protocol is replaced by `TypedValue`, a protobuf `oneof` that carries the value in its native type.
**v1 behavior (removed):**
- All values serialized to string via `.ToString()`
- Client-side parsing heuristic: numeric → bool → string → null
- Arrays JSON-serialized as strings (e.g., `"[1,2,3]"`)
- Empty string treated as null
**v2 behavior:**
- Values transmitted in their native protobuf type
- No parsing ambiguity — the `oneof` case tells you the type
- Arrays use dedicated repeated-field messages (`Int32Array`, `FloatArray`, etc.)
- Null represented by an unset `oneof` (no field selected in `TypedValue`)
- `datetime_value` uses `int64` UTC Ticks (same wire encoding as v1 timestamps, but now semantically typed as a DateTime value rather than a string)
**Null handling:**
| Scenario | v1 | v2 |
|----------|----|----|
| Null value | `value = ""` (empty string) | `TypedValue` with no `oneof` case set |
| Missing VTQ | Treated as Bad quality, null value | Same — Bad quality, unset `TypedValue` |
**Type mapping from internal tag model:**
| Tag Data Type | TypedValue Field | Notes |
|---------------|-----------------|-------|
| `bool` | `bool_value` | |
| `int32` | `int32_value` | |
| `int64` | `int64_value` | |
| `float` | `float_value` | |
| `double` | `double_value` | |
| `string` | `string_value` | |
| `byte[]` | `bytes_value` | |
| `DateTime` | `datetime_value` | UTC Ticks as int64 |
| `float[]` | `array_value.float_values` | |
| `int32[]` | `array_value.int32_values` | |
| Other arrays | Corresponding `ArrayValue` field | |
### 3.2 OPC UA-Style Quality Codes
**What changed:** The `string quality` field (one of `"Good"`, `"Uncertain"`, `"Bad"`) is replaced by `QualityCode` containing a numeric OPC UA status code and a human-readable symbolic name.
**v1 behavior (removed):**
- Quality as case-insensitive string: `"Good"`, `"Uncertain"`, `"Bad"`
- No sub-codes — all failures were just `"Bad"`
**v2 behavior:**
- `status_code` is a `uint32` matching OPC UA `StatusCode` bit layout
- `symbolic_name` is the human-readable equivalent (for logging, debugging, display)
- Category derived from high bits: `0x00xxxxxx` = Good, `0x40xxxxxx` = Uncertain, `0x80xxxxxx` = Bad
**Supported quality codes:**
The quality codes below are filtered to those actively used by AVEVA System Platform, InTouch, and OI Server/DAServer (per AVEVA Tech Note TN1305). AVEVA's ecosystem maps OPC DA quality codes to OPC UA status codes when communicating over OPC UA. This table includes the OPC UA equivalents for the AVEVA-relevant quality states.
**Good Quality:**
| Symbolic Name | Status Code | AVEVA OPC DA Hex | AVEVA Description |
|---------------|-------------|------------------|-------------------|
| `Good` | `0x00000000` | `0x00C0` | Value is reliable, non-specific |
| `GoodLocalOverride` | `0x00D80000` | `0x00D8` | Value has been manually overridden; input disconnected |
**Uncertain Quality:**
| Symbolic Name | Status Code | AVEVA OPC DA Hex | AVEVA Description |
|---------------|-------------|------------------|-------------------|
| `UncertainLastUsableValue` | `0x40900000` | `0x0044` | External source stopped writing; value is stale |
| `UncertainSensorNotAccurate` | `0x42390000` | `0x0050` | Sensor out of calibration or clamped at limit |
| `UncertainEngineeringUnitsExceeded` | `0x40540000` | `0x0054` | Value is outside defined engineering limits |
| `UncertainSubNormal` | `0x40580000` | `0x0058` | Derived from multiple sources with insufficient good sources |
**Bad Quality:**
| Symbolic Name | Status Code | AVEVA OPC DA Hex | AVEVA Description |
|---------------|-------------|------------------|-------------------|
| `Bad` | `0x80000000` | `0x0000` | Non-specific bad; value is not useful |
| `BadConfigurationError` | `0x80040000` | `0x0004` | Server-specific configuration problem (e.g., item deleted) |
| `BadNotConnected` | `0x808A0000` | `0x0008` | Input not logically connected to a source |
| `BadDeviceFailure` | `0x806B0000` | `0x000C` | Device failure detected |
| `BadSensorFailure` | `0x806D0000` | `0x0010` | Sensor failure detected |
| `BadLastKnownValue` | `0x80050000` | `0x0014` | Communication failed; last known value available (check timestamp age) |
| `BadCommunicationFailure` | `0x80050000` | `0x0018` | Communication failed; no last known value available |
| `BadOutOfService` | `0x808F0000` | `0x001C` | Block is off-scan or locked; item/group is inactive |
**Notes:**
- AVEVA OPC DA quality codes use a 16-bit structure: 2 bits major (Good/Bad/Uncertain), 4 bits minor (sub-status), 2 bits limit (Not Limited, Low, High, Constant). The OPC UA status codes above are the standard UA equivalents.
- The limit bits (Not Limited `0x00`, Low Limited `0x01`, High Limited `0x02`, Constant `0x03`) are appended to any quality code. For example, `Good + High Limited` = `0x00C2` in OPC DA. In OPC UA, limits are conveyed via separate status code bits but the base code remains the same.
- AVEVA's "Initializing" state (seen when OI Server is still establishing communication) maps to `Bad` with no sub-code in OPC DA (`0x0000`). In OPC UA this is `BadWaitingForInitialData` (`0x80320000`).
- This is the minimum set needed to simulate realistic AVEVA System Platform behavior. Additional OPC UA codes can be added if specific simulation scenarios require them.
**Category helper logic (C#):**
```csharp
public static string GetCategory(uint statusCode) => statusCode switch
{
_ when (statusCode & 0xC0000000) == 0x00000000 => "Good",
_ when (statusCode & 0xC0000000) == 0x40000000 => "Uncertain",
_ when (statusCode & 0xC0000000) == 0x80000000 => "Bad",
_ => "Unknown"
};
public static bool IsGood(uint statusCode) => (statusCode & 0xC0000000) == 0x00000000;
public static bool IsBad(uint statusCode) => (statusCode & 0xC0000000) == 0x80000000;
```
### 3.3 WriteBatchAndWait Flag Comparison
**What changed:** `flag_value` is now `TypedValue` instead of `string`. The server uses type-aware equality comparison instead of string comparison.
**v1 behavior (removed):**
```csharp
// v1: string comparison
bool matched = readResult.Value?.ToString() == request.FlagValue;
```
**v2 behavior:**
```csharp
// v2: type-aware comparison
bool matched = TypedValueEquals(readResult.TypedValue, request.FlagValue);
```
**Comparison rules:**
- Both values must have the same `oneof` case (same type). Mismatched types are never equal.
- Numeric comparison uses the native type's equality (no floating-point string round-trip issues).
- String comparison is case-sensitive (unchanged from v1).
- Bool comparison is direct equality.
- Null (unset `oneof`) equals null. Null does not equal any set value.
- Array comparison: element-by-element equality, same length required.
- `datetime_value` compared as `int64` equality (tick-level precision).
---
## 4. Behavioral Changes
### 4.1 Read Operations
No RPC signature changes. The returned `VtqMessage` now uses `TypedValue` and `QualityCode` instead of strings.
**v1 client code:**
```csharp
var response = await client.ReadAsync(new ReadRequest { SessionId = sid, Tag = "Motor.Speed" });
double value = double.Parse(response.Vtq.Value); // string → double
bool isGood = response.Vtq.Quality.Equals("Good", ...); // string comparison
```
**v2 client code:**
```csharp
var response = await client.ReadAsync(new ReadRequest { SessionId = sid, Tag = "Motor.Speed" });
double value = response.Vtq.Value.DoubleValue; // direct typed access
bool isGood = response.Vtq.Quality.StatusCode == 0x00000000; // numeric comparison
// or: bool isGood = IsGood(response.Vtq.Quality.StatusCode); // helper method
```
### 4.2 Write Operations
Client must construct `TypedValue` instead of converting to string.
**v1 client code:**
```csharp
await client.WriteAsync(new WriteRequest
{
SessionId = sid,
Tag = "Motor.Speed",
Value = 42.5.ToString() // double → string
});
```
**v2 client code:**
```csharp
await client.WriteAsync(new WriteRequest
{
SessionId = sid,
Tag = "Motor.Speed",
Value = new TypedValue { DoubleValue = 42.5 } // native type
});
```
### 4.3 Subscription Stream
No RPC signature changes. The streamed `VtqMessage` items now use the updated format. Client `onUpdate` callbacks receive typed values and structured quality.
### 4.4 Error Conditions with New Quality Codes
The server now returns specific quality codes instead of generic `"Bad"`:
| Scenario | v1 Quality | v2 Quality |
|----------|-----------|-----------|
| Tag not found | `"Bad"` | `BadConfigurationError` (`0x80040000`) |
| Tag read exception / comms loss | `"Bad"` | `BadCommunicationFailure` (`0x80050000`) |
| Write to read-only tag | `success=false` | WriteResult.success=false, message indicates read-only |
| Type mismatch on write | `success=false` | WriteResult.success=false, message indicates type mismatch |
| Simulated sensor failure | `"Bad"` | `BadSensorFailure` (`0x806D0000`) |
| Simulated device failure | `"Bad"` | `BadDeviceFailure` (`0x806B0000`) |
| Stale value (fault injection) | `"Uncertain"` | `UncertainLastUsableValue` (`0x40900000`) |
| Block off-scan / disabled | `"Bad"` | `BadOutOfService` (`0x808F0000`) |
| Local override active | `"Good"` | `GoodLocalOverride` (`0x00D80000`) |
| Initializing / waiting for first value | `"Bad"` | `BadWaitingForInitialData` (`0x80320000`) |
---
## 5. Migration Guide
### 5.1 Strategy
**Clean break** — all clients and servers are updated simultaneously in a single coordinated release. No backward compatibility layer, no version negotiation, no dual-format support.
This is appropriate because:
- The LmxProxy is an internal protocol between ScadaLink components, not a public API
- The number of clients is small and controlled
- Maintaining dual formats adds complexity with no long-term benefit
### 5.2 Server-Side Changes
**Files to update:**
| File | Changes |
|------|---------|
| `scada.proto` | Replace with v2 proto (Section 2 of this document) |
| `ScadaServiceImpl.cs` | Update all RPC handlers to construct `TypedValue` and `QualityCode` instead of strings |
| `SessionManager.cs` | No changes (session model unchanged) |
| `TagMapper.cs` | Update to return `TypedValue` from tag reads instead of string conversion |
**Server implementation notes:**
- When reading a tag, construct `TypedValue` by setting the appropriate `oneof` field based on the tag's data type. Do not call `.ToString()`.
- When a tag read fails, return `QualityCode { StatusCode = 0x80050000, SymbolicName = "BadCommunicationFailure" }` (or a more specific code) instead of the string `"Bad"`.
- When handling writes, extract the value from the `TypedValue` oneof and apply it to the tag actor. If the `oneof` case doesn't match the tag's expected data type, return `WriteResult` with `success=false` and message indicating type mismatch.
- For `WriteBatchAndWait` flag comparison, implement `TypedValueEquals()` per the comparison rules in Section 3.3.
### 5.3 Client-Side Changes
**Files to update:**
| File | Changes |
|------|---------|
| `ILmxProxyClient.cs` | Interface unchanged (same method signatures, updated message types come from proto regeneration) |
| `RealLmxProxyClient.cs` | Update value construction in write methods; update value extraction in read callbacks |
| `LmxProxyDataConnection.cs` | Update DCL adapter to map between DCL's internal value model and `TypedValue`/`QualityCode` |
| `LmxProxyClientFactory.cs` | No changes |
**Client implementation notes:**
- Replace all `double.Parse(vtq.Value)` / `bool.Parse(vtq.Value)` calls with direct typed access (e.g., `vtq.Value.DoubleValue`).
- Replace all `vtq.Quality.Equals("Good", ...)` string comparisons with numeric status code checks or the `IsGood()`/`IsBad()` helpers.
- Replace all `.ToString()` value serialization in write paths with `TypedValue` construction.
- The `onUpdate` callback signature in `SubscribeAsync` doesn't change at the interface level, but the `VtqMessage` it receives now contains `TypedValue` and `QualityCode`.
### 5.4 Migration Checklist
```
[ ] Generate updated C# classes from v2 proto file
[ ] Update server: ScadaServiceImpl read handlers → TypedValue + QualityCode
[ ] Update server: ScadaServiceImpl write handlers → accept TypedValue
[ ] Update server: WriteBatchAndWait flag comparison → TypedValueEquals()
[ ] Update server: Error paths → specific QualityCode status codes
[ ] Update client: RealLmxProxyClient read paths → typed value extraction
[ ] Update client: RealLmxProxyClient write paths → TypedValue construction
[ ] Update client: Quality checks → numeric status code comparison
[ ] Update client: LmxProxyDataConnection DCL adapter → map TypedValue ↔ DCL values
[ ] Update all unit tests for new message shapes
[ ] Integration test: client ↔ server round-trip with all data types
[ ] Integration test: WriteBatchAndWait with typed flag comparison
[ ] Integration test: Subscription stream delivers typed VTQ messages
[ ] Integration test: Error paths return correct QualityCode sub-codes
[ ] Remove all string-based value parsing/serialization code
[ ] Remove all string-based quality comparison code
```
---
## 6. Test Scenarios for v2 Validation
These scenarios validate that the v2 protocol behaves correctly across all data types and quality codes.
### 6.1 Round-Trip Type Fidelity
For each supported data type, write a value via `Write`, read it back via `Read`, and verify the `TypedValue` oneof case and value match exactly:
| Data Type | Test Value | TypedValue Field | Verify |
|-----------|-----------|-----------------|--------|
| `bool` | `true` | `bool_value` | `== true` |
| `int32` | `2147483647` | `int32_value` | `== int.MaxValue` |
| `int64` | `9223372036854775807` | `int64_value` | `== long.MaxValue` |
| `float` | `3.14159f` | `float_value` | `== 3.14159f` (exact bits) |
| `double` | `2.718281828459045` | `double_value` | `== 2.718281828459045` (exact bits) |
| `string` | `"Hello World"` | `string_value` | `== "Hello World"` |
| `bytes` | `[0x00, 0xFF, 0x42]` | `bytes_value` | byte-for-byte match |
| `DateTime` | `638789000000000000L` | `datetime_value` | `== 638789000000000000L` |
| `float[]` | `[1.0f, 2.0f, 3.0f]` | `array_value.float_values` | element-wise match |
| `int32[]` | `[10, 20, 30]` | `array_value.int32_values` | element-wise match |
| null | (unset) | no oneof case | `Value case == None` |
### 6.2 Quality Code Propagation
| Scenario | Trigger | Expected QualityCode |
|----------|---------|---------------------|
| Normal read | Read a healthy tag | `{ 0x00000000, "Good" }` |
| Local override | Script sets `GoodLocalOverride` | `{ 0x00D80000, "GoodLocalOverride" }` |
| Fault injection: sensor failure | Script sets `BadSensorFailure` | `{ 0x806D0000, "BadSensorFailure" }` |
| Fault injection: device failure | Script sets `BadDeviceFailure` | `{ 0x806B0000, "BadDeviceFailure" }` |
| Fault injection: stale value | Script sets `UncertainLastUsableValue` | `{ 0x40900000, "UncertainLastUsableValue" }` |
| Fault injection: off-scan | Script sets `BadOutOfService` | `{ 0x808F0000, "BadOutOfService" }` |
| Fault injection: comms failure | Script sets `BadCommunicationFailure` | `{ 0x80050000, "BadCommunicationFailure" }` |
| Unknown tag | Read nonexistent tag | `{ 0x80040000, "BadConfigurationError" }` |
| Write to read-only | Write to a read-only tag | WriteResult.success=false, message contains "read-only" |
### 6.3 WriteBatchAndWait Typed Flag Comparison
| Flag Type | Written Value | Flag Value | Expected Result |
|-----------|--------------|-----------|-----------------|
| `bool` | `true` | `TypedValue { bool_value = true }` | `flag_reached = true` |
| `bool` | `false` | `TypedValue { bool_value = true }` | `flag_reached = false` (timeout) |
| `double` | `42.5` | `TypedValue { double_value = 42.5 }` | `flag_reached = true` |
| `double` | `42.500001` | `TypedValue { double_value = 42.5 }` | `flag_reached = false` |
| `string` | `"DONE"` | `TypedValue { string_value = "DONE" }` | `flag_reached = true` |
| `string` | `"done"` | `TypedValue { string_value = "DONE" }` | `flag_reached = false` (case-sensitive) |
| `int32` | `1` | `TypedValue { double_value = 1.0 }` | `flag_reached = false` (type mismatch) |
### 6.4 Subscription Stream
- Subscribe to tags of mixed data types
- Verify each streamed `VtqMessage` has the correct `oneof` case matching the tag's data type
- Inject a fault mid-stream and verify the quality code changes from `Good` to the injected code
- Cancel the subscription and verify the stream terminates cleanly
---
## 7. Appendix: v1 → v2 Quick Reference
**Reading a value:**
```csharp
// v1
string raw = vtq.Value;
if (double.TryParse(raw, out var d)) { /* use d */ }
else if (bool.TryParse(raw, out var b)) { /* use b */ }
else { /* it's a string */ }
// v2
switch (vtq.Value.ValueCase)
{
case TypedValue.ValueOneofCase.DoubleValue:
double d = vtq.Value.DoubleValue;
break;
case TypedValue.ValueOneofCase.BoolValue:
bool b = vtq.Value.BoolValue;
break;
case TypedValue.ValueOneofCase.StringValue:
string s = vtq.Value.StringValue;
break;
case TypedValue.ValueOneofCase.None:
// null value
break;
// ... other cases
}
```
**Writing a value:**
```csharp
// v1
new WriteItem { Tag = "Motor.Speed", Value = 42.5.ToString() }
// v2
new WriteItem { Tag = "Motor.Speed", Value = new TypedValue { DoubleValue = 42.5 } }
```
**Checking quality:**
```csharp
// v1
bool isGood = vtq.Quality.Equals("Good", StringComparison.OrdinalIgnoreCase);
bool isBad = vtq.Quality.Equals("Bad", StringComparison.OrdinalIgnoreCase);
// v2
bool isGood = (vtq.Quality.StatusCode & 0xC0000000) == 0x00000000;
bool isBad = (vtq.Quality.StatusCode & 0xC0000000) == 0x80000000;
// or use helper:
bool isGood = QualityHelper.IsGood(vtq.Quality.StatusCode);
```
**Constructing quality (server-side):**
```csharp
// v1
vtq.Quality = "Good";
// v2
vtq.Quality = new QualityCode { StatusCode = 0x00000000, SymbolicName = "Good" };
// or for errors:
vtq.Quality = new QualityCode { StatusCode = 0x806D0000, SymbolicName = "BadSensorFailure" };
vtq.Quality = new QualityCode { StatusCode = 0x80050000, SymbolicName = "BadCommunicationFailure" };
vtq.Quality = new QualityCode { StatusCode = 0x00D80000, SymbolicName = "GoodLocalOverride" };
```
---
*Document version: 1.0 — All decisions resolved. Complete proto, migration guide, and test scenarios.*

View File

@@ -0,0 +1,210 @@
# LmxProxy v2 Rebuild — Design Document
**Date**: 2026-03-21
**Status**: Approved
**Scope**: Complete rebuild of LmxProxy Host and Client with v2 protocol
## 1. Overview
Rebuild the LmxProxy gRPC proxy service from scratch, implementing the v2 protocol (TypedValue + QualityCode) as defined in `docs/lmxproxy_updates.md`. The existing code in `src/` is retained as reference only. No backward compatibility with v1.
## 2. Key Design Decisions
| Decision | Choice | Rationale |
|----------|--------|-----------|
| gRPC server for Host | Grpc.Core (C-core) | Only option for .NET Framework 4.8 server-side |
| Service hosting | Topshelf | Proven, already deployed, simple install/uninstall |
| Protocol version | v2 only, clean break | Small controlled client count, no value in v1 compat |
| Shared code between projects | None — fully independent | Different .NET runtimes (.NET Fx 4.8 vs .NET 10), wire compat is the contract |
| Client retry library | Polly v8+ | Building fresh on .NET 10, modern API |
| Testing strategy | Unit tests during implementation, integration tests after Client functional | Phased approach, real hardware validation on windev |
## 3. Architecture
### 3.1 Host (.NET Framework 4.8, x86)
```
Program.cs (Topshelf entry point)
└── LmxProxyService (lifecycle manager)
├── Configuration (appsettings.json binding + validation)
├── MxAccessClient (COM interop, STA dispatch thread)
│ ├── Connection state machine
│ ├── Read/Write with semaphore concurrency
│ ├── Subscription storage for reconnect replay
│ └── Auto-reconnect loop (5s interval)
├── SessionManager (ConcurrentDictionary, 5-min inactivity scavenging)
├── SubscriptionManager (per-client channels, shared MxAccess subscriptions)
├── ApiKeyService (JSON file, FileSystemWatcher hot-reload)
├── ScadaGrpcService (proto-generated, all 10 RPCs)
│ └── ApiKeyInterceptor (x-api-key header enforcement)
├── PerformanceMetrics (per-op tracking, p95, 60s log)
├── HealthCheckService (basic + detailed with test tag)
└── StatusWebServer (HTML dashboard, JSON status, health endpoint)
```
### 3.2 Client (.NET 10, AnyCPU)
```
ILmxProxyClient (public interface)
└── LmxProxyClient (partial class)
├── Connection (GrpcChannel, protobuf-net.Grpc, 30s keep-alive)
├── Read/Write/Subscribe operations
├── CodeFirstSubscription (IAsyncEnumerable streaming)
├── ClientMetrics (p95/p99, 1000-sample buffer)
└── Disposal (session disconnect, channel cleanup)
LmxProxyClientBuilder (fluent builder, Polly v8 resilience pipeline)
ILmxProxyClientFactory + LmxProxyClientFactory (config-based creation)
ServiceCollectionExtensions (DI registrations)
StreamingExtensions (batched reads/writes, parallel processing)
Domain/
├── ScadaContracts.cs (IScadaService + all DataContract messages)
├── Quality.cs, QualityExtensions.cs
├── Vtq.cs
└── ConnectionState.cs
```
### 3.3 Wire Compatibility
The `.proto` file is the single source of truth for the wire format. Host generates server stubs from it. Client implements code-first contracts (`[DataContract]`/`[ServiceContract]`) that mirror the proto exactly — same field numbers, names, nesting, and streaming shapes. Cross-stack serialization tests verify compatibility.
## 4. Protocol (v2)
### 4.1 TypedValue System
Protobuf `oneof` carrying native types:
| Case | Proto Type | .NET Type |
|------|-----------|-----------|
| bool_value | bool | bool |
| int32_value | int32 | int |
| int64_value | int64 | long |
| float_value | float | float |
| double_value | double | double |
| string_value | string | string |
| bytes_value | bytes | byte[] |
| datetime_value | int64 (UTC Ticks) | DateTime |
| array_value | ArrayValue | typed arrays |
Unset `oneof` = null. No string serialization heuristics.
### 4.2 COM Variant Coercion Table
| COM Variant Type | TypedValue Case | Notes |
|-----------------|-----------------|-------|
| VT_BOOL | bool_value | |
| VT_I2 (short) | int32_value | Widened |
| VT_I4 (int) | int32_value | |
| VT_I8 (long) | int64_value | |
| VT_UI2 (ushort) | int32_value | Widened |
| VT_UI4 (uint) | int64_value | Widened to avoid sign issues |
| VT_UI8 (ulong) | int64_value | Truncation risk logged if > long.MaxValue |
| VT_R4 (float) | float_value | |
| VT_R8 (double) | double_value | |
| VT_BSTR (string) | string_value | |
| VT_DATE (DateTime) | datetime_value | Converted to UTC Ticks |
| VT_DECIMAL | double_value | Precision loss logged |
| VT_CY (Currency) | double_value | |
| VT_NULL, VT_EMPTY, DBNull | unset oneof | Represents null |
| VT_ARRAY | array_value | Element type determines ArrayValue field |
| VT_UNKNOWN | string_value | ToString() fallback, logged as warning |
### 4.3 QualityCode System
`status_code` (uint32, OPC UA-compatible) is canonical. `symbolic_name` is derived from a lookup table, never set independently.
Category derived from high bits:
- `0x00xxxxxx` = Good
- `0x40xxxxxx` = Uncertain
- `0x80xxxxxx` = Bad
Domain `Quality` enum uses byte values for the low-order byte, with extension methods `IsGood()`, `IsBad()`, `IsUncertain()`.
### 4.4 Error Model
| Error Type | Mechanism | Examples |
|-----------|-----------|----------|
| Infrastructure | gRPC StatusCode | Unauthenticated (bad API key), PermissionDenied (ReadOnly write), InvalidArgument (bad session), Unavailable (MxAccess down) |
| Business outcome | Payload `success`/`message` fields | Tag read failure, write type mismatch, batch partial failure, WriteBatchAndWait flag timeout |
| Subscription | gRPC StatusCode on stream | Unauthenticated (invalid session), Internal (unexpected error) |
## 5. COM Threading Model
MxAccess is an STA COM component. All COM operations execute on a **dedicated STA thread** with a `BlockingCollection<Action>` dispatch queue:
- `MxAccessClient` creates a single STA thread at construction
- All COM calls (connect, read, write, subscribe, disconnect) are dispatched to this thread via the queue
- Callers await a `TaskCompletionSource<T>` that the STA thread completes after the COM call
- The STA thread runs a message pump loop (`Application.Run` or manual `MSG` pump)
- On disposal, a sentinel is enqueued and the thread joins with a 10-second timeout
This replaces the fragile `Task.Run` + `SemaphoreSlim` pattern in the reference code.
## 6. Session Lifecycle
- Sessions created on `Connect` with GUID "N" format (32-char hex)
- Tracked in `ConcurrentDictionary<string, SessionInfo>`
- **Inactivity scavenging**: sessions not accessed for 5 minutes are automatically terminated. Client keep-alive pings (30s) keep legitimate sessions alive.
- On termination: subscriptions cleaned up, session removed from dictionary
- All sessions lost on service restart (in-memory only)
## 7. Subscription Semantics
- **Shared MxAccess subscriptions**: first client to subscribe creates the underlying MxAccess subscription. Last to unsubscribe disposes it. Ref-counted.
- **Sampling rate**: when multiple clients subscribe to the same tag with different `sampling_ms`, the fastest (lowest non-zero) rate is used for the MxAccess subscription. All clients receive updates at this rate.
- **Per-client channels**: each client gets an independent `BoundedChannel<VtqMessage>` (capacity 1000, DropOldest). One slow consumer's drops do not affect other clients.
- **MxAccess disconnect**: all subscribed clients receive a bad-quality notification for all their subscribed tags.
- **Session termination**: all subscriptions for that session are cleaned up.
## 8. Authentication
- `x-api-key` gRPC metadata header is the authoritative authentication mechanism
- `ConnectRequest.api_key` is accepted but the interceptor is the enforcement point
- API keys loaded from JSON file with FileSystemWatcher hot-reload (1-second debounce)
- Auto-generates default file with two random keys (ReadOnly + ReadWrite) if missing
- Write-protected RPCs: Write, WriteBatch, WriteBatchAndWait
## 9. Phasing
| Phase | Scope | Depends On |
|-------|-------|------------|
| 1 | Protocol & Domain Types | — |
| 2 | Host Core (MxAccessClient, SessionManager, SubscriptionManager) | Phase 1 |
| 3 | Host gRPC Server, Security, Configuration, Service Hosting | Phase 2 |
| 4 | Host Health, Metrics, Status Server | Phase 3 |
| 5 | Client Core | Phase 1 |
| 6 | Client Extras (Builder, Factory, DI, Streaming) | Phase 5 |
| 7 | Integration Tests & Deployment | Phases 4 + 6 |
Phases 2-4 (Host) and 5-6 (Client) can proceed in parallel after Phase 1.
## 10. Guardrails
1. **Proto is the source of truth** — any wire format question is resolved by reading `scada.proto`, not the code-first contracts.
2. **No v1 code in the new build** — reference only. Do not copy-paste and modify; write fresh.
3. **Cross-stack tests in Phase 1** — Host proto serialize → Client code-first deserialize (and vice versa) before any business logic.
4. **COM calls only on STA thread** — no `Task.Run` for COM operations. All go through the dispatch queue.
5. **status_code is canonical for quality**`symbolic_name` is always derived, never independently set.
6. **Unit tests before integration** — every phase includes unit tests. Integration tests are Phase 7 only.
7. **Each phase must compile and pass tests** before the next phase begins.
8. **No string serialization heuristics** — v2 uses native TypedValue. No `double.TryParse` or `bool.TryParse` on values.
## 11. Resolved Conflicts
| Conflict | Resolution |
|----------|-----------|
| WriteBatchAndWait signature (MxAccessClient vs Protocol) | Follow Protocol spec: write items, poll flagTag for flagValue. IScadaClient interface matches protocol semantics. |
| Builder default port 5050 vs Host 50051 | Standardize builder default to 50051 |
| Auth in metadata vs payload | x-api-key header is authoritative; ConnectRequest.api_key accepted but interceptor enforces |
## 12. Reference Code
The existing code remains in `src/` as `src-reference/` for consultation:
- `src-reference/ZB.MOM.WW.LmxProxy.Host/` — v1 Host implementation
- `src-reference/ZB.MOM.WW.LmxProxy.Client/` — v1 Client implementation
Key reference files for COM interop patterns:
- `Implementation/MxAccessClient.Connection.cs` — COM object lifecycle
- `Implementation/MxAccessClient.EventHandlers.cs` — MxAccess callbacks
- `Implementation/MxAccessClient.Subscription.cs` — Advise/Unadvise patterns

View File

@@ -0,0 +1,673 @@
# Gap 1 & Gap 2: Active Health Probing + Subscription Handle Cleanup
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.
**Goal:** Fix two reconnect-related gaps: (1) the monitor loop cannot detect a silently-dead MxAccess connection, and (2) SubscriptionManager holds stale IAsyncDisposable handles after reconnect.
**Architecture:** Add a domain-level connection probe to `MxAccessClient` that classifies results as Healthy/TransportFailure/DataDegraded. The monitor loop uses this to decide reconnect vs degrade-and-backoff. Separately, remove `SubscriptionManager._mxAccessHandles` entirely and switch to address-based unsubscribe through `IScadaClient`, making `MxAccessClient` the sole owner of COM subscription lifecycle.
**Tech Stack:** .NET Framework 4.8, C#, MxAccess COM interop, Serilog
---
## Task 0: Add `ProbeResult` domain type
**Files:**
- Create: `src/ZB.MOM.WW.LmxProxy.Host/Domain/ProbeResult.cs`
**Step 1: Create the ProbeResult type**
```csharp
using System;
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
public enum ProbeStatus
{
Healthy,
TransportFailure,
DataDegraded
}
public sealed class ProbeResult
{
public ProbeStatus Status { get; }
public Quality? Quality { get; }
public DateTime? Timestamp { get; }
public string? Message { get; }
public Exception? Exception { get; }
private ProbeResult(ProbeStatus status, Quality? quality, DateTime? timestamp,
string? message, Exception? exception)
{
Status = status;
Quality = quality;
Timestamp = timestamp;
Message = message;
Exception = exception;
}
public static ProbeResult Healthy(Quality quality, DateTime timestamp)
=> new ProbeResult(ProbeStatus.Healthy, quality, timestamp, null, null);
public static ProbeResult Degraded(Quality quality, DateTime timestamp, string message)
=> new ProbeResult(ProbeStatus.DataDegraded, quality, timestamp, message, null);
public static ProbeResult TransportFailed(string message, Exception? ex = null)
=> new ProbeResult(ProbeStatus.TransportFailure, null, null, message, ex);
}
}
```
**Step 2: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Domain/ProbeResult.cs
git commit -m "feat: add ProbeResult domain type for connection health classification"
```
---
## Task 1: Add `ProbeConnectionAsync` to `MxAccessClient`
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/Domain/IScadaClient.cs` — add `ProbeConnectionAsync` to interface
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs` — implement probe method
**Step 1: Add to IScadaClient interface**
In `IScadaClient.cs`, add after the `DisconnectAsync` method:
```csharp
/// <summary>
/// Probes connection health by reading a test tag.
/// Returns a classified result: Healthy, TransportFailure, or DataDegraded.
/// </summary>
Task<ProbeResult> ProbeConnectionAsync(string testTagAddress, int timeoutMs, CancellationToken ct = default);
```
**Step 2: Implement in MxAccessClient.Connection.cs**
Add before `MonitorConnectionAsync`:
```csharp
/// <summary>
/// Probes the connection by reading a test tag with a timeout.
/// Classifies the result as transport failure vs data degraded.
/// </summary>
public async Task<ProbeResult> ProbeConnectionAsync(string testTagAddress, int timeoutMs,
CancellationToken ct = default)
{
if (!IsConnected)
return ProbeResult.TransportFailed("Not connected");
try
{
using (var cts = CancellationTokenSource.CreateLinkedTokenSource(ct))
{
cts.CancelAfter(timeoutMs);
Vtq vtq;
try
{
vtq = await ReadAsync(testTagAddress, cts.Token);
}
catch (OperationCanceledException) when (!ct.IsCancellationRequested)
{
// Our timeout fired, not the caller's — treat as transport failure
return ProbeResult.TransportFailed("Probe read timed out after " + timeoutMs + "ms");
}
if (vtq.Quality == Domain.Quality.Bad_NotConnected ||
vtq.Quality == Domain.Quality.Bad_CommFailure)
{
return ProbeResult.TransportFailed("Probe returned " + vtq.Quality);
}
if (!vtq.Quality.IsGood())
{
return ProbeResult.Degraded(vtq.Quality, vtq.Timestamp,
"Probe quality: " + vtq.Quality);
}
if (DateTime.UtcNow - vtq.Timestamp > TimeSpan.FromMinutes(5))
{
return ProbeResult.Degraded(vtq.Quality, vtq.Timestamp,
"Probe data stale (>" + 5 + "min)");
}
return ProbeResult.Healthy(vtq.Quality, vtq.Timestamp);
}
}
catch (System.Runtime.InteropServices.COMException ex)
{
return ProbeResult.TransportFailed("COM exception: " + ex.Message, ex);
}
catch (InvalidOperationException ex) when (ex.Message.Contains("Not connected"))
{
return ProbeResult.TransportFailed(ex.Message, ex);
}
catch (Exception ex)
{
return ProbeResult.TransportFailed("Probe failed: " + ex.Message, ex);
}
}
```
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Domain/IScadaClient.cs
git add src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs
git commit -m "feat: add ProbeConnectionAsync to MxAccessClient for active health probing"
```
---
## Task 2: Add health check configuration
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs` — add HealthCheckConfiguration class and property
**Step 1: Add HealthCheckConfiguration**
Add a new class in the Configuration namespace (can be in the same file or a new file — keep it simple, same file):
```csharp
/// <summary>Health check / probe configuration.</summary>
public class HealthCheckConfiguration
{
/// <summary>Tag address to probe for connection liveness. Default: TestChildObject.TestBool.</summary>
public string TestTagAddress { get; set; } = "TestChildObject.TestBool";
/// <summary>Probe timeout in milliseconds. Default: 5000.</summary>
public int ProbeTimeoutMs { get; set; } = 5000;
/// <summary>Consecutive transport failures before forced reconnect. Default: 3.</summary>
public int MaxConsecutiveTransportFailures { get; set; } = 3;
/// <summary>Probe interval while in degraded state (ms). Default: 30000 (30s).</summary>
public int DegradedProbeIntervalMs { get; set; } = 30000;
}
```
Add to `LmxProxyConfiguration`:
```csharp
/// <summary>Health check / active probe settings.</summary>
public HealthCheckConfiguration HealthCheck { get; set; } = new HealthCheckConfiguration();
```
**Step 2: Add to appsettings.json**
In the existing `appsettings.json`, add the `HealthCheck` section:
```json
"HealthCheck": {
"TestTagAddress": "TestChildObject.TestBool",
"ProbeTimeoutMs": 5000,
"MaxConsecutiveTransportFailures": 3,
"DegradedProbeIntervalMs": 30000
}
```
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs
git add src/ZB.MOM.WW.LmxProxy.Host/appsettings.json
git commit -m "feat: add HealthCheck configuration section for active connection probing"
```
---
## Task 3: Rewrite `MonitorConnectionAsync` with active probing
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.cs` — add probe state fields
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs` — rewrite monitor loop
The monitor needs configuration passed in. The simplest approach: add constructor parameters for the probe settings alongside the existing ones.
**Step 1: Add probe fields to MxAccessClient.cs**
Add fields after the existing reconnect fields (around line 42):
```csharp
// Probe configuration
private readonly string? _probeTestTagAddress;
private readonly int _probeTimeoutMs;
private readonly int _maxConsecutiveTransportFailures;
private readonly int _degradedProbeIntervalMs;
// Probe state
private int _consecutiveTransportFailures;
private bool _isDegraded;
```
Add constructor parameters and assignments. After the existing `_galaxyName = galaxyName;` line:
```csharp
public MxAccessClient(
int maxConcurrentOperations = 10,
int readTimeoutSeconds = 5,
int writeTimeoutSeconds = 5,
int monitorIntervalSeconds = 5,
bool autoReconnect = true,
string? nodeName = null,
string? galaxyName = null,
string? probeTestTagAddress = null,
int probeTimeoutMs = 5000,
int maxConsecutiveTransportFailures = 3,
int degradedProbeIntervalMs = 30000)
```
And in the body:
```csharp
_probeTestTagAddress = probeTestTagAddress;
_probeTimeoutMs = probeTimeoutMs;
_maxConsecutiveTransportFailures = maxConsecutiveTransportFailures;
_degradedProbeIntervalMs = degradedProbeIntervalMs;
```
**Step 2: Rewrite MonitorConnectionAsync in MxAccessClient.Connection.cs**
Replace the existing `MonitorConnectionAsync` (lines 177-213) with:
```csharp
/// <summary>
/// Auto-reconnect monitor loop with active health probing.
/// - If IsConnected is false: immediate reconnect (existing behavior).
/// - If IsConnected is true and probe configured: read test tag each interval.
/// - TransportFailure for N consecutive probes → forced disconnect + reconnect.
/// - DataDegraded → stay connected, back off probe interval, report degraded.
/// - Healthy → reset counters and resume normal interval.
/// </summary>
private async Task MonitorConnectionAsync(CancellationToken ct)
{
Log.Information("Connection monitor loop started (interval={IntervalMs}ms, probe={ProbeEnabled})",
_monitorIntervalMs, _probeTestTagAddress != null);
while (!ct.IsCancellationRequested)
{
var interval = _isDegraded ? _degradedProbeIntervalMs : _monitorIntervalMs;
try
{
await Task.Delay(interval, ct);
}
catch (OperationCanceledException)
{
break;
}
// ── Case 1: Already disconnected ──
if (!IsConnected)
{
_isDegraded = false;
_consecutiveTransportFailures = 0;
await AttemptReconnectAsync(ct);
continue;
}
// ── Case 2: Connected, no probe configured — legacy behavior ──
if (_probeTestTagAddress == null)
continue;
// ── Case 3: Connected, probe configured — active health check ──
var probe = await ProbeConnectionAsync(_probeTestTagAddress, _probeTimeoutMs, ct);
switch (probe.Status)
{
case ProbeStatus.Healthy:
if (_isDegraded)
{
Log.Information("Probe healthy — exiting degraded mode");
_isDegraded = false;
}
_consecutiveTransportFailures = 0;
break;
case ProbeStatus.DataDegraded:
_consecutiveTransportFailures = 0;
if (!_isDegraded)
{
Log.Warning("Probe degraded: {Message} — entering degraded mode (probe interval {IntervalMs}ms)",
probe.Message, _degradedProbeIntervalMs);
_isDegraded = true;
}
break;
case ProbeStatus.TransportFailure:
_isDegraded = false;
_consecutiveTransportFailures++;
Log.Warning("Probe transport failure ({Count}/{Max}): {Message}",
_consecutiveTransportFailures, _maxConsecutiveTransportFailures, probe.Message);
if (_consecutiveTransportFailures >= _maxConsecutiveTransportFailures)
{
Log.Warning("Max consecutive transport failures reached — forcing reconnect");
_consecutiveTransportFailures = 0;
try
{
await DisconnectAsync(ct);
}
catch (Exception ex)
{
Log.Warning(ex, "Error during forced disconnect before reconnect");
// DisconnectAsync already calls CleanupComObjectsAsync on error path
}
await AttemptReconnectAsync(ct);
}
break;
}
}
Log.Information("Connection monitor loop exited");
}
private async Task AttemptReconnectAsync(CancellationToken ct)
{
Log.Information("Attempting reconnect...");
SetState(ConnectionState.Reconnecting);
try
{
await ConnectAsync(ct);
Log.Information("Reconnected to MxAccess successfully");
}
catch (OperationCanceledException)
{
// Let the outer loop handle cancellation
}
catch (Exception ex)
{
Log.Warning(ex, "Reconnect attempt failed, will retry at next interval");
}
}
```
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.cs
git add src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs
git commit -m "feat: rewrite monitor loop with active probing, transport vs degraded classification"
```
---
## Task 4: Wire probe config through `LmxProxyService.Start()`
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs` — pass HealthCheck config to MxAccessClient constructor
**Step 1: Update MxAccessClient construction**
In `LmxProxyService.Start()`, update the MxAccessClient creation (around line 62) to pass the new parameters:
```csharp
_mxAccessClient = new MxAccessClient(
maxConcurrentOperations: _config.Connection.MaxConcurrentOperations,
readTimeoutSeconds: _config.Connection.ReadTimeoutSeconds,
writeTimeoutSeconds: _config.Connection.WriteTimeoutSeconds,
monitorIntervalSeconds: _config.Connection.MonitorIntervalSeconds,
autoReconnect: _config.Connection.AutoReconnect,
nodeName: _config.Connection.NodeName,
galaxyName: _config.Connection.GalaxyName,
probeTestTagAddress: _config.HealthCheck.TestTagAddress,
probeTimeoutMs: _config.HealthCheck.ProbeTimeoutMs,
maxConsecutiveTransportFailures: _config.HealthCheck.MaxConsecutiveTransportFailures,
degradedProbeIntervalMs: _config.HealthCheck.DegradedProbeIntervalMs);
```
**Step 2: Update DetailedHealthCheckService to use shared probe**
In `LmxProxyService.Start()`, update the DetailedHealthCheckService construction (around line 114) to pass the test tag address from config:
```csharp
_detailedHealthCheckService = new DetailedHealthCheckService(
_mxAccessClient, _config.HealthCheck.TestTagAddress);
```
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs
git commit -m "feat: wire HealthCheck config to MxAccessClient and DetailedHealthCheckService"
```
---
## Task 5: Add `UnsubscribeByAddressAsync` to `IScadaClient` and `MxAccessClient`
This is the foundation for removing handle-based unsubscribe from SubscriptionManager.
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/Domain/IScadaClient.cs` — add `UnsubscribeByAddressAsync`
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Subscription.cs` — implement, change `UnsubscribeAsync` visibility
**Step 1: Add to IScadaClient**
After `SubscribeAsync`:
```csharp
/// <summary>
/// Unsubscribes specific tag addresses. Removes from stored subscriptions
/// and COM state. Safe to call after reconnect — uses current handle mappings.
/// </summary>
Task UnsubscribeByAddressAsync(IEnumerable<string> addresses);
```
**Step 2: Implement in MxAccessClient.Subscription.cs**
The existing `UnsubscribeAsync` (line 53) already does exactly this — it's just `internal`. Rename it or add a public wrapper:
```csharp
/// <summary>
/// Unsubscribes specific addresses by address name.
/// Removes from both COM state and stored subscriptions (no reconnect replay).
/// </summary>
public async Task UnsubscribeByAddressAsync(IEnumerable<string> addresses)
{
await UnsubscribeAsync(addresses);
}
```
This keeps the existing `internal UnsubscribeAsync` unchanged (it's still used by `SubscriptionHandle.DisposeAsync`).
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Domain/IScadaClient.cs
git add src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Subscription.cs
git commit -m "feat: add UnsubscribeByAddressAsync to IScadaClient for address-based unsubscribe"
```
---
## Task 6: Remove `_mxAccessHandles` from `SubscriptionManager`
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/Subscriptions/SubscriptionManager.cs`
**Step 1: Remove `_mxAccessHandles` field**
Delete line 34-35:
```csharp
// REMOVE:
private readonly ConcurrentDictionary<string, IAsyncDisposable> _mxAccessHandles
= new ConcurrentDictionary<string, IAsyncDisposable>(StringComparer.OrdinalIgnoreCase);
```
**Step 2: Rewrite `CreateMxAccessSubscriptionsAsync`**
The method no longer stores handles. It just calls `SubscribeAsync` to create the COM subscriptions. `MxAccessClient` stores them in `_storedSubscriptions` internally.
```csharp
private async Task CreateMxAccessSubscriptionsAsync(List<string> addresses)
{
try
{
await _scadaClient.SubscribeAsync(
addresses,
(address, vtq) => OnTagValueChanged(address, vtq));
}
catch (Exception ex)
{
Log.Error(ex, "Failed to create MxAccess subscriptions for {Count} tags", addresses.Count);
}
}
```
**Step 3: Rewrite unsubscribe logic in `UnsubscribeClient`**
Replace the handle disposal section (lines 198-212) with address-based unsubscribe:
```csharp
// Unsubscribe tags with no remaining clients via address-based API
if (tagsToDispose.Count > 0)
{
try
{
_scadaClient.UnsubscribeByAddressAsync(tagsToDispose).GetAwaiter().GetResult();
}
catch (Exception ex)
{
Log.Warning(ex, "Error unsubscribing {Count} tags from MxAccess", tagsToDispose.Count);
}
}
```
**Step 4: Verify build**
```bash
dotnet build src/ZB.MOM.WW.LmxProxy.Host
```
Expected: Build succeeds. No references to `_mxAccessHandles` remain.
**Step 5: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Subscriptions/SubscriptionManager.cs
git commit -m "fix: remove _mxAccessHandles from SubscriptionManager, use address-based unsubscribe"
```
---
## Task 7: Wire `ConnectionStateChanged` for reconnect notification in `SubscriptionManager`
After reconnect, `RecreateStoredSubscriptionsAsync` recreates COM subscriptions, and `SubscriptionManager` continues to receive `OnTagValueChanged` callbacks because the callback references are preserved in `_storedSubscriptions`. However, we should notify subscribed clients that quality has been restored.
**Files:**
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/Subscriptions/SubscriptionManager.cs` — add `NotifyReconnection` method
- Modify: `src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs` — wire Connected state to SubscriptionManager
**Step 1: Add `NotifyReconnection` to SubscriptionManager**
```csharp
/// <summary>
/// Logs reconnection for observability. Data flow resumes automatically
/// via MxAccessClient.RecreateStoredSubscriptionsAsync callbacks.
/// </summary>
public void NotifyReconnection()
{
Log.Information("MxAccess reconnected — subscriptions recreated, " +
"data flow will resume via OnDataChange callbacks " +
"({ClientCount} clients, {TagCount} tags)",
_clientSubscriptions.Count, _tagSubscriptions.Count);
}
```
**Step 2: Wire in LmxProxyService.Start()**
Extend the existing `ConnectionStateChanged` handler (around line 97):
```csharp
_mxAccessClient.ConnectionStateChanged += (sender, e) =>
{
if (e.CurrentState == Domain.ConnectionState.Disconnected ||
e.CurrentState == Domain.ConnectionState.Error)
{
_subscriptionManager.NotifyDisconnection();
}
else if (e.CurrentState == Domain.ConnectionState.Connected &&
e.PreviousState == Domain.ConnectionState.Reconnecting)
{
_subscriptionManager.NotifyReconnection();
}
};
```
**Step 3: Commit**
```bash
git add src/ZB.MOM.WW.LmxProxy.Host/Subscriptions/SubscriptionManager.cs
git add src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs
git commit -m "feat: wire reconnection notification to SubscriptionManager for observability"
```
---
## Task 8: Build, deploy to windev, test
**Files:**
- No code changes — build and deployment task.
**Step 1: Build the solution**
```bash
dotnet build ZB.MOM.WW.LmxProxy.slnx
```
Expected: Clean build, no errors.
**Step 2: Deploy to windev**
Follow existing deployment procedure per `docker/README.md` or manual copy to windev.
**Step 3: Manual test — Gap 1 (active probing)**
1. Start the v2 service on windev. Verify logs show: `Connection monitor loop started (interval=5000ms, probe=True)`.
2. Verify probe runs: logs should show no warnings while platform is healthy.
3. Kill aaBootstrap on windev. Within 15-20s (3 probe failures at 5s intervals), logs should show:
- `Probe transport failure (1/3): Probe returned Bad_CommFailure` (or similar)
- `Probe transport failure (2/3): ...`
- `Probe transport failure (3/3): ...`
- `Max consecutive transport failures reached — forcing reconnect`
- `Attempting reconnect...`
4. After platform restart (but objects still stopped): Logs should show `Probe degraded` and `entering degraded mode`, then probe backs off to 30s interval. No reconnect churn.
5. After objects restart via SMC: Logs should show `Probe healthy — exiting degraded mode`.
**Step 4: Manual test — Gap 2 (subscription cleanup)**
1. Connect a gRPC client, subscribe to tags.
2. Kill aaBootstrap → client receives `Bad_NotConnected` quality.
3. Restart platform + objects. Verify client starts receiving Good quality updates again (via `RecreateStoredSubscriptionsAsync`).
4. Disconnect the client. Verify logs show `Unsubscribed from N tags` (address-based) with no handle disposal errors.
---
## Design Rationale
### Why two failure modes in the probe?
| Failure Mode | Cause | Correct Response |
|---|---|---|
| **Transport failure** | COM object dead, platform process crashed, MxAccess unreachable | Force disconnect + reconnect |
| **Data degraded** | COM session alive, AVEVA objects stopped, all reads return Bad quality | Stay connected, report degraded, back off probes |
Reconnecting on DataDegraded would churn COM objects with no benefit — the platform objects are stopped regardless of connection state. Observed: 40+ minutes of Bad quality after aaBootstrap crash until manual SMC restart.
### Why remove `_mxAccessHandles`?
1. **Batch handle bug**: `CreateMxAccessSubscriptionsAsync` stored the same `IAsyncDisposable` handle for every address in a batch. Disposing any one address disposed the entire batch, silently removing unrelated subscriptions from `_storedSubscriptions`.
2. **Stale after reconnect**: `RecreateStoredSubscriptionsAsync` recreates COM subscriptions but doesn't produce new `SubscriptionManager` handles. Old handles point to disposed COM state.
3. **Ownership violation**: `MxAccessClient` already owns subscription lifecycle via `_storedSubscriptions` and `_addressToHandle`. Duplicating ownership in `SubscriptionManager._mxAccessHandles` is a leaky abstraction.
The fix: `SubscriptionManager` owns client routing and ref counts only. `MxAccessClient` owns COM subscription lifecycle. Unsubscribe is by address, not by opaque handle.

View File

@@ -0,0 +1,15 @@
{
"planPath": "lmxproxy/docs/plans/2026-03-22-gap1-gap2-reconnect-subscriptions.md",
"tasks": [
{"id": 0, "subject": "Task 0: Add ProbeResult domain type", "status": "pending"},
{"id": 1, "subject": "Task 1: Add ProbeConnectionAsync to MxAccessClient", "status": "pending", "blockedBy": [0]},
{"id": 2, "subject": "Task 2: Add health check configuration", "status": "pending"},
{"id": 3, "subject": "Task 3: Rewrite MonitorConnectionAsync with active probing", "status": "pending", "blockedBy": [1, 2]},
{"id": 4, "subject": "Task 4: Wire probe config through LmxProxyService.Start()", "status": "pending", "blockedBy": [2, 3]},
{"id": 5, "subject": "Task 5: Add UnsubscribeByAddressAsync to IScadaClient", "status": "pending"},
{"id": 6, "subject": "Task 6: Remove _mxAccessHandles from SubscriptionManager", "status": "pending", "blockedBy": [5]},
{"id": 7, "subject": "Task 7: Wire ConnectionStateChanged for reconnect notification", "status": "pending", "blockedBy": [6]},
{"id": 8, "subject": "Task 8: Build, deploy to windev, test", "status": "pending", "blockedBy": [4, 7]}
],
"lastUpdated": "2026-03-22T00:00:00Z"
}

View File

@@ -0,0 +1,185 @@
# LmxProxy Stale Session Subscription Leak Fix
## Problem
When a gRPC client disconnects abruptly, Grpc.Core (the C-core library used by the .NET Framework 4.8 server) does not reliably fire the `ServerCallContext.CancellationToken`. This means:
1. The `Subscribe` RPC in `ScadaGrpcService` blocks forever on `reader.WaitToReadAsync(context.CancellationToken)` (line 368)
2. The `finally` block with `_subscriptionManager.UnsubscribeClient(request.SessionId)` never runs
3. The `ct.Register(() => UnsubscribeClient(clientId))` in `SubscriptionManager.SubscribeAsync` also never fires (same token)
4. The old session's subscriptions leak in `SubscriptionManager._clientSubscriptions` and `_tagSubscriptions`
When the client reconnects with a new session ID, it creates duplicate subscriptions. Tags aren't cleaned up because they still have a ref-count from the leaked old session. Over time, client count grows and tag subscriptions accumulate.
The `SessionManager` does scavenge inactive sessions after 5 minutes, but it only removes the session from its own dictionary — it doesn't notify `SubscriptionManager` to clean up subscriptions.
## Fix
Bridge `SessionManager` scavenging to `SubscriptionManager` cleanup. When a session is scavenged due to inactivity, also call `SubscriptionManager.UnsubscribeClient()`.
### Step 1: Add cleanup callback to SessionManager
File: `src/ZB.MOM.WW.LmxProxy.Host/Sessions/SessionManager.cs`
Add a callback field and expose it:
```csharp
// Add after the _inactivityTimeout field (line 22)
private Action<string>? _onSessionScavenged;
/// <summary>
/// Register a callback invoked when a session is scavenged due to inactivity.
/// The callback receives the session ID.
/// </summary>
public void OnSessionScavenged(Action<string> callback)
{
_onSessionScavenged = callback;
}
```
Then in `ScavengeInactiveSessions`, invoke the callback for each scavenged session:
```csharp
// In ScavengeInactiveSessions (line 103-118), change the foreach to:
foreach (var kvp in expired)
{
if (_sessions.TryRemove(kvp.Key, out _))
{
Log.Information("Session {SessionId} scavenged (inactive since {LastActivity})",
kvp.Key, kvp.Value.LastActivity);
// Notify subscriber cleanup
try
{
_onSessionScavenged?.Invoke(kvp.Key);
}
catch (Exception ex)
{
Log.Warning(ex, "Error in session scavenge callback for {SessionId}", kvp.Key);
}
}
}
```
### Step 2: Wire up the callback in LmxProxyService
File: `src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs`
After both `SessionManager` and `SubscriptionManager` are created, register the callback:
```csharp
// Add after SubscriptionManager creation:
_sessionManager.OnSessionScavenged(sessionId =>
{
Log.Information("Cleaning up subscriptions for scavenged session {SessionId}", sessionId);
_subscriptionManager.UnsubscribeClient(sessionId);
});
```
Find where `_sessionManager` and `_subscriptionManager` are both initialized and add this line right after.
### Step 3: Also clean up on explicit Disconnect
This is already handled — `ScadaGrpcService.Disconnect()` (line 86) calls `_subscriptionManager.UnsubscribeClient(request.SessionId)` before terminating the session. No change needed.
### Step 4: Add proactive stream timeout (belt-and-suspenders)
The scavenger runs every 60 seconds with a 5-minute timeout. This means a leaked session could take up to 6 minutes to clean up. For faster detection, add a secondary timeout in the Subscribe RPC itself.
File: `src/ZB.MOM.WW.LmxProxy.Host/Grpc/Services/ScadaGrpcService.cs`
In the `Subscribe` method, replace the simple `context.CancellationToken` with a combined token that also expires if the session becomes invalid:
```csharp
// Replace the Subscribe method (lines 353-390) with:
public override async Task Subscribe(
Scada.SubscribeRequest request,
IServerStreamWriter<Scada.VtqMessage> responseStream,
ServerCallContext context)
{
if (!_sessionManager.ValidateSession(request.SessionId))
{
throw new RpcException(new GrpcStatus(StatusCode.Unauthenticated, "Invalid session"));
}
var reader = await _subscriptionManager.SubscribeAsync(
request.SessionId, request.Tags, context.CancellationToken);
try
{
// Use a combined approach: check both the gRPC cancellation token AND
// periodic session validity. This works around Grpc.Core not reliably
// firing CancellationToken on client disconnect.
while (true)
{
// Wait for data with a timeout so we can periodically check session validity
using var timeoutCts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(
context.CancellationToken, timeoutCts.Token);
bool hasData;
try
{
hasData = await reader.WaitToReadAsync(linkedCts.Token);
}
catch (OperationCanceledException) when (timeoutCts.IsCancellationRequested
&& !context.CancellationToken.IsCancellationRequested)
{
// Timeout expired, not a client disconnect — check if session is still valid
if (!_sessionManager.ValidateSession(request.SessionId))
{
Log.Information("Subscribe stream ending — session {SessionId} no longer valid",
request.SessionId);
break;
}
continue; // Session still valid, keep waiting
}
if (!hasData) break; // Channel completed
while (reader.TryRead(out var item))
{
var protoVtq = ConvertToProtoVtq(item.address, item.vtq);
await responseStream.WriteAsync(protoVtq);
}
}
}
catch (OperationCanceledException)
{
// Client disconnected -- normal
}
catch (Exception ex)
{
Log.Error(ex, "Subscribe stream error for session {SessionId}", request.SessionId);
throw new RpcException(new GrpcStatus(StatusCode.Internal, ex.Message));
}
finally
{
_subscriptionManager.UnsubscribeClient(request.SessionId);
}
}
```
This adds a 30-second periodic check: if no data arrives for 30 seconds, it checks whether the session is still valid. If the session was scavenged (client disconnected, 5-min timeout), the stream exits cleanly and runs the `finally` cleanup.
## Summary of Changes
| File | Change |
|------|--------|
| `Sessions/SessionManager.cs` | Add `_onSessionScavenged` callback, invoke during `ScavengeInactiveSessions` |
| `LmxProxyService.cs` | Wire `_sessionManager.OnSessionScavenged` to `_subscriptionManager.UnsubscribeClient` |
| `Grpc/Services/ScadaGrpcService.cs` | Add 30-second periodic session validity check in `Subscribe` loop |
## Testing
1. Start LmxProxy server
2. Connect a client and subscribe to tags
3. Kill the client process abruptly (not a clean disconnect)
4. Check status page — client count should still show the old session
5. Wait up to 5 minutes — session should be scavenged, subscription count should drop
6. Reconnect client — should get a clean new session, no duplicate subscriptions
7. Verify tag subscription counts match expected (no leaked refs)
## Optional: Reduce scavenge timeout for faster cleanup
In `LmxProxyService.cs` where `SessionManager` is constructed, consider reducing `inactivityTimeoutMinutes` from 5 to 2, since the Subscribe RPC now has its own 30-second validity check. The 5-minute timeout was the only cleanup path before; now it's belt-and-suspenders.

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,666 @@
# Phase 4: Host Health, Metrics & Status Server — Implementation Plan
**Date**: 2026-03-21
**Prerequisites**: Phase 3 complete and passing (gRPC server, Security, Configuration, Service Hosting all functional)
**Working Directory**: The lmxproxy repo is on windev at `C:\src\lmxproxy`
## Guardrails
1. **This is a v2 rebuild** — do not copy code from the v1 reference in `src-reference/`. Write fresh implementations guided by the design docs and the reference code's structure.
2. **Host targets .NET Framework 4.8, x86** — all code must use C# 9.0 language features maximum (`LangVersion` is `9.0` in the csproj). No file-scoped namespaces, no `required` keyword, no collection expressions in Host code.
3. **No new NuGet packages** — all required packages are already in the Host `.csproj` (`Microsoft.Extensions.Diagnostics.HealthChecks`, `Serilog`, `System.Threading.Channels`, `System.Text.Json` via framework).
4. **Namespace**: `ZB.MOM.WW.LmxProxy.Host` with sub-namespaces matching folder structure (e.g., `ZB.MOM.WW.LmxProxy.Host.Health`, `ZB.MOM.WW.LmxProxy.Host.Metrics`, `ZB.MOM.WW.LmxProxy.Host.Status`).
5. **All COM operations are on the STA thread** — health checks that read test tags must go through `MxAccessClient.ReadAsync()`, never directly touching COM objects.
6. **Build must pass after each step**: `dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86`
7. **Tests run on windev**: `dotnet test tests/ZB.MOM.WW.LmxProxy.Host.Tests --platform x86`
## Step 1: Create PerformanceMetrics
**File**: `src/ZB.MOM.WW.LmxProxy.Host/Metrics/PerformanceMetrics.cs`
Create the `PerformanceMetrics` class in namespace `ZB.MOM.WW.LmxProxy.Host.Metrics`.
### 1.1 OperationMetrics (nested or separate class in same file)
```csharp
namespace ZB.MOM.WW.LmxProxy.Host.Metrics
{
public class OperationMetrics
{
private readonly List<double> _durations = new List<double>();
private readonly object _lock = new object();
private long _totalCount;
private long _successCount;
private double _totalMilliseconds;
private double _minMilliseconds = double.MaxValue;
private double _maxMilliseconds;
public void Record(TimeSpan duration, bool success) { ... }
public MetricsStatistics GetStatistics() { ... }
}
}
```
Implementation details:
- `Record(TimeSpan duration, bool success)`: Inside `lock (_lock)`, increment `_totalCount`, conditionally increment `_successCount`, add `duration.TotalMilliseconds` to `_durations` list, update `_totalMilliseconds`, `_minMilliseconds`, `_maxMilliseconds`. If `_durations.Count > 1000`, call `_durations.RemoveAt(0)` to maintain rolling buffer.
- `GetStatistics()`: Inside `lock (_lock)`, return early with empty `MetricsStatistics` if `_totalCount == 0`. Otherwise sort `_durations`, compute p95 index as `(int)Math.Ceiling(sortedDurations.Count * 0.95) - 1`, clamp to `Math.Max(0, p95Index)`.
### 1.2 MetricsStatistics
```csharp
public class MetricsStatistics
{
public long TotalCount { get; set; }
public long SuccessCount { get; set; }
public double SuccessRate { get; set; }
public double AverageMilliseconds { get; set; }
public double MinMilliseconds { get; set; }
public double MaxMilliseconds { get; set; }
public double Percentile95Milliseconds { get; set; }
}
```
### 1.3 ITimingScope interface and TimingScope implementation
```csharp
public interface ITimingScope : IDisposable
{
void SetSuccess(bool success);
}
```
`TimingScope` is a private nested class inside `PerformanceMetrics`:
- Constructor takes `PerformanceMetrics metrics, string operationName`, starts a `Stopwatch`.
- `SetSuccess(bool success)` stores the flag (default `true`).
- `Dispose()`: stops stopwatch, calls `_metrics.RecordOperation(_operationName, _stopwatch.Elapsed, _success)`. Guard against double-dispose with `_disposed` flag.
### 1.4 PerformanceMetrics class
```csharp
public class PerformanceMetrics : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<PerformanceMetrics>();
private readonly ConcurrentDictionary<string, OperationMetrics> _metrics = new ConcurrentDictionary<string, OperationMetrics>();
private readonly Timer _reportingTimer;
private bool _disposed;
public PerformanceMetrics()
{
_reportingTimer = new Timer(ReportMetrics, null, TimeSpan.FromSeconds(60), TimeSpan.FromSeconds(60));
}
public void RecordOperation(string operationName, TimeSpan duration, bool success = true) { ... }
public ITimingScope BeginOperation(string operationName) => new TimingScope(this, operationName);
public OperationMetrics? GetMetrics(string operationName) { ... }
public IReadOnlyDictionary<string, OperationMetrics> GetAllMetrics() { ... }
public Dictionary<string, MetricsStatistics> GetStatistics() { ... }
private void ReportMetrics(object? state) { ... } // Log each operation's stats at Information level
public void Dispose() { ... } // Dispose timer, call ReportMetrics one final time
}
```
`ReportMetrics` iterates `_metrics`, calls `GetStatistics()` on each, logs via Serilog structured logging with properties: `Operation`, `Count`, `SuccessRate`, `AverageMs`, `MinMs`, `MaxMs`, `P95Ms`.
### 1.5 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86"
```
## Step 2: Create HealthCheckService
**File**: `src/ZB.MOM.WW.LmxProxy.Host/Health/HealthCheckService.cs`
Namespace: `ZB.MOM.WW.LmxProxy.Host.Health`
### 2.1 Basic HealthCheckService
```csharp
public class HealthCheckService : IHealthCheck
{
private static readonly ILogger Logger = Log.ForContext<HealthCheckService>();
private readonly IScadaClient _scadaClient;
private readonly SubscriptionManager _subscriptionManager;
private readonly PerformanceMetrics _performanceMetrics;
public HealthCheckService(
IScadaClient scadaClient,
SubscriptionManager subscriptionManager,
PerformanceMetrics performanceMetrics) { ... }
public Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default) { ... }
}
```
Dependencies imported:
- `ZB.MOM.WW.LmxProxy.Host.Domain` for `IScadaClient`, `ConnectionState`
- `ZB.MOM.WW.LmxProxy.Host.Services` for `SubscriptionManager` (if still in that namespace after Phase 2/3; adjust import to match actual location)
- `ZB.MOM.WW.LmxProxy.Host.Metrics` for `PerformanceMetrics`
- `Microsoft.Extensions.Diagnostics.HealthChecks` for `IHealthCheck`, `HealthCheckResult`, `HealthCheckContext`
`CheckHealthAsync` logic:
1. Create `Dictionary<string, object> data`.
2. Read `_scadaClient.IsConnected` and `_scadaClient.ConnectionState` into `data["scada_connected"]` and `data["scada_connection_state"]`.
3. Get subscription stats via `_subscriptionManager.GetSubscriptionStats()` — store `TotalClients`, `TotalTags` in data.
4. Iterate `_performanceMetrics.GetAllMetrics()` to compute `totalOperations` and `averageSuccessRate`.
5. Store `total_operations` and `average_success_rate` in data.
6. Decision tree:
- If `!isConnected``HealthCheckResult.Unhealthy("SCADA client is not connected", data: data)`
- If `averageSuccessRate < 0.5 && totalOperations > 100``HealthCheckResult.Degraded(...)`
- If `subscriptionStats.TotalClients > 100``HealthCheckResult.Degraded(...)`
- Otherwise → `HealthCheckResult.Healthy("LmxProxy is healthy", data)`
7. Wrap everything in try/catch — on exception return `Unhealthy` with exception details.
### 2.2 DetailedHealthCheckService
In the same file or a separate file `src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs`:
```csharp
public class DetailedHealthCheckService : IHealthCheck
{
private static readonly ILogger Logger = Log.ForContext<DetailedHealthCheckService>();
private readonly IScadaClient _scadaClient;
private readonly string _testTagAddress;
public DetailedHealthCheckService(IScadaClient scadaClient, string testTagAddress = "TestChildObject.TestBool") { ... }
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default) { ... }
}
```
`CheckHealthAsync` logic:
1. If `!_scadaClient.IsConnected` → return `Unhealthy`.
2. Try `Vtq vtq = await _scadaClient.ReadAsync(_testTagAddress, cancellationToken)`.
3. If `vtq.Quality != Quality.Good` → return `Degraded` with quality info.
4. If `DateTime.UtcNow - vtq.Timestamp > TimeSpan.FromMinutes(5)` → return `Degraded` (stale data).
5. Otherwise → `Healthy`.
6. Catch read exceptions → return `Degraded("Could not read test tag")`.
7. Catch all exceptions → return `Unhealthy`.
### 2.3 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86"
```
## Step 3: Create StatusReportService
**File**: `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusReportService.cs`
Namespace: `ZB.MOM.WW.LmxProxy.Host.Status`
### 3.1 Data model classes
Define in the same file (or a separate `StatusModels.cs` in the same folder):
```csharp
public class StatusData
{
public DateTime Timestamp { get; set; }
public string ServiceName { get; set; } = "";
public string Version { get; set; } = "";
public ConnectionStatus Connection { get; set; } = new ConnectionStatus();
public SubscriptionStatus Subscriptions { get; set; } = new SubscriptionStatus();
public PerformanceStatus Performance { get; set; } = new PerformanceStatus();
public HealthInfo Health { get; set; } = new HealthInfo();
public HealthInfo? DetailedHealth { get; set; }
}
public class ConnectionStatus
{
public bool IsConnected { get; set; }
public string State { get; set; } = "";
public string NodeName { get; set; } = "";
public string GalaxyName { get; set; } = "";
}
public class SubscriptionStatus
{
public int TotalClients { get; set; }
public int TotalTags { get; set; }
public int ActiveSubscriptions { get; set; }
}
public class PerformanceStatus
{
public long TotalOperations { get; set; }
public double AverageSuccessRate { get; set; }
public Dictionary<string, OperationStatus> Operations { get; set; } = new Dictionary<string, OperationStatus>();
}
public class OperationStatus
{
public long TotalCount { get; set; }
public double SuccessRate { get; set; }
public double AverageMilliseconds { get; set; }
public double MinMilliseconds { get; set; }
public double MaxMilliseconds { get; set; }
public double Percentile95Milliseconds { get; set; }
}
public class HealthInfo
{
public string Status { get; set; } = "";
public string Description { get; set; } = "";
public Dictionary<string, string> Data { get; set; } = new Dictionary<string, string>();
}
```
### 3.2 StatusReportService
```csharp
public class StatusReportService
{
private static readonly ILogger Logger = Log.ForContext<StatusReportService>();
private readonly IScadaClient _scadaClient;
private readonly SubscriptionManager _subscriptionManager;
private readonly PerformanceMetrics _performanceMetrics;
private readonly HealthCheckService _healthCheckService;
private readonly DetailedHealthCheckService? _detailedHealthCheckService;
public StatusReportService(
IScadaClient scadaClient,
SubscriptionManager subscriptionManager,
PerformanceMetrics performanceMetrics,
HealthCheckService healthCheckService,
DetailedHealthCheckService? detailedHealthCheckService = null) { ... }
public async Task<string> GenerateHtmlReportAsync() { ... }
public async Task<string> GenerateJsonReportAsync() { ... }
public async Task<bool> IsHealthyAsync() { ... }
private async Task<StatusData> CollectStatusDataAsync() { ... }
private static string GenerateHtmlFromStatusData(StatusData statusData) { ... }
private static string GenerateErrorHtml(Exception ex) { ... }
}
```
`CollectStatusDataAsync`:
- Populate `StatusData.Timestamp = DateTime.UtcNow`, `ServiceName = "ZB.MOM.WW.LmxProxy.Host"`, `Version` from `Assembly.GetExecutingAssembly().GetName().Version`.
- Connection info from `_scadaClient.IsConnected`, `_scadaClient.ConnectionState`.
- Subscription stats from `_subscriptionManager.GetSubscriptionStats()`.
- Performance stats from `_performanceMetrics.GetStatistics()` — include P95 in the `OperationStatus`.
- Health from `_healthCheckService.CheckHealthAsync(new HealthCheckContext())`.
- Detailed health from `_detailedHealthCheckService?.CheckHealthAsync(new HealthCheckContext())` if not null.
`GenerateJsonReportAsync`:
- Use `System.Text.Json.JsonSerializer.Serialize(statusData, new JsonSerializerOptions { WriteIndented = true, PropertyNamingPolicy = JsonNamingPolicy.CamelCase })`.
`GenerateHtmlFromStatusData`:
- Use `StringBuilder` to generate self-contained HTML.
- Include inline CSS (Bootstrap-like grid, status cards with color-coded left borders).
- Color coding: green (#28a745) for Healthy/Connected, yellow (#ffc107) for Degraded, red (#dc3545) for Unhealthy/Disconnected.
- Operations table with columns: Operation, Count, Success Rate, Avg (ms), Min (ms), Max (ms), P95 (ms).
- `<meta http-equiv="refresh" content="30">` for auto-refresh.
- Last updated timestamp at the bottom.
`IsHealthyAsync`:
- Run basic health check, return `result.Status == HealthStatus.Healthy`.
### 3.3 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86"
```
## Step 4: Create StatusWebServer
**File**: `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusWebServer.cs`
Namespace: `ZB.MOM.WW.LmxProxy.Host.Status`
```csharp
public class StatusWebServer : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<StatusWebServer>();
private readonly WebServerConfiguration _configuration;
private readonly StatusReportService _statusReportService;
private HttpListener? _httpListener;
private CancellationTokenSource? _cancellationTokenSource;
private Task? _listenerTask;
private bool _disposed;
public StatusWebServer(WebServerConfiguration configuration, StatusReportService statusReportService) { ... }
public bool Start() { ... }
public bool Stop() { ... }
public void Dispose() { ... }
private async Task HandleRequestsAsync(CancellationToken cancellationToken) { ... }
private async Task HandleRequestAsync(HttpListenerContext context) { ... }
private async Task HandleStatusPageAsync(HttpListenerResponse response) { ... }
private async Task HandleStatusApiAsync(HttpListenerResponse response) { ... }
private async Task HandleHealthApiAsync(HttpListenerResponse response) { ... }
private static async Task WriteResponseAsync(HttpListenerResponse response, string content, string contentType) { ... }
}
```
### 4.1 Start()
1. If `!_configuration.Enabled`, log info and return `true`.
2. Create `HttpListener`, add prefix `_configuration.Prefix ?? $"http://+:{_configuration.Port}/"` (ensure trailing `/`).
3. Call `_httpListener.Start()`.
4. Create `_cancellationTokenSource = new CancellationTokenSource()`.
5. Start `_listenerTask = Task.Run(() => HandleRequestsAsync(_cancellationTokenSource.Token))`.
6. On exception, log error and return `false`.
### 4.2 Stop()
1. If not enabled or listener is null, return `true`.
2. Cancel `_cancellationTokenSource`.
3. Wait for `_listenerTask` with 5-second timeout.
4. Stop and close `_httpListener`.
### 4.3 HandleRequestsAsync
- Loop while not cancelled and listener is listening.
- `await _httpListener.GetContextAsync()` — on success, spawn `Task.Run` to handle.
- Catch `ObjectDisposedException` and `HttpListenerException(995)` as expected shutdown signals.
- On other errors, log and delay 1 second before continuing.
### 4.4 HandleRequestAsync routing
| Path (lowered) | Handler |
|---|---|
| `/` | `HandleStatusPageAsync` — calls `_statusReportService.GenerateHtmlReportAsync()`, content type `text/html; charset=utf-8` |
| `/api/status` | `HandleStatusApiAsync` — calls `_statusReportService.GenerateJsonReportAsync()`, content type `application/json; charset=utf-8` |
| `/api/health` | `HandleHealthApiAsync` — calls `_statusReportService.IsHealthyAsync()`, returns `"OK"` (200) or `"UNHEALTHY"` (503) as `text/plain` |
| Non-GET method | Return 405 Method Not Allowed |
| Unknown path | Return 404 Not Found |
| Exception | Return 500 Internal Server Error |
### 4.5 WriteResponseAsync
- Set `Content-Type`, add `Cache-Control: no-cache, no-store, must-revalidate`, `Pragma: no-cache`, `Expires: 0`.
- Convert content to UTF-8 bytes, set `ContentLength64`, write to `response.OutputStream`.
### 4.6 Dispose
- Guard with `_disposed` flag. Call `Stop()`. Dispose `_cancellationTokenSource` and close `_httpListener`.
### 4.7 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86"
```
## Step 5: Wire into LmxProxyService
**File**: `src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs`
This file already exists. Modify the `Start()` method to create and wire the new components. The v2 rebuild should create these fresh, but the wiring pattern follows the same order as the reference.
### 5.1 Add using directives
```csharp
using ZB.MOM.WW.LmxProxy.Host.Health;
using ZB.MOM.WW.LmxProxy.Host.Metrics;
using ZB.MOM.WW.LmxProxy.Host.Status;
```
### 5.2 Add fields
```csharp
private PerformanceMetrics? _performanceMetrics;
private HealthCheckService? _healthCheckService;
private DetailedHealthCheckService? _detailedHealthCheckService;
private StatusReportService? _statusReportService;
private StatusWebServer? _statusWebServer;
```
### 5.3 In Start(), after SessionManager and SubscriptionManager creation
```csharp
// Create performance metrics
_performanceMetrics = new PerformanceMetrics();
// Create health check services
_healthCheckService = new HealthCheckService(_scadaClient, _subscriptionManager, _performanceMetrics);
_detailedHealthCheckService = new DetailedHealthCheckService(_scadaClient);
// Create status report service
_statusReportService = new StatusReportService(
_scadaClient, _subscriptionManager, _performanceMetrics,
_healthCheckService, _detailedHealthCheckService);
// Start status web server
_statusWebServer = new StatusWebServer(_configuration.WebServer, _statusReportService);
if (!_statusWebServer.Start())
{
Logger.Warning("Status web server failed to start — continuing without it");
}
```
### 5.4 In Stop(), before gRPC server shutdown
```csharp
// Stop status web server
_statusWebServer?.Stop();
// Dispose performance metrics
_performanceMetrics?.Dispose();
```
### 5.5 Pass _performanceMetrics to ScadaGrpcService constructor
Ensure `ScadaGrpcService` receives `_performanceMetrics` so it can record timings on each RPC call. The gRPC service should call `_performanceMetrics.BeginOperation("Read")` (etc.) and dispose the timing scope at the end of each RPC handler.
### 5.6 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Host --platform x86"
```
## Step 6: Unit Tests
**Project**: `tests/ZB.MOM.WW.LmxProxy.Host.Tests/`
If this project does not exist yet, create it:
```bash
ssh windev "cd C:\src\lmxproxy && dotnet new xunit -n ZB.MOM.WW.LmxProxy.Host.Tests -o tests/ZB.MOM.WW.LmxProxy.Host.Tests --framework net48"
```
**Csproj adjustments** for `tests/ZB.MOM.WW.LmxProxy.Host.Tests/ZB.MOM.WW.LmxProxy.Host.Tests.csproj`:
- `<TargetFramework>net48</TargetFramework>`
- `<PlatformTarget>x86</PlatformTarget>`
- `<LangVersion>9.0</LangVersion>`
- Add `<ProjectReference Include="..\..\src\ZB.MOM.WW.LmxProxy.Host\ZB.MOM.WW.LmxProxy.Host.csproj" />`
- Add `<PackageReference Include="xunit" Version="2.9.3" />`
- Add `<PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />`
- Add `<PackageReference Include="NSubstitute" Version="5.3.0" />` (for mocking IScadaClient)
- Add `<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0" />`
**Also add to solution** in `ZB.MOM.WW.LmxProxy.slnx`:
```xml
<Folder Name="/tests/">
<Project Path="tests/ZB.MOM.WW.LmxProxy.Host.Tests/ZB.MOM.WW.LmxProxy.Host.Tests.csproj" />
</Folder>
```
### 6.1 PerformanceMetrics Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Host.Tests/Metrics/PerformanceMetricsTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Host.Tests.Metrics
{
public class PerformanceMetricsTests
{
[Fact]
public void RecordOperation_TracksCountAndDuration()
// Record 5 operations, verify GetStatistics returns TotalCount=5
[Fact]
public void RecordOperation_TracksSuccessAndFailure()
// Record 3 success + 2 failure, verify SuccessRate == 0.6
[Fact]
public void GetStatistics_CalculatesP95Correctly()
// Record 100 operations with known durations (1ms through 100ms)
// Verify P95 is approximately 95ms
[Fact]
public void RollingBuffer_CapsAt1000Samples()
// Record 1500 operations, verify _durations list doesn't exceed 1000
// (test via GetStatistics behavior — TotalCount is 1500 but percentile computed from 1000)
[Fact]
public void BeginOperation_RecordsDurationOnDispose()
// Use BeginOperation, await Task.Delay(50), dispose scope
// Verify recorded duration >= 50ms
[Fact]
public void TimingScope_DefaultsToSuccess()
// BeginOperation + dispose without calling SetSuccess
// Verify SuccessCount == 1
[Fact]
public void TimingScope_RespectsSetSuccessFalse()
// BeginOperation, SetSuccess(false), dispose
// Verify SuccessCount == 0, TotalCount == 1
[Fact]
public void GetMetrics_ReturnsNullForUnknownOperation()
[Fact]
public void GetAllMetrics_ReturnsAllTrackedOperations()
}
}
```
### 6.2 HealthCheckService Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Host.Tests/Health/HealthCheckServiceTests.cs`
Use NSubstitute to mock `IScadaClient`. Create a real `PerformanceMetrics` instance and a real or mock `SubscriptionManager` (depends on Phase 2/3 implementation — if `SubscriptionManager` has an interface, mock it; if not, use the `GetSubscriptionStats()` approach with a concrete instance).
```csharp
namespace ZB.MOM.WW.LmxProxy.Host.Tests.Health
{
public class HealthCheckServiceTests
{
[Fact]
public async Task ReturnsHealthy_WhenConnectedAndNormalMetrics()
// Mock: IsConnected=true, ConnectionState=Connected
// SubscriptionStats: TotalClients=5, TotalTags=10
// PerformanceMetrics: record some successes
// Assert: HealthStatus.Healthy
[Fact]
public async Task ReturnsUnhealthy_WhenNotConnected()
// Mock: IsConnected=false
// Assert: HealthStatus.Unhealthy, description contains "not connected"
[Fact]
public async Task ReturnsDegraded_WhenSuccessRateBelow50Percent()
// Mock: IsConnected=true
// Record 200 operations with 40% success rate
// Assert: HealthStatus.Degraded
[Fact]
public async Task ReturnsDegraded_WhenClientCountOver100()
// Mock: IsConnected=true, SubscriptionStats.TotalClients=150
// Assert: HealthStatus.Degraded
[Fact]
public async Task DoesNotFlagLowSuccessRate_Under100Operations()
// Record 50 operations with 0% success rate
// Assert: still Healthy (threshold is > 100 total ops)
}
public class DetailedHealthCheckServiceTests
{
[Fact]
public async Task ReturnsUnhealthy_WhenNotConnected()
[Fact]
public async Task ReturnsHealthy_WhenTestTagGoodAndRecent()
// Mock ReadAsync returns Good quality with recent timestamp
// Assert: Healthy
[Fact]
public async Task ReturnsDegraded_WhenTestTagQualityNotGood()
// Mock ReadAsync returns Uncertain quality
// Assert: Degraded
[Fact]
public async Task ReturnsDegraded_WhenTestTagTimestampStale()
// Mock ReadAsync returns Good quality but timestamp 10 minutes ago
// Assert: Degraded
[Fact]
public async Task ReturnsDegraded_WhenTestTagReadThrows()
// Mock ReadAsync throws exception
// Assert: Degraded
}
}
```
### 6.3 StatusReportService Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Host.Tests/Status/StatusReportServiceTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Host.Tests.Status
{
public class StatusReportServiceTests
{
[Fact]
public async Task GenerateJsonReportAsync_ReturnsCamelCaseJson()
// Verify JSON contains "serviceName", "connection", "isConnected" (camelCase)
[Fact]
public async Task GenerateHtmlReportAsync_ContainsAutoRefresh()
// Verify HTML contains <meta http-equiv="refresh" content="30">
[Fact]
public async Task IsHealthyAsync_ReturnsTrueWhenHealthy()
[Fact]
public async Task IsHealthyAsync_ReturnsFalseWhenUnhealthy()
[Fact]
public async Task GenerateJsonReportAsync_IncludesPerformanceMetrics()
// Record some operations, verify JSON includes operation names and stats
}
}
```
### 6.4 Run tests
```bash
ssh windev "cd C:\src\lmxproxy && dotnet test tests/ZB.MOM.WW.LmxProxy.Host.Tests --platform x86 --verbosity normal"
```
## Step 7: Build Verification
Run full solution build and tests:
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build ZB.MOM.WW.LmxProxy.slnx && dotnet test --verbosity normal"
```
If the test project is .NET 4.8 x86, you may need:
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build ZB.MOM.WW.LmxProxy.slnx --platform x86 && dotnet test tests/ZB.MOM.WW.LmxProxy.Host.Tests --platform x86"
```
## Completion Criteria
- [ ] `PerformanceMetrics` class with `OperationMetrics`, `MetricsStatistics`, `ITimingScope` in `src/ZB.MOM.WW.LmxProxy.Host/Metrics/`
- [ ] `HealthCheckService` and `DetailedHealthCheckService` in `src/ZB.MOM.WW.LmxProxy.Host/Health/`
- [ ] `StatusReportService` with data model classes in `src/ZB.MOM.WW.LmxProxy.Host/Status/`
- [ ] `StatusWebServer` with HTML dashboard, JSON status, and health endpoints in `src/ZB.MOM.WW.LmxProxy.Host/Status/`
- [ ] All components wired into `LmxProxyService.Start()` / `Stop()`
- [ ] `ScadaGrpcService` uses `PerformanceMetrics.BeginOperation()` for Read, ReadBatch, Write, WriteBatch RPCs
- [ ] Unit tests for PerformanceMetrics (recording, percentile, rolling buffer, timing scope)
- [ ] Unit tests for HealthCheckService (healthy, unhealthy, degraded transitions)
- [ ] Unit tests for DetailedHealthCheckService (connected, quality, staleness)
- [ ] Unit tests for StatusReportService (JSON format, HTML format, health aggregation)
- [ ] Solution builds without errors: `dotnet build ZB.MOM.WW.LmxProxy.slnx`
- [ ] All tests pass: `dotnet test`

View File

@@ -0,0 +1,852 @@
# Phase 5: Client Core — Implementation Plan
**Date**: 2026-03-21
**Prerequisites**: Phase 1 complete and passing (Protocol & Domain Types — `ScadaContracts.cs` with v2 `TypedValue`/`QualityCode` messages, `Quality.cs`, `QualityExtensions.cs`, `Vtq.cs`, `ConnectionState.cs` all exist and cross-stack serialization tests pass)
**Working Directory**: The lmxproxy repo is on windev at `C:\src\lmxproxy`
## Guardrails
1. **Client targets .NET 10, AnyCPU** — use latest C# features freely. The csproj `<TargetFramework>` is `net10.0`, `<LangVersion>latest</LangVersion>`.
2. **Code-first gRPC only** — the Client uses `protobuf-net.Grpc` with `[ServiceContract]`/`[DataContract]` attributes. Never reference proto files or `Grpc.Tools`.
3. **No string serialization heuristics** — v2 uses native `TypedValue`. Do not write `double.TryParse`, `bool.TryParse`, or any string-to-value parsing on tag values.
4. **`status_code` is canonical for quality** — `symbolic_name` is derived. Never set `symbolic_name` independently.
5. **Polly v8 API** — the Client csproj already has `<PackageReference Include="Polly" Version="8.5.2" />`. Use the v8 `ResiliencePipeline` API, not the legacy v7 `IAsyncPolicy` API.
6. **No new NuGet packages** — all needed packages are already in `src/ZB.MOM.WW.LmxProxy.Client/ZB.MOM.WW.LmxProxy.Client.csproj`.
7. **Build command**: `dotnet build src/ZB.MOM.WW.LmxProxy.Client`
8. **Test command**: `dotnet test tests/ZB.MOM.WW.LmxProxy.Client.Tests`
9. **Namespace root**: `ZB.MOM.WW.LmxProxy.Client`
## Step 1: ClientTlsConfiguration
**File**: `src/ZB.MOM.WW.LmxProxy.Client/ClientTlsConfiguration.cs`
This file already exists with the correct shape. Verify it has all these properties (from Component-Client.md):
```csharp
namespace ZB.MOM.WW.LmxProxy.Client;
public class ClientTlsConfiguration
{
public bool UseTls { get; set; } = false;
public string? ClientCertificatePath { get; set; }
public string? ClientKeyPath { get; set; }
public string? ServerCaCertificatePath { get; set; }
public string? ServerNameOverride { get; set; }
public bool ValidateServerCertificate { get; set; } = true;
public bool AllowSelfSignedCertificates { get; set; } = false;
public bool IgnoreAllCertificateErrors { get; set; } = false;
}
```
If it matches, no changes needed. If any properties are missing, add them.
## Step 2: Security/GrpcChannelFactory
**File**: `src/ZB.MOM.WW.LmxProxy.Client/Security/GrpcChannelFactory.cs`
This file already exists. Verify the implementation covers:
1. `CreateChannel(Uri address, ClientTlsConfiguration? tlsConfiguration, ILogger logger)` — returns `GrpcChannel`.
2. Creates `SocketsHttpHandler` with `EnableMultipleHttp2Connections = true`.
3. For TLS: sets `SslProtocols = Tls12 | Tls13`, configures `ServerNameOverride` as `TargetHost`, loads client certificate from PEM files for mTLS.
4. Certificate validation callback handles: `IgnoreAllCertificateErrors`, `!ValidateServerCertificate`, custom CA trust store via `ServerCaCertificatePath`, `AllowSelfSignedCertificates`.
5. Static constructor sets `System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport = true` for non-TLS.
The existing implementation matches. No changes expected unless Phase 1 introduced breaking changes.
## Step 3: ILmxProxyClient Interface
**File**: `src/ZB.MOM.WW.LmxProxy.Client/ILmxProxyClient.cs`
Rewrite for v2 protocol. The key changes from v1:
- `WriteAsync` and `WriteBatchAsync` accept `TypedValue` instead of `object`
- `SubscribeAsync` has an `onStreamError` callback parameter
- `CheckApiKeyAsync` is added
- Return types use v2 domain `Vtq` (which wraps `TypedValue` + `QualityCode`)
```csharp
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client;
/// <summary>
/// Interface for LmxProxy client operations.
/// </summary>
public interface ILmxProxyClient : IDisposable, IAsyncDisposable
{
/// <summary>Gets or sets the default timeout for operations (range: 1s to 10min).</summary>
TimeSpan DefaultTimeout { get; set; }
/// <summary>Connects to the LmxProxy service and establishes a session.</summary>
Task ConnectAsync(CancellationToken cancellationToken = default);
/// <summary>Disconnects from the LmxProxy service.</summary>
Task DisconnectAsync();
/// <summary>Returns true if the client has an active session.</summary>
Task<bool> IsConnectedAsync();
/// <summary>Reads a single tag value.</summary>
Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default);
/// <summary>Reads multiple tag values in a single batch.</summary>
Task<IDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses, CancellationToken cancellationToken = default);
/// <summary>Writes a single tag value (native TypedValue — no string heuristics).</summary>
Task WriteAsync(string address, TypedValue value, CancellationToken cancellationToken = default);
/// <summary>Writes multiple tag values in a single batch.</summary>
Task WriteBatchAsync(IDictionary<string, TypedValue> values, CancellationToken cancellationToken = default);
/// <summary>
/// Writes a batch of values, then polls a flag tag until it matches or timeout expires.
/// Returns (writeResults, flagReached, elapsedMs).
/// </summary>
Task<WriteBatchAndWaitResponse> WriteBatchAndWaitAsync(
IDictionary<string, TypedValue> values,
string flagTag,
TypedValue flagValue,
int timeoutMs = 5000,
int pollIntervalMs = 100,
CancellationToken cancellationToken = default);
/// <summary>Subscribes to tag updates with value and error callbacks.</summary>
Task<ISubscription> SubscribeAsync(
IEnumerable<string> addresses,
Action<string, Vtq> onUpdate,
Action<Exception>? onStreamError = null,
CancellationToken cancellationToken = default);
/// <summary>Validates an API key and returns info.</summary>
Task<ApiKeyInfo> CheckApiKeyAsync(string apiKey, CancellationToken cancellationToken = default);
/// <summary>Returns a snapshot of client-side metrics.</summary>
Dictionary<string, object> GetMetrics();
}
```
**Note**: The `TypedValue` class referenced here is from `Domain/ScadaContracts.cs` — it should already have been updated in Phase 1 to use `[DataContract]` with the v2 oneof-style properties (e.g., `BoolValue`, `Int32Value`, `DoubleValue`, `StringValue`, `DatetimeValue`, etc., with a `ValueCase` enum or similar discriminator).
## Step 4: LmxProxyClient — Main File
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.cs`
This is a partial class. The main file contains the constructor, fields, properties, and the Read/Write/WriteBatch/WriteBatchAndWait/CheckApiKey methods.
### 4.1 Fields and Constructor
```csharp
public partial class LmxProxyClient : ILmxProxyClient
{
private readonly ILogger<LmxProxyClient> _logger;
private readonly string _host;
private readonly int _port;
private readonly string? _apiKey;
private readonly ClientTlsConfiguration? _tlsConfiguration;
private readonly ClientMetrics _metrics = new();
private readonly SemaphoreSlim _connectionLock = new(1, 1);
private readonly List<ISubscription> _activeSubscriptions = [];
private readonly Lock _subscriptionLock = new();
private GrpcChannel? _channel;
private IScadaService? _client;
private string _sessionId = string.Empty;
private bool _disposed;
private bool _isConnected;
private TimeSpan _defaultTimeout = TimeSpan.FromSeconds(30);
private ClientConfiguration? _configuration;
private ResiliencePipeline? _resiliencePipeline; // Polly v8
private Timer? _keepAliveTimer;
private readonly TimeSpan _keepAliveInterval = TimeSpan.FromSeconds(30);
// IsConnected computed property
public bool IsConnected => !_disposed && _isConnected && !string.IsNullOrEmpty(_sessionId);
public LmxProxyClient(
string host, int port, string? apiKey,
ClientTlsConfiguration? tlsConfiguration,
ILogger<LmxProxyClient>? logger = null)
{
_host = host ?? throw new ArgumentNullException(nameof(host));
_port = port;
_apiKey = apiKey;
_tlsConfiguration = tlsConfiguration;
_logger = logger ?? NullLogger<LmxProxyClient>.Instance;
}
internal void SetBuilderConfiguration(ClientConfiguration config)
{
_configuration = config;
// Build Polly v8 ResiliencePipeline from config
if (config.MaxRetryAttempts > 0)
{
_resiliencePipeline = new ResiliencePipelineBuilder()
.AddRetry(new RetryStrategyOptions
{
MaxRetryAttempts = config.MaxRetryAttempts,
Delay = config.RetryDelay,
BackoffType = DelayBackoffType.Exponential,
ShouldHandle = new PredicateBuilder()
.Handle<RpcException>(ex =>
ex.StatusCode == StatusCode.Unavailable ||
ex.StatusCode == StatusCode.DeadlineExceeded ||
ex.StatusCode == StatusCode.ResourceExhausted ||
ex.StatusCode == StatusCode.Aborted),
OnRetry = args =>
{
_logger.LogWarning("Retry {Attempt} after {Delay} for {Exception}",
args.AttemptNumber, args.RetryDelay, args.Outcome.Exception?.Message);
return ValueTask.CompletedTask;
}
})
.Build();
}
}
}
```
### 4.2 ReadAsync
```csharp
public async Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default)
{
EnsureConnected();
_metrics.IncrementOperationCount("Read");
var sw = Stopwatch.StartNew();
try
{
var request = new ReadRequest { SessionId = _sessionId, Tag = address };
ReadResponse response = await ExecuteWithRetry(
() => _client!.ReadAsync(request).AsTask(), cancellationToken);
if (!response.Success)
throw new InvalidOperationException($"Read failed: {response.Message}");
return ConvertVtqMessage(response.Vtq);
}
catch (Exception ex)
{
_metrics.IncrementErrorCount("Read");
throw;
}
finally
{
sw.Stop();
_metrics.RecordLatency("Read", sw.ElapsedMilliseconds);
}
}
```
### 4.3 ReadBatchAsync
```csharp
public async Task<IDictionary<string, Vtq>> ReadBatchAsync(
IEnumerable<string> addresses, CancellationToken cancellationToken = default)
{
EnsureConnected();
_metrics.IncrementOperationCount("ReadBatch");
var sw = Stopwatch.StartNew();
try
{
var request = new ReadBatchRequest { SessionId = _sessionId, Tags = addresses.ToList() };
ReadBatchResponse response = await ExecuteWithRetry(
() => _client!.ReadBatchAsync(request).AsTask(), cancellationToken);
var result = new Dictionary<string, Vtq>();
foreach (var vtqMsg in response.Vtqs)
{
result[vtqMsg.Tag] = ConvertVtqMessage(vtqMsg);
}
return result;
}
catch
{
_metrics.IncrementErrorCount("ReadBatch");
throw;
}
finally
{
sw.Stop();
_metrics.RecordLatency("ReadBatch", sw.ElapsedMilliseconds);
}
}
```
### 4.4 WriteAsync
```csharp
public async Task WriteAsync(string address, TypedValue value, CancellationToken cancellationToken = default)
{
EnsureConnected();
_metrics.IncrementOperationCount("Write");
var sw = Stopwatch.StartNew();
try
{
var request = new WriteRequest { SessionId = _sessionId, Tag = address, Value = value };
WriteResponse response = await ExecuteWithRetry(
() => _client!.WriteAsync(request).AsTask(), cancellationToken);
if (!response.Success)
throw new InvalidOperationException($"Write failed: {response.Message}");
}
catch
{
_metrics.IncrementErrorCount("Write");
throw;
}
finally
{
sw.Stop();
_metrics.RecordLatency("Write", sw.ElapsedMilliseconds);
}
}
```
### 4.5 WriteBatchAsync
```csharp
public async Task WriteBatchAsync(IDictionary<string, TypedValue> values, CancellationToken cancellationToken = default)
{
EnsureConnected();
_metrics.IncrementOperationCount("WriteBatch");
var sw = Stopwatch.StartNew();
try
{
var request = new WriteBatchRequest
{
SessionId = _sessionId,
Items = values.Select(kv => new WriteItem { Tag = kv.Key, Value = kv.Value }).ToList()
};
WriteBatchResponse response = await ExecuteWithRetry(
() => _client!.WriteBatchAsync(request).AsTask(), cancellationToken);
if (!response.Success)
throw new InvalidOperationException($"WriteBatch failed: {response.Message}");
}
catch
{
_metrics.IncrementErrorCount("WriteBatch");
throw;
}
finally
{
sw.Stop();
_metrics.RecordLatency("WriteBatch", sw.ElapsedMilliseconds);
}
}
```
### 4.6 WriteBatchAndWaitAsync
```csharp
public async Task<WriteBatchAndWaitResponse> WriteBatchAndWaitAsync(
IDictionary<string, TypedValue> values, string flagTag, TypedValue flagValue,
int timeoutMs = 5000, int pollIntervalMs = 100, CancellationToken cancellationToken = default)
{
EnsureConnected();
var request = new WriteBatchAndWaitRequest
{
SessionId = _sessionId,
Items = values.Select(kv => new WriteItem { Tag = kv.Key, Value = kv.Value }).ToList(),
FlagTag = flagTag,
FlagValue = flagValue,
TimeoutMs = timeoutMs,
PollIntervalMs = pollIntervalMs
};
return await ExecuteWithRetry(
() => _client!.WriteBatchAndWaitAsync(request).AsTask(), cancellationToken);
}
```
### 4.7 CheckApiKeyAsync
```csharp
public async Task<ApiKeyInfo> CheckApiKeyAsync(string apiKey, CancellationToken cancellationToken = default)
{
EnsureConnected();
var request = new CheckApiKeyRequest { ApiKey = apiKey };
CheckApiKeyResponse response = await _client!.CheckApiKeyAsync(request);
return new ApiKeyInfo { IsValid = response.IsValid, Description = response.Message };
}
```
### 4.8 ConvertVtqMessage helper
This converts the wire `VtqMessage` (v2 with `TypedValue` + `QualityCode`) to the domain `Vtq`:
```csharp
private static Vtq ConvertVtqMessage(VtqMessage? msg)
{
if (msg is null)
return new Vtq(null, DateTime.UtcNow, Quality.Bad);
object? value = ExtractTypedValue(msg.Value);
DateTime timestamp = msg.TimestampUtcTicks > 0
? new DateTime(msg.TimestampUtcTicks, DateTimeKind.Utc)
: DateTime.UtcNow;
Quality quality = QualityExtensions.FromStatusCode(msg.Quality?.StatusCode ?? 0x80000000u);
return new Vtq(value, timestamp, quality);
}
private static object? ExtractTypedValue(TypedValue? tv)
{
if (tv is null) return null;
// Switch on whichever oneof-style property is set
// The exact property names depend on the Phase 1 code-first contract design
// e.g., tv.BoolValue, tv.Int32Value, tv.DoubleValue, tv.StringValue, etc.
// Return the native .NET value directly — no string conversions
...
}
```
**Important**: The exact shape of `TypedValue` in code-first contracts depends on Phase 1's implementation. Phase 1 should have defined a discriminator pattern (e.g., `ValueCase` enum or nullable properties with a convention). Adapt `ExtractTypedValue` to whatever pattern was chosen. The key rule: **no string heuristics**.
### 4.9 ExecuteWithRetry helper
```csharp
private async Task<T> ExecuteWithRetry<T>(Func<Task<T>> operation, CancellationToken ct)
{
if (_resiliencePipeline is not null)
{
return await _resiliencePipeline.ExecuteAsync(
async token => await operation(), ct);
}
return await operation();
}
```
### 4.10 EnsureConnected, Dispose, DisposeAsync
```csharp
private void EnsureConnected()
{
ObjectDisposedException.ThrowIf(_disposed, this);
if (!IsConnected)
throw new InvalidOperationException("Client is not connected. Call ConnectAsync first.");
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_keepAliveTimer?.Dispose();
_channel?.Dispose();
_connectionLock.Dispose();
}
public async ValueTask DisposeAsync()
{
if (_disposed) return;
try { await DisconnectAsync(); } catch { /* swallow */ }
Dispose();
}
```
### 4.11 IsConnectedAsync
```csharp
public Task<bool> IsConnectedAsync() => Task.FromResult(IsConnected);
```
### 4.12 GetMetrics
```csharp
public Dictionary<string, object> GetMetrics() => _metrics.GetSnapshot();
```
### 4.13 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 5: LmxProxyClient.Connection
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.Connection.cs`
Partial class containing `ConnectAsync`, `DisconnectAsync`, keep-alive, `MarkDisconnectedAsync`, `BuildEndpointUri`.
### 5.1 ConnectAsync
1. Acquire `_connectionLock`.
2. Throw `ObjectDisposedException` if disposed.
3. Return early if already connected.
4. Build endpoint URI via `BuildEndpointUri()`.
5. Create channel: `GrpcChannelFactory.CreateChannel(endpoint, _tlsConfiguration, _logger)`.
6. Create code-first client: `channel.CreateGrpcService<IScadaService>()` (from `ProtoBuf.Grpc.Client`).
7. Send `ConnectRequest` with `ClientId = $"ScadaBridge-{Guid.NewGuid():N}"` and `ApiKey = _apiKey ?? string.Empty`.
8. If `!response.Success`, dispose channel and throw.
9. Store channel, client, sessionId. Set `_isConnected = true`.
10. Call `StartKeepAlive()`.
11. On failure, reset all state and rethrow.
12. Release lock in `finally`.
### 5.2 DisconnectAsync
1. Acquire `_connectionLock`.
2. Stop keep-alive.
3. If client and session exist, send `DisconnectRequest`. Swallow exceptions.
4. Clear client, sessionId, isConnected. Dispose channel.
5. Release lock.
### 5.3 Keep-alive timer
- `StartKeepAlive()`: creates `Timer` with `_keepAliveInterval` (30s) interval.
- Timer callback: sends `GetConnectionStateRequest`. On failure: stops timer, calls `MarkDisconnectedAsync(ex)`.
- `StopKeepAlive()`: disposes timer, nulls it.
### 5.4 MarkDisconnectedAsync
1. If disposed, return.
2. Acquire `_connectionLock`, set `_isConnected = false`, clear client/sessionId, dispose channel. Release lock.
3. Copy and clear `_activeSubscriptions` under `_subscriptionLock`.
4. Dispose each subscription (swallow errors).
5. Log warning with the exception.
### 5.5 BuildEndpointUri
```csharp
private Uri BuildEndpointUri()
{
string scheme = _tlsConfiguration?.UseTls == true ? Uri.UriSchemeHttps : Uri.UriSchemeHttp;
return new UriBuilder { Scheme = scheme, Host = _host, Port = _port }.Uri;
}
```
### 5.6 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 6: LmxProxyClient.CodeFirstSubscription
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.CodeFirstSubscription.cs`
Nested class inside `LmxProxyClient` implementing `ISubscription`.
### 6.1 CodeFirstSubscription class
```csharp
private class CodeFirstSubscription : ISubscription
{
private readonly IScadaService _client;
private readonly string _sessionId;
private readonly List<string> _tags;
private readonly Action<string, Vtq> _onUpdate;
private readonly Action<Exception>? _onStreamError;
private readonly ILogger<LmxProxyClient> _logger;
private readonly Action<ISubscription>? _onDispose;
private readonly CancellationTokenSource _cts = new();
private Task? _processingTask;
private bool _disposed;
private bool _streamErrorFired;
```
Constructor takes all of these. `StartAsync` stores `_processingTask = ProcessUpdatesAsync(cancellationToken)`.
### 6.2 ProcessUpdatesAsync
```csharp
private async Task ProcessUpdatesAsync(CancellationToken cancellationToken)
{
try
{
var request = new SubscribeRequest
{
SessionId = _sessionId,
Tags = _tags,
SamplingMs = 1000
};
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cts.Token);
await foreach (VtqMessage vtqMsg in _client.SubscribeAsync(request, linkedCts.Token))
{
try
{
Vtq vtq = ConvertVtqMessage(vtqMsg); // static method from outer class
_onUpdate(vtqMsg.Tag, vtq);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing subscription update for {Tag}", vtqMsg.Tag);
}
}
}
catch (OperationCanceledException) when (_cts.IsCancellationRequested || cancellationToken.IsCancellationRequested)
{
_logger.LogDebug("Subscription cancelled");
}
catch (Exception ex)
{
_logger.LogError(ex, "Error in subscription processing");
FireStreamError(ex);
}
finally
{
if (!_disposed)
{
_disposed = true;
_onDispose?.Invoke(this);
}
}
}
private void FireStreamError(Exception ex)
{
if (_streamErrorFired) return;
_streamErrorFired = true;
try { _onStreamError?.Invoke(ex); }
catch (Exception cbEx) { _logger.LogWarning(cbEx, "onStreamError callback threw"); }
}
```
**Key difference from v1**: The `ConvertVtqMessage` now handles `TypedValue` + `QualityCode` natively instead of parsing strings. Also, `_onStreamError` callback is invoked exactly once on stream termination (per Component-Client.md section 5.1).
### 6.3 DisposeAsync and Dispose
`DisposeAsync()`: Cancel CTS, await `_processingTask` (swallow errors), dispose CTS. 5-second timeout guard.
`Dispose()`: Calls `DisposeAsync()` synchronously with `Task.Wait(TimeSpan.FromSeconds(5))`.
### 6.4 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 7: LmxProxyClient.ClientMetrics
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.ClientMetrics.cs`
Internal class. Already exists in v1 reference. Rewrite for v2 with p99 support.
```csharp
internal class ClientMetrics
{
private readonly ConcurrentDictionary<string, long> _operationCounts = new();
private readonly ConcurrentDictionary<string, long> _errorCounts = new();
private readonly ConcurrentDictionary<string, List<long>> _latencies = new();
private readonly Lock _latencyLock = new();
public void IncrementOperationCount(string operation) { ... }
public void IncrementErrorCount(string operation) { ... }
public void RecordLatency(string operation, long milliseconds) { ... }
public Dictionary<string, object> GetSnapshot() { ... }
}
```
`RecordLatency`: Under `_latencyLock`, add to list. If count > 1000, `RemoveAt(0)`.
`GetSnapshot`: Returns dictionary with keys `{op}_count`, `{op}_errors`, `{op}_avg_latency_ms`, `{op}_p95_latency_ms`, `{op}_p99_latency_ms`.
`GetPercentile(List<long> values, int percentile)`: Sort, compute index as `(int)Math.Ceiling(percentile / 100.0 * sorted.Count) - 1`, clamp with `Math.Max(0, ...)`.
## Step 8: LmxProxyClient.ApiKeyInfo
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.ApiKeyInfo.cs`
Simple DTO returned by `CheckApiKeyAsync`:
```csharp
namespace ZB.MOM.WW.LmxProxy.Client;
public partial class LmxProxyClient
{
/// <summary>
/// Result of an API key validation check.
/// </summary>
public class ApiKeyInfo
{
public bool IsValid { get; init; }
public string? Role { get; init; }
public string? Description { get; init; }
}
}
```
## Step 9: LmxProxyClient.ISubscription
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClient.ISubscription.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Client;
public partial class LmxProxyClient
{
/// <summary>
/// Represents an active tag subscription. Dispose to unsubscribe.
/// </summary>
public interface ISubscription : IDisposable
{
/// <summary>Asynchronous disposal with cancellation support.</summary>
Task DisposeAsync();
}
}
```
## Step 10: Unit Tests
**Project**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/`
Create if not exists:
```bash
ssh windev "cd C:\src\lmxproxy && dotnet new xunit -n ZB.MOM.WW.LmxProxy.Client.Tests -o tests/ZB.MOM.WW.LmxProxy.Client.Tests --framework net10.0"
```
**Csproj** for `tests/ZB.MOM.WW.LmxProxy.Client.Tests/ZB.MOM.WW.LmxProxy.Client.Tests.csproj`:
- `<TargetFramework>net10.0</TargetFramework>`
- `<ProjectReference Include="..\..\src\ZB.MOM.WW.LmxProxy.Client\ZB.MOM.WW.LmxProxy.Client.csproj" />`
- `<PackageReference Include="xunit" Version="2.9.3" />`
- `<PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />`
- `<PackageReference Include="NSubstitute" Version="5.3.0" />`
- `<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0" />`
**Add to solution** `ZB.MOM.WW.LmxProxy.slnx`:
```xml
<Folder Name="/tests/">
<Project Path="tests/ZB.MOM.WW.LmxProxy.Client.Tests/ZB.MOM.WW.LmxProxy.Client.Tests.csproj" />
</Folder>
```
### 10.1 Connection Lifecycle Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/LmxProxyClientConnectionTests.cs`
Mock `IScadaService` using NSubstitute.
```csharp
public class LmxProxyClientConnectionTests
{
[Fact]
public async Task ConnectAsync_EstablishesSessionAndStartsKeepAlive()
[Fact]
public async Task ConnectAsync_ThrowsWhenServerReturnsFailure()
[Fact]
public async Task DisconnectAsync_SendsDisconnectAndClearsState()
[Fact]
public async Task IsConnectedAsync_ReturnsFalseBeforeConnect()
[Fact]
public async Task IsConnectedAsync_ReturnsTrueAfterConnect()
[Fact]
public async Task KeepAliveFailure_MarksDisconnected()
}
```
Note: Testing the keep-alive requires either waiting 30s (too slow) or making the interval configurable for tests. Consider passing the interval as an internal constructor parameter or using a test-only subclass. Alternatively, test `MarkDisconnectedAsync` directly.
### 10.2 Read/Write Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/LmxProxyClientReadWriteTests.cs`
```csharp
public class LmxProxyClientReadWriteTests
{
[Fact]
public async Task ReadAsync_ReturnsVtqFromResponse()
// Mock ReadAsync to return a VtqMessage with TypedValue.DoubleValue = 42.5
// Verify returned Vtq.Value is 42.5 (double)
[Fact]
public async Task ReadAsync_ThrowsOnFailureResponse()
[Fact]
public async Task ReadBatchAsync_ReturnsDictionaryOfVtqs()
[Fact]
public async Task WriteAsync_SendsTypedValueDirectly()
// Verify the WriteRequest.Value is the TypedValue passed in, not a string
[Fact]
public async Task WriteBatchAsync_SendsAllItems()
[Fact]
public async Task WriteBatchAndWaitAsync_ReturnsResponse()
}
```
### 10.3 Subscription Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/LmxProxyClientSubscriptionTests.cs`
```csharp
public class LmxProxyClientSubscriptionTests
{
[Fact]
public async Task SubscribeAsync_InvokesCallbackForEachUpdate()
[Fact]
public async Task SubscribeAsync_InvokesStreamErrorOnFailure()
[Fact]
public async Task SubscribeAsync_DisposeStopsProcessing()
}
```
### 10.4 TypedValue Conversion Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/TypedValueConversionTests.cs`
```csharp
public class TypedValueConversionTests
{
[Fact] public void ConvertVtqMessage_ExtractsBoolValue()
[Fact] public void ConvertVtqMessage_ExtractsInt32Value()
[Fact] public void ConvertVtqMessage_ExtractsInt64Value()
[Fact] public void ConvertVtqMessage_ExtractsFloatValue()
[Fact] public void ConvertVtqMessage_ExtractsDoubleValue()
[Fact] public void ConvertVtqMessage_ExtractsStringValue()
[Fact] public void ConvertVtqMessage_ExtractsDateTimeValue()
[Fact] public void ConvertVtqMessage_HandlesNullTypedValue()
[Fact] public void ConvertVtqMessage_HandlesNullMessage()
[Fact] public void ConvertVtqMessage_MapsQualityCodeCorrectly()
[Fact] public void ConvertVtqMessage_GoodQualityCode()
[Fact] public void ConvertVtqMessage_BadQualityCode()
[Fact] public void ConvertVtqMessage_UncertainQualityCode()
}
```
### 10.5 Metrics Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/ClientMetricsTests.cs`
```csharp
public class ClientMetricsTests
{
[Fact] public void IncrementOperationCount_Increments()
[Fact] public void IncrementErrorCount_Increments()
[Fact] public void RecordLatency_StoresValues()
[Fact] public void RollingBuffer_CapsAt1000()
[Fact] public void GetSnapshot_IncludesP95AndP99()
}
```
### 10.6 Run tests
```bash
ssh windev "cd C:\src\lmxproxy && dotnet test tests/ZB.MOM.WW.LmxProxy.Client.Tests --verbosity normal"
```
## Step 11: Build Verification
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build ZB.MOM.WW.LmxProxy.slnx && dotnet test --verbosity normal"
```
## Completion Criteria
- [ ] `ILmxProxyClient` interface updated for v2 (TypedValue parameters, onStreamError callback, CheckApiKeyAsync)
- [ ] `LmxProxyClient.cs` — main file with Read/Write/WriteBatch/WriteBatchAndWait/CheckApiKey using v2 TypedValue
- [ ] `LmxProxyClient.Connection.cs` — ConnectAsync, DisconnectAsync, keep-alive (30s), MarkDisconnectedAsync
- [ ] `LmxProxyClient.CodeFirstSubscription.cs` — IAsyncEnumerable processing, onStreamError callback, 5s dispose timeout
- [ ] `LmxProxyClient.ClientMetrics.cs` — per-op counts/errors/latency, 1000-sample buffer, p95/p99
- [ ] `LmxProxyClient.ApiKeyInfo.cs` — simple DTO
- [ ] `LmxProxyClient.ISubscription.cs` — IDisposable + DisposeAsync
- [ ] `ClientTlsConfiguration.cs` — all properties present
- [ ] `Security/GrpcChannelFactory.cs` — TLS 1.2/1.3, cert validation, custom CA, self-signed support
- [ ] No string serialization heuristics anywhere in Client code
- [ ] ConvertVtqMessage extracts native TypedValue without parsing
- [ ] Polly v8 ResiliencePipeline for retry (not v7 IAsyncPolicy)
- [ ] All unit tests pass
- [ ] Solution builds cleanly

View File

@@ -0,0 +1,815 @@
# Phase 6: Client Extras — Implementation Plan
**Date**: 2026-03-21
**Prerequisites**: Phase 5 complete and passing (Client Core — `ILmxProxyClient`, `LmxProxyClient` partial classes, `ClientMetrics`, `ISubscription`, `ApiKeyInfo` all functional with unit tests passing)
**Working Directory**: The lmxproxy repo is on windev at `C:\src\lmxproxy`
## Guardrails
1. **Client targets .NET 10, AnyCPU** — latest C# features permitted.
2. **Polly v8 API**`ResiliencePipeline`, `ResiliencePipelineBuilder`, `RetryStrategyOptions`. Do NOT use Polly v7 `IAsyncPolicy`, `Policy.Handle<>().WaitAndRetryAsync(...)`.
3. **Builder default port is 50051** (per design doc section 11 — resolved conflict).
4. **No new NuGet packages**`Polly 8.5.2`, `Microsoft.Extensions.DependencyInjection.Abstractions 10.0.0`, `Microsoft.Extensions.Configuration.Abstractions 10.0.0`, `Microsoft.Extensions.Configuration.Binder 10.0.0`, `Microsoft.Extensions.Logging.Abstractions 10.0.0` are already in the csproj.
5. **Build command**: `dotnet build src/ZB.MOM.WW.LmxProxy.Client`
6. **Test command**: `dotnet test tests/ZB.MOM.WW.LmxProxy.Client.Tests`
## Step 1: LmxProxyClientBuilder
**File**: `src/ZB.MOM.WW.LmxProxy.Client/LmxProxyClientBuilder.cs`
Rewrite the builder for v2. Key changes from v1:
- Default port changes from `5050` to `50051`
- Retry uses Polly v8 `ResiliencePipeline` (built in `SetBuilderConfiguration`)
- `WithCorrelationIdHeader` support
### 1.1 Builder fields
```csharp
public class LmxProxyClientBuilder
{
private string? _host;
private int _port = 50051; // CHANGED from 5050
private string? _apiKey;
private ILogger<LmxProxyClient>? _logger;
private TimeSpan _defaultTimeout = TimeSpan.FromSeconds(30);
private int _maxRetryAttempts = 3;
private TimeSpan _retryDelay = TimeSpan.FromSeconds(1);
private bool _enableMetrics;
private string? _correlationIdHeader;
private ClientTlsConfiguration? _tlsConfiguration;
```
### 1.2 Fluent methods
Each method returns `this` for chaining. Validation at call site:
| Method | Default | Validation |
|---|---|---|
| `WithHost(string host)` | Required | `!string.IsNullOrWhiteSpace(host)` |
| `WithPort(int port)` | 50051 | 1-65535 |
| `WithApiKey(string? apiKey)` | null | none |
| `WithLogger(ILogger<LmxProxyClient> logger)` | NullLogger | `!= null` |
| `WithTimeout(TimeSpan timeout)` | 30s | `> TimeSpan.Zero && <= TimeSpan.FromMinutes(10)` |
| `WithSslCredentials(string? certificatePath)` | disabled | creates/updates `_tlsConfiguration` with `UseTls=true` |
| `WithTlsConfiguration(ClientTlsConfiguration config)` | null | `!= null` |
| `WithRetryPolicy(int maxAttempts, TimeSpan retryDelay)` | 3, 1s | `maxAttempts > 0`, `retryDelay > TimeSpan.Zero` |
| `WithMetrics()` | disabled | sets `_enableMetrics = true` |
| `WithCorrelationIdHeader(string headerName)` | null | `!string.IsNullOrEmpty` |
### 1.3 Build()
```csharp
public LmxProxyClient Build()
{
if (string.IsNullOrWhiteSpace(_host))
throw new InvalidOperationException("Host must be specified. Call WithHost() before Build().");
ValidateTlsConfiguration();
var client = new LmxProxyClient(_host, _port, _apiKey, _tlsConfiguration, _logger)
{
DefaultTimeout = _defaultTimeout
};
client.SetBuilderConfiguration(new ClientConfiguration
{
MaxRetryAttempts = _maxRetryAttempts,
RetryDelay = _retryDelay,
EnableMetrics = _enableMetrics,
CorrelationIdHeader = _correlationIdHeader
});
return client;
}
```
### 1.4 ValidateTlsConfiguration
If `_tlsConfiguration?.UseTls == true`:
- If `ServerCaCertificatePath` is set and file doesn't exist → throw `FileNotFoundException`.
- If `ClientCertificatePath` is set and file doesn't exist → throw `FileNotFoundException`.
- If `ClientKeyPath` is set and file doesn't exist → throw `FileNotFoundException`.
### 1.5 Polly v8 ResiliencePipeline setup (in LmxProxyClient.SetBuilderConfiguration)
This was defined in Step 4 of Phase 5. Verify it uses:
```csharp
using Polly;
using Polly.Retry;
using Grpc.Core;
_resiliencePipeline = new ResiliencePipelineBuilder()
.AddRetry(new RetryStrategyOptions
{
MaxRetryAttempts = config.MaxRetryAttempts,
Delay = config.RetryDelay,
BackoffType = DelayBackoffType.Exponential,
ShouldHandle = new PredicateBuilder()
.Handle<RpcException>(ex =>
ex.StatusCode == StatusCode.Unavailable ||
ex.StatusCode == StatusCode.DeadlineExceeded ||
ex.StatusCode == StatusCode.ResourceExhausted ||
ex.StatusCode == StatusCode.Aborted),
OnRetry = args =>
{
_logger.LogWarning(
"Retry {Attempt}/{Max} after {Delay}ms — {Error}",
args.AttemptNumber, config.MaxRetryAttempts,
args.RetryDelay.TotalMilliseconds,
args.Outcome.Exception?.Message ?? "unknown");
return ValueTask.CompletedTask;
}
})
.Build();
```
Backoff sequence: `retryDelay * 2^(attempt-1)` → 1s, 2s, 4s for defaults.
### 1.6 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 2: ClientConfiguration
**File**: This is already defined in `LmxProxyClientBuilder.cs` (at the bottom of the file, as an `internal class`). Verify it contains:
```csharp
internal class ClientConfiguration
{
public int MaxRetryAttempts { get; set; }
public TimeSpan RetryDelay { get; set; }
public bool EnableMetrics { get; set; }
public string? CorrelationIdHeader { get; set; }
}
```
No changes needed if it matches.
## Step 3: ILmxProxyClientFactory + LmxProxyClientFactory
**File**: `src/ZB.MOM.WW.LmxProxy.Client/ILmxProxyClientFactory.cs`
### 3.1 Interface
```csharp
namespace ZB.MOM.WW.LmxProxy.Client;
public interface ILmxProxyClientFactory
{
LmxProxyClient CreateClient();
LmxProxyClient CreateClient(string configName);
LmxProxyClient CreateClient(Action<LmxProxyClientBuilder> builderAction);
}
```
### 3.2 Implementation
```csharp
public class LmxProxyClientFactory : ILmxProxyClientFactory
{
private readonly IConfiguration _configuration;
public LmxProxyClientFactory(IConfiguration configuration)
{
_configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
}
public LmxProxyClient CreateClient() => CreateClient("LmxProxy");
public LmxProxyClient CreateClient(string configName)
{
IConfigurationSection section = _configuration.GetSection(configName);
var options = new LmxProxyClientOptions();
section.Bind(options);
return BuildFromOptions(options);
}
public LmxProxyClient CreateClient(Action<LmxProxyClientBuilder> builderAction)
{
var builder = new LmxProxyClientBuilder();
builderAction(builder);
return builder.Build();
}
private static LmxProxyClient BuildFromOptions(LmxProxyClientOptions options)
{
var builder = new LmxProxyClientBuilder()
.WithHost(options.Host)
.WithPort(options.Port)
.WithTimeout(options.Timeout)
.WithRetryPolicy(options.Retry.MaxAttempts, options.Retry.Delay);
if (!string.IsNullOrEmpty(options.ApiKey))
builder.WithApiKey(options.ApiKey);
if (options.EnableMetrics)
builder.WithMetrics();
if (!string.IsNullOrEmpty(options.CorrelationIdHeader))
builder.WithCorrelationIdHeader(options.CorrelationIdHeader);
if (options.UseSsl)
{
builder.WithTlsConfiguration(new ClientTlsConfiguration
{
UseTls = true,
ServerCaCertificatePath = options.CertificatePath
});
}
return builder.Build();
}
}
```
### 3.3 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 4: ServiceCollectionExtensions
**File**: `src/ZB.MOM.WW.LmxProxy.Client/ServiceCollectionExtensions.cs`
### 4.1 Options classes
Define at the bottom of the file or in a separate `LmxProxyClientOptions.cs`:
```csharp
public class LmxProxyClientOptions
{
public string Host { get; set; } = "localhost";
public int Port { get; set; } = 50051; // CHANGED from 5050
public string? ApiKey { get; set; }
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(30);
public bool UseSsl { get; set; }
public string? CertificatePath { get; set; }
public bool EnableMetrics { get; set; }
public string? CorrelationIdHeader { get; set; }
public RetryOptions Retry { get; set; } = new();
}
public class RetryOptions
{
public int MaxAttempts { get; set; } = 3;
public TimeSpan Delay { get; set; } = TimeSpan.FromSeconds(1);
}
```
### 4.2 Extension methods
```csharp
public static class ServiceCollectionExtensions
{
/// <summary>Registers a singleton ILmxProxyClient from the "LmxProxy" config section.</summary>
public static IServiceCollection AddLmxProxyClient(
this IServiceCollection services, IConfiguration configuration)
{
return services.AddLmxProxyClient(configuration, "LmxProxy");
}
/// <summary>Registers a singleton ILmxProxyClient from a named config section.</summary>
public static IServiceCollection AddLmxProxyClient(
this IServiceCollection services, IConfiguration configuration, string sectionName)
{
services.AddSingleton<ILmxProxyClientFactory>(
sp => new LmxProxyClientFactory(configuration));
services.AddSingleton<ILmxProxyClient>(sp =>
{
var factory = sp.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient(sectionName);
});
return services;
}
/// <summary>Registers a singleton ILmxProxyClient via builder action.</summary>
public static IServiceCollection AddLmxProxyClient(
this IServiceCollection services, Action<LmxProxyClientBuilder> configure)
{
services.AddSingleton<ILmxProxyClient>(sp =>
{
var builder = new LmxProxyClientBuilder();
configure(builder);
return builder.Build();
});
return services;
}
/// <summary>Registers a scoped ILmxProxyClient from the "LmxProxy" config section.</summary>
public static IServiceCollection AddScopedLmxProxyClient(
this IServiceCollection services, IConfiguration configuration)
{
services.AddSingleton<ILmxProxyClientFactory>(
sp => new LmxProxyClientFactory(configuration));
services.AddScoped<ILmxProxyClient>(sp =>
{
var factory = sp.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient();
});
return services;
}
/// <summary>Registers a keyed singleton ILmxProxyClient.</summary>
public static IServiceCollection AddNamedLmxProxyClient(
this IServiceCollection services, string name, Action<LmxProxyClientBuilder> configure)
{
services.AddKeyedSingleton<ILmxProxyClient>(name, (sp, key) =>
{
var builder = new LmxProxyClientBuilder();
configure(builder);
return builder.Build();
});
return services;
}
}
```
### 4.3 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 5: StreamingExtensions
**File**: `src/ZB.MOM.WW.LmxProxy.Client/StreamingExtensions.cs`
### 5.1 ReadStreamAsync
```csharp
public static class StreamingExtensions
{
/// <summary>
/// Reads multiple tags as an async stream in batches.
/// Retries up to 2 times per batch. Aborts after 3 consecutive batch errors.
/// </summary>
public static async IAsyncEnumerable<KeyValuePair<string, Vtq>> ReadStreamAsync(
this ILmxProxyClient client,
IEnumerable<string> addresses,
int batchSize = 100,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(addresses);
if (batchSize <= 0)
throw new ArgumentOutOfRangeException(nameof(batchSize));
var batch = new List<string>(batchSize);
int consecutiveErrors = 0;
const int maxConsecutiveErrors = 3;
const int maxRetries = 2;
foreach (string address in addresses)
{
cancellationToken.ThrowIfCancellationRequested();
batch.Add(address);
if (batch.Count >= batchSize)
{
await foreach (var kvp in ReadBatchWithRetry(
client, batch, maxRetries, cancellationToken))
{
consecutiveErrors = 0;
yield return kvp;
}
// If we get here without yielding, it was an error
// (handled inside ReadBatchWithRetry)
batch.Clear();
}
}
// Process remaining
if (batch.Count > 0)
{
await foreach (var kvp in ReadBatchWithRetry(
client, batch, maxRetries, cancellationToken))
{
yield return kvp;
}
}
}
private static async IAsyncEnumerable<KeyValuePair<string, Vtq>> ReadBatchWithRetry(
ILmxProxyClient client,
List<string> batch,
int maxRetries,
[EnumeratorCancellation] CancellationToken ct)
{
int retries = 0;
while (retries <= maxRetries)
{
IDictionary<string, Vtq>? results = null;
try
{
results = await client.ReadBatchAsync(batch, ct);
}
catch when (retries < maxRetries)
{
retries++;
continue;
}
if (results is not null)
{
foreach (var kvp in results)
yield return kvp;
yield break;
}
retries++;
}
}
```
### 5.2 WriteStreamAsync
```csharp
/// <summary>
/// Writes values from an async enumerable in batches. Returns total count written.
/// </summary>
public static async Task<int> WriteStreamAsync(
this ILmxProxyClient client,
IAsyncEnumerable<KeyValuePair<string, TypedValue>> values,
int batchSize = 100,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(values);
if (batchSize <= 0)
throw new ArgumentOutOfRangeException(nameof(batchSize));
var batch = new Dictionary<string, TypedValue>(batchSize);
int totalWritten = 0;
await foreach (var kvp in values.WithCancellation(cancellationToken))
{
batch[kvp.Key] = kvp.Value;
if (batch.Count >= batchSize)
{
await client.WriteBatchAsync(batch, cancellationToken);
totalWritten += batch.Count;
batch.Clear();
}
}
if (batch.Count > 0)
{
await client.WriteBatchAsync(batch, cancellationToken);
totalWritten += batch.Count;
}
return totalWritten;
}
```
### 5.3 ProcessInParallelAsync
```csharp
/// <summary>
/// Processes items in parallel with a configurable max concurrency (default 4).
/// </summary>
public static async Task ProcessInParallelAsync<T>(
this IAsyncEnumerable<T> source,
Func<T, CancellationToken, Task> processor,
int maxConcurrency = 4,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(source);
ArgumentNullException.ThrowIfNull(processor);
if (maxConcurrency <= 0)
throw new ArgumentOutOfRangeException(nameof(maxConcurrency));
using var semaphore = new SemaphoreSlim(maxConcurrency);
var tasks = new List<Task>();
await foreach (T item in source.WithCancellation(cancellationToken))
{
await semaphore.WaitAsync(cancellationToken);
tasks.Add(Task.Run(async () =>
{
try
{
await processor(item, cancellationToken);
}
finally
{
semaphore.Release();
}
}, cancellationToken));
}
await Task.WhenAll(tasks);
}
```
### 5.4 SubscribeStreamAsync
```csharp
/// <summary>
/// Wraps a callback-based subscription into an IAsyncEnumerable via System.Threading.Channels.
/// </summary>
public static async IAsyncEnumerable<(string Tag, Vtq Vtq)> SubscribeStreamAsync(
this ILmxProxyClient client,
IEnumerable<string> addresses,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(addresses);
var channel = Channel.CreateBounded<(string, Vtq)>(
new BoundedChannelOptions(1000)
{
FullMode = BoundedChannelFullMode.DropOldest,
SingleReader = true,
SingleWriter = false
});
ISubscription? subscription = null;
try
{
subscription = await client.SubscribeAsync(
addresses,
(tag, vtq) =>
{
channel.Writer.TryWrite((tag, vtq));
},
ex =>
{
channel.Writer.TryComplete(ex);
},
cancellationToken);
await foreach (var item in channel.Reader.ReadAllAsync(cancellationToken))
{
yield return item;
}
}
finally
{
subscription?.Dispose();
channel.Writer.TryComplete();
}
}
}
```
### 5.5 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 6: Properties/AssemblyInfo.cs
**File**: `src/ZB.MOM.WW.LmxProxy.Client/Properties/AssemblyInfo.cs`
Create this file if it doesn't already exist:
```csharp
using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("ZB.MOM.WW.LmxProxy.Client.Tests")]
```
This allows the test project to access `internal` types like `ClientMetrics` and `ClientConfiguration`.
### 6.1 Verify build
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build src/ZB.MOM.WW.LmxProxy.Client"
```
## Step 7: Unit Tests
Add tests to the existing `tests/ZB.MOM.WW.LmxProxy.Client.Tests/` project (created in Phase 5).
### 7.1 Builder Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/LmxProxyClientBuilderTests.cs`
```csharp
public class LmxProxyClientBuilderTests
{
[Fact]
public void Build_ThrowsWhenHostNotSet()
{
var builder = new LmxProxyClientBuilder();
Assert.Throws<InvalidOperationException>(() => builder.Build());
}
[Fact]
public void Build_DefaultPort_Is50051()
{
var client = new LmxProxyClientBuilder()
.WithHost("localhost")
.Build();
// Verify via reflection or by checking connection attempt URI
Assert.NotNull(client);
}
[Fact]
public void WithPort_ThrowsOnZero()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithPort(0));
}
[Fact]
public void WithPort_ThrowsOn65536()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithPort(65536));
}
[Fact]
public void WithTimeout_ThrowsOnNegative()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithTimeout(TimeSpan.FromSeconds(-1)));
}
[Fact]
public void WithTimeout_ThrowsOver10Minutes()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithTimeout(TimeSpan.FromMinutes(11)));
}
[Fact]
public void WithRetryPolicy_ThrowsOnZeroAttempts()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithRetryPolicy(0, TimeSpan.FromSeconds(1)));
}
[Fact]
public void WithRetryPolicy_ThrowsOnZeroDelay()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
new LmxProxyClientBuilder().WithRetryPolicy(3, TimeSpan.Zero));
}
[Fact]
public void Build_WithAllOptions_Succeeds()
{
var client = new LmxProxyClientBuilder()
.WithHost("10.100.0.48")
.WithPort(50051)
.WithApiKey("test-key")
.WithTimeout(TimeSpan.FromSeconds(15))
.WithRetryPolicy(5, TimeSpan.FromSeconds(2))
.WithMetrics()
.WithCorrelationIdHeader("X-Correlation-ID")
.Build();
Assert.NotNull(client);
}
[Fact]
public void Build_WithTls_ValidatesCertificatePaths()
{
var builder = new LmxProxyClientBuilder()
.WithHost("localhost")
.WithTlsConfiguration(new ClientTlsConfiguration
{
UseTls = true,
ServerCaCertificatePath = "/nonexistent/cert.pem"
});
Assert.Throws<FileNotFoundException>(() => builder.Build());
}
[Fact]
public void WithHost_ThrowsOnNull()
{
Assert.Throws<ArgumentException>(() =>
new LmxProxyClientBuilder().WithHost(null!));
}
[Fact]
public void WithHost_ThrowsOnEmpty()
{
Assert.Throws<ArgumentException>(() =>
new LmxProxyClientBuilder().WithHost(""));
}
}
```
### 7.2 Factory Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/LmxProxyClientFactoryTests.cs`
```csharp
public class LmxProxyClientFactoryTests
{
[Fact]
public void CreateClient_BindsFromConfiguration()
{
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
["LmxProxy:Host"] = "10.100.0.48",
["LmxProxy:Port"] = "50052",
["LmxProxy:ApiKey"] = "test-key",
["LmxProxy:Retry:MaxAttempts"] = "5",
["LmxProxy:Retry:Delay"] = "00:00:02",
})
.Build();
var factory = new LmxProxyClientFactory(config);
var client = factory.CreateClient();
Assert.NotNull(client);
}
[Fact]
public void CreateClient_NamedSection()
{
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
["MyProxy:Host"] = "10.100.0.48",
["MyProxy:Port"] = "50052",
})
.Build();
var factory = new LmxProxyClientFactory(config);
var client = factory.CreateClient("MyProxy");
Assert.NotNull(client);
}
[Fact]
public void CreateClient_BuilderAction()
{
var config = new ConfigurationBuilder().Build();
var factory = new LmxProxyClientFactory(config);
var client = factory.CreateClient(b => b.WithHost("localhost").WithPort(50051));
Assert.NotNull(client);
}
}
```
### 7.3 StreamingExtensions Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.Tests/StreamingExtensionsTests.cs`
```csharp
public class StreamingExtensionsTests
{
[Fact]
public async Task ReadStreamAsync_BatchesCorrectly()
// Create mock client, provide 250 addresses with batchSize=100
// Verify ReadBatchAsync called 3 times (100, 100, 50)
[Fact]
public async Task ReadStreamAsync_RetriesOnError()
// Mock first ReadBatchAsync to throw, second to succeed
// Verify results returned from second attempt
[Fact]
public async Task WriteStreamAsync_BatchesAndReturnsCount()
// Provide async enumerable of 250 items, batchSize=100
// Verify WriteBatchAsync called 3 times, total returned = 250
[Fact]
public async Task ProcessInParallelAsync_RespectsMaxConcurrency()
// Track concurrent count with SemaphoreSlim
// maxConcurrency=2, verify never exceeds 2 concurrent calls
[Fact]
public async Task SubscribeStreamAsync_YieldsFromChannel()
// Mock SubscribeAsync to invoke onUpdate callback with test values
// Verify IAsyncEnumerable yields matching items
}
```
### 7.4 Run all tests
```bash
ssh windev "cd C:\src\lmxproxy && dotnet test tests/ZB.MOM.WW.LmxProxy.Client.Tests --verbosity normal"
```
## Step 8: Build Verification
Run full solution build and all tests:
```bash
ssh windev "cd C:\src\lmxproxy && dotnet build ZB.MOM.WW.LmxProxy.slnx && dotnet test --verbosity normal"
```
## Completion Criteria
- [ ] `LmxProxyClientBuilder` with default port 50051, Polly v8 wiring, all fluent methods, TLS validation
- [ ] `ClientConfiguration` internal record with retry, metrics, correlation header fields
- [ ] `ILmxProxyClientFactory` + `LmxProxyClientFactory` with 3 `CreateClient` overloads
- [ ] `ServiceCollectionExtensions` with `AddLmxProxyClient` (3 overloads), `AddScopedLmxProxyClient`, `AddNamedLmxProxyClient`
- [ ] `LmxProxyClientOptions` + `RetryOptions` configuration classes
- [ ] `StreamingExtensions` with `ReadStreamAsync` (batched, 2 retries, 3 consecutive error abort), `WriteStreamAsync` (batched), `ProcessInParallelAsync` (SemaphoreSlim, max 4), `SubscribeStreamAsync` (Channel-based IAsyncEnumerable)
- [ ] `Properties/AssemblyInfo.cs` with `InternalsVisibleTo` for test project
- [ ] Builder tests: validation, defaults, Polly pipeline wiring, TLS cert validation
- [ ] Factory tests: config binding from IConfiguration, named sections, builder action
- [ ] StreamingExtensions tests: batching, error recovery, parallel throttling, subscription streaming
- [ ] Solution builds cleanly
- [ ] All tests pass

View File

@@ -0,0 +1,837 @@
# Phase 7: Integration Tests & Deployment — Implementation Plan
**Date**: 2026-03-21
**Prerequisites**: Phase 4 (Host complete) and Phase 6 (Client complete) both passing. All unit tests green.
**Working Directory (Mac)**: `/Users/dohertj2/Desktop/scadalink-design/lmxproxy`
**Working Directory (windev)**: `C:\src\lmxproxy`
**windev SSH**: `ssh windev` (alias configured in `~/.ssh/config`, passwordless ed25519, user `dohertj2`)
## Guardrails
1. **Never stop the v1 service until v2 is verified** — deploy v2 on alternate ports first.
2. **Take a Veeam backup before cutover** — provides rollback point.
3. **Integration tests run from Mac against windev** — they use `Grpc.Net.Client` which is cross-platform.
4. **All integration tests must pass before cutover**.
5. **API keys**: The existing `apikeys.json` on windev is the source of truth for valid keys. Read it to get test keys.
6. **Real MxAccess tags**: Use the `TestChildObject` tags on windev's AVEVA System Platform instance. Available tags cover all TypedValue cases:
- `TestChildObject.TestBool` (bool)
- `TestChildObject.TestInt` (int)
- `TestChildObject.TestFloat` (float)
- `TestChildObject.TestDouble` (double)
- `TestChildObject.TestString` (string)
- `TestChildObject.TestDateTime` (datetime)
- `TestChildObject.TestBoolArray[]` (bool array)
- `TestChildObject.TestDateTimeArray[]` (datetime array)
- `TestChildObject.TestDoubleArray[]` (double array)
- `TestChildObject.TestFloatArray[]` (float array)
- `TestChildObject.TestIntArray[]` (int array)
- `TestChildObject.TestStringArray[]` (string array)
## Step 1: Build Host on windev
### 1.1 Pull latest code
```bash
ssh windev "cd C:\src\lmxproxy && git pull"
```
If the repo doesn't exist on windev yet:
```bash
ssh windev "git clone https://gitea.dohertylan.com/dohertj2/lmxproxy.git C:\src\lmxproxy"
```
### 1.2 Publish Host binary
```bash
ssh windev "cd C:\src\lmxproxy && dotnet publish src/ZB.MOM.WW.LmxProxy.Host -c Release -r win-x86 --self-contained false -o C:\publish-v2\"
```
**Expected output**: `C:\publish-v2\ZB.MOM.WW.LmxProxy.Host.exe` plus dependencies.
### 1.3 Create v2 appsettings.json
Create `C:\publish-v2\appsettings.json` configured for testing on alternate ports:
```bash
ssh windev "powershell -Command \"@'
{
\"GrpcPort\": 50052,
\"ApiKeyConfigFile\": \"apikeys.json\",
\"Connection\": {
\"MonitorIntervalSeconds\": 5,
\"ConnectionTimeoutSeconds\": 30,
\"ReadTimeoutSeconds\": 5,
\"WriteTimeoutSeconds\": 5,
\"MaxConcurrentOperations\": 10,
\"AutoReconnect\": true
},
\"Subscription\": {
\"ChannelCapacity\": 1000,
\"ChannelFullMode\": \"DropOldest\"
},
\"HealthCheck\": {
\"Enabled\": true,
\"TestTagAddress\": \"TestChildObject.TestBool\",
\"MaxStaleDataMinutes\": 5
},
\"Tls\": {
\"Enabled\": false
},
\"WebServer\": {
\"Enabled\": true,
\"Port\": 8081
},
\"Serilog\": {
\"MinimumLevel\": {
\"Default\": \"Information\",
\"Override\": {
\"Microsoft\": \"Warning\",
\"System\": \"Warning\",
\"Grpc\": \"Information\"
}
},
\"WriteTo\": [
{ \"Name\": \"Console\" },
{
\"Name\": \"File\",
\"Args\": {
\"path\": \"logs/lmxproxy-v2-.txt\",
\"rollingInterval\": \"Day\",
\"retainedFileCountLimit\": 30
}
}
]
}
}
'@ | Set-Content -Path 'C:\publish-v2\appsettings.json' -Encoding UTF8\""
```
**Key differences from production config**: gRPC port is 50052 (not 50051), web port is 8081 (not 8080), log file prefix is `lmxproxy-v2-`.
### 1.4 Copy apikeys.json
If v2 should use the same API keys as v1:
```bash
ssh windev "copy C:\publish\apikeys.json C:\publish-v2\apikeys.json"
```
If `C:\publish\apikeys.json` doesn't exist (the v2 service will auto-generate one on first start):
```bash
ssh windev "if not exist C:\publish\apikeys.json echo No existing apikeys.json - v2 will auto-generate"
```
### 1.5 Verify the publish directory
```bash
ssh windev "dir C:\publish-v2\ZB.MOM.WW.LmxProxy.Host.exe && dir C:\publish-v2\appsettings.json"
```
## Step 2: Deploy v2 Host Service
### 2.1 Install as a separate Topshelf service
The v2 service runs alongside v1 on different ports. Install with a distinct service name:
```bash
ssh windev "C:\publish-v2\ZB.MOM.WW.LmxProxy.Host.exe install -servicename \"ZB.MOM.WW.LmxProxy.Host.V2\" -displayname \"SCADA Bridge LMX Proxy V2\" -description \"LmxProxy v2 gRPC service (test deployment)\" --autostart"
```
### 2.2 Start the v2 service
```bash
ssh windev "sc start ZB.MOM.WW.LmxProxy.Host.V2"
```
### 2.3 Wait 10 seconds for startup, then verify
```bash
ssh windev "timeout /t 10 /nobreak >nul && sc query ZB.MOM.WW.LmxProxy.Host.V2"
```
Expected: `STATE: 4 RUNNING`.
### 2.4 Verify status page
From Mac, use curl to check the v2 status page:
```bash
curl -s http://10.100.0.48:8081/ | head -20
```
Expected: HTML containing "LmxProxy Status Dashboard".
```bash
curl -s http://10.100.0.48:8081/api/health
```
Expected: `OK` with HTTP 200.
```bash
curl -s http://10.100.0.48:8081/api/status | python3 -m json.tool | head -30
```
Expected: JSON with `serviceName`, `connection.isConnected: true`, version info.
### 2.5 Verify MxAccess connected
The status page should show `MxAccess Connection: Connected`. If it shows `Disconnected`, check the logs:
```bash
ssh windev "type C:\publish-v2\logs\lmxproxy-v2-*.txt | findstr /i \"error\""
```
### 2.6 Read the apikeys.json to get test keys
```bash
ssh windev "type C:\publish-v2\apikeys.json"
```
Record the ReadWrite and ReadOnly API keys for use in integration tests. Example structure:
```json
{
"Keys": [
{ "Key": "abc123...", "Role": "ReadWrite", "Description": "Default ReadWrite key" },
{ "Key": "def456...", "Role": "ReadOnly", "Description": "Default ReadOnly key" }
]
}
```
## Step 3: Create Integration Test Project
### 3.1 Create project
On windev (or Mac — the test project is .NET 10 and cross-platform):
```bash
cd /Users/dohertj2/Desktop/scadalink-design/lmxproxy
dotnet new xunit -n ZB.MOM.WW.LmxProxy.Client.IntegrationTests -o tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests --framework net10.0
```
### 3.2 Configure csproj
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests.csproj`
```xml
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<LangVersion>latest</LangVersion>
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="xunit" Version="2.9.3" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0" />
<PackageReference Include="Microsoft.Extensions.Configuration" Version="10.0.0" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="10.0.0" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.LmxProxy.Client\ZB.MOM.WW.LmxProxy.Client.csproj" />
</ItemGroup>
<ItemGroup>
<None Update="appsettings.test.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
</Project>
```
### 3.3 Add to solution
Edit `ZB.MOM.WW.LmxProxy.slnx`:
```xml
<Solution>
<Folder Name="/src/">
<Project Path="src/ZB.MOM.WW.LmxProxy.Host/ZB.MOM.WW.LmxProxy.Host.csproj" />
<Project Path="src/ZB.MOM.WW.LmxProxy.Client/ZB.MOM.WW.LmxProxy.Client.csproj" />
</Folder>
<Folder Name="/tests/">
<Project Path="tests/ZB.MOM.WW.LmxProxy.Host.Tests/ZB.MOM.WW.LmxProxy.Host.Tests.csproj" />
<Project Path="tests/ZB.MOM.WW.LmxProxy.Client.Tests/ZB.MOM.WW.LmxProxy.Client.Tests.csproj" />
<Project Path="tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests.csproj" />
</Folder>
</Solution>
```
### 3.4 Create test configuration
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/appsettings.test.json`
```json
{
"LmxProxy": {
"Host": "10.100.0.48",
"Port": 50052,
"ReadWriteApiKey": "REPLACE_WITH_ACTUAL_KEY",
"ReadOnlyApiKey": "REPLACE_WITH_ACTUAL_KEY",
"InvalidApiKey": "invalid-key-that-does-not-exist"
}
}
```
**IMPORTANT**: After reading the actual `apikeys.json` from windev in Step 2.6, replace the placeholder values with the real keys.
### 3.5 Create test base class
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/IntegrationTestBase.cs`
```csharp
using Microsoft.Extensions.Configuration;
using ZB.MOM.WW.LmxProxy.Client;
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public abstract class IntegrationTestBase : IAsyncLifetime
{
protected IConfiguration Configuration { get; }
protected string Host { get; }
protected int Port { get; }
protected string ReadWriteApiKey { get; }
protected string ReadOnlyApiKey { get; }
protected string InvalidApiKey { get; }
protected LmxProxyClient? Client { get; set; }
protected IntegrationTestBase()
{
Configuration = new ConfigurationBuilder()
.AddJsonFile("appsettings.test.json")
.Build();
var section = Configuration.GetSection("LmxProxy");
Host = section["Host"] ?? "10.100.0.48";
Port = int.Parse(section["Port"] ?? "50052");
ReadWriteApiKey = section["ReadWriteApiKey"] ?? throw new Exception("ReadWriteApiKey not configured");
ReadOnlyApiKey = section["ReadOnlyApiKey"] ?? throw new Exception("ReadOnlyApiKey not configured");
InvalidApiKey = section["InvalidApiKey"] ?? "invalid-key";
}
protected LmxProxyClient CreateClient(string? apiKey = null)
{
return new LmxProxyClientBuilder()
.WithHost(Host)
.WithPort(Port)
.WithApiKey(apiKey ?? ReadWriteApiKey)
.WithTimeout(TimeSpan.FromSeconds(10))
.WithRetryPolicy(2, TimeSpan.FromSeconds(1))
.WithMetrics()
.Build();
}
public virtual async Task InitializeAsync()
{
Client = CreateClient();
await Client.ConnectAsync();
}
public virtual async Task DisposeAsync()
{
if (Client is not null)
{
await Client.DisconnectAsync();
Client.Dispose();
}
}
}
```
## Step 4: Integration Test Scenarios
### 4.1 Connection Lifecycle
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/ConnectionTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class ConnectionTests : IntegrationTestBase
{
[Fact]
public async Task ConnectAndDisconnect_Succeeds()
{
// Client is connected in InitializeAsync
Assert.True(await Client!.IsConnectedAsync());
await Client.DisconnectAsync();
Assert.False(await Client.IsConnectedAsync());
}
[Fact]
public async Task ConnectWithInvalidApiKey_Fails()
{
using var badClient = CreateClient(InvalidApiKey);
// Expect RpcException with StatusCode.Unauthenticated
var ex = await Assert.ThrowsAsync<Grpc.Core.RpcException>(
() => badClient.ConnectAsync());
Assert.Equal(Grpc.Core.StatusCode.Unauthenticated, ex.StatusCode);
}
[Fact]
public async Task DoubleConnect_IsIdempotent()
{
await Client!.ConnectAsync(); // Already connected — should be no-op
Assert.True(await Client.IsConnectedAsync());
}
}
```
### 4.2 Read Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/ReadTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class ReadTests : IntegrationTestBase
{
[Fact]
public async Task Read_BoolTag_ReturnsBoolValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestBool");
Assert.IsType<bool>(vtq.Value);
Assert.True(vtq.Quality.IsGood());
}
[Fact]
public async Task Read_IntTag_ReturnsIntValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestInt");
Assert.True(vtq.Value is int or long);
Assert.True(vtq.Quality.IsGood());
}
[Fact]
public async Task Read_FloatTag_ReturnsFloatValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestFloat");
Assert.True(vtq.Value is float or double);
Assert.True(vtq.Quality.IsGood());
}
[Fact]
public async Task Read_DoubleTag_ReturnsDoubleValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestDouble");
Assert.IsType<double>(vtq.Value);
Assert.True(vtq.Quality.IsGood());
}
[Fact]
public async Task Read_StringTag_ReturnsStringValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestString");
Assert.IsType<string>(vtq.Value);
Assert.True(vtq.Quality.IsGood());
}
[Fact]
public async Task Read_DateTimeTag_ReturnsDateTimeValue()
{
var vtq = await Client!.ReadAsync("TestChildObject.TestDateTime");
Assert.IsType<DateTime>(vtq.Value);
Assert.True(vtq.Quality.IsGood());
Assert.True(DateTime.UtcNow - vtq.Timestamp < TimeSpan.FromHours(1));
}
[Fact]
public async Task ReadBatch_MultiplesTags_ReturnsDictionary()
{
var tags = new[] { "TestChildObject.TestString", "TestChildObject.TestString" };
var results = await Client!.ReadBatchAsync(tags);
Assert.Equal(2, results.Count);
Assert.True(results.ContainsKey("TestChildObject.TestString"));
Assert.True(results.ContainsKey("TestChildObject.TestString"));
}
[Fact]
public async Task Read_NonexistentTag_ReturnsBadQuality()
{
// Reading a tag that doesn't exist should return Bad quality
// (or throw — depends on Host implementation. Adjust assertion accordingly.)
var vtq = await Client!.ReadAsync("NonExistent.Tag.12345");
// If the Host returns success=false, ReadAsync will throw.
// If it returns success=true with bad quality, check quality.
// Adjust based on actual behavior.
}
}
```
### 4.3 Write Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteTests.cs`
```csharp
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class WriteTests : IntegrationTestBase
{
[Fact]
public async Task WriteAndReadBack_StringValue()
{
string testValue = $"IntTest-{DateTime.UtcNow:HHmmss}";
// Write to a writable string tag
await Client!.WriteAsync("TestChildObject.TestString",
new TypedValue { StringValue = testValue });
// Read back and verify
await Task.Delay(500); // Allow time for write to propagate
var vtq = await Client.ReadAsync("TestChildObject.TestString");
Assert.Equal(testValue, vtq.Value);
}
[Fact]
public async Task WriteWithReadOnlyKey_ThrowsPermissionDenied()
{
using var readOnlyClient = CreateClient(ReadOnlyApiKey);
await readOnlyClient.ConnectAsync();
var ex = await Assert.ThrowsAsync<Grpc.Core.RpcException>(
() => readOnlyClient.WriteAsync("TestChildObject.TestString",
new TypedValue { StringValue = "should-fail" }));
Assert.Equal(Grpc.Core.StatusCode.PermissionDenied, ex.StatusCode);
}
}
```
### 4.4 Subscribe Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/SubscribeTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class SubscribeTests : IntegrationTestBase
{
[Fact]
public async Task Subscribe_ReceivesUpdates()
{
var received = new List<(string Tag, Vtq Vtq)>();
var receivedEvent = new TaskCompletionSource<bool>();
var subscription = await Client!.SubscribeAsync(
new[] { "TestChildObject.TestInt" },
(tag, vtq) =>
{
received.Add((tag, vtq));
if (received.Count >= 3)
receivedEvent.TrySetResult(true);
},
ex => receivedEvent.TrySetException(ex));
// Wait up to 30 seconds for at least 3 updates
var completed = await Task.WhenAny(receivedEvent.Task, Task.Delay(TimeSpan.FromSeconds(30)));
subscription.Dispose();
Assert.True(received.Count >= 1, $"Expected at least 1 update, got {received.Count}");
// Verify the VTQ has correct structure
var first = received[0];
Assert.Equal("TestChildObject.TestInt", first.Tag);
Assert.NotNull(first.Vtq.Value);
// ScanTime should be a DateTime value
Assert.True(first.Vtq.Timestamp > DateTime.MinValue);
}
}
```
### 4.5 WriteBatchAndWait Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteBatchAndWaitTests.cs`
```csharp
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class WriteBatchAndWaitTests : IntegrationTestBase
{
[Fact]
public async Task WriteBatchAndWait_TypeAwareComparison()
{
// This test requires a writable tag and a flag tag.
// Adjust tag names based on available tags in TestChildObject.
// Example: write values and poll a flag.
var values = new Dictionary<string, TypedValue>
{
["TestChildObject.TestString"] = new TypedValue { StringValue = "BatchTest" }
};
// Poll the same tag we wrote to (simple self-check)
var response = await Client!.WriteBatchAndWaitAsync(
values,
flagTag: "TestChildObject.TestString",
flagValue: new TypedValue { StringValue = "BatchTest" },
timeoutMs: 5000,
pollIntervalMs: 200);
Assert.True(response.Success);
Assert.True(response.FlagReached);
Assert.True(response.ElapsedMs < 5000);
}
}
```
### 4.6 CheckApiKey Tests
**File**: `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/CheckApiKeyTests.cs`
```csharp
namespace ZB.MOM.WW.LmxProxy.Client.IntegrationTests;
public class CheckApiKeyTests : IntegrationTestBase
{
[Fact]
public async Task CheckApiKey_ValidReadWrite_ReturnsValid()
{
var info = await Client!.CheckApiKeyAsync(ReadWriteApiKey);
Assert.True(info.IsValid);
}
[Fact]
public async Task CheckApiKey_ValidReadOnly_ReturnsValid()
{
var info = await Client!.CheckApiKeyAsync(ReadOnlyApiKey);
Assert.True(info.IsValid);
}
[Fact]
public async Task CheckApiKey_Invalid_ReturnsInvalid()
{
var info = await Client!.CheckApiKeyAsync("totally-invalid-key-12345");
Assert.False(info.IsValid);
}
}
```
## Step 5: Run Integration Tests
### 5.1 Build the test project (from Mac)
```bash
cd /Users/dohertj2/Desktop/scadalink-design/lmxproxy
dotnet build tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests
```
### 5.2 Run integration tests against v2 on alternate port
```bash
dotnet test tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests --verbosity normal
```
All tests should pass against `10.100.0.48:50052`.
### 5.3 Debug failures
If tests fail, check:
1. v2 service is running: `ssh windev "sc query ZB.MOM.WW.LmxProxy.Host.V2"`
2. v2 service logs: `ssh windev "type C:\publish-v2\logs\lmxproxy-v2-*.txt | findstr /i error"`
3. Network connectivity: `curl -s http://10.100.0.48:8081/api/health`
4. API keys match: `ssh windev "type C:\publish-v2\apikeys.json"`
### 5.4 Verify metrics after test run
```bash
curl -s http://10.100.0.48:8081/api/status | python3 -m json.tool
```
Should show non-zero operation counts for Read, ReadBatch, Write, etc.
## Step 6: Cutover
**Only proceed if ALL integration tests pass.**
### 6.1 Stop v1 service
```bash
ssh windev "sc stop ZB.MOM.WW.LmxProxy.Host"
```
Verify stopped:
```bash
ssh windev "sc query ZB.MOM.WW.LmxProxy.Host"
```
Expected: `STATE: 1 STOPPED`.
### 6.2 Stop v2 service
```bash
ssh windev "sc stop ZB.MOM.WW.LmxProxy.Host.V2"
```
### 6.3 Reconfigure v2 to production ports
Update `C:\publish-v2\appsettings.json`:
- Change `GrpcPort` from `50052` to `50051`
- Change `WebServer.Port` from `8081` to `8080`
- Change log file prefix from `lmxproxy-v2-` to `lmxproxy-`
```bash
ssh windev "powershell -Command \"(Get-Content 'C:\publish-v2\appsettings.json') -replace '50052','50051' -replace '8081','8080' -replace 'lmxproxy-v2-','lmxproxy-' | Set-Content 'C:\publish-v2\appsettings.json'\""
```
### 6.4 Uninstall v1 service
```bash
ssh windev "C:\publish\ZB.MOM.WW.LmxProxy.Host.exe uninstall -servicename \"ZB.MOM.WW.LmxProxy.Host\""
```
### 6.5 Uninstall v2 test service and reinstall as production service
```bash
ssh windev "C:\publish-v2\ZB.MOM.WW.LmxProxy.Host.exe uninstall -servicename \"ZB.MOM.WW.LmxProxy.Host.V2\""
```
```bash
ssh windev "C:\publish-v2\ZB.MOM.WW.LmxProxy.Host.exe install -servicename \"ZB.MOM.WW.LmxProxy.Host\" -displayname \"SCADA Bridge LMX Proxy\" -description \"LmxProxy v2 gRPC service\" --autostart"
```
### 6.6 Start the production service
```bash
ssh windev "sc start ZB.MOM.WW.LmxProxy.Host"
```
### 6.7 Verify on production ports
```bash
ssh windev "timeout /t 10 /nobreak >nul && sc query ZB.MOM.WW.LmxProxy.Host"
```
Expected: `STATE: 4 RUNNING`.
```bash
curl -s http://10.100.0.48:8080/api/health
```
Expected: `OK`.
```bash
curl -s http://10.100.0.48:8080/api/status | python3 -m json.tool | head -15
```
Expected: Connected, version shows v2.
### 6.8 Update test configuration and re-run integration tests
Update `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/appsettings.test.json`:
- Change `Port` from `50052` to `50051`
```bash
dotnet test tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests --verbosity normal
```
All tests should pass on the production port.
### 6.9 Configure service recovery
```bash
ssh windev "sc failure ZB.MOM.WW.LmxProxy.Host reset= 86400 actions= restart/60000/restart/300000/restart/600000"
```
This configures: restart after 1 min on first failure, 5 min on second, 10 min on subsequent. Reset counter after 1 day (86400 seconds).
## Step 7: Documentation Updates
### 7.1 Update windev.md
Add a section about the LmxProxy v2 service to `/Users/dohertj2/Desktop/scadalink-design/windev.md`:
```markdown
## LmxProxy v2
| Field | Value |
|---|---|
| Service Name | ZB.MOM.WW.LmxProxy.Host |
| Display Name | SCADA Bridge LMX Proxy |
| gRPC Port | 50051 |
| Status Page | http://10.100.0.48:8080/ |
| Health Endpoint | http://10.100.0.48:8080/api/health |
| Publish Directory | C:\publish-v2\ |
| API Keys | C:\publish-v2\apikeys.json |
| Logs | C:\publish-v2\logs\ |
| Protocol | v2 (TypedValue + QualityCode) |
```
### 7.2 Update lmxproxy CLAUDE.md
If `lmxproxy/CLAUDE.md` references v1 behavior, update:
- Change "currently v1 protocol" references to "v2 protocol"
- Update publish directory references from `C:\publish\` to `C:\publish-v2\`
- Update any value conversion notes (no more string heuristics)
### 7.3 Clean up v1 publish directory (optional)
```bash
ssh windev "if exist C:\publish\ ren C:\publish publish-v1-backup"
```
## Step 8: Veeam Backup
### 8.1 Take incremental backup
```bash
ssh dohertj2@10.100.0.30 "powershell -Command \"Add-PSSnapin VeeamPSSnapin; Connect-VBRServer -Server localhost; Start-VBRJob -Job (Get-VBRJob -Name 'Backup WW_DEV_VM')\""
```
### 8.2 Wait for backup to complete (check status)
```bash
ssh dohertj2@10.100.0.30 "powershell -Command \"Add-PSSnapin VeeamPSSnapin; Connect-VBRServer -Server localhost; (Get-VBRJob -Name 'Backup WW_DEV_VM').FindLastSession() | Select-Object State, Result, CreationTime, EndTime\""
```
Expected: `State: Stopped, Result: Success`.
### 8.3 Get the restore point ID
```bash
ssh dohertj2@10.100.0.30 "powershell -Command \"Add-PSSnapin VeeamPSSnapin; Connect-VBRServer -Server localhost; Get-VBRRestorePoint -Backup (Get-VBRBackup -Name 'Backup WW_DEV_VM') | Select-Object Id, CreationTime, Type, @{N='SizeGB';E={[math]::Round(\`$_.ApproxSize/1GB,2)}} | Format-Table -AutoSize\""
```
### 8.4 Record in windev.md
Add a new row to the Restore Points table in `windev.md`:
```markdown
| `XXXXXXXX` | 2026-XX-XX XX:XX | Increment | **Post-v2 deployment** — LmxProxy v2 live on port 50051 |
```
Replace placeholders with actual restore point ID and timestamp.
## Completion Criteria
- [ ] v2 Host binary published to `C:\publish-v2\` on windev
- [ ] v2 service installed and running on alternate ports (50052/8081) — verified via status page
- [ ] Integration test project created at `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/`
- [ ] All integration tests pass against v2 on alternate ports:
- [ ] Connect/disconnect lifecycle
- [ ] Read string tag `TestChildObject.TestString` — value "JoeDev", Good quality
- [ ] Read writable tag `TestChildObject.TestString`
- [ ] Write string then read-back verification
- [ ] ReadBatch multiple tags
- [ ] Subscribe to `TestChildObject.TestInt` — verify updates received with TypedValue + QualityCode
- [ ] WriteBatchAndWait with type-aware flag comparison
- [ ] CheckApiKey — valid ReadWrite, valid ReadOnly, invalid
- [ ] Write with ReadOnly key — PermissionDenied
- [ ] Connect with invalid API key — Unauthenticated
- [ ] v1 service stopped and uninstalled
- [ ] v2 service reconfigured to production ports (50051/8080) and reinstalled
- [ ] All integration tests pass on production ports
- [ ] Service recovery configured (restart on failure)
- [ ] `windev.md` updated with v2 service details
- [ ] `lmxproxy/CLAUDE.md` updated for v2
- [ ] Veeam backup taken and restore point ID recorded in `windev.md`
- [ ] v1 publish directory backed up or removed

View File

@@ -0,0 +1,200 @@
# Component: Client
## Purpose
A .NET 10 class library providing a typed gRPC client for consuming the LmxProxy service. Used by ScadaLink's Data Connection Layer to connect to AVEVA System Platform via the LmxProxy Host.
## Location
`src/ZB.MOM.WW.LmxProxy.Client/` — all files in this project.
Key files:
- `ILmxProxyClient.cs` — public interface.
- `LmxProxyClient.cs` — main implementation (partial class across multiple files).
- `LmxProxyClientBuilder.cs` — fluent builder for client construction.
- `ServiceCollectionExtensions.cs` — DI integration and options classes.
- `ILmxProxyClientFactory.cs` — factory interface and implementation.
- `StreamingExtensions.cs` — batch and parallel streaming helpers.
- `Domain/ScadaContracts.cs` — code-first gRPC contracts.
- `Security/GrpcChannelFactory.cs` — TLS channel creation.
## Responsibilities
- Connect to and communicate with the LmxProxy Host gRPC service.
- Manage session lifecycle (connect, keep-alive, disconnect).
- Execute read, write, and subscribe operations with retry and concurrency control.
- Provide a fluent builder and DI integration for configuration.
- Track client-side performance metrics.
- Support TLS and mutual TLS connections.
## 1. Public Interface (ILmxProxyClient)
| Method | Description |
|--------|-------------|
| `ConnectAsync(ct)` | Establish gRPC channel and session |
| `DisconnectAsync()` | Graceful disconnect |
| `IsConnectedAsync()` | Thread-safe connection state check |
| `ReadAsync(address, ct)` | Read single tag, returns Vtq |
| `ReadBatchAsync(addresses, ct)` | Read multiple tags, returns dictionary |
| `WriteAsync(address, value, ct)` | Write single tag value |
| `WriteBatchAsync(values, ct)` | Write multiple tag values |
| `SubscribeAsync(addresses, onUpdate, onStreamError, ct)` | Subscribe to tag updates with value and error callbacks |
| `GetMetrics()` | Return operation counts, errors, latency stats |
| `DefaultTimeout` | Configurable timeout (default 30s, range 1s10min) |
Implements `IDisposable` and `IAsyncDisposable`.
## 2. Connection Management
### 2.1 Connect
`ConnectAsync()`:
1. Creates a gRPC channel via `GrpcChannelFactory` (HTTP or HTTPS based on TLS config).
2. Creates a `protobuf-net.Grpc` client for `IScadaService`.
3. Calls the `Connect` RPC with a client ID (format: `ScadaBridge-{guid}`) and optional API key.
4. Stores the returned session ID.
5. Starts the keep-alive timer.
### 2.2 Keep-Alive
- Timer-based ping every **30 seconds** (hardcoded).
- Sends a lightweight `GetConnectionState` RPC.
- On failure: stops the timer, marks disconnected, triggers subscription cleanup.
### 2.3 Disconnect
`DisconnectAsync()`:
1. Stops keep-alive timer.
2. Calls `Disconnect` RPC.
3. Clears session ID.
4. Disposes gRPC channel.
### 2.4 Connection State
`IsConnected` property: `!_disposed && _isConnected && !string.IsNullOrEmpty(_sessionId)`.
## 3. Builder Pattern (LmxProxyClientBuilder)
| Method | Default | Constraint |
|--------|---------|-----------|
| `WithHost(string)` | Required | Non-null/non-empty |
| `WithPort(int)` | 5050 | 165535 |
| `WithApiKey(string?)` | null | Optional |
| `WithTimeout(TimeSpan)` | 30 seconds | > 0 and ≤ 10 minutes |
| `WithLogger(ILogger)` | NullLogger | Optional |
| `WithSslCredentials(string?)` | Disabled | Optional cert path |
| `WithTlsConfiguration(ClientTlsConfiguration)` | null | Full TLS config |
| `WithRetryPolicy(int, TimeSpan)` | 3 attempts, 1s delay | maxAttempts > 0, delay > 0 |
| `WithMetrics()` | Disabled | Enables metric collection |
| `WithCorrelationIdHeader(string)` | null | Custom header name |
## 4. Retry Policy
Polly-based exponential backoff:
- Default: **3 attempts** with **1-second** initial delay.
- Backoff sequence: `delay * 2^(retryAttempt - 1)` → 1s, 2s, 4s.
- Transient errors retried: `Unavailable`, `DeadlineExceeded`, `ResourceExhausted`, `Aborted`.
- Each retry is logged with correlation ID at Warning level.
## 5. Subscription
### 5.1 Subscribe API
`SubscribeAsync(addresses, onUpdate, onStreamError, ct)` returns an `ISubscription`:
- Calls the `Subscribe` RPC (server streaming) with the tag list and default sampling interval (**1000ms**).
- Processes streamed `VtqMessage` items asynchronously, invoking the `onUpdate(tag, vtq)` callback for each.
- On stream termination (server disconnect, gRPC error, or connection drop), invokes the `onStreamError` callback exactly once.
- On stream error, the client immediately nullifies its session ID, causing `IsConnected` to return `false`. This triggers the DCL adapter's `Disconnected` event and reconnection cycle.
- Errors are logged per-subscription.
### 5.2 ISubscription
- `Dispose()` — synchronous disposal with **5-second** timeout.
- Automatic callback on disposal for cleanup.
## 6. DI Integration
### 6.1 Service Collection Extensions
| Method | Lifetime | Description |
|--------|----------|-------------|
| `AddLmxProxyClient(IConfiguration)` | Singleton | Bind `LmxProxy` config section |
| `AddLmxProxyClient(IConfiguration, string)` | Singleton | Bind named config section |
| `AddLmxProxyClient(Action<Builder>)` | Singleton | Builder action |
| `AddScopedLmxProxyClient(IConfiguration)` | Scoped | Per-scope lifetime |
| `AddNamedLmxProxyClient(string, Action<Builder>)` | Keyed singleton | Named/keyed registration |
### 6.2 Configuration Options (LmxProxyClientOptions)
Bound from `appsettings.json`:
| Setting | Default | Description |
|---------|---------|-------------|
| Host | `localhost` | Server hostname |
| Port | 5050 | Server port |
| ApiKey | null | API key |
| Timeout | 30 seconds | Operation timeout |
| UseSsl | false | Enable TLS |
| CertificatePath | null | SSL certificate path |
| EnableMetrics | false | Enable client metrics |
| CorrelationIdHeader | null | Custom correlation header |
| Retry:MaxAttempts | 3 | Retry attempts |
| Retry:Delay | 1 second | Initial retry delay |
### 6.3 Factory Pattern
`ILmxProxyClientFactory` creates configured clients:
- `CreateClient()` — uses default `LmxProxy` config section.
- `CreateClient(string)` — uses named config section.
- `CreateClient(Action<Builder>)` — uses builder action.
Registered as singleton in DI.
## 7. Streaming Extensions
Helper methods for large-scale batch operations:
| Method | Default Batch Size | Description |
|--------|--------------------|-------------|
| `ReadStreamAsync` | 100 | Batched reads, 2 retries per batch, stops after 3 consecutive errors. Returns `IAsyncEnumerable<KeyValuePair<string, Vtq>>`. |
| `WriteStreamAsync` | 100 | Batched writes from async enumerable input. Returns total count written. |
| `ProcessInParallelAsync` | — | Parallel processing with max concurrency of **4** (configurable). Semaphore-based rate limiting. |
| `SubscribeStreamAsync` | — | Wraps callback-based subscription into `IAsyncEnumerable<Vtq>` via `System.Threading.Channels`. |
## 8. Client Metrics
When metrics are enabled (`WithMetrics()`):
- Per-operation tracking: counts, error counts, latency.
- Rolling buffer of **1000** latency samples per operation (prevents memory growth).
- Snapshot via `GetMetrics()` returns: `{op}_count`, `{op}_errors`, `{op}_avg_latency_ms`, `{op}_p95_latency_ms`, `{op}_p99_latency_ms`.
## 9. Value and Quality Handling
### 9.1 Values (TypedValue)
Read responses and subscription updates return values as `TypedValue` (protobuf oneof). The client extracts the value directly from the appropriate oneof field (e.g., `vtq.Value.DoubleValue`, `vtq.Value.BoolValue`). Write operations construct `TypedValue` with the correct oneof case for the value's native type. No string serialization or parsing is needed.
### 9.2 Quality (QualityCode)
Quality is received as a `QualityCode` message. Category checks use bitmask: `IsGood = (statusCode & 0xC0000000) == 0x00000000`, `IsBad = (statusCode & 0xC0000000) == 0x80000000`. The `symbolic_name` field provides human-readable quality for logging and display.
### 9.3 Current Implementation (V1 Legacy)
The current codebase still uses v1 string-based encoding. During v2 migration, the following will be removed:
- `ConvertToVtq()` — parses string values via heuristic (double → bool → null → raw string).
- `ConvertToString()` — serializes values via `.ToString()`.
## Dependencies
- **protobuf-net.Grpc** — code-first gRPC client.
- **Grpc.Net.Client** — HTTP/2 gRPC transport.
- **Polly** — retry policies.
- **Microsoft.Extensions.DependencyInjection** — DI integration.
- **Microsoft.Extensions.Configuration** — options binding.
- **Microsoft.Extensions.Logging** — logging abstraction.
## Interactions
- **ScadaLink Data Connection Layer** consumes the client library via `ILmxProxyClient`.
- **Protocol** — the client uses code-first contracts (`IScadaService`) that are wire-compatible with the Host's proto-generated service.
- **Security** — `GrpcChannelFactory` creates TLS-configured channels matching the Host's TLS configuration.

View File

@@ -0,0 +1,122 @@
# Component: Configuration
## Purpose
Defines the `appsettings.json` structure, configuration binding, and startup validation for the LmxProxy Host service.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs` — root configuration class.
- `src/ZB.MOM.WW.LmxProxy.Host/Configuration/ConfigurationValidator.cs` — validation logic.
- `src/ZB.MOM.WW.LmxProxy.Host/appsettings.json` — default configuration file.
## Responsibilities
- Define all configurable settings as strongly-typed classes.
- Bind `appsettings.json` sections to configuration objects via `Microsoft.Extensions.Configuration`.
- Validate all settings at startup, failing fast on invalid values.
- Support environment variable overrides.
## 1. Configuration Structure
### 1.1 Root: LmxProxyConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| GrpcPort | int | 50051 | gRPC server listen port |
| ApiKeyConfigFile | string | `apikeys.json` | Path to API key configuration file |
| Subscription | SubscriptionConfiguration | — | Subscription channel settings |
| ServiceRecovery | ServiceRecoveryConfiguration | — | Windows SCM recovery settings |
| Connection | ConnectionConfiguration | — | MxAccess connection settings |
| Tls | TlsConfiguration | — | TLS/SSL settings |
| WebServer | WebServerConfiguration | — | Status web server settings |
### 1.2 ConnectionConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| MonitorIntervalSeconds | int | 5 | Auto-reconnect check interval |
| ConnectionTimeoutSeconds | int | 30 | Initial connection timeout |
| ReadTimeoutSeconds | int | 5 | Per-read operation timeout |
| WriteTimeoutSeconds | int | 5 | Per-write operation timeout |
| MaxConcurrentOperations | int | 10 | Semaphore limit for concurrent MxAccess operations |
| AutoReconnect | bool | true | Enable auto-reconnect loop |
| NodeName | string? | null | MxAccess node name (optional) |
| GalaxyName | string? | null | MxAccess galaxy name (optional) |
### 1.3 SubscriptionConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| ChannelCapacity | int | 1000 | Per-client subscription buffer size |
| ChannelFullMode | string | `DropOldest` | Backpressure strategy: `DropOldest`, `DropNewest`, `Wait` |
### 1.4 TlsConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| Enabled | bool | false | Enable TLS on gRPC server |
| ServerCertificatePath | string | `certs/server.crt` | PEM server certificate |
| ServerKeyPath | string | `certs/server.key` | PEM server private key |
| ClientCaCertificatePath | string | `certs/ca.crt` | CA certificate for mTLS |
| RequireClientCertificate | bool | false | Require client certificates |
| CheckCertificateRevocation | bool | false | Enable CRL checking |
### 1.5 WebServerConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| Enabled | bool | true | Enable status web server |
| Port | int | 8080 | HTTP listen port |
| Prefix | string? | null | Custom URL prefix (defaults to `http://+:{Port}/`) |
### 1.6 ServiceRecoveryConfiguration
| Setting | Type | Default | Description |
|---------|------|---------|-------------|
| FirstFailureDelayMinutes | int | 1 | Restart delay after first failure |
| SecondFailureDelayMinutes | int | 5 | Restart delay after second failure |
| SubsequentFailureDelayMinutes | int | 10 | Restart delay after subsequent failures |
| ResetPeriodDays | int | 1 | Days before failure count resets |
## 2. Validation
`ConfigurationValidator.ValidateAndLog()` runs at startup and checks:
- **GrpcPort**: Must be 165535.
- **Connection**: All timeout values > 0. NodeName and GalaxyName ≤ 255 characters.
- **Subscription**: ChannelCapacity 0100000. ChannelFullMode must be one of `DropOldest`, `DropNewest`, `Wait`.
- **ServiceRecovery**: All failure delay values ≥ 0. ResetPeriodDays > 0.
- **TLS**: If enabled, validates certificate file paths exist.
Validation errors are logged and cause the service to throw `InvalidOperationException`, preventing startup.
## 3. Configuration Sources
Configuration is loaded via `Microsoft.Extensions.Configuration.ConfigurationBuilder`:
1. `appsettings.json` (required).
2. Environment variables (override any JSON setting).
## 4. Serilog Configuration
Logging is configured in the `Serilog` section of `appsettings.json`:
| Setting | Value |
|---------|-------|
| Console sink | ANSI theme, custom template with HH:mm:ss timestamp |
| File sink | `logs/lmxproxy-.txt`, daily rolling, 30 files retained |
| Default level | Information |
| Override: Microsoft | Warning |
| Override: System | Warning |
| Override: Grpc | Information |
| Enrichment | FromLogContext, WithMachineName, WithThreadId |
## Dependencies
- **Microsoft.Extensions.Configuration** — configuration binding.
- **Serilog.Settings.Configuration** — Serilog configuration from appsettings.
## Interactions
- **ServiceHost** (Program.cs) loads and validates configuration at startup.
- All other components receive their settings from the bound configuration objects.

View File

@@ -0,0 +1,86 @@
# Component: GrpcServer
## Purpose
The gRPC service implementation that receives client RPCs, validates sessions, and delegates operations to the MxAccessClient. It is the network-facing entry point for all SCADA operations.
## Location
`src/ZB.MOM.WW.LmxProxy.Host/Grpc/ScadaGrpcService.cs` — inherits proto-generated `ScadaService.ScadaServiceBase`.
## Responsibilities
- Implement all 10 gRPC RPCs defined in `scada.proto`.
- Validate session IDs on all data operations before processing.
- Delegate read/write/subscribe operations to the MxAccessClient.
- Convert between gRPC message types and internal domain types (Vtq, Quality).
- Track operation timing and success/failure via PerformanceMetrics.
- Handle errors gracefully, returning structured error responses rather than throwing.
## 1. RPC Implementations
### 1.1 Connection Management
- **Connect**: Creates a new session via SessionManager if MxAccess is connected. Returns the session ID (32-character hex GUID). Rejects if MxAccess is disconnected.
- **Disconnect**: Terminates the session via SessionManager.
- **GetConnectionState**: Returns `IsConnected`, `ClientId`, and `ConnectedSinceUtcTicks` from the MxAccessClient.
### 1.2 Read Operations
- **Read**: Validates session, applies Polly retry policy, calls MxAccessClient.ReadAsync(), returns VtqMessage. On invalid session, returns a VtqMessage with `Quality.Bad`.
- **ReadBatch**: Validates session, reads all tags via MxAccessClient.ReadBatchAsync() with semaphore-controlled concurrency (max 10 concurrent). Returns results in request order. Batch reads are partially successful — individual tags may have Bad quality (with current UTC timestamp) while the overall response succeeds. If a tag read throws an exception, its VTQ is returned with Bad quality.
### 1.3 Write Operations
- **Write**: Validates session, parses the string value using the type heuristic, calls MxAccessClient.WriteAsync().
- **WriteBatch**: Validates session, writes all items in parallel via MxAccessClient with semaphore concurrency control. Returns per-item success/failure results. Overall `success` is `false` if any item fails (all-or-nothing at the reporting level).
- **WriteBatchAndWait**: Validates session, writes all items first. If any write fails, returns immediately with `success=false`. If writes succeed, polls `flag_tag` at `poll_interval_ms` intervals using type-aware `TypedValueEquals()` comparison (same oneof case required, native type equality, case-sensitive strings, null equals null only). Default timeout: 5000ms, default poll interval: 100ms. If flag matches before timeout: `success=true`, `flag_reached=true`. If timeout expires: `success=true`, `flag_reached=false` (timeout is not an error). Returns `flag_reached` boolean and `elapsed_ms`.
### 1.4 Subscription
- **Subscribe**: Validates session (throws `RpcException(Unauthenticated)` on invalid). Creates a subscription handle via SubscriptionManager. Streams VtqMessage items from the subscription channel to the client. Cleans up the subscription on stream cancellation or error.
### 1.5 API Key Check
- **CheckApiKey**: Returns validity and role information from the interceptor context.
## 2. Value and Quality Handling
### 2.1 Values (TypedValue)
Read responses and subscription updates return values as `TypedValue` (protobuf oneof carrying native types). Write requests receive `TypedValue` and apply the value directly to MxAccess by its native type. If the `oneof` case doesn't match the tag's expected data type, the write returns `WriteResult` with `success=false` indicating type mismatch. No string serialization or parsing heuristics are used.
### 2.2 Quality (QualityCode)
Quality is returned as a `QualityCode` message with `uint32 status_code` (OPC UA-compatible) and `string symbolic_name`. The server maps MxAccess quality codes to OPC UA status codes per the quality table in Component-Protocol. Specific error scenarios return specific quality codes (e.g., tag not found → `BadConfigurationError`, comms loss → `BadCommunicationFailure`).
### 2.3 Current Implementation (V1 Legacy)
The current codebase still uses v1 string-based encoding. During v2 migration, the following v1 behavior will be removed:
- `ConvertValueToString()` — serializes values to strings (bool → lowercase, DateTime → ISO-8601, arrays → JSON, others → `.ToString()`).
- `ParseValue()` — parses string values in order: bool → int → long → double → DateTime → raw string.
- Three-state string quality mapping: ≥192 → `"Good"`, 64191 → `"Uncertain"`, <64 → `"Bad"`.
## 3. Error Handling
- All RPC methods catch exceptions and return error responses with `success=false` and a descriptive message. Exceptions do not propagate as gRPC status codes (except Subscribe, which throws `RpcException` for invalid sessions).
- Each operation is wrapped in a PerformanceMetrics timing scope that records duration and success/failure.
## 4. Session Validation
- All data operations (Read, ReadBatch, Write, WriteBatch, WriteBatchAndWait, Subscribe) validate the session ID before processing.
- Invalid session on read/write operations returns a response with Bad quality VTQ.
- Invalid session on Subscribe throws `RpcException` with `StatusCode.Unauthenticated`.
## Dependencies
- **MxAccessClient** (IScadaClient) — all SCADA operations are delegated here.
- **SessionManager** — session creation, validation, and termination.
- **SubscriptionManager** — subscription lifecycle for the Subscribe RPC.
- **PerformanceMetrics** — operation timing and success/failure tracking.
## Interactions
- **ApiKeyInterceptor** intercepts all RPCs before they reach ScadaGrpcService, enforcing API key authentication and role-based write authorization.
- **SubscriptionManager** provides the channel that Subscribe streams from.
- **StatusReportService** reads PerformanceMetrics data that ScadaGrpcService populates.

View File

@@ -0,0 +1,121 @@
# Component: HealthAndMetrics
## Purpose
Provides health checking, performance metrics collection, and an HTTP status dashboard for monitoring the LmxProxy service.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Health/HealthCheckService.cs` — basic health check.
- `src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs` — detailed health check with test tag read.
- `src/ZB.MOM.WW.LmxProxy.Host/Metrics/PerformanceMetrics.cs` — operation metrics collection.
- `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusReportService.cs` — status report generation.
- `src/ZB.MOM.WW.LmxProxy.Host/Status/StatusWebServer.cs` — HTTP status endpoint.
## Responsibilities
- Evaluate service health based on connection state, operation success rates, and test tag reads.
- Track per-operation performance metrics (counts, latencies, percentiles).
- Serve an HTML status dashboard and JSON/health HTTP endpoints.
- Report metrics to logs on a periodic interval.
## 1. Health Checks
### 1.1 Basic Health Check (HealthCheckService)
`CheckHealthAsync()` evaluates:
| Check | Healthy | Degraded |
|-------|---------|----------|
| MxAccess connected | Yes | — |
| Success rate (if > 100 total ops) | ≥ 50% | < 50% |
| Client count | ≤ 100 | > 100 |
Returns health data dictionary: `scada_connected`, `scada_connection_state`, `total_clients`, `total_tags`, `total_operations`, `average_success_rate`.
### 1.2 Detailed Health Check (DetailedHealthCheckService)
`CheckHealthAsync()` performs an active probe:
1. Checks `IsConnected` — returns **Unhealthy** if not connected.
2. Reads a test tag (default `System.Heartbeat`).
3. If test tag quality is not Good — returns **Degraded**.
4. If test tag timestamp is older than **5 minutes** — returns **Degraded** (stale data detection).
5. Otherwise returns **Healthy**.
## 2. Performance Metrics
### 2.1 Tracking
`PerformanceMetrics` uses a `ConcurrentDictionary<string, OperationMetrics>` to track operations by name.
Operations tracked: `Read`, `ReadBatch`, `Write`, `WriteBatch` (recorded by ScadaGrpcService).
### 2.2 Recording
Two recording patterns:
- `RecordOperation(name, duration, success)` — explicit recording.
- `BeginOperation(name)` — returns an `ITimingScope` (disposable). On dispose, automatically records duration (via `Stopwatch`) and success flag (set via `SetSuccess(bool)`).
### 2.3 Per-Operation Statistics
`OperationMetrics` maintains:
- `_totalCount`, `_successCount` — running counters.
- `_totalMilliseconds`, `_minMilliseconds`, `_maxMilliseconds` — latency range.
- `_durations` — rolling buffer of up to **1000 latency samples** for percentile calculation.
`MetricsStatistics` snapshot:
- `TotalCount`, `SuccessCount`, `SuccessRate` (percentage).
- `AverageMilliseconds`, `MinMilliseconds`, `MaxMilliseconds`.
- `Percentile95Milliseconds` — calculated from sorted samples at the 95th percentile index.
### 2.4 Periodic Reporting
A timer fires every **60 seconds**, logging a summary of all operation metrics to Serilog.
## 3. Status Web Server
### 3.1 Server
`StatusWebServer` uses `HttpListener` on `http://+:{Port}/` (default port 8080).
- Starts an async request-handling loop, spawning a task per request.
- Graceful shutdown: cancels the listener, waits **5 seconds** for the listener task to exit.
- Returns HTTP 405 for non-GET methods, HTTP 500 on errors.
### 3.2 Endpoints
| Endpoint | Method | Response |
|----------|--------|----------|
| `/` | GET | HTML dashboard (auto-refresh every 30 seconds) |
| `/api/status` | GET | JSON status report (camelCase) |
| `/api/health` | GET | Plain text `OK` (200) or `UNHEALTHY` (503) |
### 3.3 HTML Dashboard
Generated by `StatusReportService`:
- Bootstrap-like CSS grid layout with status cards.
- Color-coded status: green = Healthy, yellow = Degraded, red = Unhealthy/Error.
- Operations table with columns: Count, SuccessRate, Avg/Min/Max/P95 milliseconds.
- Service metadata: ServiceName, Version (assembly version), connection state.
- Subscription stats: TotalClients, TotalTags, ActiveSubscriptions.
- Auto-refresh via `<meta http-equiv="refresh" content="30">`.
- Last updated timestamp.
### 3.4 JSON Status Report
Fully nested structure with camelCase property names:
- Service metadata, connection status, subscription stats, performance data, health check results.
## Dependencies
- **MxAccessClient** — `IsConnected`, `ConnectionState` for health checks; test tag read for detailed check.
- **SubscriptionManager** — subscription statistics.
- **PerformanceMetrics** — operation statistics for status report and health evaluation.
- **Configuration** — `WebServerConfiguration` for port and prefix.
## Interactions
- **GrpcServer** populates PerformanceMetrics via timing scopes on every RPC.
- **ServiceHost** creates all health/metrics/status components at startup and disposes them at shutdown.
- External monitoring systems can poll `/api/health` for availability checks.

View File

@@ -0,0 +1,108 @@
# Component: MxAccessClient
## Purpose
The core component that wraps the ArchestrA MXAccess COM API, providing connection management, tag read/write operations, and subscription-based value change notifications. This is the bridge between the gRPC service layer and AVEVA System Platform.
## Location
`src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.cs` — partial class split across 6 files:
- `MxAccessClient.cs` — Main class, properties, disposal, factory.
- `MxAccessClient.Connection.cs` — Connection lifecycle (connect, disconnect, reconnect, cleanup).
- `MxAccessClient.ReadWrite.cs` — Read and write operations with retry and concurrency control.
- `MxAccessClient.Subscription.cs` — Subscription management and stored subscription state.
- `MxAccessClient.EventHandlers.cs` — COM event handlers (OnDataChange, OnWriteComplete, OperationComplete).
- `MxAccessClient.NestedTypes.cs` — Internal types and enums.
## Responsibilities
- Manage the MXAccess COM object lifecycle (create, register, unregister, release).
- Maintain connection state (Disconnected, Connecting, Connected, Disconnecting, Error, Reconnecting) and fire state change events.
- Execute read and write operations against MXAccess with concurrency control via semaphores.
- Manage tag subscriptions via MXAccess advise callbacks and store subscription state for reconnection.
- Handle COM threading constraints (STA thread context via `Task.Run`).
## 1. Connection Lifecycle
### 1.1 Connect
`ConnectAsync()` wraps `ConnectInternal()` in `Task.Run` for STA thread context:
1. Validates not disposed.
2. Returns early if already connected.
3. Sets state to `Connecting`.
4. `InitializeMxAccessConnection()` — creates new `LMXProxyServer` COM object, wires event handlers (OnDataChange, OnWriteComplete, OperationComplete).
5. `RegisterWithMxAccess()` — calls `_lmxProxy.Register("ZB.MOM.WW.LmxProxy.Host")`, stores the returned connection handle.
6. Sets state to `Connected`.
7. On error, calls `Cleanup()` and re-throws.
After successful connection, calls `RecreateStoredSubscriptionsAsync()` to restore any previously active subscriptions.
### 1.2 Disconnect
`DisconnectAsync()` wraps `DisconnectInternal()` in `Task.Run`:
1. Checks `IsConnected`.
2. Sets state to `Disconnecting`.
3. `RemoveAllSubscriptions()` — unsubscribes all tags from MXAccess but retains subscription state in `_storedSubscriptions` for reconnection.
4. `UnregisterFromMxAccess()` — calls `_lmxProxy.Unregister(_connectionHandle)`.
5. `Cleanup()` — removes event handlers, calls `Marshal.ReleaseComObject(_lmxProxy)` to force-release all COM references, nulls the proxy and resets the connection handle.
6. Sets state to `Disconnected`.
### 1.3 Connection State
- `IsConnected` property: `_lmxProxy != null && _connectionState == Connected && _connectionHandle > 0`.
- `ConnectionState` enum: Disconnected, Connecting, Connected, Disconnecting, Error, Reconnecting.
- `ConnectionStateChanged` event fires on all state transitions with previous state, current state, and optional message.
### 1.4 Auto-Reconnect
When `AutoReconnect` is enabled (default), the `MonitorConnectionAsync` loop runs continuously:
- Checks `IsConnected` every `MonitorIntervalSeconds` (default 5 seconds).
- On disconnect, attempts reconnect via semaphore-protected `ConnectAsync()`.
- On failure, logs warning and retries at the next interval.
- Reconnection restores stored subscriptions automatically.
## 2. Thread Safety & COM Constraints
- State mutations protected by `lock (_lock)`.
- COM operations wrapped in `Task.Run` for STA thread context (MXAccess is 32-bit COM).
- Concurrency control: `_readSemaphore` and `_writeSemaphore` limit concurrent MXAccess operations to `MaxConcurrentOperations` (default 10, configurable).
- Default max concurrency constant: `DefaultMaxConcurrency = 10`.
## 3. Read Operations
- `ReadAsync(address, ct)` — Applies Polly retry policy, calls `ReadSingleValueAsync()`, returns `Vtq`.
- `ReadBatchAsync(addresses, ct)` — Creates parallel tasks per address via `ReadAddressWithSemaphoreAsync()`. Each task acquires `_readSemaphore` before reading. Returns `IReadOnlyDictionary<address, Vtq>`.
## 4. Write Operations
- `WriteAsync(address, value, ct)` — Applies Polly retry policy, calls `WriteInternalAsync(address, value, ct)`.
- `WriteBatchAsync(values, ct)` — Parallel tasks via `WriteAddressWithSemaphoreAsync()`. Each task acquires `_writeSemaphore` before writing.
- `WriteBatchAndWaitAsync(values, flagAddress, flagValue, responseAddress, responseValue, ct)` — Writes batch, writes flag, polls response tag until match.
## 5. Subscription Management
- Subscriptions stored in `_storedSubscriptions` for reconnection persistence.
- `SubscribeInternalAsync(addresses, callback, storeSubscription)` — registers tags with MXAccess and stores subscription state.
- `RecreateStoredSubscriptionsAsync()` — called after reconnect to re-subscribe all previously active tags without re-storing.
- `RemoveAllSubscriptions()` — unsubscribes from MXAccess but retains `_storedSubscriptions`.
## 6. Event Handlers
- **OnDataChange** — Fired by MXAccess when a subscribed tag value changes. Routes the update to the SubscriptionManager.
- **OnWriteComplete** — Fired when an async write operation completes.
- **OperationComplete** — General operation completion callback.
## Dependencies
- **ArchestrA.MXAccess** COM interop assembly (`lib/ArchestrA.MXAccess.dll`).
- **Polly** — retry policies for read/write operations.
- **Configuration** — `ConnectionConfiguration` for timeouts, concurrency limits, and auto-reconnect settings.
## Interactions
- **GrpcServer** (ScadaGrpcService) delegates all SCADA operations to MxAccessClient via the `IScadaClient` interface.
- **SubscriptionManager** receives value change callbacks originating from MxAccessClient's COM event handlers.
- **HealthAndMetrics** queries `IsConnected` and `ConnectionState` for health checks.
- **ServiceHost** manages the MxAccessClient lifecycle (create at startup, dispose at shutdown).

View File

@@ -0,0 +1,301 @@
# Component: Protocol
## Purpose
Defines the gRPC protocol specification for communication between the LmxProxy Client and Host, including the proto file definition, code-first contracts, message schemas, value type system, and quality codes. The authoritative specification is `docs/lmxproxy_updates.md`.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Grpc/Protos/scada.proto` — proto file (Host, proto-generated).
- `src/ZB.MOM.WW.LmxProxy.Client/Domain/ScadaContracts.cs` — code-first contracts (Client, protobuf-net.Grpc).
- `docs/lmxproxy_updates.md` — authoritative protocol specification.
- `docs/lmxproxy_protocol.md` — legacy v1 protocol documentation (superseded).
## Responsibilities
- Define the gRPC service interface (`scada.ScadaService`) and all message types.
- Ensure wire compatibility between the Host's proto-generated code and the Client's code-first contracts.
- Specify the VTQ data model: `TypedValue` for values, `QualityCode` for quality.
- Document OPC UA-aligned quality codes filtered to AVEVA System Platform usage.
## 1. Service Definition
Service: `scada.ScadaService` (gRPC package: `scada`)
| RPC | Request | Response | Type |
|-----|---------|----------|------|
| Connect | ConnectRequest | ConnectResponse | Unary |
| Disconnect | DisconnectRequest | DisconnectResponse | Unary |
| GetConnectionState | GetConnectionStateRequest | GetConnectionStateResponse | Unary |
| Read | ReadRequest | ReadResponse | Unary |
| ReadBatch | ReadBatchRequest | ReadBatchResponse | Unary |
| Write | WriteRequest | WriteResponse | Unary |
| WriteBatch | WriteBatchRequest | WriteBatchResponse | Unary |
| WriteBatchAndWait | WriteBatchAndWaitRequest | WriteBatchAndWaitResponse | Unary |
| Subscribe | SubscribeRequest | stream VtqMessage | Server streaming |
| CheckApiKey | CheckApiKeyRequest | CheckApiKeyResponse | Unary |
## 2. Value Type System (TypedValue)
Values are transmitted in their native protobuf types via a `TypedValue` oneof. No string serialization or parsing heuristics are used.
```
TypedValue {
oneof value {
bool bool_value = 1
int32 int32_value = 2
int64 int64_value = 3
float float_value = 4
double double_value = 5
string string_value = 6
bytes bytes_value = 7
int64 datetime_value = 8 // UTC DateTime.Ticks (100ns intervals since 0001-01-01)
ArrayValue array_value = 9 // typed arrays
}
}
```
`ArrayValue` contains typed repeated fields via oneof: `BoolArray`, `Int32Array`, `Int64Array`, `FloatArray`, `DoubleArray`, `StringArray`. Each contains a `repeated` field of the corresponding primitive.
### 2.1 Null Handling
- Null is represented by an unset `oneof` (no field selected in `TypedValue`).
- A null or missing VTQ message is treated as Bad quality with null value and current UTC timestamp.
### 2.2 Type Mapping from Internal Tag Model
| Tag Data Type | TypedValue Field |
|---------------|-----------------|
| `bool` | `bool_value` |
| `int32` | `int32_value` |
| `int64` | `int64_value` |
| `float` | `float_value` |
| `double` | `double_value` |
| `string` | `string_value` |
| `byte[]` | `bytes_value` |
| `DateTime` | `datetime_value` (UTC Ticks as int64) |
| `float[]` | `array_value.float_values` |
| `int32[]` | `array_value.int32_values` |
| Other arrays | Corresponding `ArrayValue` field |
## 3. Quality System (QualityCode)
Quality is a structured message with an OPC UA-compatible numeric status code and a human-readable symbolic name:
```
QualityCode {
uint32 status_code = 1 // OPC UA-compatible numeric status code
string symbolic_name = 2 // Human-readable name (e.g., "Good", "BadSensorFailure")
}
```
### 3.1 Category Extraction
Category derived from high bits via `(statusCode & 0xC0000000)`:
- `0x00000000` = Good
- `0x40000000` = Uncertain
- `0x80000000` = Bad
```csharp
public static bool IsGood(uint statusCode) => (statusCode & 0xC0000000) == 0x00000000;
public static bool IsBad(uint statusCode) => (statusCode & 0xC0000000) == 0x80000000;
```
### 3.2 Supported Quality Codes
Filtered to codes actively used by AVEVA System Platform, InTouch, and OI Server/DAServer (per AVEVA Tech Note TN1305):
**Good Quality:**
| Symbolic Name | OPC UA Status Code | AVEVA OPC DA Hex | Description |
|--------------|-------------------|------------------|-------------|
| `Good` | `0x00000000` | `0x00C0` | Value is reliable, non-specific |
| `GoodLocalOverride` | `0x00D80000` | `0x00D8` | Manually overridden; input disconnected |
**Uncertain Quality:**
| Symbolic Name | OPC UA Status Code | AVEVA OPC DA Hex | Description |
|--------------|-------------------|------------------|-------------|
| `UncertainLastUsableValue` | `0x40900000` | `0x0044` | External source stopped writing; value is stale |
| `UncertainSensorNotAccurate` | `0x42390000` | `0x0050` | Sensor out of calibration or clamped |
| `UncertainEngineeringUnitsExceeded` | `0x40540000` | `0x0054` | Outside defined engineering limits |
| `UncertainSubNormal` | `0x40580000` | `0x0058` | Derived from insufficient good sources |
**Bad Quality:**
| Symbolic Name | OPC UA Status Code | AVEVA OPC DA Hex | Description |
|--------------|-------------------|------------------|-------------|
| `Bad` | `0x80000000` | `0x0000` | Non-specific bad; value not useful |
| `BadConfigurationError` | `0x80040000` | `0x0004` | Server config problem (e.g., item deleted) |
| `BadNotConnected` | `0x808A0000` | `0x0008` | Input not logically connected to source |
| `BadDeviceFailure` | `0x806B0000` | `0x000C` | Device failure detected |
| `BadSensorFailure` | `0x806D0000` | `0x0010` | Sensor failure detected |
| `BadLastKnownValue` | `0x80050000` | `0x0014` | Comm failed; last known value available |
| `BadCommunicationFailure` | `0x80050000` | `0x0018` | Comm failed; no last known value |
| `BadOutOfService` | `0x808F0000` | `0x001C` | Block off-scan/locked; item inactive |
| `BadWaitingForInitialData` | `0x80320000` | — | Initializing; OI Server establishing communication |
**Notes:**
- AVEVA OPC DA quality codes use a 16-bit structure: 2 bits major (Good/Bad/Uncertain), 4 bits minor (sub-status), 2 bits limit (Not Limited, Low, High, Constant). The OPC UA status codes above are the standard UA equivalents.
- The limit bits are appended to any quality code. For example, `Good + High Limited` = `0x00C2` in OPC DA. In OPC UA, limits are conveyed via separate status code bits but the base code remains the same.
### 3.3 Error Condition Mapping
| Scenario | Quality |
|----------|---------|
| Normal read | `Good` (`0x00000000`) |
| Tag not found | `BadConfigurationError` (`0x80040000`) |
| Tag read exception / comms loss | `BadCommunicationFailure` (`0x80050000`) |
| Sensor failure | `BadSensorFailure` (`0x806D0000`) |
| Device failure | `BadDeviceFailure` (`0x806B0000`) |
| Stale value | `UncertainLastUsableValue` (`0x40900000`) |
| Block off-scan / disabled | `BadOutOfService` (`0x808F0000`) |
| Local override active | `GoodLocalOverride` (`0x00D80000`) |
| Initializing / waiting for first value | `BadWaitingForInitialData` (`0x80320000`) |
| Write to read-only tag | `WriteResult.success=false`, message indicates read-only |
| Type mismatch on write | `WriteResult.success=false`, message indicates type mismatch |
## 4. Message Schemas
### 4.1 VtqMessage
The core data type for tag value transport:
| Field | Proto Type | Order | Description |
|-------|-----------|-------|-------------|
| tag | string | 1 | Tag address |
| value | TypedValue | 2 | Typed value (native protobuf types) |
| timestamp_utc_ticks | int64 | 3 | UTC DateTime.Ticks (100ns intervals since 0001-01-01) |
| quality | QualityCode | 4 | Structured quality with status code and symbolic name |
A null or missing VTQ message is treated as Bad quality with null value and current UTC timestamp.
### 4.2 Connection Messages
**ConnectRequest**: `client_id` (string), `api_key` (string)
**ConnectResponse**: `success` (bool), `message` (string), `session_id` (string — 32-char hex GUID)
**DisconnectRequest**: `session_id` (string)
**DisconnectResponse**: `success` (bool), `message` (string)
**GetConnectionStateRequest**: `session_id` (string)
**GetConnectionStateResponse**: `is_connected` (bool), `client_id` (string), `connected_since_utc_ticks` (int64)
### 4.3 Read Messages
**ReadRequest**: `session_id` (string), `tag` (string)
**ReadResponse**: `success` (bool), `message` (string), `vtq` (VtqMessage)
**ReadBatchRequest**: `session_id` (string), `tags` (repeated string)
**ReadBatchResponse**: `success` (bool), `message` (string), `vtqs` (repeated VtqMessage)
### 4.4 Write Messages
**WriteRequest**: `session_id` (string), `tag` (string), `value` (TypedValue)
**WriteResponse**: `success` (bool), `message` (string)
**WriteItem**: `tag` (string), `value` (TypedValue)
**WriteResult**: `tag` (string), `success` (bool), `message` (string)
**WriteBatchRequest**: `session_id` (string), `items` (repeated WriteItem)
**WriteBatchResponse**: `success` (bool), `message` (string), `results` (repeated WriteResult)
### 4.5 WriteBatchAndWait Messages
**WriteBatchAndWaitRequest**:
- `session_id` (string)
- `items` (repeated WriteItem) — values to write
- `flag_tag` (string) — tag to poll after writes
- `flag_value` (TypedValue) — expected value (type-aware comparison)
- `timeout_ms` (int32) — max wait time (default 5000ms if ≤ 0)
- `poll_interval_ms` (int32) — polling interval (default 100ms if ≤ 0)
**WriteBatchAndWaitResponse**:
- `success` (bool)
- `message` (string)
- `write_results` (repeated WriteResult)
- `flag_reached` (bool) — whether the flag value was matched
- `elapsed_ms` (int32) — total elapsed time
**Behavior:**
1. All writes execute first. If any write fails, returns immediately with `success=false`.
2. If writes succeed, polls `flag_tag` at `poll_interval_ms` intervals.
3. Uses type-aware `TypedValueEquals()` comparison (see Section 4.5.1).
4. If flag matches before timeout: `success=true`, `flag_reached=true`.
5. If timeout expires: `success=true`, `flag_reached=false` (timeout is not an error).
#### 4.5.1 Flag Comparison Rules
Type-aware comparison via `TypedValueEquals()`:
- Both values must have the same `oneof` case (same type). Mismatched types are never equal.
- Numeric comparison uses the native type's equality (no floating-point string round-trip issues).
- String comparison is case-sensitive.
- Bool comparison is direct equality.
- Null (unset `oneof`) equals null. Null does not equal any set value.
- Array comparison: element-by-element equality, same length required.
- `datetime_value` compared as `int64` equality (tick-level precision).
### 4.6 Subscription Messages
**SubscribeRequest**: `session_id` (string), `tags` (repeated string), `sampling_ms` (int32)
Response: streamed `VtqMessage` items.
### 4.7 API Key Messages
**CheckApiKeyRequest**: `api_key` (string)
**CheckApiKeyResponse**: `is_valid` (bool), `message` (string)
## 5. Dual gRPC Stack Compatibility
The Host and Client use different gRPC implementations:
| Aspect | Host | Client |
|--------|------|--------|
| Stack | Grpc.Core (C-core) | Grpc.Net.Client |
| Contract | Proto file (`scada.proto`) + Grpc.Tools codegen | Code-first (`[ServiceContract]`, `[DataContract]`) via protobuf-net.Grpc |
| Runtime | .NET Framework 4.8 | .NET 10 |
Both target `scada.ScadaService` and produce identical wire format. Field ordering in `[DataMember(Order = N)]` matches proto field numbers.
## 6. V1 Legacy Protocol
The current codebase implements the v1 protocol. The following describes v1 behavior that will be replaced during migration to v2.
### 6.1 V1 Value Encoding
All values transmitted as strings:
- Write direction: server parses string values in order: bool → int → long → double → DateTime → raw string.
- Read direction: server serializes via `.ToString()` (bool → lowercase, DateTime → ISO-8601, arrays → JSON).
- Client parses: double → bool → null (empty string) → raw string.
### 6.2 V1 Quality
Three-state string quality (`"Good"`, `"Uncertain"`, `"Bad"`, case-insensitive). OPC UA numeric ranges: ≥192 = Good, 64191 = Uncertain, <64 = Bad.
### 6.3 V1 → V2 Field Changes
| Message | Field | V1 Type | V2 Type |
|---------|-------|---------|---------|
| VtqMessage | value | string | TypedValue |
| VtqMessage | quality | string | QualityCode |
| WriteRequest | value | string | TypedValue |
| WriteItem | value | string | TypedValue |
| WriteBatchAndWaitRequest | flag_value | string | TypedValue |
All RPC signatures remain unchanged. Only value and quality fields change type.
### 6.4 Migration Strategy
Clean break — no backward compatibility layer. All clients and servers updated simultaneously. This is appropriate because LmxProxy is an internal protocol with a small, controlled client count. Dual-format support adds complexity with no long-term benefit.
## Dependencies
- **Grpc.Core** + **Grpc.Tools** — proto compilation and server hosting (Host).
- **protobuf-net.Grpc** — code-first contracts (Client).
- **Grpc.Net.Client** — HTTP/2 transport (Client).
## Interactions
- **GrpcServer** implements the service defined by this protocol.
- **Client** consumes the service defined by this protocol.
- **MxAccessClient** is the backend that executes the operations requested via the protocol.

View File

@@ -0,0 +1,119 @@
# Component: Security
## Purpose
Provides API key-based authentication and role-based authorization for the gRPC service, along with TLS certificate management for transport security.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Security/ApiKeyService.cs` — API key storage and validation.
- `src/ZB.MOM.WW.LmxProxy.Host/Security/ApiKeyInterceptor.cs` — gRPC server interceptor for authentication/authorization.
- `src/ZB.MOM.WW.LmxProxy.Client/Security/GrpcChannelFactory.cs` — Client-side TLS channel factory.
## Responsibilities
- Load and hot-reload API keys from a JSON configuration file.
- Validate API keys on every gRPC request via a server interceptor.
- Enforce role-based access control (ReadOnly vs ReadWrite).
- Manage TLS certificates for server and optional mutual TLS.
## 1. API Key Service
### 1.1 Key Storage
- Keys are stored in a JSON file (default `apikeys.json`).
- File format: `{ "ApiKeys": [{ "Key": "...", "Description": "...", "Role": "ReadOnly|ReadWrite", "Enabled": true|false }] }`.
- If the file does not exist at startup, the service auto-generates a default file with two random keys: one ReadOnly and one ReadWrite.
### 1.2 Hot Reload
- A `FileSystemWatcher` monitors the API key file for changes.
- Rapid changes are debounced (1-second minimum between reloads).
- `ReloadConfigurationAsync` uses a `SemaphoreSlim` to serialize reload operations.
- New and modified keys take effect on the next request. Removed or disabled keys reject future requests immediately.
- Active sessions are not affected by key changes — sessions are tracked independently by SessionManager.
### 1.3 Validation
- `ValidateApiKey(apiKey)` — Returns the `ApiKey` object if the key exists and `Enabled` is true, otherwise null.
- `HasRole(apiKey, requiredRole)` — Returns true if the key has the required role. Role hierarchy: ReadWrite implies ReadOnly.
## 2. API Key Interceptor
### 2.1 Authentication Flow
The `ApiKeyInterceptor` intercepts every unary and server-streaming RPC:
1. Extracts the `x-api-key` header from gRPC request metadata.
2. Calls `ApiKeyService.ValidateApiKey()`.
3. If the key is invalid or missing, returns `StatusCode.Unauthenticated`.
4. For write-protected methods (`Write`, `WriteBatch`, `WriteBatchAndWait`), checks that the key has the `ReadWrite` role. Returns `StatusCode.PermissionDenied` if the key is `ReadOnly`.
5. Adds the validated `ApiKey` to `context.UserState["ApiKey"]` for downstream use.
6. Continues to the service method.
### 2.2 Write-Protected Methods
These RPCs require the `ReadWrite` role:
- `Write`
- `WriteBatch`
- `WriteBatchAndWait`
All other RPCs (`Connect`, `Disconnect`, `GetConnectionState`, `Read`, `ReadBatch`, `Subscribe`, `CheckApiKey`) are allowed for `ReadOnly` keys.
## 3. API Key Model
| Field | Type | Description |
|-------|------|-------------|
| Key | string | The secret API key value |
| Description | string | Human-readable name for the key |
| Role | ApiKeyRole | `ReadOnly` or `ReadWrite` |
| Enabled | bool | Whether the key is active |
`ApiKeyRole` enum: `ReadOnly` (read and subscribe only), `ReadWrite` (full access including writes).
## 4. TLS Configuration
### 4.1 Server-Side (Host)
Configured via `TlsConfiguration` in `appsettings.json`:
| Setting | Default | Description |
|---------|---------|-------------|
| Enabled | false | Enable TLS on the gRPC server |
| ServerCertificatePath | `certs/server.crt` | PEM server certificate |
| ServerKeyPath | `certs/server.key` | PEM server private key |
| ClientCaCertificatePath | `certs/ca.crt` | CA certificate for mTLS client validation |
| RequireClientCertificate | false | Require client certificates (mutual TLS) |
| CheckCertificateRevocation | false | Check certificate revocation lists |
If TLS is enabled but certificates are missing, the service generates self-signed certificates at startup.
### 4.2 Client-Side
`ClientTlsConfiguration` in the client library:
| Setting | Default | Description |
|---------|---------|-------------|
| UseTls | false | Enable TLS on the client connection |
| ClientCertificatePath | null | Client certificate for mTLS |
| ClientKeyPath | null | Client private key for mTLS |
| ServerCaCertificatePath | null | Custom CA for server validation |
| ServerNameOverride | null | SNI/hostname override |
| ValidateServerCertificate | true | Validate the server certificate chain |
| AllowSelfSignedCertificates | false | Accept self-signed server certificates |
| IgnoreAllCertificateErrors | false | Skip all certificate validation (dangerous) |
- SSL protocols: TLS 1.2 and TLS 1.3.
- Client certificates loaded from PEM files and converted to PKCS12.
- Custom CA trust store support via chain building.
## Dependencies
- **Configuration** — TLS settings and API key file path from `appsettings.json`.
- **System.IO.FileSystemWatcher** — API key file change detection.
## Interactions
- **GrpcServer** — the ApiKeyInterceptor runs before every RPC in ScadaGrpcService.
- **ServiceHost** — creates ApiKeyService and ApiKeyInterceptor at startup, configures gRPC server credentials.
- **Client** — GrpcChannelFactory creates TLS-configured gRPC channels in LmxProxyClient.

View File

@@ -0,0 +1,108 @@
# Component: ServiceHost
## Purpose
The entry point and lifecycle manager for the LmxProxy Windows service. Handles Topshelf service hosting, Serilog logging setup, component initialization/teardown ordering, and Windows SCM service recovery configuration.
## Location
- `src/ZB.MOM.WW.LmxProxy.Host/Program.cs` — entry point, Serilog setup, Topshelf configuration.
- `src/ZB.MOM.WW.LmxProxy.Host/LmxProxyService.cs` — service lifecycle (Start, Stop, Pause, Continue, Shutdown).
## Responsibilities
- Configure and launch the Topshelf Windows service.
- Load and validate configuration from `appsettings.json`.
- Initialize Serilog logging.
- Orchestrate service startup: create all components in dependency order, connect to MxAccess, start servers.
- Orchestrate service shutdown: stop servers, dispose all components in reverse order.
- Configure Windows SCM service recovery policies.
## 1. Entry Point (Program.cs)
1. Builds configuration from `appsettings.json` + environment variables via `ConfigurationBuilder`.
2. Configures Serilog from the `Serilog` section of appsettings (console + file sinks).
3. Validates configuration using `ConfigurationValidator.ValidateAndLog()`.
4. Configures Topshelf `HostFactory`:
- Service name: `ZB.MOM.WW.LmxProxy.Host`
- Display name: `SCADA Bridge LMX Proxy`
- Start automatically on boot.
- Service recovery: first failure 1 min, second 5 min, subsequent 10 min, reset period 1 day.
5. Runs the Topshelf host (blocks until service stops).
## 2. Service Lifecycle (LmxProxyService)
### 2.1 Startup Sequence (Start)
Components are created and started in dependency order:
1. Validate configuration.
2. Check/generate TLS certificates (if TLS enabled).
3. Create `PerformanceMetrics`.
4. Create `ApiKeyService` — loads API keys from file.
5. Create `MxAccessClient` via factory.
6. Subscribe to connection state changes.
7. Connect to MxAccess synchronously — times out at `ConnectionTimeoutSeconds` (default 30s).
8. Start `MonitorConnectionAsync` (if `AutoReconnect` enabled).
9. Create `SubscriptionManager`.
10. Create `SessionManager`.
11. Create `HealthCheckService` + `DetailedHealthCheckService`.
12. Create `StatusReportService` + `StatusWebServer`.
13. Create `ScadaGrpcService`.
14. Create `ApiKeyInterceptor`.
15. Configure gRPC `Server` with TLS or insecure credentials.
16. Start gRPC server on `0.0.0.0:{GrpcPort}`.
17. Start `StatusWebServer`.
### 2.2 Shutdown Sequence (Stop)
Components are stopped and disposed in reverse order:
1. Cancel reconnect monitor — wait **5 seconds** for exit.
2. Graceful gRPC server shutdown — **10-second** timeout, then kill.
3. Stop StatusWebServer — **5-second** wait.
4. Dispose all components in reverse creation order.
5. Disconnect from MxAccess — **10-second** timeout.
### 2.3 Other Lifecycle Events
- **Pause**: Supported by Topshelf but behavior is a no-op beyond logging.
- **Continue**: Resume from pause, no-op beyond logging.
- **Shutdown**: System shutdown signal, triggers the same shutdown sequence as Stop.
## 3. Service Recovery (Windows SCM)
Configured via Topshelf's `EnableServiceRecovery`:
| Failure | Action | Delay |
|---------|--------|-------|
| First | Restart service | 1 minute |
| Second | Restart service | 5 minutes |
| Subsequent | Restart service | 10 minutes |
| Reset period | — | 1 day |
All values are configurable via `ServiceRecoveryConfiguration`.
## 4. Service Identity
| Property | Value |
|----------|-------|
| Service name | `ZB.MOM.WW.LmxProxy.Host` |
| Display name | `SCADA Bridge LMX Proxy` |
| Start mode | Automatic |
| Platform | x86 (.NET Framework 4.8) |
| Framework | Topshelf |
## Dependencies
- **Topshelf** — Windows service framework.
- **Serilog** — structured logging (console + file sinks).
- **Microsoft.Extensions.Configuration** — configuration loading.
- **Configuration** — validated configuration objects.
- All other components are created and managed by LmxProxyService.
## Interactions
- **Configuration** is loaded and validated first; all other components receive their settings from it.
- **MxAccessClient** is connected synchronously during startup. If connection fails within the timeout, the service fails to start.
- **GrpcServer** and **StatusWebServer** are started last, after all dependencies are ready.

View File

@@ -0,0 +1,76 @@
# Component: SessionManager
## Purpose
Tracks active client sessions, mapping session IDs to client metadata. Provides session creation, validation, and termination for the gRPC service layer.
## Location
`src/ZB.MOM.WW.LmxProxy.Host/Sessions/SessionManager.cs`
## Responsibilities
- Create new sessions with unique identifiers when clients connect.
- Validate session IDs on every data operation.
- Track session metadata (client ID, API key, connection time, last activity).
- Terminate sessions on client disconnect.
- Provide session listing for monitoring and status reporting.
## 1. Session Storage
- Sessions are stored in a `ConcurrentDictionary<string, SessionInfo>` (lock-free, thread-safe).
- Session state is in-memory only — all sessions are lost on service restart.
- `ActiveSessionCount` property returns the current count of tracked sessions.
## 2. Session Lifecycle
### 2.1 Creation
`CreateSession(clientId, apiKey)`:
- Generates a unique session ID: `Guid.NewGuid().ToString("N")` (32-character lowercase hex string, no hyphens).
- Creates a `SessionInfo` record with `ConnectedAt` and `LastActivity` set to `DateTime.UtcNow`.
- Stores the session in the dictionary.
- Returns the session ID to the client.
### 2.2 Validation
`ValidateSession(sessionId)`:
- Looks up the session ID in the dictionary.
- If found, updates `LastActivity` to `DateTime.UtcNow` and returns `true`.
- If not found, returns `false`.
### 2.3 Termination
`TerminateSession(sessionId)`:
- Removes the session from the dictionary.
- Returns `true` if the session existed, `false` otherwise.
### 2.4 Query
- `GetSession(sessionId)` — Returns `SessionInfo` or `null` if not found.
- `GetAllSessions()` — Returns `IReadOnlyList<SessionInfo>` snapshot of all active sessions.
## 3. SessionInfo
| Field | Type | Description |
|-------|------|-------------|
| SessionId | string | 32-character hex GUID |
| ClientId | string | Client-provided identifier |
| ApiKey | string | API key used for authentication |
| ConnectedAt | DateTime | UTC time of session creation |
| LastActivity | DateTime | UTC time of last operation (updated on each validation) |
| ConnectedSinceUtcTicks | long | `ConnectedAt.Ticks` for gRPC response serialization |
## 4. Disposal
`Dispose()` clears all sessions from the dictionary. No notifications are sent to connected clients.
## Dependencies
None. SessionManager is a standalone in-memory store with no external dependencies.
## Interactions
- **GrpcServer** calls `CreateSession` on Connect, `ValidateSession` on every data operation, and `TerminateSession` on Disconnect.
- **HealthAndMetrics** reads `ActiveSessionCount` for health check data.
- **StatusReportService** reads session information for the status dashboard.

View File

@@ -0,0 +1,116 @@
# Component: SubscriptionManager
## Purpose
Manages the lifecycle of tag value subscriptions, multiplexing multiple client subscriptions onto shared MXAccess tag subscriptions and delivering updates via per-client bounded channels with configurable backpressure.
## Location
`src/ZB.MOM.WW.LmxProxy.Host/Subscriptions/SubscriptionManager.cs`
## Responsibilities
- Create per-client subscription channels with bounded capacity.
- Share underlying MXAccess tag subscriptions across multiple clients subscribing to the same tags.
- Deliver tag value updates from MXAccess callbacks to all subscribed clients.
- Handle backpressure when client channels are full (DropOldest, DropNewest, or Wait).
- Clean up subscriptions on client disconnect.
- Notify all subscribed clients with bad quality when MXAccess disconnects.
## 1. Architecture
### 1.1 Per-Client Channels
Each subscribing client gets a bounded `System.Threading.Channel<(string address, Vtq vtq)>`:
- Capacity: configurable (default 1000 messages).
- Full mode: configurable (default `DropOldest`).
- `SingleReader = true`, `SingleWriter = false`.
### 1.2 Shared Tag Subscriptions
Tag subscriptions to MXAccess are shared across clients:
- When the first client subscribes to a tag, a new MXAccess subscription is created.
- When additional clients subscribe to the same tag, they are added to the existing tag subscription's client set.
- When the last client unsubscribes from a tag, the MXAccess subscription is disposed.
### 1.3 Thread Safety
- `ReaderWriterLockSlim` protects tag subscription updates.
- `ConcurrentDictionary` for client subscription tracking.
## 2. Subscription Flow
### 2.1 Subscribe
`SubscribeAsync(clientId, addresses, ct)`:
1. Creates a bounded channel with configured capacity and full mode.
2. Creates a `ClientSubscription` record (clientId, channel, address set, CancellationTokenSource, counters).
3. For each tag address:
- If the tag already has a subscription, adds the client to the existing `TagSubscription.clientIds` set.
- Otherwise, creates a new `TagSubscription` and calls `_scadaClient.SubscribeAsync()` to register with MXAccess (outside the lock to avoid blocking).
4. Registers a cancellation token callback to automatically call `UnsubscribeClient` on disconnect.
5. Returns the channel reader for the GrpcServer to stream from.
### 2.2 Value Updates
`OnTagValueChanged(address, Vtq)` — called from MxAccessClient's COM event handler:
1. Looks up the tag subscription to find all subscribed clients.
2. For each client, calls `channel.Writer.TryWrite((address, vtq))`.
3. If the channel is full:
- **DropOldest**: Logs a warning, increments `DroppedMessageCount`. The oldest message is automatically discarded by the channel.
- **DropNewest**: Drops the incoming message.
- **Wait**: Blocks the writer until space is available (not recommended for gRPC streaming).
4. On channel closed (client disconnected), schedules `UnsubscribeClient` cleanup.
### 2.3 Unsubscribe
`UnsubscribeClient(clientId)`:
1. Removes the client from the client dictionary.
2. For each tag the client was subscribed to, removes the client from the tag's subscriber set.
3. If a tag has no remaining subscribers, disposes the MXAccess subscription handle.
4. Completes the client's channel writer (signals end of stream).
## 3. Backpressure
| Mode | Behavior | Use Case |
|------|----------|----------|
| DropOldest | Silently discards oldest message when channel is full | Default. Fire-and-forget semantics. No client blocking. |
| DropNewest | Drops the incoming message when channel is full | Preserves history, drops latest updates. |
| Wait | Blocks the writer until space is available | Not recommended for gRPC streaming (blocks callback thread). |
Per-client statistics track `DeliveredMessageCount` and `DroppedMessageCount` for monitoring via the status dashboard.
## 4. Disconnection Handling
### 4.1 Client Disconnect
When a client's gRPC stream ends (cancellation or error), the cancellation token callback triggers `UnsubscribeClient`, which cleans up all tag subscriptions for that client.
### 4.2 MxAccess Disconnect
`OnConnectionStateChanged` — when the MxAccess connection drops:
- Sends a bad-quality Vtq to all subscribed clients via their channels.
- Each client receives an async notification of the connection loss.
- Tag subscriptions are retained in memory for reconnection (via MxAccessClient's `_storedSubscriptions`).
## 5. Statistics
`GetSubscriptionStats()` returns:
- `TotalClients` — number of active client subscriptions.
- `TotalTags` — number of unique tags with active MXAccess subscriptions.
- `ActiveSubscriptions` — total client-tag subscription count.
## Dependencies
- **MxAccessClient** (IScadaClient) — creates and disposes MXAccess tag subscriptions.
- **Configuration** — `SubscriptionConfiguration` for channel capacity and full mode.
## Interactions
- **GrpcServer** calls `SubscribeAsync` on Subscribe RPC and reads from the returned channel.
- **MxAccessClient** delivers value updates via the `OnTagValueChanged` callback.
- **HealthAndMetrics** reads subscription statistics for health checks and status reports.
- **ServiceHost** disposes the SubscriptionManager at shutdown.

View File

@@ -0,0 +1,274 @@
# LmxProxy - High Level Requirements
## 1. System Purpose
LmxProxy is a gRPC proxy service that bridges SCADA clients to AVEVA System Platform (Wonderware) via the ArchestrA MXAccess COM API. It exists because MXAccess is a 32-bit COM component that requires co-location with System Platform on a Windows machine running .NET Framework 4.8. LmxProxy isolates this constraint behind a gRPC interface, allowing modern .NET clients to access System Platform data remotely over HTTP/2.
## 2. Architecture
### 2.1 Two-Project Structure
- **ZB.MOM.WW.LmxProxy.Host** — .NET Framework 4.8, x86-only Windows service. Hosts a gRPC server (Grpc.Core) fronting the MXAccess COM API. Runs on the same machine as AVEVA System Platform.
- **ZB.MOM.WW.LmxProxy.Client** — .NET 10, AnyCPU class library. Code-first gRPC client (protobuf-net.Grpc) consumed by ScadaLink's Data Connection Layer. Packaged as a NuGet library.
### 2.2 Dual gRPC Stacks
The two projects use different gRPC implementations that are wire-compatible:
- **Host**: Proto-file-generated code via `Grpc.Core` + `Grpc.Tools`. Uses the deprecated C-core gRPC library because .NET Framework 4.8 does not support `Grpc.Net.Server`.
- **Client**: Code-first contracts via `protobuf-net.Grpc` with `[DataContract]`/`[ServiceContract]` attributes over `Grpc.Net.Client`.
Both target the same `scada.ScadaService` gRPC service definition and are wire-compatible.
### 2.3 Deployment Model
- The Host service runs on the AVEVA System Platform machine (or any machine with MXAccess access).
- Clients connect remotely over gRPC (HTTP/2) on a configurable port (default 50051).
- The Host runs as a Windows service managed by Topshelf.
## 3. Communication Protocol
### 3.1 Transport
- gRPC over HTTP/2.
- Default server port: 50051.
- Optional TLS with mutual TLS (mTLS) support.
### 3.2 RPCs
The service exposes 10 RPCs:
| RPC | Type | Description |
|-----|------|-------------|
| Connect | Unary | Establish session, returns session ID |
| Disconnect | Unary | Terminate session |
| GetConnectionState | Unary | Query MxAccess connection status |
| Read | Unary | Read single tag value |
| ReadBatch | Unary | Read multiple tag values |
| Write | Unary | Write single tag value |
| WriteBatch | Unary | Write multiple tag values |
| WriteBatchAndWait | Unary | Write values, poll flag tag until match or timeout |
| Subscribe | Server streaming | Stream tag value updates to client |
| CheckApiKey | Unary | Validate API key and return role |
### 3.3 Data Model (VTQ)
All tag values are represented as VTQ (Value, Timestamp, Quality) tuples:
- **Value**: `TypedValue` — a protobuf `oneof` carrying the value in its native type (bool, int32, int64, float, double, string, bytes, datetime, typed arrays). An unset `oneof` represents null.
- **Timestamp**: UTC `DateTime.Ticks` as `int64` (100-nanosecond intervals since 0001-01-01 00:00:00 UTC).
- **Quality**: `QualityCode` — a structured message with `uint32 status_code` (OPC UA-compatible) and `string symbolic_name`. Category derived from high bits: `0x00xxxxxx` = Good, `0x40xxxxxx` = Uncertain, `0x80xxxxxx` = Bad.
## 4. Session Lifecycle
- Clients call `Connect` with a client ID and optional API key to establish a session.
- The server returns a 32-character hex GUID as the session ID.
- All subsequent operations require the session ID for validation.
- Sessions persist until explicit `Disconnect` or server restart. There is no idle timeout.
- Session state is tracked in memory (not persisted). All sessions are lost on service restart.
## 5. Authentication & Authorization
### 5.1 API Key Authentication
- API keys are validated via the `x-api-key` gRPC metadata header.
- Keys are stored in a JSON file (`apikeys.json` by default) with hot-reload via FileSystemWatcher (1-second debounce).
- If no API key file exists, the service auto-generates a default file with two random keys (one ReadOnly, one ReadWrite).
- Authentication is enforced at the gRPC interceptor level before any service method executes.
### 5.2 Role-Based Authorization
Two roles with hierarchical permissions:
| Role | Read | Subscribe | Write |
|------|------|-----------|-------|
| ReadOnly | Yes | Yes | No |
| ReadWrite | Yes | Yes | Yes |
Write-protected methods: `Write`, `WriteBatch`, `WriteBatchAndWait`. A ReadOnly key attempting a write receives `StatusCode.PermissionDenied`.
### 5.3 TLS/Security
- TLS is optional (disabled by default in configuration, though `Tls.Enabled` defaults to `true` in the config class).
- Supports server TLS and mutual TLS (client certificate validation).
- Client CA certificate path configurable for mTLS.
- Certificate revocation checking is optional.
- Client library supports TLS 1.2 and TLS 1.3, custom CA trust stores, self-signed certificate allowance, and server name override.
## 6. Operations
### 6.1 Read
- Single tag read with configurable retry policy.
- Batch read with semaphore-controlled concurrency (default max 10 concurrent operations).
- Read timeout: 5 seconds (configurable).
### 6.2 Write
- Single tag write with retry policy. Values are sent as `TypedValue` (native protobuf types). Type mismatches between the value and the tag's expected type return a write failure.
- Batch write with semaphore-controlled concurrency.
- Write timeout: 5 seconds (configurable).
- WriteBatchAndWait: writes a batch, then polls the flag tag at a configurable interval until its value matches the expected flag value (type-aware comparison via `TypedValueEquals`) or a timeout expires. Default timeout: 5000ms, default poll interval: 100ms. Timeout is not an error — returns `flag_reached=false`.
### 6.3 Subscribe
- Server-streaming RPC. Client sends a list of tags and a sampling interval (in milliseconds).
- Server maintains a per-client bounded channel (default capacity 1000 messages).
- Updates are pushed as `VtqMessage` items on the stream.
- When the MxAccess connection drops, all subscribed clients receive a bad-quality notification.
- Subscriptions are cleaned up on client disconnect. When the last client unsubscribes from a tag, the underlying MxAccess subscription is disposed.
## 7. Connection Resilience
### 7.1 Host Auto-Reconnect
- If the MxAccess connection is lost, the Host automatically attempts reconnection at a fixed interval (default 5 seconds).
- Stored subscriptions are recreated after a successful reconnect.
- Auto-reconnect is configurable (`Connection.AutoReconnect`, default true).
### 7.2 Client Keep-Alive
- The client sends a lightweight `GetConnectionState` ping every 30 seconds.
- On keep-alive failure, the client marks the connection as disconnected and cleans up subscriptions.
### 7.3 Client Retry Policy
- Polly-based exponential backoff retry.
- Default: 3 attempts with 1-second initial delay (1s → 2s → 4s).
- Transient errors retried: Unavailable, DeadlineExceeded, ResourceExhausted, Aborted.
## 8. Health Monitoring & Metrics
### 8.1 Health Checks
Two health check implementations:
- **Basic** (`HealthCheckService`): Checks MxAccess connection state, subscription stats, and operation success rate. Returns Degraded if success rate < 50% (with > 100 operations) or client count > 100.
- **Detailed** (`DetailedHealthCheckService`): Reads a test tag (`System.Heartbeat`). Returns Unhealthy if not connected, Degraded if test tag quality is not Good or timestamp is older than 5 minutes.
### 8.2 Performance Metrics
- Per-operation tracking: Read, ReadBatch, Write, WriteBatch.
- Metrics: total count, success count, success rate, average/min/max latency, 95th percentile latency.
- Rolling buffer of 1000 latency samples per operation for percentile calculation.
- Metrics reported to logs every 60 seconds.
### 8.3 Status Web Server
- HTTP status server on port 8080 (configurable).
- Endpoints:
- `GET /` — HTML dashboard with auto-refresh (30 seconds), color-coded status cards, operations table.
- `GET /api/status` — JSON status report.
- `GET /api/health` — Plain text `OK` (200) or `UNHEALTHY` (503).
### 8.4 Client Metrics
- Per-operation counts, error counts, and latency tracking (average, p95, p99).
- Rolling buffer of 1000 latency samples.
- Exposed via `ILmxProxyClient.GetMetrics()`.
## 9. Service Hosting
### 9.1 Topshelf Windows Service
- Service name: `ZB.MOM.WW.LmxProxy.Host`
- Display name: `SCADA Bridge LMX Proxy`
- Starts automatically on boot.
### 9.2 Service Recovery (Windows SCM)
| Failure | Restart Delay |
|---------|--------------|
| First | 1 minute |
| Second | 5 minutes |
| Subsequent | 10 minutes |
| Reset period | 1 day |
### 9.3 Startup Sequence
1. Load configuration from `appsettings.json` + environment variables.
2. Configure Serilog (console + file sinks).
3. Validate configuration.
4. Check/generate TLS certificates (if TLS enabled).
5. Initialize services: PerformanceMetrics, ApiKeyService, MxAccessClient, SubscriptionManager, SessionManager, HealthCheckService, StatusReportService.
6. Connect to MxAccess synchronously (timeout: 30 seconds).
7. Start auto-reconnect monitor loop (if enabled).
8. Start gRPC server on configured port.
9. Start HTTP status web server.
### 9.4 Shutdown Sequence
1. Cancel reconnect monitor (5-second wait).
2. Graceful gRPC server shutdown (10-second timeout, then kill).
3. Stop status web server (5-second wait).
4. Dispose all components in reverse order.
5. Disconnect from MxAccess (10-second timeout).
## 10. Configuration
All configuration is via `appsettings.json` bound to `LmxProxyConfiguration`. Key settings:
| Section | Setting | Default |
|---------|---------|---------|
| Root | GrpcPort | 50051 |
| Root | ApiKeyConfigFile | `apikeys.json` |
| Connection | MonitorIntervalSeconds | 5 |
| Connection | ConnectionTimeoutSeconds | 30 |
| Connection | ReadTimeoutSeconds | 5 |
| Connection | WriteTimeoutSeconds | 5 |
| Connection | MaxConcurrentOperations | 10 |
| Connection | AutoReconnect | true |
| Subscription | ChannelCapacity | 1000 |
| Subscription | ChannelFullMode | DropOldest |
| Tls | Enabled | false |
| Tls | RequireClientCertificate | false |
| WebServer | Enabled | true |
| WebServer | Port | 8080 |
Configuration is validated at startup. Invalid values cause the service to fail to start.
## 11. Logging
- Serilog with console and file sinks.
- File sink: `logs/lmxproxy-.txt`, daily rolling, 30 files retained.
- Default level: Information. Overrides: Microsoft=Warning, System=Warning, Grpc=Information.
- Enrichment: FromLogContext, WithMachineName, WithThreadId.
## 12. Constraints
- Host **must** target x86 and .NET Framework 4.8 (MXAccess is 32-bit COM).
- Host uses `Grpc.Core` (deprecated C-core library), required because .NET 4.8 does not support `Grpc.Net.Server`.
- Client targets .NET 10 and runs in ScadaLink central/site clusters.
- MxAccess COM operations require STA thread context (wrapped in `Task.Run`).
- The solution file uses `.slnx` format.
## 13. Protocol
The protocol specification is defined in `lmxproxy_updates.md`, which is the authoritative source of truth. All RPC signatures, message schemas, and behavioral specifications are per that document.
### 13.1 Value System (TypedValue)
Values are transmitted in their native protobuf types via a `TypedValue` oneof: bool, int32, int64, float, double, string, bytes, datetime (int64 UTC Ticks), and typed arrays. An unset oneof represents null. No string serialization or parsing heuristics are used.
### 13.2 Quality System (QualityCode)
Quality is a structured `QualityCode` message with `uint32 status_code` (OPC UA-compatible) and `string symbolic_name`. Supports AVEVA-aligned quality sub-codes (e.g., `BadSensorFailure` = `0x806D0000`, `GoodLocalOverride` = `0x00D80000`, `BadWaitingForInitialData` = `0x80320000`). See Component-Protocol for the full quality code table.
### 13.3 Migration from V1
The current codebase implements the v1 protocol (string-encoded values, three-state string quality). The v2 protocol is a clean break — all clients and servers will be updated simultaneously. No backward compatibility layer. This is appropriate because LmxProxy is an internal protocol with a small, controlled client count.
## 14. Component List (10 Components)
| # | Component | Description |
|---|-----------|-------------|
| 1 | GrpcServer | gRPC service implementation, session validation, request routing |
| 2 | MxAccessClient | MXAccess COM interop wrapper, connection lifecycle, read/write/subscribe |
| 3 | SessionManager | Client session tracking and lifecycle |
| 4 | Security | API key authentication, role-based authorization, TLS management |
| 5 | SubscriptionManager | Tag subscription lifecycle, channel-based update delivery, backpressure |
| 6 | Configuration | appsettings.json structure, validation, options binding |
| 7 | HealthAndMetrics | Health checks, performance metrics, status web server |
| 8 | ServiceHost | Topshelf hosting, startup/shutdown, logging setup, service recovery |
| 9 | Client | LmxProxyClient library, builder, retry, streaming, DI integration |
| 10 | Protocol | gRPC protocol specification, proto definition, code-first contracts |

View File

@@ -0,0 +1,167 @@
# STA Message Pump Gap — OnWriteComplete COM Callback
**Status**: Documented gap. Fire-and-forget workaround in place (deviation #7). Full fix deferred until secured/verified writes are needed.
## When This Matters
The current fire-and-forget write approach works for **supervisory writes** where:
- Security is handled at the LmxProxy API key level, not MxAccess attribute level
- Writes succeed synchronously (no secured/verified write requirements)
- Write confirmation is handled at the application level (read-back in `WriteBatchAndWait`)
This gap becomes a **blocking issue** if any of these scenarios arise:
- **Secured writes (MxAccess error 1012)**: Attribute requires ArchestrA user authentication. `OnWriteComplete` returns the error, and the caller must retry with `WriteSecured()`.
- **Verified writes (MxAccess error 1013)**: Attribute requires two-user verification. Same retry pattern.
- **Write failure detection**: MxAccess accepts the `Write()` call but can't complete it (e.g., downstream device failure). `OnWriteComplete` is the only notification of this — without it, the caller assumes success.
## Root Cause
The MxAccess documentation (Write() Method) states: *"Upon completion of the write, your program receives notification of the success/failure status through the OnWriteComplete() event"* and *"that item should not be taken off advise or removed from the internal tables until the OnWriteComplete() event is received."*
`OnWriteComplete` **should** fire after every `Write()` call. It doesn't in our service because:
- MxAccess is a COM component designed for Windows Forms apps with a UI message loop
- COM event callbacks are delivered via the Windows message pump
- Our Topshelf Windows service has no message pump — `Write()` is called from thread pool threads (`Task.Run`) with no message loop
- `OnDataChange` works because MxAccess fires it proactively on its own internal threads; `OnWriteComplete` is a response callback that needs message-pump-based marshaling
## Correct Solution: Dedicated STA Thread + `Application.Run()`
Based on research (Stephen Toub, MSDN Magazine 2007; Microsoft Learn COM interop docs; community patterns), the correct approach is a dedicated STA thread running a Windows Forms message pump via `Application.Run()`.
### Architecture
```
Service main thread (MTA)
├── gRPC server threads (handle client RPCs)
│ │
│ └── Marshal COM calls via Form.BeginInvoke() ──┐
│ │
└── Dedicated STA thread │
│ │
├── Creates LMXProxyServerClass COM object │
├── Wires event handlers (OnDataChange, │
│ OnWriteComplete, OperationComplete) │
├── Runs Application.Run() ← continuous │
│ message pump │
│ │
└── Hidden Form receives BeginInvoke calls ◄────┘
├── Executes COM operations (Read, Write,
│ AddItem, AdviseSupervisory, etc.)
└── COM callbacks delivered via message pump
(OnWriteComplete, OnDataChange, etc.)
```
### Implementation Pattern
```csharp
// In MxAccessClient constructor or Start():
var initDone = new ManualResetEventSlim(false);
_staThread = new Thread(() =>
{
// 1. Create hidden form for marshaling
_marshalForm = new Form();
_marshalForm.CreateHandle(); // force HWND creation without showing
// 2. Create COM objects ON THIS THREAD
_lmxProxy = new LMXProxyServerClass();
_lmxProxy.OnDataChange += OnDataChange;
_lmxProxy.OnWriteComplete += OnWriteComplete;
// 3. Signal that init is complete
initDone.Set();
// 4. Run message pump (blocks forever, pumps COM callbacks)
Application.Run();
});
_staThread.Name = "MxAccess-STA";
_staThread.IsBackground = true;
_staThread.SetApartmentState(ApartmentState.STA);
_staThread.Start();
initDone.Wait(); // wait for COM objects to be ready
```
### Dispatching Work to the STA Thread
```csharp
// All COM calls must go through the hidden form's invoke:
public Task<Vtq> ReadAsync(string address, CancellationToken ct)
{
var tcs = new TaskCompletionSource<Vtq>();
_marshalForm.BeginInvoke((Action)(() =>
{
try
{
// COM call executes on STA thread
int handle = _lmxProxy.AddItem(_connectionHandle, address);
_lmxProxy.AdviseSupervisory(_connectionHandle, handle);
// ... etc
tcs.SetResult(vtq);
}
catch (Exception ex)
{
tcs.SetException(ex);
}
}));
return tcs.Task;
}
```
### Shutdown
```csharp
// To stop the message pump:
_marshalForm.BeginInvoke((Action)(() =>
{
// Clean up COM objects on STA thread
// ... UnAdvise, RemoveItem, Unregister ...
Marshal.ReleaseComObject(_lmxProxy);
Application.ExitThread(); // stops Application.Run()
}));
_staThread.Join(TimeSpan.FromSeconds(10));
```
### Why Our First Attempt Failed
Our original `StaDispatchThread` (Phase 2) used `BlockingCollection.Take()` to wait for work items, with `Application.DoEvents()` between items. This failed because:
| Our failed approach | Correct approach |
|---|---|
| `BlockingCollection.Take()` blocks the STA thread, preventing the message pump from running | `Application.Run()` runs continuously, pumping messages at all times |
| `Application.DoEvents()` only pumps messages already in the queue at that instant | Message pump runs an infinite loop, processing messages as they arrive |
| Work dispatched by enqueueing to `BlockingCollection` | Work dispatched via `Form.BeginInvoke()` which posts a Windows message to the STA thread's queue |
The key difference: `BeginInvoke` posts a `WM_` message that the message pump processes alongside COM callbacks. `BlockingCollection` bypasses the message pump entirely.
## Drawbacks of the STA Approach
### Performance
- **All COM calls serialize onto one thread.** Under load (batch reads of 100+ tags), operations queue up single-file. Current `Task.Run` approach allows MxAccess's internal marshaling to handle some concurrency.
- **Double context switch per operation.** Caller → STA thread (invoke) → wait → back to caller. Adds ~0.1-1ms per call. Negligible for single reads, noticeable for large batch operations.
### Safety
- **Single point of failure.** If the STA thread dies, all MxAccess operations stop. Recovery requires tearing down and recreating the thread + all COM objects.
- **Deadlock risk.** If STA thread code synchronously waits on something that needs the STA thread (circular dependency), the message pump freezes. All waits must be async/non-blocking.
- **Reentrancy.** While pumping messages, inbound COM callbacks can reenter your code during another COM call. Event handlers must be reentrant-safe.
### Complexity
- Every COM call needs `_marshalForm.BeginInvoke()` wrapping.
- COM object affinity to STA thread is hard to enforce at compile time.
- Unit tests need STA thread support or must use fakes.
## Decision
Fire-and-forget is the correct choice for now. Revisit when secured/verified writes are needed.
## References
- [.NET Matters: Handling Messages in Console Apps — Stephen Toub, MSDN Magazine 2007](https://learn.microsoft.com/en-us/archive/msdn-magazine/2007/june/net-matters-handling-messages-in-console-apps)
- [How to: Support COM Interop by Displaying Each Windows Form on Its Own Thread — Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/desktop/winforms/advanced/how-to-support-com-interop-by-displaying-each-windows-form-on-its-own-thread)
- [.NET Windows Service needs STAThread — hirenppatel](https://hirenppatel.wordpress.com/2012/11/24/net-windows-service-needs-to-use-stathread-instead-of-mtathread/)
- [Application.Run() In a Windows Service — PC Review](https://www.pcreview.co.uk/threads/application-run-in-a-windows-service.3087159/)
- [Build a message pump for a Windows service? — CodeProject](https://www.codeproject.com/Messages/1365966/Build-a-message-pump-for-a-Windows-service.aspx)
- MxAccess Toolkit User's Guide — Write() Method, OnWriteComplete Callback sections

View File

@@ -0,0 +1,95 @@
# LmxProxy v2 — Instance Configuration
Two instances of the LmxProxy v2 Host service are deployed on windev (10.100.0.48), both connecting to the same AVEVA System Platform via MxAccess COM.
## Instances
| | Instance A | Instance B |
|---|---|---|
| **Service Name** | `ZB.MOM.WW.LmxProxy.Host.V2` | `ZB.MOM.WW.LmxProxy.Host.V2B` |
| **Display Name** | SCADA Bridge LMX Proxy V2 | SCADA Bridge LMX Proxy V2B |
| **MxAccess Client Name** | `LmxProxy-A` | `LmxProxy-B` |
| **Publish Directory** | `C:\publish-v2\` | `C:\publish-v2b\` |
| **gRPC Port** | 50100 | 50101 |
| **HTTP Status Port** | 8081 | 8082 |
| **Log File Prefix** | `lmxproxy-v2-` | `lmxproxy-v2b-` |
| **Log Directory** | `C:\publish-v2\logs\` | `C:\publish-v2b\logs\` |
| **Health Probe Tag** | `DevPlatform.Scheduler.ScanTime` | `DevPlatform.Scheduler.ScanTime` |
| **API Keys File** | `C:\publish-v2\apikeys.json` | `C:\publish-v2b\apikeys.json` |
| **Auto-Start** | Yes | Yes |
## Shared API Keys
Both instances use the same API keys (copied from instance A during setup).
| Role | Key |
|---|---|
| **ReadWrite** | `c4559c7c6acc60a997135c1381162e3c30f4572ece78dd933c1a626e6fd933b4` |
| **ReadOnly** | `a77d090d4adcfeaac1a50379ec5f971ff282c998599fd8ccf410090c9f290150` |
## Service Management
```bash
# Instance A
sc start ZB.MOM.WW.LmxProxy.Host.V2
sc stop ZB.MOM.WW.LmxProxy.Host.V2
sc query ZB.MOM.WW.LmxProxy.Host.V2
# Instance B
sc start ZB.MOM.WW.LmxProxy.Host.V2B
sc stop ZB.MOM.WW.LmxProxy.Host.V2B
sc query ZB.MOM.WW.LmxProxy.Host.V2B
```
## Health Endpoints
```bash
# Instance A
curl http://10.100.0.48:8081/api/health
curl http://10.100.0.48:8081/api/status
# Instance B
curl http://10.100.0.48:8082/api/health
curl http://10.100.0.48:8082/api/status
```
## Client Connection
```csharp
// Instance A
var clientA = new LmxProxyClientBuilder()
.WithHost("10.100.0.48")
.WithPort(50100)
.WithApiKey("c4559c7c6acc60a997135c1381162e3c30f4572ece78dd933c1a626e6fd933b4")
.Build();
// Instance B
var clientB = new LmxProxyClientBuilder()
.WithHost("10.100.0.48")
.WithPort(50101)
.WithApiKey("c4559c7c6acc60a997135c1381162e3c30f4572ece78dd933c1a626e6fd933b4")
.Build();
```
## Updating Instances
After code changes, both instances need to be republished separately:
```bash
# Stop both
sc stop ZB.MOM.WW.LmxProxy.Host.V2
sc stop ZB.MOM.WW.LmxProxy.Host.V2B
# Publish
dotnet publish src/ZB.MOM.WW.LmxProxy.Host -c Release -r win-x86 --self-contained false -o C:\publish-v2
dotnet publish src/ZB.MOM.WW.LmxProxy.Host -c Release -r win-x86 --self-contained false -o C:\publish-v2b
# Restore instance-specific config (publish overwrites appsettings.json)
# Instance B needs port 50101/8082 and log prefix lmxproxy-v2b-
# Start both
sc start ZB.MOM.WW.LmxProxy.Host.V2
sc start ZB.MOM.WW.LmxProxy.Host.V2B
```
**Note:** `dotnet publish` overwrites `appsettings.json` in the output directory with the source copy (which has default ports 50051/8080). After publishing, verify the instance-specific config is correct before starting the service.

Binary file not shown.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,48 @@
namespace ZB.MOM.WW.LmxProxy.Client;
/// <summary>
/// TLS configuration for LmxProxy client connections
/// </summary>
public class ClientTlsConfiguration
{
/// <summary>
/// Gets or sets whether to use TLS for the connection
/// </summary>
public bool UseTls { get; set; } = false;
/// <summary>
/// Gets or sets the path to the client certificate file (optional for mutual TLS)
/// </summary>
public string? ClientCertificatePath { get; set; }
/// <summary>
/// Gets or sets the path to the client private key file (optional for mutual TLS)
/// </summary>
public string? ClientKeyPath { get; set; }
/// <summary>
/// Gets or sets the path to the CA certificate for server validation (optional)
/// </summary>
public string? ServerCaCertificatePath { get; set; }
/// <summary>
/// Gets or sets the server name override for certificate validation (optional)
/// </summary>
public string? ServerNameOverride { get; set; }
/// <summary>
/// Gets or sets whether to validate the server certificate
/// </summary>
public bool ValidateServerCertificate { get; set; } = true;
/// <summary>
/// Gets or sets whether to allow self-signed certificates (for testing only)
/// </summary>
public bool AllowSelfSignedCertificates { get; set; } = false;
/// <summary>
/// Gets or sets whether to ignore all certificate errors (DANGEROUS - for testing only)
/// WARNING: This completely disables certificate validation and should never be used in production
/// </summary>
public bool IgnoreAllCertificateErrors { get; set; } = false;
}

View File

@@ -0,0 +1,49 @@
using System;
namespace ZB.MOM.WW.LmxProxy.Client.Domain;
/// <summary>
/// Represents the connection state of an LmxProxy client.
/// </summary>
public enum ConnectionState
{
/// <summary>Not connected to the server.</summary>
Disconnected,
/// <summary>Connection attempt in progress.</summary>
Connecting,
/// <summary>Connected and ready for operations.</summary>
Connected,
/// <summary>Graceful disconnect in progress.</summary>
Disconnecting,
/// <summary>Connection failed with an error.</summary>
Error,
/// <summary>Attempting to re-establish a lost connection.</summary>
Reconnecting
}
/// <summary>
/// Event arguments for connection state change notifications.
/// </summary>
public class ConnectionStateChangedEventArgs : EventArgs
{
/// <summary>The previous connection state.</summary>
public ConnectionState OldState { get; }
/// <summary>The new connection state.</summary>
public ConnectionState NewState { get; }
/// <summary>Optional message describing the state change (e.g., error details).</summary>
public string? Message { get; }
public ConnectionStateChangedEventArgs(ConnectionState oldState, ConnectionState newState, string? message = null)
{
OldState = oldState;
NewState = newState;
Message = message;
}
}

View File

@@ -0,0 +1,118 @@
namespace ZB.MOM.WW.LmxProxy.Client.Domain;
/// <summary>
/// OPC-style quality codes for SCADA data values.
/// Based on OPC DA quality encoding as a single byte:
/// bits 76 = major (00=Bad, 01=Uncertain, 11=Good),
/// bits 52 = substatus, bits 10 = limit (00=None, 01=Low, 10=High, 11=Constant).
/// </summary>
public enum Quality : byte
{
/// <summary>Bad non-specific.</summary>
Bad = 0,
/// <summary>Bad configuration error in the server.</summary>
Bad_ConfigError = 4,
/// <summary>Bad input source is not connected.</summary>
Bad_NotConnected = 8,
/// <summary>Bad device failure detected.</summary>
Bad_DeviceFailure = 12,
/// <summary>Bad sensor failure detected.</summary>
Bad_SensorFailure = 16,
/// <summary>Bad last known value (communication lost, value stale).</summary>
Bad_LastKnownValue = 20,
/// <summary>Bad communication failure.</summary>
Bad_CommFailure = 24,
/// <summary>Bad item is out of service.</summary>
Bad_OutOfService = 28,
/// <summary>Uncertain non-specific.</summary>
Uncertain = 64,
/// <summary>Uncertain non-specific, low limited.</summary>
Uncertain_LowLimited = 65,
/// <summary>Uncertain non-specific, high limited.</summary>
Uncertain_HighLimited = 66,
/// <summary>Uncertain non-specific, constant.</summary>
Uncertain_Constant = 67,
/// <summary>Uncertain last usable value.</summary>
Uncertain_LastUsable = 68,
/// <summary>Uncertain last usable value, low limited.</summary>
Uncertain_LastUsable_LL = 69,
/// <summary>Uncertain last usable value, high limited.</summary>
Uncertain_LastUsable_HL = 70,
/// <summary>Uncertain last usable value, constant.</summary>
Uncertain_LastUsable_Cnst = 71,
/// <summary>Uncertain sensor not accurate.</summary>
Uncertain_SensorNotAcc = 80,
/// <summary>Uncertain sensor not accurate, low limited.</summary>
Uncertain_SensorNotAcc_LL = 81,
/// <summary>Uncertain sensor not accurate, high limited.</summary>
Uncertain_SensorNotAcc_HL = 82,
/// <summary>Uncertain sensor not accurate, constant.</summary>
Uncertain_SensorNotAcc_C = 83,
/// <summary>Uncertain engineering units exceeded.</summary>
Uncertain_EuExceeded = 84,
/// <summary>Uncertain engineering units exceeded, low limited.</summary>
Uncertain_EuExceeded_LL = 85,
/// <summary>Uncertain engineering units exceeded, high limited.</summary>
Uncertain_EuExceeded_HL = 86,
/// <summary>Uncertain engineering units exceeded, constant.</summary>
Uncertain_EuExceeded_C = 87,
/// <summary>Uncertain sub-normal operating conditions.</summary>
Uncertain_SubNormal = 88,
/// <summary>Uncertain sub-normal, low limited.</summary>
Uncertain_SubNormal_LL = 89,
/// <summary>Uncertain sub-normal, high limited.</summary>
Uncertain_SubNormal_HL = 90,
/// <summary>Uncertain sub-normal, constant.</summary>
Uncertain_SubNormal_C = 91,
/// <summary>Good non-specific.</summary>
Good = 192,
/// <summary>Good low limited.</summary>
Good_LowLimited = 193,
/// <summary>Good high limited.</summary>
Good_HighLimited = 194,
/// <summary>Good constant.</summary>
Good_Constant = 195,
/// <summary>Good local override active.</summary>
Good_LocalOverride = 216,
/// <summary>Good local override active, low limited.</summary>
Good_LocalOverride_LL = 217,
/// <summary>Good local override active, high limited.</summary>
Good_LocalOverride_HL = 218,
/// <summary>Good local override active, constant.</summary>
Good_LocalOverride_C = 219
}

View File

@@ -0,0 +1,8 @@
namespace ZB.MOM.WW.LmxProxy.Client.Domain;
public static class QualityExtensions
{
public static bool IsGood(this Quality q) => (byte)q >= 128;
public static bool IsUncertain(this Quality q) => (byte)q is >= 64 and < 128;
public static bool IsBad(this Quality q) => (byte)q < 64;
}

View File

@@ -0,0 +1,444 @@
using System.Collections.Generic;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.LmxProxy.Client.Domain;
// ────────────────────────────────────────────────────────────────
// Service contract
// ────────────────────────────────────────────────────────────────
/// <summary>
/// Code-first gRPC service contract for SCADA operations.
/// </summary>
[ServiceContract(Name = "scada.ScadaService")]
public interface IScadaService
{
/// <summary>Establishes a connection with the SCADA service.</summary>
ValueTask<ConnectResponse> ConnectAsync(ConnectRequest request);
/// <summary>Terminates a SCADA service connection.</summary>
ValueTask<DisconnectResponse> DisconnectAsync(DisconnectRequest request);
/// <summary>Retrieves the current state of a SCADA connection.</summary>
ValueTask<GetConnectionStateResponse> GetConnectionStateAsync(GetConnectionStateRequest request);
/// <summary>Reads a single tag value from the SCADA system.</summary>
ValueTask<ReadResponse> ReadAsync(ReadRequest request);
/// <summary>Reads multiple tag values from the SCADA system in a batch operation.</summary>
ValueTask<ReadBatchResponse> ReadBatchAsync(ReadBatchRequest request);
/// <summary>Writes a single value to a tag in the SCADA system.</summary>
ValueTask<WriteResponse> WriteAsync(WriteRequest request);
/// <summary>Writes multiple values to tags in the SCADA system in a batch operation.</summary>
ValueTask<WriteBatchResponse> WriteBatchAsync(WriteBatchRequest request);
/// <summary>Writes multiple values and waits for a completion flag before returning.</summary>
ValueTask<WriteBatchAndWaitResponse> WriteBatchAndWaitAsync(WriteBatchAndWaitRequest request);
/// <summary>Subscribes to real-time value changes from specified tags.</summary>
IAsyncEnumerable<VtqMessage> SubscribeAsync(SubscribeRequest request, CancellationToken cancellationToken = default);
/// <summary>Validates an API key for authentication.</summary>
ValueTask<CheckApiKeyResponse> CheckApiKeyAsync(CheckApiKeyRequest request);
}
// ────────────────────────────────────────────────────────────────
// VTQ message
// ────────────────────────────────────────────────────────────────
/// <summary>
/// Value-Timestamp-Quality message transmitted over gRPC.
/// All values are string-encoded; timestamps are UTC ticks.
/// </summary>
[DataContract]
public class VtqMessage
{
/// <summary>Tag address.</summary>
[DataMember(Order = 1)]
public string Tag { get; set; } = string.Empty;
/// <summary>Value encoded as a string.</summary>
[DataMember(Order = 2)]
public string Value { get; set; } = string.Empty;
/// <summary>UTC timestamp as DateTime.Ticks (100ns intervals since 0001-01-01).</summary>
[DataMember(Order = 3)]
public long TimestampUtcTicks { get; set; }
/// <summary>Quality string: "Good", "Uncertain", or "Bad".</summary>
[DataMember(Order = 4)]
public string Quality { get; set; } = string.Empty;
}
// ────────────────────────────────────────────────────────────────
// Connect
// ────────────────────────────────────────────────────────────────
/// <summary>Request to establish a session with the proxy server.</summary>
[DataContract]
public class ConnectRequest
{
/// <summary>Client identifier (e.g., "ScadaLink-{guid}").</summary>
[DataMember(Order = 1)]
public string ClientId { get; set; } = string.Empty;
/// <summary>API key for authentication (empty if none required).</summary>
[DataMember(Order = 2)]
public string ApiKey { get; set; } = string.Empty;
}
/// <summary>Response from a Connect call.</summary>
[DataContract]
public class ConnectResponse
{
/// <summary>Whether the connection was established successfully.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Status or error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
/// <summary>Session ID (32-char hex GUID). Only valid when <see cref="Success"/> is <c>true</c>.</summary>
[DataMember(Order = 3)]
public string SessionId { get; set; } = string.Empty;
}
// ────────────────────────────────────────────────────────────────
// Disconnect
// ────────────────────────────────────────────────────────────────
/// <summary>Request to terminate a session.</summary>
[DataContract]
public class DisconnectRequest
{
/// <summary>Active session ID to disconnect.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
}
/// <summary>Response from a Disconnect call.</summary>
[DataContract]
public class DisconnectResponse
{
/// <summary>Whether the disconnect succeeded.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Status or error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
}
// ────────────────────────────────────────────────────────────────
// GetConnectionState
// ────────────────────────────────────────────────────────────────
/// <summary>Request to query connection state for a session.</summary>
[DataContract]
public class GetConnectionStateRequest
{
/// <summary>Session ID to query.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
}
/// <summary>Response with connection state information.</summary>
[DataContract]
public class GetConnectionStateResponse
{
/// <summary>Whether the session is currently connected.</summary>
[DataMember(Order = 1)]
public bool IsConnected { get; set; }
/// <summary>Client identifier for this session.</summary>
[DataMember(Order = 2)]
public string ClientId { get; set; } = string.Empty;
/// <summary>UTC ticks when the connection was established.</summary>
[DataMember(Order = 3)]
public long ConnectedSinceUtcTicks { get; set; }
}
// ────────────────────────────────────────────────────────────────
// Read
// ────────────────────────────────────────────────────────────────
/// <summary>Request to read a single tag.</summary>
[DataContract]
public class ReadRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag address to read.</summary>
[DataMember(Order = 2)]
public string Tag { get; set; } = string.Empty;
}
/// <summary>Response from a single-tag Read call.</summary>
[DataContract]
public class ReadResponse
{
/// <summary>Whether the read succeeded.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Error message if the read failed.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
/// <summary>The value-timestamp-quality result.</summary>
[DataMember(Order = 3)]
public VtqMessage? Vtq { get; set; }
}
// ────────────────────────────────────────────────────────────────
// ReadBatch
// ────────────────────────────────────────────────────────────────
/// <summary>Request to read multiple tags in a single round-trip.</summary>
[DataContract]
public class ReadBatchRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag addresses to read.</summary>
[DataMember(Order = 2)]
public List<string> Tags { get; set; } = [];
}
/// <summary>Response from a batch Read call.</summary>
[DataContract]
public class ReadBatchResponse
{
/// <summary>False if any tag read failed.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
/// <summary>VTQ results in the same order as the request tags.</summary>
[DataMember(Order = 3)]
public List<VtqMessage> Vtqs { get; set; } = [];
}
// ────────────────────────────────────────────────────────────────
// Write
// ────────────────────────────────────────────────────────────────
/// <summary>Request to write a single tag value.</summary>
[DataContract]
public class WriteRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag address to write.</summary>
[DataMember(Order = 2)]
public string Tag { get; set; } = string.Empty;
/// <summary>Value as a string (parsed server-side).</summary>
[DataMember(Order = 3)]
public string Value { get; set; } = string.Empty;
}
/// <summary>Response from a single-tag Write call.</summary>
[DataContract]
public class WriteResponse
{
/// <summary>Whether the write succeeded.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Status or error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
}
// ────────────────────────────────────────────────────────────────
// WriteItem / WriteResult
// ────────────────────────────────────────────────────────────────
/// <summary>A single tag-value pair for batch write operations.</summary>
[DataContract]
public class WriteItem
{
/// <summary>Tag address.</summary>
[DataMember(Order = 1)]
public string Tag { get; set; } = string.Empty;
/// <summary>Value as a string.</summary>
[DataMember(Order = 2)]
public string Value { get; set; } = string.Empty;
}
/// <summary>Per-item result from a batch write operation.</summary>
[DataContract]
public class WriteResult
{
/// <summary>Tag address that was written.</summary>
[DataMember(Order = 1)]
public string Tag { get; set; } = string.Empty;
/// <summary>Whether the individual write succeeded.</summary>
[DataMember(Order = 2)]
public bool Success { get; set; }
/// <summary>Error message for this item, if any.</summary>
[DataMember(Order = 3)]
public string Message { get; set; } = string.Empty;
}
// ────────────────────────────────────────────────────────────────
// WriteBatch
// ────────────────────────────────────────────────────────────────
/// <summary>Request to write multiple tag values in a single round-trip.</summary>
[DataContract]
public class WriteBatchRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag-value pairs to write.</summary>
[DataMember(Order = 2)]
public List<WriteItem> Items { get; set; } = [];
}
/// <summary>Response from a batch Write call.</summary>
[DataContract]
public class WriteBatchResponse
{
/// <summary>Overall success — false if any item failed.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Status or error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
/// <summary>Per-item write results.</summary>
[DataMember(Order = 3)]
public List<WriteResult> Results { get; set; } = [];
}
// ────────────────────────────────────────────────────────────────
// WriteBatchAndWait
// ────────────────────────────────────────────────────────────────
/// <summary>
/// Request to write multiple tag values then poll a flag tag
/// until it matches an expected value or the timeout expires.
/// </summary>
[DataContract]
public class WriteBatchAndWaitRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag-value pairs to write.</summary>
[DataMember(Order = 2)]
public List<WriteItem> Items { get; set; } = [];
/// <summary>Tag to poll after writes complete.</summary>
[DataMember(Order = 3)]
public string FlagTag { get; set; } = string.Empty;
/// <summary>Expected value for the flag tag (string comparison).</summary>
[DataMember(Order = 4)]
public string FlagValue { get; set; } = string.Empty;
/// <summary>Timeout in milliseconds (default 5000 if &lt;= 0).</summary>
[DataMember(Order = 5)]
public int TimeoutMs { get; set; }
/// <summary>Poll interval in milliseconds (default 100 if &lt;= 0).</summary>
[DataMember(Order = 6)]
public int PollIntervalMs { get; set; }
}
/// <summary>Response from a WriteBatchAndWait call.</summary>
[DataContract]
public class WriteBatchAndWaitResponse
{
/// <summary>Overall operation success.</summary>
[DataMember(Order = 1)]
public bool Success { get; set; }
/// <summary>Status or error message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
/// <summary>Per-item write results.</summary>
[DataMember(Order = 3)]
public List<WriteResult> WriteResults { get; set; } = [];
/// <summary>Whether the flag tag matched the expected value before timeout.</summary>
[DataMember(Order = 4)]
public bool FlagReached { get; set; }
/// <summary>Total elapsed time in milliseconds.</summary>
[DataMember(Order = 5)]
public int ElapsedMs { get; set; }
}
// ────────────────────────────────────────────────────────────────
// Subscribe
// ────────────────────────────────────────────────────────────────
/// <summary>Request to subscribe to value change notifications on one or more tags.</summary>
[DataContract]
public class SubscribeRequest
{
/// <summary>Valid session ID.</summary>
[DataMember(Order = 1)]
public string SessionId { get; set; } = string.Empty;
/// <summary>Tag addresses to monitor.</summary>
[DataMember(Order = 2)]
public List<string> Tags { get; set; } = [];
/// <summary>Backend sampling interval in milliseconds.</summary>
[DataMember(Order = 3)]
public int SamplingMs { get; set; }
}
// ────────────────────────────────────────────────────────────────
// CheckApiKey
// ────────────────────────────────────────────────────────────────
/// <summary>Request to validate an API key without creating a session.</summary>
[DataContract]
public class CheckApiKeyRequest
{
/// <summary>API key to validate.</summary>
[DataMember(Order = 1)]
public string ApiKey { get; set; } = string.Empty;
}
/// <summary>Response from an API key validation check.</summary>
[DataContract]
public class CheckApiKeyResponse
{
/// <summary>Whether the API key is valid.</summary>
[DataMember(Order = 1)]
public bool IsValid { get; set; }
/// <summary>Validation message.</summary>
[DataMember(Order = 2)]
public string Message { get; set; } = string.Empty;
}

View File

@@ -0,0 +1,27 @@
using System;
namespace ZB.MOM.WW.LmxProxy.Client.Domain;
/// <summary>
/// Value, Timestamp, and Quality structure for SCADA data.
/// </summary>
/// <param name="Value">The value.</param>
/// <param name="Timestamp">The timestamp when the value was read.</param>
/// <param name="Quality">The quality of the value.</param>
public readonly record struct Vtq(object? Value, DateTime Timestamp, Quality Quality)
{
/// <summary>Creates a new VTQ with the specified value and quality, using the current UTC timestamp.</summary>
public static Vtq New(object? value, Quality quality) => new(value, DateTime.UtcNow, quality);
/// <summary>Creates a new VTQ with the specified value, timestamp, and quality.</summary>
public static Vtq New(object? value, DateTime timestamp, Quality quality) => new(value, timestamp, quality);
/// <summary>Creates a Good-quality VTQ with the current UTC time.</summary>
public static Vtq Good(object? value) => new(value, DateTime.UtcNow, Quality.Good);
/// <summary>Creates a Bad-quality VTQ with the current UTC time.</summary>
public static Vtq Bad(object? value = null) => new(value, DateTime.UtcNow, Quality.Bad);
/// <summary>Creates an Uncertain-quality VTQ with the current UTC time.</summary>
public static Vtq Uncertain(object? value) => new(value, DateTime.UtcNow, Quality.Uncertain);
}

View File

@@ -0,0 +1,77 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Interface for LmxProxy client operations
/// </summary>
public interface ILmxProxyClient : IDisposable, IAsyncDisposable
{
/// <summary>
/// Gets or sets the default timeout for operations
/// </summary>
TimeSpan DefaultTimeout { get; set; }
/// <summary>
/// Connects to the LmxProxy service
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
Task ConnectAsync(CancellationToken cancellationToken = default);
/// <summary>
/// Disconnects from the LmxProxy service
/// </summary>
Task DisconnectAsync();
/// <summary>
/// Checks if the client is connected to the service
/// </summary>
Task<bool> IsConnectedAsync();
/// <summary>
/// Reads a single tag value
/// </summary>
/// <param name="address">The tag address to read.</param>
/// <param name="cancellationToken">Cancellation token.</param>
Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default);
/// <summary>
/// Reads multiple tag values in a single batch
/// </summary>
/// <param name="addresses">The tag addresses to read.</param>
/// <param name="cancellationToken">Cancellation token.</param>
Task<IDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses, CancellationToken cancellationToken = default);
/// <summary>
/// Writes a single tag value
/// </summary>
/// <param name="address">The tag address to write.</param>
/// <param name="value">The value to write.</param>
/// <param name="cancellationToken">Cancellation token.</param>
Task WriteAsync(string address, object value, CancellationToken cancellationToken = default);
/// <summary>
/// Writes multiple tag values in a single batch
/// </summary>
/// <param name="values">The tag addresses and values to write.</param>
/// <param name="cancellationToken">Cancellation token.</param>
Task WriteBatchAsync(IDictionary<string, object> values, CancellationToken cancellationToken = default);
/// <summary>
/// Subscribes to tag updates
/// </summary>
/// <param name="addresses">The tag addresses to subscribe to.</param>
/// <param name="onUpdate">Callback invoked when tag values change.</param>
/// <param name="cancellationToken">Cancellation token.</param>
Task<ISubscription> SubscribeAsync(IEnumerable<string> addresses, Action<string, Vtq> onUpdate, CancellationToken cancellationToken = default);
/// <summary>
/// Gets the current metrics snapshot
/// </summary>
Dictionary<string, object> GetMetrics();
}
}

View File

@@ -0,0 +1,150 @@
using System;
using System.Linq;
using Microsoft.Extensions.Configuration;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Factory interface for creating LmxProxyClient instances
/// </summary>
public interface ILmxProxyClientFactory
{
/// <summary>
/// Creates a new LmxProxyClient instance with default configuration
/// </summary>
/// <returns>A configured LmxProxyClient instance</returns>
LmxProxyClient CreateClient();
/// <summary>
/// Creates a new LmxProxyClient instance with custom configuration
/// </summary>
/// <param name="configurationName">Name of the configuration section to use</param>
/// <returns>A configured LmxProxyClient instance</returns>
LmxProxyClient CreateClient(string configurationName);
/// <summary>
/// Creates a new LmxProxyClient instance using a builder
/// </summary>
/// <param name="builderAction">Action to configure the builder</param>
/// <returns>A configured LmxProxyClient instance</returns>
LmxProxyClient CreateClient(Action<LmxProxyClientBuilder> builderAction);
}
/// <summary>
/// Default implementation of ILmxProxyClientFactory
/// </summary>
public class LmxProxyClientFactory : ILmxProxyClientFactory
{
private readonly IConfiguration _configuration;
/// <summary>
/// Initializes a new instance of the LmxProxyClientFactory
/// </summary>
/// <param name="configuration">Application configuration</param>
public LmxProxyClientFactory(IConfiguration configuration)
{
_configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
}
/// <summary>
/// Creates a new LmxProxyClient instance with default configuration
/// </summary>
/// <returns>A configured LmxProxyClient instance</returns>
public LmxProxyClient CreateClient()
{
return CreateClient("LmxProxy");
}
/// <summary>
/// Creates a new LmxProxyClient instance with custom configuration
/// </summary>
/// <param name="configurationName">Name of the configuration section to use</param>
/// <returns>A configured LmxProxyClient instance</returns>
public LmxProxyClient CreateClient(string configurationName)
{
IConfigurationSection section = _configuration.GetSection(configurationName);
if (!section.GetChildren().Any() && section.Value == null)
{
throw new InvalidOperationException($"Configuration section '{configurationName}' not found");
}
var builder = new LmxProxyClientBuilder();
// Configure from appsettings
string? host = section["Host"];
if (!string.IsNullOrEmpty(host))
{
builder.WithHost(host);
}
if (int.TryParse(section["Port"], out int port))
{
builder.WithPort(port);
}
string? apiKey = section["ApiKey"];
if (!string.IsNullOrEmpty(apiKey))
{
builder.WithApiKey(apiKey);
}
if (TimeSpan.TryParse(section["Timeout"], out TimeSpan timeout))
{
builder.WithTimeout(timeout);
}
// Retry configuration
IConfigurationSection? retrySection = section.GetSection("Retry");
if (retrySection != null && (retrySection.GetChildren().Any() || retrySection.Value != null))
{
if (int.TryParse(retrySection["MaxAttempts"], out int maxAttempts) &&
TimeSpan.TryParse(retrySection["Delay"], out TimeSpan retryDelay))
{
builder.WithRetryPolicy(maxAttempts, retryDelay);
}
}
// SSL configuration
bool useSsl = section.GetValue<bool>("UseSsl");
if (useSsl)
{
string? certificatePath = section["CertificatePath"];
builder.WithSslCredentials(certificatePath);
}
// Metrics configuration
if (section.GetValue<bool>("EnableMetrics"))
{
builder.WithMetrics();
}
// Correlation ID configuration
string? correlationHeader = section["CorrelationIdHeader"];
if (!string.IsNullOrEmpty(correlationHeader))
{
builder.WithCorrelationIdHeader(correlationHeader);
}
// Logger is optional - don't set a default one
return builder.Build();
}
/// <summary>
/// Creates a new LmxProxyClient instance using a builder
/// </summary>
/// <param name="builderAction">Action to configure the builder</param>
/// <returns>A configured LmxProxyClient instance</returns>
public LmxProxyClient CreateClient(Action<LmxProxyClientBuilder> builderAction)
{
ArgumentNullException.ThrowIfNull(builderAction);
var builder = new LmxProxyClientBuilder();
builderAction(builder);
// Logger is optional - caller can set it via builderAction if needed
return builder.Build();
}
}
}

View File

@@ -0,0 +1,36 @@
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// API key information returned from CheckApiKey
/// </summary>
public class ApiKeyInfo
{
/// <summary>
/// Whether the API key is valid
/// </summary>
public bool IsValid { get; }
/// <summary>
/// The role assigned to the API key
/// </summary>
public string Role { get; }
/// <summary>
/// Description of the API key
/// </summary>
public string Description { get; }
/// <summary>
/// Initializes a new instance of the ApiKeyInfo class
/// </summary>
/// <param name="isValid">Whether the API key is valid</param>
/// <param name="role">The role assigned to the API key</param>
/// <param name="description">Description of the API key</param>
public ApiKeyInfo(bool isValid, string role, string description)
{
IsValid = isValid;
Role = role ?? string.Empty;
Description = description ?? string.Empty;
}
}
}

View File

@@ -0,0 +1,100 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Metrics collection for client operations
/// </summary>
internal class ClientMetrics
{
private readonly ConcurrentDictionary<string, long> _operationCounts = new();
private readonly ConcurrentDictionary<string, long> _errorCounts = new();
private readonly ConcurrentDictionary<string, List<long>> _latencies = new();
private readonly object _latencyLock = new();
/// <summary>
/// Increments the operation count for a specific operation.
/// </summary>
/// <param name="operation">The operation name.</param>
public void IncrementOperationCount(string operation)
{
_operationCounts.AddOrUpdate(operation, 1, (_, oldValue) => oldValue + 1);
}
/// <summary>
/// Increments the error count for a specific operation.
/// </summary>
/// <param name="operation">The operation name.</param>
public void IncrementErrorCount(string operation)
{
_errorCounts.AddOrUpdate(operation, 1, (_, oldValue) => oldValue + 1);
}
/// <summary>
/// Records latency for a specific operation.
/// </summary>
/// <param name="operation">The operation name.</param>
/// <param name="milliseconds">The latency in milliseconds.</param>
public void RecordLatency(string operation, long milliseconds)
{
lock (_latencyLock)
{
if (!_latencies.ContainsKey(operation))
{
_latencies[operation] = [];
}
_latencies[operation].Add(milliseconds);
// Keep only last 1000 entries to prevent memory growth
if (_latencies[operation].Count > 1000)
{
_latencies[operation].RemoveAt(0);
}
}
}
/// <summary>
/// Gets a snapshot of current metrics.
/// </summary>
/// <returns>A dictionary containing metric data.</returns>
public Dictionary<string, object> GetSnapshot()
{
var snapshot = new Dictionary<string, object>();
foreach (KeyValuePair<string, long> kvp in _operationCounts)
{
snapshot[$"{kvp.Key}_count"] = kvp.Value;
}
foreach (KeyValuePair<string, long> kvp in _errorCounts)
{
snapshot[$"{kvp.Key}_errors"] = kvp.Value;
}
lock (_latencyLock)
{
foreach (KeyValuePair<string, List<long>> kvp in _latencies)
{
if (kvp.Value.Any())
{
snapshot[$"{kvp.Key}_avg_latency_ms"] = kvp.Value.Average();
snapshot[$"{kvp.Key}_p95_latency_ms"] = GetPercentile(kvp.Value, 95);
snapshot[$"{kvp.Key}_p99_latency_ms"] = GetPercentile(kvp.Value, 99);
}
}
}
return snapshot;
}
private double GetPercentile(List<long> values, int percentile)
{
var sorted = values.OrderBy(x => x).ToList();
int index = (int)Math.Ceiling(percentile / 100.0 * sorted.Count) - 1;
return sorted[Math.Max(0, index)];
}
}
}

View File

@@ -0,0 +1,156 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client
{
public partial class LmxProxyClient
{
private class CodeFirstSubscription : ISubscription
{
private readonly IScadaService _client;
private readonly string _sessionId;
private readonly List<string> _tags;
private readonly Action<string, Vtq> _onUpdate;
private readonly ILogger<LmxProxyClient> _logger;
private readonly Action<ISubscription>? _onDispose;
private readonly CancellationTokenSource _cts = new();
private Task? _processingTask;
private bool _disposed;
/// <summary>
/// Initializes a new instance of the CodeFirstSubscription class.
/// </summary>
/// <param name="client">The gRPC ScadaService client.</param>
/// <param name="sessionId">The session identifier.</param>
/// <param name="tags">The list of tag addresses to subscribe to.</param>
/// <param name="onUpdate">Callback invoked when tag values change.</param>
/// <param name="logger">Logger for diagnostic information.</param>
/// <param name="onDispose">Optional callback invoked when the subscription is disposed.</param>
public CodeFirstSubscription(
IScadaService client,
string sessionId,
List<string> tags,
Action<string, Vtq> onUpdate,
ILogger<LmxProxyClient> logger,
Action<ISubscription>? onDispose = null)
{
_client = client;
_sessionId = sessionId;
_tags = tags;
_onUpdate = onUpdate;
_logger = logger;
_onDispose = onDispose;
}
/// <summary>
/// Starts the subscription asynchronously and begins processing tag value updates.
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>A task that completes when the subscription processing has started.</returns>
public Task StartAsync(CancellationToken cancellationToken = default)
{
_processingTask = ProcessUpdatesAsync(cancellationToken);
return Task.CompletedTask;
}
private async Task ProcessUpdatesAsync(CancellationToken cancellationToken)
{
try
{
var request = new SubscribeRequest
{
SessionId = _sessionId,
Tags = _tags,
SamplingMs = 1000
};
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cts.Token);
await foreach (VtqMessage vtq in _client.SubscribeAsync(request, linkedCts.Token))
{
try
{
Vtq convertedVtq = ConvertToVtq(vtq.Tag, vtq);
_onUpdate(vtq.Tag, convertedVtq);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing subscription update for {Tag}", vtq.Tag);
}
}
}
catch (OperationCanceledException) when (_cts.Token.IsCancellationRequested || cancellationToken.IsCancellationRequested)
{
_logger.LogDebug("Subscription cancelled");
}
catch (Exception ex)
{
_logger.LogError(ex, "Error in subscription processing");
try { await _cts.CancelAsync(); } catch { /* ignore */ }
}
finally
{
if (!_disposed)
{
_disposed = true;
_onDispose?.Invoke(this);
}
}
}
/// <summary>
/// Asynchronously disposes the subscription and stops processing tag updates.
/// </summary>
/// <returns>A task representing the asynchronous disposal operation.</returns>
public async Task DisposeAsync()
{
if (_disposed) return;
_disposed = true;
await _cts.CancelAsync();
try
{
if (_processingTask != null)
{
await _processingTask;
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Error disposing subscription");
}
finally
{
_cts.Dispose();
_onDispose?.Invoke(this);
}
}
/// <summary>
/// Synchronously disposes the subscription and stops processing tag updates.
/// </summary>
public void Dispose()
{
if (_disposed) return;
try
{
Task task = DisposeAsync();
if (!task.Wait(TimeSpan.FromSeconds(5)))
{
_logger.LogWarning("Subscription disposal timed out");
}
}
catch (Exception ex)
{
_logger.LogError(ex, "Error during synchronous disposal");
}
}
}
}
}

View File

@@ -0,0 +1,262 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Grpc.Net.Client;
using Microsoft.Extensions.Logging;
using ProtoBuf.Grpc.Client;
using ZB.MOM.WW.LmxProxy.Client.Domain;
using ZB.MOM.WW.LmxProxy.Client.Security;
namespace ZB.MOM.WW.LmxProxy.Client
{
public partial class LmxProxyClient
{
/// <summary>
/// Connects to the LmxProxy service and establishes a session
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task ConnectAsync(CancellationToken cancellationToken = default)
{
GrpcChannel? provisionalChannel = null;
await _connectionLock.WaitAsync(cancellationToken);
try
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(LmxProxyClient));
}
if (_isConnected && _client != null && !string.IsNullOrEmpty(_sessionId))
{
_logger.LogDebug("LmxProxyClient already connected to {Host}:{Port} with session {SessionId}",
_host, _port, _sessionId);
return;
}
string securityMode = _tlsConfiguration?.UseTls == true ? "TLS/SSL" : "INSECURE";
_logger.LogInformation("Creating new {SecurityMode} connection to LmxProxy at {Host}:{Port}",
securityMode, _host, _port);
Uri endpoint = BuildEndpointUri();
provisionalChannel = GrpcChannelFactory.CreateChannel(endpoint, _tlsConfiguration, _logger);
// Create code-first gRPC client
IScadaService provisionalClient = provisionalChannel.CreateGrpcService<IScadaService>();
// Establish session with the server
var connectRequest = new ConnectRequest
{
ClientId = $"ScadaBridge-{Guid.NewGuid():N}",
ApiKey = _apiKey ?? string.Empty
};
ConnectResponse connectResponse = await provisionalClient.ConnectAsync(connectRequest);
if (!connectResponse.Success)
{
provisionalChannel.Dispose();
throw new InvalidOperationException($"Failed to establish session: {connectResponse.Message}");
}
// Dispose any existing channel before replacing it
_channel?.Dispose();
_channel = provisionalChannel;
_client = provisionalClient;
_sessionId = connectResponse.SessionId;
_isConnected = true;
provisionalChannel = null;
StartKeepAlive();
_logger.LogInformation("Successfully connected to LmxProxy with session {SessionId}", _sessionId);
}
catch (Exception ex)
{
_isConnected = false;
_client = null;
_sessionId = string.Empty;
_logger.LogError(ex, "Failed to connect to LmxProxy");
throw;
}
finally
{
provisionalChannel?.Dispose();
_connectionLock.Release();
}
}
private void StartKeepAlive()
{
StopKeepAlive();
_keepAliveTimer = new Timer(async _ =>
{
try
{
if (_isConnected && _client != null && !string.IsNullOrEmpty(_sessionId))
{
// Send a lightweight ping to keep session alive
var request = new GetConnectionStateRequest { SessionId = _sessionId };
await _client.GetConnectionStateAsync(request);
_logger.LogDebug("Keep-alive ping sent successfully for session {SessionId}", _sessionId);
}
}
catch (Exception ex)
{
_logger.LogDebug(ex, "Keep-alive ping failed");
StopKeepAlive();
await MarkDisconnectedAsync(ex).ConfigureAwait(false);
}
}, null, _keepAliveInterval, _keepAliveInterval);
}
private void StopKeepAlive()
{
_keepAliveTimer?.Dispose();
_keepAliveTimer = null;
}
/// <summary>
/// Disconnects from the LmxProxy service
/// </summary>
public async Task DisconnectAsync()
{
await _connectionLock.WaitAsync();
try
{
StopKeepAlive();
if (_client != null && !string.IsNullOrEmpty(_sessionId))
{
try
{
var request = new DisconnectRequest { SessionId = _sessionId };
await _client.DisconnectAsync(request);
_logger.LogInformation("Session {SessionId} disconnected", _sessionId);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Error during disconnect");
}
}
_client = null;
_sessionId = string.Empty;
_isConnected = false;
_channel?.Dispose();
_channel = null;
}
finally
{
_connectionLock.Release();
}
}
/// <summary>
/// Connects the LmxProxy to MxAccess (legacy method - session now established in ConnectAsync)
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
public Task<(bool Success, string? ErrorMessage)> ConnectToMxAccessAsync(CancellationToken cancellationToken = default)
{
// Session is now established in ConnectAsync
if (IsConnected)
return Task.FromResult((true, (string?)null));
return Task.FromResult<(bool Success, string? ErrorMessage)>((false, "Not connected. Call ConnectAsync first."));
}
/// <summary>
/// Disconnects the LmxProxy from MxAccess (legacy method)
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<(bool Success, string? ErrorMessage)> DisconnectFromMxAccessAsync(CancellationToken cancellationToken = default)
{
try
{
await DisconnectAsync();
return (true, null);
}
catch (Exception ex)
{
return (false, ex.Message);
}
}
/// <summary>
/// Gets the connection state of the LmxProxy
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<(bool IsConnected, string? ClientId)> GetConnectionStateAsync(CancellationToken cancellationToken = default)
{
EnsureConnected();
var request = new GetConnectionStateRequest { SessionId = _sessionId };
GetConnectionStateResponse response = await _client!.GetConnectionStateAsync(request);
return (response.IsConnected, response.ClientId);
}
/// <summary>
/// Builds the gRPC endpoint URI (http/https) based on TLS configuration.
/// </summary>
private Uri BuildEndpointUri()
{
string scheme = _tlsConfiguration?.UseTls == true ? Uri.UriSchemeHttps : Uri.UriSchemeHttp;
return new UriBuilder
{
Scheme = scheme,
Host = _host,
Port = _port
}.Uri;
}
private async Task MarkDisconnectedAsync(Exception? ex = null)
{
if (_disposed)
return;
await _connectionLock.WaitAsync().ConfigureAwait(false);
try
{
_isConnected = false;
_client = null;
_sessionId = string.Empty;
_channel?.Dispose();
_channel = null;
}
finally
{
_connectionLock.Release();
}
List<ISubscription> subsToDispose;
lock (_subscriptionLock)
{
subsToDispose = new List<ISubscription>(_activeSubscriptions);
_activeSubscriptions.Clear();
}
foreach (ISubscription sub in subsToDispose)
{
try
{
await sub.DisposeAsync().ConfigureAwait(false);
}
catch (Exception disposeEx)
{
_logger.LogWarning(disposeEx, "Error disposing subscription after disconnect");
}
}
if (ex != null)
{
_logger.LogWarning(ex, "Connection marked disconnected due to keep-alive failure");
}
}
}
}

View File

@@ -0,0 +1,16 @@
using System;
using System.Threading.Tasks;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Represents a subscription to tag value changes
/// </summary>
public interface ISubscription : IDisposable
{
/// <summary>
/// Disposes the subscription asynchronously
/// </summary>
Task DisposeAsync();
}
}

View File

@@ -0,0 +1,573 @@
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Grpc.Net.Client;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Polly;
using ZB.MOM.WW.LmxProxy.Client.Domain;
using ZB.MOM.WW.LmxProxy.Client.Security;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Client for communicating with the LmxProxy gRPC service using protobuf-net.Grpc code-first
/// </summary>
public partial class LmxProxyClient : ILmxProxyClient
{
private static readonly string Http2InsecureSwitch = "System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport";
private readonly ILogger<LmxProxyClient> _logger;
private readonly string _host;
private readonly int _port;
private readonly string? _apiKey;
private GrpcChannel? _channel;
private IScadaService? _client;
private string _sessionId = string.Empty;
private readonly SemaphoreSlim _connectionLock = new(1, 1);
private readonly List<ISubscription> _activeSubscriptions = [];
private readonly Lock _subscriptionLock = new();
private bool _disposed;
private bool _isConnected;
private TimeSpan _defaultTimeout = TimeSpan.FromSeconds(30);
private ClientConfiguration? _configuration;
private IAsyncPolicy? _retryPolicy;
private readonly ClientMetrics _metrics = new();
private Timer? _keepAliveTimer;
private readonly TimeSpan _keepAliveInterval = TimeSpan.FromSeconds(30);
private readonly ClientTlsConfiguration? _tlsConfiguration;
static LmxProxyClient()
{
AppContext.SetSwitch(Http2InsecureSwitch, true);
}
/// <summary>
/// Gets or sets the default timeout for operations
/// </summary>
public TimeSpan DefaultTimeout
{
get => _defaultTimeout;
set
{
if (value <= TimeSpan.Zero)
throw new ArgumentOutOfRangeException(nameof(value), "Timeout must be positive");
if (value > TimeSpan.FromMinutes(10))
throw new ArgumentOutOfRangeException(nameof(value), "Timeout cannot exceed 10 minutes");
_defaultTimeout = value;
}
}
/// <summary>
/// Initializes a new instance of the LmxProxyClient
/// </summary>
/// <param name="host">The host address of the LmxProxy service</param>
/// <param name="port">The port of the LmxProxy service</param>
/// <param name="apiKey">The API key for authentication</param>
/// <param name="logger">Optional logger instance</param>
public LmxProxyClient(string host, int port, string? apiKey = null, ILogger<LmxProxyClient>? logger = null)
: this(host, port, apiKey, null, logger)
{
}
/// <summary>
/// Creates a new instance of the LmxProxyClient with TLS configuration
/// </summary>
/// <param name="host">The host address of the LmxProxy service</param>
/// <param name="port">The port of the LmxProxy service</param>
/// <param name="apiKey">The API key for authentication</param>
/// <param name="tlsConfiguration">TLS configuration for secure connections</param>
/// <param name="logger">Optional logger instance</param>
public LmxProxyClient(string host, int port, string? apiKey, ClientTlsConfiguration? tlsConfiguration, ILogger<LmxProxyClient>? logger = null)
{
if (string.IsNullOrWhiteSpace(host))
throw new ArgumentException("Host cannot be null or empty", nameof(host));
if (port < 1 || port > 65535)
throw new ArgumentOutOfRangeException(nameof(port), "Port must be between 1 and 65535");
_host = host;
_port = port;
_apiKey = apiKey;
_tlsConfiguration = tlsConfiguration;
_logger = logger ?? NullLogger<LmxProxyClient>.Instance;
}
/// <summary>
/// Gets whether the client is connected to the service
/// </summary>
public bool IsConnected => !_disposed && _isConnected && !string.IsNullOrEmpty(_sessionId);
/// <summary>
/// Asynchronously checks if the client is connected with proper synchronization
/// </summary>
public async Task<bool> IsConnectedAsync()
{
await _connectionLock.WaitAsync();
try
{
return !_disposed && _client != null && _isConnected && !string.IsNullOrEmpty(_sessionId);
}
finally
{
_connectionLock.Release();
}
}
/// <summary>
/// Sets the builder configuration (internal use)
/// </summary>
/// <param name="configuration">The client configuration.</param>
internal void SetBuilderConfiguration(ClientConfiguration configuration)
{
_configuration = configuration;
// Setup retry policy if configured
if (configuration.MaxRetryAttempts > 0)
{
_retryPolicy = Policy
.Handle<Exception>(IsTransientError)
.WaitAndRetryAsync(
configuration.MaxRetryAttempts,
retryAttempt => configuration.RetryDelay * Math.Pow(2, retryAttempt - 1),
onRetry: (exception, timeSpan, retryCount, context) =>
{
object? correlationId = context.GetValueOrDefault("CorrelationId", "N/A");
_logger.LogWarning(exception,
"Retry {RetryCount} after {Delay}ms. CorrelationId: {CorrelationId}",
retryCount, timeSpan.TotalMilliseconds, correlationId);
});
}
}
/// <summary>
/// Reads a single tag value
/// </summary>
/// <param name="address">The tag address to read.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<Vtq> ReadAsync(string address, CancellationToken cancellationToken = default)
{
if (string.IsNullOrEmpty(address))
throw new ArgumentNullException(nameof(address));
EnsureConnected();
string correlationId = GenerateCorrelationId();
var stopwatch = Stopwatch.StartNew();
try
{
_metrics.IncrementOperationCount("Read");
var request = new ReadRequest
{
SessionId = _sessionId,
Tag = address
};
ReadResponse response = await ExecuteWithRetryAsync(async () =>
await _client!.ReadAsync(request),
correlationId);
if (!response.Success)
{
_metrics.IncrementErrorCount("Read");
throw new InvalidOperationException($"Read failed for tag '{address}': {response.Message}. CorrelationId: {correlationId}");
}
_metrics.RecordLatency("Read", stopwatch.ElapsedMilliseconds);
return ConvertToVtq(address, response.Vtq);
}
catch (Exception ex)
{
_metrics.IncrementErrorCount("Read");
_logger.LogError(ex, "Read operation failed for tag: {Tag}, CorrelationId: {CorrelationId}",
address, correlationId);
throw;
}
}
/// <summary>
/// Reads multiple tag values
/// </summary>
/// <param name="addresses">The tag addresses to read.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<IDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses, CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(addresses);
var addressList = addresses.ToList();
if (!addressList.Any())
throw new ArgumentException("At least one address must be provided", nameof(addresses));
EnsureConnected();
var request = new ReadBatchRequest
{
SessionId = _sessionId,
Tags = addressList
};
ReadBatchResponse response = await _client!.ReadBatchAsync(request);
if (!response.Success)
throw new InvalidOperationException($"ReadBatch failed: {response.Message}");
var results = new Dictionary<string, Vtq>();
foreach (VtqMessage vtq in response.Vtqs)
{
results[vtq.Tag] = ConvertToVtq(vtq.Tag, vtq);
}
return results;
}
/// <summary>
/// Writes a single tag value
/// </summary>
/// <param name="address">The tag address to write.</param>
/// <param name="value">The value to write.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task WriteAsync(string address, object value, CancellationToken cancellationToken = default)
{
if (string.IsNullOrEmpty(address))
throw new ArgumentNullException(nameof(address));
ArgumentNullException.ThrowIfNull(value);
EnsureConnected();
var request = new WriteRequest
{
SessionId = _sessionId,
Tag = address,
Value = ConvertToString(value)
};
WriteResponse response = await _client!.WriteAsync(request);
if (!response.Success)
throw new InvalidOperationException($"Write failed: {response.Message}");
}
/// <summary>
/// Writes multiple tag values
/// </summary>
/// <param name="values">The tag addresses and values to write.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task WriteBatchAsync(IDictionary<string, object> values, CancellationToken cancellationToken = default)
{
if (values == null || !values.Any())
throw new ArgumentException("At least one value must be provided", nameof(values));
EnsureConnected();
var request = new WriteBatchRequest
{
SessionId = _sessionId,
Items = values.Select(kvp => new WriteItem
{
Tag = kvp.Key,
Value = ConvertToString(kvp.Value)
}).ToList()
};
WriteBatchResponse response = await _client!.WriteBatchAsync(request);
if (!response.Success)
throw new InvalidOperationException($"WriteBatch failed: {response.Message}");
}
/// <summary>
/// Writes values and waits for a condition to be met
/// </summary>
/// <param name="values">The tag addresses and values to write.</param>
/// <param name="flagAddress">The flag address to write.</param>
/// <param name="flagValue">The flag value to write.</param>
/// <param name="responseAddress">The response address to monitor.</param>
/// <param name="responseValue">The expected response value.</param>
/// <param name="timeoutSeconds">Timeout in seconds.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<bool> WriteBatchAndWaitAsync(
IDictionary<string, object> values,
string flagAddress,
object flagValue,
string responseAddress,
object responseValue,
int timeoutSeconds = 30,
CancellationToken cancellationToken = default)
{
if (values == null || !values.Any())
throw new ArgumentException("At least one value must be provided", nameof(values));
EnsureConnected();
var request = new WriteBatchAndWaitRequest
{
SessionId = _sessionId,
Items = values.Select(kvp => new WriteItem
{
Tag = kvp.Key,
Value = ConvertToString(kvp.Value)
}).ToList(),
FlagTag = flagAddress,
FlagValue = ConvertToString(flagValue),
TimeoutMs = timeoutSeconds * 1000,
PollIntervalMs = 100
};
WriteBatchAndWaitResponse response = await _client!.WriteBatchAndWaitAsync(request);
if (!response.Success)
throw new InvalidOperationException($"WriteBatchAndWait failed: {response.Message}");
return response.FlagReached;
}
/// <summary>
/// Checks the validity and permissions of the current API key
/// </summary>
/// <param name="cancellationToken">Cancellation token.</param>
public async Task<ApiKeyInfo> CheckApiKeyAsync(CancellationToken cancellationToken = default)
{
EnsureConnected();
var request = new CheckApiKeyRequest { ApiKey = _apiKey ?? string.Empty };
CheckApiKeyResponse response = await _client!.CheckApiKeyAsync(request);
return new ApiKeyInfo(
response.IsValid,
"ReadWrite", // Code-first contract doesn't return role
response.Message);
}
/// <summary>
/// Subscribes to tag value changes
/// </summary>
/// <param name="addresses">The tag addresses to subscribe to.</param>
/// <param name="onUpdate">Callback invoked when tag values change.</param>
/// <param name="cancellationToken">Cancellation token.</param>
public Task<ISubscription> SubscribeAsync(
IEnumerable<string> addresses,
Action<string, Vtq> onUpdate,
CancellationToken cancellationToken = default)
{
List<string> addressList = addresses?.ToList() ?? throw new ArgumentNullException(nameof(addresses));
if (!addressList.Any())
throw new ArgumentException("At least one address must be provided", nameof(addresses));
ArgumentNullException.ThrowIfNull(onUpdate);
EnsureConnected();
var subscription = new CodeFirstSubscription(_client!, _sessionId, addressList, onUpdate, _logger, RemoveSubscription);
// Track the subscription
lock (_subscriptionLock)
{
_activeSubscriptions.Add(subscription);
}
// Start processing updates
Task startTask = subscription.StartAsync(cancellationToken);
// Log any startup errors but don't throw
startTask.ContinueWith(t =>
{
if (t.IsFaulted)
{
_logger.LogError(t.Exception, "Subscription startup failed");
}
}, TaskContinuationOptions.OnlyOnFaulted);
return Task.FromResult<ISubscription>(subscription);
}
private void EnsureConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(LmxProxyClient));
if (_client == null || !_isConnected || string.IsNullOrEmpty(_sessionId))
throw new InvalidOperationException("Client is not connected. Call ConnectAsync first.");
}
private static Vtq ConvertToVtq(string tag, VtqMessage? vtqMessage)
{
if (vtqMessage == null)
return new Vtq(null, DateTime.UtcNow, Quality.Bad);
// Parse the string value
object? value = vtqMessage.Value;
if (!string.IsNullOrEmpty(vtqMessage.Value))
{
// Try to parse as numeric types
if (double.TryParse(vtqMessage.Value, out double doubleVal))
value = doubleVal;
else if (bool.TryParse(vtqMessage.Value, out bool boolVal))
value = boolVal;
else
value = vtqMessage.Value;
}
var timestamp = new DateTime(vtqMessage.TimestampUtcTicks, DateTimeKind.Utc);
Quality quality = vtqMessage.Quality?.ToUpperInvariant() switch
{
"GOOD" => Quality.Good,
"UNCERTAIN" => Quality.Uncertain,
_ => Quality.Bad
};
return new Vtq(value, timestamp, quality);
}
private static string ConvertToString(object value)
{
if (value == null)
return string.Empty;
return value switch
{
DateTime dt => dt.ToUniversalTime().ToString("O"),
DateTimeOffset dto => dto.ToString("O"),
bool b => b.ToString().ToLowerInvariant(),
_ => value.ToString() ?? string.Empty
};
}
/// <summary>
/// Removes a subscription from the active tracking list
/// </summary>
private void RemoveSubscription(ISubscription subscription)
{
lock (_subscriptionLock)
{
_activeSubscriptions.Remove(subscription);
}
}
/// <summary>
/// Disposes of the client and closes the connection
/// </summary>
public void Dispose()
{
if (_disposed)
{
return;
}
DisposeAsync().AsTask().GetAwaiter().GetResult();
GC.SuppressFinalize(this);
}
/// <summary>
/// Asynchronously disposes of the client and closes the connection
/// </summary>
public async ValueTask DisposeAsync()
{
if (_disposed)
return;
_disposed = true;
await DisposeCoreAsync().ConfigureAwait(false);
_connectionLock.Dispose();
GC.SuppressFinalize(this);
}
/// <summary>
/// Protected disposal implementation
/// </summary>
/// <param name="disposing">True if disposing managed resources.</param>
protected virtual void Dispose(bool disposing)
{
if (!disposing || _disposed)
return;
_disposed = true;
DisposeCoreAsync().GetAwaiter().GetResult();
_connectionLock.Dispose();
}
private async Task DisposeCoreAsync()
{
StopKeepAlive();
List<ISubscription> subscriptionsToDispose;
lock (_subscriptionLock)
{
subscriptionsToDispose = new List<ISubscription>(_activeSubscriptions);
_activeSubscriptions.Clear();
}
foreach (ISubscription subscription in subscriptionsToDispose)
{
try
{
await subscription.DisposeAsync().ConfigureAwait(false);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error disposing subscription");
}
}
// Disconnect session
if (_client != null && !string.IsNullOrEmpty(_sessionId))
{
try
{
var request = new DisconnectRequest { SessionId = _sessionId };
await _client.DisconnectAsync(request);
}
catch (Exception ex)
{
_logger.LogDebug(ex, "Error during disconnect on dispose");
}
}
await _connectionLock.WaitAsync().ConfigureAwait(false);
try
{
_client = null;
_sessionId = string.Empty;
_isConnected = false;
_channel?.Dispose();
_channel = null;
}
finally
{
_connectionLock.Release();
}
}
private string GenerateCorrelationId()
{
return Guid.NewGuid().ToString("N");
}
private bool IsTransientError(Exception ex)
{
// Check for transient gRPC errors
return ex.Message.Contains("Unavailable") ||
ex.Message.Contains("DeadlineExceeded") ||
ex.Message.Contains("ResourceExhausted") ||
ex.Message.Contains("Aborted");
}
private async Task<T> ExecuteWithRetryAsync<T>(Func<Task<T>> operation, string correlationId)
{
if (_retryPolicy != null)
{
var context = new Context { ["CorrelationId"] = correlationId };
return await _retryPolicy.ExecuteAsync(async (_) => await operation(), context);
}
return await operation();
}
/// <summary>
/// Gets the current metrics snapshot
/// </summary>
public Dictionary<string, object> GetMetrics() => _metrics.GetSnapshot();
}
}

View File

@@ -0,0 +1,241 @@
using System;
using System.IO;
using Microsoft.Extensions.Logging;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Builder for creating configured instances of LmxProxyClient
/// </summary>
public class LmxProxyClientBuilder
{
private string? _host;
private int _port = 5050;
private string? _apiKey;
private ILogger<LmxProxyClient>? _logger;
private TimeSpan _defaultTimeout = TimeSpan.FromSeconds(30);
private int _maxRetryAttempts = 3;
private TimeSpan _retryDelay = TimeSpan.FromSeconds(1);
private bool _enableMetrics;
private string? _correlationIdHeader;
private ClientTlsConfiguration? _tlsConfiguration;
/// <summary>
/// Sets the host address for the LmxProxy service
/// </summary>
/// <param name="host">The host address</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithHost(string host)
{
if (string.IsNullOrWhiteSpace(host))
throw new ArgumentException("Host cannot be null or empty", nameof(host));
_host = host;
return this;
}
/// <summary>
/// Sets the port for the LmxProxy service
/// </summary>
/// <param name="port">The port number</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithPort(int port)
{
if (port < 1 || port > 65535)
throw new ArgumentOutOfRangeException(nameof(port), "Port must be between 1 and 65535");
_port = port;
return this;
}
/// <summary>
/// Sets the API key for authentication
/// </summary>
/// <param name="apiKey">The API key</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithApiKey(string apiKey)
{
_apiKey = apiKey;
return this;
}
/// <summary>
/// Sets the logger instance
/// </summary>
/// <param name="logger">The logger</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithLogger(ILogger<LmxProxyClient> logger)
{
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
return this;
}
/// <summary>
/// Sets the default timeout for operations
/// </summary>
/// <param name="timeout">The timeout duration</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithTimeout(TimeSpan timeout)
{
if (timeout <= TimeSpan.Zero)
throw new ArgumentOutOfRangeException(nameof(timeout), "Timeout must be positive");
if (timeout > TimeSpan.FromMinutes(10))
throw new ArgumentOutOfRangeException(nameof(timeout), "Timeout cannot exceed 10 minutes");
_defaultTimeout = timeout;
return this;
}
/// <summary>
/// Enables SSL/TLS with the specified certificate
/// </summary>
/// <param name="certificatePath">Path to the certificate file</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithSslCredentials(string? certificatePath = null)
{
_tlsConfiguration ??= new ClientTlsConfiguration();
_tlsConfiguration.UseTls = true;
_tlsConfiguration.ServerCaCertificatePath = string.IsNullOrWhiteSpace(certificatePath) ? null : certificatePath;
return this;
}
/// <summary>
/// Applies a full TLS configuration to the client.
/// </summary>
/// <param name="configuration">The TLS configuration to apply.</param>
/// <returns>The builder instance for method chaining.</returns>
public LmxProxyClientBuilder WithTlsConfiguration(ClientTlsConfiguration configuration)
{
_tlsConfiguration = configuration ?? throw new ArgumentNullException(nameof(configuration));
return this;
}
/// <summary>
/// Sets the retry configuration
/// </summary>
/// <param name="maxAttempts">Maximum number of retry attempts</param>
/// <param name="retryDelay">Delay between retries</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithRetryPolicy(int maxAttempts, TimeSpan retryDelay)
{
if (maxAttempts <= 0)
throw new ArgumentOutOfRangeException(nameof(maxAttempts), "Max attempts must be positive");
if (retryDelay <= TimeSpan.Zero)
throw new ArgumentOutOfRangeException(nameof(retryDelay), "Retry delay must be positive");
_maxRetryAttempts = maxAttempts;
_retryDelay = retryDelay;
return this;
}
/// <summary>
/// Enables metrics collection
/// </summary>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithMetrics()
{
_enableMetrics = true;
return this;
}
/// <summary>
/// Sets the correlation ID header name for request tracing
/// </summary>
/// <param name="headerName">The header name for correlation ID</param>
/// <returns>The builder instance for method chaining</returns>
public LmxProxyClientBuilder WithCorrelationIdHeader(string headerName)
{
if (string.IsNullOrEmpty(headerName))
throw new ArgumentException("Header name cannot be null or empty", nameof(headerName));
_correlationIdHeader = headerName;
return this;
}
/// <summary>
/// Builds the configured LmxProxyClient instance
/// </summary>
/// <returns>A configured LmxProxyClient instance</returns>
public LmxProxyClient Build()
{
if (string.IsNullOrWhiteSpace(_host))
throw new InvalidOperationException("Host must be specified");
ValidateTlsConfiguration();
var client = new LmxProxyClient(_host, _port, _apiKey, _tlsConfiguration, _logger)
{
DefaultTimeout = _defaultTimeout
};
// Store additional configuration for future use
client.SetBuilderConfiguration(new ClientConfiguration
{
MaxRetryAttempts = _maxRetryAttempts,
RetryDelay = _retryDelay,
EnableMetrics = _enableMetrics,
CorrelationIdHeader = _correlationIdHeader
});
return client;
}
private void ValidateTlsConfiguration()
{
if (_tlsConfiguration?.UseTls != true)
{
return;
}
if (!string.IsNullOrWhiteSpace(_tlsConfiguration.ServerCaCertificatePath) &&
!File.Exists(_tlsConfiguration.ServerCaCertificatePath))
{
throw new FileNotFoundException(
$"Certificate file not found: {_tlsConfiguration.ServerCaCertificatePath}",
_tlsConfiguration.ServerCaCertificatePath);
}
if (!string.IsNullOrWhiteSpace(_tlsConfiguration.ClientCertificatePath) &&
!File.Exists(_tlsConfiguration.ClientCertificatePath))
{
throw new FileNotFoundException(
$"Client certificate file not found: {_tlsConfiguration.ClientCertificatePath}",
_tlsConfiguration.ClientCertificatePath);
}
if (!string.IsNullOrWhiteSpace(_tlsConfiguration.ClientKeyPath) &&
!File.Exists(_tlsConfiguration.ClientKeyPath))
{
throw new FileNotFoundException(
$"Client key file not found: {_tlsConfiguration.ClientKeyPath}",
_tlsConfiguration.ClientKeyPath);
}
}
}
/// <summary>
/// Internal configuration class for storing builder settings
/// </summary>
internal class ClientConfiguration
{
/// <summary>
/// Gets or sets the maximum number of retry attempts.
/// </summary>
public int MaxRetryAttempts { get; set; }
/// <summary>
/// Gets or sets the retry delay.
/// </summary>
public TimeSpan RetryDelay { get; set; }
/// <summary>
/// Gets or sets a value indicating whether metrics are enabled.
/// </summary>
public bool EnableMetrics { get; set; }
/// <summary>
/// Gets or sets the correlation ID header name.
/// </summary>
public string? CorrelationIdHeader { get; set; }
}
}

View File

@@ -0,0 +1,4 @@
using System.Runtime.CompilerServices;
// Expose internal members to test assembly
[assembly: InternalsVisibleTo("ZB.MOM.WW.LmxProxy.Client.Tests")]

View File

@@ -0,0 +1,184 @@
using System;
using System.IO;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Net.Security;
using System.Security.Authentication;
using System.Security.Cryptography.X509Certificates;
using Grpc.Net.Client;
using Microsoft.Extensions.Logging;
namespace ZB.MOM.WW.LmxProxy.Client.Security;
internal static class GrpcChannelFactory
{
private const string Http2UnencryptedSwitch = "System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport";
static GrpcChannelFactory()
{
AppContext.SetSwitch(Http2UnencryptedSwitch, true);
}
/// <summary>
/// Creates a gRPC channel with optional TLS configuration.
/// </summary>
/// <param name="address">The server address.</param>
/// <param name="tlsConfiguration">Optional TLS configuration.</param>
/// <param name="logger">The logger.</param>
/// <returns>A configured gRPC channel.</returns>
public static GrpcChannel CreateChannel(Uri address, ClientTlsConfiguration? tlsConfiguration, ILogger logger)
{
var options = new GrpcChannelOptions
{
HttpHandler = CreateHttpHandler(tlsConfiguration, logger)
};
return GrpcChannel.ForAddress(address, options);
}
private static HttpMessageHandler CreateHttpHandler(ClientTlsConfiguration? tlsConfiguration, ILogger logger)
{
var handler = new SocketsHttpHandler
{
AutomaticDecompression = DecompressionMethods.None,
AllowAutoRedirect = false,
EnableMultipleHttp2Connections = true
};
if (tlsConfiguration?.UseTls == true)
{
ConfigureTls(handler, tlsConfiguration, logger);
}
return handler;
}
private static void ConfigureTls(SocketsHttpHandler handler, ClientTlsConfiguration tlsConfiguration, ILogger logger)
{
SslClientAuthenticationOptions sslOptions = handler.SslOptions;
sslOptions.EnabledSslProtocols = SslProtocols.Tls12 | SslProtocols.Tls13;
if (!string.IsNullOrWhiteSpace(tlsConfiguration.ServerNameOverride))
{
sslOptions.TargetHost = tlsConfiguration.ServerNameOverride;
}
if (!string.IsNullOrWhiteSpace(tlsConfiguration.ClientCertificatePath) &&
!string.IsNullOrWhiteSpace(tlsConfiguration.ClientKeyPath))
{
try
{
var clientCertificate = X509Certificate2.CreateFromPemFile(
tlsConfiguration.ClientCertificatePath,
tlsConfiguration.ClientKeyPath);
clientCertificate = new X509Certificate2(clientCertificate.Export(X509ContentType.Pfx));
sslOptions.ClientCertificates ??= new X509CertificateCollection();
sslOptions.ClientCertificates.Add(clientCertificate);
logger.LogInformation("Configured client certificate for mutual TLS ({CertificatePath})", tlsConfiguration.ClientCertificatePath);
}
catch (Exception ex)
{
logger.LogWarning(ex, "Failed to load client certificate from {CertificatePath}", tlsConfiguration.ClientCertificatePath);
}
}
sslOptions.RemoteCertificateValidationCallback = (_, certificate, chain, sslPolicyErrors) =>
ValidateServerCertificate(tlsConfiguration, logger, certificate, chain, sslPolicyErrors);
}
private static bool ValidateServerCertificate(
ClientTlsConfiguration tlsConfiguration,
ILogger logger,
X509Certificate? certificate,
X509Chain? chain,
SslPolicyErrors sslPolicyErrors)
{
if (tlsConfiguration.IgnoreAllCertificateErrors)
{
logger.LogWarning("SECURITY WARNING: Ignoring all certificate validation errors for LmxProxy gRPC connection.");
return true;
}
if (certificate is null)
{
logger.LogWarning("Server certificate was null.");
return false;
}
if (!tlsConfiguration.ValidateServerCertificate)
{
logger.LogWarning("SECURITY WARNING: Server certificate validation disabled for LmxProxy gRPC connection.");
return true;
}
X509Certificate2 certificate2 = certificate as X509Certificate2 ?? new X509Certificate2(certificate);
if (!string.IsNullOrWhiteSpace(tlsConfiguration.ServerNameOverride))
{
string dnsName = certificate2.GetNameInfo(X509NameType.DnsName, forIssuer: false);
if (!string.Equals(dnsName, tlsConfiguration.ServerNameOverride, StringComparison.OrdinalIgnoreCase))
{
logger.LogWarning("Server certificate subject '{Subject}' does not match expected host '{ExpectedHost}'",
dnsName, tlsConfiguration.ServerNameOverride);
return false;
}
}
using X509Chain validationChain = chain ?? new X509Chain();
validationChain.ChainPolicy.RevocationMode = X509RevocationMode.NoCheck;
validationChain.ChainPolicy.VerificationFlags = X509VerificationFlags.NoFlag;
if (!string.IsNullOrWhiteSpace(tlsConfiguration.ServerCaCertificatePath) &&
File.Exists(tlsConfiguration.ServerCaCertificatePath))
{
try
{
X509Certificate2 ca = LoadCertificate(tlsConfiguration.ServerCaCertificatePath);
validationChain.ChainPolicy.CustomTrustStore.Add(ca);
validationChain.ChainPolicy.TrustMode = X509ChainTrustMode.CustomRootTrust;
}
catch (Exception ex)
{
logger.LogWarning(ex, "Failed to load CA certificate from {Path}", tlsConfiguration.ServerCaCertificatePath);
}
}
if (tlsConfiguration.AllowSelfSignedCertificates)
{
validationChain.ChainPolicy.VerificationFlags |= X509VerificationFlags.AllowUnknownCertificateAuthority;
}
bool isValid = validationChain.Build(certificate2);
if (isValid)
{
return true;
}
if (tlsConfiguration.AllowSelfSignedCertificates &&
validationChain.ChainStatus.All(status =>
status.Status == X509ChainStatusFlags.UntrustedRoot ||
status.Status == X509ChainStatusFlags.PartialChain))
{
logger.LogWarning("Accepting self-signed certificate for {Subject}", certificate2.Subject);
return true;
}
string statusMessage = string.Join(", ", validationChain.ChainStatus.Select(s => s.Status));
logger.LogWarning("Server certificate validation failed: {Status}", statusMessage);
return false;
}
private static X509Certificate2 LoadCertificate(string path)
{
try
{
return X509Certificate2.CreateFromPemFile(path);
}
catch
{
return new X509Certificate2(File.ReadAllBytes(path));
}
}
}

View File

@@ -0,0 +1,182 @@
using System;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Extension methods for registering LmxProxyClient with dependency injection
/// </summary>
public static class ServiceCollectionExtensions
{
/// <summary>
/// Adds LmxProxyClient services to the service collection
/// </summary>
/// <param name="services">The service collection</param>
/// <param name="configuration">Application configuration</param>
/// <returns>The service collection for chaining</returns>
public static IServiceCollection AddLmxProxyClient(this IServiceCollection services, IConfiguration configuration)
{
return services.AddLmxProxyClient(configuration, "LmxProxy");
}
/// <summary>
/// Adds LmxProxyClient services to the service collection with a specific configuration section
/// </summary>
/// <param name="services">The service collection</param>
/// <param name="configuration">Application configuration</param>
/// <param name="configurationSection">Name of the configuration section</param>
/// <returns>The service collection for chaining</returns>
public static IServiceCollection AddLmxProxyClient(
this IServiceCollection services,
IConfiguration configuration,
string configurationSection)
{
services.AddSingleton<ILmxProxyClientFactory, LmxProxyClientFactory>();
// Register a singleton client with default configuration
services.AddSingleton<LmxProxyClient>(provider =>
{
ILmxProxyClientFactory factory = provider.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient(configurationSection);
});
return services;
}
/// <summary>
/// Adds LmxProxyClient services to the service collection with custom configuration
/// </summary>
/// <param name="services">The service collection</param>
/// <param name="configureClient">Action to configure the client builder</param>
/// <returns>The service collection for chaining</returns>
public static IServiceCollection AddLmxProxyClient(
this IServiceCollection services,
Action<LmxProxyClientBuilder> configureClient)
{
services.AddSingleton<ILmxProxyClientFactory, LmxProxyClientFactory>();
// Register a singleton client with custom configuration
services.AddSingleton<LmxProxyClient>(provider =>
{
ILmxProxyClientFactory factory = provider.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient(configureClient);
});
return services;
}
/// <summary>
/// Adds LmxProxyClient services to the service collection with scoped lifetime
/// </summary>
/// <param name="services">The service collection</param>
/// <param name="configuration">Application configuration</param>
/// <returns>The service collection for chaining</returns>
public static IServiceCollection AddScopedLmxProxyClient(
this IServiceCollection services,
IConfiguration configuration)
{
services.AddSingleton<ILmxProxyClientFactory, LmxProxyClientFactory>();
// Register a scoped client
services.AddScoped<LmxProxyClient>(provider =>
{
ILmxProxyClientFactory factory = provider.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient();
});
return services;
}
/// <summary>
/// Adds named LmxProxyClient services to the service collection
/// </summary>
/// <param name="services">The service collection</param>
/// <param name="name">Name for the client</param>
/// <param name="configureClient">Action to configure the client builder</param>
/// <returns>The service collection for chaining</returns>
public static IServiceCollection AddNamedLmxProxyClient(
this IServiceCollection services,
string name,
Action<LmxProxyClientBuilder> configureClient)
{
services.AddSingleton<ILmxProxyClientFactory, LmxProxyClientFactory>();
// Register a keyed singleton
services.AddKeyedSingleton<LmxProxyClient>(name, (provider, _) =>
{
ILmxProxyClientFactory factory = provider.GetRequiredService<ILmxProxyClientFactory>();
return factory.CreateClient(configureClient);
});
return services;
}
}
/// <summary>
/// Configuration options for LmxProxyClient
/// </summary>
public class LmxProxyClientOptions
{
/// <summary>
/// Gets or sets the host address
/// </summary>
public string Host { get; set; } = "localhost";
/// <summary>
/// Gets or sets the port number
/// </summary>
public int Port { get; set; } = 5050;
/// <summary>
/// Gets or sets the API key
/// </summary>
public string? ApiKey { get; set; }
/// <summary>
/// Gets or sets the timeout duration
/// </summary>
public TimeSpan Timeout { get; set; } = TimeSpan.FromSeconds(30);
/// <summary>
/// Gets or sets whether to use SSL
/// </summary>
public bool UseSsl { get; set; }
/// <summary>
/// Gets or sets the certificate path for SSL
/// </summary>
public string? CertificatePath { get; set; }
/// <summary>
/// Gets or sets whether to enable metrics
/// </summary>
public bool EnableMetrics { get; set; }
/// <summary>
/// Gets or sets the correlation ID header name
/// </summary>
public string? CorrelationIdHeader { get; set; }
/// <summary>
/// Gets or sets the retry configuration
/// </summary>
public RetryOptions? Retry { get; set; }
}
/// <summary>
/// Retry configuration options
/// </summary>
public class RetryOptions
{
/// <summary>
/// Gets or sets the maximum number of retry attempts
/// </summary>
public int MaxAttempts { get; set; } = 3;
/// <summary>
/// Gets or sets the delay between retries
/// </summary>
public TimeSpan Delay { get; set; } = TimeSpan.FromSeconds(1);
}
}

View File

@@ -0,0 +1,260 @@
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.LmxProxy.Client.Domain;
namespace ZB.MOM.WW.LmxProxy.Client
{
/// <summary>
/// Extension methods for streaming operations with the LmxProxy client
/// </summary>
public static class StreamingExtensions
{
/// <summary>
/// Reads multiple tag values as an async stream for efficient memory usage with large datasets
/// </summary>
/// <param name="client">The LmxProxy client</param>
/// <param name="addresses">The addresses to read</param>
/// <param name="batchSize">Size of each batch to process</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns>An async enumerable of tag values</returns>
public static async IAsyncEnumerable<KeyValuePair<string, Vtq>> ReadStreamAsync(
this ILmxProxyClient client,
IEnumerable<string> addresses,
int batchSize = 100,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(addresses);
if (batchSize <= 0)
throw new ArgumentOutOfRangeException(nameof(batchSize), "Batch size must be positive");
var batch = new List<string>(batchSize);
int errorCount = 0;
const int maxConsecutiveErrors = 3;
foreach (string address in addresses)
{
batch.Add(address);
if (batch.Count >= batchSize)
{
bool success = false;
int retries = 0;
const int maxRetries = 2;
while (!success && retries < maxRetries)
{
IDictionary<string, Vtq>? results = null;
Exception? lastException = null;
try
{
results = await client.ReadBatchAsync(batch, cancellationToken);
errorCount = 0; // Reset error count on success
success = true;
}
catch (OperationCanceledException)
{
throw; // Don't retry on cancellation
}
catch (Exception ex)
{
lastException = ex;
retries++;
errorCount++;
if (errorCount >= maxConsecutiveErrors)
{
throw new InvalidOperationException(
$"Stream reading failed after {maxConsecutiveErrors} consecutive errors", ex);
}
if (retries >= maxRetries)
{
// Log error and continue with next batch
System.Diagnostics.Debug.WriteLine($"Failed to read batch after {maxRetries} retries: {ex.Message}");
batch.Clear();
break;
}
// Wait before retry with exponential backoff
await Task.Delay(TimeSpan.FromMilliseconds(100 * Math.Pow(2, retries - 1)), cancellationToken);
}
if (results != null)
{
foreach (KeyValuePair<string, Vtq> result in results)
{
yield return result;
}
batch.Clear();
}
}
}
cancellationToken.ThrowIfCancellationRequested();
}
// Process remaining items
if (batch.Count > 0)
{
IDictionary<string, Vtq>? results = null;
try
{
results = await client.ReadBatchAsync(batch, cancellationToken);
}
catch (OperationCanceledException)
{
throw;
}
catch (Exception ex)
{
// Log error for final batch but don't throw to allow partial results
System.Diagnostics.Debug.WriteLine($"Failed to read final batch: {ex.Message}");
}
if (results != null)
{
foreach (KeyValuePair<string, Vtq> result in results)
{
yield return result;
}
}
}
}
/// <summary>
/// Writes multiple tag values as an async stream for efficient memory usage with large datasets
/// </summary>
/// <param name="client">The LmxProxy client</param>
/// <param name="values">The values to write as an async enumerable</param>
/// <param name="batchSize">Size of each batch to process</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns>The number of values written</returns>
public static async Task<int> WriteStreamAsync(
this ILmxProxyClient client,
IAsyncEnumerable<KeyValuePair<string, object>> values,
int batchSize = 100,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(values);
if (batchSize <= 0)
throw new ArgumentOutOfRangeException(nameof(batchSize), "Batch size must be positive");
var batch = new Dictionary<string, object>(batchSize);
int totalWritten = 0;
await foreach (KeyValuePair<string, object> kvp in values.WithCancellation(cancellationToken))
{
batch[kvp.Key] = kvp.Value;
if (batch.Count >= batchSize)
{
await client.WriteBatchAsync(batch, cancellationToken);
totalWritten += batch.Count;
batch.Clear();
}
}
// Process remaining items
if (batch.Count > 0)
{
await client.WriteBatchAsync(batch, cancellationToken);
totalWritten += batch.Count;
}
return totalWritten;
}
/// <summary>
/// Processes tag values in parallel batches for maximum throughput
/// </summary>
/// <param name="client">The LmxProxy client</param>
/// <param name="addresses">The addresses to read</param>
/// <param name="processor">The async function to process each value</param>
/// <param name="maxDegreeOfParallelism">Maximum number of concurrent operations</param>
/// <param name="cancellationToken">Cancellation token</param>
public static async Task ProcessInParallelAsync(
this ILmxProxyClient client,
IEnumerable<string> addresses,
Func<string, Vtq, Task> processor,
int maxDegreeOfParallelism = 4,
CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(addresses);
ArgumentNullException.ThrowIfNull(processor);
if (maxDegreeOfParallelism <= 0)
throw new ArgumentOutOfRangeException(nameof(maxDegreeOfParallelism));
var semaphore = new SemaphoreSlim(maxDegreeOfParallelism, maxDegreeOfParallelism);
var tasks = new List<Task>();
await foreach (KeyValuePair<string, Vtq> kvp in client.ReadStreamAsync(addresses, cancellationToken: cancellationToken))
{
await semaphore.WaitAsync(cancellationToken);
var task = Task.Run(async () =>
{
try
{
await processor(kvp.Key, kvp.Value);
}
finally
{
semaphore.Release();
}
}, cancellationToken);
tasks.Add(task);
}
await Task.WhenAll(tasks);
}
/// <summary>
/// Subscribes to multiple tags and returns updates as an async stream
/// </summary>
/// <param name="client">The LmxProxy client</param>
/// <param name="addresses">The addresses to subscribe to</param>
/// <param name="pollIntervalMs">Poll interval in milliseconds</param>
/// <param name="cancellationToken">Cancellation token</param>
/// <returns>An async enumerable of tag updates</returns>
public static async IAsyncEnumerable<Vtq> SubscribeStreamAsync(
this ILmxProxyClient client,
IEnumerable<string> addresses,
int pollIntervalMs = 1000,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(addresses);
var updateChannel = System.Threading.Channels.Channel.CreateUnbounded<Vtq>();
// Setup update handler
void OnUpdate(string address, Vtq vtq)
{
updateChannel.Writer.TryWrite(vtq);
}
ISubscription subscription = await client.SubscribeAsync(addresses, OnUpdate, cancellationToken);
try
{
await foreach (Vtq update in updateChannel.Reader.ReadAllAsync(cancellationToken))
{
yield return update;
}
}
finally
{
await subscription.DisposeAsync();
}
}
}
}

View File

@@ -0,0 +1,27 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<LangVersion>latest</LangVersion>
<Nullable>enable</Nullable>
<RootNamespace>ZB.MOM.WW.LmxProxy.Client</RootNamespace>
<AssemblyName>ZB.MOM.WW.LmxProxy.Client</AssemblyName>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<IsPackable>true</IsPackable>
<Description>gRPC client library for LmxProxy service</Description>
<PlatformTarget>AnyCPU</PlatformTarget>
<Platforms>AnyCPU</Platforms>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Grpc.Core.Api" Version="2.71.0" />
<PackageReference Include="Grpc.Net.Client" Version="2.71.0" />
<PackageReference Include="protobuf-net.Grpc" Version="1.2.5" />
<PackageReference Include="Microsoft.Extensions.Configuration.Abstractions" Version="10.0.0" />
<PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="10.0.0" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" Version="10.0.0" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0" />
<PackageReference Include="Polly" Version="8.5.2" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,25 @@
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="System.Threading.Tasks.Extensions" publicKeyToken="cc7b13ffcd2ddd51"
culture="neutral"/>
<bindingRedirect oldVersion="0.0.0.0-4.2.0.1" newVersion="4.2.0.1"/>
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="System.Runtime.CompilerServices.Unsafe" publicKeyToken="b03f5f7f11d50a3a"
culture="neutral"/>
<bindingRedirect oldVersion="0.0.0.0-4.0.6.0" newVersion="4.0.6.0"/>
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="System.Memory" publicKeyToken="cc7b13ffcd2ddd51" culture="neutral"/>
<bindingRedirect oldVersion="0.0.0.0-4.0.1.2" newVersion="4.0.1.2"/>
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="System.Buffers" publicKeyToken="cc7b13ffcd2ddd51" culture="neutral"/>
<bindingRedirect oldVersion="0.0.0.0-4.0.3.0" newVersion="4.0.3.0"/>
</dependentAssembly>
</assemblyBinding>
</runtime>
</configuration>

View File

@@ -0,0 +1,206 @@
using System;
using System.Collections.Generic;
using System.Linq;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Configuration
{
/// <summary>
/// Validates LmxProxy configuration settings on startup.
/// </summary>
public static class ConfigurationValidator
{
private static readonly ILogger Logger = Log.ForContext(typeof(ConfigurationValidator));
/// <summary>
/// Validates the provided configuration and returns a list of validation errors.
/// </summary>
/// <param name="configuration">The configuration to validate.</param>
/// <returns>A list of validation error messages. Empty if configuration is valid.</returns>
public static List<string> Validate(LmxProxyConfiguration configuration)
{
var errors = new List<string>();
if (configuration == null)
{
errors.Add("Configuration is null");
return errors;
}
// Validate gRPC port
if (configuration.GrpcPort <= 0 || configuration.GrpcPort > 65535)
{
errors.Add($"Invalid gRPC port: {configuration.GrpcPort}. Must be between 1 and 65535.");
}
// Validate API key configuration file
if (string.IsNullOrWhiteSpace(configuration.ApiKeyConfigFile))
{
errors.Add("API key configuration file path is not specified.");
}
// Validate Connection settings
if (configuration.Connection != null)
{
ValidateConnectionConfiguration(configuration.Connection, errors);
}
else
{
errors.Add("Connection configuration is missing.");
}
// Validate Subscription settings
if (configuration.Subscription != null)
{
ValidateSubscriptionConfiguration(configuration.Subscription, errors);
}
// Validate Service Recovery settings
if (configuration.ServiceRecovery != null)
{
ValidateServiceRecoveryConfiguration(configuration.ServiceRecovery, errors);
}
// Validate TLS settings
if (configuration.Tls != null)
{
if (!configuration.Tls.Validate())
{
errors.Add("TLS configuration validation failed. Check the logs for details.");
}
}
return errors;
}
private static void ValidateConnectionConfiguration(ConnectionConfiguration config, List<string> errors)
{
if (config.MonitorIntervalSeconds <= 0)
{
errors.Add(
$"Invalid monitor interval: {config.MonitorIntervalSeconds} seconds. Must be greater than 0.");
}
if (config.ConnectionTimeoutSeconds <= 0)
{
errors.Add(
$"Invalid connection timeout: {config.ConnectionTimeoutSeconds} seconds. Must be greater than 0.");
}
if (config.ReadTimeoutSeconds <= 0)
{
errors.Add($"Invalid read timeout: {config.ReadTimeoutSeconds} seconds. Must be greater than 0.");
}
if (config.WriteTimeoutSeconds <= 0)
{
errors.Add($"Invalid write timeout: {config.WriteTimeoutSeconds} seconds. Must be greater than 0.");
}
if (config.MaxConcurrentOperations.HasValue && config.MaxConcurrentOperations.Value <= 0)
{
errors.Add(
$"Invalid max concurrent operations: {config.MaxConcurrentOperations}. Must be greater than 0.");
}
// Validate node and galaxy names if provided
if (!string.IsNullOrWhiteSpace(config.NodeName) && config.NodeName?.Length > 255)
{
errors.Add($"Node name is too long: {config.NodeName.Length} characters. Maximum is 255.");
}
if (!string.IsNullOrWhiteSpace(config.GalaxyName) && config.GalaxyName?.Length > 255)
{
errors.Add($"Galaxy name is too long: {config.GalaxyName.Length} characters. Maximum is 255.");
}
}
private static void ValidateSubscriptionConfiguration(SubscriptionConfiguration config, List<string> errors)
{
if (config.ChannelCapacity <= 0)
{
errors.Add($"Invalid channel capacity: {config.ChannelCapacity}. Must be greater than 0.");
}
if (config.ChannelCapacity > 100000)
{
errors.Add($"Channel capacity too large: {config.ChannelCapacity}. Maximum recommended is 100000.");
}
string[] validChannelModes = { "DropOldest", "DropNewest", "Wait" };
if (!validChannelModes.Contains(config.ChannelFullMode))
{
errors.Add(
$"Invalid channel full mode: {config.ChannelFullMode}. Valid values are: {string.Join(", ", validChannelModes)}");
}
}
private static void ValidateServiceRecoveryConfiguration(ServiceRecoveryConfiguration config,
List<string> errors)
{
if (config.FirstFailureDelayMinutes < 0)
{
errors.Add(
$"Invalid first failure delay: {config.FirstFailureDelayMinutes} minutes. Must be 0 or greater.");
}
if (config.SecondFailureDelayMinutes < 0)
{
errors.Add(
$"Invalid second failure delay: {config.SecondFailureDelayMinutes} minutes. Must be 0 or greater.");
}
if (config.SubsequentFailureDelayMinutes < 0)
{
errors.Add(
$"Invalid subsequent failure delay: {config.SubsequentFailureDelayMinutes} minutes. Must be 0 or greater.");
}
if (config.ResetPeriodDays <= 0)
{
errors.Add($"Invalid reset period: {config.ResetPeriodDays} days. Must be greater than 0.");
}
}
/// <summary>
/// Logs validation results and returns whether the configuration is valid.
/// </summary>
/// <param name="configuration">The configuration to validate.</param>
/// <returns>True if configuration is valid; otherwise, false.</returns>
public static bool ValidateAndLog(LmxProxyConfiguration configuration)
{
List<string> errors = Validate(configuration);
if (errors.Any())
{
Logger.Error("Configuration validation failed with {ErrorCount} errors:", errors.Count);
foreach (string? error in errors)
{
Logger.Error(" - {ValidationError}", error);
}
return false;
}
Logger.Information("Configuration validation successful");
return true;
}
/// <summary>
/// Throws an exception if the configuration is invalid.
/// </summary>
/// <param name="configuration">The configuration to validate.</param>
/// <exception cref="InvalidOperationException">Thrown when configuration is invalid.</exception>
public static void ValidateOrThrow(LmxProxyConfiguration configuration)
{
List<string> errors = Validate(configuration);
if (errors.Any())
{
string message = $"Configuration validation failed with {errors.Count} error(s):\n" +
string.Join("\n", errors.Select(e => $" - {e}"));
throw new InvalidOperationException(message);
}
}
}
}

View File

@@ -0,0 +1,110 @@
namespace ZB.MOM.WW.LmxProxy.Host.Configuration
{
/// <summary>
/// Configuration settings for LmxProxy service
/// </summary>
public class LmxProxyConfiguration
{
/// <summary>
/// gRPC server port
/// </summary>
public int GrpcPort { get; set; } = 50051;
/// <summary>
/// Subscription management settings
/// </summary>
public SubscriptionConfiguration Subscription { get; set; } = new();
/// <summary>
/// Windows service recovery settings
/// </summary>
public ServiceRecoveryConfiguration ServiceRecovery { get; set; } = new();
/// <summary>
/// API key configuration file path
/// </summary>
public string ApiKeyConfigFile { get; set; } = "apikeys.json";
/// <summary>
/// MxAccess connection settings
/// </summary>
public ConnectionConfiguration Connection { get; set; } = new();
/// <summary>
/// TLS/SSL configuration for secure gRPC communication
/// </summary>
public TlsConfiguration Tls { get; set; } = new();
/// <summary>
/// Web server configuration for status display
/// </summary>
public WebServerConfiguration WebServer { get; set; } = new();
}
/// <summary>
/// Configuration for MxAccess connection monitoring and reconnection
/// </summary>
public class ConnectionConfiguration
{
/// <summary>
/// Interval in seconds between connection health checks
/// </summary>
public int MonitorIntervalSeconds { get; set; } = 5;
/// <summary>
/// Timeout in seconds for initial connection attempts
/// </summary>
public int ConnectionTimeoutSeconds { get; set; } = 30;
/// <summary>
/// Whether to automatically reconnect when connection is lost
/// </summary>
public bool AutoReconnect { get; set; } = true;
/// <summary>
/// Timeout in seconds for read operations
/// </summary>
public int ReadTimeoutSeconds { get; set; } = 5;
/// <summary>
/// Timeout in seconds for write operations
/// </summary>
public int WriteTimeoutSeconds { get; set; } = 5;
/// <summary>
/// Maximum number of concurrent read/write operations allowed
/// </summary>
public int? MaxConcurrentOperations { get; set; } = 10;
/// <summary>
/// Name of the node to connect to (optional)
/// </summary>
public string? NodeName { get; set; }
/// <summary>
/// Name of the galaxy to connect to (optional)
/// </summary>
public string? GalaxyName { get; set; }
}
/// <summary>
/// Configuration for web server that displays status information
/// </summary>
public class WebServerConfiguration
{
/// <summary>
/// Whether the web server is enabled
/// </summary>
public bool Enabled { get; set; } = true;
/// <summary>
/// Port number for the web server
/// </summary>
public int Port { get; set; } = 8080;
/// <summary>
/// Prefix URL for the web server (default: http://+:{Port}/)
/// </summary>
public string? Prefix { get; set; }
}
}

View File

@@ -0,0 +1,28 @@
namespace ZB.MOM.WW.LmxProxy.Host.Configuration
{
/// <summary>
/// Configuration for Windows service recovery
/// </summary>
public class ServiceRecoveryConfiguration
{
/// <summary>
/// Minutes to wait before restart on first failure
/// </summary>
public int FirstFailureDelayMinutes { get; set; } = 1;
/// <summary>
/// Minutes to wait before restart on second failure
/// </summary>
public int SecondFailureDelayMinutes { get; set; } = 5;
/// <summary>
/// Minutes to wait before restart on subsequent failures
/// </summary>
public int SubsequentFailureDelayMinutes { get; set; } = 10;
/// <summary>
/// Days before resetting the failure count
/// </summary>
public int ResetPeriodDays { get; set; } = 1;
}
}

View File

@@ -0,0 +1,18 @@
namespace ZB.MOM.WW.LmxProxy.Host.Configuration
{
/// <summary>
/// Configuration for subscription management
/// </summary>
public class SubscriptionConfiguration
{
/// <summary>
/// Buffer size for each client's channel (number of messages)
/// </summary>
public int ChannelCapacity { get; set; } = 1000;
/// <summary>
/// Strategy when channel buffer is full: "DropOldest", "DropNewest", or "Wait"
/// </summary>
public string ChannelFullMode { get; set; } = "DropOldest";
}
}

View File

@@ -0,0 +1,90 @@
using System.IO;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Configuration
{
/// <summary>
/// Configuration for TLS/SSL settings for secure gRPC communication
/// </summary>
public class TlsConfiguration
{
/// <summary>
/// Gets or sets whether TLS is enabled for gRPC communication
/// </summary>
public bool Enabled { get; set; } = false;
/// <summary>
/// Gets or sets the path to the server certificate file (.pem or .crt)
/// </summary>
public string ServerCertificatePath { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the path to the server private key file (.key)
/// </summary>
public string ServerKeyPath { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the path to the certificate authority file for client certificate validation (optional)
/// </summary>
public string? ClientCaCertificatePath { get; set; }
/// <summary>
/// Gets or sets whether to require client certificates for mutual TLS
/// </summary>
public bool RequireClientCertificate { get; set; } = false;
/// <summary>
/// Gets or sets whether to check certificate revocation
/// </summary>
public bool CheckCertificateRevocation { get; set; } = true;
/// <summary>
/// Validates the TLS configuration
/// </summary>
/// <returns>True if configuration is valid, false otherwise</returns>
public bool Validate()
{
if (!Enabled)
{
return true; // No validation needed if TLS is disabled
}
if (string.IsNullOrWhiteSpace(ServerCertificatePath))
{
Log.Error("TLS is enabled but ServerCertificatePath is not configured");
return false;
}
if (string.IsNullOrWhiteSpace(ServerKeyPath))
{
Log.Error("TLS is enabled but ServerKeyPath is not configured");
return false;
}
if (!File.Exists(ServerCertificatePath))
{
Log.Warning("Server certificate file not found: {Path} - will be auto-generated on startup",
ServerCertificatePath);
}
if (!File.Exists(ServerKeyPath))
{
Log.Warning("Server key file not found: {Path} - will be auto-generated on startup", ServerKeyPath);
}
if (RequireClientCertificate && string.IsNullOrWhiteSpace(ClientCaCertificatePath))
{
Log.Error("Client certificate is required but ClientCaCertificatePath is not configured");
return false;
}
if (!string.IsNullOrWhiteSpace(ClientCaCertificatePath) && !File.Exists(ClientCaCertificatePath))
{
Log.Warning("Client CA certificate file not found: {Path} - will be auto-generated on startup",
ClientCaCertificatePath);
}
return true;
}
}
}

View File

@@ -0,0 +1,23 @@
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Per-client subscription statistics.
/// </summary>
public class ClientStats
{
/// <summary>
/// Gets or sets the number of tags the client is subscribed to.
/// </summary>
public int SubscribedTags { get; set; }
/// <summary>
/// Gets or sets the number of delivered messages.
/// </summary>
public long DeliveredMessages { get; set; }
/// <summary>
/// Gets or sets the number of dropped messages.
/// </summary>
public long DroppedMessages { get; set; }
}
}

View File

@@ -0,0 +1,38 @@
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Represents the state of a SCADA client connection.
/// </summary>
public enum ConnectionState
{
/// <summary>
/// The client is disconnected.
/// </summary>
Disconnected,
/// <summary>
/// The client is in the process of connecting.
/// </summary>
Connecting,
/// <summary>
/// The client is connected.
/// </summary>
Connected,
/// <summary>
/// The client is in the process of disconnecting.
/// </summary>
Disconnecting,
/// <summary>
/// The client encountered an error.
/// </summary>
Error,
/// <summary>
/// The client is reconnecting after a connection loss.
/// </summary>
Reconnecting
}
}

View File

@@ -0,0 +1,45 @@
using System;
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Event arguments for SCADA client connection state changes.
/// </summary>
public class ConnectionStateChangedEventArgs : EventArgs
{
/// <summary>
/// Initializes a new instance of the <see cref="ConnectionStateChangedEventArgs" /> class.
/// </summary>
/// <param name="previousState">The previous connection state.</param>
/// <param name="currentState">The current connection state.</param>
/// <param name="message">Optional message providing additional information about the state change.</param>
public ConnectionStateChangedEventArgs(ConnectionState previousState, ConnectionState currentState,
string? message = null)
{
PreviousState = previousState;
CurrentState = currentState;
Timestamp = DateTime.UtcNow;
Message = message;
}
/// <summary>
/// Gets the previous connection state.
/// </summary>
public ConnectionState PreviousState { get; }
/// <summary>
/// Gets the current connection state.
/// </summary>
public ConnectionState CurrentState { get; }
/// <summary>
/// Gets the timestamp when the state change occurred.
/// </summary>
public DateTime Timestamp { get; }
/// <summary>
/// Gets additional information about the state change, such as error messages.
/// </summary>
public string? Message { get; }
}
}

View File

@@ -0,0 +1,104 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Interface for SCADA system clients.
/// </summary>
public interface IScadaClient : IAsyncDisposable
{
/// <summary>
/// Gets the connection status.
/// </summary>
bool IsConnected { get; }
/// <summary>
/// Gets the current connection state.
/// </summary>
ConnectionState ConnectionState { get; }
/// <summary>
/// Occurs when the connection state changes.
/// </summary>
event EventHandler<ConnectionStateChangedEventArgs> ConnectionStateChanged;
/// <summary>
/// Connects to the SCADA system.
/// </summary>
/// <param name="ct">Cancellation token.</param>
Task ConnectAsync(CancellationToken ct = default);
/// <summary>
/// Disconnects from the SCADA system.
/// </summary>
/// <param name="ct">Cancellation token.</param>
Task DisconnectAsync(CancellationToken ct = default);
/// <summary>
/// Reads a single tag value from the SCADA system.
/// </summary>
/// <param name="address">The tag address.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>The value, timestamp, and quality.</returns>
Task<Vtq> ReadAsync(string address, CancellationToken ct = default);
/// <summary>
/// Reads multiple tag values from the SCADA system.
/// </summary>
/// <param name="addresses">The tag addresses.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>Dictionary of address to VTQ values.</returns>
Task<IReadOnlyDictionary<string, Vtq>>
ReadBatchAsync(IEnumerable<string> addresses, CancellationToken ct = default);
/// <summary>
/// Writes a single tag value to the SCADA system.
/// </summary>
/// <param name="address">The tag address.</param>
/// <param name="value">The value to write.</param>
/// <param name="ct">Cancellation token.</param>
Task WriteAsync(string address, object value, CancellationToken ct = default);
/// <summary>
/// Writes multiple tag values to the SCADA system.
/// </summary>
/// <param name="values">Dictionary of address to value.</param>
/// <param name="ct">Cancellation token.</param>
Task WriteBatchAsync(IReadOnlyDictionary<string, object> values, CancellationToken ct = default);
/// <summary>
/// Writes a batch of tag values and a flag tag, then waits for a response tag to
/// equal the expected value.
/// </summary>
/// <param name="values">The regular tag values to write.</param>
/// <param name="flagAddress">The address of the flag tag to write.</param>
/// <param name="flagValue">The value to write to the flag tag.</param>
/// <param name="responseAddress">The address of the response tag to monitor.</param>
/// <param name="responseValue">The expected value of the response tag.</param>
/// <param name="ct">Cancellation token controlling the wait.</param>
/// <returns>
/// <c>true</c> if the response value was observed before cancellation;
/// otherwise <c>false</c>.
/// </returns>
Task<bool> WriteBatchAndWaitAsync(
IReadOnlyDictionary<string, object> values,
string flagAddress,
object flagValue,
string responseAddress,
object responseValue,
CancellationToken ct = default);
/// <summary>
/// Subscribes to value changes for specified addresses.
/// </summary>
/// <param name="addresses">The tag addresses to monitor.</param>
/// <param name="callback">Callback for value changes.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>Subscription handle for unsubscribing.</returns>
Task<IAsyncDisposable> SubscribeAsync(IEnumerable<string> addresses, Action<string, Vtq> callback,
CancellationToken ct = default);
}
}

View File

@@ -0,0 +1,124 @@
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// OPC quality codes mapped to domain-level values.
/// The byte value matches the low-order byte of the OPC UA StatusCode,
/// so it can be persisted or round-tripped without translation.
/// </summary>
public enum Quality : byte
{
// ─────────────── Bad family (0-31) ───────────────
/// <summary>0x00 Bad [Non-Specific]</summary>
Bad = 0,
/// <summary>0x01 Unknown quality value</summary>
Unknown = 1,
/// <summary>0x04 Bad [Configuration Error]</summary>
Bad_ConfigError = 4,
/// <summary>0x08 Bad [Not Connected]</summary>
Bad_NotConnected = 8,
/// <summary>0x0C Bad [Device Failure]</summary>
Bad_DeviceFailure = 12,
/// <summary>0x10 Bad [Sensor Failure]</summary>
Bad_SensorFailure = 16,
/// <summary>0x14 Bad [Last Known Value]</summary>
Bad_LastKnownValue = 20,
/// <summary>0x18 Bad [Communication Failure]</summary>
Bad_CommFailure = 24,
/// <summary>0x1C Bad [Out of Service]</summary>
Bad_OutOfService = 28,
// ──────────── Uncertain family (64-95) ───────────
/// <summary>0x40 Uncertain [Non-Specific]</summary>
Uncertain = 64,
/// <summary>0x41 Uncertain [Non-Specific] (Low Limited)</summary>
Uncertain_LowLimited = 65,
/// <summary>0x42 Uncertain [Non-Specific] (High Limited)</summary>
Uncertain_HighLimited = 66,
/// <summary>0x43 Uncertain [Non-Specific] (Constant)</summary>
Uncertain_Constant = 67,
/// <summary>0x44 Uncertain [Last Usable]</summary>
Uncertain_LastUsable = 68,
/// <summary>0x45 Uncertain [Last Usable] (Low Limited)</summary>
Uncertain_LastUsable_LL = 69,
/// <summary>0x46 Uncertain [Last Usable] (High Limited)</summary>
Uncertain_LastUsable_HL = 70,
/// <summary>0x47 Uncertain [Last Usable] (Constant)</summary>
Uncertain_LastUsable_Cnst = 71,
/// <summary>0x50 Uncertain [Sensor Not Accurate]</summary>
Uncertain_SensorNotAcc = 80,
/// <summary>0x51 Uncertain [Sensor Not Accurate] (Low Limited)</summary>
Uncertain_SensorNotAcc_LL = 81,
/// <summary>0x52 Uncertain [Sensor Not Accurate] (High Limited)</summary>
Uncertain_SensorNotAcc_HL = 82,
/// <summary>0x53 Uncertain [Sensor Not Accurate] (Constant)</summary>
Uncertain_SensorNotAcc_C = 83,
/// <summary>0x54 Uncertain [EU Exceeded]</summary>
Uncertain_EuExceeded = 84,
/// <summary>0x55 Uncertain [EU Exceeded] (Low Limited)</summary>
Uncertain_EuExceeded_LL = 85,
/// <summary>0x56 Uncertain [EU Exceeded] (High Limited)</summary>
Uncertain_EuExceeded_HL = 86,
/// <summary>0x57 Uncertain [EU Exceeded] (Constant)</summary>
Uncertain_EuExceeded_C = 87,
/// <summary>0x58 Uncertain [Sub-Normal]</summary>
Uncertain_SubNormal = 88,
/// <summary>0x59 Uncertain [Sub-Normal] (Low Limited)</summary>
Uncertain_SubNormal_LL = 89,
/// <summary>0x5A Uncertain [Sub-Normal] (High Limited)</summary>
Uncertain_SubNormal_HL = 90,
/// <summary>0x5B Uncertain [Sub-Normal] (Constant)</summary>
Uncertain_SubNormal_C = 91,
// ─────────────── Good family (192-219) ────────────
/// <summary>0xC0 Good [Non-Specific]</summary>
Good = 192,
/// <summary>0xC1 Good [Non-Specific] (Low Limited)</summary>
Good_LowLimited = 193,
/// <summary>0xC2 Good [Non-Specific] (High Limited)</summary>
Good_HighLimited = 194,
/// <summary>0xC3 Good [Non-Specific] (Constant)</summary>
Good_Constant = 195,
/// <summary>0xD8 Good [Local Override]</summary>
Good_LocalOverride = 216,
/// <summary>0xD9 Good [Local Override] (Low Limited)</summary>
Good_LocalOverride_LL = 217,
/// <summary>0xDA Good [Local Override] (High Limited)</summary>
Good_LocalOverride_HL = 218,
/// <summary>0xDB Good [Local Override] (Constant)</summary>
Good_LocalOverride_C = 219
}
}

View File

@@ -0,0 +1,30 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Subscription statistics for all clients and tags.
/// </summary>
public class SubscriptionStats
{
/// <summary>
/// Gets or sets the total number of clients.
/// </summary>
public int TotalClients { get; set; }
/// <summary>
/// Gets or sets the total number of tags.
/// </summary>
public int TotalTags { get; set; }
/// <summary>
/// Gets or sets the mapping of tag addresses to client counts.
/// </summary>
public Dictionary<string, int> TagClientCounts { get; set; } = new();
/// <summary>
/// Gets or sets the mapping of client IDs to their statistics.
/// </summary>
public Dictionary<string, ClientStats> ClientStats { get; set; } = new();
}
}

View File

@@ -0,0 +1,129 @@
using System;
namespace ZB.MOM.WW.LmxProxy.Host.Domain
{
/// <summary>
/// Value, Timestamp, and Quality structure for SCADA data.
/// </summary>
public readonly struct Vtq : IEquatable<Vtq>
{
/// <summary>
/// Gets the value.
/// </summary>
public object? Value { get; }
/// <summary>
/// Gets the timestamp when the value was read.
/// </summary>
public DateTime Timestamp { get; }
/// <summary>
/// Gets the quality of the value.
/// </summary>
public Quality Quality { get; }
/// <summary>
/// Initializes a new instance of the <see cref="Vtq" /> struct.
/// </summary>
/// <param name="value">The value.</param>
/// <param name="timestamp">The timestamp when the value was read.</param>
/// <param name="quality">The quality of the value.</param>
public Vtq(object? value, DateTime timestamp, Quality quality)
{
Value = value;
Timestamp = timestamp;
Quality = quality;
}
/// <summary>
/// Creates a new <see cref="Vtq" /> instance with the specified value and quality, using the current UTC timestamp.
/// </summary>
/// <param name="value">The value.</param>
/// <param name="quality">The quality of the value.</param>
/// <returns>A new <see cref="Vtq" /> instance.</returns>
public static Vtq New(object value, Quality quality) => new(value, DateTime.UtcNow, quality);
/// <summary>
/// Creates a new <see cref="Vtq" /> instance with the specified value, timestamp, and quality.
/// </summary>
/// <param name="value">The value.</param>
/// <param name="timestamp">The timestamp when the value was read.</param>
/// <param name="quality">The quality of the value.</param>
/// <returns>A new <see cref="Vtq" /> instance.</returns>
public static Vtq New(object value, DateTime timestamp, Quality quality) => new(value, timestamp, quality);
/// <summary>
/// Creates a <see cref="Vtq" /> instance with good quality and the current UTC timestamp.
/// </summary>
/// <param name="value">The value.</param>
/// <returns>A new <see cref="Vtq" /> instance with good quality.</returns>
public static Vtq Good(object value) => new(value, DateTime.UtcNow, Quality.Good);
/// <summary>
/// Creates a <see cref="Vtq" /> instance with bad quality and the current UTC timestamp.
/// </summary>
/// <param name="value">The value. Optional.</param>
/// <returns>A new <see cref="Vtq" /> instance with bad quality.</returns>
public static Vtq Bad(object? value = null) => new(value, DateTime.UtcNow, Quality.Bad);
/// <summary>
/// Creates a <see cref="Vtq" /> instance with uncertain quality and the current UTC timestamp.
/// </summary>
/// <param name="value">The value.</param>
/// <returns>A new <see cref="Vtq" /> instance with uncertain quality.</returns>
public static Vtq Uncertain(object value) => new(value, DateTime.UtcNow, Quality.Uncertain);
/// <summary>
/// Determines whether the specified <see cref="Vtq" /> is equal to the current <see cref="Vtq" />.
/// </summary>
/// <param name="other">The <see cref="Vtq" /> to compare with the current <see cref="Vtq" />.</param>
/// <returns>true if the specified <see cref="Vtq" /> is equal to the current <see cref="Vtq" />; otherwise, false.</returns>
public bool Equals(Vtq other) =>
Equals(Value, other.Value) && Timestamp.Equals(other.Timestamp) && Quality == other.Quality;
/// <summary>
/// Determines whether the specified object is equal to the current <see cref="Vtq" />.
/// </summary>
/// <param name="obj">The object to compare with the current <see cref="Vtq" />.</param>
/// <returns>true if the specified object is equal to the current <see cref="Vtq" />; otherwise, false.</returns>
public override bool Equals(object obj) => obj is Vtq other && Equals(other);
/// <summary>
/// Returns the hash code for this instance.
/// </summary>
/// <returns>A 32-bit signed integer hash code.</returns>
public override int GetHashCode()
{
unchecked
{
int hashCode = Value != null ? Value.GetHashCode() : 0;
hashCode = (hashCode * 397) ^ Timestamp.GetHashCode();
hashCode = (hashCode * 397) ^ (int)Quality;
return hashCode;
}
}
/// <summary>
/// Returns a string that represents the current object.
/// </summary>
/// <returns>A string that represents the current object.</returns>
public override string ToString() =>
$"{{Value={Value}, Timestamp={Timestamp:yyyy-MM-dd HH:mm:ss.fff}, Quality={Quality}}}";
/// <summary>
/// Determines whether two specified instances of <see cref="Vtq" /> are equal.
/// </summary>
/// <param name="left">The first <see cref="Vtq" /> to compare.</param>
/// <param name="right">The second <see cref="Vtq" /> to compare.</param>
/// <returns>true if left and right are equal; otherwise, false.</returns>
public static bool operator ==(Vtq left, Vtq right) => left.Equals(right);
/// <summary>
/// Determines whether two specified instances of <see cref="Vtq" /> are not equal.
/// </summary>
/// <param name="left">The first <see cref="Vtq" /> to compare.</param>
/// <param name="right">The second <see cref="Vtq" /> to compare.</param>
/// <returns>true if left and right are not equal; otherwise, false.</returns>
public static bool operator !=(Vtq left, Vtq right) => !left.Equals(right);
}
}

View File

@@ -1,6 +1,6 @@
syntax = "proto3";
option csharp_namespace = "LmxFakeProxy.Grpc";
option csharp_namespace = "ZB.MOM.WW.LmxProxy.Host.Grpc";
package scada;

View File

@@ -0,0 +1,804 @@
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Text.Json;
using System.Threading;
using System.Threading.Channels;
using System.Threading.Tasks;
using Grpc.Core;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Domain;
using ZB.MOM.WW.LmxProxy.Host.Security;
using ZB.MOM.WW.LmxProxy.Host.Services;
using ZB.MOM.WW.LmxProxy.Host.Grpc;
namespace ZB.MOM.WW.LmxProxy.Host.Grpc.Services
{
/// <summary>
/// gRPC service implementation for SCADA operations.
/// Provides methods for connecting, reading, writing, batch operations, and subscriptions.
/// </summary>
public class ScadaGrpcService : ScadaService.ScadaServiceBase
{
private static readonly ILogger Logger = Log.ForContext<ScadaGrpcService>();
private readonly PerformanceMetrics _performanceMetrics;
private readonly IScadaClient _scadaClient;
private readonly SessionManager _sessionManager;
private readonly SubscriptionManager _subscriptionManager;
/// <summary>
/// Initializes a new instance of the <see cref="ScadaGrpcService" /> class.
/// </summary>
/// <param name="scadaClient">The SCADA client instance.</param>
/// <param name="subscriptionManager">The subscription manager instance.</param>
/// <param name="sessionManager">The session manager instance.</param>
/// <param name="performanceMetrics">Optional performance metrics service for tracking operations.</param>
/// <exception cref="ArgumentNullException">Thrown if any required argument is null.</exception>
public ScadaGrpcService(
IScadaClient scadaClient,
SubscriptionManager subscriptionManager,
SessionManager sessionManager,
PerformanceMetrics performanceMetrics = null)
{
_scadaClient = scadaClient ?? throw new ArgumentNullException(nameof(scadaClient));
_subscriptionManager = subscriptionManager ?? throw new ArgumentNullException(nameof(subscriptionManager));
_sessionManager = sessionManager ?? throw new ArgumentNullException(nameof(sessionManager));
_performanceMetrics = performanceMetrics;
}
#region Connection Management
/// <summary>
/// Creates a new session for a client.
/// The MxAccess connection is managed separately at server startup.
/// </summary>
/// <param name="request">The connection request with client ID and API key.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="ConnectResponse" /> with session ID.</returns>
public override Task<ConnectResponse> Connect(ConnectRequest request, ServerCallContext context)
{
try
{
Logger.Information("Connect request from {Peer} - ClientId: {ClientId}",
context.Peer, request.ClientId);
// Validate that MxAccess is connected
if (!_scadaClient.IsConnected)
{
return Task.FromResult(new ConnectResponse
{
Success = false,
Message = "SCADA server is not connected to MxAccess",
SessionId = string.Empty
});
}
// Create a new session
var sessionId = _sessionManager.CreateSession(request.ClientId, request.ApiKey);
return Task.FromResult(new ConnectResponse
{
Success = true,
Message = "Session created successfully",
SessionId = sessionId
});
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to create session for client {ClientId}", request.ClientId);
return Task.FromResult(new ConnectResponse
{
Success = false,
Message = ex.Message,
SessionId = string.Empty
});
}
}
/// <summary>
/// Terminates a client session.
/// </summary>
/// <param name="request">The disconnect request with session ID.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="DisconnectResponse" /> indicating success or failure.</returns>
public override Task<DisconnectResponse> Disconnect(DisconnectRequest request, ServerCallContext context)
{
try
{
Logger.Information("Disconnect request from {Peer} - SessionId: {SessionId}",
context.Peer, request.SessionId);
var terminated = _sessionManager.TerminateSession(request.SessionId);
return Task.FromResult(new DisconnectResponse
{
Success = terminated,
Message = terminated ? "Session terminated successfully" : "Session not found"
});
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to disconnect session {SessionId}", request.SessionId);
return Task.FromResult(new DisconnectResponse
{
Success = false,
Message = ex.Message
});
}
}
/// <summary>
/// Gets the connection state for a session.
/// </summary>
/// <param name="request">The connection state request with session ID.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="GetConnectionStateResponse" /> with connection details.</returns>
public override Task<GetConnectionStateResponse> GetConnectionState(GetConnectionStateRequest request,
ServerCallContext context)
{
var session = _sessionManager.GetSession(request.SessionId);
if (session == null)
{
return Task.FromResult(new GetConnectionStateResponse
{
IsConnected = false,
ClientId = string.Empty,
ConnectedSinceUtcTicks = 0
});
}
return Task.FromResult(new GetConnectionStateResponse
{
IsConnected = _scadaClient.IsConnected,
ClientId = session.ClientId,
ConnectedSinceUtcTicks = session.ConnectedSinceUtcTicks
});
}
#endregion
#region Read Operations
/// <summary>
/// Reads a single tag value from the SCADA system.
/// </summary>
/// <param name="request">The read request with session ID and tag.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="ReadResponse" /> with the VTQ data.</returns>
public override async Task<ReadResponse> Read(ReadRequest request, ServerCallContext context)
{
using (PerformanceMetrics.ITimingScope scope = _performanceMetrics?.BeginOperation("Read"))
{
try
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
return new ReadResponse
{
Success = false,
Message = "Invalid session ID",
Vtq = CreateBadVtqMessage(request.Tag)
};
}
Logger.Debug("Read request from {Peer} for {Tag}", context.Peer, request.Tag);
Vtq vtq = await _scadaClient.ReadAsync(request.Tag, context.CancellationToken);
scope?.SetSuccess(true);
return new ReadResponse
{
Success = true,
Message = string.Empty,
Vtq = ConvertToVtqMessage(request.Tag, vtq)
};
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to read {Tag}", request.Tag);
scope?.SetSuccess(false);
return new ReadResponse
{
Success = false,
Message = ex.Message,
Vtq = CreateBadVtqMessage(request.Tag)
};
}
}
}
/// <summary>
/// Reads multiple tag values from the SCADA system.
/// </summary>
/// <param name="request">The batch read request with session ID and tags.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="ReadBatchResponse" /> with VTQ data for each tag.</returns>
public override async Task<ReadBatchResponse> ReadBatch(ReadBatchRequest request, ServerCallContext context)
{
using (PerformanceMetrics.ITimingScope scope = _performanceMetrics?.BeginOperation("ReadBatch"))
{
try
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
var badResponse = new ReadBatchResponse
{
Success = false,
Message = "Invalid session ID"
};
foreach (var tag in request.Tags)
{
badResponse.Vtqs.Add(CreateBadVtqMessage(tag));
}
return badResponse;
}
Logger.Debug("ReadBatch request from {Peer} for {Count} tags", context.Peer, request.Tags.Count);
IReadOnlyDictionary<string, Vtq> results =
await _scadaClient.ReadBatchAsync(request.Tags, context.CancellationToken);
var response = new ReadBatchResponse
{
Success = true,
Message = string.Empty
};
// Return results in the same order as the request tags
foreach (var tag in request.Tags)
{
if (results.TryGetValue(tag, out Vtq vtq))
{
response.Vtqs.Add(ConvertToVtqMessage(tag, vtq));
}
else
{
response.Vtqs.Add(CreateBadVtqMessage(tag));
}
}
scope?.SetSuccess(true);
return response;
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to read batch");
scope?.SetSuccess(false);
var response = new ReadBatchResponse
{
Success = false,
Message = ex.Message
};
foreach (var tag in request.Tags)
{
response.Vtqs.Add(CreateBadVtqMessage(tag));
}
return response;
}
}
}
#endregion
#region Write Operations
/// <summary>
/// Writes a single tag value to the SCADA system.
/// </summary>
/// <param name="request">The write request with session ID, tag, and value.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="WriteResponse" /> indicating success or failure.</returns>
public override async Task<WriteResponse> Write(WriteRequest request, ServerCallContext context)
{
using (PerformanceMetrics.ITimingScope scope = _performanceMetrics?.BeginOperation("Write"))
{
try
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
return new WriteResponse
{
Success = false,
Message = "Invalid session ID"
};
}
Logger.Debug("Write request from {Peer} for {Tag}", context.Peer, request.Tag);
// Parse the string value to an appropriate type
var value = ParseValue(request.Value);
await _scadaClient.WriteAsync(request.Tag, value, context.CancellationToken);
scope?.SetSuccess(true);
return new WriteResponse
{
Success = true,
Message = string.Empty
};
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to write to {Tag}", request.Tag);
scope?.SetSuccess(false);
return new WriteResponse
{
Success = false,
Message = ex.Message
};
}
}
}
/// <summary>
/// Writes multiple tag values to the SCADA system.
/// </summary>
/// <param name="request">The batch write request with session ID and items.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="WriteBatchResponse" /> with results for each tag.</returns>
public override async Task<WriteBatchResponse> WriteBatch(WriteBatchRequest request, ServerCallContext context)
{
using (PerformanceMetrics.ITimingScope scope = _performanceMetrics?.BeginOperation("WriteBatch"))
{
try
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
var badResponse = new WriteBatchResponse
{
Success = false,
Message = "Invalid session ID"
};
foreach (var item in request.Items)
{
badResponse.Results.Add(new WriteResult
{
Tag = item.Tag,
Success = false,
Message = "Invalid session ID"
});
}
return badResponse;
}
Logger.Debug("WriteBatch request from {Peer} for {Count} items", context.Peer, request.Items.Count);
var values = new Dictionary<string, object>();
foreach (var item in request.Items)
{
values[item.Tag] = ParseValue(item.Value);
}
await _scadaClient.WriteBatchAsync(values, context.CancellationToken);
scope?.SetSuccess(true);
var response = new WriteBatchResponse
{
Success = true,
Message = string.Empty
};
foreach (var item in request.Items)
{
response.Results.Add(new WriteResult
{
Tag = item.Tag,
Success = true,
Message = string.Empty
});
}
return response;
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to write batch");
scope?.SetSuccess(false);
var response = new WriteBatchResponse
{
Success = false,
Message = ex.Message
};
foreach (var item in request.Items)
{
response.Results.Add(new WriteResult
{
Tag = item.Tag,
Success = false,
Message = ex.Message
});
}
return response;
}
}
}
/// <summary>
/// Writes a batch of tag values and waits for a flag tag to reach a specific value.
/// </summary>
/// <param name="request">The batch write and wait request.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="WriteBatchAndWaitResponse" /> with results and flag status.</returns>
public override async Task<WriteBatchAndWaitResponse> WriteBatchAndWait(WriteBatchAndWaitRequest request,
ServerCallContext context)
{
var startTime = DateTime.UtcNow;
try
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
var badResponse = new WriteBatchAndWaitResponse
{
Success = false,
Message = "Invalid session ID",
FlagReached = false,
ElapsedMs = 0
};
foreach (var item in request.Items)
{
badResponse.WriteResults.Add(new WriteResult
{
Tag = item.Tag,
Success = false,
Message = "Invalid session ID"
});
}
return badResponse;
}
Logger.Debug("WriteBatchAndWait request from {Peer}", context.Peer);
var values = new Dictionary<string, object>();
foreach (var item in request.Items)
{
values[item.Tag] = ParseValue(item.Value);
}
var flagValue = ParseValue(request.FlagValue);
var pollInterval = request.PollIntervalMs > 0 ? request.PollIntervalMs : 100;
using var cts = CancellationTokenSource.CreateLinkedTokenSource(context.CancellationToken);
cts.CancelAfter(TimeSpan.FromMilliseconds(request.TimeoutMs));
// Write the batch first
await _scadaClient.WriteBatchAsync(values, cts.Token);
// Poll for the flag value
var flagReached = false;
while (!cts.Token.IsCancellationRequested)
{
try
{
var flagVtq = await _scadaClient.ReadAsync(request.FlagTag, cts.Token);
if (flagVtq.Value != null && AreValuesEqual(flagVtq.Value, flagValue))
{
flagReached = true;
break;
}
await Task.Delay(pollInterval, cts.Token);
}
catch (OperationCanceledException)
{
break;
}
}
var elapsedMs = (int)(DateTime.UtcNow - startTime).TotalMilliseconds;
var response = new WriteBatchAndWaitResponse
{
Success = true,
Message = string.Empty,
FlagReached = flagReached,
ElapsedMs = elapsedMs
};
foreach (var item in request.Items)
{
response.WriteResults.Add(new WriteResult
{
Tag = item.Tag,
Success = true,
Message = string.Empty
});
}
return response;
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to write batch and wait");
var elapsedMs = (int)(DateTime.UtcNow - startTime).TotalMilliseconds;
var response = new WriteBatchAndWaitResponse
{
Success = false,
Message = ex.Message,
FlagReached = false,
ElapsedMs = elapsedMs
};
foreach (var item in request.Items)
{
response.WriteResults.Add(new WriteResult
{
Tag = item.Tag,
Success = false,
Message = ex.Message
});
}
return response;
}
}
#endregion
#region Subscription Operations
/// <summary>
/// Subscribes to value changes for specified tags and streams updates to the client.
/// </summary>
/// <param name="request">The subscribe request with session ID and tags.</param>
/// <param name="responseStream">The server stream writer for VTQ updates.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A task representing the asynchronous operation.</returns>
public override async Task Subscribe(SubscribeRequest request,
IServerStreamWriter<VtqMessage> responseStream, ServerCallContext context)
{
// Validate session
if (!_sessionManager.ValidateSession(request.SessionId))
{
Logger.Warning("Subscribe failed: Invalid session ID {SessionId}", request.SessionId);
throw new RpcException(new Status(StatusCode.Unauthenticated, "Invalid session ID"));
}
var clientId = Guid.NewGuid().ToString();
try
{
Logger.Information("Subscribe request from {Peer} with client ID {ClientId} for {Count} tags",
context.Peer, clientId, request.Tags.Count);
Channel<(string address, Vtq vtq)> channel = await _subscriptionManager.SubscribeAsync(
clientId,
request.Tags,
context.CancellationToken);
// Stream updates to the client until cancelled
while (!context.CancellationToken.IsCancellationRequested)
{
try
{
while (await channel.Reader.WaitToReadAsync(context.CancellationToken))
{
if (channel.Reader.TryRead(out (string address, Vtq vtq) item))
{
var vtqMessage = ConvertToVtqMessage(item.address, item.vtq);
await responseStream.WriteAsync(vtqMessage);
}
}
}
catch (OperationCanceledException)
{
break;
}
}
}
catch (OperationCanceledException)
{
Logger.Information("Subscription cancelled for client {ClientId}", clientId);
}
catch (Exception ex)
{
Logger.Error(ex, "Error in subscription for client {ClientId}", clientId);
throw;
}
finally
{
_subscriptionManager.UnsubscribeClient(clientId);
}
}
#endregion
#region Authentication
/// <summary>
/// Checks the validity of an API key.
/// </summary>
/// <param name="request">The API key check request.</param>
/// <param name="context">The gRPC server call context.</param>
/// <returns>A <see cref="CheckApiKeyResponse" /> with validity and details.</returns>
public override Task<CheckApiKeyResponse> CheckApiKey(CheckApiKeyRequest request, ServerCallContext context)
{
var response = new CheckApiKeyResponse
{
IsValid = false,
Message = "API key validation failed"
};
// Check if API key was validated by interceptor
if (context.UserState.TryGetValue("ApiKey", out object apiKeyObj) && apiKeyObj is ApiKey apiKey)
{
response.IsValid = apiKey.IsValid();
response.Message = apiKey.IsValid()
? $"API key is valid (Role: {apiKey.Role})"
: "API key is disabled";
Logger.Information("API key check - Valid: {IsValid}, Role: {Role}",
response.IsValid, apiKey.Role);
}
else
{
Logger.Warning("API key check failed - no API key in context");
}
return Task.FromResult(response);
}
#endregion
#region Value Conversion Helpers
/// <summary>
/// Converts a domain <see cref="Vtq" /> to a gRPC <see cref="VtqMessage" />.
/// </summary>
private static VtqMessage ConvertToVtqMessage(string tag, Vtq vtq)
{
return new VtqMessage
{
Tag = tag,
Value = ConvertValueToString(vtq.Value),
TimestampUtcTicks = vtq.Timestamp.Ticks,
Quality = ConvertQualityToString(vtq.Quality)
};
}
/// <summary>
/// Creates a bad quality VTQ message for error cases.
/// </summary>
private static VtqMessage CreateBadVtqMessage(string tag)
{
return new VtqMessage
{
Tag = tag,
Value = string.Empty,
TimestampUtcTicks = DateTime.UtcNow.Ticks,
Quality = "Bad"
};
}
/// <summary>
/// Converts a value to its string representation.
/// </summary>
private static string ConvertValueToString(object value)
{
if (value == null)
{
return string.Empty;
}
return value switch
{
bool b => b.ToString().ToLowerInvariant(),
DateTime dt => dt.ToUniversalTime().ToString("O"),
DateTimeOffset dto => dto.ToString("O"),
float f => f.ToString(CultureInfo.InvariantCulture),
double d => d.ToString(CultureInfo.InvariantCulture),
decimal dec => dec.ToString(CultureInfo.InvariantCulture),
Array => JsonSerializer.Serialize(value, value.GetType()),
_ => value.ToString() ?? string.Empty
};
}
/// <summary>
/// Converts a domain quality value to a string.
/// </summary>
private static string ConvertQualityToString(Domain.Quality quality)
{
// Simplified quality mapping for the new API
var qualityValue = (int)quality;
if (qualityValue >= 192) // Good family
{
return "Good";
}
if (qualityValue >= 64) // Uncertain family
{
return "Uncertain";
}
return "Bad"; // Bad family
}
/// <summary>
/// Parses a string value to an appropriate .NET type.
/// </summary>
private static object ParseValue(string value)
{
if (string.IsNullOrEmpty(value))
{
return string.Empty;
}
// Try to parse as boolean
if (bool.TryParse(value, out bool boolResult))
{
return boolResult;
}
// Try to parse as integer
if (int.TryParse(value, NumberStyles.Integer, CultureInfo.InvariantCulture, out int intResult))
{
return intResult;
}
// Try to parse as long
if (long.TryParse(value, NumberStyles.Integer, CultureInfo.InvariantCulture, out long longResult))
{
return longResult;
}
// Try to parse as double
if (double.TryParse(value, NumberStyles.Float | NumberStyles.AllowThousands, CultureInfo.InvariantCulture,
out double doubleResult))
{
return doubleResult;
}
// Try to parse as DateTime
if (DateTime.TryParse(value, CultureInfo.InvariantCulture, DateTimeStyles.RoundtripKind,
out DateTime dateResult))
{
return dateResult;
}
// Return as string
return value;
}
/// <summary>
/// Compares two values for equality.
/// </summary>
private static bool AreValuesEqual(object value1, object value2)
{
if (value1 == null && value2 == null)
{
return true;
}
if (value1 == null || value2 == null)
{
return false;
}
// Convert both to strings for comparison
var str1 = ConvertValueToString(value1);
var str2 = ConvertValueToString(value2);
return string.Equals(str1, str2, StringComparison.OrdinalIgnoreCase);
}
#endregion
}
}

View File

@@ -0,0 +1,298 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Connection management for MxAccessClient.
/// </summary>
public sealed partial class MxAccessClient
{
/// <summary>
/// Asynchronously connects to the MxAccess server.
/// </summary>
/// <param name="ct">A cancellation token to observe while waiting for the task to complete.</param>
/// <returns>A task that represents the asynchronous connect operation.</returns>
/// <exception cref="ObjectDisposedException">Thrown if the client has been disposed.</exception>
/// <exception cref="InvalidOperationException">Thrown if registration with MxAccess fails.</exception>
/// <exception cref="Exception">Thrown if any other error occurs during connection.</exception>
public async Task ConnectAsync(CancellationToken ct = default)
{
// COM operations must run on STA thread, so we use Task.Run here
await Task.Run(ConnectInternal, ct);
// Recreate stored subscriptions after successful connection
await RecreateStoredSubscriptionsAsync();
}
/// <summary>
/// Asynchronously disconnects from the MxAccess server and cleans up resources.
/// </summary>
/// <param name="ct">A cancellation token to observe while waiting for the task to complete.</param>
/// <returns>A task that represents the asynchronous disconnect operation.</returns>
public async Task DisconnectAsync(CancellationToken ct = default)
{
// COM operations must run on STA thread, so we use Task.Run here
await Task.Run(() => DisconnectInternal(), ct);
}
/// <summary>
/// Internal synchronous connection logic.
/// </summary>
private void ConnectInternal()
{
lock (_lock)
{
ValidateNotDisposed();
if (IsConnected)
{
return;
}
try
{
Logger.Information("Attempting to connect to MxAccess");
SetConnectionState(ConnectionState.Connecting);
InitializeMxAccessConnection();
RegisterWithMxAccess();
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to connect to MxAccess");
Cleanup();
SetConnectionState(ConnectionState.Disconnected, ex.Message);
throw;
}
}
}
/// <summary>
/// Validates that the client has not been disposed.
/// </summary>
private void ValidateNotDisposed()
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(MxAccessClient));
}
}
/// <summary>
/// Initializes the MxAccess COM connection and event handlers.
/// </summary>
private void InitializeMxAccessConnection()
{
// Create the COM object
_lmxProxy = new LMXProxyServer();
// Wire up event handlers
_lmxProxy.OnDataChange += OnDataChange;
_lmxProxy.OnWriteComplete += OnWriteComplete;
_lmxProxy.OperationComplete += OnOperationComplete;
}
/// <summary>
/// Registers with the MxAccess server.
/// </summary>
private void RegisterWithMxAccess()
{
// Register with the server
if (_lmxProxy == null)
{
throw new InvalidOperationException("MxAccess proxy is not initialized");
}
_connectionHandle = _lmxProxy.Register("ZB.MOM.WW.LmxProxy.Host");
if (_connectionHandle > 0)
{
SetConnectionState(ConnectionState.Connected);
Logger.Information("Successfully connected to MxAccess with handle {Handle}", _connectionHandle);
}
else
{
throw new InvalidOperationException("Failed to register with MxAccess - invalid handle returned");
}
}
/// <summary>
/// Internal synchronous disconnection logic.
/// </summary>
private void DisconnectInternal()
{
lock (_lock)
{
if (!IsConnected || _lmxProxy == null)
{
return;
}
try
{
Logger.Information("Disconnecting from MxAccess");
SetConnectionState(ConnectionState.Disconnecting);
RemoveAllSubscriptions();
UnregisterFromMxAccess();
Cleanup();
SetConnectionState(ConnectionState.Disconnected);
Logger.Information("Successfully disconnected from MxAccess");
}
catch (Exception ex)
{
Logger.Error(ex, "Error during disconnect");
Cleanup();
SetConnectionState(ConnectionState.Disconnected, ex.Message);
}
}
}
/// <summary>
/// Removes all active subscriptions.
/// </summary>
private void RemoveAllSubscriptions()
{
var subscriptionsToRemove = _subscriptions.Values.ToList();
var failedRemovals = new List<string>();
foreach (SubscriptionInfo? sub in subscriptionsToRemove)
{
if (!TryRemoveSubscription(sub))
{
failedRemovals.Add(sub.Address);
}
}
if (failedRemovals.Any())
{
Logger.Warning("Failed to cleanly remove {Count} subscriptions: {Addresses}",
failedRemovals.Count, string.Join(", ", failedRemovals));
}
_subscriptions.Clear();
_subscriptionsByHandle.Clear();
// Note: We intentionally keep _storedSubscriptions to recreate them on reconnect
}
/// <summary>
/// Attempts to remove a single subscription.
/// </summary>
private bool TryRemoveSubscription(SubscriptionInfo subscription)
{
try
{
if (_lmxProxy == null)
{
return false;
}
_lmxProxy.UnAdvise(_connectionHandle, subscription.ItemHandle);
_lmxProxy.RemoveItem(_connectionHandle, subscription.ItemHandle);
return true;
}
catch (Exception ex)
{
Logger.Warning(ex, "Error removing subscription for {Address}", subscription.Address);
return false;
}
}
/// <summary>
/// Unregisters from the MxAccess server.
/// </summary>
private void UnregisterFromMxAccess()
{
if (_connectionHandle > 0 && _lmxProxy != null)
{
_lmxProxy.Unregister(_connectionHandle);
_connectionHandle = 0;
}
}
/// <summary>
/// Cleans up resources and releases the COM object.
/// Removes event handlers and releases the proxy COM object if present.
/// </summary>
private void Cleanup()
{
try
{
if (_lmxProxy != null)
{
// Remove event handlers
_lmxProxy.OnDataChange -= OnDataChange;
_lmxProxy.OnWriteComplete -= OnWriteComplete;
_lmxProxy.OperationComplete -= OnOperationComplete;
// Release COM object
int refCount = Marshal.ReleaseComObject(_lmxProxy);
if (refCount > 0)
{
Logger.Warning("COM object reference count after release: {RefCount}", refCount);
// Force final release
while (refCount > 0)
{
refCount = Marshal.ReleaseComObject(_lmxProxy);
}
}
_lmxProxy = null;
}
_connectionHandle = 0;
}
catch (Exception ex)
{
Logger.Warning(ex, "Error during cleanup");
}
}
/// <summary>
/// Recreates all stored subscriptions after reconnection.
/// </summary>
private async Task RecreateStoredSubscriptionsAsync()
{
List<StoredSubscription> subscriptionsToRecreate;
lock (_lock)
{
// Create a copy to avoid holding the lock during async operations
subscriptionsToRecreate = new List<StoredSubscription>(_storedSubscriptions);
}
if (subscriptionsToRecreate.Count == 0)
{
Logger.Debug("No stored subscriptions to recreate");
return;
}
Logger.Information("Recreating {Count} stored subscription groups after reconnection",
subscriptionsToRecreate.Count);
foreach (StoredSubscription? storedSub in subscriptionsToRecreate)
{
try
{
// Recreate the subscription without storing it again
await SubscribeInternalAsync(storedSub.Addresses, storedSub.Callback, false);
Logger.Information("Successfully recreated subscription group {GroupId} with {Count} addresses",
storedSub.GroupId, storedSub.Addresses.Count);
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to recreate subscription group {GroupId}", storedSub.GroupId);
}
}
}
}
}

View File

@@ -0,0 +1,166 @@
using System;
using ArchestrA.MxAccess;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Event handlers for MxAccessClient to process data changes, write completions, and operation completions.
/// </summary>
public sealed partial class MxAccessClient
{
/// <summary>
/// Handles data change events from the MxAccess server.
/// </summary>
/// <param name="hLMXServerHandle">Server handle.</param>
/// <param name="phItemHandle">Item handle.</param>
/// <param name="pvItemValue">Item value.</param>
/// <param name="pwItemQuality">Item quality code.</param>
/// <param name="pftItemTimeStamp">Item timestamp.</param>
/// <param name="ItemStatus">Status array.</param>
private void OnDataChange(int hLMXServerHandle, int phItemHandle, object pvItemValue,
int pwItemQuality, object pftItemTimeStamp, ref MXSTATUS_PROXY[] ItemStatus)
{
try
{
if (!_subscriptionsByHandle.TryGetValue(phItemHandle, out SubscriptionInfo? subscription))
{
return;
}
// Convert quality from integer
Quality quality = ConvertQuality(pwItemQuality);
DateTime timestamp = ConvertTimestamp(pftItemTimeStamp);
var vtq = new Vtq(pvItemValue, timestamp, quality);
// Invoke callback
subscription.Callback?.Invoke(subscription.Address, vtq);
}
catch (Exception ex)
{
Logger.Error(ex, "Error processing data change for handle {Handle}", phItemHandle);
}
}
/// <summary>
/// Handles write completion events from the MxAccess server.
/// </summary>
/// <param name="hLMXServerHandle">Server handle.</param>
/// <param name="phItemHandle">Item handle.</param>
/// <param name="ItemStatus">Status array.</param>
private void OnWriteComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] ItemStatus)
{
try
{
WriteOperation? writeOp;
lock (_lock)
{
if (_pendingWrites.TryGetValue(phItemHandle, out writeOp))
{
_pendingWrites.Remove(phItemHandle);
}
}
if (writeOp != null)
{
try
{
if (ItemStatus is { Length: > 0 })
{
var status = ItemStatus[0];
if (status.success == 0)
{
string errorMsg = GetWriteErrorMessage(status.detail);
Logger.Warning(
"Write failed for {Address} (handle {Handle}): {Error} (Category={Category}, Detail={Detail})",
writeOp.Address, phItemHandle, errorMsg, status.category, status.detail);
writeOp.CompletionSource.TrySetException(new InvalidOperationException(
$"Write failed: {errorMsg}"));
}
else
{
Logger.Debug("Write completed successfully for {Address} (handle {Handle})",
writeOp.Address, phItemHandle);
writeOp.CompletionSource.TrySetResult(true);
}
}
else
{
Logger.Debug("Write completed for {Address} (handle {Handle}) with no status",
writeOp.Address, phItemHandle);
writeOp.CompletionSource.TrySetResult(true);
}
}
finally
{
// Clean up the item after write completes
lock (_lock)
{
if (_lmxProxy != null)
{
try
{
_lmxProxy.UnAdvise(_connectionHandle, phItemHandle);
_lmxProxy.RemoveItem(_connectionHandle, phItemHandle);
}
catch (Exception ex)
{
Logger.Debug(ex, "Error cleaning up after write for handle {Handle}", phItemHandle);
}
}
}
}
}
else if (ItemStatus is { Length: > 0 })
{
var status = ItemStatus[0];
if (status.success == 0)
{
Logger.Warning("Write failed for unknown handle {Handle}: Category={Category}, Detail={Detail}",
phItemHandle, status.category, status.detail);
}
}
}
catch (Exception ex)
{
Logger.Error(ex, "Error processing write complete for handle {Handle}", phItemHandle);
}
}
/// <summary>
/// Handles operation completion events from the MxAccess server.
/// </summary>
/// <param name="hLMXServerHandle">Server handle.</param>
/// <param name="phItemHandle">Item handle.</param>
/// <param name="ItemStatus">Status array.</param>
private void OnOperationComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] ItemStatus)
{
// Log operation completion
Logger.Debug("Operation complete for handle {Handle}", phItemHandle);
}
/// <summary>
/// Converts an integer MxAccess quality code to <see cref="Quality" />.
/// </summary>
/// <param name="mxQuality">The MxAccess quality code.</param>
/// <returns>The corresponding <see cref="Quality" /> value.</returns>
private Quality ConvertQuality(int mxQuality) => (Quality)mxQuality;
/// <summary>
/// Converts a timestamp object to <see cref="DateTime" /> in UTC.
/// </summary>
/// <param name="timestamp">The timestamp object.</param>
/// <returns>The UTC <see cref="DateTime" /> value.</returns>
private DateTime ConvertTimestamp(object timestamp)
{
if (timestamp is DateTime dt)
{
return dt.Kind == DateTimeKind.Utc ? dt : dt.ToUniversalTime();
}
return DateTime.UtcNow;
}
}
}

View File

@@ -0,0 +1,132 @@
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Private nested types for MxAccessClient to encapsulate subscription and write operation details.
/// </summary>
public sealed partial class MxAccessClient
{
/// <summary>
/// Holds information about a subscription to a SCADA tag.
/// </summary>
private class SubscriptionInfo
{
/// <summary>
/// Gets or sets the address of the tag.
/// </summary>
public string Address { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the item handle.
/// </summary>
public int ItemHandle { get; set; }
/// <summary>
/// Gets or sets the callback for value changes.
/// </summary>
public Action<string, Vtq>? Callback { get; set; }
/// <summary>
/// Gets or sets the subscription identifier.
/// </summary>
public string SubscriptionId { get; set; } = string.Empty;
}
/// <summary>
/// Represents a handle for a subscription, allowing asynchronous disposal.
/// </summary>
private class SubscriptionHandle : IAsyncDisposable
{
private readonly MxAccessClient _client;
private readonly string _groupId;
private readonly List<string> _subscriptionIds;
private bool _disposed;
/// <summary>
/// Initializes a new instance of the <see cref="SubscriptionHandle" /> class.
/// </summary>
/// <param name="client">The owning <see cref="MxAccessClient" />.</param>
/// <param name="subscriptionIds">The subscription identifiers.</param>
/// <param name="groupId">The group identifier for stored subscriptions.</param>
public SubscriptionHandle(MxAccessClient client, List<string> subscriptionIds, string groupId)
{
_client = client;
_subscriptionIds = subscriptionIds;
_groupId = groupId;
}
/// <inheritdoc />
public async ValueTask DisposeAsync()
{
if (_disposed)
{
return;
}
_disposed = true;
var tasks = new List<Task>();
foreach (string? id in _subscriptionIds)
{
tasks.Add(_client.UnsubscribeInternalAsync(id));
}
await Task.WhenAll(tasks);
// Remove the stored subscription group
_client.RemoveStoredSubscription(_groupId);
}
}
/// <summary>
/// Represents a pending write operation.
/// </summary>
private class WriteOperation
{
/// <summary>
/// Gets or sets the address of the tag.
/// </summary>
public string Address { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the item handle.
/// </summary>
public int ItemHandle { get; set; }
/// <summary>
/// Gets or sets the completion source for the write operation.
/// </summary>
public TaskCompletionSource<bool> CompletionSource { get; set; } = null!;
/// <summary>
/// Gets or sets the start time of the write operation.
/// </summary>
public DateTime StartTime { get; set; }
}
/// <summary>
/// Stores subscription information for automatic recreation after reconnection.
/// </summary>
private class StoredSubscription
{
/// <summary>
/// Gets or sets the addresses that were subscribed to.
/// </summary>
public List<string> Addresses { get; set; } = new();
/// <summary>
/// Gets or sets the callback for value changes.
/// </summary>
public Action<string, Vtq> Callback { get; set; } = null!;
/// <summary>
/// Gets or sets the unique identifier for this stored subscription group.
/// </summary>
public string GroupId { get; set; } = string.Empty;
}
}
}

View File

@@ -0,0 +1,402 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Polly;
using ZB.MOM.WW.LmxProxy.Host.Domain;
using ZB.MOM.WW.LmxProxy.Host.Services;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Read and write operations for MxAccessClient.
/// </summary>
public sealed partial class MxAccessClient
{
/// <inheritdoc />
public async Task<Vtq> ReadAsync(string address, CancellationToken ct = default)
{
// Apply retry policy for read operations
IAsyncPolicy<Vtq> policy = RetryPolicies.CreateReadPolicy<Vtq>();
return await policy.ExecuteWithRetryAsync(async () =>
{
ValidateConnection();
return await ReadSingleValueAsync(address, ct);
}, $"Read-{address}");
}
/// <inheritdoc />
public async Task<IReadOnlyDictionary<string, Vtq>> ReadBatchAsync(IEnumerable<string> addresses,
CancellationToken ct = default)
{
var addressList = addresses.ToList();
var results = new Dictionary<string, Vtq>(addressList.Count);
// Create tasks for parallel reading
IEnumerable<Task> tasks =
addressList.Select(address => ReadAddressWithSemaphoreAsync(address, results, ct));
await Task.WhenAll(tasks);
return results;
}
/// <inheritdoc />
public async Task WriteAsync(string address, object value, CancellationToken ct = default)
{
// Apply retry policy for write operations
IAsyncPolicy policy = RetryPolicies.CreateWritePolicy();
await policy.ExecuteWithRetryAsync(async () => { await WriteInternalAsync(address, value, ct); },
$"Write-{address}");
}
/// <inheritdoc />
public async Task WriteBatchAsync(IReadOnlyDictionary<string, object> values, CancellationToken ct = default)
{
// Create tasks for parallel writing
IEnumerable<Task> tasks = values.Select(kvp => WriteAddressWithSemaphoreAsync(kvp.Key, kvp.Value, ct));
await Task.WhenAll(tasks);
}
/// <inheritdoc />
public async Task<bool> WriteBatchAndWaitAsync(
IReadOnlyDictionary<string, object> values,
string flagAddress,
object flagValue,
string responseAddress,
object responseValue,
CancellationToken ct = default)
{
// Write the batch values
await WriteBatchAsync(values, ct);
// Write the flag
await WriteAsync(flagAddress, flagValue, ct);
// Wait for the response
return await WaitForResponseAsync(responseAddress, responseValue, ct);
}
#region Private Helper Methods
/// <summary>
/// Validates that the client is connected.
/// </summary>
private void ValidateConnection()
{
if (!IsConnected)
{
throw new InvalidOperationException("Not connected to MxAccess");
}
}
/// <summary>
/// Reads a single value from the specified address.
/// </summary>
private async Task<Vtq> ReadSingleValueAsync(string address, CancellationToken ct)
{
// MxAccess doesn't support direct read - we need to subscribe, get the value, then unsubscribe
var tcs = new TaskCompletionSource<Vtq>();
IAsyncDisposable? subscription = null;
try
{
subscription = await SubscribeAsync(new[] { address }, (addr, vtq) => { tcs.TrySetResult(vtq); }, ct);
return await WaitForReadResultAsync(tcs, ct);
}
finally
{
if (subscription != null)
{
await subscription.DisposeAsync();
}
}
}
/// <summary>
/// Waits for a read result with timeout.
/// </summary>
private async Task<Vtq> WaitForReadResultAsync(TaskCompletionSource<Vtq> tcs, CancellationToken ct)
{
using (var cts = new CancellationTokenSource(TimeSpan.FromSeconds(_configuration.ReadTimeoutSeconds)))
{
using (ct.Register(() => cts.Cancel()))
{
cts.Token.Register(() => tcs.TrySetException(new TimeoutException("Read timeout")));
return await tcs.Task;
}
}
}
/// <summary>
/// Reads an address with semaphore protection for batch operations.
/// </summary>
private async Task ReadAddressWithSemaphoreAsync(string address, Dictionary<string, Vtq> results,
CancellationToken ct)
{
await _readSemaphore.WaitAsync(ct);
try
{
Vtq vtq = await ReadAsync(address, ct);
lock (results)
{
results[address] = vtq;
}
}
catch (Exception ex)
{
Logger.Warning(ex, "Failed to read {Address}", address);
lock (results)
{
results[address] = Vtq.Bad();
}
}
finally
{
_readSemaphore.Release();
}
}
/// <summary>
/// Internal write implementation.
/// </summary>
private async Task WriteInternalAsync(string address, object value, CancellationToken ct)
{
var tcs = new TaskCompletionSource<bool>();
int itemHandle = await SetupWriteOperationAsync(address, value, tcs, ct);
try
{
await WaitForWriteCompletionAsync(tcs, itemHandle, address, ct);
}
catch
{
await CleanupWriteOperationAsync(itemHandle);
throw;
}
}
/// <summary>
/// Sets up a write operation and returns the item handle.
/// </summary>
private async Task<int> SetupWriteOperationAsync(string address, object value, TaskCompletionSource<bool> tcs,
CancellationToken ct)
{
return await Task.Run(() =>
{
lock (_lock)
{
ValidateConnectionLocked();
return InitiateWriteOperation(address, value, tcs);
}
}, ct);
}
/// <summary>
/// Validates connection while holding the lock.
/// </summary>
private void ValidateConnectionLocked()
{
if (!IsConnected || _lmxProxy == null)
{
throw new InvalidOperationException("Not connected to MxAccess");
}
}
/// <summary>
/// Initiates a write operation and returns the item handle.
/// </summary>
private int InitiateWriteOperation(string address, object value, TaskCompletionSource<bool> tcs)
{
int itemHandle = 0;
try
{
if (_lmxProxy == null)
{
throw new InvalidOperationException("MxAccess proxy is not initialized");
}
// Add the item if not already added
itemHandle = _lmxProxy.AddItem(_connectionHandle, address);
// Advise the item to enable writing
_lmxProxy.AdviseSupervisory(_connectionHandle, itemHandle);
// Track the pending write operation
TrackPendingWrite(address, itemHandle, tcs);
// Write the value
_lmxProxy.Write(_connectionHandle, itemHandle, value, -1); // -1 for no security
return itemHandle;
}
catch (Exception ex)
{
CleanupFailedWrite(itemHandle);
Logger.Error(ex, "Failed to write value to {Address}", address);
throw;
}
}
/// <summary>
/// Tracks a pending write operation.
/// </summary>
private void TrackPendingWrite(string address, int itemHandle, TaskCompletionSource<bool> tcs)
{
var writeOp = new WriteOperation
{
Address = address,
ItemHandle = itemHandle,
CompletionSource = tcs,
StartTime = DateTime.UtcNow
};
_pendingWrites[itemHandle] = writeOp;
}
/// <summary>
/// Cleans up a failed write operation.
/// </summary>
private void CleanupFailedWrite(int itemHandle)
{
if (itemHandle > 0 && _lmxProxy != null)
{
try
{
_lmxProxy.UnAdvise(_connectionHandle, itemHandle);
_lmxProxy.RemoveItem(_connectionHandle, itemHandle);
_pendingWrites.Remove(itemHandle);
}
catch
{
}
}
}
/// <summary>
/// Waits for write completion with timeout.
/// </summary>
private async Task WaitForWriteCompletionAsync(TaskCompletionSource<bool> tcs, int itemHandle, string address,
CancellationToken ct)
{
using (ct.Register(() => tcs.TrySetCanceled()))
{
var timeoutTask = Task.Delay(TimeSpan.FromSeconds(_configuration.WriteTimeoutSeconds), ct);
Task? completedTask = await Task.WhenAny(tcs.Task, timeoutTask);
if (completedTask == timeoutTask)
{
await HandleWriteTimeoutAsync(itemHandle, address);
}
await tcs.Task; // This will throw if the write failed
}
}
/// <summary>
/// Handles write timeout by cleaning up resources.
/// </summary>
private async Task HandleWriteTimeoutAsync(int itemHandle, string address)
{
await CleanupWriteOperationAsync(itemHandle);
throw new TimeoutException($"Write operation to {address} timed out");
}
/// <summary>
/// Cleans up a write operation.
/// </summary>
private async Task CleanupWriteOperationAsync(int itemHandle)
{
await Task.Run(() =>
{
lock (_lock)
{
if (_pendingWrites.ContainsKey(itemHandle))
{
_pendingWrites.Remove(itemHandle);
if (_lmxProxy != null)
{
try
{
_lmxProxy.UnAdvise(_connectionHandle, itemHandle);
_lmxProxy.RemoveItem(_connectionHandle, itemHandle);
}
catch
{
}
}
}
}
});
}
/// <summary>
/// Writes an address with semaphore protection for batch operations.
/// </summary>
private async Task WriteAddressWithSemaphoreAsync(string address, object value, CancellationToken ct)
{
await _writeSemaphore.WaitAsync(ct);
try
{
await WriteAsync(address, value, ct);
}
finally
{
_writeSemaphore.Release();
}
}
/// <summary>
/// Waits for a specific response value.
/// </summary>
private async Task<bool> WaitForResponseAsync(string responseAddress, object responseValue,
CancellationToken ct)
{
var tcs = new TaskCompletionSource<bool>();
IAsyncDisposable? subscription = null;
try
{
subscription = await SubscribeAsync(new[] { responseAddress }, (addr, vtq) =>
{
if (Equals(vtq.Value, responseValue))
{
tcs.TrySetResult(true);
}
}, ct);
// Wait for the response value
using (ct.Register(() => tcs.TrySetResult(false)))
{
return await tcs.Task;
}
}
finally
{
if (subscription != null)
{
await subscription.DisposeAsync();
}
}
}
/// <summary>
/// Gets a human-readable error message for a write error code.
/// </summary>
/// <param name="errorCode">The error code.</param>
/// <returns>The error message.</returns>
private static string GetWriteErrorMessage(int errorCode)
{
return errorCode switch
{
1008 => "User lacks proper security for write operation",
1012 => "Secured write required",
1013 => "Verified write required",
_ => $"Unknown error code: {errorCode}"
};
}
#endregion
}
}

View File

@@ -0,0 +1,153 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Subscription management for MxAccessClient to handle SCADA tag updates.
/// </summary>
public sealed partial class MxAccessClient
{
/// <summary>
/// Subscribes to a set of addresses and registers a callback for value changes.
/// </summary>
/// <param name="addresses">The collection of addresses to subscribe to.</param>
/// <param name="callback">
/// The callback to invoke when a value changes.
/// The callback receives the address and the new <see cref="Vtq" /> value.
/// </param>
/// <param name="ct">An optional <see cref="CancellationToken" /> to cancel the operation.</param>
/// <returns>
/// A <see cref="Task{IAsyncDisposable}" /> that completes with a handle to the subscription.
/// Disposing the handle will unsubscribe from all addresses.
/// </returns>
/// <exception cref="InvalidOperationException">Thrown if not connected to MxAccess.</exception>
/// <exception cref="Exception">Thrown if subscription fails for any address.</exception>
public Task<IAsyncDisposable> SubscribeAsync(IEnumerable<string> addresses, Action<string, Vtq> callback,
CancellationToken ct = default) => SubscribeInternalAsync(addresses, callback, true, ct);
/// <summary>
/// Internal subscription method that allows control over whether to store the subscription for recreation.
/// </summary>
private Task<IAsyncDisposable> SubscribeInternalAsync(IEnumerable<string> addresses,
Action<string, Vtq> callback, bool storeForRecreation, CancellationToken ct = default)
{
return Task.Run<IAsyncDisposable>(() =>
{
lock (_lock)
{
if (!IsConnected || _lmxProxy == null)
{
throw new InvalidOperationException("Not connected to MxAccess");
}
var subscriptionIds = new List<string>();
try
{
var addressList = addresses.ToList();
foreach (string? address in addressList)
{
// Add the item
var itemHandle = _lmxProxy.AddItem(_connectionHandle, address);
// Create subscription info
string subscriptionId = Guid.NewGuid().ToString();
var subscription = new SubscriptionInfo
{
Address = address,
ItemHandle = itemHandle,
Callback = callback,
SubscriptionId = subscriptionId
};
// Store subscription
_subscriptions[subscriptionId] = subscription;
_subscriptionsByHandle[itemHandle] = subscription;
subscriptionIds.Add(subscriptionId);
// Advise the item
_lmxProxy.AdviseSupervisory(_connectionHandle, itemHandle);
Logger.Debug("Subscribed to {Address} with handle {Handle}", address, itemHandle);
}
// Store subscription group for automatic recreation after reconnect
string groupId = Guid.NewGuid().ToString();
if (storeForRecreation)
{
_storedSubscriptions.Add(new StoredSubscription
{
Addresses = addressList,
Callback = callback,
GroupId = groupId
});
Logger.Debug(
"Stored subscription group {GroupId} with {Count} addresses for automatic recreation",
groupId, addressList.Count);
}
return new SubscriptionHandle(this, subscriptionIds, groupId);
}
catch (Exception ex)
{
// Clean up any subscriptions that were created
foreach (string? id in subscriptionIds)
{
UnsubscribeInternalAsync(id).Wait();
}
Logger.Error(ex, "Failed to subscribe to addresses");
throw;
}
}
}, ct);
}
/// <summary>
/// Unsubscribes from a subscription by its ID.
/// </summary>
/// <param name="subscriptionId">The subscription identifier.</param>
/// <returns>
/// A <see cref="Task" /> representing the asynchronous operation.
/// </returns>
private Task UnsubscribeInternalAsync(string subscriptionId)
{
return Task.Run(() =>
{
lock (_lock)
{
if (!_subscriptions.TryGetValue(subscriptionId, out SubscriptionInfo? subscription))
{
return;
}
try
{
if (_lmxProxy != null && _connectionHandle > 0)
{
_lmxProxy.UnAdvise(_connectionHandle, subscription.ItemHandle);
_lmxProxy.RemoveItem(_connectionHandle, subscription.ItemHandle);
}
_subscriptions.Remove(subscriptionId);
_subscriptionsByHandle.Remove(subscription.ItemHandle);
Logger.Debug("Unsubscribed from {Address}", subscription.Address);
}
catch (Exception ex)
{
Logger.Warning(ex, "Error unsubscribing from {Address}", subscription.Address);
}
}
});
}
}
}

View File

@@ -0,0 +1,136 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Implementation
{
/// <summary>
/// Implementation of <see cref="IScadaClient" /> using ArchestrA MxAccess.
/// Provides connection management, read/write operations, and subscription support for SCADA tags.
/// </summary>
public sealed partial class MxAccessClient : IScadaClient
{
private const int DefaultMaxConcurrency = 10;
private static readonly ILogger Logger = Log.ForContext<MxAccessClient>();
private readonly ConnectionConfiguration _configuration;
private readonly object _lock = new();
private readonly Dictionary<int, WriteOperation> _pendingWrites = new();
// Concurrency control for batch operations
private readonly SemaphoreSlim _readSemaphore;
// Store subscription details for automatic recreation after reconnect
private readonly List<StoredSubscription> _storedSubscriptions = new();
private readonly Dictionary<string, SubscriptionInfo> _subscriptions = new();
private readonly Dictionary<int, SubscriptionInfo> _subscriptionsByHandle = new();
private readonly SemaphoreSlim _writeSemaphore;
private int _connectionHandle;
private ConnectionState _connectionState = ConnectionState.Disconnected;
private bool _disposed;
private LMXProxyServer? _lmxProxy;
/// <summary>
/// Initializes a new instance of the <see cref="MxAccessClient" /> class.
/// </summary>
/// <param name="configuration">The connection configuration settings.</param>
public MxAccessClient(ConnectionConfiguration configuration)
{
_configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
// Initialize semaphores with configurable concurrency limits
int maxConcurrency = _configuration.MaxConcurrentOperations ?? DefaultMaxConcurrency;
_readSemaphore = new SemaphoreSlim(maxConcurrency, maxConcurrency);
_writeSemaphore = new SemaphoreSlim(maxConcurrency, maxConcurrency);
}
/// <inheritdoc />
public bool IsConnected
{
get
{
lock (_lock)
{
return _lmxProxy != null && _connectionState == ConnectionState.Connected && _connectionHandle > 0;
}
}
}
/// <inheritdoc />
public ConnectionState ConnectionState
{
get
{
lock (_lock)
{
return _connectionState;
}
}
}
/// <summary>
/// Occurs when the connection state changes.
/// </summary>
public event EventHandler<ConnectionStateChangedEventArgs>? ConnectionStateChanged;
/// <inheritdoc />
public async ValueTask DisposeAsync()
{
if (_disposed)
{
return;
}
await DisconnectAsync();
_disposed = true;
// Dispose semaphores
_readSemaphore?.Dispose();
_writeSemaphore?.Dispose();
}
/// <inheritdoc />
public void Dispose() => DisposeAsync().GetAwaiter().GetResult();
/// <summary>
/// Sets the connection state and raises the <see cref="ConnectionStateChanged" /> event.
/// </summary>
/// <param name="newState">The new connection state.</param>
/// <param name="message">Optional message describing the state change.</param>
private void SetConnectionState(ConnectionState newState, string? message = null)
{
ConnectionState previousState = _connectionState;
if (previousState == newState)
{
return;
}
_connectionState = newState;
Logger.Information("Connection state changed from {Previous} to {Current}", previousState, newState);
ConnectionStateChanged?.Invoke(this, new ConnectionStateChangedEventArgs(previousState, newState, message));
}
/// <summary>
/// Removes a stored subscription group by its ID.
/// </summary>
/// <param name="groupId">The group identifier to remove.</param>
private void RemoveStoredSubscription(string groupId)
{
lock (_lock)
{
_storedSubscriptions.RemoveAll(s => s.GroupId == groupId);
Logger.Debug("Removed stored subscription group {GroupId}", groupId);
}
}
#pragma warning disable CS0169 // Field is never used - reserved for future functionality
private string? _currentNodeName;
private string? _currentGalaxyName;
#pragma warning restore CS0169
}
}

View File

@@ -0,0 +1,592 @@
using System;
using System.Collections.Generic;
using System.IO;
using System.Text.Json;
using System.Threading;
using System.Threading.Tasks;
using Grpc.Core;
using Grpc.Core.Interceptors;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
using ZB.MOM.WW.LmxProxy.Host.Domain;
using ZB.MOM.WW.LmxProxy.Host.Grpc.Services;
using ZB.MOM.WW.LmxProxy.Host.Implementation;
using ZB.MOM.WW.LmxProxy.Host.Security;
using ZB.MOM.WW.LmxProxy.Host.Services;
using ZB.MOM.WW.LmxProxy.Host.Grpc;
using ConnectionState = ZB.MOM.WW.LmxProxy.Host.Domain.ConnectionState;
namespace ZB.MOM.WW.LmxProxy.Host
{
/// <summary>
/// Windows service that hosts the gRPC server and MxAccess client.
/// Manages lifecycle of gRPC server, SCADA client, subscription manager, and API key service.
/// </summary>
public class LmxProxyService
{
private static readonly ILogger Logger = Log.ForContext<LmxProxyService>();
private readonly LmxProxyConfiguration _configuration;
private readonly SemaphoreSlim _reconnectSemaphore = new(1, 1);
private readonly Func<LmxProxyConfiguration, IScadaClient> _scadaClientFactory;
private readonly CancellationTokenSource _shutdownCts = new();
private ApiKeyService? _apiKeyService;
private Task? _connectionMonitorTask;
private DetailedHealthCheckService? _detailedHealthCheckService;
private Server? _grpcServer;
private HealthCheckService? _healthCheckService;
private PerformanceMetrics? _performanceMetrics;
private IScadaClient? _scadaClient;
private SessionManager? _sessionManager;
private StatusReportService? _statusReportService;
private StatusWebServer? _statusWebServer;
private SubscriptionManager? _subscriptionManager;
/// <summary>
/// Initializes a new instance of the <see cref="LmxProxyService" /> class.
/// </summary>
/// <param name="configuration">Configuration settings for the service.</param>
/// <exception cref="ArgumentNullException">Thrown if configuration is null.</exception>
public LmxProxyService(LmxProxyConfiguration configuration,
Func<LmxProxyConfiguration, IScadaClient>? scadaClientFactory = null)
{
_configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
_scadaClientFactory = scadaClientFactory ?? (config => new MxAccessClient(config.Connection));
}
/// <summary>
/// Starts the LmxProxy service, initializing all required components and starting the gRPC server.
/// </summary>
/// <returns><c>true</c> if the service started successfully; otherwise, <c>false</c>.</returns>
public bool Start()
{
try
{
Logger.Information("Starting LmxProxy service on port {Port}", _configuration.GrpcPort);
// Validate configuration before proceeding
if (!ValidateConfiguration())
{
Logger.Error("Configuration validation failed");
return false;
}
// Check and ensure TLS certificates are valid
if (_configuration.Tls.Enabled)
{
Logger.Information("Checking TLS certificate configuration");
var tlsManager = new TlsCertificateManager(_configuration.Tls);
if (!tlsManager.EnsureCertificatesValid())
{
Logger.Error("Failed to ensure valid TLS certificates");
throw new InvalidOperationException("TLS certificate validation or generation failed");
}
Logger.Information("TLS certificates validated successfully");
}
// Create performance metrics service
_performanceMetrics = new PerformanceMetrics();
Logger.Information("Performance metrics service initialized");
// Create API key service
string apiKeyConfigPath = Path.GetFullPath(_configuration.ApiKeyConfigFile);
_apiKeyService = new ApiKeyService(apiKeyConfigPath);
Logger.Information("API key service initialized with config file: {ConfigFile}", apiKeyConfigPath);
// Create SCADA client via factory
_scadaClient = _scadaClientFactory(_configuration) ??
throw new InvalidOperationException("SCADA client factory returned null.");
// Subscribe to connection state changes
_scadaClient.ConnectionStateChanged += OnConnectionStateChanged;
// Automatically connect to MxAccess on startup
try
{
Logger.Information("Connecting to MxAccess...");
Task connectTask = _scadaClient.ConnectAsync();
if (!connectTask.Wait(TimeSpan.FromSeconds(_configuration.Connection.ConnectionTimeoutSeconds)))
{
throw new TimeoutException(
$"Timeout connecting to MxAccess after {_configuration.Connection.ConnectionTimeoutSeconds} seconds");
}
Logger.Information("Successfully connected to MxAccess");
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to connect to MxAccess on startup");
throw;
}
// Start connection monitoring if auto-reconnect is enabled
if (_configuration.Connection.AutoReconnect)
{
_connectionMonitorTask = Task.Run(() => MonitorConnectionAsync(_shutdownCts.Token));
Logger.Information("Connection monitoring started with {Interval} second interval",
_configuration.Connection.MonitorIntervalSeconds);
}
// Create subscription manager with configuration
_subscriptionManager = new SubscriptionManager(_scadaClient, _configuration.Subscription);
// Create session manager for tracking client sessions
_sessionManager = new SessionManager();
Logger.Information("Session manager initialized");
// Create health check services
_healthCheckService = new HealthCheckService(_scadaClient, _subscriptionManager, _performanceMetrics);
_detailedHealthCheckService = new DetailedHealthCheckService(_scadaClient);
Logger.Information("Health check services initialized");
// Create status report service and web server
_statusReportService = new StatusReportService(
_scadaClient,
_subscriptionManager,
_performanceMetrics,
_healthCheckService,
_detailedHealthCheckService);
_statusWebServer = new StatusWebServer(_configuration.WebServer, _statusReportService);
Logger.Information("Status web server initialized");
// Create gRPC service with session manager and performance metrics
var scadaService = new ScadaGrpcService(_scadaClient, _subscriptionManager, _sessionManager, _performanceMetrics);
// Create API key interceptor
var apiKeyInterceptor = new ApiKeyInterceptor(_apiKeyService);
// Configure server credentials based on TLS configuration
ServerCredentials serverCredentials;
if (_configuration.Tls.Enabled)
{
serverCredentials = CreateTlsCredentials(_configuration.Tls);
Logger.Information("TLS enabled for gRPC server");
}
else
{
serverCredentials = ServerCredentials.Insecure;
Logger.Warning("gRPC server running without TLS encryption - not recommended for production");
}
// Configure and start gRPC server with interceptor
_grpcServer = new Server
{
Services = { ScadaService.BindService(scadaService).Intercept(apiKeyInterceptor) },
Ports = { new ServerPort("0.0.0.0", _configuration.GrpcPort, serverCredentials) }
};
_grpcServer.Start();
string securityMode = _configuration.Tls.Enabled ? "TLS/SSL" : "INSECURE";
Logger.Information("LmxProxy service started successfully on port {Port} ({SecurityMode})",
_configuration.GrpcPort, securityMode);
Logger.Information("gRPC server listening on 0.0.0.0:{Port}", _configuration.GrpcPort);
// Start status web server
if (_statusWebServer != null && !_statusWebServer.Start())
{
Logger.Warning("Failed to start status web server, continuing without it");
}
return true;
}
catch (Exception ex)
{
Logger.Fatal(ex, "Failed to start LmxProxy service");
return false;
}
}
/// <summary>
/// Stops the LmxProxy service, shutting down the gRPC server and disposing all resources.
/// </summary>
/// <returns><c>true</c> if the service stopped successfully; otherwise, <c>false</c>.</returns>
public bool Stop()
{
try
{
Logger.Information("Stopping LmxProxy service");
_shutdownCts.Cancel();
// Stop connection monitoring
if (_connectionMonitorTask != null)
{
try
{
_connectionMonitorTask.Wait(TimeSpan.FromSeconds(5));
}
catch (Exception ex)
{
Logger.Warning(ex, "Error stopping connection monitor");
}
}
// Shutdown gRPC server
if (_grpcServer != null)
{
Logger.Information("Shutting down gRPC server");
Task? shutdownTask = _grpcServer.ShutdownAsync();
// Wait up to 10 seconds for graceful shutdown
if (!shutdownTask.Wait(TimeSpan.FromSeconds(10)))
{
Logger.Warning("gRPC server shutdown timeout, forcing kill");
_grpcServer.KillAsync().Wait(TimeSpan.FromSeconds(5));
}
_grpcServer = null;
}
// Stop status web server
if (_statusWebServer != null)
{
Logger.Information("Stopping status web server");
try
{
_statusWebServer.Stop();
_statusWebServer.Dispose();
_statusWebServer = null;
}
catch (Exception ex)
{
Logger.Warning(ex, "Error stopping status web server");
}
}
// Dispose status report service
if (_statusReportService != null)
{
Logger.Information("Disposing status report service");
_statusReportService = null;
}
// Dispose health check services
if (_detailedHealthCheckService != null)
{
Logger.Information("Disposing detailed health check service");
_detailedHealthCheckService = null;
}
if (_healthCheckService != null)
{
Logger.Information("Disposing health check service");
_healthCheckService = null;
}
// Dispose subscription manager
if (_subscriptionManager != null)
{
Logger.Information("Disposing subscription manager");
_subscriptionManager.Dispose();
_subscriptionManager = null;
}
// Dispose session manager
if (_sessionManager != null)
{
Logger.Information("Disposing session manager");
_sessionManager.Dispose();
_sessionManager = null;
}
// Dispose API key service
if (_apiKeyService != null)
{
Logger.Information("Disposing API key service");
_apiKeyService.Dispose();
_apiKeyService = null;
}
// Dispose performance metrics
if (_performanceMetrics != null)
{
Logger.Information("Disposing performance metrics service");
_performanceMetrics.Dispose();
_performanceMetrics = null;
}
// Disconnect and dispose SCADA client
if (_scadaClient != null)
{
Logger.Information("Disconnecting SCADA client");
// Unsubscribe from events
_scadaClient.ConnectionStateChanged -= OnConnectionStateChanged;
try
{
Task disconnectTask = _scadaClient.DisconnectAsync();
if (!disconnectTask.Wait(TimeSpan.FromSeconds(10)))
{
Logger.Warning("SCADA client disconnect timeout");
}
}
catch (Exception ex)
{
Logger.Warning(ex, "Error disconnecting SCADA client");
}
try
{
Task? disposeTask = _scadaClient.DisposeAsync().AsTask();
if (!disposeTask.Wait(TimeSpan.FromSeconds(5)))
{
Logger.Warning("SCADA client dispose timeout");
}
}
catch (Exception ex)
{
Logger.Warning(ex, "Error disposing SCADA client");
}
_scadaClient = null;
}
Logger.Information("LmxProxy service stopped successfully");
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Error stopping LmxProxy service");
return false;
}
}
/// <summary>
/// Pauses the LmxProxy service. No operation is performed except logging.
/// </summary>
public void Pause() => Logger.Information("LmxProxy service paused");
/// <summary>
/// Continues the LmxProxy service after a pause. No operation is performed except logging.
/// </summary>
public void Continue() => Logger.Information("LmxProxy service continued");
/// <summary>
/// Requests shutdown of the LmxProxy service and stops all components.
/// </summary>
public void Shutdown()
{
Logger.Information("LmxProxy service shutdown requested");
Stop();
}
/// <summary>
/// Handles connection state changes from the SCADA client.
/// </summary>
private void OnConnectionStateChanged(object? sender, ConnectionStateChangedEventArgs e)
{
Logger.Information("MxAccess connection state changed from {Previous} to {Current}",
e.PreviousState, e.CurrentState);
if (e.CurrentState == ConnectionState.Disconnected &&
e.PreviousState == ConnectionState.Connected)
{
Logger.Warning("MxAccess connection lost. Automatic reconnection will be attempted.");
}
}
/// <summary>
/// Monitors the connection and attempts to reconnect when disconnected.
/// </summary>
private async Task MonitorConnectionAsync(CancellationToken cancellationToken)
{
Logger.Information("Starting connection monitor");
while (!cancellationToken.IsCancellationRequested)
{
try
{
await Task.Delay(TimeSpan.FromSeconds(_configuration.Connection.MonitorIntervalSeconds),
cancellationToken);
if (_scadaClient != null && !_scadaClient.IsConnected && !cancellationToken.IsCancellationRequested)
{
await _reconnectSemaphore.WaitAsync(cancellationToken);
try
{
if (_scadaClient != null && !_scadaClient.IsConnected)
{
Logger.Information("Attempting to reconnect to MxAccess...");
try
{
await _scadaClient.ConnectAsync(cancellationToken);
Logger.Information("Successfully reconnected to MxAccess");
}
catch (Exception ex)
{
Logger.Warning(ex,
"Failed to reconnect to MxAccess. Will retry in {Interval} seconds.",
_configuration.Connection.MonitorIntervalSeconds);
}
}
}
finally
{
_reconnectSemaphore.Release();
}
}
}
catch (OperationCanceledException)
{
// Expected when shutting down
break;
}
catch (Exception ex)
{
Logger.Error(ex, "Error in connection monitor");
}
}
Logger.Information("Connection monitor stopped");
}
/// <summary>
/// Creates TLS server credentials from configuration
/// </summary>
private static ServerCredentials CreateTlsCredentials(TlsConfiguration tlsConfig)
{
try
{
// Read certificate and key files
string serverCert = File.ReadAllText(tlsConfig.ServerCertificatePath);
string serverKey = File.ReadAllText(tlsConfig.ServerKeyPath);
var keyCertPairs = new List<KeyCertificatePair>
{
new(serverCert, serverKey)
};
// Configure client certificate requirements
if (tlsConfig.RequireClientCertificate && !string.IsNullOrWhiteSpace(tlsConfig.ClientCaCertificatePath))
{
string clientCaCert = File.ReadAllText(tlsConfig.ClientCaCertificatePath);
return new SslServerCredentials(
keyCertPairs,
clientCaCert,
tlsConfig.CheckCertificateRevocation
? SslClientCertificateRequestType.RequestAndRequireAndVerify
: SslClientCertificateRequestType.RequestAndRequireButDontVerify);
}
if (tlsConfig.RequireClientCertificate)
{
// Require client certificate but no CA specified - use system CA
return new SslServerCredentials(
keyCertPairs,
null,
SslClientCertificateRequestType.RequestAndRequireAndVerify);
}
// No client certificate required
return new SslServerCredentials(keyCertPairs);
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to create TLS credentials");
throw new InvalidOperationException("Failed to configure TLS for gRPC server", ex);
}
}
/// <summary>
/// Validates the service configuration and returns false if any critical issues are found
/// </summary>
private bool ValidateConfiguration()
{
try
{
// Validate gRPC port
if (_configuration.GrpcPort <= 0 || _configuration.GrpcPort > 65535)
{
Logger.Error("Invalid gRPC port: {Port}. Port must be between 1 and 65535",
_configuration.GrpcPort);
return false;
}
// Validate API key configuration file
if (string.IsNullOrWhiteSpace(_configuration.ApiKeyConfigFile))
{
Logger.Error("API key configuration file path is null or empty");
return false;
}
// Check if API key file exists or can be created
string apiKeyPath = Path.GetFullPath(_configuration.ApiKeyConfigFile);
string? apiKeyDirectory = Path.GetDirectoryName(apiKeyPath);
if (!string.IsNullOrEmpty(apiKeyDirectory) && !Directory.Exists(apiKeyDirectory))
{
try
{
Directory.CreateDirectory(apiKeyDirectory);
}
catch (Exception ex)
{
Logger.Error(ex, "Cannot create directory for API key file: {Directory}", apiKeyDirectory);
return false;
}
}
// If API key file exists, validate it can be read
if (File.Exists(apiKeyPath))
{
try
{
string content = File.ReadAllText(apiKeyPath);
if (!string.IsNullOrWhiteSpace(content))
{
// Try to parse as JSON to validate format
JsonDocument.Parse(content);
}
}
catch (Exception ex)
{
Logger.Error(ex, "API key configuration file is invalid or unreadable: {FilePath}", apiKeyPath);
return false;
}
}
// Validate TLS configuration if enabled
if (_configuration.Tls.Enabled)
{
if (!_configuration.Tls.Validate())
{
Logger.Error("TLS configuration validation failed");
return false;
}
}
// Validate web server configuration if enabled
if (_configuration.WebServer.Enabled)
{
if (_configuration.WebServer.Port <= 0 || _configuration.WebServer.Port > 65535)
{
Logger.Error("Invalid web server port: {Port}. Port must be between 1 and 65535",
_configuration.WebServer.Port);
return false;
}
// Check for port conflicts
if (_configuration.WebServer.Port == _configuration.GrpcPort)
{
Logger.Error("Web server port {WebPort} conflicts with gRPC port {GrpcPort}",
_configuration.WebServer.Port, _configuration.GrpcPort);
return false;
}
}
Logger.Information("Configuration validation passed");
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Error during configuration validation");
return false;
}
}
}
}

View File

@@ -0,0 +1,87 @@
using System;
using System.IO;
using Microsoft.Extensions.Configuration;
using Serilog;
using Topshelf;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
namespace ZB.MOM.WW.LmxProxy.Host
{
internal class Program
{
private static void Main(string[] args)
{
// Build configuration
IConfigurationRoot? configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json", true, true)
.AddEnvironmentVariables()
.Build();
// Configure Serilog from appsettings.json
Log.Logger = new LoggerConfiguration()
.ReadFrom.Configuration(configuration)
.CreateLogger();
try
{
Log.Information("Starting ZB.MOM.WW.LmxProxy.Host");
// Load configuration
var config = new LmxProxyConfiguration();
configuration.Bind(config);
// Validate configuration
if (!ConfigurationValidator.ValidateAndLog(config))
{
Log.Fatal("Configuration validation failed. Please check the configuration and try again.");
Environment.ExitCode = 1;
return;
}
// Configure and run the Windows service using TopShelf
TopshelfExitCode exitCode = HostFactory.Run(hostConfig =>
{
hostConfig.Service<LmxProxyService>(serviceConfig =>
{
serviceConfig.ConstructUsing(() => new LmxProxyService(config));
serviceConfig.WhenStarted(service => service.Start());
serviceConfig.WhenStopped(service => service.Stop());
serviceConfig.WhenPaused(service => service.Pause());
serviceConfig.WhenContinued(service => service.Continue());
serviceConfig.WhenShutdown(service => service.Shutdown());
});
hostConfig.UseSerilog(Log.Logger);
hostConfig.SetServiceName("ZB.MOM.WW.LmxProxy.Host");
hostConfig.SetDisplayName("SCADA Bridge LMX Proxy");
hostConfig.SetDescription("Provides gRPC access to Archestra MxAccess for SCADA Bridge");
hostConfig.StartAutomatically();
hostConfig.EnableServiceRecovery(recoveryConfig =>
{
recoveryConfig.RestartService(config.ServiceRecovery.FirstFailureDelayMinutes);
recoveryConfig.RestartService(config.ServiceRecovery.SecondFailureDelayMinutes);
recoveryConfig.RestartService(config.ServiceRecovery.SubsequentFailureDelayMinutes);
recoveryConfig.SetResetPeriod(config.ServiceRecovery.ResetPeriodDays);
});
hostConfig.OnException(ex => { Log.Fatal(ex, "Unhandled exception in service"); });
});
Log.Information("Service exited with code: {ExitCode}", exitCode);
Environment.ExitCode = (int)exitCode;
}
catch (Exception ex)
{
Log.Fatal(ex, "Failed to start service");
Environment.ExitCode = 1;
}
finally
{
Log.CloseAndFlush();
}
}
}
}

View File

@@ -0,0 +1,49 @@
namespace ZB.MOM.WW.LmxProxy.Host.Security
{
/// <summary>
/// Represents an API key with associated permissions
/// </summary>
public class ApiKey
{
/// <summary>
/// The API key value
/// </summary>
public string Key { get; set; } = string.Empty;
/// <summary>
/// Description of what this API key is used for
/// </summary>
public string Description { get; set; } = string.Empty;
/// <summary>
/// The role assigned to this API key
/// </summary>
public ApiKeyRole Role { get; set; } = ApiKeyRole.ReadOnly;
/// <summary>
/// Whether this API key is enabled
/// </summary>
public bool Enabled { get; set; } = true;
/// <summary>
/// Checks if the API key is valid
/// </summary>
public bool IsValid() => Enabled;
}
/// <summary>
/// API key roles
/// </summary>
public enum ApiKeyRole
{
/// <summary>
/// Can only read data
/// </summary>
ReadOnly,
/// <summary>
/// Can read and write data
/// </summary>
ReadWrite
}
}

View File

@@ -0,0 +1,15 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.LmxProxy.Host.Security
{
/// <summary>
/// Configuration for API keys loaded from file
/// </summary>
public class ApiKeyConfiguration
{
/// <summary>
/// List of API keys
/// </summary>
public List<ApiKey> ApiKeys { get; set; } = new();
}
}

View File

@@ -0,0 +1,168 @@
using System;
using System.Linq;
using System.Threading.Tasks;
using Grpc.Core;
using Grpc.Core.Interceptors;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Security
{
/// <summary>
/// gRPC interceptor for API key authentication.
/// Validates API keys for incoming requests and enforces role-based access control.
/// </summary>
public class ApiKeyInterceptor : Interceptor
{
private static readonly ILogger Logger = Log.ForContext<ApiKeyInterceptor>();
/// <summary>
/// List of gRPC method names that require write access.
/// </summary>
private static readonly string[] WriteMethodNames =
{
"Write",
"WriteBatch",
"WriteBatchAndWait"
};
private readonly ApiKeyService _apiKeyService;
/// <summary>
/// Initializes a new instance of the <see cref="ApiKeyInterceptor" /> class.
/// </summary>
/// <param name="apiKeyService">The API key service used for validation.</param>
/// <exception cref="ArgumentNullException">Thrown if <paramref name="apiKeyService" /> is null.</exception>
public ApiKeyInterceptor(ApiKeyService apiKeyService)
{
_apiKeyService = apiKeyService ?? throw new ArgumentNullException(nameof(apiKeyService));
}
/// <summary>
/// Handles unary gRPC calls, validating API key and enforcing permissions.
/// </summary>
/// <typeparam name="TRequest">The request type.</typeparam>
/// <typeparam name="TResponse">The response type.</typeparam>
/// <param name="request">The request message.</param>
/// <param name="context">The server call context.</param>
/// <param name="continuation">The continuation delegate.</param>
/// <returns>The response message.</returns>
/// <exception cref="RpcException">Thrown if authentication or authorization fails.</exception>
public override async Task<TResponse> UnaryServerHandler<TRequest, TResponse>(
TRequest request,
ServerCallContext context,
UnaryServerMethod<TRequest, TResponse> continuation)
{
string apiKey = GetApiKeyFromContext(context);
string methodName = GetMethodName(context.Method);
if (string.IsNullOrEmpty(apiKey))
{
Logger.Warning("Missing API key for method {Method} from {Peer}",
context.Method, context.Peer);
throw new RpcException(new Status(StatusCode.Unauthenticated, "API key is required"));
}
ApiKey? key = _apiKeyService.ValidateApiKey(apiKey);
if (key == null)
{
Logger.Warning("Invalid API key for method {Method} from {Peer}",
context.Method, context.Peer);
throw new RpcException(new Status(StatusCode.Unauthenticated, "Invalid API key"));
}
// Check if method requires write access
if (IsWriteMethod(methodName) && key.Role != ApiKeyRole.ReadWrite)
{
Logger.Warning("Insufficient permissions for method {Method} with API key {Description}",
context.Method, key.Description);
throw new RpcException(new Status(StatusCode.PermissionDenied,
"API key does not have write permissions"));
}
// Add API key info to context items for use in service methods
context.UserState["ApiKey"] = key;
Logger.Debug("Authorized method {Method} for API key {Description}",
context.Method, key.Description);
return await continuation(request, context);
}
/// <summary>
/// Handles server streaming gRPC calls, validating API key and enforcing permissions.
/// </summary>
/// <typeparam name="TRequest">The request type.</typeparam>
/// <typeparam name="TResponse">The response type.</typeparam>
/// <param name="request">The request message.</param>
/// <param name="responseStream">The response stream writer.</param>
/// <param name="context">The server call context.</param>
/// <param name="continuation">The continuation delegate.</param>
/// <returns>A task representing the asynchronous operation.</returns>
/// <exception cref="RpcException">Thrown if authentication fails.</exception>
public override async Task ServerStreamingServerHandler<TRequest, TResponse>(
TRequest request,
IServerStreamWriter<TResponse> responseStream,
ServerCallContext context,
ServerStreamingServerMethod<TRequest, TResponse> continuation)
{
string apiKey = GetApiKeyFromContext(context);
if (string.IsNullOrEmpty(apiKey))
{
Logger.Warning("Missing API key for streaming method {Method} from {Peer}",
context.Method, context.Peer);
throw new RpcException(new Status(StatusCode.Unauthenticated, "API key is required"));
}
ApiKey? key = _apiKeyService.ValidateApiKey(apiKey);
if (key == null)
{
Logger.Warning("Invalid API key for streaming method {Method} from {Peer}",
context.Method, context.Peer);
throw new RpcException(new Status(StatusCode.Unauthenticated, "Invalid API key"));
}
// Add API key info to context items
context.UserState["ApiKey"] = key;
Logger.Debug("Authorized streaming method {Method} for API key {Description}",
context.Method, key.Description);
await continuation(request, responseStream, context);
}
/// <summary>
/// Extracts the API key from the gRPC request headers.
/// </summary>
/// <param name="context">The server call context.</param>
/// <returns>The API key value, or an empty string if not found.</returns>
private static string GetApiKeyFromContext(ServerCallContext context)
{
// Check for API key in metadata (headers)
Metadata.Entry? entry = context.RequestHeaders.FirstOrDefault(e =>
e.Key.Equals("x-api-key", StringComparison.OrdinalIgnoreCase));
return entry?.Value ?? string.Empty;
}
/// <summary>
/// Gets the method name from the full gRPC method string.
/// </summary>
/// <param name="method">The full method string (e.g., /package.Service/Method).</param>
/// <returns>The method name.</returns>
private static string GetMethodName(string method)
{
// Method format is /package.Service/Method
int lastSlash = method.LastIndexOf('/');
return lastSlash >= 0 ? method.Substring(lastSlash + 1) : method;
}
/// <summary>
/// Determines whether the specified method name requires write access.
/// </summary>
/// <param name="methodName">The method name.</param>
/// <returns><c>true</c> if the method requires write access; otherwise, <c>false</c>.</returns>
private static bool IsWriteMethod(string methodName) =>
WriteMethodNames.Contains(methodName, StringComparer.OrdinalIgnoreCase);
}
}

View File

@@ -0,0 +1,305 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.Json;
using System.Text.Json.Serialization;
using System.Threading;
using System.Threading.Tasks;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Security
{
/// <summary>
/// Service for managing API keys with file-based storage.
/// Handles validation, role checking, and automatic reload on file changes.
/// </summary>
public class ApiKeyService : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<ApiKeyService>();
private readonly ConcurrentDictionary<string, ApiKey> _apiKeys;
private readonly string _configFilePath;
private readonly SemaphoreSlim _reloadLock = new(1, 1);
private bool _disposed;
private FileSystemWatcher? _fileWatcher;
private DateTime _lastReloadTime = DateTime.MinValue;
/// <summary>
/// Initializes a new instance of the <see cref="ApiKeyService" /> class.
/// </summary>
/// <param name="configFilePath">The path to the API key configuration file.</param>
/// <exception cref="ArgumentNullException">Thrown if <paramref name="configFilePath" /> is null.</exception>
public ApiKeyService(string configFilePath)
{
_configFilePath = configFilePath ?? throw new ArgumentNullException(nameof(configFilePath));
_apiKeys = new ConcurrentDictionary<string, ApiKey>();
InitializeFileWatcher();
LoadConfiguration();
}
/// <summary>
/// Disposes the <see cref="ApiKeyService" /> and releases resources.
/// </summary>
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
_fileWatcher?.Dispose();
_reloadLock?.Dispose();
Logger.Information("API key service disposed");
}
/// <summary>
/// Validates an API key and returns its details if valid.
/// </summary>
/// <param name="apiKey">The API key value to validate.</param>
/// <returns>The <see cref="ApiKey" /> if valid; otherwise, <c>null</c>.</returns>
public ApiKey? ValidateApiKey(string apiKey)
{
if (string.IsNullOrWhiteSpace(apiKey))
{
return null;
}
if (_apiKeys.TryGetValue(apiKey, out ApiKey? key) && key.IsValid())
{
Logger.Debug("API key validated successfully for {Description}", key.Description);
return key;
}
Logger.Warning("Invalid or expired API key attempted");
return null;
}
/// <summary>
/// Checks if an API key has the specified role.
/// </summary>
/// <param name="apiKey">The API key value.</param>
/// <param name="requiredRole">The required <see cref="ApiKeyRole" />.</param>
/// <returns><c>true</c> if the API key has the required role; otherwise, <c>false</c>.</returns>
public bool HasRole(string apiKey, ApiKeyRole requiredRole)
{
ApiKey? key = ValidateApiKey(apiKey);
if (key == null)
{
return false;
}
// ReadWrite role has access to everything
if (key.Role == ApiKeyRole.ReadWrite)
{
return true;
}
// ReadOnly role only has access to ReadOnly operations
return requiredRole == ApiKeyRole.ReadOnly;
}
/// <summary>
/// Initializes the file system watcher for the API key configuration file.
/// </summary>
private void InitializeFileWatcher()
{
string? directory = Path.GetDirectoryName(_configFilePath);
string? fileName = Path.GetFileName(_configFilePath);
if (string.IsNullOrEmpty(directory) || string.IsNullOrEmpty(fileName))
{
Logger.Warning("Invalid config file path, file watching disabled");
return;
}
try
{
_fileWatcher = new FileSystemWatcher(directory, fileName)
{
NotifyFilter = NotifyFilters.LastWrite | NotifyFilters.Size | NotifyFilters.CreationTime,
EnableRaisingEvents = true
};
_fileWatcher.Changed += OnFileChanged;
_fileWatcher.Created += OnFileChanged;
_fileWatcher.Renamed += OnFileRenamed;
Logger.Information("File watcher initialized for {FilePath}", _configFilePath);
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to initialize file watcher for {FilePath}", _configFilePath);
}
}
/// <summary>
/// Handles file change events for the configuration file.
/// </summary>
/// <param name="sender">The event sender.</param>
/// <param name="e">The <see cref="FileSystemEventArgs" /> instance containing event data.</param>
private void OnFileChanged(object sender, FileSystemEventArgs e)
{
if (e.ChangeType == WatcherChangeTypes.Changed || e.ChangeType == WatcherChangeTypes.Created)
{
Logger.Information("API key configuration file changed, reloading");
Task.Run(() => ReloadConfigurationAsync());
}
}
/// <summary>
/// Handles file rename events for the configuration file.
/// </summary>
/// <param name="sender">The event sender.</param>
/// <param name="e">The <see cref="RenamedEventArgs" /> instance containing event data.</param>
private void OnFileRenamed(object sender, RenamedEventArgs e)
{
if (e.FullPath.Equals(_configFilePath, StringComparison.OrdinalIgnoreCase))
{
Logger.Information("API key configuration file renamed, reloading");
Task.Run(() => ReloadConfigurationAsync());
}
}
/// <summary>
/// Asynchronously reloads the API key configuration from file.
/// Debounces rapid file changes to avoid excessive reloads.
/// </summary>
private async Task ReloadConfigurationAsync()
{
// Debounce rapid file changes
TimeSpan timeSinceLastReload = DateTime.UtcNow - _lastReloadTime;
if (timeSinceLastReload < TimeSpan.FromSeconds(1))
{
await Task.Delay(TimeSpan.FromSeconds(1) - timeSinceLastReload);
}
await _reloadLock.WaitAsync();
try
{
LoadConfiguration();
_lastReloadTime = DateTime.UtcNow;
}
finally
{
_reloadLock.Release();
}
}
/// <summary>
/// Loads the API key configuration from file.
/// If the file does not exist, creates a default configuration.
/// </summary>
private void LoadConfiguration()
{
try
{
if (!File.Exists(_configFilePath))
{
Logger.Warning("API key configuration file not found at {FilePath}, creating default",
_configFilePath);
CreateDefaultConfiguration();
return;
}
string json = File.ReadAllText(_configFilePath);
var options = new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true,
ReadCommentHandling = JsonCommentHandling.Skip
};
options.Converters.Add(new JsonStringEnumConverter());
ApiKeyConfiguration? config = JsonSerializer.Deserialize<ApiKeyConfiguration>(json, options);
if (config?.ApiKeys == null || !config.ApiKeys.Any())
{
Logger.Warning("No API keys found in configuration file");
return;
}
// Clear existing keys and load new ones
_apiKeys.Clear();
foreach (ApiKey? apiKey in config.ApiKeys)
{
if (string.IsNullOrWhiteSpace(apiKey.Key))
{
Logger.Warning("Skipping API key with empty key value");
continue;
}
if (_apiKeys.TryAdd(apiKey.Key, apiKey))
{
Logger.Information("Loaded API key: {Description} with role {Role}",
apiKey.Description, apiKey.Role);
}
else
{
Logger.Warning("Duplicate API key found: {Description}", apiKey.Description);
}
}
Logger.Information("Loaded {Count} API keys from configuration", _apiKeys.Count);
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to load API key configuration from {FilePath}", _configFilePath);
}
}
/// <summary>
/// Creates a default API key configuration file with sample keys.
/// </summary>
private void CreateDefaultConfiguration()
{
try
{
var defaultConfig = new ApiKeyConfiguration
{
ApiKeys = new List<ApiKey>
{
new()
{
Key = Guid.NewGuid().ToString("N"),
Description = "Default read-only API key",
Role = ApiKeyRole.ReadOnly,
Enabled = true
},
new()
{
Key = Guid.NewGuid().ToString("N"),
Description = "Default read-write API key",
Role = ApiKeyRole.ReadWrite,
Enabled = true
}
}
};
string? json = JsonSerializer.Serialize(defaultConfig, new JsonSerializerOptions
{
WriteIndented = true
});
string? directory = Path.GetDirectoryName(_configFilePath);
if (!string.IsNullOrEmpty(directory) && !Directory.Exists(directory))
{
Directory.CreateDirectory(directory);
}
File.WriteAllText(_configFilePath, json);
Logger.Information("Created default API key configuration at {FilePath}", _configFilePath);
// Load the created configuration
LoadConfiguration();
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to create default API key configuration");
}
}
}
}

View File

@@ -0,0 +1,329 @@
using System;
using System.IO;
using System.Net;
using System.Security.Cryptography;
using System.Security.Cryptography.X509Certificates;
using System.Text;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
namespace ZB.MOM.WW.LmxProxy.Host.Security
{
/// <summary>
/// Manages TLS certificates for the LmxProxy service, including generation and validation
/// </summary>
public class TlsCertificateManager
{
private static readonly ILogger Logger = Log.ForContext<TlsCertificateManager>();
private readonly TlsConfiguration _tlsConfiguration;
public TlsCertificateManager(TlsConfiguration tlsConfiguration)
{
_tlsConfiguration = tlsConfiguration ?? throw new ArgumentNullException(nameof(tlsConfiguration));
}
/// <summary>
/// Checks TLS certificate status and creates new certificates if needed
/// </summary>
/// <returns>True if certificates are valid or were successfully created</returns>
public bool EnsureCertificatesValid()
{
if (!_tlsConfiguration.Enabled)
{
Logger.Information("TLS is disabled, skipping certificate check");
return true;
}
try
{
// Check if certificate files exist
bool certificateExists = File.Exists(_tlsConfiguration.ServerCertificatePath);
bool keyExists = File.Exists(_tlsConfiguration.ServerKeyPath);
if (!certificateExists || !keyExists)
{
Logger.Warning("TLS certificate or key not found, generating new certificate");
return GenerateNewCertificate();
}
// Check certificate expiration
if (IsCertificateExpiringSoon(_tlsConfiguration.ServerCertificatePath))
{
Logger.Warning("TLS certificate is expiring within the next year, generating new certificate");
return GenerateNewCertificate();
}
Logger.Information("TLS certificate is valid");
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Error checking TLS certificates");
return false;
}
}
/// <summary>
/// Checks if a certificate is expiring within the next year
/// </summary>
private bool IsCertificateExpiringSoon(string certificatePath)
{
try
{
string certPem = File.ReadAllText(certificatePath);
byte[] certBytes = GetBytesFromPem(certPem, "CERTIFICATE");
using var cert = new X509Certificate2(certBytes);
DateTime expirationDate = cert.NotAfter;
double daysUntilExpiration = (expirationDate - DateTime.Now).TotalDays;
Logger.Information("Certificate expires on {ExpirationDate} ({DaysUntilExpiration:F0} days from now)",
expirationDate, daysUntilExpiration);
// Check if expiring within the next year (365 days)
return daysUntilExpiration <= 365;
}
catch (Exception ex)
{
Logger.Error(ex, "Error checking certificate expiration");
// If we can't check expiration, assume it needs renewal
return true;
}
}
/// <summary>
/// Generates a new self-signed certificate
/// </summary>
private bool GenerateNewCertificate()
{
try
{
Logger.Information("Generating new self-signed TLS certificate");
// Ensure directory exists
string? certDir = Path.GetDirectoryName(_tlsConfiguration.ServerCertificatePath);
if (!string.IsNullOrEmpty(certDir) && !Directory.Exists(certDir))
{
Directory.CreateDirectory(certDir);
Logger.Information("Created certificate directory: {Directory}", certDir);
}
// Generate a new self-signed certificate
using var rsa = RSA.Create(2048);
var request = new CertificateRequest(
"CN=LmxProxy, O=SCADA Bridge, C=US",
rsa,
HashAlgorithmName.SHA256,
RSASignaturePadding.Pkcs1);
// Add certificate extensions
request.CertificateExtensions.Add(
new X509BasicConstraintsExtension(false, false, 0, false));
request.CertificateExtensions.Add(
new X509KeyUsageExtension(
X509KeyUsageFlags.DigitalSignature | X509KeyUsageFlags.KeyEncipherment,
false));
request.CertificateExtensions.Add(
new X509EnhancedKeyUsageExtension(
new OidCollection
{
new Oid("1.3.6.1.5.5.7.3.1") // Server Authentication
},
false));
// Add Subject Alternative Names
var sanBuilder = new SubjectAlternativeNameBuilder();
sanBuilder.AddDnsName("localhost");
sanBuilder.AddDnsName(Environment.MachineName);
sanBuilder.AddIpAddress(IPAddress.Loopback);
sanBuilder.AddIpAddress(IPAddress.IPv6Loopback);
request.CertificateExtensions.Add(sanBuilder.Build());
// Create the certificate with 2-year validity
DateTimeOffset notBefore = DateTimeOffset.Now.AddDays(-1);
DateTimeOffset notAfter = DateTimeOffset.Now.AddYears(2);
using X509Certificate2? cert = request.CreateSelfSigned(notBefore, notAfter);
// Export certificate to PEM format
string certPem = ExportCertificateToPem(cert);
File.WriteAllText(_tlsConfiguration.ServerCertificatePath, certPem);
Logger.Information("Saved certificate to {Path}", _tlsConfiguration.ServerCertificatePath);
// Export private key to PEM format
string keyPem = ExportPrivateKeyToPem(rsa);
File.WriteAllText(_tlsConfiguration.ServerKeyPath, keyPem);
Logger.Information("Saved private key to {Path}", _tlsConfiguration.ServerKeyPath);
// If client CA path is specified and doesn't exist, create it
if (!string.IsNullOrWhiteSpace(_tlsConfiguration.ClientCaCertificatePath) &&
!File.Exists(_tlsConfiguration.ClientCaCertificatePath))
{
// For self-signed certificates, the CA cert is the same as the server cert
File.WriteAllText(_tlsConfiguration.ClientCaCertificatePath, certPem);
Logger.Information("Saved CA certificate to {Path}", _tlsConfiguration.ClientCaCertificatePath);
}
Logger.Information("Successfully generated new TLS certificate valid until {NotAfter}", notAfter);
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to generate new TLS certificate");
return false;
}
}
/// <summary>
/// Exports a certificate to PEM format
/// </summary>
private static string ExportCertificateToPem(X509Certificate2 cert)
{
var builder = new StringBuilder();
builder.AppendLine("-----BEGIN CERTIFICATE-----");
builder.AppendLine(Convert.ToBase64String(cert.Export(X509ContentType.Cert),
Base64FormattingOptions.InsertLineBreaks));
builder.AppendLine("-----END CERTIFICATE-----");
return builder.ToString();
}
/// <summary>
/// Exports an RSA private key to PEM format
/// </summary>
private static string ExportPrivateKeyToPem(RSA rsa)
{
var builder = new StringBuilder();
builder.AppendLine("-----BEGIN RSA PRIVATE KEY-----");
// For .NET Framework 4.8, we need to use the older export method
RSAParameters parameters = rsa.ExportParameters(true);
byte[] keyBytes = EncodeRSAPrivateKey(parameters);
builder.AppendLine(Convert.ToBase64String(keyBytes, Base64FormattingOptions.InsertLineBreaks));
builder.AppendLine("-----END RSA PRIVATE KEY-----");
return builder.ToString();
}
/// <summary>
/// Encodes RSA parameters to PKCS#1 format for .NET Framework 4.8
/// </summary>
private static byte[] EncodeRSAPrivateKey(RSAParameters parameters)
{
using (var stream = new MemoryStream())
using (var writer = new BinaryWriter(stream))
{
// Write version
writer.Write((byte)0x02); // INTEGER
writer.Write((byte)0x01); // Length
writer.Write((byte)0x00); // Version
// Write modulus
WriteIntegerBytes(writer, parameters.Modulus);
// Write public exponent
WriteIntegerBytes(writer, parameters.Exponent);
// Write private exponent
WriteIntegerBytes(writer, parameters.D);
// Write prime1
WriteIntegerBytes(writer, parameters.P);
// Write prime2
WriteIntegerBytes(writer, parameters.Q);
// Write exponent1
WriteIntegerBytes(writer, parameters.DP);
// Write exponent2
WriteIntegerBytes(writer, parameters.DQ);
// Write coefficient
WriteIntegerBytes(writer, parameters.InverseQ);
byte[] innerBytes = stream.ToArray();
// Create SEQUENCE wrapper
using (var finalStream = new MemoryStream())
using (var finalWriter = new BinaryWriter(finalStream))
{
finalWriter.Write((byte)0x30); // SEQUENCE
WriteLength(finalWriter, innerBytes.Length);
finalWriter.Write(innerBytes);
return finalStream.ToArray();
}
}
}
private static void WriteIntegerBytes(BinaryWriter writer, byte[] bytes)
{
if (bytes == null)
{
bytes = new byte[] { 0 };
}
writer.Write((byte)0x02); // INTEGER
if (bytes[0] >= 0x80)
{
// Add padding byte for positive number
WriteLength(writer, bytes.Length + 1);
writer.Write((byte)0x00);
writer.Write(bytes);
}
else
{
WriteLength(writer, bytes.Length);
writer.Write(bytes);
}
}
private static void WriteLength(BinaryWriter writer, int length)
{
if (length < 0x80)
{
writer.Write((byte)length);
}
else if (length <= 0xFF)
{
writer.Write((byte)0x81);
writer.Write((byte)length);
}
else
{
writer.Write((byte)0x82);
writer.Write((byte)(length >> 8));
writer.Write((byte)(length & 0xFF));
}
}
/// <summary>
/// Extracts bytes from PEM format
/// </summary>
private static byte[] GetBytesFromPem(string pem, string section)
{
string header = $"-----BEGIN {section}-----";
string footer = $"-----END {section}-----";
int start = pem.IndexOf(header, StringComparison.Ordinal);
if (start < 0)
{
throw new InvalidOperationException($"PEM {section} header not found");
}
start += header.Length;
int end = pem.IndexOf(footer, start, StringComparison.Ordinal);
if (end < 0)
{
throw new InvalidOperationException($"PEM {section} footer not found");
}
// Use Substring instead of range syntax for .NET Framework 4.8 compatibility
string base64 = pem.Substring(start, end - start).Replace("\r", "").Replace("\n", "");
return Convert.FromBase64String(base64);
}
}
}

View File

@@ -0,0 +1,189 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Health check service for monitoring LmxProxy health
/// </summary>
public class HealthCheckService : IHealthCheck
{
private static readonly ILogger Logger = Log.ForContext<HealthCheckService>();
private readonly PerformanceMetrics _performanceMetrics;
private readonly IScadaClient _scadaClient;
private readonly SubscriptionManager _subscriptionManager;
public HealthCheckService(
IScadaClient scadaClient,
SubscriptionManager subscriptionManager,
PerformanceMetrics performanceMetrics)
{
_scadaClient = scadaClient ?? throw new ArgumentNullException(nameof(scadaClient));
_subscriptionManager = subscriptionManager ?? throw new ArgumentNullException(nameof(subscriptionManager));
_performanceMetrics = performanceMetrics ?? throw new ArgumentNullException(nameof(performanceMetrics));
}
public Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
var data = new Dictionary<string, object>();
try
{
// Check SCADA connection
bool isConnected = _scadaClient.IsConnected;
ConnectionState connectionState = _scadaClient.ConnectionState;
data["scada_connected"] = isConnected;
data["scada_connection_state"] = connectionState.ToString();
// Get subscription statistics
SubscriptionStats subscriptionStats = _subscriptionManager.GetSubscriptionStats();
data["total_clients"] = subscriptionStats.TotalClients;
data["total_tags"] = subscriptionStats.TotalTags;
// Get performance metrics
IReadOnlyDictionary<string, OperationMetrics> metrics = _performanceMetrics.GetAllMetrics();
long totalOperations = 0L;
double averageSuccessRate = 0.0;
foreach (OperationMetrics? metric in metrics.Values)
{
MetricsStatistics stats = metric.GetStatistics();
totalOperations += stats.TotalCount;
averageSuccessRate += stats.SuccessRate;
}
if (metrics.Count > 0)
{
averageSuccessRate /= metrics.Count;
}
data["total_operations"] = totalOperations;
data["average_success_rate"] = averageSuccessRate;
// Determine health status
if (!isConnected)
{
return Task.FromResult(HealthCheckResult.Unhealthy(
"SCADA client is not connected",
data: data));
}
if (averageSuccessRate < 0.5 && totalOperations > 100)
{
return Task.FromResult(HealthCheckResult.Degraded(
$"Low success rate: {averageSuccessRate:P}",
data: data));
}
if (subscriptionStats.TotalClients > 100)
{
return Task.FromResult(HealthCheckResult.Degraded(
$"High client count: {subscriptionStats.TotalClients}",
data: data));
}
return Task.FromResult(HealthCheckResult.Healthy(
"LmxProxy is healthy",
data));
}
catch (Exception ex)
{
Logger.Error(ex, "Health check failed");
data["error"] = ex.Message;
return Task.FromResult(HealthCheckResult.Unhealthy(
"Health check threw an exception",
ex,
data));
}
}
}
/// <summary>
/// Detailed health check that performs additional connectivity tests
/// </summary>
public class DetailedHealthCheckService : IHealthCheck
{
private static readonly ILogger Logger = Log.ForContext<DetailedHealthCheckService>();
private readonly IScadaClient _scadaClient;
private readonly string _testTagAddress;
public DetailedHealthCheckService(IScadaClient scadaClient, string testTagAddress = "System.Heartbeat")
{
_scadaClient = scadaClient ?? throw new ArgumentNullException(nameof(scadaClient));
_testTagAddress = testTagAddress;
}
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
var data = new Dictionary<string, object>();
try
{
// Basic connectivity check
if (!_scadaClient.IsConnected)
{
data["connected"] = false;
return HealthCheckResult.Unhealthy("SCADA client is not connected", data: data);
}
data["connected"] = true;
// Try to read a test tag
try
{
Vtq vtq = await _scadaClient.ReadAsync(_testTagAddress, cancellationToken);
data["test_tag_quality"] = vtq.Quality.ToString();
data["test_tag_timestamp"] = vtq.Timestamp;
if (vtq.Quality != Quality.Good)
{
return HealthCheckResult.Degraded(
$"Test tag quality is {vtq.Quality}",
data: data);
}
// Check if timestamp is recent (within last 5 minutes)
TimeSpan age = DateTime.UtcNow - vtq.Timestamp;
if (age > TimeSpan.FromMinutes(5))
{
data["timestamp_age_minutes"] = age.TotalMinutes;
return HealthCheckResult.Degraded(
$"Test tag timestamp is stale ({age.TotalMinutes:F1} minutes old)",
data: data);
}
}
catch (Exception readEx)
{
data["test_tag_error"] = readEx.Message;
return HealthCheckResult.Degraded(
"Could not read test tag",
data: data);
}
return HealthCheckResult.Healthy("All checks passed", data);
}
catch (Exception ex)
{
Logger.Error(ex, "Detailed health check failed");
data["error"] = ex.Message;
return HealthCheckResult.Unhealthy(
"Health check threw an exception",
ex,
data);
}
}
}
}

View File

@@ -0,0 +1,213 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Provides performance metrics tracking for LmxProxy operations
/// </summary>
public class PerformanceMetrics : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<PerformanceMetrics>();
private readonly ConcurrentDictionary<string, OperationMetrics> _metrics = new();
private readonly Timer _reportingTimer;
private bool _disposed;
/// <summary>
/// Initializes a new instance of the PerformanceMetrics class
/// </summary>
public PerformanceMetrics()
{
// Report metrics every minute
_reportingTimer = new Timer(ReportMetrics, null, TimeSpan.FromMinutes(1), TimeSpan.FromMinutes(1));
}
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
_reportingTimer?.Dispose();
ReportMetrics(null); // Final report
}
/// <summary>
/// Records the execution time of an operation
/// </summary>
public void RecordOperation(string operationName, TimeSpan duration, bool success = true)
{
OperationMetrics? metrics = _metrics.GetOrAdd(operationName, _ => new OperationMetrics());
metrics.Record(duration, success);
}
/// <summary>
/// Creates a timing scope for measuring operation duration
/// </summary>
public ITimingScope BeginOperation(string operationName) => new TimingScope(this, operationName);
/// <summary>
/// Gets current metrics for a specific operation
/// </summary>
public OperationMetrics? GetMetrics(string operationName) =>
_metrics.TryGetValue(operationName, out OperationMetrics? metrics) ? metrics : null;
/// <summary>
/// Gets all current metrics
/// </summary>
public IReadOnlyDictionary<string, OperationMetrics> GetAllMetrics() =>
_metrics.ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
/// <summary>
/// Gets statistics for all operations
/// </summary>
public Dictionary<string, MetricsStatistics> GetStatistics() =>
_metrics.ToDictionary(kvp => kvp.Key, kvp => kvp.Value.GetStatistics());
private void ReportMetrics(object? state)
{
foreach (KeyValuePair<string, OperationMetrics> kvp in _metrics)
{
MetricsStatistics stats = kvp.Value.GetStatistics();
if (stats.TotalCount > 0)
{
Logger.Information(
"Performance Metrics - {Operation}: Count={Count}, Success={SuccessRate:P}, " +
"Avg={AverageMs:F2}ms, Min={MinMs:F2}ms, Max={MaxMs:F2}ms, P95={P95Ms:F2}ms",
kvp.Key,
stats.TotalCount,
stats.SuccessRate,
stats.AverageMilliseconds,
stats.MinMilliseconds,
stats.MaxMilliseconds,
stats.Percentile95Milliseconds);
}
}
}
/// <summary>
/// Timing scope for automatic duration measurement
/// </summary>
public interface ITimingScope : IDisposable
{
void SetSuccess(bool success);
}
private class TimingScope : ITimingScope
{
private readonly PerformanceMetrics _metrics;
private readonly string _operationName;
private readonly Stopwatch _stopwatch;
private bool _disposed;
private bool _success = true;
public TimingScope(PerformanceMetrics metrics, string operationName)
{
_metrics = metrics;
_operationName = operationName;
_stopwatch = Stopwatch.StartNew();
}
public void SetSuccess(bool success) => _success = success;
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
_stopwatch.Stop();
_metrics.RecordOperation(_operationName, _stopwatch.Elapsed, _success);
}
}
}
/// <summary>
/// Metrics for a specific operation
/// </summary>
public class OperationMetrics
{
private readonly List<double> _durations = new();
private readonly object _lock = new();
private double _maxMilliseconds;
private double _minMilliseconds = double.MaxValue;
private long _successCount;
private long _totalCount;
private double _totalMilliseconds;
public void Record(TimeSpan duration, bool success)
{
lock (_lock)
{
double ms = duration.TotalMilliseconds;
_durations.Add(ms);
_totalCount++;
if (success)
{
_successCount++;
}
_totalMilliseconds += ms;
_minMilliseconds = Math.Min(_minMilliseconds, ms);
_maxMilliseconds = Math.Max(_maxMilliseconds, ms);
// Keep only last 1000 samples for percentile calculation
if (_durations.Count > 1000)
{
_durations.RemoveAt(0);
}
}
}
public MetricsStatistics GetStatistics()
{
lock (_lock)
{
if (_totalCount == 0)
{
return new MetricsStatistics();
}
var sortedDurations = _durations.OrderBy(d => d).ToList();
int p95Index = (int)Math.Ceiling(sortedDurations.Count * 0.95) - 1;
return new MetricsStatistics
{
TotalCount = _totalCount,
SuccessCount = _successCount,
SuccessRate = _successCount / (double)_totalCount,
AverageMilliseconds = _totalMilliseconds / _totalCount,
MinMilliseconds = _minMilliseconds == double.MaxValue ? 0 : _minMilliseconds,
MaxMilliseconds = _maxMilliseconds,
Percentile95Milliseconds = sortedDurations.Count > 0 ? sortedDurations[Math.Max(0, p95Index)] : 0
};
}
}
}
/// <summary>
/// Statistics for an operation
/// </summary>
public class MetricsStatistics
{
public long TotalCount { get; set; }
public long SuccessCount { get; set; }
public double SuccessRate { get; set; }
public double AverageMilliseconds { get; set; }
public double MinMilliseconds { get; set; }
public double MaxMilliseconds { get; set; }
public double Percentile95Milliseconds { get; set; }
}
}

View File

@@ -0,0 +1,193 @@
using System;
using System.Threading.Tasks;
using Polly;
using Polly.Timeout;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Provides retry policies for resilient operations
/// </summary>
public static class RetryPolicies
{
private static readonly ILogger Logger = Log.ForContext(typeof(RetryPolicies));
/// <summary>
/// Creates a retry policy with exponential backoff for read operations
/// </summary>
public static IAsyncPolicy<T> CreateReadPolicy<T>()
{
return Policy<T>
.Handle<Exception>(ex => !(ex is ArgumentException || ex is InvalidOperationException))
.WaitAndRetryAsync(
3,
retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt - 1)),
(outcome, timespan, retryCount, context) =>
{
Exception? exception = outcome.Exception;
Logger.Warning(exception,
"Read operation retry {RetryCount} after {DelayMs}ms. Operation: {Operation}",
retryCount,
timespan.TotalMilliseconds,
context.ContainsKey("Operation") ? context["Operation"] : "Unknown");
});
}
/// <summary>
/// Creates a retry policy with exponential backoff for write operations
/// </summary>
public static IAsyncPolicy CreateWritePolicy()
{
return Policy
.Handle<Exception>(ex => !(ex is ArgumentException || ex is InvalidOperationException))
.WaitAndRetryAsync(
3,
retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)),
(exception, timespan, retryCount, context) =>
{
Logger.Warning(exception,
"Write operation retry {RetryCount} after {DelayMs}ms. Operation: {Operation}",
retryCount,
timespan.TotalMilliseconds,
context.ContainsKey("Operation") ? context["Operation"] : "Unknown");
});
}
/// <summary>
/// Creates a retry policy for connection operations with longer delays
/// </summary>
public static IAsyncPolicy CreateConnectionPolicy()
{
return Policy
.Handle<Exception>()
.WaitAndRetryAsync(
5,
retryAttempt =>
{
// 2s, 4s, 8s, 16s, 32s
var delay = TimeSpan.FromSeconds(Math.Min(32, Math.Pow(2, retryAttempt)));
return delay;
},
(exception, timespan, retryCount, context) =>
{
Logger.Warning(exception,
"Connection retry {RetryCount} after {DelayMs}ms",
retryCount,
timespan.TotalMilliseconds);
});
}
/// <summary>
/// Creates a circuit breaker policy for protecting against repeated failures
/// </summary>
public static IAsyncPolicy<T> CreateCircuitBreakerPolicy<T>()
{
return Policy<T>
.Handle<Exception>()
.CircuitBreakerAsync(
5,
TimeSpan.FromSeconds(30),
(result, timespan) =>
{
Logger.Error(result.Exception,
"Circuit breaker opened for {BreakDurationSeconds}s due to repeated failures",
timespan.TotalSeconds);
},
() => { Logger.Information("Circuit breaker reset - resuming normal operations"); },
() => { Logger.Information("Circuit breaker half-open - testing operation"); });
}
/// <summary>
/// Creates a combined policy with retry and circuit breaker
/// </summary>
public static IAsyncPolicy<T> CreateCombinedPolicy<T>()
{
IAsyncPolicy<T> retry = CreateReadPolicy<T>();
IAsyncPolicy<T> circuitBreaker = CreateCircuitBreakerPolicy<T>();
// Wrap retry around circuit breaker
// This means retry happens first, and if all retries fail, it counts toward the circuit breaker
return Policy.WrapAsync(retry, circuitBreaker);
}
/// <summary>
/// Creates a timeout policy for operations
/// </summary>
public static IAsyncPolicy CreateTimeoutPolicy(TimeSpan timeout)
{
return Policy
.TimeoutAsync(
timeout,
TimeoutStrategy.Pessimistic,
async (context, timespan, task) =>
{
Logger.Warning(
"Operation timed out after {TimeoutMs}ms. Operation: {Operation}",
timespan.TotalMilliseconds,
context.ContainsKey("Operation") ? context["Operation"] : "Unknown");
if (task != null)
{
try
{
await task;
}
catch
{
// Ignore exceptions from the timed-out task
}
}
});
}
/// <summary>
/// Creates a bulkhead policy to limit concurrent operations
/// </summary>
public static IAsyncPolicy CreateBulkheadPolicy(int maxParallelization, int maxQueuingActions = 100)
{
return Policy
.BulkheadAsync(
maxParallelization,
maxQueuingActions,
context =>
{
Logger.Warning(
"Bulkhead rejected operation. Max parallelization: {MaxParallel}, Queue: {MaxQueue}",
maxParallelization,
maxQueuingActions);
return Task.CompletedTask;
});
}
}
/// <summary>
/// Extension methods for applying retry policies
/// </summary>
public static class RetryPolicyExtensions
{
/// <summary>
/// Executes an operation with retry policy
/// </summary>
public static async Task<T> ExecuteWithRetryAsync<T>(
this IAsyncPolicy<T> policy,
Func<Task<T>> operation,
string operationName)
{
var context = new Context { ["Operation"] = operationName };
return await policy.ExecuteAsync(async ctx => await operation(), context);
}
/// <summary>
/// Executes an operation with retry policy (non-generic)
/// </summary>
public static async Task ExecuteWithRetryAsync(
this IAsyncPolicy policy,
Func<Task> operation,
string operationName)
{
var context = new Context { ["Operation"] = operationName };
await policy.ExecuteAsync(async ctx => await operation(), context);
}
}
}

View File

@@ -0,0 +1,182 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using Serilog;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Manages client sessions for the gRPC service.
/// Tracks active sessions with unique session IDs.
/// </summary>
public class SessionManager : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<SessionManager>();
private readonly ConcurrentDictionary<string, SessionInfo> _sessions = new();
private bool _disposed;
/// <summary>
/// Gets the number of active sessions.
/// </summary>
public int ActiveSessionCount => _sessions.Count;
/// <summary>
/// Creates a new session for a client.
/// </summary>
/// <param name="clientId">The client identifier.</param>
/// <param name="apiKey">The API key used for authentication (optional).</param>
/// <returns>The session ID for the new session.</returns>
/// <exception cref="ObjectDisposedException">Thrown if the manager is disposed.</exception>
public string CreateSession(string clientId, string apiKey = null)
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(SessionManager));
}
var sessionId = Guid.NewGuid().ToString("N");
var sessionInfo = new SessionInfo
{
SessionId = sessionId,
ClientId = clientId ?? string.Empty,
ApiKey = apiKey ?? string.Empty,
ConnectedAt = DateTime.UtcNow,
LastActivity = DateTime.UtcNow
};
_sessions[sessionId] = sessionInfo;
Logger.Information("Created session {SessionId} for client {ClientId}", sessionId, clientId);
return sessionId;
}
/// <summary>
/// Validates a session ID and updates the last activity timestamp.
/// </summary>
/// <param name="sessionId">The session ID to validate.</param>
/// <returns>True if the session is valid; otherwise, false.</returns>
public bool ValidateSession(string sessionId)
{
if (_disposed)
{
return false;
}
if (string.IsNullOrEmpty(sessionId))
{
return false;
}
if (_sessions.TryGetValue(sessionId, out SessionInfo sessionInfo))
{
sessionInfo.LastActivity = DateTime.UtcNow;
return true;
}
return false;
}
/// <summary>
/// Gets the session information for a session ID.
/// </summary>
/// <param name="sessionId">The session ID.</param>
/// <returns>The session information, or null if not found.</returns>
public SessionInfo GetSession(string sessionId)
{
if (_disposed || string.IsNullOrEmpty(sessionId))
{
return null;
}
_sessions.TryGetValue(sessionId, out SessionInfo sessionInfo);
return sessionInfo;
}
/// <summary>
/// Terminates a session.
/// </summary>
/// <param name="sessionId">The session ID to terminate.</param>
/// <returns>True if the session was terminated; otherwise, false.</returns>
public bool TerminateSession(string sessionId)
{
if (_disposed || string.IsNullOrEmpty(sessionId))
{
return false;
}
if (_sessions.TryRemove(sessionId, out SessionInfo sessionInfo))
{
Logger.Information("Terminated session {SessionId} for client {ClientId}", sessionId, sessionInfo.ClientId);
return true;
}
return false;
}
/// <summary>
/// Gets all active sessions.
/// </summary>
/// <returns>A list of all active session information.</returns>
public IReadOnlyList<SessionInfo> GetAllSessions()
{
return _sessions.Values.ToList();
}
/// <summary>
/// Disposes the session manager and clears all sessions.
/// </summary>
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
var count = _sessions.Count;
_sessions.Clear();
Logger.Information("SessionManager disposed, cleared {Count} sessions", count);
}
}
/// <summary>
/// Contains information about a client session.
/// </summary>
public class SessionInfo
{
/// <summary>
/// Gets or sets the unique session identifier.
/// </summary>
public string SessionId { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the client identifier.
/// </summary>
public string ClientId { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the API key used for this session.
/// </summary>
public string ApiKey { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the time when the session was created.
/// </summary>
public DateTime ConnectedAt { get; set; }
/// <summary>
/// Gets or sets the time of the last activity on this session.
/// </summary>
public DateTime LastActivity { get; set; }
/// <summary>
/// Gets the connected time as UTC ticks for the gRPC response.
/// </summary>
public long ConnectedSinceUtcTicks => ConnectedAt.Ticks;
}
}

View File

@@ -0,0 +1,433 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Service for collecting and formatting status information from various LmxProxy components
/// </summary>
public class StatusReportService
{
private static readonly ILogger Logger = Log.ForContext<StatusReportService>();
private readonly DetailedHealthCheckService? _detailedHealthCheckService;
private readonly HealthCheckService _healthCheckService;
private readonly PerformanceMetrics _performanceMetrics;
private readonly IScadaClient _scadaClient;
private readonly SubscriptionManager _subscriptionManager;
/// <summary>
/// Initializes a new instance of the StatusReportService class
/// </summary>
public StatusReportService(
IScadaClient scadaClient,
SubscriptionManager subscriptionManager,
PerformanceMetrics performanceMetrics,
HealthCheckService healthCheckService,
DetailedHealthCheckService? detailedHealthCheckService = null)
{
_scadaClient = scadaClient ?? throw new ArgumentNullException(nameof(scadaClient));
_subscriptionManager = subscriptionManager ?? throw new ArgumentNullException(nameof(subscriptionManager));
_performanceMetrics = performanceMetrics ?? throw new ArgumentNullException(nameof(performanceMetrics));
_healthCheckService = healthCheckService ?? throw new ArgumentNullException(nameof(healthCheckService));
_detailedHealthCheckService = detailedHealthCheckService;
}
/// <summary>
/// Generates a comprehensive status report as HTML
/// </summary>
public async Task<string> GenerateHtmlReportAsync()
{
try
{
StatusData statusData = await CollectStatusDataAsync();
return GenerateHtmlFromStatusData(statusData);
}
catch (Exception ex)
{
Logger.Error(ex, "Error generating HTML status report");
return GenerateErrorHtml(ex);
}
}
/// <summary>
/// Generates a comprehensive status report as JSON
/// </summary>
public async Task<string> GenerateJsonReportAsync()
{
try
{
StatusData statusData = await CollectStatusDataAsync();
return JsonSerializer.Serialize(statusData, new JsonSerializerOptions
{
WriteIndented = true,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
});
}
catch (Exception ex)
{
Logger.Error(ex, "Error generating JSON status report");
return JsonSerializer.Serialize(new { error = ex.Message }, new JsonSerializerOptions
{
WriteIndented = true,
PropertyNamingPolicy = JsonNamingPolicy.CamelCase
});
}
}
/// <summary>
/// Checks if the service is healthy
/// </summary>
public async Task<bool> IsHealthyAsync()
{
try
{
HealthCheckResult healthResult = await _healthCheckService.CheckHealthAsync(new HealthCheckContext());
return healthResult.Status == HealthStatus.Healthy;
}
catch (Exception ex)
{
Logger.Error(ex, "Error checking health status");
return false;
}
}
/// <summary>
/// Collects status data from all components
/// </summary>
private async Task<StatusData> CollectStatusDataAsync()
{
var statusData = new StatusData
{
Timestamp = DateTime.UtcNow,
ServiceName = "ZB.MOM.WW.LmxProxy.Host",
Version = Assembly.GetExecutingAssembly().GetName().Version?.ToString() ?? "Unknown"
};
// Collect connection status
statusData.Connection = new ConnectionStatus
{
IsConnected = _scadaClient.IsConnected,
State = _scadaClient.ConnectionState.ToString(),
NodeName = "N/A", // Could be extracted from configuration if needed
GalaxyName = "N/A" // Could be extracted from configuration if needed
};
// Collect subscription statistics
SubscriptionStats subscriptionStats = _subscriptionManager.GetSubscriptionStats();
statusData.Subscriptions = new SubscriptionStatus
{
TotalClients = subscriptionStats.TotalClients,
TotalTags = subscriptionStats.TotalTags,
ActiveSubscriptions = subscriptionStats.TotalTags // Assuming same for simplicity
};
// Collect performance metrics
Dictionary<string, MetricsStatistics> perfMetrics = _performanceMetrics.GetStatistics();
statusData.Performance = new PerformanceStatus
{
TotalOperations = perfMetrics.Values.Sum(m => m.TotalCount),
AverageSuccessRate = perfMetrics.Count > 0 ? perfMetrics.Values.Average(m => m.SuccessRate) : 1.0,
Operations = perfMetrics.ToDictionary(
kvp => kvp.Key,
kvp => new OperationStatus
{
TotalCount = kvp.Value.TotalCount,
SuccessRate = kvp.Value.SuccessRate,
AverageMilliseconds = kvp.Value.AverageMilliseconds,
MinMilliseconds = kvp.Value.MinMilliseconds,
MaxMilliseconds = kvp.Value.MaxMilliseconds
})
};
// Collect health check results
try
{
HealthCheckResult healthResult = await _healthCheckService.CheckHealthAsync(new HealthCheckContext());
statusData.Health = new HealthInfo
{
Status = healthResult.Status.ToString(),
Description = healthResult.Description ?? "",
Data = healthResult.Data?.ToDictionary(kvp => kvp.Key, kvp => kvp.Value?.ToString() ?? "") ??
new Dictionary<string, string>()
};
// Collect detailed health check if available
if (_detailedHealthCheckService != null)
{
HealthCheckResult detailedHealthResult =
await _detailedHealthCheckService.CheckHealthAsync(new HealthCheckContext());
statusData.DetailedHealth = new HealthInfo
{
Status = detailedHealthResult.Status.ToString(),
Description = detailedHealthResult.Description ?? "",
Data = detailedHealthResult.Data?.ToDictionary(kvp => kvp.Key,
kvp => kvp.Value?.ToString() ?? "") ?? new Dictionary<string, string>()
};
}
}
catch (Exception ex)
{
Logger.Error(ex, "Error collecting health check data");
statusData.Health = new HealthInfo
{
Status = "Error",
Description = $"Health check failed: {ex.Message}",
Data = new Dictionary<string, string>()
};
}
return statusData;
}
/// <summary>
/// Generates HTML from status data
/// </summary>
private static string GenerateHtmlFromStatusData(StatusData statusData)
{
var html = new StringBuilder();
html.AppendLine("<!DOCTYPE html>");
html.AppendLine("<html>");
html.AppendLine("<head>");
html.AppendLine(" <title>LmxProxy Status</title>");
html.AppendLine(" <meta charset=\"utf-8\">");
html.AppendLine(" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">");
html.AppendLine(" <meta http-equiv=\"refresh\" content=\"30\">");
html.AppendLine(" <style>");
html.AppendLine(
" body { font-family: Arial, sans-serif; margin: 40px; background-color: #f5f5f5; }");
html.AppendLine(
" .container { max-width: 1200px; margin: 0 auto; background-color: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); }");
html.AppendLine(" .header { text-align: center; margin-bottom: 30px; }");
html.AppendLine(
" .status-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); gap: 20px; }");
html.AppendLine(
" .status-card { background: #f9f9f9; padding: 15px; border-radius: 6px; border-left: 4px solid #007acc; }");
html.AppendLine(" .status-card h3 { margin-top: 0; color: #333; }");
html.AppendLine(" .status-value { font-weight: bold; color: #007acc; }");
html.AppendLine(" .status-healthy { color: #28a745; }");
html.AppendLine(" .status-warning { color: #ffc107; }");
html.AppendLine(" .status-error { color: #dc3545; }");
html.AppendLine(" .status-connected { border-left-color: #28a745; }");
html.AppendLine(" .status-disconnected { border-left-color: #dc3545; }");
html.AppendLine(" table { width: 100%; border-collapse: collapse; margin-top: 10px; }");
html.AppendLine(" th, td { text-align: left; padding: 8px; border-bottom: 1px solid #ddd; }");
html.AppendLine(" th { background-color: #f2f2f2; }");
html.AppendLine(
" .timestamp { text-align: center; margin-top: 20px; color: #666; font-size: 0.9em; }");
html.AppendLine(" </style>");
html.AppendLine("</head>");
html.AppendLine("<body>");
html.AppendLine(" <div class=\"container\">");
// Header
html.AppendLine(" <div class=\"header\">");
html.AppendLine(" <h1>LmxProxy Status Dashboard</h1>");
html.AppendLine($" <p>Service: {statusData.ServiceName} | Version: {statusData.Version}</p>");
html.AppendLine(" </div>");
html.AppendLine(" <div class=\"status-grid\">");
// Connection Status Card
string connectionClass = statusData.Connection.IsConnected ? "status-connected" : "status-disconnected";
string connectionStatusText = statusData.Connection.IsConnected ? "Connected" : "Disconnected";
string connectionStatusClass = statusData.Connection.IsConnected ? "status-healthy" : "status-error";
html.AppendLine($" <div class=\"status-card {connectionClass}\">");
html.AppendLine(" <h3>MxAccess Connection</h3>");
html.AppendLine(
$" <p>Status: <span class=\"status-value {connectionStatusClass}\">{connectionStatusText}</span></p>");
html.AppendLine(
$" <p>State: <span class=\"status-value\">{statusData.Connection.State}</span></p>");
html.AppendLine(" </div>");
// Subscription Status Card
html.AppendLine(" <div class=\"status-card\">");
html.AppendLine(" <h3>Subscriptions</h3>");
html.AppendLine(
$" <p>Total Clients: <span class=\"status-value\">{statusData.Subscriptions.TotalClients}</span></p>");
html.AppendLine(
$" <p>Total Tags: <span class=\"status-value\">{statusData.Subscriptions.TotalTags}</span></p>");
html.AppendLine(
$" <p>Active Subscriptions: <span class=\"status-value\">{statusData.Subscriptions.ActiveSubscriptions}</span></p>");
html.AppendLine(" </div>");
// Performance Status Card
html.AppendLine(" <div class=\"status-card\">");
html.AppendLine(" <h3>Performance</h3>");
html.AppendLine(
$" <p>Total Operations: <span class=\"status-value\">{statusData.Performance.TotalOperations:N0}</span></p>");
html.AppendLine(
$" <p>Success Rate: <span class=\"status-value\">{statusData.Performance.AverageSuccessRate:P2}</span></p>");
html.AppendLine(" </div>");
// Health Status Card
string healthStatusClass = statusData.Health.Status.ToLowerInvariant() switch
{
"healthy" => "status-healthy",
"degraded" => "status-warning",
_ => "status-error"
};
html.AppendLine(" <div class=\"status-card\">");
html.AppendLine(" <h3>Health Status</h3>");
html.AppendLine(
$" <p>Status: <span class=\"status-value {healthStatusClass}\">{statusData.Health.Status}</span></p>");
html.AppendLine(
$" <p>Description: <span class=\"status-value\">{statusData.Health.Description}</span></p>");
html.AppendLine(" </div>");
html.AppendLine(" </div>");
// Performance Metrics Table
if (statusData.Performance.Operations.Any())
{
html.AppendLine(" <div class=\"status-card\" style=\"margin-top: 20px;\">");
html.AppendLine(" <h3>Operation Performance Metrics</h3>");
html.AppendLine(" <table>");
html.AppendLine(" <tr>");
html.AppendLine(" <th>Operation</th>");
html.AppendLine(" <th>Count</th>");
html.AppendLine(" <th>Success Rate</th>");
html.AppendLine(" <th>Avg (ms)</th>");
html.AppendLine(" <th>Min (ms)</th>");
html.AppendLine(" <th>Max (ms)</th>");
html.AppendLine(" </tr>");
foreach (KeyValuePair<string, OperationStatus> operation in statusData.Performance.Operations)
{
html.AppendLine(" <tr>");
html.AppendLine($" <td>{operation.Key}</td>");
html.AppendLine($" <td>{operation.Value.TotalCount:N0}</td>");
html.AppendLine($" <td>{operation.Value.SuccessRate:P2}</td>");
html.AppendLine($" <td>{operation.Value.AverageMilliseconds:F2}</td>");
html.AppendLine($" <td>{operation.Value.MinMilliseconds:F2}</td>");
html.AppendLine($" <td>{operation.Value.MaxMilliseconds:F2}</td>");
html.AppendLine(" </tr>");
}
html.AppendLine(" </table>");
html.AppendLine(" </div>");
}
// Timestamp
html.AppendLine(
$" <div class=\"timestamp\">Last updated: {statusData.Timestamp:yyyy-MM-dd HH:mm:ss} UTC</div>");
html.AppendLine(" </div>");
html.AppendLine("</body>");
html.AppendLine("</html>");
return html.ToString();
}
/// <summary>
/// Generates error HTML when status collection fails
/// </summary>
private static string GenerateErrorHtml(Exception ex)
{
return $@"<!DOCTYPE html>
<html>
<head>
<title>LmxProxy Status - Error</title>
<meta charset=""utf-8"">
<style>
body {{ font-family: Arial, sans-serif; margin: 40px; background-color: #f5f5f5; }}
.container {{ max-width: 800px; margin: 0 auto; background-color: white; padding: 20px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); }}
.error {{ color: #dc3545; background-color: #f8d7da; padding: 15px; border-radius: 6px; border: 1px solid #f5c6cb; }}
</style>
</head>
<body>
<div class=""container"">
<h1>LmxProxy Status Dashboard</h1>
<div class=""error"">
<h3>Error Loading Status</h3>
<p>An error occurred while collecting status information:</p>
<p><strong>{ex.Message}</strong></p>
</div>
<div style=""text-align: center; margin-top: 20px; color: #666; font-size: 0.9em;"">
Last updated: {DateTime.UtcNow:yyyy-MM-dd HH:mm:ss} UTC
</div>
</div>
</body>
</html>";
}
}
/// <summary>
/// Data structure for holding complete status information
/// </summary>
public class StatusData
{
public DateTime Timestamp { get; set; }
public string ServiceName { get; set; } = "";
public string Version { get; set; } = "";
public ConnectionStatus Connection { get; set; } = new();
public SubscriptionStatus Subscriptions { get; set; } = new();
public PerformanceStatus Performance { get; set; } = new();
public HealthInfo Health { get; set; } = new();
public HealthInfo? DetailedHealth { get; set; }
}
/// <summary>
/// Connection status information
/// </summary>
public class ConnectionStatus
{
public bool IsConnected { get; set; }
public string State { get; set; } = "";
public string NodeName { get; set; } = "";
public string GalaxyName { get; set; } = "";
}
/// <summary>
/// Subscription status information
/// </summary>
public class SubscriptionStatus
{
public int TotalClients { get; set; }
public int TotalTags { get; set; }
public int ActiveSubscriptions { get; set; }
}
/// <summary>
/// Performance status information
/// </summary>
public class PerformanceStatus
{
public long TotalOperations { get; set; }
public double AverageSuccessRate { get; set; }
public Dictionary<string, OperationStatus> Operations { get; set; } = new();
}
/// <summary>
/// Individual operation status
/// </summary>
public class OperationStatus
{
public long TotalCount { get; set; }
public double SuccessRate { get; set; }
public double AverageMilliseconds { get; set; }
public double MinMilliseconds { get; set; }
public double MaxMilliseconds { get; set; }
}
/// <summary>
/// Health check status information
/// </summary>
public class HealthInfo
{
public string Status { get; set; } = "";
public string Description { get; set; } = "";
public Dictionary<string, string> Data { get; set; } = new();
}
}

View File

@@ -0,0 +1,315 @@
using System;
using System.IO;
using System.Net;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// HTTP web server that serves status information for the LmxProxy service
/// </summary>
public class StatusWebServer : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<StatusWebServer>();
private readonly WebServerConfiguration _configuration;
private readonly StatusReportService _statusReportService;
private CancellationTokenSource? _cancellationTokenSource;
private bool _disposed;
private HttpListener? _httpListener;
private Task? _listenerTask;
/// <summary>
/// Initializes a new instance of the StatusWebServer class
/// </summary>
/// <param name="configuration">Web server configuration</param>
/// <param name="statusReportService">Service for collecting status information</param>
public StatusWebServer(WebServerConfiguration configuration, StatusReportService statusReportService)
{
_configuration = configuration ?? throw new ArgumentNullException(nameof(configuration));
_statusReportService = statusReportService ?? throw new ArgumentNullException(nameof(statusReportService));
}
/// <summary>
/// Disposes the web server and releases resources
/// </summary>
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
Stop();
_cancellationTokenSource?.Dispose();
_httpListener?.Close();
}
/// <summary>
/// Starts the HTTP web server
/// </summary>
/// <returns>True if started successfully, false otherwise</returns>
public bool Start()
{
try
{
if (!_configuration.Enabled)
{
Logger.Information("Status web server is disabled");
return true;
}
Logger.Information("Starting status web server on port {Port}", _configuration.Port);
_httpListener = new HttpListener();
// Configure the URL prefix
string prefix = _configuration.Prefix ?? $"http://+:{_configuration.Port}/";
if (!prefix.EndsWith("/"))
{
prefix += "/";
}
_httpListener.Prefixes.Add(prefix);
_httpListener.Start();
_cancellationTokenSource = new CancellationTokenSource();
_listenerTask = Task.Run(() => HandleRequestsAsync(_cancellationTokenSource.Token));
Logger.Information("Status web server started successfully on {Prefix}", prefix);
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to start status web server");
return false;
}
}
/// <summary>
/// Stops the HTTP web server
/// </summary>
/// <returns>True if stopped successfully, false otherwise</returns>
public bool Stop()
{
try
{
if (!_configuration.Enabled || _httpListener == null)
{
return true;
}
Logger.Information("Stopping status web server");
_cancellationTokenSource?.Cancel();
if (_listenerTask != null)
{
try
{
_listenerTask.Wait(TimeSpan.FromSeconds(5));
}
catch (Exception ex)
{
Logger.Warning(ex, "Error waiting for listener task to complete");
}
}
_httpListener?.Stop();
_httpListener?.Close();
Logger.Information("Status web server stopped successfully");
return true;
}
catch (Exception ex)
{
Logger.Error(ex, "Error stopping status web server");
return false;
}
}
/// <summary>
/// Main request handling loop
/// </summary>
private async Task HandleRequestsAsync(CancellationToken cancellationToken)
{
Logger.Information("Status web server listener started");
while (!cancellationToken.IsCancellationRequested && _httpListener != null && _httpListener.IsListening)
{
try
{
HttpListenerContext? context = await _httpListener.GetContextAsync();
// Handle request asynchronously without waiting
_ = Task.Run(async () =>
{
try
{
await HandleRequestAsync(context);
}
catch (Exception ex)
{
Logger.Error(ex, "Error handling HTTP request from {RemoteEndPoint}",
context.Request.RemoteEndPoint);
}
}, cancellationToken);
}
catch (ObjectDisposedException)
{
// Expected when stopping the listener
break;
}
catch (HttpListenerException ex) when (ex.ErrorCode == 995) // ERROR_OPERATION_ABORTED
{
// Expected when stopping the listener
break;
}
catch (Exception ex)
{
Logger.Error(ex, "Error in request listener loop");
// Brief delay before continuing to avoid tight error loops
try
{
await Task.Delay(1000, cancellationToken);
}
catch (OperationCanceledException)
{
break;
}
}
}
Logger.Information("Status web server listener stopped");
}
/// <summary>
/// Handles a single HTTP request
/// </summary>
private async Task HandleRequestAsync(HttpListenerContext context)
{
HttpListenerRequest? request = context.Request;
HttpListenerResponse response = context.Response;
try
{
Logger.Debug("Handling {Method} request to {Url} from {RemoteEndPoint}",
request.HttpMethod, request.Url?.AbsolutePath, request.RemoteEndPoint);
// Only allow GET requests
if (request.HttpMethod != "GET")
{
response.StatusCode = 405; // Method Not Allowed
response.StatusDescription = "Method Not Allowed";
await WriteResponseAsync(response, "Only GET requests are supported", "text/plain");
return;
}
string path = request.Url?.AbsolutePath?.ToLowerInvariant() ?? "/";
switch (path)
{
case "/":
await HandleStatusPageAsync(response);
break;
case "/api/status":
await HandleStatusApiAsync(response);
break;
case "/api/health":
await HandleHealthApiAsync(response);
break;
default:
response.StatusCode = 404; // Not Found
response.StatusDescription = "Not Found";
await WriteResponseAsync(response, "Resource not found", "text/plain");
break;
}
}
catch (Exception ex)
{
Logger.Error(ex, "Error handling HTTP request");
try
{
response.StatusCode = 500; // Internal Server Error
response.StatusDescription = "Internal Server Error";
await WriteResponseAsync(response, "Internal server error", "text/plain");
}
catch (Exception responseEx)
{
Logger.Error(responseEx, "Error writing error response");
}
}
finally
{
try
{
response.Close();
}
catch (Exception ex)
{
Logger.Warning(ex, "Error closing HTTP response");
}
}
}
/// <summary>
/// Handles the main status page (HTML)
/// </summary>
private async Task HandleStatusPageAsync(HttpListenerResponse response)
{
string statusHtml = await _statusReportService.GenerateHtmlReportAsync();
await WriteResponseAsync(response, statusHtml, "text/html; charset=utf-8");
}
/// <summary>
/// Handles the status API endpoint (JSON)
/// </summary>
private async Task HandleStatusApiAsync(HttpListenerResponse response)
{
string statusJson = await _statusReportService.GenerateJsonReportAsync();
await WriteResponseAsync(response, statusJson, "application/json; charset=utf-8");
}
/// <summary>
/// Handles the health API endpoint (simple text)
/// </summary>
private async Task HandleHealthApiAsync(HttpListenerResponse response)
{
bool isHealthy = await _statusReportService.IsHealthyAsync();
string healthText = isHealthy ? "OK" : "UNHEALTHY";
response.StatusCode = isHealthy ? 200 : 503; // Service Unavailable if unhealthy
await WriteResponseAsync(response, healthText, "text/plain");
}
/// <summary>
/// Writes a response to the HTTP context
/// </summary>
private static async Task WriteResponseAsync(HttpListenerResponse response, string content, string contentType)
{
response.ContentType = contentType;
response.Headers.Add("Cache-Control", "no-cache, no-store, must-revalidate");
response.Headers.Add("Pragma", "no-cache");
response.Headers.Add("Expires", "0");
byte[] buffer = Encoding.UTF8.GetBytes(content);
response.ContentLength64 = buffer.Length;
using (Stream? output = response.OutputStream)
{
await output.WriteAsync(buffer, 0, buffer.Length);
}
}
}
}

View File

@@ -0,0 +1,535 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Channels;
using System.Threading.Tasks;
using Serilog;
using ZB.MOM.WW.LmxProxy.Host.Configuration;
using ZB.MOM.WW.LmxProxy.Host.Domain;
namespace ZB.MOM.WW.LmxProxy.Host.Services
{
/// <summary>
/// Manages subscriptions for multiple gRPC clients, handling tag subscriptions, message delivery, and client
/// statistics.
/// </summary>
public class SubscriptionManager : IDisposable
{
private static readonly ILogger Logger = Log.ForContext<SubscriptionManager>();
// Configuration for channel buffering
private readonly int _channelCapacity;
private readonly BoundedChannelFullMode _channelFullMode;
private readonly ConcurrentDictionary<string, ClientSubscription> _clientSubscriptions = new();
private readonly ReaderWriterLockSlim _lock = new(LockRecursionPolicy.NoRecursion);
private readonly IScadaClient _scadaClient;
private readonly ConcurrentDictionary<string, TagSubscription> _tagSubscriptions = new();
private bool _disposed;
/// <summary>
/// Initializes a new instance of the <see cref="SubscriptionManager" /> class.
/// </summary>
/// <param name="scadaClient">The SCADA client to use for subscriptions.</param>
/// <param name="configuration">The subscription configuration.</param>
/// <exception cref="ArgumentNullException">
/// Thrown if <paramref name="scadaClient" /> or <paramref name="configuration" />
/// is null.
/// </exception>
public SubscriptionManager(IScadaClient scadaClient, SubscriptionConfiguration configuration)
{
_scadaClient = scadaClient ?? throw new ArgumentNullException(nameof(scadaClient));
SubscriptionConfiguration configuration1 =
configuration ?? throw new ArgumentNullException(nameof(configuration));
_channelCapacity = configuration1.ChannelCapacity;
_channelFullMode = ParseChannelFullMode(configuration1.ChannelFullMode);
// Subscribe to connection state changes
_scadaClient.ConnectionStateChanged += OnConnectionStateChanged;
Logger.Information("SubscriptionManager initialized with channel capacity: {Capacity}, full mode: {Mode}",
_channelCapacity, _channelFullMode);
}
/// <summary>
/// Disposes the <see cref="SubscriptionManager" />, unsubscribing all clients and cleaning up resources.
/// </summary>
public void Dispose()
{
if (_disposed)
{
return;
}
_disposed = true;
Logger.Information("Disposing SubscriptionManager");
// Unsubscribe from connection state changes
_scadaClient.ConnectionStateChanged -= OnConnectionStateChanged;
// Unsubscribe all clients
var clientIds = _clientSubscriptions.Keys.ToList();
foreach (string? clientId in clientIds)
{
UnsubscribeClient(clientId);
}
_clientSubscriptions.Clear();
_tagSubscriptions.Clear();
// Dispose the lock
_lock?.Dispose();
}
/// <summary>
/// Gets the number of active client subscriptions.
/// </summary>
public virtual int GetActiveSubscriptionCount() => _clientSubscriptions.Count;
/// <summary>
/// Parses the channel full mode string to <see cref="BoundedChannelFullMode" />.
/// </summary>
/// <param name="mode">The mode string.</param>
/// <returns>The parsed <see cref="BoundedChannelFullMode" /> value.</returns>
private static BoundedChannelFullMode ParseChannelFullMode(string mode)
{
return mode?.ToUpperInvariant() switch
{
"DROPOLDEST" => BoundedChannelFullMode.DropOldest,
"DROPNEWEST" => BoundedChannelFullMode.DropNewest,
"WAIT" => BoundedChannelFullMode.Wait,
_ => BoundedChannelFullMode.DropOldest // Default
};
}
/// <summary>
/// Creates a new subscription for a client to a set of tag addresses.
/// </summary>
/// <param name="clientId">The client identifier.</param>
/// <param name="addresses">The tag addresses to subscribe to.</param>
/// <param name="ct">Optional cancellation token.</param>
/// <returns>A channel for receiving tag updates.</returns>
/// <exception cref="ObjectDisposedException">Thrown if the manager is disposed.</exception>
public async Task<Channel<(string address, Vtq vtq)>> SubscribeAsync(
string clientId,
IEnumerable<string> addresses,
CancellationToken ct = default)
{
if (_disposed)
{
throw new ObjectDisposedException(nameof(SubscriptionManager));
}
var addressList = addresses.ToList();
Logger.Information("Client {ClientId} subscribing to {Count} tags", clientId, addressList.Count);
// Create a bounded channel for this client with buffering
var channel = Channel.CreateBounded<(string address, Vtq vtq)>(new BoundedChannelOptions(_channelCapacity)
{
FullMode = _channelFullMode,
SingleReader = true,
SingleWriter = false,
AllowSynchronousContinuations = false
});
Logger.Debug("Created bounded channel for client {ClientId} with capacity {Capacity}", clientId,
_channelCapacity);
var clientSubscription = new ClientSubscription
{
ClientId = clientId,
Channel = channel,
Addresses = new HashSet<string>(addressList),
CancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(ct)
};
_clientSubscriptions[clientId] = clientSubscription;
// Subscribe to each tag
foreach (string? address in addressList)
{
await SubscribeToTagAsync(address, clientId);
}
// Handle client disconnection
clientSubscription.CancellationTokenSource.Token.Register(() =>
{
Logger.Information("Client {ClientId} disconnected, cleaning up subscriptions", clientId);
UnsubscribeClient(clientId);
});
return channel;
}
/// <summary>
/// Unsubscribes a client from all tags and cleans up resources.
/// </summary>
/// <param name="clientId">The client identifier.</param>
public void UnsubscribeClient(string clientId)
{
if (_clientSubscriptions.TryRemove(clientId, out ClientSubscription? clientSubscription))
{
Logger.Information(
"Unsubscribing client {ClientId} from {Count} tags. Stats: Delivered={Delivered}, Dropped={Dropped}",
clientId, clientSubscription.Addresses.Count,
clientSubscription.DeliveredMessageCount, clientSubscription.DroppedMessageCount);
_lock.EnterWriteLock();
try
{
foreach (string? address in clientSubscription.Addresses)
{
if (_tagSubscriptions.TryGetValue(address, out TagSubscription? tagSubscription))
{
tagSubscription.ClientIds.Remove(clientId);
// If no more clients are subscribed to this tag, unsubscribe from SCADA
if (tagSubscription.ClientIds.Count == 0)
{
Logger.Information(
"No more clients subscribed to {Address}, removing SCADA subscription", address);
_tagSubscriptions.TryRemove(address, out _);
// Dispose the SCADA subscription
Task.Run(async () =>
{
try
{
if (tagSubscription.ScadaSubscription != null)
{
await tagSubscription.ScadaSubscription.DisposeAsync();
Logger.Debug("Successfully disposed SCADA subscription for {Address}",
address);
}
}
catch (Exception ex)
{
Logger.Error(ex, "Error disposing SCADA subscription for {Address}", address);
}
});
}
else
{
Logger.Debug(
"Client {ClientId} removed from {Address} subscription (remaining clients: {Count})",
clientId, address, tagSubscription.ClientIds.Count);
}
}
}
}
finally
{
_lock.ExitWriteLock();
}
// Complete the channel
clientSubscription.Channel.Writer.TryComplete();
clientSubscription.CancellationTokenSource.Dispose();
}
}
/// <summary>
/// Subscribes a client to a tag address, creating a new SCADA subscription if needed.
/// </summary>
/// <param name="address">The tag address.</param>
/// <param name="clientId">The client identifier.</param>
private async Task SubscribeToTagAsync(string address, string clientId)
{
bool needsSubscription;
TagSubscription? tagSubscription;
_lock.EnterWriteLock();
try
{
if (_tagSubscriptions.TryGetValue(address, out TagSubscription? existingSubscription))
{
// Tag is already subscribed, just add this client
existingSubscription.ClientIds.Add(clientId);
Logger.Debug(
"Client {ClientId} added to existing subscription for {Address} (total clients: {Count})",
clientId, address, existingSubscription.ClientIds.Count);
return;
}
// Create new tag subscription and reserve the spot
tagSubscription = new TagSubscription
{
Address = address,
ClientIds = new HashSet<string> { clientId }
};
_tagSubscriptions[address] = tagSubscription;
needsSubscription = true;
}
finally
{
_lock.ExitWriteLock();
}
if (needsSubscription && tagSubscription != null)
{
// Subscribe to SCADA outside of lock to avoid blocking
Logger.Debug("Creating new SCADA subscription for {Address}", address);
try
{
IAsyncDisposable scadaSubscription = await _scadaClient.SubscribeAsync(
new[] { address },
(addr, vtq) => OnTagValueChanged(addr, vtq),
CancellationToken.None);
_lock.EnterWriteLock();
try
{
tagSubscription.ScadaSubscription = scadaSubscription;
}
finally
{
_lock.ExitWriteLock();
}
Logger.Information("Successfully subscribed to {Address} for client {ClientId}", address, clientId);
}
catch (Exception ex)
{
Logger.Error(ex, "Failed to subscribe to {Address}", address);
// Remove the failed subscription
_lock.EnterWriteLock();
try
{
_tagSubscriptions.TryRemove(address, out _);
}
finally
{
_lock.ExitWriteLock();
}
throw;
}
}
}
/// <summary>
/// Handles tag value changes and delivers updates to all subscribed clients.
/// </summary>
/// <param name="address">The tag address.</param>
/// <param name="vtq">The value, timestamp, and quality.</param>
private void OnTagValueChanged(string address, Vtq vtq)
{
Logger.Debug("Tag value changed: {Address} = {Vtq}", address, vtq);
_lock.EnterReadLock();
try
{
if (!_tagSubscriptions.TryGetValue(address, out TagSubscription? tagSubscription))
{
Logger.Warning("Received update for untracked tag {Address}", address);
return;
}
// Send update to all subscribed clients
// Use the existing collection directly without ToList() since we're in a read lock
foreach (string? clientId in tagSubscription.ClientIds)
{
if (_clientSubscriptions.TryGetValue(clientId, out ClientSubscription? clientSubscription))
{
try
{
if (!clientSubscription.Channel.Writer.TryWrite((address, vtq)))
{
// Channel is full - with DropOldest mode, this should rarely happen
Logger.Warning(
"Channel full for client {ClientId}, dropping message for {Address}. Consider increasing buffer size.",
clientId, address);
clientSubscription.DroppedMessageCount++;
}
else
{
clientSubscription.DeliveredMessageCount++;
}
}
catch (InvalidOperationException ex) when (ex.Message.Contains("closed"))
{
Logger.Debug("Channel closed for client {ClientId}, removing subscription", clientId);
// Schedule cleanup of disconnected client
Task.Run(() => UnsubscribeClient(clientId));
}
catch (Exception ex)
{
Logger.Error(ex, "Error sending update to client {ClientId}", clientId);
}
}
}
}
finally
{
_lock.ExitReadLock();
}
}
/// <summary>
/// Gets current subscription statistics for all clients and tags.
/// </summary>
/// <returns>A <see cref="SubscriptionStats" /> object containing statistics.</returns>
public virtual SubscriptionStats GetSubscriptionStats()
{
_lock.EnterReadLock();
try
{
var tagClientCounts = _tagSubscriptions.ToDictionary(
kvp => kvp.Key,
kvp => kvp.Value.ClientIds.Count);
var clientStats = _clientSubscriptions.ToDictionary(
kvp => kvp.Key,
kvp => new ClientStats
{
SubscribedTags = kvp.Value.Addresses.Count,
DeliveredMessages = kvp.Value.DeliveredMessageCount,
DroppedMessages = kvp.Value.DroppedMessageCount
});
return new SubscriptionStats
{
TotalClients = _clientSubscriptions.Count,
TotalTags = _tagSubscriptions.Count,
TagClientCounts = tagClientCounts,
ClientStats = clientStats
};
}
finally
{
_lock.ExitReadLock();
}
}
/// <summary>
/// Handles SCADA client connection state changes and notifies clients of disconnection.
/// </summary>
/// <param name="sender">The event sender.</param>
/// <param name="e">The connection state change event arguments.</param>
private void OnConnectionStateChanged(object? sender, ConnectionStateChangedEventArgs e)
{
Logger.Information("Connection state changed from {Previous} to {Current}",
e.PreviousState, e.CurrentState);
// If we're disconnected, notify all subscribed clients with bad quality
if (e.CurrentState != ConnectionState.Connected)
{
Task.Run(async () =>
{
try
{
await NotifyAllClientsOfDisconnection();
}
catch (Exception ex)
{
Logger.Error(ex, "Error notifying clients of disconnection");
}
});
}
}
/// <summary>
/// Notifies all clients of a SCADA disconnection by sending bad quality updates.
/// </summary>
private async Task NotifyAllClientsOfDisconnection()
{
Logger.Information("Notifying all clients of disconnection");
var badQualityVtq = new Vtq(null, DateTime.UtcNow, Quality.Bad);
// Get all unique addresses being subscribed to
var allAddresses = _tagSubscriptions.Keys.ToList();
// Send bad quality update for each address to all subscribed clients
foreach (string? address in allAddresses)
{
if (_tagSubscriptions.TryGetValue(address, out TagSubscription? tagSubscription))
{
var clientIds = tagSubscription.ClientIds.ToList();
foreach (string? clientId in clientIds)
{
if (_clientSubscriptions.TryGetValue(clientId, out ClientSubscription? clientSubscription))
{
try
{
await clientSubscription.Channel.Writer.WriteAsync((address, badQualityVtq));
Logger.Debug("Sent bad quality notification for {Address} to client {ClientId}",
address, clientId);
}
catch (Exception ex)
{
Logger.Warning(ex, "Failed to send bad quality notification to client {ClientId}",
clientId);
}
}
}
}
}
}
/// <summary>
/// Represents a client's subscription, including channel, addresses, and statistics.
/// </summary>
private class ClientSubscription
{
/// <summary>
/// Gets or sets the client identifier.
/// </summary>
public string ClientId { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the channel for delivering tag updates.
/// </summary>
public Channel<(string address, Vtq vtq)> Channel { get; set; } = null!;
/// <summary>
/// Gets or sets the set of addresses the client is subscribed to.
/// </summary>
public HashSet<string> Addresses { get; set; } = new();
/// <summary>
/// Gets or sets the cancellation token source for the client.
/// </summary>
public CancellationTokenSource CancellationTokenSource { get; set; } = null!;
/// <summary>
/// Gets or sets the count of delivered messages.
/// </summary>
public long DeliveredMessageCount { get; set; }
/// <summary>
/// Gets or sets the count of dropped messages.
/// </summary>
public long DroppedMessageCount { get; set; }
}
/// <summary>
/// Represents a tag subscription, including address, client IDs, and SCADA subscription handle.
/// </summary>
private class TagSubscription
{
/// <summary>
/// Gets or sets the tag address.
/// </summary>
public string Address { get; set; } = string.Empty;
/// <summary>
/// Gets or sets the set of client IDs subscribed to this tag.
/// </summary>
public HashSet<string> ClientIds { get; set; } = new();
/// <summary>
/// Gets or sets the SCADA subscription handle.
/// </summary>
public IAsyncDisposable ScadaSubscription { get; set; } = null!;
}
}
}

View File

@@ -0,0 +1,65 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net48</TargetFramework>
<OutputType>Exe</OutputType>
<LangVersion>9.0</LangVersion>
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
<RootNamespace>ZB.MOM.WW.LmxProxy.Host</RootNamespace>
<AssemblyName>ZB.MOM.WW.LmxProxy.Host</AssemblyName>
<!-- Force x86 architecture for all configurations (required by ArchestrA.MXAccess) -->
<PlatformTarget>x86</PlatformTarget>
<Platforms>x86</Platforms>
<Prefer32Bit>true</Prefer32Bit>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Grpc.Core" Version="2.46.6"/>
<PackageReference Include="Grpc.Tools" Version="2.51.0">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
<PackageReference Include="Google.Protobuf" Version="3.21.12"/>
<PackageReference Include="Topshelf" Version="4.3.0"/>
<PackageReference Include="Topshelf.Serilog" Version="4.3.0"/>
<PackageReference Include="Serilog" Version="2.10.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="4.0.1"/>
<PackageReference Include="Serilog.Sinks.File" Version="5.0.0"/>
<PackageReference Include="Serilog.Settings.Configuration" Version="3.3.0"/>
<PackageReference Include="Serilog.Formatting.Compact" Version="1.1.0"/>
<PackageReference Include="System.Threading.Channels" Version="4.7.1"/>
<PackageReference Include="Microsoft.Extensions.Configuration" Version="3.1.32"/>
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="3.1.32"/>
<PackageReference Include="Microsoft.Extensions.Configuration.EnvironmentVariables" Version="3.1.32"/>
<PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="3.1.32"/>
<PackageReference Include="Polly" Version="7.2.4"/>
<PackageReference Include="Microsoft.Extensions.Diagnostics.HealthChecks" Version="3.1.32"/>
<PackageReference Include="System.Memory" Version="4.5.5"/>
<PackageReference Include="System.Runtime.CompilerServices.Unsafe" Version="4.7.1"/>
</ItemGroup>
<ItemGroup>
<Reference Include="ArchestrA.MXAccess">
<HintPath>..\..\lib\ArchestrA.MXAccess.dll</HintPath>
<Private>true</Private>
</Reference>
</ItemGroup>
<ItemGroup>
<Protobuf Include="Grpc\Protos\*.proto" GrpcServices="Both"/>
</ItemGroup>
<ItemGroup>
<None Update="appsettings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="appsettings.*.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="App.config">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,40 @@
{
"Serilog": {
"MinimumLevel": {
"Default": "Information",
"Override": {
"Microsoft": "Warning",
"System": "Warning",
"Grpc": "Information"
}
},
"WriteTo": [
{
"Name": "Console",
"Args": {
"formatter": "Serilog.Formatting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact"
}
},
{
"Name": "File",
"Args": {
"path": "logs/lmxproxy-.json",
"rollingInterval": "Day",
"retainedFileCountLimit": 30,
"formatter": "Serilog.Formatting.Compact.RenderedCompactJsonFormatter, Serilog.Formatting.Compact"
}
}
],
"Enrich": [
"FromLogContext",
"WithMachineName",
"WithThreadId",
"WithProcessId",
"WithEnvironmentName"
],
"Properties": {
"Application": "LmxProxy",
"Environment": "Production"
}
}
}

Some files were not shown because too many files have changed in this diff Show More