# LmxProxy - High Level Requirements ## 1. System Purpose LmxProxy is a gRPC proxy service that bridges SCADA clients to AVEVA System Platform (Wonderware) via the ArchestrA MXAccess COM API. It exists because MXAccess is a 32-bit COM component that requires co-location with System Platform on a Windows machine running .NET Framework 4.8. LmxProxy isolates this constraint behind a gRPC interface, allowing modern .NET clients to access System Platform data remotely over HTTP/2. ## 2. Architecture ### 2.1 Two-Project Structure - **ZB.MOM.WW.LmxProxy.Host** — .NET Framework 4.8, x86-only Windows service. Hosts a gRPC server (Grpc.Core) fronting the MXAccess COM API. Runs on the same machine as AVEVA System Platform. - **ZB.MOM.WW.LmxProxy.Client** — .NET 10, AnyCPU class library. Code-first gRPC client (protobuf-net.Grpc) consumed by ScadaLink's Data Connection Layer. Packaged as a NuGet library. ### 2.2 Dual gRPC Stacks The two projects use different gRPC implementations that are wire-compatible: - **Host**: Proto-file-generated code via `Grpc.Core` + `Grpc.Tools`. Uses the deprecated C-core gRPC library because .NET Framework 4.8 does not support `Grpc.Net.Server`. - **Client**: Code-first contracts via `protobuf-net.Grpc` with `[DataContract]`/`[ServiceContract]` attributes over `Grpc.Net.Client`. Both target the same `scada.ScadaService` gRPC service definition and are wire-compatible. ### 2.3 Deployment Model - The Host service runs on the AVEVA System Platform machine (or any machine with MXAccess access). - Clients connect remotely over gRPC (HTTP/2) on a configurable port (default 50051). - The Host runs as a Windows service managed by Topshelf. ## 3. Communication Protocol ### 3.1 Transport - gRPC over HTTP/2. - Default server port: 50051. - Optional TLS with mutual TLS (mTLS) support. ### 3.2 RPCs The service exposes 10 RPCs: | RPC | Type | Description | |-----|------|-------------| | Connect | Unary | Establish session, returns session ID | | Disconnect | Unary | Terminate session | | GetConnectionState | Unary | Query MxAccess connection status | | Read | Unary | Read single tag value | | ReadBatch | Unary | Read multiple tag values | | Write | Unary | Write single tag value | | WriteBatch | Unary | Write multiple tag values | | WriteBatchAndWait | Unary | Write values, poll flag tag until match or timeout | | Subscribe | Server streaming | Stream tag value updates to client | | CheckApiKey | Unary | Validate API key and return role | ### 3.3 Data Model (VTQ) All tag values are represented as VTQ (Value, Timestamp, Quality) tuples: - **Value**: `TypedValue` — a protobuf `oneof` carrying the value in its native type (bool, int32, int64, float, double, string, bytes, datetime, typed arrays). An unset `oneof` represents null. - **Timestamp**: UTC `DateTime.Ticks` as `int64` (100-nanosecond intervals since 0001-01-01 00:00:00 UTC). - **Quality**: `QualityCode` — a structured message with `uint32 status_code` (OPC UA-compatible) and `string symbolic_name`. Category derived from high bits: `0x00xxxxxx` = Good, `0x40xxxxxx` = Uncertain, `0x80xxxxxx` = Bad. ## 4. Session Lifecycle - Clients call `Connect` with a client ID and optional API key to establish a session. - The server returns a 32-character hex GUID as the session ID. - All subsequent operations require the session ID for validation. - Sessions persist until explicit `Disconnect` or server restart. There is no idle timeout. - Session state is tracked in memory (not persisted). All sessions are lost on service restart. ## 5. Authentication & Authorization ### 5.1 API Key Authentication - API keys are validated via the `x-api-key` gRPC metadata header. - Keys are stored in a JSON file (`apikeys.json` by default) with hot-reload via FileSystemWatcher (1-second debounce). - If no API key file exists, the service auto-generates a default file with two random keys (one ReadOnly, one ReadWrite). - Authentication is enforced at the gRPC interceptor level before any service method executes. ### 5.2 Role-Based Authorization Two roles with hierarchical permissions: | Role | Read | Subscribe | Write | |------|------|-----------|-------| | ReadOnly | Yes | Yes | No | | ReadWrite | Yes | Yes | Yes | Write-protected methods: `Write`, `WriteBatch`, `WriteBatchAndWait`. A ReadOnly key attempting a write receives `StatusCode.PermissionDenied`. ### 5.3 TLS/Security - TLS is optional (disabled by default in configuration, though `Tls.Enabled` defaults to `true` in the config class). - Supports server TLS and mutual TLS (client certificate validation). - Client CA certificate path configurable for mTLS. - Certificate revocation checking is optional. - Client library supports TLS 1.2 and TLS 1.3, custom CA trust stores, self-signed certificate allowance, and server name override. ## 6. Operations ### 6.1 Read - Single tag read with configurable retry policy. - Batch read with semaphore-controlled concurrency (default max 10 concurrent operations). - Read timeout: 5 seconds (configurable). ### 6.2 Write - Single tag write with retry policy. Values are sent as `TypedValue` (native protobuf types). Type mismatches between the value and the tag's expected type return a write failure. - Batch write with semaphore-controlled concurrency. - Write timeout: 5 seconds (configurable). - WriteBatchAndWait: writes a batch, then polls the flag tag at a configurable interval until its value matches the expected flag value (type-aware comparison via `TypedValueEquals`) or a timeout expires. Default timeout: 5000ms, default poll interval: 100ms. Timeout is not an error — returns `flag_reached=false`. ### 6.3 Subscribe - Server-streaming RPC. Client sends a list of tags and a sampling interval (in milliseconds). - Server maintains a per-client bounded channel (default capacity 1000 messages). - Updates are pushed as `VtqMessage` items on the stream. - When the MxAccess connection drops, all subscribed clients receive a bad-quality notification. - Subscriptions are cleaned up on client disconnect. When the last client unsubscribes from a tag, the underlying MxAccess subscription is disposed. ## 7. Connection Resilience ### 7.1 Host Auto-Reconnect - If the MxAccess connection is lost, the Host automatically attempts reconnection at a fixed interval (default 5 seconds). - Stored subscriptions are recreated after a successful reconnect. - Auto-reconnect is configurable (`Connection.AutoReconnect`, default true). ### 7.2 Client Keep-Alive - The client sends a lightweight `GetConnectionState` ping every 30 seconds. - On keep-alive failure, the client marks the connection as disconnected and cleans up subscriptions. ### 7.3 Client Retry Policy - Polly-based exponential backoff retry. - Default: 3 attempts with 1-second initial delay (1s → 2s → 4s). - Transient errors retried: Unavailable, DeadlineExceeded, ResourceExhausted, Aborted. ## 8. Health Monitoring & Metrics ### 8.1 Health Checks Two health check implementations: - **Basic** (`HealthCheckService`): Checks MxAccess connection state, subscription stats, and operation success rate. Returns Degraded if success rate < 50% (with > 100 operations) or client count > 100. - **Detailed** (`DetailedHealthCheckService`): Reads a test tag (`System.Heartbeat`). Returns Unhealthy if not connected, Degraded if test tag quality is not Good or timestamp is older than 5 minutes. ### 8.2 Performance Metrics - Per-operation tracking: Read, ReadBatch, Write, WriteBatch. - Metrics: total count, success count, success rate, average/min/max latency, 95th percentile latency. - Rolling buffer of 1000 latency samples per operation for percentile calculation. - Metrics reported to logs every 60 seconds. ### 8.3 Status Web Server - HTTP status server on port 8080 (configurable). - Endpoints: - `GET /` — HTML dashboard with auto-refresh (30 seconds), color-coded status cards, operations table. - `GET /api/status` — JSON status report. - `GET /api/health` — Plain text `OK` (200) or `UNHEALTHY` (503). ### 8.4 Client Metrics - Per-operation counts, error counts, and latency tracking (average, p95, p99). - Rolling buffer of 1000 latency samples. - Exposed via `ILmxProxyClient.GetMetrics()`. ## 9. Service Hosting ### 9.1 Topshelf Windows Service - Service name: `ZB.MOM.WW.LmxProxy.Host` - Display name: `SCADA Bridge LMX Proxy` - Starts automatically on boot. ### 9.2 Service Recovery (Windows SCM) | Failure | Restart Delay | |---------|--------------| | First | 1 minute | | Second | 5 minutes | | Subsequent | 10 minutes | | Reset period | 1 day | ### 9.3 Startup Sequence 1. Load configuration from `appsettings.json` + environment variables. 2. Configure Serilog (console + file sinks). 3. Validate configuration. 4. Check/generate TLS certificates (if TLS enabled). 5. Initialize services: PerformanceMetrics, ApiKeyService, MxAccessClient, SubscriptionManager, SessionManager, HealthCheckService, StatusReportService. 6. Connect to MxAccess synchronously (timeout: 30 seconds). 7. Start auto-reconnect monitor loop (if enabled). 8. Start gRPC server on configured port. 9. Start HTTP status web server. ### 9.4 Shutdown Sequence 1. Cancel reconnect monitor (5-second wait). 2. Graceful gRPC server shutdown (10-second timeout, then kill). 3. Stop status web server (5-second wait). 4. Dispose all components in reverse order. 5. Disconnect from MxAccess (10-second timeout). ## 10. Configuration All configuration is via `appsettings.json` bound to `LmxProxyConfiguration`. Key settings: | Section | Setting | Default | |---------|---------|---------| | Root | GrpcPort | 50051 | | Root | ApiKeyConfigFile | `apikeys.json` | | Connection | MonitorIntervalSeconds | 5 | | Connection | ConnectionTimeoutSeconds | 30 | | Connection | ReadTimeoutSeconds | 5 | | Connection | WriteTimeoutSeconds | 5 | | Connection | MaxConcurrentOperations | 10 | | Connection | AutoReconnect | true | | Subscription | ChannelCapacity | 1000 | | Subscription | ChannelFullMode | DropOldest | | Tls | Enabled | false | | Tls | RequireClientCertificate | false | | WebServer | Enabled | true | | WebServer | Port | 8080 | Configuration is validated at startup. Invalid values cause the service to fail to start. ## 11. Logging - Serilog with console and file sinks. - File sink: `logs/lmxproxy-.txt`, daily rolling, 30 files retained. - Default level: Information. Overrides: Microsoft=Warning, System=Warning, Grpc=Information. - Enrichment: FromLogContext, WithMachineName, WithThreadId. ## 12. Constraints - Host **must** target x86 and .NET Framework 4.8 (MXAccess is 32-bit COM). - Host uses `Grpc.Core` (deprecated C-core library), required because .NET 4.8 does not support `Grpc.Net.Server`. - Client targets .NET 10 and runs in ScadaLink central/site clusters. - MxAccess COM operations require STA thread context (wrapped in `Task.Run`). - The solution file uses `.slnx` format. ## 13. Protocol The protocol specification is defined in `lmxproxy_updates.md`, which is the authoritative source of truth. All RPC signatures, message schemas, and behavioral specifications are per that document. ### 13.1 Value System (TypedValue) Values are transmitted in their native protobuf types via a `TypedValue` oneof: bool, int32, int64, float, double, string, bytes, datetime (int64 UTC Ticks), and typed arrays. An unset oneof represents null. No string serialization or parsing heuristics are used. ### 13.2 Quality System (QualityCode) Quality is a structured `QualityCode` message with `uint32 status_code` (OPC UA-compatible) and `string symbolic_name`. Supports AVEVA-aligned quality sub-codes (e.g., `BadSensorFailure` = `0x806D0000`, `GoodLocalOverride` = `0x00D80000`, `BadWaitingForInitialData` = `0x80320000`). See Component-Protocol for the full quality code table. ### 13.3 Migration from V1 The current codebase implements the v1 protocol (string-encoded values, three-state string quality). The v2 protocol is a clean break — all clients and servers will be updated simultaneously. No backward compatibility layer. This is appropriate because LmxProxy is an internal protocol with a small, controlled client count. ## 14. Component List (10 Components) | # | Component | Description | |---|-----------|-------------| | 1 | GrpcServer | gRPC service implementation, session validation, request routing | | 2 | MxAccessClient | MXAccess COM interop wrapper, connection lifecycle, read/write/subscribe | | 3 | SessionManager | Client session tracking and lifecycle | | 4 | Security | API key authentication, role-based authorization, TLS management | | 5 | SubscriptionManager | Tag subscription lifecycle, channel-based update delivery, backpressure | | 6 | Configuration | appsettings.json structure, validation, options binding | | 7 | HealthAndMetrics | Health checks, performance metrics, status web server | | 8 | ServiceHost | Topshelf hosting, startup/shutdown, logging setup, service recovery | | 9 | Client | LmxProxyClient library, builder, retry, streaming, DI integration | | 10 | Protocol | gRPC protocol specification, proto definition, code-first contracts |