diff --git a/lmxproxy/EXECUTE_FIX_WRITE_TESTS.md b/lmxproxy/EXECUTE_FIX_WRITE_TESTS.md deleted file mode 100644 index 5f1239a..0000000 --- a/lmxproxy/EXECUTE_FIX_WRITE_TESTS.md +++ /dev/null @@ -1,119 +0,0 @@ -# LmxProxy v2 — Fix Failing Write Integration Tests - -Run this prompt with Claude Code from the `lmxproxy/` directory. - -## Prompt - -You are debugging and fixing 2 failing integration tests for the LmxProxy v2 gRPC proxy service. The tests are `WriteAndReadBack` and `WriteBatchAndWait` in the integration test project. - -### Context - -Read these documents before starting: - -1. `CLAUDE.md` — project-level instructions and architecture -2. `docs/deviations.md` — deviation #7 describes the failure (OnWriteComplete COM callback not firing) -3. `mxaccess_documentation.md` — the official MxAccess Toolkit reference. Search for "OnWriteComplete", "Write() method", and "AdviseSupervisory" sections. - -Read these source files: - -4. `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.ReadWrite.cs` — current v2 write implementation -5. `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.EventHandlers.cs` — OnWriteComplete callback handler -6. `src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs` — COM event wiring (`_lmxProxy.OnWriteComplete += OnWriteComplete`) -7. `src-reference/ZB.MOM.WW.LmxProxy.Host/Implementation/MxAccessClient.ReadWrite.cs` — v1 write implementation (for comparison) -8. `src-reference/ZB.MOM.WW.LmxProxy.Host/Implementation/MxAccessClient.EventHandlers.cs` — v1 OnWriteComplete handler -9. `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteTests.cs` — failing WriteAndReadBack test -10. `tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteBatchAndWaitTests.cs` — failing WriteBatchAndWait test - -### Problem Statement - -The `OnWriteComplete` COM callback from MxAccess never fires within the write timeout (default 5s). The write operation follows this flow: - -1. `AddItem()` — registers the tag with MxAccess -2. `AdviseSupervisory()` — establishes a supervisory connection for the tag -3. Store a `TaskCompletionSource` in `_pendingWrites[itemHandle]` -4. `Write()` — sends the write to MxAccess -5. Wait for `OnWriteComplete` callback to resolve the TCS — **this never fires, causing a timeout** - -The v1 code used the same pattern and presumably worked, so the issue is either: - -- (a) MxAccess completes the write synchronously and never fires `OnWriteComplete` for simple (non-secured, non-verified) writes. The documentation says: "The LMXProxyInterface triggers an event for OnWriteComplete when your program calls the Write() or WriteSecured() function." However, this may not be true in all configurations or for all attribute types. -- (b) The COM event subscription wiring (`_lmxProxy.OnWriteComplete += OnWriteComplete`) is correct syntactically but the callback doesn't fire because the thread that called `Write()` (a thread pool thread via `Task.Run`) isn't pumping COM messages. Note: STA threading was abandoned (see deviation #2 in `docs/deviations.md`) because it caused other issues with `OnDataChange` callbacks. -- (c) There's a difference in how v1 vs v2 initializes or interacts with the COM object that affects event delivery. - -### Investigation Steps - -These steps require SSH access to windev where the v2 Host is deployed. - -**Step 1: Check if the write actually succeeds despite no callback.** - -The `WriteAndReadBack` test writes a value and then reads it back. The test fails because `WriteAsync` throws a `TimeoutException` (OnWriteComplete never fires). But the write itself may have succeeded at the MxAccess level. - -SSH to windev (`ssh windev`) and: -- Check the v2 Host service logs in `C:\publish-v2\logs\` for any write-related log entries -- Look for "Write failed", "WriteComplete", or "timeout" messages - -**Step 2: Add a fire-and-forget write mode.** - -If the write succeeds at the MxAccess level but `OnWriteComplete` never fires, the simplest fix is to bypass the callback wait. Modify `MxAccessClient.ReadWrite.cs`: - -- After calling `_lmxProxy.Write()`, immediately resolve the TCS with success instead of waiting for the callback -- Keep the `OnWriteComplete` handler wired up — if it does fire, it can log the result for diagnostics but shouldn't block the write path -- Add a configuration option `WriteConfirmationMode` with values `FireAndForget` (default) and `WaitForCallback`, so the behavior can be switched if needed - -The rationale: the MxAccess documentation's sample application (Chapter 6) uses `OnWriteComplete` to detect whether a *secured* or *verified* write is needed, then retries with `WriteSecured()`. For simple supervisory writes (which is what LmxProxy does), the Write() call itself is the confirmation — if it doesn't throw, the write was accepted. - -**Step 3: Implement the fix.** - -In `MxAccessClient.ReadWrite.cs`, modify `SetupWriteOperationAsync`: - -``` -// After _lmxProxy.Write(): -// Immediately complete the write — OnWriteComplete may not fire for supervisory writes -tcs.TrySetResult(true); -``` - -Remove the `_pendingWrites[itemHandle] = tcs` tracking since it's no longer needed for the default path. Keep `OnWriteComplete` wired for logging/diagnostics. - -Clean up `WaitForWriteCompletionAsync` — in fire-and-forget mode, the TCS is already completed so the await returns immediately. The cleanup (UnAdvise + RemoveItem) should still happen. - -**Step 4: Consider an alternative approach — poll-based write confirmation.** - -If fire-and-forget is too loose (we want to confirm the write took effect), consider a poll-based approach similar to `WriteBatchAndWaitAsync`: after writing, read the tag back and compare. This is more reliable than depending on a COM callback. However, this adds latency and may not be needed — the `WriteAndReadBack` test already does a read-back verification. - -**Step 5: Build and deploy.** - -After making changes: - -```bash -ssh windev "cd C:\source\lmxproxy && git pull && dotnet build src\ZB.MOM.WW.LmxProxy.Host -c Release -r win-x86 --no-self-contained -o C:\publish-v2" -``` - -Restart the v2 service: -```bash -ssh windev "net stop LmxProxyV2 && net start LmxProxyV2" -``` - -**Step 6: Run the integration tests.** - -From the Mac: -```bash -cd tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests -dotnet test --filter "WriteAndReadBack|WriteBatchAndWait" -v n -``` - -Both tests should pass. If they do, run the full integration test suite: -```bash -dotnet test -v n -``` - -**Step 7: Update deviations document.** - -If the fix works, update `docs/deviations.md` deviation #7 to reflect the resolution. Change the status from pending to resolved and document what was done. - -### Guardrails - -1. **Do not reintroduce STA threading.** The MTA/Task.Run approach works for `OnDataChange` callbacks and subscriptions. Do not change the threading model. -2. **Do not modify the integration tests** unless they have a genuine bug (e.g., wrong assertion, wrong tag name). The tests define the expected behavior — fix the implementation to match. -3. **Do not modify the proto file or gRPC contracts.** This is a Host-side implementation fix only. -4. **Keep the OnWriteComplete handler wired.** Even if we don't wait on it, the callback provides diagnostic information (security errors, verified write requirements) that should be logged. -5. **Commit with message:** `fix(lmxproxy): resolve write timeout — bypass OnWriteComplete callback for supervisory writes` diff --git a/lmxproxy/EXECUTE_REBUILD.md b/lmxproxy/EXECUTE_REBUILD.md deleted file mode 100644 index d7b6417..0000000 --- a/lmxproxy/EXECUTE_REBUILD.md +++ /dev/null @@ -1,98 +0,0 @@ -# LmxProxy v2 Rebuild — Execution Prompt - -Run this prompt with Claude Code from the `lmxproxy/` directory to execute all 7 phases of the rebuild autonomously. - -## Prompt - -You are executing a pre-approved implementation plan for rebuilding the LmxProxy gRPC proxy service. All design decisions have been made and documented. You do NOT need to ask for approval — execute each phase completely, then move to the next. - -### Context - -Read these documents in order before starting: - -1. `docs/plans/2026-03-21-lmxproxy-v2-rebuild-design.md` — the approved design -2. `CLAUDE.md` — project-level instructions -3. `docs/requirements/HighLevelReqs.md` — high-level requirements -4. `docs/requirements/Component-*.md` — all component requirements (10 files) -5. `docs/lmxproxy_updates.md` — authoritative v2 protocol specification - -### Execution Order - -Execute phases in this exact order. Each phase has a detailed plan in `docs/plans/`: - -1. **Phase 1**: `docs/plans/phase-1-protocol-domain-types.md` -2. **Phase 2**: `docs/plans/phase-2-host-core.md` -3. **Phase 3**: `docs/plans/phase-3-host-grpc-security-config.md` -4. **Phase 4**: `docs/plans/phase-4-host-health-metrics.md` -5. **Phase 5**: `docs/plans/phase-5-client-core.md` -6. **Phase 6**: `docs/plans/phase-6-client-extras.md` -7. **Phase 7**: `docs/plans/phase-7-integration-deployment.md` - -### How to Execute Each Phase - -For each phase: - -1. Read the phase plan document completely before writing any code. -2. Read any referenced requirements documents for that phase. -3. Execute each step in the plan in order. -4. After all steps, run `dotnet build` and `dotnet test` to verify. -5. If build or tests fail, fix the issues before proceeding. -6. Commit the phase with message: `feat(lmxproxy): phase N — ` -7. Push to remote: `git push` -8. Move to the next phase. - -### Guardrails (MUST follow) - -1. **Proto is the source of truth** — any wire format question is resolved by reading `src/ZB.MOM.WW.LmxProxy.Host/Grpc/Protos/scada.proto`, not the code-first contracts. -2. **No v1 code in the new build** — the `src-reference/` directory is for reading only. Do not copy-paste and modify; write fresh code guided by the plan. -3. **Cross-stack tests in Phase 1** — Host proto serialize to Client code-first deserialize (and vice versa) must pass before any business logic. -4. **COM calls only on STA dispatch thread** — no `Task.Run` for COM operations. All go through the `StaDispatchThread` dispatch queue. -5. **status_code is canonical for quality** — `symbolic_name` is always derived from lookup, never independently set. -6. **Unit tests before integration** — every phase includes unit tests. Integration tests are Phase 7 only. -7. **Each phase must compile and pass tests** before the next phase begins. Do not skip failing tests. -8. **No string serialization heuristics** — v2 uses native TypedValue. No `double.TryParse` or `bool.TryParse` on values. -9. **Do not modify requirements or design docs** — if you find a conflict, follow the design doc's resolution (section 11). -10. **Do not ask for user approval** — all decisions are pre-approved in the design document. - -### Error Recovery - -- If a build fails, read the error messages carefully, fix the code, and rebuild. -- If a test fails, fix the implementation (not the test) unless the test has a clear bug. -- If a step in the plan is ambiguous, consult the requirements document for that component. -- If the requirements are ambiguous, consult the design document's resolution table (section 11). -- If you cannot resolve an issue after 3 attempts, skip that step, leave a `// TODO: ` comment, and continue. - -### Phase 7 Special Instructions - -Phase 7 requires SSH access to windev (10.100.0.48). See `windev.md` in the repo root for connection details: -- SSH: `ssh windev` (passwordless) -- Default shell: cmd.exe, use `powershell -Command` for PowerShell -- Git and .NET SDK 10 are installed -- The existing v1 LmxProxy service is at `C:\publish\` on port 50051 - -For Veeam backups, SSH to the Veeam server: -- SSH: `ssh dohertj2@10.100.0.30` (passwordless) -- Use `Add-PSSnapin VeeamPSSnapin` for Veeam PowerShell - -### Commit Messages - -Use this format for each phase commit: - -- Phase 1: `feat(lmxproxy): phase 1 — v2 protocol types and domain model` -- Phase 2: `feat(lmxproxy): phase 2 — host core (MxAccessClient, SessionManager, SubscriptionManager)` -- Phase 3: `feat(lmxproxy): phase 3 — host gRPC server, security, configuration, service hosting` -- Phase 4: `feat(lmxproxy): phase 4 — host health monitoring, metrics, status web server` -- Phase 5: `feat(lmxproxy): phase 5 — client core (ILmxProxyClient, connection, read/write/subscribe)` -- Phase 6: `feat(lmxproxy): phase 6 — client extras (builder, factory, DI, streaming extensions)` -- Phase 7: `feat(lmxproxy): phase 7 — integration tests, deployment to windev, v1 cutover` - -### After All Phases - -When all 7 phases are complete: - -1. Run `dotnet build ZB.MOM.WW.LmxProxy.slnx` to verify the full solution builds. -2. Run `dotnet test` to verify all unit tests pass. -3. Verify the integration tests passed in Phase 7. -4. Create a final commit if any cleanup was needed. -5. Push all changes. -6. Report: total files created, total tests, build status, integration test results. diff --git a/lmxproxy/REQUIREMENTS_PROMPT.md b/lmxproxy/REQUIREMENTS_PROMPT.md deleted file mode 100644 index 359de91..0000000 --- a/lmxproxy/REQUIREMENTS_PROMPT.md +++ /dev/null @@ -1,95 +0,0 @@ -# LmxProxy Requirements Documentation Prompt - -Use this prompt with Claude Code to generate the requirements documentation for the LmxProxy project. Run from the `lmxproxy/` directory. - -## Prompt - -Create requirements documentation for the LmxProxy project in `docs/requirements/`. Follow the same structure used in the ScadaLink project (`docs/requirements/` in the parent repo) — a high-level requirements doc and per-component breakout documents. - -### Context - -LmxProxy is a gRPC proxy service that bridges SCADA clients to industrial automation systems (primarily AVEVA/Wonderware System Platform via ArchestrA.MXAccess). It consists of two projects: - -1. **ZB.MOM.WW.LmxProxy.Host** — A .NET Framework 4.8 Windows service (Topshelf) that runs on the same machine as System Platform. It connects to MXAccess (COM interop, x86) and exposes a gRPC server for remote SCADA operations (read, write, subscribe, batch operations). It handles session management, API key authentication, TLS, health checks, performance metrics, and subscription management. - -2. **ZB.MOM.WW.LmxProxy.Client** — A .NET 10 class library providing a typed gRPC client for consuming the LmxProxy service. It uses protobuf-net.Grpc (code-first, no .proto files). It includes connection management, retry policies, TLS support, streaming extensions, DI integration, and a builder pattern for configuration. - -### What to Generate - -**1. `docs/requirements/HighLevelReqs.md`** — High-level requirements covering: -- System purpose and architecture (proxy pattern, why it exists) -- Deployment model (runs on System Platform machine, clients connect remotely) -- Communication protocol (gRPC, HTTP/2, code-first and proto-based) -- Session lifecycle (connect, session ID, disconnect, no idle timeout) -- Authentication model (API key via metadata header, configurable enforcement) -- TLS/security model (optional TLS, mutual TLS support, certificate validation) -- Data model (VTQ — Value/Timestamp/Quality, OPC-style quality codes) -- Operations (read, read batch, write, write batch, write-and-wait, subscribe) -- Subscription model (server-streaming, tag-based, sampling interval) -- Health monitoring and metrics -- Service hosting (Topshelf Windows service, service recovery) -- Configuration (appsettings.json sections) -- Scale considerations -- Protocol versioning (v1 string-based, v2 OPC UA-aligned typed values) - -**2. Component documents** — One `Component-.md` for each logical component: - -- **Component-GrpcServer.md** — The gRPC service implementation (ScadaGrpcService). Session validation, request routing to MxAccessClient, subscription lifecycle, error handling, proto-based serialization. - -- **Component-MxAccessClient.md** — The MXAccess COM interop wrapper. Connection lifecycle (Become/Stash-like state machine), tag registration, read/write operations, subscription via advise callbacks, event handling, x86/COM threading constraints. This is the core component. - -- **Component-SessionManager.md** — Client session tracking, session creation/destruction, session-to-client mapping, concurrent session limits. - -- **Component-Security.md** — API key authentication (ApiKeyService, ApiKeyInterceptor), key file management, role-based permissions (ReadOnly/ReadWrite), TLS certificate management. - -- **Component-SubscriptionManager.md** — Tag subscription lifecycle, channel-based update delivery, sampling intervals, backpressure (channel full modes), subscription cleanup on disconnect. - -- **Component-Configuration.md** — appsettings.json structure, configuration validation, TLS configuration, service recovery configuration, connection timeouts, retry policies. - -- **Component-HealthAndMetrics.md** — Health check service (test tag reads, stale data detection), performance metrics (operation counts, latencies, percentiles), status web server (HTTP status endpoint). - -- **Component-ServiceHost.md** — Topshelf service hosting, Program.cs entry point, Serilog logging setup, service install/uninstall, service recovery (Windows SCM restart policies). - -- **Component-Client.md** — The LmxProxyClient library. Builder pattern, connection management, retry with Polly, keep-alive pings, streaming extensions, DI registration (ServiceCollectionExtensions), factory pattern, TLS configuration. - -- **Component-Protocol.md** — The gRPC protocol specification. Proto definition, code-first contracts (IScadaService), message schemas, VTQ format, quality codes, v1 vs v2 differences. - -### Document Structure (per component) - -Each component doc must follow this structure exactly: -``` -# Component: - -## Purpose -<1-2 sentence description> - -## Location - - -## Responsibilities - - -## - - -## Dependencies - - -## Interactions - -``` - -### Sources - -Derive requirements from: -- The source code in `src/ZB.MOM.WW.LmxProxy.Host/` and `src/ZB.MOM.WW.LmxProxy.Client/` -- The protocol docs in `docs/lmxproxy_protocol.md` and `docs/lmxproxy_updates.md` -- The appsettings.json configuration files - -### Rules - -- Write requirements as design decisions, not aspirational statements. Describe what the system **does**, not what it **should** do. -- Include specific values from configuration (ports, timeouts, intervals, limits). -- Cross-reference between documents using component names. -- Keep the high-level doc focused on system-wide concerns; push implementation details to component docs. -- Do not invent features not present in the source code. diff --git a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs index b7b7906..f74a2e1 100644 --- a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs +++ b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Configuration/LmxProxyConfiguration.cs @@ -31,8 +31,8 @@ namespace ZB.MOM.WW.LmxProxy.Host.Configuration /// Health check / probe configuration. public class HealthCheckConfiguration { - /// Tag address to probe for connection liveness. Default: TestChildObject.TestBool. - public string TestTagAddress { get; set; } = "TestChildObject.TestBool"; + /// Tag address to probe for connection liveness. Default: DevAppEngine.Scheduler.ScanTime. + public string TestTagAddress { get; set; } = "DevAppEngine.Scheduler.ScanTime"; /// Probe timeout in milliseconds. Default: 5000. public int ProbeTimeoutMs { get; set; } = 5000; diff --git a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs index 7e94ba1..d177b24 100644 --- a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs +++ b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/Health/DetailedHealthCheckService.cs @@ -20,7 +20,7 @@ namespace ZB.MOM.WW.LmxProxy.Host.Health public DetailedHealthCheckService( IScadaClient scadaClient, - string testTagAddress = "TestChildObject.TestBool") + string testTagAddress = "DevAppEngine.Scheduler.ScanTime") { _scadaClient = scadaClient; _testTagAddress = testTagAddress; diff --git a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/appsettings.json b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/appsettings.json index 1202291..2eff7e5 100644 --- a/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/appsettings.json +++ b/lmxproxy/src/ZB.MOM.WW.LmxProxy.Host/appsettings.json @@ -33,7 +33,7 @@ }, "HealthCheck": { - "TestTagAddress": "TestChildObject.TestBool", + "TestTagAddress": "DevAppEngine.Scheduler.ScanTime", "ProbeTimeoutMs": 5000, "MaxConsecutiveTransportFailures": 3, "DegradedProbeIntervalMs": 30000