Commit Graph

12 Commits

Author SHA1 Message Date
Joseph Doherty b3907efa6e review(Driver.Historian.Wonderware): AtTime fails over on connection-class errors
Re-review at 7286d320. -014 (Medium): ReadAtTimeAsync didn't classify StartQuery failures,
so a connection-class failure left a dead connection, re-failed every timestamp, and returned
Success=true with all-Bad (no failover); now resets+fails over via a shared classifier + tests.
-015: refresh stale named-pipe comments to TCP (no wire change). -013 (silent cap truncation,
ties OpcUaServer-002/Core.Abstractions-009) deferred cross-module. NOTE: the SDK-touching tests
are net48 + native aahClientManaged and run only on Windows; macOS verifies build + the SDK-free
subset only.
2026-06-19 11:47:11 -04:00
Joseph Doherty feddc2b80e feat(historian): emit PermanentFail for poison alarm events via additive PerEventStatus sidecar IPC field 2026-06-18 12:30:14 -04:00
Joseph Doherty db2e4777dd fix(historian-sidecar): close active TCP client on cancel so RunAsync unwinds
v2-ci / build (push) Failing after 33s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
The net48 sidecar's TcpFrameServer.RunOneConnectionAsync registered the
cancellation token to Stop() only the listener (to unblock a parked
AcceptTcpClientAsync), but never closed the active client. On net48
NetworkStream.ReadAsync ignores the CancellationToken, so while the frame
loop is parked reading an idle connected client, cancelling the token cannot
unblock it — only closing the socket can. RunAsync therefore never returned
on Ctrl-C/service-stop while a connection was open (Program.Main's
RunAsync().GetAwaiter().GetResult() would hang until NSSM force-killed).

Register the cancel to Close() the active client, and convert the resulting
cancel-time read/handshake exception to OperationCanceledException so RunAsync
unwinds cleanly without logging it as a connection failure or counting it
toward MaxConsecutiveFailures.

Caught by the first-ever net48 execution of TcpRoundTripTests on the Windows
VM (these only compile on macOS): SingleActive_SecondClientHelloCompletesOnly
AfterFirstCloses deadlocked in teardown. Full net48 historian suite now green
(122 passed, 0 failed, 2 skipped); all 6 TcpRoundTrip tests pass.
2026-06-12 13:34:45 -04:00
Joseph Doherty 6218512365 fix(historian-sidecar): don't wedge the TCP listener when Start() bind fails
v2-ci / build (push) Failing after 46s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Live verification on a Windows VM surfaced a crash loop: TcpFrameServer.EnsureListening
assigned _listener = new TcpListener(...) BEFORE calling Start(). When Start() throws —
e.g. the port is in a Windows excluded/reserved range (WSAEACCES) or already in use — the
field was left non-null-but-unstarted, so the `if (_listener is not null) return` guard
permanently skipped re-Start() and every subsequent AcceptTcpClientAsync() threw the
misleading InvalidOperationException "Not listening" → 20 failures → exit 2 → NSSM restart
→ loop. Now _listener is assigned only after Start() succeeds, so a transient bind failure
is retried and a permanent one surfaces the real bind error each iteration. Adds a
regression test that forces a bind conflict and asserts the SocketException persists.
2026-06-12 13:02:22 -04:00
Joseph Doherty 72f32045a4 refactor(historian): remove named-pipe transport 2026-06-12 11:51:53 -04:00
Joseph Doherty 999e58c605 fix(historian-sidecar): cancel SocketException guard + version-reject log + TLS test (review) 2026-06-12 11:31:04 -04:00
Joseph Doherty 3528702185 feat(historian-sidecar): TcpFrameServer (TCP + optional TLS) 2026-06-12 11:16:28 -04:00
Joseph Doherty 64e3fbe035 docs: backfill XML documentation across 756 files
v2-ci / build (push) Failing after 1m43s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
Adds <summary>, <param>, <typeparam>, and <inheritdoc/> tags to public
members surfaced by commentchecker — resolves 5,847 of 5,869 issues
(99.6%) across three /fixdocs passes.
2026-05-28 08:10:17 -04:00
Joseph Doherty 1f29b215c8 fix(driver-historian-wonderware): resolve Low code-review findings (Driver.Historian.Wonderware-004,005,007,008,010,011,012)
- Driver.Historian.Wonderware-004: ToHistorianEvent synthesises a fresh
  Guid when the upstream EventId is unparseable and logs the substitution
  instead of writing the historian with Guid.Empty.
- Driver.Historian.Wonderware-005: GetHealthSnapshot derives the
  connection-open booleans from the active-node fields so the snapshot
  is self-consistent without depending on the secondary lock.
- Driver.Historian.Wonderware-007: SID-mismatch branch in PipeServer now
  sends a HelloAck { Accepted=false, RejectReason } so the client sees a
  symmetric rejection.
- Driver.Historian.Wonderware-008: classify StartQuery failures —
  connection-class codes drop the connection, query-class codes throw
  QueryClassStartQueryException so the IPC layer surfaces Success=false.
- Driver.Historian.Wonderware-010: RequestTimeoutSeconds now enforced
  via BuildRequestCts linked to the caller's CancellationToken.
- Driver.Historian.Wonderware-011: refreshed XML docs to describe the
  current sidecar / named-pipe architecture (Galaxy.Host / Proxy
  references reframed as historical context).
- Driver.Historian.Wonderware-012: pinned the previously-uncovered
  HistorianDataSource behaviours with five new test files; also removed
  the stale empty tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests
  directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 08:18:10 -04:00
Joseph Doherty 7474631992 fix(driver-historian-wonderware): resolve Medium code-review finding (Driver.Historian.Wonderware-006)
Add exponential backoff (250 ms → 500 ms → 1 s → 2 s → 4 s → 8 s cap) to
PipeServer.RunAsync after each connection-loop exception, replacing the spin
loop that previously pegged a CPU core and flooded the log on persistent errors
such as a duplicate pipe name or a failing PipeAcl.Create. After 20 consecutive
failures the method re-throws so the SCM / NSSM supervisor can restart the
sidecar cleanly. A clean connection (even a short-lived one) resets the counter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 09:23:42 -04:00
Joseph Doherty 205b07f6aa fix(driver-historian-wonderware): resolve Medium code-review finding (Driver.Historian.Wonderware-002)
Normalise req.Events to Array.Empty<AlarmHistorianEventDto>() immediately
after MessagePack deserialization in HandleWriteAlarmEventsAsync. MessagePack
deserializes an absent or explicit-nil array field as null, not Array.Empty,
so a peer that sends a null Events array would trigger a NullReferenceException
on either .Length dereference (no-writer branch or catch block), leaving the
client correlation-id wait hanging with no reply frame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 09:21:46 -04:00
Joseph Doherty a25593a9c6 chore: organize solution into module folders (Core/Server/Drivers/Client/Tooling)
Group all 69 projects into category subfolders under src/ and tests/ so the
Rider Solution Explorer mirrors the module structure. Folders: Core, Server,
Drivers (with a nested Driver CLIs subfolder), Client, Tooling.

- Move every project folder on disk with git mv (history preserved as renames).
- Recompute relative paths in 57 .csproj files: cross-category ProjectReferences,
  the lib/ HintPath+None refs in Driver.Historian.Wonderware, and the external
  mxaccessgw refs in Driver.Galaxy and its test project.
- Rebuild ZB.MOM.WW.OtOpcUa.slnx with nested solution folders.
- Re-prefix project paths in functional scripts (e2e, compliance, smoke SQL,
  integration, install).

Build green (0 errors); unit tests pass. Docs left for a separate pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 01:55:28 -04:00