The net48 sidecar's TcpFrameServer.RunOneConnectionAsync registered the
cancellation token to Stop() only the listener (to unblock a parked
AcceptTcpClientAsync), but never closed the active client. On net48
NetworkStream.ReadAsync ignores the CancellationToken, so while the frame
loop is parked reading an idle connected client, cancelling the token cannot
unblock it — only closing the socket can. RunAsync therefore never returned
on Ctrl-C/service-stop while a connection was open (Program.Main's
RunAsync().GetAwaiter().GetResult() would hang until NSSM force-killed).
Register the cancel to Close() the active client, and convert the resulting
cancel-time read/handshake exception to OperationCanceledException so RunAsync
unwinds cleanly without logging it as a connection failure or counting it
toward MaxConsecutiveFailures.
Caught by the first-ever net48 execution of TcpRoundTripTests on the Windows
VM (these only compile on macOS): SingleActive_SecondClientHelloCompletesOnly
AfterFirstCloses deadlocked in teardown. Full net48 historian suite now green
(122 passed, 0 failed, 2 skipped); all 6 TcpRoundTrip tests pass.
Live verification on a Windows VM surfaced a crash loop: TcpFrameServer.EnsureListening
assigned _listener = new TcpListener(...) BEFORE calling Start(). When Start() throws —
e.g. the port is in a Windows excluded/reserved range (WSAEACCES) or already in use — the
field was left non-null-but-unstarted, so the `if (_listener is not null) return` guard
permanently skipped re-Start() and every subsequent AcceptTcpClientAsync() threw the
misleading InvalidOperationException "Not listening" → 20 failures → exit 2 → NSSM restart
→ loop. Now _listener is assigned only after Start() succeeds, so a transient bind failure
is retried and a permanent one surfaces the real bind error each iteration. Adds a
regression test that forces a bind conflict and asserts the SocketException persists.
All five suppressed advisories are now resolved at baseline/resolved versions,
so every NuGetAuditSuppress is removed repo-wide:
- System.Security.Cryptography.Xml (GHSA-37gx-xxp4-5rgx / GHSA-w3x6-4m5h-cxqf)
-> fixed by the .NET 10 baseline (10.0.6)
- OPCFoundation Opc.Ua.Core (GHSA-h958-fxgg-g7w3) -> fixed at resolved 1.5.378.106
Two were still live and are now patched via direct security pins:
- OpenTelemetry.Api 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j) pinned in Cluster;
Runtime/ControlPlane/AdminUI + tests inherit via project reference
- Tmds.DBus.Protocol 0.20.0 -> 0.21.3 (GHSA-xrw6-gwf8-vvr9) pinned in Client.UI
Also correct the Historian sidecar runtime comments (x86 -> x64, matching the
csproj PlatformTarget). Solution audit: 0 vulnerable packages; full build clean.
Adds <summary>, <param>, <typeparam>, and <inheritdoc/> tags to public
members surfaced by commentchecker — resolves 5,847 of 5,869 issues
(99.6%) across three /fixdocs passes.
Adds Directory.Packages.props (ManagePackageVersionsCentrally) and
Directory.Build.props (net10.0/nullable/implicit usings/LangVersion latest).
Strips Version attributes from every csproj PackageReference and consolidates
versions into the central file.
Side fixes (necessary to keep the build green on .NET SDK 10.0.105 on macOS):
- Microsoft.CodeAnalysis.CSharp{,.Workspaces}: 5.3.0 -> 5.0.0. The 5.3.0
analyzer DLL references compiler 5.3.0.0 and the local SDK ships compiler
5.0.0.0, producing CS9057 on every project that loaded the Analyzers
output. Master itself was broken on this machine pre-change.
- Server + Server.Tests pin OPCFoundation.NetStandard.Opc.Ua.{Configuration,
Client} to 1.5.374.126 via VersionOverride, matching Opc.Ua.Server's
pin. Mixing 1.5.378.106 Opc.Ua.Core transitively with 1.5.374.126
Opc.Ua.Server breaks CustomNodeManager2 override signatures
(CS0115 on LoadPredefinedNodes/Browse/HistoryRead*) and CS7069 in
the tests. The pin disappears when the legacy Server project is
deleted in Task 56.
- Client.UI + Client.UI.Tests: NuGetAuditSuppress for
GHSA-xrw6-gwf8-vvr9 (Tmds.DBus.Protocol 0.20.0 reaches both projects
transitively from Avalonia.Desktop on Linux/macOS only).
Deviation from the plan: TreatWarningsAsErrors=true is NOT set in
Directory.Build.props because the pre-v2 Admin/Server test projects carry
~240 xUnit1051 analyzer warnings that would fail the build. New v2 projects
opt in via their own csproj; the global flag can return once the legacy
projects are deleted in Task 56.
- Driver.Historian.Wonderware-004: ToHistorianEvent synthesises a fresh
Guid when the upstream EventId is unparseable and logs the substitution
instead of writing the historian with Guid.Empty.
- Driver.Historian.Wonderware-005: GetHealthSnapshot derives the
connection-open booleans from the active-node fields so the snapshot
is self-consistent without depending on the secondary lock.
- Driver.Historian.Wonderware-007: SID-mismatch branch in PipeServer now
sends a HelloAck { Accepted=false, RejectReason } so the client sees a
symmetric rejection.
- Driver.Historian.Wonderware-008: classify StartQuery failures —
connection-class codes drop the connection, query-class codes throw
QueryClassStartQueryException so the IPC layer surfaces Success=false.
- Driver.Historian.Wonderware-010: RequestTimeoutSeconds now enforced
via BuildRequestCts linked to the caller's CancellationToken.
- Driver.Historian.Wonderware-011: refreshed XML docs to describe the
current sidecar / named-pipe architecture (Galaxy.Host / Proxy
references reframed as historical context).
- Driver.Historian.Wonderware-012: pinned the previously-uncovered
HistorianDataSource behaviours with five new test files; also removed
the stale empty tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests
directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply _config.MaxValuesPerRead as a bucket cap in ReadAggregateAsync,
mirroring the existing cap in ReadRawAsync. Without this guard a processed
read over a wide time range with a small IntervalMs could accumulate an
unbounded HistorianAggregateSample list; if the serialised reply exceeded
the 16 MiB FrameWriter frame cap WriteAsync would throw and the client
correlation-id wait would hang. Truncation now logs a Warning with a hint
to widen IntervalMs or reduce the time range.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add exponential backoff (250 ms → 500 ms → 1 s → 2 s → 4 s → 8 s cap) to
PipeServer.RunAsync after each connection-loop exception, replacing the spin
loop that previously pegged a CPU core and flooded the log on persistent errors
such as a duplicate pipe name or a failing PipeAcl.Create. After 20 consecutive
failures the method re-throws so the SCM / NSSM supervisor can restart the
sidecar cleanly. A clean connection (even a short-lived one) resets the counter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extract the string-vs-numeric value selection from raw and at-time read
loops into a SelectValue helper method. aahClientManaged's HistoryQueryResult
has no data-type field in the bound SDK version, so the heuristic (prefer
StringValue when non-empty and Value==0) is unavoidable; the helper now
documents the limitation explicitly in its XML doc so the known edge case
(numeric tag at exactly zero with a formatted StringValue) is self-evident.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Normalise req.Events to Array.Empty<AlarmHistorianEventDto>() immediately
after MessagePack deserialization in HandleWriteAlarmEventsAsync. MessagePack
deserializes an absent or explicit-nil array field as null, not Array.Empty,
so a peer that sends a null Events array would trigger a NullReferenceException
on either .Length dereference (no-writer branch or catch block), leaving the
client correlation-id wait hanging with no reply frame.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WriteToReadOnlyFile was listed in MalformedErrors, so ClassifyOutcome/
MapOutcome routed it to PermanentFail and the store-and-forward sink
dead-lettered every alarm event in the batch. But WriteToReadOnlyFile is
a connection-configuration fault (the write session was opened without
ReadOnly = false), not an event-payload fault — treating it as permanent
silently and permanently discards alarm events on a misconfigured or
regressed connection, which is data loss.
Move WriteToReadOnlyFile from MalformedErrors into ConnectionErrors. The
batch loop now aborts the batch, resets the connection (so the reconnect
path re-opens a writable ReadOnly = false session), and defers the
events as RetryPlease for the next drain tick.
Updated the ClassifyOutcome theory data and added a dedicated regression
test pinning WriteToReadOnlyFile -> RetryPlease.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SdkAlarmHistorianWriteBackend.WriteBatchAsync replaces the RetryPlease
placeholder with the real entry point — HistorianAccess.AddStreamedValue
(HistorianEvent, out HistorianAccessError) in aahClientManaged, pinned by
decompiling the installed SDK.
The write path opens its own ReadOnly=false connection: the query-side
HistorianDataSource opens ReadOnly sessions and AddStreamedValue fails on
those with WriteToReadOnlyFile. IHistorianConnectionFactory gains a readOnly
parameter (default true, query path unchanged); BuildConnectionArgs is
extracted as a pure helper. HistorianClusterEndpointPicker is shared for
node failover; connection-class errors abort the batch as RetryPlease and
reset the connection, malformed-input codes map to PermanentFail.
Tests: connection-unavailable batch deferral, ClassifyOutcome error-code
table, BuildConnectionArgs read-vs-write shaping (80 pass, 2 rig-skipped).
Live_* round-trip tests stay Skip-gated for the D.1 rollout smoke.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group all 69 projects into category subfolders under src/ and tests/ so the
Rider Solution Explorer mirrors the module structure. Folders: Core, Server,
Drivers (with a nested Driver CLIs subfolder), Client, Tooling.
- Move every project folder on disk with git mv (history preserved as renames).
- Recompute relative paths in 57 .csproj files: cross-category ProjectReferences,
the lib/ HintPath+None refs in Driver.Historian.Wonderware, and the external
mxaccessgw refs in Driver.Galaxy and its test project.
- Rebuild ZB.MOM.WW.OtOpcUa.slnx with nested solution folders.
- Re-prefix project paths in functional scripts (e2e, compliance, smoke SQL,
integration, install).
Build green (0 errors); unit tests pass. Docs left for a separate pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>