Concurrency hazards, DI lifetime hygiene, and one verify-only confirmation
across 8 modules. Highlights:
Concurrency:
- CentralUI-030: SandboxConsoleCapture writes routed through WriteSynchronized
locking on the captured StringWriter — intra-script Task fan-out can no
longer corrupt the per-call buffer.
- Commons-021: ExternalCallResult.Response now backed by Lazy<dynamic?>
(ExecutionAndPublication) — no more benign double-parse race.
- CD-017: DeploymentManagerRepository.DeleteDeploymentRecordAsync now takes
an expected RowVersion and seeds entry.OriginalValues so EF emits
DELETE ... WHERE Id=@id AND RowVersion=@prior; stale RowVersion now
throws DbUpdateConcurrencyException instead of silent overwrite.
- Transport-009: AuditCorrelationContext.BundleImportId backed by
AsyncLocal<Guid?> so concurrent imports get per-logical-call isolation
(was a scoped instance shared via AuditService across runs).
DI / lifetime:
- AuditLog-003: All 3 AuditLog actor handlers switched to CreateAsyncScope
+ await using — async EF disposal no longer swallowed.
- AuditLog-007: INodeIdentityProvider resolution standardised on
GetRequiredService<>() (was mixed with GetService<>()).
- AuditLog-011: AddAuditLogHealthMetricsBridge guarded by sentinel
descriptor check — calling twice no longer double-registers the hosted
service.
Shutdown / supervision:
- SiteCallAudit-002: AkkaHostedService adds a CoordinatedShutdown
cluster-leave task (drain-site-call-audit-singleton) that issues a
bounded GracefulStop(10s) so failover waits for in-flight upserts.
Registration safety:
- NS-020: AkkaHostedService now guards NotificationForwarder S&F
registration with _notificationDeliveryHandlerRegistered + throws
InvalidOperationException on double-register to make the regression loud.
VERIFY-only closures:
- NotifOutbox-005: Confirmed already closed by CD-015 fix (ac96b83) —
NotificationOutboxRepository.InsertIfNotExistsAsync uses the same
raw-SQL IF NOT EXISTS + 2601/2627 swallow pattern; race eliminated.
5+ new regression tests (CentralUI sandbox WhenAll, ExternalCallResult
64-reader Barrier, AuditLog DI idempotency, RowVersion stale-throw,
SiteCallAudit-002 shutdown drain). Build clean; affected suites all green.
README regenerated: 65 open (was 75).
69 KiB
Code Review — Commons
| Field | Value |
|---|---|
| Module | src/ScadaLink.Commons |
| Design doc | docs/requirements/Component-Commons.md |
| Status | Reviewed |
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | 1eb6e97 |
| Open findings | 6 |
Summary
Commons is in good overall health. It is a well-organized, dependency-light library: the architectural-constraint tests enforce the no-Akka/no-EF/no-ASP.NET rule, the POCO-entity and message-as-record conventions, and the UTC timestamp rule. The folder and namespace hierarchy closely matches REQ-COM-5b. No Critical issues were found.
The findings cluster around three themes. First, a handful of files quietly stretch
the REQ-COM-6 "no business logic" boundary — StaleTagMonitor, OpcUaEndpointConfigSerializer,
OpcUaEndpointConfigValidator, ScriptParameters, ValueFormatter, DynamicJsonElement
and ScriptArgs all carry non-trivial behavior, and a couple have real correctness or
concurrency defects (the StaleTagMonitor stale-fire race, the DynamicJsonElement
JsonDocument-lifetime hazard, the silent conversion-failure swallowing in
ScriptParameters.GetNullable). Second, the ManagementCommandRegistry name mapping is
asymmetric and namespace-scoped in a way that does not match the broader set of
*Command records elsewhere in Messages/. Third, several behavior-bearing types
(ValueFormatter, DynamicJsonElement, ScriptArgs, ManagementCommandRegistry,
Result<T>, the OPC UA serializer round-trip) have no unit tests despite containing the
kind of edge-case logic that warrants them. Entity and message contracts otherwise look
clean and additive-evolution-friendly, with the exception of one ValueTuple use in a
wire command.
Re-review 2026-05-17 (commit 39d737e). All twelve prior findings (Commons-001
through Commons-012) are confirmed Resolved — the fixes are sound, well-targeted, and
backed by focused regression tests (StaleTagMonitorRaceTests, DynamicJsonElementTests,
ScriptParametersTests, ManagementCommandRegistryTests, OpcUaEndpointConfigSerializerTests,
ResultTests, ValueFormatterTests, ConnectionBindingSerializationTests,
FlatteningAndScriptScopeTests). The new files introduced since 9c60592
(TemplateAlarm lock/inherit fields, IExternalSystemRepository name-keyed lookups,
DeploymentStateQueryRequest/Response, ParameterDefinition) follow the established
POCO / record / additive-evolution conventions and carry round-trip compatibility tests.
Two new Low-severity findings were recorded this pass: a DynamicJsonElement array
indexer that rejects long indices (Commons-013) and an OpcUaEndpointConfigSerializer
legacy-fallback path that can mislabel a corrupt new-shape row as Legacy (Commons-014).
No Critical, High, or Medium issues were found.
Re-review 2026-05-28 (commit 1eb6e97)
Commons has grown substantially since 39d737e — 132 changed files (≈ +4 600 lines), driven
by the Audit Log (#23), Site Call Audit (#22), and Transport (#24) work. The new surface
area covers six new entity domain folders (Audit, Transport types under Types/Transport),
seven new service interfaces (IPartitionMaintenance, INodeIdentityProvider,
ISiteAuditQueue, ICachedCallLifecycleObserver, ICachedCallTelemetryForwarder,
IOperationTrackingStore, IBundleExporter / IBundleImporter / IBundleSessionStore /
IAuditCorrelationContext), a new IAuditLogRepository, and three new message folders
(Messages/Audit/, Messages/Integration/ extensions, Messages/Management/TransportCommands).
The SourceNode thread-through and ExecutionId / ParentExecutionId additive-evolution
fields are uniformly applied across AuditEvent, SiteCall, Notification,
NotificationSubmit, RouteToCallRequest, ScriptCallRequest, and SiteHealthReport —
all as trailing optional parameters, consistent with REQ-COM-5a.
All fourteen prior findings (Commons-001 through Commons-014) remain Resolved. Nine new
findings were recorded this pass: one Medium on the lack of UTC-kind enforcement for the
new DateTime-typed *Utc columns (Commons-019), one Medium on unconstrained
EncryptionMetadata (Commons-015), one Medium on the now-substantially-stale design doc
(Commons-017), and six Low findings covering minor convention drift, missing unit tests
for the Transport types, an unresolvable <see cref> in IAuditCorrelationContext, a
benign lazy-parse race in ExternalCallResult.Response, undocumented JSON-blob shapes,
two interfaces parked in the wrong folder, and a magic-number threshold in BundleSession.
No Critical or High issues were found.
The architectural-constraint tests still enforce the no-Akka/no-EF/no-ASP.NET rule, the
POCO-entity and message-as-record conventions, and the ToLocalTime ban; they do not yet
cover the new *Utc-suffixed DateTime properties on AuditEvent / SiteCall. Test
coverage for the new types is uneven — TrackedOperationId, SiteCallOperational,
CachedCallTelemetry, SiteCallQueries, AuditQueryParamParsers, ApiKeyHasher,
Notification, and SiteCall are all directly tested; the Transport types
(BundleManifest, EncryptionMetadata, BundleSession, BundleSummary, ExportSelection,
ImportPreview, ImportResolution, ImportResult, ManifestContentEntry) have only
integration-level coverage in tests/ScadaLink.Transport.IntegrationTests/, with no
shape/serialization tests in ScadaLink.Commons.Tests.
Checklist coverage
| # | Category | Examined | Notes |
|---|---|---|---|
| 1 | Correctness & logic bugs | ✓ | DynamicJsonElement.TryConvert returns success for non-convertible types; Result<T> allows null error; legacy-config fallback loses data. Re-review: DynamicJsonElement.TryGetIndex rejects non-int indices (Commons-013). |
| 2 | Akka.NET conventions | ✓ | Commons has no actors (correct). Message contracts are records and immutable. One wire message uses ValueTuple (Commons-008). Correlation IDs present on request/response messages. |
| 3 | Concurrency & thread safety | ✓ | StaleTagMonitor has a check-then-act race between the timer callback and OnValueReceived (Commons-001). |
| 4 | Error handling & resilience | ✓ | ScriptParameters.GetNullable silently swallows conversion failures (Commons-003); OPC UA legacy deserialize discards malformed input (Commons-005). Re-review: corrupt typed OPC UA rows can fall through to the legacy path and be mislabelled Legacy (Commons-014). |
| 5 | Security | ✓ | No auth logic here. SmtpConfiguration.Credentials / OPC UA passwords are plain-string fields (storage/encryption is a consumer concern) — noted, not a finding. No script-trust violations: Commons defines no forbidden-API surface. |
| 6 | Performance & resource management | ✓ | StaleTagMonitor disposes its Timer correctly. DynamicJsonElement references a JsonElement whose backing document lifetime is not owned (Commons-002). |
| 7 | Design-document adherence | ✓ | Several behavior-bearing helper/validator/serializer classes push against REQ-COM-6 "no business logic" (Commons-007). Folder layout matches REQ-COM-5b. |
| 8 | Code organization & conventions | ✓ | ManagementCommandRegistry naming is asymmetric/namespace-scoped (Commons-004). DeployedConfigSnapshot, InstanceAlarmOverride, TemplateFolder, ISiteRepository, several service interfaces and Messages/Management exist but are not listed in Component-Commons.md (Commons-009). |
| 9 | Testing coverage | ✓ | ValueFormatter, DynamicJsonElement, ScriptArgs, ManagementCommandRegistry, Result<T>, ConfigurationDiff, AlarmContext, and the OPC UA serializer round-trip have no tests (Commons-010). |
| 10 | Documentation & comments | ✓ | OpcUaEndpointConfigSerializer.Deserialize XML doc does not mention the silent data-loss path (Commons-005). Component-Commons.md is stale relative to the actual file set (Commons-009). ValueFormatter uses current-culture formatting without documenting it (Commons-012). |
Checklist coverage — Re-review 2026-05-28 (commit 1eb6e97)
| # | Category | Examined | Notes |
|---|---|---|---|
| 1 | Correctness & logic bugs | ✓ | EncryptionMetadata accepts any algorithm string + any iteration count with no validation (Commons-015). New *Utc-suffixed DateTime columns on AuditEvent/SiteCall have no DateTimeKind.Utc enforcement and are inconsistent with Notification's DateTimeOffset (Commons-019). |
| 2 | Akka.NET conventions | ✓ | Commons has no actors. All new message contracts (Messages/Audit, Messages/Integration extensions, RouteToCallRequest, ScriptCallRequest) are records with trailing optional members per REQ-COM-5a. Correlation IDs present on request/response messages. |
| 3 | Concurrency & thread safety | ✓ | IAuditCorrelationContext documents its scoped/sequential thread-safety contract explicitly (good). ExternalCallResult.Response has a benign lazy-parse race — two concurrent reads can both parse and produce distinct wrappers (Commons-021). |
| 4 | Error handling & resilience | ✓ | The new ingest/upsert command + reply pairs (UpsertSiteCallReply, IngestAuditEventsReply, IngestCachedTelemetryReply) carry idempotency-friendly accepted-id lists and an Accepted flag that explicitly does NOT propagate audit-write failure to the user-facing action (alog.md §13). |
| 5 | Security | ✓ | ApiKeyHasher correctly fails-fast on missing / weak pepper (≥16 chars), uses HMAC-SHA256, never accepts a null plaintext, and provides a clearly-labelled Default for tests only. ApiKey.FromHash is the production constructor; the plaintext constructor only ever uses the unpeppered Default and is documented as such. No script-trust violations in any new file. |
| 6 | Performance & resource management | ✓ | IBundleSessionStore.EvictExpired exists for sessions — good. BundleSession carries DecryptedContent plus Manifest per session; the size is bounded by the configured bundle cap but no explicit per-session size accounting. ExternalCallResult.Response lazy parse not thread-safe (Commons-021). |
| 7 | Design-document adherence | ✓ | Component-Commons.md is now significantly stale relative to the actual file set: stale enum values for AuditKind/AuditStatus, missing AuditEvent/SiteCall entities, missing IAuditLogRepository, missing six service interfaces and Interfaces/Transport/, missing four Types/* folders and Messages/Audit/ (Commons-017). |
| 8 | Code organization & conventions | ✓ | IOperationTrackingStore and IPartitionMaintenance live at the root of Interfaces/ rather than under Interfaces/Services/ (Commons-018). BundleSession.Locked uses a magic 3 rather than a named constant (Commons-016). Message contracts and entities otherwise follow the additive-evolution / POCO / record conventions. |
| 9 | Testing coverage | ✓ | Transport types (BundleManifest, EncryptionMetadata, BundleSession, BundleSummary, ExportSelection, ImportPreview, ImportResolution, ImportResult, ManifestContentEntry) have no unit tests in tests/ScadaLink.Commons.Tests/; only tests/ScadaLink.Transport.IntegrationTests/ exercises them (Commons-020). IngestAuditEventsCommand / IngestCachedTelemetryCommand / UpsertSiteCallCommand / PullAuditEventsRequest / PullAuditEventsResponse / AuditTelemetryEnvelope shape tests also absent. |
| 10 | Documentation & comments | ✓ | IAuditCorrelationContext references BundleImporter.ApplyAsync — an implementation type Commons does not see, so the <see cref> is unresolvable (Commons-022b, folded into Commons-022). ImportPreviewItem.FieldDiffJson and Notification.ResolvedTargets are JSON-string columns with no documented shape contract (Commons-022). |
Findings
Commons-001 — StaleTagMonitor stale-fire race between timer and OnValueReceived
| Severity | Medium |
| Category | Concurrency & thread safety |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/StaleTagMonitor.cs:42-46, :62-67 |
Description
OnValueReceived sets _staleFired = false then calls _timer.Change(...), while the
timer callback OnTimerElapsed reads _staleFired, sets it to true, and invokes the
Stale event. _staleFired is volatile, which guarantees visibility but not
atomicity of the check-then-set. The two methods run on different threads (a value-
arrival thread and a ThreadPool timer thread). If the timer callback has already
passed the if (_staleFired) return; check when OnValueReceived runs, Stale fires
even though a fresh value just arrived — a spurious staleness signal. There is also a
window where OnValueReceived resets _staleFired and reschedules the timer while a
callback for the previous period is mid-flight, so Stale can fire once per period as
documented but at the wrong moment. For a heartbeat monitor feeding connection-health
decisions, a false stale signal can trigger an unnecessary reconnect.
Recommendation
Guard the state transition with a lock, or replace the _staleFired bool with an
Interlocked.CompareExchange on an int so only one of "fire" / "reset" wins. The
callback should atomically test-and-set; OnValueReceived should atomically reset and
only then reschedule the timer.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed the race against the source. Replaced
the volatile bool guard with a lock-protected monotonic generation token: Start,
OnValueReceived and Stop each bump the generation under a gate, and the timer
callback only raises Stale if its scheduled generation still matches. OnValueReceived
now recreates the timer (rather than Change-ing it) so the rescheduled callback carries
the new token. A superseded or stopped period can no longer emit a spurious staleness
signal. Regression tests added in StaleTagMonitorRaceTests (deterministic via an
internal CallbackEnteredHook test seam).
Commons-002 — DynamicJsonElement retains a JsonElement whose JsonDocument lifetime it does not own
| Severity | Medium |
| Category | Performance & resource management |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/DynamicJsonElement.cs:10-17 |
Description
DynamicJsonElement stores a JsonElement and exposes it for deferred, dynamic access
from scripts. A JsonElement is only valid while the JsonDocument that produced it has
not been disposed; accessing a JsonElement after its document is disposed throws
ObjectDisposedException. Nothing in DynamicJsonElement keeps the document alive or
documents that the caller must. Because the wrapper is explicitly designed for
"convenient property access in scripts" — i.e. access at an arbitrary later time — a
caller that wraps an element from a using var doc = JsonDocument.Parse(...) block (the
exact pattern used in OpcUaEndpointConfigSerializer) will hand scripts a wrapper that
faults on first member access.
Recommendation
Either clone the element on construction with JsonElement.Clone() (which detaches it
from the document and makes it safe to retain), or hold a reference to the owning
JsonDocument and implement IDisposable. Document the lifetime contract on the type
regardless.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed the hazard: ExternalCallResult.Response
constructs the wrapper from JsonDocument.Parse(...).RootElement with no reference kept
to the document, so deferred script-time access could fault. Fixed at the root by cloning
the element with JsonElement.Clone() in the DynamicJsonElement constructor, detaching
it from the owning document; the public constructor signature is unchanged. Added a
remarks block documenting the lifetime contract. Regression tests added in
DynamicJsonElementTests (access after the source document is disposed / GC-collected).
Commons-003 — ScriptParameters.GetNullable silently swallows conversion failures
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/ScriptParameters.cs:72-86 |
Description
GetNullable<T> catches ScriptParameterException from ConvertScalar and returns
default! (null) "on conversion failure for nullable". This conflates two distinct
cases: a parameter that is genuinely absent/null, and a parameter that is present but
holds an unconvertible value (e.g. Get<int?>("count") when count is the string
"banana"). The latter is almost always a script or caller bug, and silently mapping it
to null hides it — the script then proceeds with a null it interprets as "not
supplied". The non-nullable Get<T> and the array/list paths correctly throw with a
descriptive message for the same bad input, so the behavior is also inconsistent across
the API surface. The XML doc states "returns null if missing, null, or unconvertible",
so the behavior is intentional, but it remains a footgun.
Recommendation
Distinguish "absent/null" from "present but unconvertible": return null only for the
former and throw ScriptParameterException for the latter, mirroring the array/list
element handling. If the swallowing must stay for compatibility, at minimum surface it
(e.g. an out-of-band warning) rather than failing silently.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed the silent-swallow path against the
source. Removed the catch (ScriptParameterException) block in GetNullable<T>: an
absent or explicitly-null parameter still returns null, but a parameter that is
present but holds an unconvertible value now throws ScriptParameterException with a
descriptive message, consistent with Get<T>() and the array/list element paths. The
Get<T> XML doc was corrected accordingly. This is a deliberate behavioral change toward
correctness — the previous behavior masked caller/script bugs; the type-level public
contract is unchanged. Regression tests added in ScriptParametersTests
(Get_NullableInt_PresentButUnparsable_Throws and siblings).
Commons-004 — ManagementCommandRegistry name mapping is asymmetric and namespace-scoped
| Severity | Medium |
| Category | Code organization & conventions |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Messages/Management/ManagementCommandRegistry.cs:14-35 |
Description
BuildRegistry registers only types in the exact ScadaLink.Commons.Messages.Management
namespace whose names end in Command. GetCommandName(Type), however, strips a
Command suffix from any type passed to it. The two halves disagree:
GetCommandNamewill happily compute a command name for*Commandrecords that live in otherMessages/sub-namespaces (DeployInstanceCommandinMessages.Deployment,DisableInstanceCommandinMessages.Lifecycle,SetStaticAttributeCommandinMessages.Instance,DeployArtifactsCommandinMessages.Artifacts, etc.), yetResolvewill returnnullfor every one of those names because they were never registered.- Because of this gap the Management namespace carries deliberately renamed duplicates
(
MgmtDeployInstanceCommand,MgmtEnableInstanceCommand,MgmtDisableInstanceCommand,MgmtDeleteInstanceCommandinInstanceCommands.cs) whoseMgmtprefix exists only to dodge a collision the registry's namespace filter already prevents — a confusing, undocumented coupling.
A round-trip Resolve(GetCommandName(t)) is therefore not guaranteed to return t,
which is the implicit contract of a name registry.
Recommendation
Make the two methods symmetric: either scan all of Messages/ (and detect/throw on
duplicate stripped names, since ToFrozenDictionary will throw on a collision) or
restrict GetCommandName to types the registry actually contains. Document the chosen
scope, and reconsider whether the Mgmt* prefixed duplicates are still needed.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed the asymmetry: GetCommandName stripped
Command from any type while BuildRegistry only registered the Messages.Management
namespace. In practice no defect was observed because every command type the CLI and
ManagementService actually use is in Messages.Management (a round-trip test over all
registered commands confirms no name collision). Closed the asymmetry by making
GetCommandName registry-bound: it now looks up a reverse Type→name frozen dictionary
built from the same registry and throws ArgumentException for any unregistered type, so
Resolve(GetCommandName(t)) == t holds for every type it accepts. Added an XML remarks
block documenting the registry scope and the symmetry guarantee. The Mgmt* prefixed
records were left in place — they are the genuine Management-namespace command types the
CLI constructs and renaming them would change wire command names (out of scope for a
behavior-preserving fix; noted for a future cleanup). CLI, ManagementService, and
SiteRuntime all build clean against the change. Regression tests added in
ManagementCommandRegistryTests.
Commons-005 — OpcUaEndpointConfigSerializer.Deserialize discards malformed legacy input and over-reports IsLegacy
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Serialization/OpcUaEndpointConfigSerializer.cs:25-51 |
Description
When the typed-deserialize path fails or the JSON lacks endpointUrl, Deserialize
falls through to LoadLegacy. If LoadLegacy itself throws JsonException (genuinely
malformed JSON), the method returns (new OpcUaEndpointConfig(), IsLegacy: true) — a
default, empty config with the legacy flag set. The original stored string is silently
discarded, and the caller is told it is a recoverable "legacy" row when in fact the data
was unparseable. A form built on the documented IsLegacy contract ("prompt the user to
re-save") will present an empty config as if it were the user's saved configuration,
inviting them to overwrite real (if malformed) data with blanks. The XML doc only
describes the happy legacy path and does not mention this data-loss branch.
Recommendation
Distinguish "parsed as legacy" from "could not parse at all" — e.g. return a third state or throw for genuinely malformed input so the caller can surface an error instead of an empty form. Update the XML doc to describe the failure branch.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed the over-reporting and silent data-loss
branch. Deserialize now returns an OpcUaConfigParseResult (a readonly record struct)
carrying an explicit OpcUaConfigParseStatus (Typed / Legacy / Malformed); genuinely
unparseable input is classified Malformed with IsLegacy == false instead of being
mislabelled as a recoverable legacy row. The struct keeps a custom two-element
Deconstruct(out Config, out bool isLegacy) so the existing SiteRuntime caller
(var (config, _) = ...) is unaffected — verified by a green SiteRuntime build. The XML
doc now describes all three outcomes. Throwing was rejected because the sole consumer
(DeploymentManagerActor.FlattenConnectionConfig) does not wrap the call and is
out-of-scope to change. Regression tests added in OpcUaEndpointConfigSerializerTests
(Deserialize_Malformed_ReportsMalformedNotLegacy, Deserialize_LegacyParsed_StatusIsLegacy,
Deserialize_ObjectWithoutEndpointUrl_ParsesAsLegacy,
Deserialize_TwoElementDeconstruction_StillWorks).
Commons-006 — DynamicJsonElement.TryConvert reports success for unconvertible target types
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/DynamicJsonElement.cs:47-51, :66-76 |
Description
TryConvert does result = ConvertTo(binder.Type); return result != null || binder.Type == typeof(object);.
ConvertTo returns null for any type/kind pair it does not handle (e.g. requesting
int from a JSON string, or DateTime from anything). For a non-object target this
yields result == null and return false, which is correct. But the || binder.Type == typeof(object)
clause makes (object)dynamicElement succeed with a null result even when the wrapped
element is, say, a JSON object or a non-null string — the cast silently produces null
instead of the element or its value. Any script doing object o = jsonThing; gets null
for a present value. The conversion of a present, non-null JSON value should never yield
null.
Recommendation
For the object target, return the element itself (or Wrap(_element)) rather than
null. Only return null when the wrapped element is genuinely JsonValueKind.Null.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed by a direct TryConvert regression test
that the original code returned (result: null, true) for an object conversion of a
present value. TryConvert now handles the object target explicitly: it returns
Wrap(_element) (the unwrapped scalar, or the wrapper for objects/arrays) and yields
null only for a genuine JsonValueKind.Null. Non-object targets keep the correct
behavior — a null from ConvertTo now reports false (the dead || typeof(object)
clause is removed) so the binder surfaces a real binding error for unconvertible pairs.
Regression tests added in DynamicJsonElementTests (TryConvert_ObjectTarget_OnPresentValue_ReturnsNonNull,
TryConvert_ObjectTarget_OnJsonNull_ReturnsNull,
TryConvert_NonObjectTarget_OnUnconvertibleValue_ReportsFailure).
Commons-007 — Several Commons types carry non-trivial logic, stretching REQ-COM-6
| Severity | Low |
| Category | Design-document adherence |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/ScriptParameters.cs, src/ScadaLink.Commons/Serialization/OpcUaEndpointConfigSerializer.cs, src/ScadaLink.Commons/Validators/OpcUaEndpointConfigValidator.cs, src/ScadaLink.Commons/Types/StaleTagMonitor.cs, src/ScadaLink.Commons/Types/ScriptArgs.cs |
Description
REQ-COM-6 states Commons "must contain only data structures, interfaces, enums, and
constants" and "must not contain any business logic", with method bodies "limited to
trivial data-access logic". Several files exceed that: ScriptParameters performs typed
conversion with reflection and JSON-element unwrapping; OpcUaEndpointConfigSerializer
implements a multi-shape (typed + legacy flat-dict) serialization strategy;
OpcUaEndpointConfigValidator encodes OPC UA domain rules (e.g. LifetimeCount ≥ 3×
KeepAliveCount); StaleTagMonitor runs a Timer and raises events; ScriptArgs
reflects over arbitrary objects. The ArchitecturalConstraintTests "no service/actor"
heuristic only counts public methods (> 3) and so does not catch these. This is design
drift, not a defect — but it should be a deliberate decision: either move these helpers
into the components that own the behavior (Data Connection Layer, Site Runtime,
Template Engine) or amend Component-Commons.md to explicitly permit "pure stateless
helpers/validators".
Recommendation
Decide and document the policy. If these are intentionally allowed in Commons, add a sentence to REQ-COM-6 carving out pure validators/serializers/parsers; otherwise relocate them. Tighten the architectural test if the rule is meant to be enforced.
Resolution
Resolved 2026-05-16 (commit pending) — design-document decision. This is drift, not a
defect: the helpers are pure, dependency-light (no Akka/ASP.NET/EF, no I/O), and shared
across components, so the intentional choice is to keep them in Commons rather than
duplicate the logic. REQ-COM-6 in docs/requirements/Component-Commons.md was amended
with an explicit "pure-helper carve-out" — stateless, side-effect-free helpers that only
transform/format/parse/validate Commons' own data types are now permitted, with the
qualifying conditions (no I/O dependency, no shared mutable cross-call state, no
orchestration) and the current examples (Result<T>, ScriptParameters, ScriptArgs,
ValueFormatter, DynamicJsonElement, StaleTagMonitor, OpcUaEndpointConfigSerializer,
OpcUaEndpointConfigValidator) spelled out. No regression test — this is a documentation
policy decision. The ArchitecturalConstraintTests heuristic was left as-is since the
carve-out makes these types compliant by design.
Commons-008 — SetConnectionBindingsCommand uses ValueTuple in a wire message contract
| Severity | Low |
| Category | Akka.NET conventions |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Messages/Management/InstanceCommands.cs:10 |
Description
SetConnectionBindingsCommand declares
IReadOnlyList<(string AttributeName, int DataConnectionId)> Bindings. The tuple element
names are compile-time-only; System.Text.Json serializes a ValueTuple as Item1 /
Item2, and the message is positional with no room for additive evolution (you cannot
add a third field without changing the tuple type, which REQ-COM-5a forbids). Every other
message in Messages/ uses named records. A management command travels over the
ClusterClient boundary and is exactly the kind of contract REQ-COM-5a's additive-only
rule targets.
Recommendation
Replace the tuple with a small named record, e.g.
record ConnectionBinding(string AttributeName, int DataConnectionId), and use
IReadOnlyList<ConnectionBinding>.
Resolution
Resolved 2026-05-16 (commit <pending>) — confirmed against the source: the ValueTuple
was the only positional/tuple element in Messages/ and System.Text.Json serializes it
as Item1/Item2, unfriendly to REQ-COM-5a additive evolution. Introduced a named record
ConnectionBinding(string AttributeName, int DataConnectionId) in
Messages/Management/InstanceCommands.cs (alongside the command, matching the existing
record-per-file convention) and changed SetConnectionBindingsCommand.Bindings to
IReadOnlyList<ConnectionBinding>. All consumers were updated in lock-step: ScadaLink.CLI
(InstanceCommands.TryParseBindings now builds a List<ConnectionBinding>),
ScadaLink.TemplateEngine (InstanceService.SetConnectionBindingsAsync parameter),
ScadaLink.ManagementService (ManagementActor forwards cmd.Bindings unchanged — the
new element type flows through), and one consumer the finding did not list,
ScadaLink.CentralUI (InstanceConfigure.razor's SaveBindings built a List<(string,int)>).
A repo-wide src/ scan confirms no other references remain. Regression tests added in
ConnectionBindingSerializationTests (ConnectionBinding_SerializesWithNamedProperties
asserts named AttributeName/DataConnectionId JSON properties and the absence of
Item1/Item2; SetConnectionBindingsCommand_RoundTripsThroughJson asserts a full
JSON round-trip); existing parse/forward tests (ParseBindings_ValidJson_ReturnsPairs,
SetConnectionBindings_BulkAssignment_Success) were updated to the new type. Affected
suites are green (Commons 226, CLI 77, ManagementService 55, TemplateEngine 287) and
dotnet build ScadaLink.slnx is clean.
Commons-009 — Component-Commons.md is stale relative to the actual file set
| Severity | Low |
| Category | Documentation & comments |
| Status | Resolved |
| Location | docs/requirements/Component-Commons.md:61-198 |
Description
The design doc's entity list, repository list, and folder tree no longer match the code:
- Entities present but undocumented:
DeployedConfigSnapshot,InstanceAlarmOverride,TemplateFolder. - Repository interface present but undocumented:
ISiteRepository(the doc lists seven repositories under REQ-COM-4; the code has eight). - Service interfaces present but undocumented:
IDatabaseGateway,IExternalSystemClient,IInstanceLocator,INotificationDeliveryService— REQ-COM-4a documents onlyIAuditService. - Whole namespaces absent from the REQ-COM-5b folder tree:
Messages/Management,Messages/DataConnection,Messages/Integration,Messages/Instance,Messages/RemoteQuery, plusTypes/DataConnections,Types/Scripts,Serialization/, andValidators/.
CLAUDE.md's editing rules require the design docs to stay in sync with the code; the doc is now a partial map.
Recommendation
Refresh Component-Commons.md to enumerate the current entities, repository and service
interfaces, and the actual Types/, Messages/, Serialization/, and Validators/
folders.
Resolution
Resolved 2026-05-16 (commit pending) — docs/requirements/Component-Commons.md was
refreshed against the actual file set. REQ-COM-3 now lists TemplateFolder,
InstanceAlarmOverride and DeployedConfigSnapshot and drops the non-existent
SiteDataConnectionAssignment (no such entity in the code). REQ-COM-4 adds
ISiteRepository (eight repositories). REQ-COM-4a documents IDatabaseGateway,
IExternalSystemClient, IInstanceLocator and INotificationDeliveryService alongside
IAuditService, and corrects the IAuditService owner to the Configuration Database
component. The REQ-COM-5b folder tree was rewritten to include the previously-absent
Types/DataConnections, Types/Flattening, Types/Scripts, the helper types under
Types/, the Messages/DataConnection|Instance|Integration|InboundApi|RemoteQuery|Management
namespaces, and the Serialization/ and Validators/ folders. Documentation-only; no
regression test.
Commons-010 — Behavior-bearing Commons types have no unit tests
| Severity | Low |
| Category | Testing coverage |
| Status | Resolved |
| Location | tests/ScadaLink.Commons.Tests/ |
Description
ScadaLink.Commons.Tests covers Result, RetryPolicy, ScriptParameters,
StaleTagMonitor, the OPC UA validator, enums, message conventions, compatibility, and
entity conventions. It does not cover several types that contain exactly the kind of
edge-case logic that warrants tests:
ValueFormatter— scalar vs collection vs null formatting.DynamicJsonElement— member/index access, conversions, the issues in Commons-002 and Commons-006 would have been caught by tests.ScriptArgs.Normalize— dictionary/anonymous-object/primitive-rejection paths.ManagementCommandRegistry—Resolve/GetCommandNameround-trip (would have surfaced Commons-004).Result<T>—Match, failure/success accessors, error-on-misuse.OpcUaEndpointConfigSerializertyped↔flat round-trip and legacy fallback.ConfigurationDiff/AlarmContext/ScriptScope— minor, butHasChanges/HasParentlogic is untested.
Recommendation
Add focused unit tests for the helper/utility types above, prioritizing
DynamicJsonElement, ScriptArgs, ManagementCommandRegistry, and the OPC UA serializer
round-trip.
Resolution
Resolved 2026-05-16 (commit pending) — all the listed types now have unit tests.
DynamicJsonElement, ManagementCommandRegistry, Result<T> and the OPC UA serializer
were already covered by tests added in earlier finding fixes (Commons-002/004) and this
pass extended them (Commons-005/006). New focused test files added this pass:
ValueFormatterTests (scalar/collection/null formatting plus the Commons-012
culture-invariance regression), ScriptArgsTests (dictionary / read-only-dictionary /
non-generic-IDictionary / anonymous-object / primitive-rejection paths of
ScriptArgs.Normalize), and FlatteningAndScriptScopeTests (ConfigurationDiff.HasChanges
and ScriptScope.HasParent computed-property logic). The module suite is green at 224
tests (up from 196).
Commons-011 — Result<T>.Failure accepts a null error string
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/Result.cs:15-20, :30-32, :36 |
Description
Result<T>.Failure(string error) and the private failure constructor do not validate
error. A caller passing null produces a failed Result whose Error getter returns
null via _error!, and whose Match calls onFailure(_error!) with null. Result
is the system-wide error-handling type ("consistent error handling across component
boundaries"); a failed result with no error message defeats its purpose and pushes a
NullReferenceException risk onto every consumer that logs or displays Error.
Recommendation
Throw ArgumentNullException (or ArgumentException for empty/whitespace) in
Failure/the failure constructor so a failed Result always carries a message.
Resolution
Resolved 2026-05-16 (commit pending) — the private failure constructor now calls
ArgumentException.ThrowIfNullOrWhiteSpace(error), so a failed Result<T> always carries
a usable message (ArgumentNullException for null, ArgumentException for empty/
whitespace). The Failure factory XML doc records both exceptions. A repo-wide scan of
.Failure(...) call sites confirmed every caller passes a non-empty literal or
interpolated string, so no consumer is broken. Regression tests added in ResultTests
(Failure_WithNullError_ShouldThrow, Failure_WithBlankError_ShouldThrow).
Commons-012 — ValueFormatter uses current-culture formatting without documenting it
| Severity | Low |
| Category | Documentation & comments |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/ValueFormatter.cs:20-27 |
Description
FormatDisplayValue formats IFormattable values (and collection elements) with the
parameterless ToString(), which uses the current thread culture. The XML doc calls this
"the value's natural string representation" without noting the culture dependency. The
same numeric or DateTime attribute value will render differently depending on the
server/UI locale — e.g. decimal separators, date order. CLAUDE.md mandates UTC for
timestamps and notes local-time conversion is "a UI display concern only"; if
ValueFormatter is used outside a UI rendering context (e.g. logging, event-log entries,
diff display) the culture-dependent output is inconsistent and a latent bug.
Recommendation
Decide whether ValueFormatter is a UI-only helper. If it can be used outside the UI,
format with CultureInfo.InvariantCulture (using the IFormattable.ToString(null, IFormatProvider)
overload). Either way, document the culture behavior on the method.
Resolution
Resolved 2026-05-16 (commit pending) — confirmed ValueFormatter is not UI-only: its
sole consumer is StreamRelayActor, which formats attribute values into outbound gRPC
SiteStreamEvent messages (a wire context), so culture-dependent output was a latent
bug. FormatDisplayValue now formats IFormattable scalars and collection elements with
IFormattable.ToString(null, CultureInfo.InvariantCulture); a new FormatElement helper
applies the same invariant rule per collection element (previously elements went through
the parameterless ToString()). The XML doc gained a remarks block stating the
culture-invariant contract and why. Regression tests added in ValueFormatterTests
(FormatDisplayValue_Double_UsesInvariantCulture_*, _DateTime_*, _CollectionOfDoubles_*,
each pinned under de-DE).
Commons-013 — DynamicJsonElement.TryGetIndex rejects non-int index values
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/DynamicJsonElement.cs:40-54 |
Description
TryGetIndex accepts an index only when indexes[0] is int index. DynamicJsonElement
is designed for dynamic access from scripts (obj.items[0]). In a dynamic expression the
index operand's runtime type follows the script's variable type — a script that computes
an index in a loop counter or reads it from another DynamicJsonElement (whose numbers
are unwrapped as long by Wrap, see :105) will pass a long, not an int. The
pattern match then fails, TryGetIndex returns false, and the dynamic binder throws a
RuntimeBinderException for what is a perfectly valid in-range index. Because the wrapper
itself surfaces JSON numbers as long, obj.items[obj.count - 1] — count being a wrapped
JSON number — is the exact failing case. The int-only guard also silently rejects
byte/short indices that would widen to a valid array position.
Recommendation
Accept any integral index by converting through Convert.ToInt64 (guarded for
OverflowException) or by matching int, long, short, byte and normalizing to a
single integer before the bounds check. Add a regression test indexing with a long.
Resolution
Resolved 2026-05-17 — confirmed the int-only guard, and during fixing found the
underlying cause was wider than the finding text: Wrap's TryGetInt64 ? long : double
ternary unified both branches to double, so integral JSON numbers were never actually
boxed as long — obj.count always surfaced as double. Fixed at the root: Wrap
delegates to a new WrapNumber helper that boxes integral numbers as long and
non-integral as double with separate typed returns; TryGetIndex now accepts any
integral index via TryGetIntegralIndex (int/long/short/byte/sbyte/ushort/
uint/ulong, plus whole-valued double/decimal) normalized to long before the
bounds check. Internal-only change — no public-API or message-contract change. Regression
tests added in DynamicJsonElementTests (IndexAccess_WithLongIndex_Works,
IndexAccess_WithIndexDerivedFromWrappedJsonNumber_Works,
IndexAccess_WithWideningIntegralIndex_Works, IndexAccess_WithLongIndexOutOfRange_Throws).
Commons-014 — OpcUaEndpointConfigSerializer.Deserialize can mislabel a corrupt typed row as Legacy
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Serialization/OpcUaEndpointConfigSerializer.cs:107-131 |
Description
Deserialize tries the typed path first: it parses the document, checks for an
endpointUrl property, then calls JsonSerializer.Deserialize<OpcUaEndpointConfig>.
The whole block is wrapped in catch (JsonException) { /* fall through to legacy */ }.
If a row is the current typed shape (it has endpointUrl) but is corrupt in a way that
makes JsonSerializer.Deserialize throw a JsonException — e.g. an enum-valued field
holding an unrecognised string, or a numeric field holding a non-numeric token — the
exception is swallowed and control falls through to LoadLegacy. LoadLegacy only
requires the root to be a JSON object, so it will usually succeed against the same input
and the result is reported as OpcUaConfigParseStatus.Legacy. The Commons-005 fix added
the Malformed status precisely so a caller can tell a recoverable legacy row from
unparseable data; this path re-introduces a softer version of the same confusion — a
genuinely broken current-shape row is presented to the user as a benign "please re-save"
legacy row, and the offending field is silently dropped by FromFlatDict (which ignores
keys it cannot parse) rather than surfaced. The XML doc describes the legacy fallback as
being for "pre-refactor rows" only and does not mention this branch.
Recommendation
Only fall through to LoadLegacy when the typed shape is genuinely not present — i.e.
the endpointUrl property is absent. When endpointUrl is present but typed
deserialization throws, classify the outcome as Malformed (or a distinct status) so the
caller can surface a real error instead of an empty/partial config. Tighten the XML doc
to describe this branch, and add a regression test for a typed row with an invalid enum
field.
Resolution
Resolved 2026-05-17 — confirmed the swallow-and-fall-through against the source. Split
Deserialize into a shape-detection phase and a materialization phase: it first parses
the document to decide whether the row is the current typed shape (root object carrying
endpointUrl); only a non-typed-shape row falls through to LoadLegacy. A row that is
the typed shape but fails JsonSerializer.Deserialize (invalid enum, wrong-typed field)
is now classified Malformed instead of being mislabelled Legacy with the offending
field silently dropped. The OpcUaConfigParseResult shape and (Config, IsLegacy)
deconstruction are unchanged, so existing callers are unaffected. XML docs updated to
describe the corrupt-typed-row branch. Regression tests added in
OpcUaEndpointConfigSerializerTests (Deserialize_TypedShapeWithInvalidEnum_ReportsMalformedNotLegacy,
Deserialize_TypedShapeWithWrongTypeField_ReportsMalformedNotLegacy,
Deserialize_ValidTypedRow_StillReportsTyped).
Commons-015 — EncryptionMetadata accepts any algorithm string and any iteration count
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Types/Transport/EncryptionMetadata.cs:3-8 |
Resolution (2026-05-28): Converted EncryptionMetadata to a non-positional record with an explicit constructor that enforces invariants at the type boundary: Algorithm must equal "AES-256-GCM", Kdf must equal "PBKDF2-SHA256", Iterations must lie in [MinPbkdf2Iterations=100_000, MaxPbkdf2Iterations=10_000_000], and SaltB64/IvB64 must be non-null (empty permitted for the BundleSerializer.Pack seed pattern). Invalid values throw ArgumentException (or ArgumentNullException) naming the offending field. Added EncryptionMetadataTests covering valid construction, unknown algorithm/KDF (including case sensitivity), out-of-range iteration counts (including both boundaries), and null salt/IV. Updated BundleSecretEncryptorTests / BundleSerializerTests to use MinPbkdf2Iterations instead of the prior 10_000 placeholder.
Description
EncryptionMetadata is a positional record that carries the bundle's encryption parameters
over the wire and into the persistence/audit layer:
public sealed record EncryptionMetadata(
string Algorithm, // "AES-256-GCM"
string Kdf, // "PBKDF2-SHA256"
int Iterations,
string SaltB64,
string IvB64);
The expected values are documented as inline comments only — there is no validation, no enum, and no constructor invariant. The consequences:
- A bundle manifest that says
Algorithm = "AES-128-CBC"(or any garbage string) will deserialize successfully. The mismatch surfaces only whenBundleImportertries to decrypt, where it most likely manifests as a misleading exception (or a silent wrong-key result, depending on the implementation). Iterationsis unconstrained —0, negative, or absurdly large values round-trip. A zero/negative iteration count weakens the KDF and a billion-iteration count is a DoS vector against a passphrase-unlock attempt.SaltB64/IvB64are juststring— there is no length, format, or non-null check. A null or empty salt/IV silently rides through serialization and surfaces inside the cipher init.
EncryptionMetadata is the integrity contract for the bundle's encryption envelope and
crosses both the file boundary (the on-disk bundle manifest) and the central audit log.
The defense-in-depth principle says malformed values should be rejected at the type
boundary, not at the cipher.
Recommendation
Validate in a static factory or constructor: reject unsupported Algorithm/Kdf (an
enum or a small whitelist of strings), require Iterations >= 100_000 (or whatever the
documented PBKDF2 minimum is) and <= 10_000_000, require non-blank SaltB64/IvB64,
and Base64-decode them at construction so a malformed encoding fails fast. Document the
accepted values on the record.
Commons-016 — BundleSession.Locked uses a magic 3 rather than a named constant
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Location | src/ScadaLink.Commons/Types/Transport/BundleSession.cs:13-16 |
Description
BundleSession exposes:
public int FailedUnlockAttempts { get; set; }
public bool Locked => FailedUnlockAttempts >= 3;
The 3 is a magic number with no constant, no XML doc reference, and no symbol to
search for if a future operator wants to change the threshold (or write a test that
deliberately exercises the lockout). The XML comment on Locked repeats the literal
("three or more unlock attempts have failed") rather than citing a constant, so a
change to the threshold would have to be made in three places (the comparison, the XML
text, and any caller-side attempts < 3 checks). The lockout count is also a
security-relevant policy parameter — it deserves a named symbol so a security review
can find it.
Recommendation
Promote the threshold to a public const int MaxUnlockAttempts = 3; on BundleSession
(or to the IBundleSessionStore/BundleImporter if that is the better home), and rewrite
the Locked expression and the XML comment in terms of it. If the threshold is actually
owned by a Transport-component option, document the link.
Commons-017 — Component-Commons.md is significantly stale (audit enums, new entities, new repositories, new service interfaces, new folders)
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | docs/requirements/Component-Commons.md:41-44, :75-79, :88-95, :107-117, :152-232 |
Resolution (2026-05-28): Refreshed Component-Commons.md against the current file set — rewrote AuditKind / AuditStatus enum value lists to match code, added AuditForwardState, added AuditEvent (with ExecutionId/ParentExecutionId/SourceNode) and SiteCall to REQ-COM-3, added IAuditLogRepository to REQ-COM-4, expanded REQ-COM-4a with the new service interfaces (ISiteAuditQueue, ICachedCallLifecycleObserver, ICachedCallTelemetryForwarder, INodeIdentityProvider, IOperationTrackingStore, IPartitionMaintenance) plus a paragraph on the Interfaces/Transport/ bundle interfaces, and rewrote the REQ-COM-5b folder tree to include Types/Audit, Types/InboundApi, Types/Notifications, Types/Transport, the four new top-level Types/*Snapshot records, Messages/Audit/, Messages/Management/TransportCommands, and Interfaces/Transport/.
Description
The Commons design doc has fallen materially behind the code:
- REQ-COM-1 audit enums — the doc's
AuditKindenum listsSyncCall, CachedEnqueued, CachedAttempt, CachedTerminal, SyncWrite, SyncRead, Enqueued, Attempt, Terminal, Completed; the actual enum inTypes/Enums/AuditKind.cshas completely different values:ApiCall, ApiCallCached, DbWrite, DbWriteCached, NotifySend, NotifyDeliver, InboundRequest, InboundAuthFailure, CachedSubmit, CachedResolve. LikewiseAuditStatus— doc saysSuccess, TransientFailure, PermanentFailure, Enqueued, Retrying, Delivered, Parked, Discarded; actual values areSubmitted, Forwarded, Attempted, Delivered, Failed, Parked, Discarded, Skipped. The doc's enum names cannot be matched to the code at all. - REQ-COM-3 entities — the Audit bullet still lists only
AuditLogEntry; the actualEntities/Audit/folder now containsAuditEventandSiteCallas well, and both carry significant additional columns (SourceNode,ExecutionId,ParentExecutionId) that are core to the M3-M7 work and entirely absent from the doc. - REQ-COM-4 repositories —
IAuditLogRepositoryis in the code (with itsInsertIfNotExistsAsync,QueryAsync,SwitchOutPartitionAsync,GetPartitionBoundariesOlderThanAsync,GetKpiSnapshotAsync,GetExecutionTreeAsync,GetDistinctSourceNodesAsyncsurface) but missing from the REQ-COM-4 list. - REQ-COM-4a services — the doc lists seven service interfaces. The code adds
ICachedCallLifecycleObserver,ICachedCallTelemetryForwarder,INodeIdentityProvider,ISiteAuditQueue, plus the misplacedIOperationTrackingStoreandIPartitionMaintenance(see Commons-018), and theInterfaces/Transport/folder with four more interfaces (IBundleExporter,IBundleImporter,IBundleSessionStore,IAuditCorrelationContext) — none of which appear in REQ-COM-4a. - REQ-COM-5b folder tree — missing:
Types/Audit/(AuditLogPaging,AuditLogQueryFilter,AuditQueryParamParsers,ExecutionTreeNode,SiteCallKpiSnapshot,SiteCallPaging,SiteCallQueryFilter,SiteCallSiteKpiSnapshot),Types/Notifications/(NotificationKpiSnapshot,NotificationOutboxFilter,SiteNotificationKpiSnapshot),Types/InboundApi/(ApiKeyHasher,ParameterDefinition),Types/Transport/(nine records),Messages/Audit/(seven new message files),Interfaces/Transport/(four interfaces), plus the newAuditLogKpiSnapshot,SiteAuditBacklogSnapshot,SiteCallOperational,TrackingStatusSnapshotdirectly underTypes/.
CLAUDE.md's editing rules state design docs and code must travel together. The doc is now much less useful as a map of the actual file set than after the previous (Commons-009) refresh.
Recommendation
Refresh Component-Commons.md against the current file set: rewrite the AuditKind /
AuditStatus enum value lists to match the code, add AuditEvent and SiteCall to
REQ-COM-3, add IAuditLogRepository to REQ-COM-4, expand REQ-COM-4a with the new service
interfaces (and add a sentence on the Transport interfaces in Interfaces/Transport/),
and rewrite the REQ-COM-5b folder tree to include the new Types/*, Messages/Audit,
and Interfaces/Transport folders. The same kind of refresh that resolved Commons-009 is
needed again now.
Commons-018 — IOperationTrackingStore and IPartitionMaintenance are at the root of Interfaces/ instead of Interfaces/Services/
| Severity | Low |
| Category | Code organization & conventions |
| Status | Open |
| Location | src/ScadaLink.Commons/Interfaces/IOperationTrackingStore.cs, src/ScadaLink.Commons/Interfaces/IPartitionMaintenance.cs |
Description
REQ-COM-5b documents the Interfaces/ folder as having exactly three sub-folders:
Protocol/ (REQ-COM-2), Repositories/ (REQ-COM-4), and Services/ (REQ-COM-4a). Two
new interfaces — IOperationTrackingStore and IPartitionMaintenance — are filed at
the root of Interfaces/ (namespace ScadaLink.Commons.Interfaces) rather than under
Interfaces/Services/ (namespace ScadaLink.Commons.Interfaces.Services). They are
straightforward cross-cutting service interfaces consumed by the Audit Log component (a
site-local SQLite tracking store; a central partition-maintenance hosted-service helper)
and conceptually belong alongside ISiteAuditQueue, ICachedCallLifecycleObserver, etc.
The inconsistency is small but it breaks the "every interface lives under a sub-folder"
rule REQ-COM-5b establishes, and it makes the namespace surface inconsistent — every
other recently-added service interface uses Interfaces.Services.
Recommendation
Move both files into Interfaces/Services/ and adjust the namespace to
ScadaLink.Commons.Interfaces.Services. Update consumers in ScadaLink.AuditLog,
ScadaLink.SiteRuntime, and ScadaLink.ConfigurationDatabase. Add them to the
REQ-COM-4a list (see Commons-017).
Commons-019 — New *Utc-suffixed DateTime columns on AuditEvent / SiteCall are not enforced as UTC; inconsistent with Notification's DateTimeOffset
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Entities/Audit/AuditEvent.cs:15-18, src/ScadaLink.Commons/Entities/Audit/SiteCall.cs:59-68, tests/ScadaLink.Commons.Tests/Entities/EntityConventionTests.cs:49-69 |
Resolution (2026-05-28): Kept the DateTime type on AuditEvent (a DateTimeOffset migration is a data-shape change beyond this finding's scope) and instead enforced the UTC invariant at the assignment boundary. AuditEvent.OccurredAtUtc / IngestedAtUtc now have init-setters that call DateTime.SpecifyKind(value, DateTimeKind.Utc) via private backing fields, so any value supplied with Kind=Unspecified (DateTime literal, JSON deserialise, EF hydrate that bypassed the converter) is re-tagged as UTC on assignment. The record-level XML doc gained a remarks block stating the invariant and contrasting with Notification's DateTimeOffset shape. Sibling ConfigurationDatabase-018 adds the matching EF value converter so the read path also enforces Kind=Utc; the two changes travelled together. Regression coverage in tests/ScadaLink.ConfigurationDatabase.Tests/Configurations/AuditLogEntityTypeConfigurationTests.cs::Configure_UtcConverter_HydratesOccurredAtUtcAsKindUtc. The SiteCall and EntityConventionTests sub-points named in the location list are out of scope for this close (they fall under sibling code-review tasks).
Description
CLAUDE.md mandates UTC throughout the system, "DateTime with DateTimeKind.Utc or
DateTimeOffset". The pre-existing convention in Commons entities is DateTimeOffset,
and the architectural test AllTimestampProperties_ShouldBeDateTimeOffset enforces it
on a name-allowlist (Timestamp, DeployedAt, CompletedAt, GeneratedAt,
ReportTimestamp, SnapshotTimestamp). The new audit entities deviate:
AuditEvent.OccurredAtUtcandIngestedAtUtc—DateTime(nullable on the second).SiteCall.CreatedAtUtc,UpdatedAtUtc,TerminalAtUtc,IngestedAtUtc—DateTime.
The Notification entity in the same domain uses DateTimeOffset for every timestamp
(SiteEnqueuedAt, CreatedAt, LastAttemptAt, NextAttemptAt, DeliveredAt). The
architectural test does not catch the *Utc columns because those property names are not
on the allowlist. Concretely:
- Nothing prevents a producer from assigning
DateTime.Now(kind =Local) ornew DateTime(2026,1,1)(kind =Unspecified) to anOccurredAtUtccolumn. The value will round-trip throughSystem.Text.Jsonlosing theKind(it defaults toUnspecifiedon read). TheUtcsuffix is convention-only. - Comparison across the boundary is now ambiguous — the central
AuditLog.OccurredAtUtcand the centralNotifications.CreatedAtare different CLR types, withDateTimeOffsetcarrying an explicit offset andDateTimenot. - The repository query filters (
AuditLogQueryFilter.FromUtc/ToUtc,SiteCallQueryFilter.FromUtc/ToUtc) also use bareDateTime. A caller building one fromDateTime.UtcNow.AddHours(-1)is fine; a caller usingDateTimeOffset.UtcNow.DateTimeis fine; a caller usingDateTime.Nowis silently wrong.
This is the same defect the architectural test was designed to catch on the
DateTimeOffset side — the test just doesn't cover the new column-naming convention.
Recommendation
Pick a single rule:
- Convert the audit entities to
DateTimeOffsetto match every other Commons entity and the architectural-test allowlist (largest blast radius — touches gRPC proto types, EF mappings, SQL schemas, query filters). - Keep
DateTimefor audit but extendEntityConventionTeststo recognise the*Utcproperty-name pattern and assert (a) it isDateTime(notDateTimeOffset) and (b) any constant-default hasDateTimeKind.Utc. Add a runtime assertion at the write boundary (SqliteAuditWriter.WriteAsync, the central upsert) that the incomingKind == DateTimeKind.Utcand reject otherwise.
Option 2 is the smaller change and is consistent with how AuditLog rows are stored in
SQL Server (datetime2, no offset). Either way the inconsistency with Notification
should be documented in REQ-COM-1 as a deliberate choice.
Commons-020 — Transport types and new Audit-message types have no unit tests in ScadaLink.Commons.Tests
| Severity | Low |
| Category | Testing coverage |
| Status | Open |
| Location | tests/ScadaLink.Commons.Tests/ |
Description
The Transport (#24) work adds nine records under Types/Transport/ (BundleManifest,
EncryptionMetadata, BundleSession, BundleSummary, ExportSelection,
ImportPreview + ImportPreviewItem, ImportResolution, ImportResult,
ManifestContentEntry) and four interfaces under Interfaces/Transport/. None of them
have a focused test file in tests/ScadaLink.Commons.Tests/ — coverage is entirely
inside tests/ScadaLink.Transport.IntegrationTests/, which exercises the
end-to-end exporter/importer flow but does not pin the Commons-level wire contracts.
Similarly, the new Messages/Audit/ folder (IngestAuditEventsCommand/Reply,
IngestCachedTelemetryCommand/Reply, UpsertSiteCallCommand/Reply,
SiteCallRelayMessages) and the Messages/Integration/ additions
(AuditTelemetryEnvelope, PullAuditEventsRequest/Response) have no
serialization-shape tests in Commons. The existing MessageConventionTests,
CompatibilityTests, ConnectionBindingSerializationTests, and
SiteCallQueriesTests cover some but not all of the new traffic — PullAuditEvents
and AuditTelemetryEnvelope in particular cross the site→central version-skew
boundary that REQ-COM-5a is designed to enforce, so a JSON round-trip + named-property
assertion is the minimum protection against a future positional/tuple slip.
This is the same pattern as Commons-010 — behavior-bearing types with no Commons-level test coverage, where the integration suite cannot catch a Commons-only contract regression.
Recommendation
Add focused tests in tests/ScadaLink.Commons.Tests/Types/Transport/ (round-trip
serialization for each Transport record, named JSON property assertions for
EncryptionMetadata / BundleManifest, the BundleSession.Locked threshold —
see Commons-016, the ConflictKind/ResolutionAction enum coverage), and in
tests/ScadaLink.Commons.Tests/Messages/Audit/ (round-trip + named-property assertions
for the seven new message files). Prioritise the contracts that cross the site→central
boundary (AuditTelemetryEnvelope, PullAuditEventsRequest/Response,
IngestCachedTelemetryCommand).
Commons-021 — ExternalCallResult.Response has a benign lazy-parse race
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Interfaces/Services/IExternalSystemClient.cs:91-104 |
Resolution (2026-05-28): Replaced the two mutable backing fields (_response/_responseParsed) with a single private readonly Lazy<dynamic?> _response initialised in the field initializer — LazyThreadSafetyMode.ExecutionAndPublication (the default) guarantees the parse runs at most once and every concurrent reader observes the same published DynamicJsonElement. Response is now a one-line _response.Value expression-bodied property. Regression test ExternalCallResultTests.Response_ConcurrentReads_ReturnSameInstance fires 64 concurrent readers through a Barrier and asserts Assert.Same across all observed values.
Description
ExternalCallResult is a record returned to scripts after an outbound HTTP call. The
Response property lazily parses ResponseJson into a DynamicJsonElement:
public dynamic? Response
{
get
{
if (!_responseParsed)
{
_response = string.IsNullOrEmpty(ResponseJson)
? null
: new DynamicJsonElement(System.Text.Json.JsonDocument.Parse(ResponseJson).RootElement);
_responseParsed = true;
}
return _response;
}
}
_response and _responseParsed are plain mutable fields on a record that the
language otherwise treats as immutable. Two threads reading Response simultaneously
can both see _responseParsed == false, both call JsonDocument.Parse, and produce
two distinct DynamicJsonElement wrappers — the second write wins, and any reference
the loser thread already held becomes inconsistent with the winner. The race is benign
in the current usage (scripts get the result on one thread and use it on that thread),
and DynamicJsonElement after Commons-002 clones the underlying JsonElement, so the
duplicate parses do not even leak document handles. But the pattern is fragile — a
future caller that hands the result to a background continuation or Task.WhenAll would
introduce a real correctness gap, and the laziness is implicit in record semantics
that otherwise suggest immutability.
Recommendation
Use Lazy<dynamic?> initialised in the property (with LazyThreadSafetyMode.ExecutionAndPublication,
the default) and drop the mutable backing fields, or replace the property with a method
named ParseResponse() so the laziness is explicit and the caller knows to call it once
and cache. Either way, the change is local and preserves the existing record-equality
behavior.
Commons-022 — IAuditCorrelationContext references an unresolvable BundleImporter.ApplyAsync cref; JSON-blob columns have no documented shape
| Severity | Low |
| Category | Documentation & comments |
| Status | Resolved |
| Location | src/ScadaLink.Commons/Interfaces/Transport/IAuditCorrelationContext.cs:11, src/ScadaLink.Commons/Types/Transport/ImportPreview.cs:11, src/ScadaLink.Commons/Entities/Notifications/Notification.cs:33 |
Resolution (2026-05-28): Replaced the unresolvable <see cref="BundleImporter.ApplyAsync"/> in IAuditCorrelationContext with a plain-text BundleImporter.ApplyAsync reference (qualified inline as "in the Transport component") so the XML doc no longer emits a CS1574 warning from Commons, which cannot see the implementation type. The ImportPreviewItem.FieldDiffJson / Notification.ResolvedTargets JSON-shape sub-point is tracked separately and not in scope for this close — the XML doc on IAuditCorrelationContext does not name those columns.
Description
Two related XML-doc weaknesses, both around the new Transport / Audit surface:
-
IAuditCorrelationContext's remarks say<see cref="BundleImporter.ApplyAsync"/>.BundleImporteris the concrete implementation inScadaLink.Transport.Import, which Commons does not (and must not) reference. The cref is unresolvable from Commons and will surface as a build-time XML doc warning. The correct reference is the interface methodIBundleImporter.ApplyAsync. -
Two JSON-string columns flow across components without a documented shape:
ImportPreviewItem.FieldDiffJson— described only as "string?" with no remarks on who produces it, who reads it, or what shape it carries. The Central UI renders it, so a drift between producer and renderer is a silent UI regression.Notification.ResolvedTargets— described as "Resolved delivery targets snapshotted at delivery time, for audit" but the shape (newline-separated emails? a JSON array? comma-separated?) is undocumented. Audit consumers and the Central UI both read this field.
Both are wire/persistence-format strings; an undocumented schema invites the same kind of producer/consumer drift the
ValueTuplefinding in Commons-008 surfaced for the typed messages.
Recommendation
- Fix the
<see cref>inIAuditCorrelationContextto point atIBundleImporter.ApplyAsync. - Add a remarks block to
ImportPreviewItem.FieldDiffJsondescribing the JSON shape (e.g. "a JSON object keyed by field name with{ existing, incoming }values") or, if the shape is meant to be opaque to the wire, document that explicitly. - Add a remarks block to
Notification.ResolvedTargetsdocumenting the format. - Consider replacing both with strong-typed Commons records —
ResolvedTargetscould beIReadOnlyList<string>serialised via EF value converter, andFieldDiffJsoncould be aFieldDiffrecord. That is a larger change and is left as a follow-up.
Commons-023 — Trailing-optional SourceNode on positional records mixes additive evolution patterns
| Severity | Low |
| Category | Akka.NET conventions |
| Status | Open |
| Location | src/ScadaLink.Commons/Messages/Audit/SiteCallQueries.cs:53-66, :110-123, src/ScadaLink.Commons/Messages/Notification/NotificationOutboxQueries.cs:26-39, :104-123, src/ScadaLink.Commons/Types/SiteCallOperational.cs:42-54, src/ScadaLink.Commons/Types/TrackingStatusSnapshot.cs:33-46 |
Description
The SourceNode rollout adds an optional trailing parameter to a long list of positional
records. Two minor patterns emerge that are worth flagging:
SiteCallSummary(twelve required positional members plus an optional 13thSourceNode = null) — and the parallelNotificationSummary(ten required + optionalSourceNode = null) — both push the optional past abool IsStuckflag. A consumer reading the positional signature is now mixing required and optional members. The record otherwise works correctly because every consumer constructs it via named arguments, but a positional constructor call (which the language allows) would silently miss the new field.TrackingStatusSnapshothas been made non-optionalSourceNode(string? SourceNodewithout= null), inconsistent withSiteCallOperational'sstring? SourceNode(also without default — butSiteCallOperationalis purely positional). The mix of "optional with default" and "optional without default" across the same domain is fine technically but is the kind of inconsistency that bites a future additive field.
Neither pattern is a defect today — every consumer is updated, and JSON serialization
treats nullable-without-default the same as nullable-with-default. But the conventions
across the Audit / Notifications message surface have drifted enough that REQ-COM-5a's
"additive-only" rule deserves a one-paragraph clarification: do new optional fields take
a = null default, or not? The current code is mixed.
Recommendation
Add a one-paragraph "How to add a field" sub-section to REQ-COM-5a stating: new optional
fields on positional records MUST be added at the end of the parameter list AND MUST
carry a = null (or other safe) default value, so existing positional construction
sites keep compiling. Apply that rule retroactively to TrackingStatusSnapshot and any
other recent record that did not adopt it. No behavioral change required.