Configuration-001: wrap the EXEC dbo.sp_ValidateDraft call in sp_PublishGeneration in a BEGIN TRY/CATCH ROLLBACK; THROW block so a validation RAISERROR aborts the publish instead of being ignored. Configuration-008: route caller-supplied strings interpolated into ConfigAuditLog.DetailsJson through STRING_ESCAPE(@x, 'json') and emit sp_RollbackToGeneration's @TargetGenerationId as a bare JSON number, closing the JSON-injection / denial-of-operation vector. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 KiB
Code Review — Configuration
| Field | Value |
|---|---|
| Module | src/Core/ZB.MOM.WW.OtOpcUa.Configuration |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | 76d35d1 |
| Status | Reviewed |
| Open findings | 9 |
Checklist coverage
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Configuration-001, Configuration-002, Configuration-003 |
| 2 | OtOpcUa conventions | Configuration-004 |
| 3 | Concurrency & thread safety | Configuration-005 |
| 4 | Error handling & resilience | Configuration-006, Configuration-007 |
| 5 | Security | Configuration-008, Configuration-009, Configuration-010 |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Configuration-011 |
| 10 | Documentation & comments | No issues found |
Findings
Configuration-001
| Field | Value |
|---|---|
| Severity | High |
| Category | Correctness & logic bugs |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/20260417215224_StoredProcedures.cs:282 |
| Status | Resolved |
Description: sp_PublishGeneration invokes EXEC dbo.sp_ValidateDraft @DraftGenerationId = @DraftGenerationId; and then continues unconditionally to the reservation MERGE and the Status='Published' update. sp_ValidateDraft signals every failure with RAISERROR(..., 16, 1) followed by RETURN. A severity-16 RAISERROR is not a batch-aborting error and SET XACT_ABORT ON does not abort the transaction for it, so control returns to sp_PublishGeneration, which publishes the draft even though validation rejected it (cross-cluster namespace binding, dangling tag FKs, duplicate external identifiers, EquipmentUuid immutability all pass through). Pre-publish validation is effectively bypassed.
Recommendation: Wrap the EXEC dbo.sp_ValidateDraft in BEGIN TRY ... END TRY BEGIN CATCH ROLLBACK; THROW; END CATCH so the validation RAISERROR propagates and aborts the publish, or have sp_ValidateDraft return a result-set/output parameter that sp_PublishGeneration inspects and explicitly rolls back on. Add a regression test that publishes a draft with a known violation and asserts it stays Draft.
Resolution: Resolved 2026-05-22 — wrapped the EXEC dbo.sp_ValidateDraft call in sp_PublishGeneration in a BEGIN TRY ... BEGIN CATCH ROLLBACK; THROW; END CATCH block so a validation RAISERROR rolls back the publish transaction and re-raises instead of being silently ignored; added DB-backed regression test Publish_aborts_when_ValidateDraft_rejects_the_draft.
Configuration-002
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/20260417215224_StoredProcedures.cs:325 |
| Status | Open |
Description: sp_RollbackToGeneration opens its own BEGIN TRANSACTION, clones rows into a new Draft, then EXEC dbo.sp_PublishGeneration, which itself runs BEGIN TRANSACTION (nesting @@TRANCOUNT to 2) and on its failure paths executes a bare ROLLBACK. A bare ROLLBACK rolls back to the outermost transaction and sets @@TRANCOUNT to 0, so when sp_RollbackToGeneration later reaches its own COMMIT it runs with no open transaction and raises error 3902. The rollback clone is silently discarded and the caller sees a confusing secondary error instead of the real publish failure.
Recommendation: Make sp_PublishGeneration transaction-nesting aware: capture @@TRANCOUNT on entry, only BEGIN TRANSACTION when zero (otherwise SAVE TRANSACTION), and only COMMIT/ROLLBACK the level it owns. Alternatively factor the publish body into an inner proc that assumes an ambient transaction.
Resolution: (open)
Configuration-003
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Validation/DraftValidator.cs:73 |
| Status | Open |
Description: ValidatePathLength computes path length with hard-coded constants — it always charges 64 chars for Enterprise+Site (32 + 32 + ...) regardless of the cluster's actual values. This over-rejects: a short Enterprise/Site is penalised by up to 64 unused chars, so a legitimately under-200-char path can fail PathTooLong. The check also silently continues when an equipment's UnsLineId/UnsAreaId does not resolve, so an orphaned-line path is never length-checked.
Recommendation: Pass the actual Enterprise and Site strings into the validator (e.g. on DraftSnapshot, or as parameters alongside ValidateClusterTopology) and compute the real length. If the cluster row cannot be supplied, document the check as a conservative upper bound or emit a lower-severity warning rather than a hard error.
Resolution: (open)
Configuration-004
| Field | Value |
|---|---|
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Enums/NodePermissions.cs:8, src/Core/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs:417 |
| Status | Open |
Description: NodePermissions is declared [Flags] enum ... : uint, while its XML doc and NodeAcl.PermissionFlags' doc both say "stored as int", and ConfigureNodeAcl uses HasConversion<int>() — a uint→int conversion. Only bits 0–11 are used today, but the underlying-type/storage-type mismatch is a latent trap: a future bit-31 flag yields a uint value that overflows int and the conversion round-trip would corrupt it.
Recommendation: Change the enum underlying type to int (consistent with the docs and the conversion). No high bit is in use, so this is the smaller change.
Resolution: (open)
Configuration-005
| Field | Value |
|---|---|
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/LiteDbConfigCache.cs:50 |
| Status | Open |
Description: PutAsync performs a non-atomic find-then-insert/update. Two concurrent PutAsync calls for the same (ClusterId, GenerationId) can both observe existing is null and both Insert, producing two rows for one generation. The constructor's EnsureIndex calls are non-unique, so the storage layer does not prevent the duplicate, and PruneOldGenerationsAsync's keepLatest accounting is then off.
Recommendation: Declare a unique index on (ClusterId, GenerationId) and treat the duplicate-key exception as an idempotent no-op, or guard PutAsync with an instance SemaphoreSlim/lock. Document the concurrency contract on ILocalConfigCache.
Resolution: (open)
Configuration-006
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Error handling & resilience |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/ResilientConfigReader.cs:79 |
| Status | Open |
Description: The fallback catch filters on ex is not OperationCanceledException. A SQL command timeout surfaced by ADO.NET as a TaskCanceledException (derives from OperationCanceledException) is then treated as caller cancellation and propagates instead of falling back to the sealed cache — the opposite of the documented "fallback on any exception including timeout". The retry ShouldHandle predicate has the same shape, so command-timeout cancellations are also not retried consistently.
Recommendation: Distinguish caller cancellation from command-timeout cancellation explicitly: inspect cancellationToken.IsCancellationRequested to decide whether an OperationCanceledException is a genuine cancel (rethrow) or a timeout (fall back). Add unit tests for both a TimeoutRejectedException path and a command-timeout TaskCanceledException path asserting cache fallback occurs.
Resolution: (open)
Configuration-007
| Field | Value |
|---|---|
| Severity | Low |
| Category | Error handling & resilience |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/GenerationApplier.cs:44 |
| Status | Open |
Description: ApplyPass wraps each callback in catch (Exception ex). This swallows OperationCanceledException — a cancellation during a callback is recorded as just another entity error string and the applier keeps walking the remaining passes instead of stopping. It also masks fatal exceptions. The applier continues applying Added/Modified passes even after a Removed callback failed, leaving a partially-applied runtime state.
Recommendation: Rethrow OperationCanceledException rather than recording it as an entity error; call ct.ThrowIfCancellationRequested() between passes. Document or enforce whether a failed Removed pass should abort before the Added/Modified passes run.
Resolution: (open)
Configuration-008
| Field | Value |
|---|---|
| Severity | High |
| Category | Security |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Migrations/20260417215224_StoredProcedures.cs:150, :373, :468 |
| Status | Resolved |
Description: Three stored procedures build ConfigAuditLog.DetailsJson by raw string concatenation of caller-supplied nvarchar parameters: sp_RegisterNodeGenerationApplied (@Status), sp_RollbackToGeneration (@TargetGenerationId), sp_ReleaseExternalIdReservation (@Kind, @Value). A value with a double-quote or backslash produces malformed JSON; combined with the CK_ConfigAuditLog_DetailsJson_IsJson check constraint, the INSERT fails the constraint and aborts the surrounding publish/rollback transaction (denial of operation). It is also a JSON-injection vector that can silently rewrite the audit record's shape.
Recommendation: Build the JSON with a safe constructor (FOR JSON PATH, WITHOUT_ARRAY_WRAPPER or JSON_OBJECT(...) on SQL Server 2022+) so values are properly escaped, or run each interpolated value through STRING_ESCAPE(@Value, 'json'). Add tests with quote/backslash-containing inputs.
Resolution: Resolved 2026-05-22 — routed every caller-supplied string interpolated into DetailsJson through STRING_ESCAPE(@x, 'json') (@Status in sp_RegisterNodeGenerationApplied; @Kind/@Value in sp_ReleaseExternalIdReservation) and emitted sp_RollbackToGeneration's @TargetGenerationId as a bare JSON number via explicit CONVERT(nvarchar(20), CONVERT(bigint, ...)); added DB-backed regression tests RegisterNodeGenerationApplied_escapes_quotes_in_audit_DetailsJson and ReleaseReservation_escapes_quotes_in_audit_DetailsJson that round-trip quote/backslash inputs through ISJSON/JSON_VALUE.
Configuration-009
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Security |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/DesignTimeDbContextFactory.cs:14 |
| Status | Open |
Description: DefaultConnectionString embeds a plaintext sa password with User Id=sa directly in source, checked into the repository. Although used only at design time (dotnet ef), a checked-in sa credential normalises committing DB passwords and, if live for the shared dev SQL Server, grants sa to anyone with repo access. TrustServerCertificate=True plus Encrypt=False additionally disables transport protection for that connection.
Recommendation: Drop the embedded credential. Fall back to integrated auth (Trusted_Connection=True) or fail fast with a message instructing the developer to set OTOPCUA_CONFIG_CONNECTION. Rotate the dev sa password if this value is live.
Resolution: (open)
Configuration-010
| Field | Value |
|---|---|
| Severity | Low |
| Category | Security |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/ResilientConfigReader.cs:81 |
| Status | Open |
Description: On central-DB read failure the warning log records the full exception object. Callers pass arbitrary centralFetch delegates; if any delegate closes over a connection string, an exception thrown from it (or a SqlException carrying server/credential context) is logged verbatim. There is no scrubbing of connection-string fragments before logging, against the project's no-secret-logging rule.
Recommendation: Log ex.GetType().Name and ex.Message for SQL failures rather than the full exception, or run exception messages through a connection-string scrubber before they reach the log sink.
Resolution: (open)
Configuration-011
| Field | Value |
|---|---|
| Severity | Low |
| Category | Testing coverage |
| Location | src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Apply/GenerationApplier.cs:7, src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Validation/DraftValidator.cs:60 |
| Status | Open |
Description: The companion test project covers the cache, schema compliance, stored procedures, and DraftValidator well, but two flagged behaviours are not pinned: (a) GenerationApplier ordering/cancellation when a Removed callback fails — no test asserts the Added/Modified passes still run or that cancellation aborts; (b) ValidatePathLength's constant 32+32 approximation — no test exercises a long Enterprise/Site. The publish-bypasses-validation bug (Configuration-001) is also untested against the live SQL fixture.
Recommendation: Add GenerationApplierTests cases for a throwing callback (assert error recorded, assert cancellation propagates) and a DraftValidatorTests path-length boundary case. Add a StoredProceduresTests case that publishes an invalid draft and asserts it stays Draft.
Resolution: (open)