Files
lmxopcua/code-reviews/ControlPlane/findings.md
T
Joseph Doherty 1aa7905676 review(ControlPlane): fix premature deploy-seal from unexpected-node ack
Review at HEAD 7286d320. ControlPlane-001 (Medium): ConfigPublishCoordinator.HandleAck
now discards acks from nodes not in _expectedAcks (prevented premature SealDeployment) +
regression test. -002 (flipped-node log count), -003 (redundant mapper arms) tidied.
2026-06-19 10:52:22 -04:00

5.1 KiB

Code Review — ControlPlane

Field Value
Module src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane
Reviewer Claude Code
Review date 2026-06-19
Commit reviewed 7286d320
Status Reviewed
Open findings 0

Checklist coverage

A comprehensive review completes every category, recording "No issues found" where a category produced nothing rather than leaving it blank.

# Category Result
1 Correctness & logic bugs ControlPlane-001 (Resolved)
2 OtOpcUa conventions No issues found
3 Concurrency & thread safety No issues found
4 Error handling & resilience No issues found
5 Security No issues found
6 Performance & resource management No issues found
7 Design-document adherence No issues found
8 Code organization & conventions ControlPlane-002 (Resolved), ControlPlane-003 (Resolved)
9 Testing coverage No issues found
10 Documentation & comments No issues found

Findings

ControlPlane-001

Field Value
Severity Medium
Category Correctness & logic bugs
Location src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Coordinators/ConfigPublishCoordinator.cs:154
Status Resolved

Description: HandleAck accepts ApplyAck messages from any node, including nodes that were not in _expectedAcks at dispatch time. When an unexpected node's ack is added to _acks, _acks.Count can reach _expectedAcks.Count before every expected node has actually responded. The count-based gate on line 166 (if (_acks.Count < _expectedAcks.Count) return;) then fires prematurely, causing SealDeployment() to run while one or more expected nodes have not yet applied the deployment. In production, unexpected acks are unlikely (only DriverHostActor publishes to the deployment-acks topic) but the guard is missing.

Recommendation: Add a membership check at the start of the processing path in HandleAck to discard any ack whose NodeId is not in _expectedAcks, logging it at Debug level for diagnostics.

Resolution: Fixed 2026-06-19 (commit SHA pending). Added if (!_expectedAcks.Contains(msg.NodeId)) guard at the top of the processing path in HandleAck, discarding unexpected-node acks before they are counted. Regression test ApplyAck_from_unexpected_node_is_discarded_and_does_not_seal_prematurely added to ConfigPublishCoordinatorTests (proven failing before fix, green after).


ControlPlane-002

Field Value
Severity Low
Category Code organization & conventions
Location src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Fleet/FleetStatusBroadcaster.cs:139
Status Resolved

Description: OnTick uses a bool changed flag but passes it to the debug-log message via changed ? 1 : 0. The {Count} placeholder in the message implies a numeric count of flipped nodes, but the expression always yields 1 (never the true count) whenever any node flips. Operators reading the log will see "flipped 1 nodes to Degraded" even when multiple nodes flipped simultaneously during a cluster partition recovery.

Recommendation: Replace the bool changed flag with an int flippedCount counter incremented inside the loop, and log flippedCount directly.

Resolution: Fixed 2026-06-19 (commit SHA pending). Replaced bool changed with int flippedCount, incremented per flipped node, and logged directly. All 60 tests pass.


ControlPlane-003

Field Value
Severity Low
Category Code organization & conventions
Location src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditOutcomeMapper.cs:29
Status Resolved

Description: FromAction lists ten explicit AuditOutcome.Success arms that are all shadowed by the _ => AuditOutcome.Success default. The explicit arms add no behavioural differentiation and create an ongoing maintenance obligation: any new config-write verb added to the codebase silently falls through to the correct default anyway, making the explicit list misleading as a whitelist. The comment block documenting the known verbs is a better vehicle for the enumeration than code.

Recommendation: Remove the redundant explicit Success arms and document the known success verbs in a comment on the default arm instead. The two Denied arms must stay.

Resolution: Fixed 2026-06-19 (commit SHA pending). Removed the ten redundant explicit Success arms; replaced with an explanatory comment listing the known success verbs on the default _ => AuditOutcome.Success arm. All existing AuditOutcomeMapper tests still pass (the Outcome_is_derived_from_the_action_vocabulary theory covers every listed verb).