Commit Graph

13 Commits

Author SHA1 Message Date
Joseph Doherty ad4744a295 docs(code-review): resolve the 8 re-review findings (fixed in 9ab1c002)
Flip DataConnectionLayer-029, InboundAPI-031, SiteRuntime-032/033,
StoreAndForward-028, AuditLog-017, CentralUI-037, ScriptAnalysis-009 from Open to
Resolved with the fixing commit + a one-line resolution each; regen README
(0 pending / 576 total across 25 modules).
2026-06-24 09:40:47 -04:00
Joseph Doherty c42bb48585 docs(code-review): re-review 17 changed modules at 1f9de8a2 — 8 new findings
Re-reviewed the modules whose source changed since the last review baseline
(full-review remediation fd618cf1 + InboundAPI Database-helper fixes b3c90143),
focused on whether the fixes are sound and regression-free. 9 of 17 modules
clean; 8 new findings (0 Critical, 0 High, 4 Medium, 4 Low), all code-verified
by the orchestrator before recording:

- DataConnectionLayer-029 (Med): DCL-023's unsubscribe-clears-in-flight reopens a
  double-subscribe window that leaks an orphaned alarm feed; the alarm completion
  handler overwrites the subscription id without the tag-path guard at line 908.
- InboundAPI-031 (Med): WaitForAttribute's 5s grace backstop is tighter than the
  CommunicationService Ask's timeout+IntegrationTimeout (30s) round-trip slack, so
  a slow-but-valid timed-out 'false' arriving in the 5-30s window is cancelled into
  an unhandled OperationCanceledException/500 (contradicts spec 6 + its own comment).
- SiteRuntime-032 (Med): SiteRuntime-029's wasPresent guard skips the deployed-count
  decrement when deleting a DISABLED instance (absent from both maps), drifting the
  health-dashboard tally; self-heals on singleton restart (observational, hence Med).
- StoreAndForward-028 (Med): StoreAndForward-025 resets the register-guard but not
  _bufferedCount, so a same-instance Stop->Start re-seeds the depth gauge to ~2N.
- AuditLog-017, CentralUI-037, ScriptAnalysis-009, SiteRuntime-033 (Low): a
  test-coverage gap plus stale doc-comments/spec following the remediation.

Header commit/date bumped to 1f9de8a2 / 2026-06-24 on all 17 modules; README
regenerated (8 pending / 576 total).
2026-06-24 09:20:03 -04:00
Joseph Doherty d39089f4ed docs(code-review): full review at 4307c381 — 18 modules, 67 findings recorded + remediation tracked
Full per-module re-review of the 16 stale modules (last seen 1eb6e97 / 2026-05-28)
plus first-ever reviews of KpiHistory (#26) and ScriptAnalysis (#25), at HEAD 4307c381.

67 new findings (0 Critical, 6 High, 27 Medium, 34 Low). Remediation in commit
fd618cf1 closed 5 of the 6 Highs and ~33 Medium/Low; the rest are Deferred/Won't Fix
with rationale. Remaining pending (4) are all InboundAPI's Database-helper findings
(IA-026 High .. IA-029), left to the active feat/ipsen-movein effort per owner decision.

Highlights: caught a central-only-delivery security drift (SMTP creds broadcast to
sites — DM-025/SR-031), a never-committed 'Resolved' fix (SiteEventLogging-016 → -024),
an unguarded KPI recorder tick (KH-001), a trust-analyzer fallback weakening (SA-001),
and a native-alarm subscribe-path leak (DCL-023). ScriptAnalysis verdict: trust boundary
is semantically sound (symbol-based) in the production cluster config.

README regenerated; regen-readme.py --check passes (4 pending / 567 total).
2026-06-20 18:02:32 -04:00
Joseph Doherty 7b0b9c7365 refactor: rename ScadaLink → ZB.MOM.WW.ScadaBridge (code + projects + namespaces)
Solution + 23 src projects + 26 test projects renamed; folders, csproj,
namespaces, and ScadaLinkDbContext/ScadaBridgeDbContext class updated.
ActorSystem "scadalink" → "scadabridge", Akka seed-node URLs migrated.
SQL roles/logins, LDAP domains, CLI command name, and CLI config dir
(~/.scadalink → ~/.scadabridge) also renamed.

Build green; 5 Host.Tests fail awaiting SQL login rename in next commit.
Pre-existing StaleTagMonitor timing flakes unchanged.

Rename script committed at tools/rename-to-scadabridge.sh.
2026-05-28 09:37:45 -04:00
Joseph Doherty f936f55f51 fix(concurrency): close 8 race / thread-safety findings across CD, DCL, SR
CD-015: rewrite NotificationOutboxRepository.InsertIfNotExistsAsync as raw-SQL
IF NOT EXISTS … INSERT with SqlException 2601/2627 catch, ending the
at-least-once livelock on the site→central notification handoff.

DCL-018/019/020/021/022: add _subscribesInFlight guard so concurrent
same-tag subscribes don't orphan an adapter handle; delete the latent
dead _subscriptionHandles dictionary; stop double-counting
_totalSubscribed when an unresolved tag is promoted via another instance;
release adapter handles on mid-flight unsubscribe; gate the
tag-resolution retry timer with IsTimerActive so subscribe bursts don't
reset it into starvation.

SR-020: add _terminatingActorsByName shadow so a third deploy arriving
during a pending redeploy doesn't crash on InvalidActorNameException —
displaced senders get a Failed/superseded response and the latest
command wins on Terminated.

SR-024: split OperationTrackingStore reads from writes (fresh
SqliteConnection per GetStatusAsync) so long writes don't block status
queries; rewrite Dispose to drop the sync-over-async bridge that could
deadlock on a non-reentrant SyncContext; Interlocked.Exchange makes the
dispose-once flag race-safe across both paths.
2026-05-28 05:20:13 -04:00
Joseph Doherty f93b7b99bb code-review: 2026-05-28 baseline re-review of all 23 modules at 1eb6e97
Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.

regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
2026-05-28 02:55:47 -04:00
Joseph Doherty 14ba5495d1 fix(data-connection-layer): resolve DataConnectionLayer-014..017 — real logger for OPC UA client, initial-connect failover, accurate subscribe response, per-tag write-batch results 2026-05-17 03:18:24 -04:00
Joseph Doherty 89636e2bbf docs(code-reviews): re-review batch 2 at 39d737e — ConfigurationDatabase, DataConnectionLayer, DeploymentManager, ExternalSystemGateway, HealthMonitoring
17 new findings: ConfigurationDatabase-012..014, DataConnectionLayer-014..017, DeploymentManager-015..017, ExternalSystemGateway-015..017, HealthMonitoring-013..016.
2026-05-17 00:45:10 -04:00
Joseph Doherty ff4a4bdeb7 fix(data-connection-layer): resolve DataConnectionLayer-008,013 — O(1) unsubscribe via reverse index, atomic disconnect guard 2026-05-16 22:14:23 -04:00
Joseph Doherty c9b236e507 fix(data-connection): resolve DataConnectionLayer-006..012 — quality-counter reconciliation, per-tag batch reads, configurable failover threshold, dedup retry, stale-callback guard, secure cert default 2026-05-16 21:11:24 -04:00
Joseph Doherty fccd3274d3 fix(data-connection-layer): resolve DataConnectionLayer-002/003/004/005 — Resume supervision, concurrent dicts, subscribe-failure classification, write timeout 2026-05-16 19:40:40 -04:00
Joseph Doherty 239bee3bc4 fix(data-connection): resolve DataConnectionLayer-001 — off-thread actor state mutation
HandleSubscribe spawned a Task.Run that mutated DataConnectionActor private
state (_subscriptionIds, _subscriptionsByInstance, _totalSubscribed,
_resolvedTags, _unresolvedTags) from a thread-pool thread, racing the actor's
own message loop — a data race on non-thread-safe Dictionary/HashSet and
non-atomic counters.

Restructured HandleSubscribe to follow the actor's existing PipeTo(Self)
pattern: the background task now performs only adapter I/O and pipes a
SubscribeCompleted message to Self; all subscription-state mutation happens
in the new HandleSubscribeCompleted handler on the actor thread (wired into
the Connected, Connecting and Reconnecting states).

Adds DCL001_ConcurrentSubscribes_DoNotCorruptSubscriptionCounters (30x30
concurrent subscribes) which fails against the pre-fix code and passes after.
2026-05-16 18:26:43 -04:00
Joseph Doherty 977d7369a7 docs: add code review process and baseline review of all 19 modules
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.
2026-05-16 18:09:09 -04:00