Commit Graph

8 Commits

Author SHA1 Message Date
Joseph Doherty fd618cf1dc fix(review): full code-review remediation — 5 High + Medium/Low across 16 modules
Remediation from the full per-module code review at 4307c381 (findings recorded
separately in code-reviews/).

Highs fixed:
- DeploymentManager-025/SiteRuntime-031: stop broadcasting notification lists + SMTP
  configs (incl. credentials) to sites; site purges already-persisted rows on apply
  (enforces the central-only delivery design; clears plaintext SMTP creds at rest).
- DataConnectionLayer-023: guard the native-alarm subscribe path against the
  mid-flight-unsubscribe adapter-feed leak (mirrors the DCL-021 tag-path fix).
- SiteEventLogging-024: normalize From/To query bounds to UTC (the -016 fix the
  audit trail claimed but never committed).
- KpiHistory-001: add an in-flight guard to the recorder sample tick.
- ScriptAnalysis-001: harden the trust analyzer's TPA-absent fallback (resolve
  forbidden anchors in the minimal reference set; warn on degraded mode) — anchors
  added to validation references only, never the compile gate.
(InboundAPI-026 left to the feat/ipsen-movein effort per owner decision.)

Medium/Low: DM-026 deterministic deploy-status tiebreaker; SR-027/028/029/030
native-alarm leak/phantom-active/delete-during-redeploy fixes; AL-013/014/016;
TE-024 (folder-mutation audit rows now persisted)/025; SF-025 gauge-provider
clear-on-stop; ESG-025/026; SEC-023/024/025; SCA-007/008/009; plus doc/test
accuracy COM-023/024, HOST-025/026, HM-024/025, NS-027/028.

Full-solution build 0 warnings; ~3560 tests across 18 touched suites green.
2026-06-20 17:55:12 -04:00
Joseph Doherty 241a792e7b docs(kpi): K17 — #26 KpiHistory component doc + README/CLAUDE + cross-component interactions + completion-design update 2026-06-17 20:52:12 -04:00
Joseph Doherty 487859bff0 docs+code: close Theme 1 — 24 design-doc / XML-doc drift findings
Doc/XML-comment drift + small adherence fixes across 17 modules. Highlights:
- Host-017: site CoordinatedShutdown ordering — SiteStreamGrpcServer gains
  CancelAllStreams() (refuse new streams, cancel active), wired into
  Program.cs site branch via ApplicationStopping.
- InboundAPI-021: ParentExecutionId now travels on RouteToGet/SetAttributes
  symmetric with RouteToCallRequest; RouteHelper stamps from _parentExecutionId.
- ClusterInfra-012: ClusterOptionsValidator now requires both seed nodes.
- Comm-018: SiteCommunicationActor.HeartbeatMessage.IsActive derived from
  cluster leader check (was hardcoded true).
- DM-020: reconciliation audit row attributes the current user, not prior deployer.
- SEL-019: EventLogPurgeService early-exits on standby via active-node check.
- Plus comment/XML-doc accuracy fixes across AuditLog, ConfigurationDatabase,
  NotificationOutbox, SiteRuntime, SiteCallAudit; doc refreshes for Component-
  Commons / -ManagementService / -CLI / -ExternalSystemGateway / -HealthMonitoring
  / -Transport / -ConfigurationDatabase; CD-023 index-name doc alignment.

11 new regression tests (RouteHelper x4, SiteStreamGrpcServer x2,
ClusterOptionsValidator x1, SiteCommunicationActor x1, DeploymentService x1,
EventLogPurgeService x3). Build clean (0 warnings); InboundAPI/Communication/
Host suites all green. README regenerated: 112 open (was 136).
2026-05-28 06:28:31 -04:00
Joseph Doherty 72388a7616 docs(audit): fix Audit error rate semantics and CLI permission split
Bundle D code-review feedback on 0ae1a25 and e6f7a7f:

- Audit error rate (HealthMonitoring tile) was described as a combined
  view of CentralAuditWriteFailures + AuditRedactionFailure (writer
  health). Per alog.md §10.3 / §14.1 it is the operational error rate
  of audited operations: % of central AuditLog rows with Status not
  in (Success/Delivered/Enqueued) over a rolling 5-min window. Audit
  writer issues surface separately via the dedicated metrics.

- Audit volume description gains the spec-mandated 'events/min, global
  + per-site sparkline' shape.

- CLI: scadalink audit was claiming all three subcommands need both
  OperationalAudit and AuditExport. Per alog.md §11.2 / §15.1, read
  (query, verify-chain) needs OperationalAudit; bulk export
  additionally requires AuditExport. Restored the spec's split.
2026-05-20 08:30:42 -04:00
Joseph Doherty 0ae1a254d7 docs(audit): add Audit Log health metrics and dashboard tiles 2026-05-20 08:26:03 -04:00
Joseph Doherty e681a1f0e1 docs(requirements): add Site Call Audit KPIs to Health Monitoring 2026-05-19 11:58:46 -04:00
Joseph Doherty 6ccf3766dc docs(notification-outbox): add central-computed outbox KPIs to Health Monitoring 2026-05-18 23:17:07 -04:00
Joseph Doherty d91aa83665 refactor(docs): move requirements and test infra docs into docs/ subdirectories
Organize documentation by moving requirements (HighLevelReqs, Component-*,
lmxproxy_protocol) to docs/requirements/ and test infrastructure docs to
docs/test_infra/. Updates all cross-references in README, CLAUDE.md,
infra/README, component docs, and 23 plan files.
2026-03-21 01:11:35 -04:00