Commit Graph

9 Commits

Author SHA1 Message Date
Joseph Doherty 11950b0a8e fix(correctness): close Theme 10 — 5 data-integrity / serialisation findings
Final themed batch. 5 well-localised correctness fixes.

Serialisation precision:
- ESG-020: DatabaseGateway.JsonElementToParameterValue probes
  TryGetInt64 → TryGetDecimal → GetDouble, so a script's high-precision
  decimal SQL parameter survives the cached-write retry round-trip
  without silent precision loss. 3 new regression tests.

Template engine correctness:
- TE-018: DiffService gains ComputeConnectionsDiff over
  FlattenedConfiguration.Connections, mirroring the existing entity-diff
  shape and pairing with the Theme 1 TE-017 hash-coverage fix. A
  ConfigurationDiff record extension in Commons is flagged as a follow-up.
- TE-019: TemplateResolver.BuildInheritanceChain now walks via the
  int? ParentTemplateId directly — only null means "no parent". A real
  Id of 0 (the prior special-cased sentinel) now walks the chain like
  any other node, matching the TemplateEngine-013 CycleDetector fix.
  Regression of TE-013 closed.
- TE-020: All 5 Create* paths in TemplateService + SharedScriptService
  re-ordered to save-first → log-with-real-Id → save-audit (matching
  the InstanceService pattern). Create* audit rows no longer carry a
  literal "0" EntityId.

Doc deferral:
- Transport-012: Component-Transport.md §Audit Trail now spells out that
  the BundleImportId repository filter IS wired (in CentralUiRepository),
  but the Audit-Log-Viewer UI dropdown + summary-row hyperlink are a
  deferred CentralUI follow-up. CLI workaround documented
  (audit query --bundle-import-id).

11+ new regression tests (3 ESG, 4 DiffService, 3 TemplateResolver, 4
TemplateService, 1 SharedScriptService). Build clean; ESG 72/72,
TemplateEngine 324/324. README regenerated: 1 pending of 481 total.

Session-to-date: 135 of 136 originally-open Theme findings closed
across 10 themes in 10 commits.
2026-05-28 08:48:44 -04:00
Joseph Doherty ac96b83b08 fix(high-severity): close 9 of 10 open High findings across 8 modules
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.

SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.

DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.

ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.

InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.

NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.

SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.

TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.

Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
2026-05-28 05:40:15 -04:00
Joseph Doherty 705ae95404 test(auditlog): assert ExecutionId threading hops; defensive Guid parse on S&F read 2026-05-21 15:27:58 -04:00
Joseph Doherty da8c9f171b fix(external-system-gateway): resolve ExternalSystemGateway-015..017 — treat MaxRetries=0 as unset, scope HTTP connection cap to gateway clients, no bare trailing '?' 2026-05-17 03:18:24 -04:00
Joseph Doherty a55502254e fix(external-system-gateway): resolve ExternalSystemGateway-011 — name-keyed repository lookups replace fetch-all-then-filter on the call hot path 2026-05-17 00:02:45 -04:00
Joseph Doherty e57ccd78b7 fix(external-system-gateway): resolve ExternalSystemGateway-012,013,014 — failure logging, connection-limit wiring, test coverage; ExternalSystemGateway-011 flagged 2026-05-16 22:14:23 -04:00
Joseph Doherty 2502e4d10a fix(external-system-gateway): resolve ExternalSystemGateway-004..010 — honour retry settings, dispose HTTP messages, fix URL building, truncate error bodies, fix connection leak 2026-05-16 21:11:24 -04:00
Joseph Doherty 61253e3269 fix(store-and-forward): resolve S&F delivery + replication wiring (3 Critical findings)
Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001
— one systemic gap where buffered messages were persisted but never delivered,
and the active node never replicated its buffer to the standby.

Delivery handlers (ExternalSystemGateway-001 / NotificationService-001):
- AkkaHostedService registers delivery handlers for the ExternalSystem,
  CachedDbWrite and Notification categories after StoreAndForwardService starts;
  each resolves its scoped consumer in a fresh DI scope.
- ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each
  gain a DeliverBufferedAsync method: re-resolve the target and re-attempt
  delivery, returning true/false/throwing per the transient-vs-permanent contract.
- EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and
  NotificationDeliveryService.SendAsync pass false (they already attempted
  delivery themselves) so registering a handler does not dispatch twice.

Replication (StoreAndForward-001):
- ReplicationService is injected into StoreAndForwardService; a new BufferAsync
  helper replicates every enqueue, and successful-retry removes and parks are
  replicated too. Fire-and-forget, no-op when replication is disabled.

Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed),
attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each
consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/
NotificationService suites green.
2026-05-16 18:58:11 -04:00
Joseph Doherty b659978764 Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs
- WP-1-3: Central/site failover + dual-node recovery tests (17 tests)
- WP-4: Performance testing framework for target scale (7 tests)
- WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests)
- WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs)
- WP-7: Recovery drill test scaffolds (5 tests)
- WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests)
- WP-9: Message contract compatibility (forward/backward compat) (18 tests)
- WP-10: Deployment packaging (installation guide, production checklist, topology)
- WP-11: Operational runbooks (failover, troubleshooting, maintenance)
92 new tests, all passing. Zero warnings.
2026-03-16 22:12:31 -04:00