Commit Graph

7 Commits

Author SHA1 Message Date
Joseph Doherty
6f5a35f222 feat(auditlog): thread ExecutionId through S&F for retry-loop cached rows
The store-and-forward retry loop emits the per-attempt and terminal cached
audit rows (ApiCallCached/DbWriteCached Attempted, CachedResolve) via
CachedCallLifecycleBridge from a CachedCallAttemptContext, not from the
script context. ExecutionId (and SourceScript) were not threaded through the
S&F buffer, so those rows had ExecutionId = null and SourceScript = null.

Thread both, additively, from the cached-call enqueue path:

- StoreAndForwardMessage gains ExecutionId (Guid?) / SourceScript (string?).
- StoreAndForwardStorage adds nullable execution_id / source_script columns
  via an idempotent PRAGMA-probed ALTER TABLE migration; rows persisted by
  an older build read back null (back-compat).
- StoreAndForwardService.EnqueueAsync gains optional executionId /
  sourceScript params, stamped onto the buffered message and surfaced on the
  CachedCallAttemptContext built in the retry loop.
- CachedCallAttemptContext gains ExecutionId / SourceScript.
- CachedCallLifecycleBridge.BuildPacket sets AuditEvent.ExecutionId and
  AuditEvent.SourceScript from the context (replacing the hard-coded
  SourceScript = null and its now-stale comment).
- IExternalSystemClient.CachedCallAsync / IDatabaseGateway.CachedWriteAsync
  gain optional executionId / sourceScript params; ScriptRuntimeContext's
  CachedCall / CachedWrite helpers pass _executionId / _sourceScript.

Script-side cached rows (CachedSubmit, immediate Attempted+Resolve) are
unchanged. All threading is additive — old buffered S&F rows still
deserialize and process with the new fields null.
2026-05-21 15:18:35 -04:00
Joseph Doherty
42430dd10a feat(siteruntime): ExternalSystem.CachedCall emits CachedSubmit telemetry (#23 M3)
Rework ScriptRuntimeContext.ExternalSystem.CachedCall to fit the M3
combined-telemetry model:

* Mints a fresh TrackedOperationId and emits one CachedSubmit packet
  via ICachedCallTelemetryForwarder BEFORE handing the call off — the
  SiteCalls row is materialised before the first delivery attempt so
  Tracking.Status(id) can observe a Submitted row even if immediate
  delivery resolves before the helper returns.
* Threads the TrackedOperationId into IExternalSystemClient.CachedCallAsync
  as a new optional parameter (and into IDatabaseGateway.CachedWriteAsync
  for the Database mirror set up here for E6). The gateway uses the id
  as the StoreAndForward messageId so the retry loop (Tasks E4/E5) can
  recover it from StoreAndForwardMessage.Id.
* Returns the TrackedOperationId rather than ExternalCallResult — the
  script's contract is now "get a tracking handle, observe outcome via
  Tracking.Status". Best-effort emission: a thrown forwarder is logged
  + swallowed; the original call still runs and the id is still returned.

DatabaseHelper gets the matching siteId / sourceScript / forwarder
fields and a parallel CachedSubmit emitter (Channel=DbOutbound) so Task
E6's Database.CachedWrite mirror plugs in without further runtime
wiring.

New ICachedCallTelemetryForwarder seam in Commons.Interfaces.Services
so SiteRuntime depends on Commons (existing arrow) rather than
ScadaLink.AuditLog (would have introduced a new dependency).

Bundle E task E3 (and helper-shape work for E6).
2026-05-20 14:48:05 -04:00
Joseph Doherty
da8c9f171b fix(external-system-gateway): resolve ExternalSystemGateway-015..017 — treat MaxRetries=0 as unset, scope HTTP connection cap to gateway clients, no bare trailing '?' 2026-05-17 03:18:24 -04:00
Joseph Doherty
a55502254e fix(external-system-gateway): resolve ExternalSystemGateway-011 — name-keyed repository lookups replace fetch-all-then-filter on the call hot path 2026-05-17 00:02:45 -04:00
Joseph Doherty
2502e4d10a fix(external-system-gateway): resolve ExternalSystemGateway-004..010 — honour retry settings, dispose HTTP messages, fix URL building, truncate error bodies, fix connection leak 2026-05-16 21:11:24 -04:00
Joseph Doherty
61253e3269 fix(store-and-forward): resolve S&F delivery + replication wiring (3 Critical findings)
Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001
— one systemic gap where buffered messages were persisted but never delivered,
and the active node never replicated its buffer to the standby.

Delivery handlers (ExternalSystemGateway-001 / NotificationService-001):
- AkkaHostedService registers delivery handlers for the ExternalSystem,
  CachedDbWrite and Notification categories after StoreAndForwardService starts;
  each resolves its scoped consumer in a fresh DI scope.
- ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each
  gain a DeliverBufferedAsync method: re-resolve the target and re-attempt
  delivery, returning true/false/throwing per the transient-vs-permanent contract.
- EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and
  NotificationDeliveryService.SendAsync pass false (they already attempted
  delivery themselves) so registering a handler does not dispatch twice.

Replication (StoreAndForward-001):
- ReplicationService is injected into StoreAndForwardService; a new BufferAsync
  helper replicates every enqueue, and successful-retry removes and parks are
  replicated too. Fire-and-forget, no-op when replication is disabled.

Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed),
attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each
consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/
NotificationService suites green.
2026-05-16 18:58:11 -04:00
Joseph Doherty
b659978764 Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs
- WP-1-3: Central/site failover + dual-node recovery tests (17 tests)
- WP-4: Performance testing framework for target scale (7 tests)
- WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests)
- WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs)
- WP-7: Recovery drill test scaffolds (5 tests)
- WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests)
- WP-9: Message contract compatibility (forward/backward compat) (18 tests)
- WP-10: Deployment packaging (installation guide, production checklist, topology)
- WP-11: Operational runbooks (failover, troubleshooting, maintenance)
92 new tests, all passing. Zero warnings.
2026-03-16 22:12:31 -04:00