scadalink-design

Author	SHA1	Message	Date
Joseph Doherty	d35551efc2	feat(auditlog): NotifyDeliver rows carry the originating ParentExecutionId	2026-05-21 18:11:04 -04:00
Joseph Doherty	c00603e2a4	feat(auditlog): thread ParentExecutionId through S&F for retry-loop cached rows The store-and-forward retry loop emits the per-attempt and terminal cached audit rows (ApiCallCached/DbWriteCached Attempted, CachedResolve) via CachedCallLifecycleBridge from a CachedCallAttemptContext, not from the script context. The ExecutionId rollout (Task 4) already threaded ExecutionId and SourceScript through this path; ParentExecutionId — the spawning inbound-API request's ExecutionId — was not, so those retry-loop rows had ParentExecutionId = null even for an inbound-API-routed run. Thread it additively as a sibling at every carry point ExecutionId passes through: - StoreAndForwardMessage gains ParentExecutionId (Guid?). - StoreAndForwardStorage adds a nullable parent_execution_id column via the same idempotent PRAGMA-probed ALTER TABLE migration; rows persisted by an older build read back null (back-compat). The defensive Guid.TryParse read helper (ParseExecutionId) is renamed ParseGuidColumn and reused for both columns so a corrupt value cannot abort the retry sweep. - StoreAndForwardService.EnqueueAsync gains an optional parentExecutionId param, stamped onto the buffered message and surfaced on the CachedCallAttemptContext built in the retry loop. - CachedCallAttemptContext gains ParentExecutionId. - CachedCallLifecycleBridge.BuildPacket sets AuditEvent.ParentExecutionId from the context, beside the existing ExecutionId. - IExternalSystemClient.CachedCallAsync / IDatabaseGateway.CachedWriteAsync gain an optional parentExecutionId param; ScriptRuntimeContext's CachedCall / CachedWrite helpers pass _parentExecutionId. All threading is additive — ParentExecutionId is Guid? everywhere, null for non-routed runs, and old buffered S&F rows still deserialize with the new field null.	2026-05-21 17:58:11 -04:00
Joseph Doherty	150ba5e63f	feat(auditlog): site script-side emitters stamp ParentExecutionId	2026-05-21 17:45:55 -04:00
Joseph Doherty	6af2607a50	feat(siteruntime): thread ParentExecutionId into the routed script's ScriptRuntimeContext	2026-05-21 17:35:49 -04:00
Joseph Doherty	85bb61a1f3	feat(auditlog): NotifyDeliver rows carry the originating ExecutionId	2026-05-21 15:35:40 -04:00
Joseph Doherty	6f5a35f222	feat(auditlog): thread ExecutionId through S&F for retry-loop cached rows The store-and-forward retry loop emits the per-attempt and terminal cached audit rows (ApiCallCached/DbWriteCached Attempted, CachedResolve) via CachedCallLifecycleBridge from a CachedCallAttemptContext, not from the script context. ExecutionId (and SourceScript) were not threaded through the S&F buffer, so those rows had ExecutionId = null and SourceScript = null. Thread both, additively, from the cached-call enqueue path: - StoreAndForwardMessage gains ExecutionId (Guid?) / SourceScript (string?). - StoreAndForwardStorage adds nullable execution_id / source_script columns via an idempotent PRAGMA-probed ALTER TABLE migration; rows persisted by an older build read back null (back-compat). - StoreAndForwardService.EnqueueAsync gains optional executionId / sourceScript params, stamped onto the buffered message and surfaced on the CachedCallAttemptContext built in the retry loop. - CachedCallAttemptContext gains ExecutionId / SourceScript. - CachedCallLifecycleBridge.BuildPacket sets AuditEvent.ExecutionId and AuditEvent.SourceScript from the context (replacing the hard-coded SourceScript = null and its now-stale comment). - IExternalSystemClient.CachedCallAsync / IDatabaseGateway.CachedWriteAsync gain optional executionId / sourceScript params; ScriptRuntimeContext's CachedCall / CachedWrite helpers pass _executionId / _sourceScript. Script-side cached rows (CachedSubmit, immediate Attempted+Resolve) are unchanged. All threading is additive — old buffered S&F rows still deserialize and process with the new fields null.	2026-05-21 15:18:35 -04:00
Joseph Doherty	0149ce6180	feat(auditlog): site script-side emitters stamp ExecutionId Move the per-script-execution Guid on ScriptRuntimeContext from _auditCorrelationId to _executionId, and stamp it into the dedicated AuditEvent.ExecutionId column on every script-side audit row: - Sync ApiCall / DbWrite: ExecutionId set; CorrelationId reverts to null (a sync one-shot call has no operation lifecycle). - Cached-call script-side rows (CachedSubmit, immediate-completion ApiCallCached + CachedResolve) and NotifySend: ExecutionId set; CorrelationId unchanged (per-operation TrackedOperationId / NotificationId). Renames the threaded ctor param/field across ExternalSystemHelper, DatabaseHelper, AuditingDbConnection and AuditingDbCommand, and threads the id through NotifyHelper/NotifyTarget. The S&F retry-loop cached rows (CachedCallLifecycleBridge) are out of scope here.	2026-05-21 15:05:00 -04:00
Joseph Doherty	aadb1fd72a	refactor(auditlog): rename audit correlation field, add cross-helper tests	2026-05-21 13:57:17 -04:00
Joseph Doherty	8243f61e96	feat(auditlog): per-script-execution correlation id on sync audit rows	2026-05-21 13:46:34 -04:00
Joseph Doherty	849a011400	fix(auditlog): capture request/response payloads on outbound API audit rows The outbound ApiCall emitter hard-coded RequestSummary/ResponseSummary to null, so audited API calls carried no inputs/outputs — contrary to the Audit Log payload-capture spec. Thread the call arguments into the sync ApiCall emitter and the cached immediate-completion path (CachedSubmit / ApiCallCached / CachedResolve), and stamp the response body from ExternalCallResult.ResponseJson. The writer's payload filter still applies the size cap + redaction downstream. The S&F retry-loop cached rows are unchanged — request data is not threaded through the store-and-forward buffer (same boundary as SourceScript).	2026-05-21 10:17:42 -04:00
Joseph Doherty	ae7329034f	fix(auditlog): populate the Actor column on outbound and central rows Per the Audit Log Actor-column spec, Actor should carry the calling script identity on outbound rows (ApiCall, DbWrite, NotifySend) and a system identity on central-dispatch rows (NotifyDeliver). The original emission code hard-coded Actor=null at all four sites, so only Inbound API rows (API key name) ever filled it. Stamp the script identity and 'system' respectively.	2026-05-21 09:50:55 -04:00
Joseph Doherty	855df759b5	feat(siteruntime): emit NotifySend(Submitted) on site-side Notify.To().Send (#23 M4) Audit Log #23 M4 Bundle C — Task C1: every script-initiated Notify.To(list).Send(...) now emits exactly one Notification/NotifySend audit row via the IAuditWriter wired through ScriptRuntimeContext. The row carries Status=Submitted, Target=list name, RequestSummary={subject,body} JSON (M5 will redact), CorrelationId=NotificationId (parsed as Guid), provenance from context, ForwardState=Pending. Emission is best-effort per alog.md §7: a thrown audit writer is logged and swallowed inside the helper; the original NotificationId still flows back to the script and the underlying S&F enqueue still happened. Mirrors the M2 Bundle F ExternalSystem.Call wrapper pattern. Tests: 7 new tests in NotifySendAuditEmissionTests covering submitted- status, list-name target, request-summary JSON shape, writer-throws fail-safe, provenance, NotificationId/CorrelationId round-trip, and the null-writer degrade path.	2026-05-20 16:18:46 -04:00
Joseph Doherty	e4d902753b	feat(siteruntime): emit DbOutbound.DbWrite on sync Database.Execute/ExecuteReader (#23 M4) Audit Log #23 — M4 Bundle A (Tasks A1+A2): every script-initiated synchronous DB call routed through Database.Connection(name) now emits exactly one DbOutbound/DbWrite audit row. Implementation — three thin ADO.NET decorators in src/ScadaLink.SiteRuntime/Scripts/: - AuditingDbConnection: wraps the gateway-returned DbConnection so CreateDbCommand() hands the script an AuditingDbCommand. All other ADO.NET surface forwards unchanged. - AuditingDbCommand: intercepts ExecuteNonQuery / ExecuteScalar / ExecuteReader (sync + async). On terminal: Channel = DbOutbound, Kind = DbWrite, Status = Delivered\|Failed, Extra = {"op":"write","rowsAffected":N} (Execute), {"op":"read","rowsReturned":N} (ExecuteReader), RequestSummary = JSON of SQL + parameter values (default capture; redaction in M5), Target = "<connection>.<first 60 chars of SQL>", DurationMs captured via Stopwatch, Provenance from ScriptRuntimeContext (SourceSiteId, SourceInstanceId, SourceScript). - AuditingDbDataReader: counts rows on Read/ReadAsync and fires the audit emission exactly once on Close/CloseAsync/Dispose. DatabaseHelper now takes an IAuditWriter; ScriptRuntimeContext.Database threads through _auditWriter. When the writer is null (tests / minimal hosts) Connection() returns the raw inner DbConnection unchanged. Best-effort emission (alog.md §7): mirrors M2 Bundle F's 3-layer fail-safe — build, write, continuation. Audit-build, audit-write, and audit-continuation faults are logged + swallowed; the original ADO.NET result (or original exception) flows back to the script untouched. The SiteAuditWriteFailures counter increments automatically through the existing FallbackAuditWriter (Bundle G). Tests — tests/ScadaLink.SiteRuntime.Tests/Scripts/DatabaseSyncEmissionTests.cs (7 new, all passing): 1. Execute / INSERT success — one DbWrite row, op=write, rowsAffected=1. 2. ExecuteScalar success — one DbWrite row, op=write. 3. Execute throws — Status=Failed, ErrorMessage + ErrorDetail set. 4. ExecuteReader success — op=read, rowsReturned counts rows pulled. 5. AuditWriter throws — original ADO.NET rowsAffected returned, no events captured, no exception propagates. 6. Provenance populated from context. 7. DurationMs recorded non-zero. Tests use Microsoft.Data.Sqlite in-memory (already transitively available via SiteRuntime). Total SiteRuntime test suite: 251 passing (244 baseline + 7 new). Full solution test suite passes.	2026-05-20 15:54:54 -04:00
Joseph Doherty	f81750b2aa	fix(siteruntime): immediate-success CachedCall emits terminal telemetry (#23 M3) Bundle E left a gap in ExternalSystem.CachedCall: when the underlying HTTP call succeeds immediately (WasBuffered=false), the store-and-forward retry loop is never engaged and the ICachedCallLifecycleObserver hook never fires. As a result Tracking.Status(id) would stay in Submitted forever and the audit log would be missing the Attempted + CachedResolve pair the M3 contract requires. Fix: capture the ExternalCallResult returned by IExternalSystemClient. CachedCallAsync. When WasBuffered=false, emit the two missing telemetry packets from the helper itself: - ApiCallCached / Attempted (per-attempt mechanics row, HttpStatus + ErrorMessage extracted via the same regex the synchronous Call() audit row uses) - CachedResolve / Delivered on Success, or - CachedResolve / Failed on Success=false (immediate permanent failure or transient failure without S&F). The terminal CachedResolve row carries TerminalAtUtc so SiteCallAudit can recognise the row as eligible for purge. The WasBuffered=true path is unaffected — the S&F retry loop owns the Attempted + Resolve emissions there via the CachedCallLifecycleBridge. Database.CachedWrite is unaffected too because IDatabaseGateway. CachedWriteAsync always enqueues into S&F (no immediate-success path). Both new emissions are best-effort: a throwing forwarder is logged and swallowed (alog.md §7) and each row is independently try/catch-wrapped so a single fault cannot drop both halves of the terminal pair. Tests in ExternalSystemCachedCallEmissionTests: - CachedCall_ImmediateSuccess_EmitsAttemptedAndCachedResolve - CachedCall_ImmediateFailure_EmitsAttemptedAndCachedResolveFailed - CachedCall_BufferedPath_DoesNotEmitTerminalTelemetryFromHelper Full suite: 244 SiteRuntime tests (3 new), 200 Host tests, all green.	2026-05-20 15:15:11 -04:00
Joseph Doherty	42430dd10a	feat(siteruntime): ExternalSystem.CachedCall emits CachedSubmit telemetry (#23 M3) Rework ScriptRuntimeContext.ExternalSystem.CachedCall to fit the M3 combined-telemetry model: * Mints a fresh TrackedOperationId and emits one CachedSubmit packet via ICachedCallTelemetryForwarder BEFORE handing the call off — the SiteCalls row is materialised before the first delivery attempt so Tracking.Status(id) can observe a Submitted row even if immediate delivery resolves before the helper returns. * Threads the TrackedOperationId into IExternalSystemClient.CachedCallAsync as a new optional parameter (and into IDatabaseGateway.CachedWriteAsync for the Database mirror set up here for E6). The gateway uses the id as the StoreAndForward messageId so the retry loop (Tasks E4/E5) can recover it from StoreAndForwardMessage.Id. * Returns the TrackedOperationId rather than ExternalCallResult — the script's contract is now "get a tracking handle, observe outcome via Tracking.Status". Best-effort emission: a thrown forwarder is logged + swallowed; the original call still runs and the id is still returned. DatabaseHelper gets the matching siteId / sourceScript / forwarder fields and a parallel CachedSubmit emitter (Channel=DbOutbound) so Task E6's Database.CachedWrite mirror plugs in without further runtime wiring. New ICachedCallTelemetryForwarder seam in Commons.Interfaces.Services so SiteRuntime depends on Commons (existing arrow) rather than ScadaLink.AuditLog (would have introduced a new dependency). Bundle E task E3 (and helper-shape work for E6).	2026-05-20 14:48:05 -04:00
Joseph Doherty	0f28d13da7	feat(siteruntime): Tracking.Status(id) script API (#23 M3)	2026-05-20 13:56:59 -04:00
Joseph Doherty	82a8bbf225	feat(siteruntime): ExternalSystem.Call emits Audit Log #23 event on every sync call Wraps IExternalSystemClient.CallAsync inside ScriptRuntimeContext's ExternalSystemHelper so every script-initiated ExternalSystem.Call produces exactly one ApiOutbound/ApiCall AuditEvent via IAuditWriter. - Captures duration with Stopwatch.GetTimestamp() around the call. - Builds the audit event with full provenance (SiteId, InstanceId, SourceScript) and a fresh EventId; ForwardState=Pending. - Maps Success → AuditStatus.Delivered, Failure (or thrown) → Failed; parses HTTP {code} out of the ExternalSystemClient's error message to populate HttpStatus. - Audit emission is fully best-effort: event-build failures, sync WriteAsync throws, AND async WriteAsync faults are all logged at Warning and swallowed so the script's call path is never aborted by an audit-write failure (alog.md §7). - Original ExternalCallResult or original exception flows back to the caller unchanged. ScriptExecutionActor resolves IAuditWriter from DI and threads it into ScriptRuntimeContext alongside the existing site identity. Adds ExternalSystemCallAuditEmissionTests covering: success → Delivered, HTTP 500 → Failed+httpStatus, HTTP 400 → Failed+httpStatus, client-thrown network exception → Failed with original exception re-thrown, audit-writer throw → original result returned, provenance populated from context, DurationMs recorded. Refs Audit Log #23 M2 Bundle F.	2026-05-20 13:11:19 -04:00
Joseph Doherty	558f9ceb39	feat(notification-outbox): populate SourceScript on outbound notifications FU3: thread the executing script identifier from the script-execution context down to the Notify outbox API so NotifyTarget.Send stamps NotificationSubmit.SourceScript instead of leaving it null. - ScriptRuntimeContext / NotifyHelper / NotifyTarget take an optional sourceScript value, carried through to NotificationSubmit.SourceScript. - ScriptExecutionActor supplies "ScriptActor:<scriptName>", matching the Site Event Logging "Source" convention used for script error events. - AlarmExecutionActor builds the context without the S&F engine, so its Notify API is inert; sourceScript defaults to null there.	2026-05-19 03:54:09 -04:00
Joseph Doherty	c8b5871782	fix(notification-outbox): re-align Central UI sandbox Notify API with production The script-analysis sandbox Notify surface was stale after the Notification Outbox change: SandboxNotifyTarget.Send returned Task<NotificationResult> and there was no Status method, while production NotifyTarget.Send returns Task<string> (a NotificationId) plus NotifyHelper.Status. A script that test-ran cleanly in the sandbox would not compile against the real site runtime. - Move the NotificationDeliveryStatus record from ScadaLink.SiteRuntime.Scripts into ScadaLink.Commons.Messages.Notification so both production and the CentralUI sandbox reference the exact same type (CentralUI does not, and should not, reference SiteRuntime). Production NotifyHelper.Status is otherwise untouched. - Rewrite SandboxNotifyHelper/SandboxNotifyTarget to be a signature-faithful no-op fake: Send returns Task<string> (a fake NotificationId), Status returns Task<NotificationDeliveryStatus>. Production now enqueues into the site S&F engine, which has no central-side equivalent in the sandbox, so the fake no longer carries an INotificationDeliveryService. - Add script-analysis tests proving a script using the new Notify shape both diagnoses clean and runs in the sandbox.	2026-05-19 03:44:34 -04:00
Joseph Doherty	3326bddeb0	feat(notification-outbox): async Notify.Send with status handle Notify.To(list).Send(subject,body) now generates a NotificationId GUID, enqueues a Notification-category message into the site Store-and-Forward Engine, and returns the NotificationId immediately (Task<string>). The NotificationId is the single idempotency key end-to-end: it is the S&F message Id, it is carried inside the buffered NotificationSubmit payload, and it is the id the forwarder submits to central. NotificationForwarder now deserializes the buffered payload as a NotificationSubmit and reads NotificationId from it (re-stamping only the site-owned SourceSiteId / SourceInstanceId), instead of deriving the id from StoreAndForwardMessage.Id. Adds NotifyHelper.Status(id): queries central via the site communication actor; reports the site-local Forwarding state while the notification is still buffered at the site, maps central's response when found, and Unknown otherwise. Adds a NotificationDeliveryStatus record. SiteCommunicationActor gains a NotificationStatusQuery forwarding handler mirroring NotificationSubmit. StoreAndForwardService.EnqueueAsync gains an optional messageId parameter and exposes GetMessageByIdAsync.	2026-05-19 02:30:51 -04:00
Joseph Doherty	09b4bd5dfa	fix(site-runtime): resolve SiteRuntime-001/002/003 — route data-sourced writes to DCL, real per-attribute API results, race-free redeploy	2026-05-16 19:57:28 -04:00
Joseph Doherty	295150751f	feat(scripts): realign Test Run with runtime API, add anonymous-object calls and instance binding The Test Run sandbox and Monaco analysis modelled a script API that had drifted from the site runtime's ScriptGlobals, so real scripts failed to compile in Test Run. Realign both to the runtime surface (Instance/Scripts/ExternalSystem/Attributes/Children/Parent) and drop the duplicate ScriptHost stub so the two cannot diverge again. - Script calls (Scripts.CallShared, Instance.CallScript, Route.To().Call) accept an anonymous object instead of a hand-built dictionary, via a shared ScriptArgs normalizer; existing dictionary calls still compile. - Test Run can optionally bind to a deployed instance, so Instance/ Attributes/CallScript route to it cross-site; adds site-side RouteToGetAttributes/RouteToSetAttributes handlers. - Adds Test Run panels to the API method and template script editors. - Fixes the TestDatabaseQuery seed script, which queried a table that never existed. Also commits unrelated in-progress work already in the tree: the health monitoring report loop, site streaming changes, and the Admin/Design data-connection and SMTP page reorganization.	2026-05-16 03:37:56 -04:00
Joseph Doherty	b659978764	Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs - WP-1-3: Central/site failover + dual-node recovery tests (17 tests) - WP-4: Performance testing framework for target scale (7 tests) - WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests) - WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs) - WP-7: Recovery drill test scaffolds (5 tests) - WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests) - WP-9: Message contract compatibility (forward/backward compat) (18 tests) - WP-10: Deployment packaging (installation guide, production checklist, topology) - WP-11: Operational runbooks (failover, troubleshooting, maintenance) 92 new tests, all passing. Zero warnings.	2026-03-16 22:12:31 -04:00
Joseph Doherty	389f5a0378	Phase 3B: Site I/O & Observability — Communication, DCL, Script/Alarm actors, Health, Event Logging Communication Layer (WP-1–5): - 8 message patterns with correlation IDs, per-pattern timeouts - Central/Site communication actors, transport heartbeat config - Connection failure handling (no central buffering, debug streams killed) Data Connection Layer (WP-6–14, WP-34): - Connection actor with Become/Stash lifecycle (Connecting/Connected/Reconnecting) - OPC UA + LmxProxy adapters behind IDataConnection - Auto-reconnect, bad quality propagation, transparent re-subscribe - Write-back, tag path resolution with retry, health reporting - Protocol extensibility via DataConnectionFactory Site Runtime (WP-15–25, WP-32–33): - ScriptActor/ScriptExecutionActor (triggers, concurrent execution, blocking I/O dispatcher) - AlarmActor/AlarmExecutionActor (ValueMatch/RangeViolation/RateOfChange, in-memory state) - SharedScriptLibrary (inline execution), ScriptRuntimeContext (API) - ScriptCompilationService (Roslyn, forbidden API enforcement, execution timeout) - Recursion limit (default 10), call direction enforcement - SiteStreamManager (per-subscriber bounded buffers, fire-and-forget) - Debug view backend (snapshot + stream), concurrency serialization - Local artifact storage (4 SQLite tables) Health Monitoring (WP-26–28): - SiteHealthCollector (thread-safe counters, connection state) - HealthReportSender (30s interval, monotonic sequence numbers) - CentralHealthAggregator (offline detection 60s, online recovery) Site Event Logging (WP-29–31): - SiteEventLogger (SQLite, 6 event categories, ISO 8601 UTC) - EventLogPurgeService (30-day retention, 1GB cap) - EventLogQueryService (filters, keyword search, keyset pagination) 541 tests pass, zero warnings.	2026-03-16 20:57:25 -04:00

24 Commits