Commit Graph

19 Commits

Author SHA1 Message Date
Joseph Doherty
aadb1fd72a refactor(auditlog): rename audit correlation field, add cross-helper tests 2026-05-21 13:57:17 -04:00
Joseph Doherty
8243f61e96 feat(auditlog): per-script-execution correlation id on sync audit rows 2026-05-21 13:46:34 -04:00
Joseph Doherty
849a011400 fix(auditlog): capture request/response payloads on outbound API audit rows
The outbound ApiCall emitter hard-coded RequestSummary/ResponseSummary to null,
so audited API calls carried no inputs/outputs — contrary to the Audit Log
payload-capture spec. Thread the call arguments into the sync ApiCall emitter
and the cached immediate-completion path (CachedSubmit / ApiCallCached /
CachedResolve), and stamp the response body from ExternalCallResult.ResponseJson.
The writer's payload filter still applies the size cap + redaction downstream.

The S&F retry-loop cached rows are unchanged — request data is not threaded
through the store-and-forward buffer (same boundary as SourceScript).
2026-05-21 10:17:42 -04:00
Joseph Doherty
ae7329034f fix(auditlog): populate the Actor column on outbound and central rows
Per the Audit Log Actor-column spec, Actor should carry the calling script
identity on outbound rows (ApiCall, DbWrite, NotifySend) and a system identity
on central-dispatch rows (NotifyDeliver). The original emission code hard-coded
Actor=null at all four sites, so only Inbound API rows (API key name) ever
filled it. Stamp the script identity and 'system' respectively.
2026-05-21 09:50:55 -04:00
Joseph Doherty
855df759b5 feat(siteruntime): emit NotifySend(Submitted) on site-side Notify.To().Send (#23 M4)
Audit Log #23 M4 Bundle C — Task C1: every script-initiated
Notify.To(list).Send(...) now emits exactly one
Notification/NotifySend audit row via the IAuditWriter wired through
ScriptRuntimeContext. The row carries Status=Submitted,
Target=list name, RequestSummary={subject,body} JSON (M5 will redact),
CorrelationId=NotificationId (parsed as Guid), provenance from context,
ForwardState=Pending.

Emission is best-effort per alog.md §7: a thrown audit writer is logged
and swallowed inside the helper; the original NotificationId still flows
back to the script and the underlying S&F enqueue still happened.

Mirrors the M2 Bundle F ExternalSystem.Call wrapper pattern.

Tests: 7 new tests in NotifySendAuditEmissionTests covering submitted-
status, list-name target, request-summary JSON shape, writer-throws
fail-safe, provenance, NotificationId/CorrelationId round-trip, and the
null-writer degrade path.
2026-05-20 16:18:46 -04:00
Joseph Doherty
e4d902753b feat(siteruntime): emit DbOutbound.DbWrite on sync Database.Execute*/ExecuteReader (#23 M4)
Audit Log #23 — M4 Bundle A (Tasks A1+A2): every script-initiated
synchronous DB call routed through Database.Connection(name) now emits
exactly one DbOutbound/DbWrite audit row.

Implementation — three thin ADO.NET decorators in
src/ScadaLink.SiteRuntime/Scripts/:

  - AuditingDbConnection: wraps the gateway-returned DbConnection so
    CreateDbCommand() hands the script an AuditingDbCommand. All other
    ADO.NET surface forwards unchanged.
  - AuditingDbCommand: intercepts ExecuteNonQuery / ExecuteScalar /
    ExecuteReader (sync + async). On terminal:
      Channel = DbOutbound, Kind = DbWrite, Status = Delivered|Failed,
      Extra = {"op":"write","rowsAffected":N}   (Execute*),
              {"op":"read","rowsReturned":N}   (ExecuteReader),
      RequestSummary = JSON of SQL + parameter values (default capture;
                       redaction in M5),
      Target = "<connection>.<first 60 chars of SQL>",
      DurationMs captured via Stopwatch,
      Provenance from ScriptRuntimeContext (SourceSiteId,
                       SourceInstanceId, SourceScript).
  - AuditingDbDataReader: counts rows on Read/ReadAsync and fires the
    audit emission exactly once on Close/CloseAsync/Dispose.

DatabaseHelper now takes an IAuditWriter; ScriptRuntimeContext.Database
threads through _auditWriter. When the writer is null (tests / minimal
hosts) Connection() returns the raw inner DbConnection unchanged.

Best-effort emission (alog.md §7): mirrors M2 Bundle F's 3-layer
fail-safe — build, write, continuation. Audit-build, audit-write, and
audit-continuation faults are logged + swallowed; the original ADO.NET
result (or original exception) flows back to the script untouched. The
SiteAuditWriteFailures counter increments automatically through the
existing FallbackAuditWriter (Bundle G).

Tests — tests/ScadaLink.SiteRuntime.Tests/Scripts/DatabaseSyncEmissionTests.cs
(7 new, all passing):
  1. Execute / INSERT success — one DbWrite row, op=write, rowsAffected=1.
  2. ExecuteScalar success — one DbWrite row, op=write.
  3. Execute throws — Status=Failed, ErrorMessage + ErrorDetail set.
  4. ExecuteReader success — op=read, rowsReturned counts rows pulled.
  5. AuditWriter throws — original ADO.NET rowsAffected returned, no
     events captured, no exception propagates.
  6. Provenance populated from context.
  7. DurationMs recorded non-zero.

Tests use Microsoft.Data.Sqlite in-memory (already transitively
available via SiteRuntime). Total SiteRuntime test suite: 251 passing
(244 baseline + 7 new). Full solution test suite passes.
2026-05-20 15:54:54 -04:00
Joseph Doherty
f81750b2aa fix(siteruntime): immediate-success CachedCall emits terminal telemetry (#23 M3)
Bundle E left a gap in ExternalSystem.CachedCall: when the underlying HTTP
call succeeds immediately (WasBuffered=false), the store-and-forward retry
loop is never engaged and the ICachedCallLifecycleObserver hook never
fires. As a result Tracking.Status(id) would stay in Submitted forever and
the audit log would be missing the Attempted + CachedResolve pair the M3
contract requires.

Fix: capture the ExternalCallResult returned by IExternalSystemClient.
CachedCallAsync. When WasBuffered=false, emit the two missing telemetry
packets from the helper itself:

- ApiCallCached / Attempted   (per-attempt mechanics row, HttpStatus +
                              ErrorMessage extracted via the same regex
                              the synchronous Call() audit row uses)
- CachedResolve / Delivered   on Success, or
- CachedResolve / Failed      on Success=false (immediate permanent
                              failure or transient failure without S&F).

The terminal CachedResolve row carries TerminalAtUtc so SiteCallAudit can
recognise the row as eligible for purge.

The WasBuffered=true path is unaffected — the S&F retry loop owns the
Attempted + Resolve emissions there via the CachedCallLifecycleBridge.
Database.CachedWrite is unaffected too because IDatabaseGateway.
CachedWriteAsync always enqueues into S&F (no immediate-success path).

Both new emissions are best-effort: a throwing forwarder is logged and
swallowed (alog.md §7) and each row is independently try/catch-wrapped so
a single fault cannot drop both halves of the terminal pair.

Tests in ExternalSystemCachedCallEmissionTests:
- CachedCall_ImmediateSuccess_EmitsAttemptedAndCachedResolve
- CachedCall_ImmediateFailure_EmitsAttemptedAndCachedResolveFailed
- CachedCall_BufferedPath_DoesNotEmitTerminalTelemetryFromHelper

Full suite: 244 SiteRuntime tests (3 new), 200 Host tests, all green.
2026-05-20 15:15:11 -04:00
Joseph Doherty
047988e4c8 feat(siteruntime): Database.CachedWrite emits combined telemetry + S&F audit bridge (#23 M3)
Wire the M3 cached-call audit pipeline end-to-end for the database
channel and close the loop between the S&F lifecycle observer and the
site-side dual emitter.

* DatabaseCachedWriteEmissionTests covers Database.CachedWrite (set up
  in Bundle E3): mints a TrackedOperationId, emits one CachedSubmit
  packet on DbOutbound, threads the id into IDatabaseGateway, and is
  best-effort on a thrown forwarder. Mirrors ExternalSystem.CachedCall
  coverage from E3.

* CachedCallLifecycleBridge (new) implements ICachedCallLifecycleObserver
  and lives alongside CachedCallTelemetryForwarder. The bridge ingests
  per-attempt notifications from the S&F retry loop and fans them out
  to the forwarder:
    - TransientFailure -> 1 Attempted row
    - Delivered        -> Attempted + CachedResolve(Delivered)
    - PermanentFailure -> Attempted + CachedResolve(Parked)
    - ParkedMaxRetries -> Attempted + CachedResolve(Parked)
  Channel string -> AuditKind mapping (ApiOutbound->ApiCallCached,
  DbOutbound->DbWriteCached). Best-effort top-level catch swallows any
  unexpected throw so the S&F retry bookkeeping is never disturbed.

* Bridge tests (7) cover all four outcomes, channel mapping, provenance
  propagation, and the no-throw-on-forwarder-failure contract.

Bundle F (Host registration) will instantiate the bridge and inject it
into StoreAndForwardService.cachedCallObserver, closing the wiring path
end-to-end.

Bundle E task E6.
2026-05-20 14:55:17 -04:00
Joseph Doherty
42430dd10a feat(siteruntime): ExternalSystem.CachedCall emits CachedSubmit telemetry (#23 M3)
Rework ScriptRuntimeContext.ExternalSystem.CachedCall to fit the M3
combined-telemetry model:

* Mints a fresh TrackedOperationId and emits one CachedSubmit packet
  via ICachedCallTelemetryForwarder BEFORE handing the call off — the
  SiteCalls row is materialised before the first delivery attempt so
  Tracking.Status(id) can observe a Submitted row even if immediate
  delivery resolves before the helper returns.
* Threads the TrackedOperationId into IExternalSystemClient.CachedCallAsync
  as a new optional parameter (and into IDatabaseGateway.CachedWriteAsync
  for the Database mirror set up here for E6). The gateway uses the id
  as the StoreAndForward messageId so the retry loop (Tasks E4/E5) can
  recover it from StoreAndForwardMessage.Id.
* Returns the TrackedOperationId rather than ExternalCallResult — the
  script's contract is now "get a tracking handle, observe outcome via
  Tracking.Status". Best-effort emission: a thrown forwarder is logged
  + swallowed; the original call still runs and the id is still returned.

DatabaseHelper gets the matching siteId / sourceScript / forwarder
fields and a parallel CachedSubmit emitter (Channel=DbOutbound) so Task
E6's Database.CachedWrite mirror plugs in without further runtime
wiring.

New ICachedCallTelemetryForwarder seam in Commons.Interfaces.Services
so SiteRuntime depends on Commons (existing arrow) rather than
ScadaLink.AuditLog (would have introduced a new dependency).

Bundle E task E3 (and helper-shape work for E6).
2026-05-20 14:48:05 -04:00
Joseph Doherty
0f28d13da7 feat(siteruntime): Tracking.Status(id) script API (#23 M3) 2026-05-20 13:56:59 -04:00
Joseph Doherty
82a8bbf225 feat(siteruntime): ExternalSystem.Call emits Audit Log #23 event on every sync call
Wraps IExternalSystemClient.CallAsync inside ScriptRuntimeContext's
ExternalSystemHelper so every script-initiated ExternalSystem.Call
produces exactly one ApiOutbound/ApiCall AuditEvent via IAuditWriter.

- Captures duration with Stopwatch.GetTimestamp() around the call.
- Builds the audit event with full provenance (SiteId, InstanceId,
  SourceScript) and a fresh EventId; ForwardState=Pending.
- Maps Success → AuditStatus.Delivered, Failure (or thrown) → Failed;
  parses HTTP {code} out of the ExternalSystemClient's error message
  to populate HttpStatus.
- Audit emission is fully best-effort: event-build failures, sync
  WriteAsync throws, AND async WriteAsync faults are all logged at
  Warning and swallowed so the script's call path is never aborted
  by an audit-write failure (alog.md §7).
- Original ExternalCallResult or original exception flows back to the
  caller unchanged.

ScriptExecutionActor resolves IAuditWriter from DI and threads it
into ScriptRuntimeContext alongside the existing site identity.

Adds ExternalSystemCallAuditEmissionTests covering: success →
Delivered, HTTP 500 → Failed+httpStatus, HTTP 400 → Failed+httpStatus,
client-thrown network exception → Failed with original exception
re-thrown, audit-writer throw → original result returned, provenance
populated from context, DurationMs recorded.

Refs Audit Log #23 M2 Bundle F.
2026-05-20 13:11:19 -04:00
Joseph Doherty
17861efa51 test(notification-outbox): assert NotNull in SourceScript negative test 2026-05-19 03:57:38 -04:00
Joseph Doherty
558f9ceb39 feat(notification-outbox): populate SourceScript on outbound notifications
FU3: thread the executing script identifier from the script-execution
context down to the Notify outbox API so NotifyTarget.Send stamps
NotificationSubmit.SourceScript instead of leaving it null.

- ScriptRuntimeContext / NotifyHelper / NotifyTarget take an optional
  sourceScript value, carried through to NotificationSubmit.SourceScript.
- ScriptExecutionActor supplies "ScriptActor:<scriptName>", matching the
  Site Event Logging "Source" convention used for script error events.
- AlarmExecutionActor builds the context without the S&F engine, so its
  Notify API is inert; sourceScript defaults to null there.
2026-05-19 03:54:09 -04:00
Joseph Doherty
3326bddeb0 feat(notification-outbox): async Notify.Send with status handle
Notify.To(list).Send(subject,body) now generates a NotificationId GUID,
enqueues a Notification-category message into the site Store-and-Forward
Engine, and returns the NotificationId immediately (Task<string>). The
NotificationId is the single idempotency key end-to-end: it is the S&F
message Id, it is carried inside the buffered NotificationSubmit payload,
and it is the id the forwarder submits to central.

NotificationForwarder now deserializes the buffered payload as a
NotificationSubmit and reads NotificationId from it (re-stamping only the
site-owned SourceSiteId / SourceInstanceId), instead of deriving the id
from StoreAndForwardMessage.Id.

Adds NotifyHelper.Status(id): queries central via the site communication
actor; reports the site-local Forwarding state while the notification is
still buffered at the site, maps central's response when found, and
Unknown otherwise. Adds a NotificationDeliveryStatus record.

SiteCommunicationActor gains a NotificationStatusQuery forwarding handler
mirroring NotificationSubmit. StoreAndForwardService.EnqueueAsync gains an
optional messageId parameter and exposes GetMessageByIdAsync.
2026-05-19 02:30:51 -04:00
Joseph Doherty
a88bec9376 fix(site-runtime): resolve SiteRuntime-004..011 — deploy-after-persist, remove reflection, deterministic IDs, non-blocking startup, dedicated script scheduler, config-change detection, semantic trust-model check 2026-05-16 21:44:10 -04:00
Joseph Doherty
efba01d10a feat(scripts): self/child/parent attribute and script accessors
Phases 1+2 of the design at
docs/plans/2026-05-12-script-scope-access-design.md.

Adds ergonomic scope-aware accessors to compiled scripts. A script
on a composed TempSensor reads its own attribute via
Attributes["Temperature"]; reaches up to the parent via
Parent.Attributes["SpeedRPM"]; invokes a child script via
Children["TempSensor"].CallScript("Sample"). All resolve to the
existing flat Instance.GetAttribute / SetAttribute / CallScript
delegates by prepending the script's canonical path prefix.

Runtime types (SiteRuntime.Scripts.ScopeAccessors):
  AttributeAccessor   sync indexer + GetAsync / SetAsync
  CompositionAccessor Attributes + CallScript
  ChildrenAccessor    Children["name"] => CompositionAccessor

ScriptGlobals gains Scope, Attributes, Children, Parent properties.
Sync indexer blocks on the Instance Actor Ask; explicit GetAsync /
SetAsync are also available for callers that want to await.

Plumbing:
  - Commons.Types.Scripts.ScriptScope record (SelfPath / ParentPath).
  - ResolvedScript.Scope (defaults to ScriptScope.Root for back-compat).
  - FlatteningService emits new ScriptScope(prefix, "") for each
    composed script so a script defined on TempSensor composed under
    a parent gets SelfPath = "TempSensor".
  - ScriptActor reads the Scope from its ResolvedScript and forwards
    it through ScriptExecutionActor into ScriptGlobals on each call.

RevisionHashService not touched: the per-script canonical name
already encodes the composition path, so any structural change
already flips the hash.

10 new unit tests on the path arithmetic. Site/Template engine
suites stay green (129 + 199).

Editor surface (Phase 3: metadata fetch, Phase 4: completion +
SCADA006 / SCADA007 diagnostics) follows in the next commits.
2026-05-12 05:45:24 -04:00
Joseph Doherty
161dc406ed feat(scripts): add typed Parameters.Get<T>() helpers for script API
Replace raw dictionary casting with ScriptParameters wrapper that provides
Get<T>, Get<T?>, Get<T[]>, and Get<List<T>> with clear error messages,
numeric conversion, and JsonElement support for Inbound API parameters.
2026-03-22 15:47:18 -04:00
Joseph Doherty
b659978764 Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs
- WP-1-3: Central/site failover + dual-node recovery tests (17 tests)
- WP-4: Performance testing framework for target scale (7 tests)
- WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests)
- WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs)
- WP-7: Recovery drill test scaffolds (5 tests)
- WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests)
- WP-9: Message contract compatibility (forward/backward compat) (18 tests)
- WP-10: Deployment packaging (installation guide, production checklist, topology)
- WP-11: Operational runbooks (failover, troubleshooting, maintenance)
92 new tests, all passing. Zero warnings.
2026-03-16 22:12:31 -04:00
Joseph Doherty
389f5a0378 Phase 3B: Site I/O & Observability — Communication, DCL, Script/Alarm actors, Health, Event Logging
Communication Layer (WP-1–5):
- 8 message patterns with correlation IDs, per-pattern timeouts
- Central/Site communication actors, transport heartbeat config
- Connection failure handling (no central buffering, debug streams killed)

Data Connection Layer (WP-6–14, WP-34):
- Connection actor with Become/Stash lifecycle (Connecting/Connected/Reconnecting)
- OPC UA + LmxProxy adapters behind IDataConnection
- Auto-reconnect, bad quality propagation, transparent re-subscribe
- Write-back, tag path resolution with retry, health reporting
- Protocol extensibility via DataConnectionFactory

Site Runtime (WP-15–25, WP-32–33):
- ScriptActor/ScriptExecutionActor (triggers, concurrent execution, blocking I/O dispatcher)
- AlarmActor/AlarmExecutionActor (ValueMatch/RangeViolation/RateOfChange, in-memory state)
- SharedScriptLibrary (inline execution), ScriptRuntimeContext (API)
- ScriptCompilationService (Roslyn, forbidden API enforcement, execution timeout)
- Recursion limit (default 10), call direction enforcement
- SiteStreamManager (per-subscriber bounded buffers, fire-and-forget)
- Debug view backend (snapshot + stream), concurrency serialization
- Local artifact storage (4 SQLite tables)

Health Monitoring (WP-26–28):
- SiteHealthCollector (thread-safe counters, connection state)
- HealthReportSender (30s interval, monotonic sequence numbers)
- CentralHealthAggregator (offline detection 60s, online recovery)

Site Event Logging (WP-29–31):
- SiteEventLogger (SQLite, 6 event categories, ISO 8601 UTC)
- EventLogPurgeService (30-day retention, 1GB cap)
- EventLogQueryService (filters, keyword search, keyset pagination)

541 tests pass, zero warnings.
2026-03-16 20:57:25 -04:00