Commit Graph

863 Commits

Author SHA1 Message Date
Joseph Doherty
16f800b76a Merge branch 'feature/audit-parent-executionid': ParentExecutionId cross-execution audit correlation 2026-05-21 20:14:44 -04:00
Joseph Doherty
9ec83d5070 docs(auditlog): generalize two stale XML-doc comments
- AddColumnIfMissing is now shared by ExecutionId and ParentExecutionId;
  drop the ExecutionId-specific tag.
- AuditLogRepository.GetExecutionTreeAsync doc no longer hardcodes the
  MAXRECURSION literal; reference the ExecutionChainMaxDepth const instead.
2026-05-21 20:14:31 -04:00
Joseph Doherty
933f0484ba test(auditlog): ParentExecutionId e2e waits on audit kinds, not a row count
The headline ParentExecutionIdCorrelationTests intermittently failed under
full-suite parallel load, seeing 6 of 7 routed-run rows (NotifySend missing).
Root cause: WaitForSiteRowsPersistedAsync checked only a row *count*, which a
partial snapshot could satisfy before the last-emitted NotifySend row settled,
letting the SiteAuditTelemetryActor drain a partial batch. Fix is test-only:
wait on the specific audit Kinds (guaranteeing NotifySend is durably in SQLite
before the assertion) and widen the assertion ceiling 30s -> 90s for drain
headroom under load. Also drops leftover // DIAG sampler debug scaffolding.
2026-05-21 20:09:54 -04:00
Joseph Doherty
fb1312d0bf test(auditlog): end-to-end ParentExecutionId correlation + docs 2026-05-21 19:12:19 -04:00
Joseph Doherty
592cbd028e feat(audit): ParentExecutionId filter in the CLI and ManagementService 2026-05-21 18:59:06 -04:00
Joseph Doherty
9b1f78638b refactor(centralui): complete cycle fallback + polish in ExecutionTree 2026-05-21 18:56:03 -04:00
Joseph Doherty
34a4356625 feat(centralui): execution-chain tree view on the Audit Log page 2026-05-21 18:49:13 -04:00
Joseph Doherty
0b5723b777 feat(centralui): ParentExecutionId column, filter and parent drill-in on the Audit Log page 2026-05-21 18:38:02 -04:00
Joseph Doherty
252bf0a970 refactor(auditlog): GetExecutionTreeAsync recurses over a distinct edge set 2026-05-21 18:29:48 -04:00
Joseph Doherty
255dd95cd9 feat(auditlog): GetExecutionTreeAsync recursive execution-chain query 2026-05-21 18:22:21 -04:00
Joseph Doherty
d35551efc2 feat(auditlog): NotifyDeliver rows carry the originating ParentExecutionId 2026-05-21 18:11:04 -04:00
Joseph Doherty
c00603e2a4 feat(auditlog): thread ParentExecutionId through S&F for retry-loop cached rows
The store-and-forward retry loop emits the per-attempt and terminal cached
audit rows (ApiCallCached/DbWriteCached Attempted, CachedResolve) via
CachedCallLifecycleBridge from a CachedCallAttemptContext, not from the
script context. The ExecutionId rollout (Task 4) already threaded ExecutionId
and SourceScript through this path; ParentExecutionId — the spawning
inbound-API request's ExecutionId — was not, so those retry-loop rows had
ParentExecutionId = null even for an inbound-API-routed run.

Thread it additively as a sibling at every carry point ExecutionId passes
through:

- StoreAndForwardMessage gains ParentExecutionId (Guid?).
- StoreAndForwardStorage adds a nullable parent_execution_id column via the
  same idempotent PRAGMA-probed ALTER TABLE migration; rows persisted by an
  older build read back null (back-compat). The defensive Guid.TryParse read
  helper (ParseExecutionId) is renamed ParseGuidColumn and reused for both
  columns so a corrupt value cannot abort the retry sweep.
- StoreAndForwardService.EnqueueAsync gains an optional parentExecutionId
  param, stamped onto the buffered message and surfaced on the
  CachedCallAttemptContext built in the retry loop.
- CachedCallAttemptContext gains ParentExecutionId.
- CachedCallLifecycleBridge.BuildPacket sets AuditEvent.ParentExecutionId
  from the context, beside the existing ExecutionId.
- IExternalSystemClient.CachedCallAsync / IDatabaseGateway.CachedWriteAsync
  gain an optional parentExecutionId param; ScriptRuntimeContext's CachedCall
  / CachedWrite helpers pass _parentExecutionId.

All threading is additive — ParentExecutionId is Guid? everywhere, null for
non-routed runs, and old buffered S&F rows still deserialize with the new
field null.
2026-05-21 17:58:11 -04:00
Joseph Doherty
150ba5e63f feat(auditlog): site script-side emitters stamp ParentExecutionId 2026-05-21 17:45:55 -04:00
Joseph Doherty
6af2607a50 feat(siteruntime): thread ParentExecutionId into the routed script's ScriptRuntimeContext 2026-05-21 17:35:49 -04:00
Joseph Doherty
dc2c73b07d refactor(inboundapi): fail-fast on missing inbound ExecutionId stash 2026-05-21 17:26:49 -04:00
Joseph Doherty
d8453bfba2 feat(inboundapi): mint inbound ExecutionId early, carry it as RouteToCallRequest.ParentExecutionId 2026-05-21 17:22:13 -04:00
Joseph Doherty
50430b9daa feat(auditlog): ParentExecutionId on site SQLite schema + gRPC AuditEventDto 2026-05-21 17:12:34 -04:00
Joseph Doherty
0a8709e5c5 feat(auditlog): ParentExecutionId column on AuditEvent + central AuditLog 2026-05-21 17:04:39 -04:00
Joseph Doherty
e4b37e2798 docs(auditlog): ParentExecutionId implementation plan + task tracking 2026-05-21 16:58:07 -04:00
Joseph Doherty
6be26e2813 docs(auditlog): ParentExecutionId cross-execution correlation design 2026-05-21 16:53:25 -04:00
Joseph Doherty
156e560171 Merge branch 'feature/audit-executionid': ExecutionId universal audit correlation
Adds a dedicated ExecutionId column to the Audit Log — one universal per-run
correlation value stamped on every audit row (sync ApiCall/DbWrite, the cached
call lifecycle incl. the S&F retry-loop path, NotifySend, central NotifyDeliver,
inbound). CorrelationId keeps its per-operation lifecycle meaning. Additive
schema (AuditLog.ExecutionId, Notifications.OriginExecutionId); UI column +
filter + drill-in; CLI + ManagementService filter.
2026-05-21 16:30:41 -04:00
Joseph Doherty
5198b114b4 fix(auditlog): evolve existing site auditlog.db schema for ExecutionId 2026-05-21 16:18:17 -04:00
Joseph Doherty
fd76c19007 test(auditlog): end-to-end ExecutionId correlation + docs 2026-05-21 16:06:40 -04:00
Joseph Doherty
24cdfe373c feat(audit): ExecutionId filter in the CLI and ManagementService 2026-05-21 16:00:09 -04:00
Joseph Doherty
1ba62052d6 feat(centralui): ExecutionId column, filter and drill-in on the Audit Log page 2026-05-21 15:52:57 -04:00
Joseph Doherty
cfd8f1ecf4 feat(auditlog): inbound audit rows carry ExecutionId 2026-05-21 15:44:17 -04:00
Joseph Doherty
6aac4c8ed7 test(auditlog): pin OriginExecutionId preservation in forwarder + Parked NotifyDeliver 2026-05-21 15:42:45 -04:00
Joseph Doherty
85bb61a1f3 feat(auditlog): NotifyDeliver rows carry the originating ExecutionId 2026-05-21 15:35:40 -04:00
Joseph Doherty
705ae95404 test(auditlog): assert ExecutionId threading hops; defensive Guid parse on S&F read 2026-05-21 15:27:58 -04:00
Joseph Doherty
6f5a35f222 feat(auditlog): thread ExecutionId through S&F for retry-loop cached rows
The store-and-forward retry loop emits the per-attempt and terminal cached
audit rows (ApiCallCached/DbWriteCached Attempted, CachedResolve) via
CachedCallLifecycleBridge from a CachedCallAttemptContext, not from the
script context. ExecutionId (and SourceScript) were not threaded through the
S&F buffer, so those rows had ExecutionId = null and SourceScript = null.

Thread both, additively, from the cached-call enqueue path:

- StoreAndForwardMessage gains ExecutionId (Guid?) / SourceScript (string?).
- StoreAndForwardStorage adds nullable execution_id / source_script columns
  via an idempotent PRAGMA-probed ALTER TABLE migration; rows persisted by
  an older build read back null (back-compat).
- StoreAndForwardService.EnqueueAsync gains optional executionId /
  sourceScript params, stamped onto the buffered message and surfaced on the
  CachedCallAttemptContext built in the retry loop.
- CachedCallAttemptContext gains ExecutionId / SourceScript.
- CachedCallLifecycleBridge.BuildPacket sets AuditEvent.ExecutionId and
  AuditEvent.SourceScript from the context (replacing the hard-coded
  SourceScript = null and its now-stale comment).
- IExternalSystemClient.CachedCallAsync / IDatabaseGateway.CachedWriteAsync
  gain optional executionId / sourceScript params; ScriptRuntimeContext's
  CachedCall / CachedWrite helpers pass _executionId / _sourceScript.

Script-side cached rows (CachedSubmit, immediate Attempted+Resolve) are
unchanged. All threading is additive — old buffered S&F rows still
deserialize and process with the new fields null.
2026-05-21 15:18:35 -04:00
Joseph Doherty
0149ce6180 feat(auditlog): site script-side emitters stamp ExecutionId
Move the per-script-execution Guid on ScriptRuntimeContext from
_auditCorrelationId to _executionId, and stamp it into the dedicated
AuditEvent.ExecutionId column on every script-side audit row:

- Sync ApiCall / DbWrite: ExecutionId set; CorrelationId reverts to
  null (a sync one-shot call has no operation lifecycle).
- Cached-call script-side rows (CachedSubmit, immediate-completion
  ApiCallCached + CachedResolve) and NotifySend: ExecutionId set;
  CorrelationId unchanged (per-operation TrackedOperationId /
  NotificationId).

Renames the threaded ctor param/field across ExternalSystemHelper,
DatabaseHelper, AuditingDbConnection and AuditingDbCommand, and threads
the id through NotifyHelper/NotifyTarget. The S&F retry-loop cached rows
(CachedCallLifecycleBridge) are out of scope here.
2026-05-21 15:05:00 -04:00
Joseph Doherty
6b16a48886 feat(auditlog): ExecutionId on site SQLite schema + gRPC AuditEventDto 2026-05-21 14:53:08 -04:00
Joseph Doherty
990731d12f test(auditlog): cover ExecutionId in AuditEvent round-trip test; clarify staging-table comment 2026-05-21 14:48:39 -04:00
Joseph Doherty
fd12021984 feat(auditlog): ExecutionId column on AuditEvent + central AuditLog 2026-05-21 14:43:35 -04:00
Joseph Doherty
4002f4197b docs(plan): Audit Log ExecutionId implementation plan 2026-05-21 14:37:12 -04:00
Joseph Doherty
6ffa47f258 docs(design): Audit Log ExecutionId universal correlation 2026-05-21 14:34:12 -04:00
Joseph Doherty
c9229c35fc Merge branch 'feature/audit-execution-correlation': per-execution audit correlation id
Every script execution gets an audit correlation id (generated for tag/timer
runs, request-local for inbound); it is stamped as CorrelationId on the sync
ApiCall and DbWrite audit rows so all sync trust-boundary rows from one run
correlate. Shared scripts inherit it. Cached calls / notifications keep their
existing CorrelationId. No schema change.
2026-05-21 13:58:37 -04:00
Joseph Doherty
aadb1fd72a refactor(auditlog): rename audit correlation field, add cross-helper tests 2026-05-21 13:57:17 -04:00
Joseph Doherty
8243f61e96 feat(auditlog): per-script-execution correlation id on sync audit rows 2026-05-21 13:46:34 -04:00
Joseph Doherty
53508c79b2 Merge branch 'feature/audit-apicall-payloads': capture API-call payloads
Outbound API audit rows now carry the request arguments and response body
(sync ApiCall + cached immediate-completion path); the emitter previously
hard-coded both summary fields to null.
2026-05-21 10:17:50 -04:00
Joseph Doherty
849a011400 fix(auditlog): capture request/response payloads on outbound API audit rows
The outbound ApiCall emitter hard-coded RequestSummary/ResponseSummary to null,
so audited API calls carried no inputs/outputs — contrary to the Audit Log
payload-capture spec. Thread the call arguments into the sync ApiCall emitter
and the cached immediate-completion path (CachedSubmit / ApiCallCached /
CachedResolve), and stamp the response body from ExternalCallResult.ResponseJson.
The writer's payload filter still applies the size cap + redaction downstream.

The S&F retry-loop cached rows are unchanged — request data is not threaded
through the store-and-forward buffer (same boundary as SourceScript).
2026-05-21 10:17:42 -04:00
Joseph Doherty
405de525ca Merge branch 'feature/audit-channel-single-select': single-select Channel filter
The Audit Log Channel filter becomes a single-select — Kind narrows to the
chosen channel, so multi-channel selection is incoherent. Kind, Status and Site
stay multi-select.
2026-05-21 10:03:08 -04:00
Joseph Doherty
77922abb33 feat(centralui): single-select Channel filter on the Audit Log page
Channel narrows the Kind options to the chosen channel, so filtering by more
than one channel at a time is incoherent. Replace the Channel multi-select
dropdown with a native single-select (matching the Time range control); Kind,
Status and Site stay multi-select. The query filter contract is unchanged —
Channels just carries 0 or 1 value.
2026-05-21 10:02:17 -04:00
Joseph Doherty
5f544bfe1e Merge branch 'feature/audit-actor-identity': populate audit Actor column
Stamp the audit Actor column on outbound rows (calling script identity) and
central-dispatch rows (system identity); the original emission code left it
null on every channel except Inbound API.
2026-05-21 09:56:43 -04:00
Joseph Doherty
aaa6df24cf Merge branch 'feature/audit-filter-dropdowns': compact audit filter dropdowns
Replace the four stacked chip-button groups on the Audit Log filter bar with a
reusable MultiSelectDropdown component, collapsing the bar from four full-width
chip blocks to four inline dropdowns in one wrapped filter row.
2026-05-21 09:56:43 -04:00
Joseph Doherty
ae7329034f fix(auditlog): populate the Actor column on outbound and central rows
Per the Audit Log Actor-column spec, Actor should carry the calling script
identity on outbound rows (ApiCall, DbWrite, NotifySend) and a system identity
on central-dispatch rows (NotifyDeliver). The original emission code hard-coded
Actor=null at all four sites, so only Inbound API rows (API key name) ever
filled it. Stamp the script identity and 'system' respectively.
2026-05-21 09:50:55 -04:00
Joseph Doherty
e36f0bf9c8 feat(centralui): compact multi-select dropdowns for the audit filter bar
Replace the four stacked chip-button groups (Channel, Kind, Status, Site) on
the Audit Log filter bar with a reusable MultiSelectDropdown component, so the
bar collapses from four full-width chip blocks to four inline dropdowns sharing
one wrapped filter row. Bootstrap dropdown + checkbox menu (data-bs-auto-close
=outside); no third-party UI libraries.
2026-05-21 09:36:36 -04:00
Joseph Doherty
a3eb659b75 Merge branch 'feature/audit-log-followups': Audit Log #23 deferred follow-ups
Implements the five deferred follow-ups from the Audit Log #23 roadmap:
- Real ClusterClient-based site->central audit push (replaces NoOpSiteStreamAuditClient)
- Consolidated the duplicated AuditEvent/SiteCall DTO mappers
- Site Calls UI page + read-side backend + central->site Retry/Discard relay + Health KPI tiles
- Multi-value AuditLogQueryFilter end-to-end (repository, ManagementService, CLI, Central UI)
- Audit results grid column resize/reorder UX

Full solution build clean; full test suite green including Playwright 60/60.
2026-05-21 09:27:52 -04:00
Joseph Doherty
d34f536220 fix(centralui): stabilise Site Calls + Audit grid Playwright E2E
Three Playwright E2E failures, all test-side timing/data bugs (no
feature defects found):

- AuditGridColumnTests.ColumnOrderAndWidths_PersistAcrossReload: read
  sessionStorage synchronously right after Mouse.UpAsync, racing the
  async OnColumnResized/OnColumnReordered JS->.NET->JS save round-trip.
  Now polls (WaitForFunctionAsync) for the storage keys and for the
  reorder re-render to settle; also hardens the flaky ReorderDrag test.

- SiteCallsPageTests.FilterNarrowing_ChannelFilterShrinksGrid: the
  Target-keyword #sc-search @bind committed via the Query click's own
  blur, racing change vs click on the circuit so Search() sometimes
  ran with a stale empty filter. Commit the value with an explicit,
  fully-awaited DispatchEventAsync('change') and use the retrying
  ToHaveCount assertion for the negative row checks.

- SiteCallsPageTests.RetryClickThrough_OnParkedRow: seeded SourceSite
  'plant-a' is not a real cluster site (site-a/b/c), so the relay had
  no ClusterClient route and only resolved on the 10s inner Ask
  timeout - past the 5s toast wait. Seed a live site (site-a) for a
  fast NotParked round-trip and give the toast a 15s wait.

Playwright E2E suite: 60 passed, 0 failed, 0 skipped.
2026-05-21 09:22:50 -04:00
Joseph Doherty
40955bbca6 docs(plan): mark audit-log follow-up tasks complete 2026-05-21 06:41:53 -04:00