docs(m4.1): reconcile Config-DB AuditLog schema + Commons (AuditEvent/ApiKey/SiteCall/NotificationType) to shipped code
This commit is contained in:
@@ -60,14 +60,50 @@ The configuration database stores all central system data, organized by domain a
|
||||
- **Notifications**: The durable central notification queue owned by the Notification Outbox — one row per notification, the single source of audit truth. The schema is **type-agnostic** so it records any notification type the system supports (email today, Microsoft Teams and others later): a `Type` discriminator selects the type, and a `TypeData` JSON column (`nvarchar(max)`) carries any future per-type fields without a schema change. Columns: `NotificationId` (GUID, primary key — generated at the site, used as the idempotency key), `Type`, `ListName`, `Subject`, `Body`, `TypeData`, `Status`, `RetryCount`, `LastError`, `ResolvedTargets`, `SourceSiteId`, `SourceInstanceId`, `SourceScript`, `SiteEnqueuedAt`, `CreatedAt`, `LastAttemptAt`, `NextAttemptAt`, `DeliveredAt`. `Status` is a `NotificationStatus` enum stored with values `Pending`, `Retrying`, `Delivered`, `Parked`, `Discarded` (the site-local `Forwarding` state is never persisted centrally). Indexed on `Status` and `NextAttemptAt` for efficient dispatcher polling of due rows, and on `SourceSiteId` and `CreatedAt` for KPI computation and the Central UI query page. Terminal rows are removed by a daily purge job — see Scheduled Maintenance below. See Component-NotificationOutbox.md for the full lifecycle.
|
||||
|
||||
### Site Calls
|
||||
- **SiteCalls**: The central audit table for cached site calls — `ExternalSystem.CachedCall()` and `Database.CachedWrite()` — owned by the Site Call Audit component and a sibling of the `Notifications` table. One row per cached operation. Columns: `TrackedOperationId` (GUID, primary key — generated site-side at call time, used as the idempotency key), `SourceSite`, `Kind` (a `TrackedOperationKind` enum stored with values `ExternalCall` / `DatabaseWrite`), `TargetSummary` (external system + method for an `ExternalCall`, database connection name for a `DatabaseWrite`), `Status` (a `TrackedOperationStatus` enum stored with values `Pending`, `Retrying`, `Delivered`, `Parked`, `Failed`, `Discarded`), `RetryCount`, `LastError`, `Provenance` (source instance / script), `CreatedAtUtc`, `UpdatedAtUtc`, `TerminalAtUtc`. The table is populated **only** by Site Call Audit telemetry and reconciliation pulls — sites are the source of truth and the row is an eventually-consistent mirror, never written by a central dispatcher. Ingestion is **insert-if-not-exists** keyed on `TrackedOperationId`, then **upsert-on-newer-status**; the lifecycle is monotonic, so at-least-once and out-of-order telemetry are harmless. Indexed on `Status` and `SourceSite` for KPI computation and the Central UI query page. Terminal rows are removed by a daily purge job — see Scheduled Maintenance below. See Component-SiteCallAudit.md for the full lifecycle.
|
||||
- **SiteCalls**: The central audit table for cached site calls — `ExternalSystem.CachedCall()` and `Database.CachedWrite()` — owned by the Site Call Audit component and a sibling of the `Notifications` table. One row per cached operation. Columns: `TrackedOperationId` (`varchar(36)`, primary key — the GUID idempotency key stored in "D"-format string, matching the site SQLite and gRPC wire format), `Channel` (`varchar(32)` — `ApiOutbound` or `DbOutbound`; the trust-boundary channel that produced the call), `Target` (`varchar(256)` — human-readable target, e.g. `ERP.GetOrder` for an external call), `SourceSite` (`varchar(64)` — the site that submitted the cached call), `SourceNode` (`varchar(64)` NULL — cluster node that emitted the call; null for reconciled rows from retired nodes), `Status` (`varchar(32)` — `AuditStatus` enum name: `Submitted`, `Forwarded`, `Attempted`, `Delivered`, `Failed`, `Parked`, `Discarded`; lifecycle is monotonic), `RetryCount` (`int`), `LastError` (`nvarchar(1024)` NULL), `HttpStatus` (`int` NULL — last HTTP status code for API calls), `CreatedAtUtc` (`datetime2`), `UpdatedAtUtc` (`datetime2`), `TerminalAtUtc` (`datetime2` NULL), `IngestedAtUtc` (`datetime2` — central ingest timestamp). The table is populated **only** by Site Call Audit telemetry and reconciliation pulls — sites are the source of truth and the row is an eventually-consistent mirror, never written by a central dispatcher. Ingestion is **insert-if-not-exists** keyed on `TrackedOperationId`, then **upsert-on-newer-status**; the lifecycle is monotonic, so at-least-once and out-of-order telemetry are harmless. Indexes: `IX_SiteCalls_Source_Created (SourceSite, CreatedAtUtc DESC)` for per-site queries and `IX_SiteCalls_Status_Updated (Status, UpdatedAtUtc DESC)` for status-filtered views. Terminal rows are removed by a daily purge job — see Scheduled Maintenance below. See Component-SiteCallAudit.md for the full lifecycle.
|
||||
|
||||
### Audit Log
|
||||
- **AuditLog**: The central, append-only audit table owned by the Audit Log component — one row per script-trust-boundary lifecycle event across all channels (outbound API calls, outbound DB writes/reads, notifications, and inbound API requests). Sibling of the `Notifications` and `SiteCalls` tables but distinct: `AuditLog` is the immutable history that observes the other subsystems, not an operational state store. Columns: `EventId` (`uniqueidentifier` primary key — generated at the originator, used as the idempotency key), `OccurredAtUtc` (`datetime2`), `IngestedAtUtc` (`datetime2`), `Channel` (`varchar(32)` — `ApiOutbound` / `DbOutbound` / `Notification` / `ApiInbound`), `Kind` (`varchar(32)` — channel-specific event kind), `CorrelationId` (`uniqueidentifier` NULL — `TrackedOperationId` for cached calls, `NotificationId` for notifications, request-id for inbound API), `SourceSiteId` (`varchar(64)` NULL), `SourceInstanceId` (`varchar(128)` NULL), `SourceScript` (`varchar(128)` NULL), `Actor` (`varchar(128)` NULL), `Target` (`varchar(256)` NULL), `Status` (`varchar(32)` — outcome of *this event*: `Success`, `TransientFailure`, `PermanentFailure`, `Enqueued`, `Retrying`, `Delivered`, `Parked`, `Discarded`), `HttpStatus` (`int` NULL), `DurationMs` (`int` NULL), `ErrorMessage` (`nvarchar(1024)` NULL), `ErrorDetail` (`nvarchar(max)` NULL), `RequestSummary` (`nvarchar(max)` NULL — truncated request payload, headers redacted), `ResponseSummary` (`nvarchar(max)` NULL — truncated response payload), `PayloadTruncated` (`bit`), `Extra` (`nvarchar(max)` NULL — channel-specific JSON for fields not promoted to columns). Indexes: `IX_AuditLog_OccurredAtUtc` (primary time-range index for global scans), `IX_AuditLog_Site_Occurred (SourceSiteId, OccurredAtUtc)` (per-site filters), `IX_AuditLog_CorrelationId (CorrelationId)` (drilldown from a single operation), `IX_AuditLog_Channel_Status_Occurred (Channel, Status, OccurredAtUtc)` (KPI / dashboard tiles), and `IX_AuditLog_Target_Occurred (Target, OccurredAtUtc)` ("what did we send to system X"). The primary key on `EventId` enforces idempotency — central ingest is `INSERT … WHERE NOT EXISTS`, so at-least-once telemetry and reconciliation retries collapse to a single row. **Monthly partitioning** on `OccurredAtUtc` from day one via partition function `pf_AuditLog_Month` and partition scheme `ps_AuditLog_Month`, with a filegroup-per-month rollover so that retention purge is a partition switch rather than a row-level delete. The partition-maintenance job that rolls the scheme forward and switches expired partitions is owned by the Audit Log component, not this component. The table is populated only by Audit Log writers (site telemetry, central direct-write, reconciliation pulls); central ingest is **insert-if-not-exists** keyed on `EventId`. See Component-AuditLog.md for the full lifecycle, payload-capture policy, and ingestion paths.
|
||||
- **AuditLog**: The central, append-only audit table owned by the Audit Log component — one row per script-trust-boundary lifecycle event across all channels (outbound API calls, outbound DB writes/reads, notifications, and inbound API requests). Sibling of the `Notifications` and `SiteCalls` tables but distinct: `AuditLog` is the immutable history that observes the other subsystems, not an operational state store.
|
||||
|
||||
**Schema (post-C5 canonical collapse, migration `CollapseAuditLogToCanonical`):** The table stores the 10 canonical `ZB.MOM.WW.Audit.AuditEvent` fields directly as writable columns, plus six read-only server-side persisted computed columns derived from `DetailsJson` via `JSON_VALUE … PERSISTED`.
|
||||
|
||||
*Writable columns (the 10 canonical `AuditEvent` fields):*
|
||||
- `EventId` (`uniqueidentifier NOT NULL`) — idempotency key; generated at the originator.
|
||||
- `OccurredAtUtc` (`datetime2(7) NOT NULL`) — UTC timestamp when the audited action occurred at its source.
|
||||
- `Actor` (`nvarchar(256)` NULL) — authenticated actor; null/empty for system/anon actions.
|
||||
- `Action` (`varchar(64) NOT NULL`) — canonical action verb, format `"{channel}.{kind}"` (e.g. `ApiOutbound.ApiCall`).
|
||||
- `Outcome` (`varchar(16) NOT NULL`) — normalized outcome: `Success`, `Failure`, or `Denied`.
|
||||
- `Category` (`varchar(32) NOT NULL`) — canonical `Category` column; for ScadaBridge this equals the channel name (`ApiOutbound`, `DbOutbound`, `Notification`, `ApiInbound`). Mapped to the `AuditChannel` enum in EF as `Channel`.
|
||||
- `Target` (`nvarchar(256)` NULL) — target of the action: external system name, DB connection name, notification list name, or inbound method name.
|
||||
- `SourceNode` (`varchar(64)` NULL) — cluster node that emitted the event (e.g. `node-a`, `central-a`); null for reconciled rows from a retired node.
|
||||
- `CorrelationId` (`uniqueidentifier` NULL) — links related audit rows (e.g. the cached-op lifecycle): `TrackedOperationId` for cached calls, `NotificationId` for notifications.
|
||||
- `DetailsJson` (`nvarchar(max)` NULL) — JSON extension bag carrying all ScadaBridge domain fields (channel, kind, status, executionId, parentExecutionId, sourceSiteId, sourceInstanceId, sourceScript, httpStatus, durationMs, errorMessage, errorDetail, requestSummary, responseSummary, payloadTruncated, extra, ingestedAtUtc, and others).
|
||||
|
||||
*Persisted computed columns (read-only; derived from `DetailsJson` by SQL Server on INSERT):*
|
||||
- `Kind` — `JSON_VALUE(DetailsJson,'$.kind')` PERSISTED — channel-specific event kind (e.g. `ApiCall`, `CachedSubmit`).
|
||||
- `Status` — `JSON_VALUE(DetailsJson,'$.status')` PERSISTED — `AuditStatus` enum name for this event (e.g. `Submitted`, `Delivered`, `Parked`).
|
||||
- `SourceSiteId` — `JSON_VALUE(DetailsJson,'$.sourceSiteId')` PERSISTED — site of origin; null for central-direct events.
|
||||
- `ExecutionId` — `CAST(JSON_VALUE(DetailsJson,'$.executionId') AS uniqueidentifier)` PERSISTED — originating script execution / inbound request id.
|
||||
- `ParentExecutionId` — `CAST(JSON_VALUE(DetailsJson,'$.parentExecutionId') AS uniqueidentifier)` PERSISTED — spawner's `ExecutionId`; null for top-level runs.
|
||||
- `IngestedAtUtc` — `CAST(SWITCHOFFSET(CAST(JSON_VALUE(DetailsJson,'$.ingestedAtUtc') AS datetimeoffset), 0) AS datetime2(7))` — central ingest timestamp; **not** persisted (SQL Server rejects PERSISTED on the non-deterministic `SWITCHOFFSET` expression).
|
||||
|
||||
*Clustered primary key:* `(EventId, OccurredAtUtc)` — composite so the key is partition-aligned. `UX_AuditLog_EventId` (unique, non-aligned on `[PRIMARY]`) enforces global `EventId` uniqueness for `InsertIfNotExistsAsync` idempotency.
|
||||
|
||||
*Indexes* (all non-clustered, partition-aligned on `ps_AuditLog_Month(OccurredAtUtc)` except `UX_AuditLog_EventId`):
|
||||
- `IX_AuditLog_OccurredAtUtc` (primary time-range index for global scans)
|
||||
- `IX_AuditLog_Site_Occurred (SourceSiteId, OccurredAtUtc)` (per-site filters)
|
||||
- `IX_AuditLog_CorrelationId (CorrelationId) WHERE CorrelationId IS NOT NULL` (drilldown from a single operation)
|
||||
- `IX_AuditLog_Channel_Status_Occurred (Category, Status, OccurredAtUtc)` (KPI / dashboard tiles — note: column is `Category` in SQL, `Channel` in EF)
|
||||
- `IX_AuditLog_Target_Occurred (Target, OccurredAtUtc) WHERE Target IS NOT NULL` ("what did we send to system X")
|
||||
- `IX_AuditLog_Execution (ExecutionId)` — unfiltered (SQL Server forbids filtered indexes on computed columns)
|
||||
- `IX_AuditLog_ParentExecution (ParentExecutionId)` — unfiltered
|
||||
- `IX_AuditLog_Node_Occurred (SourceNode, OccurredAtUtc)` (per-node queries)
|
||||
|
||||
**Monthly partitioning** on `OccurredAtUtc` via partition function `pf_AuditLog_Month` and partition scheme `ps_AuditLog_Month`; retention purge is a partition switch, never a row-level delete. The table is populated only by Audit Log writers (site telemetry, central direct-write, reconciliation pulls); central ingest is **insert-if-not-exists** keyed on `EventId`. See Component-AuditLog.md for the full lifecycle, payload-capture policy, and ingestion paths.
|
||||
|
||||
### Inbound API
|
||||
- **API Keys**: Key definitions (name/label, key value, enabled flag).
|
||||
- **API Methods**: Method definitions (name, approved key references, parameter definitions, return value definitions, implementation script, timeout).
|
||||
- **API Keys**: The `ApiKeys` SQL Server table was **retired** by migration `RetireInboundApiKeyStore`. Inbound API keys now live in the shared `ZB.MOM.WW.Auth.ApiKeys` SQLite store, not the configuration MS SQL database. The `IInboundApiRepository` no longer exposes any `ApiKey` operations; key management goes through `IInboundApiKeyAdmin` (Commons, `Interfaces/Security/`). The `ApprovedApiKeyIds` column on `ApiMethods` was removed at the same time — key-to-method scoping is now managed by the auth library.
|
||||
- **API Methods**: Method definitions (name, parameter definitions, return value definitions, implementation script, timeout).
|
||||
|
||||
### Security
|
||||
- **LDAP Group Mappings**: Mappings between LDAP group names and system roles (Admin, Design, Deployment).
|
||||
|
||||
Reference in New Issue
Block a user