10 KiB
Audit — current state: OtOpcUa
Repo: ~/Desktop/OtOpcUa (Gitea lmxopcua). Stack: .NET 10, Akka.NET cluster, EF Core + SQL Server.
All paths below are relative to the repo root. Verified against source on 2026-06-01.
OtOpcUa already has a structured, idempotent audit pipeline: a cluster-broadcast AuditEvent
message, a cluster-singleton writer actor that batches and bulk-inserts, and an append-only
ConfigAuditLog EF entity with two-layer dedup. There is also a second, older write path —
SQL stored procedures that INSERT dbo.ConfigAuditLog directly — so the table has two
producers with slightly different column conventions (see §1).
1. How it works today
Record shape — src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs:9-17:
a sealed record AuditEvent(Guid EventId, string Category, string Action, string Actor, DateTime OccurredAtUtc, string? DetailsJson, NodeId SourceNode, CorrelationId CorrelationId).
NodeId and CorrelationId are Commons value-types — NodeId wraps a string (the logical
cluster node / host name, explicitly not an OPC UA NodeId per its XML doc,
src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/NodeId.cs:3-8); CorrelationId wraps a Guid
(src/Core/ZB.MOM.WW.OtOpcUa.Commons/Types/CorrelationId.cs:3).
Transport — AuditEvent is an Akka message meant to be sent to the AuditWriterActor
cluster singleton (AuditEvent.cs:6 describes it as "cluster-broadcast … consumed by the
AuditWriterActor singleton"). The singleton is registered through Akka.Hosting at
src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/ServiceCollectionExtensions.cs:68-75
(WithSingleton<AuditWriterActorKey>(AuditWriterSingletonName, …)). Any cluster member can
emit an AuditEvent; the singleton is the one sink that persists it.
Storage — EF entity ConfigAuditLog
(src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs:7-44): append-only
("Grants revoked for UPDATE/DELETE on all principals", ConfigAuditLog.cs:4-5). Columns:
AuditId (identity PK), Timestamp (default SYSUTCDATETIME()), Principal, EventType,
ClusterId?, NodeId?, GenerationId?, DetailsJson?, EventId? (Guid), CorrelationId?
(Guid). Mapping/constraints in OtOpcUaConfigDbContext.cs:429-463: DetailsJson must be valid
JSON (CK_ConfigAuditLog_DetailsJson_IsJson, line 435-436); Principal/EventType/ClusterId/NodeId
length-capped (lines 441-444); supporting indexes IX_ConfigAuditLog_Cluster_Time (line 449-451)
and IX_ConfigAuditLog_Generation (line 452-454).
Writer / batching — src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs:
a ReceiveActor with FlushBatchSize = 500 (line 25) and FlushInterval = 5s (line 26).
It buffers events in a Dictionary<Guid, AuditEvent> keyed by EventId (line 30), flushing
when the buffer hits 500 (line 60), when the 5s periodic timer fires (PreStart, line 50-53),
or on PreRestart/PostStop (lines 96-107) so a supervisor swap or coordinated shutdown does
not lose the buffer. FlushBuffer (lines 63-93) snapshots and clears the buffer, then for each
event constructs a ConfigAuditLog row (lines 75-84): Timestamp = OccurredAtUtc,
Principal = Actor, EventType = $"{Category}:{Action}", NodeId = SourceNode.Value,
DetailsJson, EventId, CorrelationId = CorrelationId.Value. A failed flush is logged and the
batch is dropped (catch at lines 89-92) — best-effort, no retry/dead-letter.
Dedup / idempotency (two layers) — described at AuditWriterActor.cs:17-21:
- In-buffer — duplicate
EventIds within a batch collapse via the dictionary (last-write-wins;HandleEvent, lines 55-61). - Database — a filtered unique index
UX_ConfigAuditLog_EventId(OtOpcUaConfigDbContext.cs:459-462,IsUnique()+HasFilter("[EventId] IS NOT NULL")) gives cross-restart safety: a retry of an already-flushed batch hits the constraint, the duplicate insert is dropped, and the rest of the batch survives.EventId/CorrelationIdare nullable so legacy/backfill rows (NULL) don't collide — confirmed in the entity XML (ConfigAuditLog.cs:33-43) and migrationMigrations/20260526105027_AddConfigAuditLogEventIdColumns.cs:26-31.
Scope — two producers, two conventions:
- Akka
AuditEventpath (the structured one): config writes + authorization checks. The EventType vocabulary lives in the entity XML doc (ConfigAuditLog.cs:18):DraftCreated | DraftEdited | Published | RolledBack | NodeApplied | CredentialAdded | CredentialDisabled | ClusterCreated | NodeAdded | ExternalIdReleased | CrossClusterNamespaceAttempt | OpcUaAccessDenied | …. Note the access-denied / cross-cluster entries are authz-check events, not config writes. - SQL stored-procedure path (older, still present): several SPs
INSERT dbo.ConfigAuditLogdirectly — e.g.Published/RolledBack/NodeApplied/ExternalIdReleased/CrossClusterNamespaceAttemptinMigrations/20260417215224_StoredProcedures.cs:151,217,351,407,504. These useSUSER_SNAME()asPrincipal, setClusterId/GenerationId, write a bareEventType(noCategory:Actionsplit), and leaveEventId/CorrelationIdNULL.
Query / UI — the only read surface is the Admin UI page
src/Server/ZB.MOM.WW.OtOpcUa.AdminUI/Components/Pages/Clusters/ClusterAudit.razor
(@page "/clusters/{ClusterId}/audit", [Authorize], lines 1-2). It reads the latest
PageSize = 200 rows (line 69) filtered by ClusterId, newest-first (OnInitializedAsync,
lines 74-82), and renders Timestamp / Principal / Event(Type) / Node / Correlation(first 8 hex) /
Details columns (lines 38-58). Tested in
tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests/AuditWriterActorTests.cs: count-threshold
flush (lines 26-41), in-buffer dedup of duplicate EventIds (lines 45-62), PostStop flush
(lines 66-81), and the column mapping incl. EventType == "Config:Edit" and NodeId == "node-a"
(lines 85-104).
Load-bearing gotcha: the actor path never sets
ClusterId(lines 75-84), but the UI filters onClusterId(ClusterAudit.razor:78). So today the cluster-scoped view surfaces the stored-procedure rows; structuredAuditEventrows written by the actor (which carry the host inNodeId, notClusterId) won't appear under a cluster. Worth flagging during normalization.
2. Mapping to the canonical AuditEvent
Target = ZB.MOM.WW.Audit.AuditEvent (built in parallel). OtOpcUa's existing AuditEvent is
already almost field-for-field aligned; the only synthesized field is Outcome.
| Canonical field | OtOpcUa source | Mapping |
|---|---|---|
Guid EventId |
AuditEvent.EventId |
Direct. Already the idempotency key (buffer key + UX_ConfigAuditLog_EventId). |
DateTimeOffset OccurredAtUtc |
AuditEvent.OccurredAtUtc (DateTime) |
Direct; widen DateTime(UTC) → DateTimeOffset. |
string Actor |
AuditEvent.Actor |
Direct (→ ConfigAuditLog.Principal). At Auth adoption this becomes the ZB.MOM.WW.Auth principal. |
string Action |
AuditEvent.Action (+ Category) |
Direct. Today persisted as "{Category}:{Action}" in EventType; canonical keeps Action and Category separate. |
AuditOutcome Outcome |
(none) | Derived from the EventType vocabulary, not stored today. OpcUaAccessDenied/CrossClusterNamespaceAttempt → Denied; the config-write verbs → Success. No explicit Failure value exists yet (a failed flush is dropped, not recorded as an event). |
string? Category |
AuditEvent.Category |
Direct (e.g. "Config"). |
string? Target |
(none) | No dedicated field today; the closest is SourceNode→NodeId (the acting host) or details. Leave null or carry the affected object in DetailsJson. |
string? SourceNode |
AuditEvent.SourceNode (NodeId.Value) |
Direct — the logical cluster node / host name (NOT an OPC UA NodeId). Currently lands in ConfigAuditLog.NodeId. |
Guid? CorrelationId |
AuditEvent.CorrelationId (CorrelationId.Value) |
Direct. |
string? DetailsJson |
AuditEvent.DetailsJson |
Direct; carries everything else (incl. ClusterId/GenerationId, which today are separate columns on the SP path). |
3. Adoption plan → ZB.MOM.WW.Audit
Effort: medium. OtOpcUa is the donor design for the canonical record, so most of the work is re-pointing types and bridging two persistence conventions, not redesigning the pipeline.
Replace with the shared library:
Commons/Messages/Audit/AuditEvent.cs→ the canonicalZB.MOM.WW.Audit.AuditEvent. Add the newOutcomefield (derive it at every emit site from the EventType vocabulary, e.g.OpcUaAccessDenied → Denied); keepCategory/Action/SourceNode/CorrelationIdas-is. Decide whetherSourceNode/CorrelationIdcarry the Commons value-types or the canonical primitives at the seam (likely a thin adapter at construction).AuditWriterActor→ implement the library'sIAuditWriter(keep the actor as OtOpcUa's Akka-cluster-singleton transport/batching adapter behind that seam; the 500/5s batching, PreRestart/PostStop flush, and two-layer dedup stay bespoke per §"left per-project").
Keep bespoke (thin adapter only):
- Transport — the cluster-broadcast → singleton
AuditWriterActor, batching, and flush triggers. - Storage — the
ConfigAuditLogEF entity, indexes, andUX_ConfigAuditLog_EventIdidempotency index. Map the canonical record onto the existing columns; add anOutcomecolumn (or fold it intoEventType/DetailsJsonif a schema change is undesirable).ClusterId/GenerationIdremain OtOpcUa-specific columns fed viaDetailsJsonor kept as side columns. - Domain vocabulary — the EventType strings (
DraftCreated,Published,OpcUaAccessDenied, …) and theCategory:Actioncomposition convention. - Query/UI —
ClusterAudit.razorand itsClusterIdfilter.
Reconcile, not extract:
- The two producers (Akka
AuditEventpath vs. SQL stored-procedureINSERTs usingSUSER_SNAME()). The SP path bypasses the canonical record entirely and writes a different column convention (bareEventType, NULLEventId/CorrelationId, populatedClusterId/GenerationId). Adopting the library does not by itself unify these; either route the SP events through the actor or accept that SP rows stay non-idempotent and absent from theEventIddedup guarantee. Flag for the normalization spec. - The
ClusterId-filter / actor-never-sets-ClusterIdmismatch noted in §1 — fix when the query surface is normalized so structuredAuditEventrows are discoverable by cluster.