Files
scadalink-design/Component-SiteEventLogging.md
Joseph Doherty 70e5ae33d5 Refine remaining components: Deployment Manager, Central UI, Site Event Logging, S&F
Deployment Manager: add deployment concurrency rules (block same-instance, allow
parallel different-instance), per-site artifact deployment status, current-only
status persistence.

Central UI: specify Blazor Server framework, real-time push updates via SignalR
for debug view, health dashboard, and deployment status.

Site Event Logging: daily retention purge, paginated queries with 500-event default,
keyword search on message/source fields.

Store-and-Forward: clarify async best-effort replication to standby with acceptable
trade-offs on failover.
2026-03-16 08:48:33 -04:00

3.7 KiB

Component: Site Event Logging

Purpose

The Site Event Logging component records operational events at each site cluster, providing a local audit trail of runtime activity. Events are queryable from the central UI for remote troubleshooting.

Location

Site clusters (event recording and storage). Central cluster (remote query access via UI).

Responsibilities

  • Record operational events from all site subsystems.
  • Persist events to local SQLite.
  • Enforce 30-day retention policy with automatic purging.
  • Respond to remote queries from central for event log data.

Events Logged

Category Events
Script Executions Script started, completed, failed (with error details), recursion limit exceeded
Alarm Events Alarm activated, alarm cleared (which alarm, which instance), alarm evaluation error
Deployment Events Configuration received from central, scripts compiled, applied successfully, apply failed
Data Connection Status Connected, disconnected, reconnected (per connection)
Store-and-Forward Message queued, delivered, retried, parked
Instance Lifecycle Instance enabled, disabled, deleted

Event Entry Schema

Each event entry contains:

  • Timestamp: When the event occurred.
  • Event Type: Category of the event (script, alarm, deployment, connection, store-and-forward, instance-lifecycle).
  • Severity: Info, Warning, or Error.
  • Instance ID (optional): The instance associated with the event (if applicable).
  • Source: The subsystem that generated the event (e.g., "ScriptActor:MonitorSpeed", "AlarmActor:OverTemp", "DataConnection:PLC1").
  • Message: Human-readable description of the event.
  • Details (optional): Additional structured data (e.g., exception stack trace, alarm name, message ID, compilation errors).

Storage

  • Events are stored in local SQLite on each site node.
  • Each node maintains its own event log (the active node generates events; the standby node generates minimal events related to replication).
  • Retention: 30 days. A daily background job runs on the active node and deletes all events older than 30 days. Hard delete — no archival.

Central Access

  • The central UI can query site event logs remotely via the Communication Layer.
  • Queries support filtering by:
    • Event type / category
    • Time range
    • Instance ID
    • Severity
    • Keyword search: Free-text search on message and source fields (SQLite LIKE query). Useful for finding events by script name, alarm name, or error message across all instances.
  • Results are paginated with a configurable page size (default: 500 events). Each response includes a continuation token for fetching additional pages. This prevents broad queries from overwhelming the communication channel.
  • The site processes the query locally against SQLite and returns matching results to central.

Dependencies

  • SQLite: Local storage on each site node.
  • Communication Layer: Handles remote query requests from central.
  • Site Runtime: Generates script execution events, alarm events, deployment application events, and instance lifecycle events.
  • Data Connection Layer: Generates connection status events.
  • Store-and-Forward Engine: Generates buffer activity events.

Interactions

  • All site subsystems: Event logging is a cross-cutting concern — any subsystem that produces notable events calls the Event Logging service.
  • Communication Layer: Receives remote queries from central and returns results.
  • Central UI: Site Event Log Viewer displays queried events.
  • Health Monitoring: Script error rates and alarm evaluation error rates can be derived from event log data.