- connection-name capable-set comparer kept as StringComparer.Ordinal:
FlatteningService and SemanticValidator use all-ordinal name-keyed
dictionaries throughout; OrdinalIgnoreCase would be inconsistent with
the rest of the binding-resolution path — added comment documenting this
- IsAlarmCapable protocol-match confirmed consistent with DataConnectionFactory
(both OrdinalIgnoreCase); added case-insensitive InlineData variants
(OPCUA, opcua, mxgateway, MXGATEWAY) to lock the contract
- clarified FlatteningPipeline comment: "filters connections by alarm-capable
protocol, then collects their names" (was "maps from the protocol string")
- added DataConnectionLayer/DataConnectionFactory.cs path reference to
AlarmCapableProtocols sync-risk comment
FlatteningPipeline loaded data connections but never passed the alarm-capable
connection set to SemanticValidator, so the native-alarm-source capability check
(built but inert) never ran — a source bound to a non-alarm-capable connection
deployed silently. Compute the capable set (IAlarmSubscribableConnection: OPC UA
+ MxGateway) and thread it through ValidationService to SemanticValidator.
The actual drift was NOT OccurredAtUtc's converter (a same-CLR-type
DateTime->DateTime ValueConverter emits no snapshot annotation and never
triggers PendingModelChangesWarning). The real pending change was a HasData
seed row: SecurityConfiguration adds LdapGroupMapping Id=5 (SCADA-Viewers ->
Viewer) but the model snapshot omitted it, so MsSqlMigrationFixture's
MigrateAsync threw PendingModelChangesWarning and failed every fixture-backed
AuditLog MSSQL test (~57).
Generated via `dotnet ef migrations add`; Up/Down are seed-data DML only
(InsertData/DeleteData of the single reference row) -- no schema DDL. The
snapshot now carries the Id=5 seed and has-pending-model-changes is clean.
Closes the Tier-1 silent gaps from the stillpending.md audit (#3-#6):
- AuditLog 365-day purge actor + reconciliation self-heal now actually
start and run on the central node (were dead code).
- SiteCall reconciliation pull (new PullSiteCalls RPC + plumbing) + daily
terminal-row purge scheduler.
- Site Event Logging now emits all 5 previously-missing categories
(alarm, deployment, instance_lifecycle, store_and_forward, notification,
script started/completed).
14 commits, each implement->review->fix. Build 0/0; cluster verified
healthy with the new singletons starting cleanly (bash docker/deploy.sh).
ScriptExecutionActor previously emitted only an Error 'script' event on failure.
It now also fire-and-forgets an Info 'script' event when execution starts (right
before RunAsync) and when it completes successfully — giving the operational log
the full started/completed/failed lifecycle. Uses the already-resolved
siteEventLogger; fire-and-forget so the event log can never block or fault the
script's own run.
Extends the SingleServiceProvider test helper to also serve IServiceScopeFactory
(returning a self-scope) so ScriptExecutionActor's serviceProvider.CreateScope()
reaches the logging hot path in tests instead of throwing into the catch.
StoreAndForwardService gains an optional ISiteEventLogger? ctor param (default
null so the many direct-construction tests still compile) and, when wired,
mirrors its own buffer/retry/park activity onto site operational events via the
existing OnActivity hook (which already isolates a throwing subscriber, so a
failing event log can never be misclassified as a transient delivery failure):
- store_and_forward (ExternalSystem / CachedDbWrite): queued/retried/delivered/
parked. Warning on buffer/retry, Error on park, Info on retry-recovery; an
immediate-success delivery is the hot path and is not logged.
- notification (the site forward-to-central path): logged ONLY on forward
FAILURE (buffered after the immediate forward threw) and on park, per the
Component-SiteEventLogging spec — routine enqueue and forward-success are
deliberately not logged (central's Notifications table is the audit record).
Wired through AddStoreAndForward (resolves ISiteEventLogger optionally from DI);
StoreAndForward project now references SiteEventLogging (acyclic: SiteEventLogging
references only Commons). Also documents the 'notification' category on the
ISiteEventLogger.LogEventAsync eventType param (folds in M1.8 doc fix).
DeploymentManagerActor now fire-and-forgets a 'deployment' site operational
event on deploy/enable/disable/delete outcomes (Info on success, Error on
failure), source 'DeploymentManagerActor'. The disable/delete events are emitted
from the existing PipeTo continuations (safe: reads only the immutable
_serviceProvider and fire-and-forgets).
InstanceActor now emits an 'instance_lifecycle' Info event in PreStart (started)
and a new PostStop (stopped) — covering start/stop/enable/disable/redeploy/
failover transitions from the instance's own vantage point. Both actors already
hold _serviceProvider; no ctor change.
Resolution is optional and LogEventAsync is fire-and-forget so a logging failure
never affects the deployment pipeline or instance lifecycle.
AlarmActor (computed) and NativeAlarmActor (native mirror) now fire-and-forget
an 'alarm' site operational event on every state transition:
- raise/activate: Error (priority/severity >= 700) or Warning
- clear/return-to-normal, ack, inter-band transition: Info
Both actors take a new optional IServiceProvider? ctor param (default null so
existing direct-construction tests still compile); InstanceActor passes its
_serviceProvider at the two Props.Create sites. Resolution is optional and the
LogEventAsync call is fire-and-forget, so a logging failure never affects alarm
evaluation. Rehydration replays are not re-logged.
Adds a capturing FakeSiteEventLogger test helper + SingleServiceProvider.
Add a daily purge tick to SiteCallAuditActor that drops terminal SiteCalls
rows older than the retention window via ISiteCallAuditRepository.PurgeTerminalAsync.
The threshold is computed each tick as UtcNow - RetentionDays so an operator who
lowers RetentionDays sees it on the next purge without a restart. Mirrors
AuditLogPurgeActor's daily cadence + continue-on-error posture: a purge fault is
logged and swallowed so the central singleton stays alive and retries next tick.
The purge timer is started in PreStart alongside the reconciliation timer and
gates on the same collaborators (pull client + enumerator) being available — the
repo-only test ctor injects neither, so neither background timer runs there.
Options: PurgeInterval (default 24h, clamped >= 1 min so a zero config value
can't spin the scheduler) + RetentionDays (default 365), plus a test-only
override that bypasses the clamp for millisecond cadences.
Tests (all in-memory, no live MSSQL): purge tick calls PurgeTerminalAsync with a
UtcNow - RetentionDays threshold (non-default 30 days); default retention yields
a 365-day threshold; a throwing repo does not kill the singleton (a second tick
still arrives).
Add a periodic reconciliation tick to SiteCallAuditActor that, per site,
pulls changed SiteCall rows since a per-site UpdatedAtUtc cursor and upserts
them idempotently (monotonic UpsertAsync) — the documented self-heal for lost
best-effort gRPC telemetry. Mirrors SiteAuditReconciliationActor's structure
(per-site cursor, per-site try/catch failure isolation, advance cursor by max
observed UpdatedAtUtc) minus the stalled-detection EventStream machinery.
Dependency wiring: add an acyclic SiteCallAudit -> AuditLog project reference
and resolve IPullSiteCallsClient + ISiteEnumerator (central-only singletons
registered by AddAuditLogCentralReconciliationClient) from the IServiceProvider
the production ctor already holds — no Host Props.Create change needed. The
repo-only test ctor injects neither collaborator, so the tick is gated off
there. A new public test ctor injects fake client + enumerator + repo so the
tick is unit-testable in-memory (public, not internal: Akka's ActivatorProducer
uses public-only reflection binding).
Options: ReconciliationInterval (default 5 min, clamped >= 1s so a zero config
value can't spin the scheduler) + ReconciliationBatchSize (default 500), plus a
test-only override that bypasses the clamp for millisecond cadences.
Tests (all in-memory, no live MSSQL): absent row is upserted on a tick; second
tick advances the cursor past already-pulled rows; one failing site does not
sink other sites; repo-only ctor does not start the tick.
Site Call Audit (#22): build the documented periodic reconciliation PULL
self-heal path for the eventually-consistent central SiteCalls mirror, as a
dedicated PullSiteCalls gRPC RPC kept separate from the audit pull. This is the
pull PLUMBING only; the central reconciliation tick is a separate follow-up.
- IOperationTrackingStore.ReadChangedSinceAsync(sinceUtc, batchSize): inclusive
UpdatedAtUtc cursor, oldest-first, batch-capped; SQLite impl projects tracking
rows onto SiteCallOperational (Kind->Channel, TargetSummary->Target, SourceSite
left empty - the store has no site-id column).
- sitestream.proto: rpc PullSiteCalls + PullSiteCallsRequest/Response, mirroring
PullAuditEvents; regenerated checked-in SiteStreamGrpc/*.cs.
- SiteCallDtoMapper.ToDto(SiteCallOperational): inverse of FromDto for the handler.
- SiteStreamGrpcServer.PullSiteCalls handler + SetOperationTrackingStore seam;
Host wires the seam alongside SetSiteAuditQueue (site roles only).
- Central IPullSiteCallsClient + GrpcPullSiteCallsClient (home: AuditLog/Central to
reuse ISiteEnumerator; SiteCallAudit does not reference AuditLog). Re-stamps
SourceSite from the dialed siteId; no-throw on tolerable transport faults;
SpecifyKind (not ToUniversalTime) cursor handling. Central-only DI registration.
Tests: ReadChangedSinceAsync (4), PullSiteCalls handler (6), GrpcPullSiteCallsClient
(8). Full solution build 0 warnings/0 errors (TreatWarningsAsErrors).
Bite-sized TDD plan. M1 (runtime wiring) fully detailed across 10 tasks
after verifying the purge/reconciliation actors already exist and only
need Host wiring + a gRPC pull client + event-logger injection. M2/M3/M4
as right-sized task inventories with files, classification, and AC.
Co-located .tasks.json for executing-plans resume.
The MxAccess Gateway .NET driver was republished at 0.1.1. Update both
ZB.MOM.WW.MxGateway.Client and ZB.MOM.WW.MxGateway.Contracts package
versions in central package management. Build is clean (0 errors/warnings),
API-compatible — no code changes required. Local docker cluster rebuilt
and redeployed (scadabridge:latest), all 8 nodes + Traefik healthy.
Add InsertNotificationAsync with explicit status/createdAt parameters so tests
can seed back-dated Retrying rows that satisfy the IsStuck derived property
(Status ∈ {Pending,Retrying} && CreatedAt < now − 10 min). Refactor
InsertParkedNotificationAsync to delegate to it, preserving its exact public
signature and producing identical SQL for existing callers.